Predictive Regression and Robust Hypothesis Testing: Predictability Hidden by Anomalous Observations

Predictive Regression and Robust Hypothesis Testing: Predictability Hidden by Anomalous Observations Fabio Trojani University of Lugano and Swiss Finance Institute fabio.trojani@usi.ch Joint work with Lorenzo Camponovo and Olivier Scaillet Fabio Trojani (USI and SFI) Robust Predictive Regression May, 212 1 / 44

Motivation: Literature A huge literature has investigated whether stock returns can be predicted by economic variables such as, e.g., dividend yields, labor income, or the interest rate. See, e.g., Rozeff (1984), Fama and French (1988), Campbell and Shiller (1988), Nelson and Kim (1993), Goetzmann and Jorion (1995), Kothari and Shanken (1997), Campbell and Yogo (26), Jansson and Moreira (26), Polk, Thompson and Vuolteenaho (26), Santos and Veronesi (26), among others. Fabio Trojani (USI and SFI) Robust Predictive Regression May, 212 2 / 44

Motivation: Econometric Approach The econometric approach is mostly based on a predictive regression model of the form y t = α + βx t 1 + u t, x t = µ + ρx t 1 + v t, where y t denotes the stock return, and x t 1 is an economic variable assumed to predict y t. Hypothesis of no predictability, H : β =, where β is the true value of the unknown parameter β. Fabio Trojani (USI and SFI) Robust Predictive Regression May, 212 3 / 44

Motivation: Issues Issue: Because of endogenous (nearly integrated) predictors and correlated innovations, standard asymptotic theory implies bias. Recently, new approaches have been proposed in order to overcome this problem. Bias-corrected methods: Stambaugh (1999), and Amihud, Hurvich and Wang (28). Near-to-unit-root asymptotics: Lewellen (24), Torous, Valkanov and Yan (24), and Campbell and Yogo (26). Resampling methods: Wolf (2), Choi and Chue (27), and Ang and Bekaert (27). Nevertheless, also these new procedures lead in a number of cases to diverging results and conclusions!?! Fabio Trojani (USI and SFI) Robust Predictive Regression May, 212 4 / 44

Contribution (1): The Lack of Robustness We show through Monte Carlo simulations that all these testing procedures are dramatically non-resistant to even small fractions of anomalous observations in the data. In particular, the presence of anomalous observations dramatically decreases the test ability to reject the null of no predictability. We quantify theoretically the robustness properties of resampling method tests of predictability using the concept of breakdown point. These results show a dramatic lack of robustness of resampling methods. Fabio Trojani (USI and SFI) Robust Predictive Regression May, 212 5 / 44

Contribution (2): Robust Tests of Predictability To overcome this robustness problem, we develop a novel class of robust bootstrap and subsampling tests for general time series settings, which are resistant to anomalous observations. The robust tests are very general and apply easily to predictive regression models. Monte Carlo analysis confirms the reliability of these testing procedures. Besides the robustness improvement our robust approach also reduces the computational costs! Fabio Trojani (USI and SFI) Robust Predictive Regression May, 212 6 / 44

Contribution (3): Empirical Analysis We provide a robust assessment of the recent empirical evidence on stock return predictability for US stock market data. We study single-predictor and multi-predictor models, using well-known predictive variables suggested in the litarature. Dividend Yields. Variance Risk Premia. Labor Income. For all the three predictors (and in particular for dividend yields), our robust approach suggests more significance in favor of predictability than nonrobust methods! Fabio Trojani (USI and SFI) Robust Predictive Regression May, 212 7 / 44

Outline (1) The Lack of Robustness. Monte Carlo Analysis. Quantile Breakdown Point. (2) The Robust Approach. Robust Resampling Methods. Monte Carlo Analysis. (3) Empirical Analysis. Single-Predictor Model. Multi-Predictor Model. Conclusions. Fabio Trojani (USI and SFI) Robust Predictive Regression May, 212 8 / 44

The Lack of Robustness We start our analysis by studying through Monte Carlo simulations the robustness properties of some of the more recent tests of predictability. In particular, we consider Bias-corrected methods; Amihud, Hurvich and Wang (28). Tests based on near-to-unit-root asymptotics; Campbell and Yogo (26). Resampling method tests; Wolf (2), Choi and Chue (27), Fabio Trojani (USI and SFI) Robust Predictive Regression May, 212 9 / 44

Monte Carlo Setting We generate N = 1, samples ( (y 1, x ),..., (y n, x n 1 ) ), of size n = 18, according to the predictive regression model with β =,.5,, 5. To study the robustness, we also consider replacement outliers random samples ( (ỹ 1, x ),..., (ỹ n, x n 1 ) ), where ỹ t = (1 p t )y t + p t y 3max, with Y 3max = 3 max(y 1,..., y n ) and p t is an iid 1 random sequence such that P[p t = 1] = 4%. We test the null hypothesis H : β =. Fabio Trojani (USI and SFI) Robust Predictive Regression May, 212 1 / 44

Power Curves 1 1.9.9.8.8.7.7.6.6 Power.5 Power.5.4.4.3.3.2.2.5 5 True parameter value.5 5 True parameter value 1 1.9.9.8.8.7.7.6.6 Power.5 Power.5.4.4.3.3.2.2.5 5 True parameter value.5 5 True parameter value We plot the proportion of rejections of the null hypothesis H : β =, when the true parameter value β [, 5]. In the top panels we present Bias-corrected method (left) and test based on near-to-unit-root asymptotics (right). In the bottom panels we consider the bootstrap (left) and the subsampling (right). We consider noncontaminated samples (straight line) and contaminated samples (dashed line). Fabio Trojani (USI and SFI) Robust Predictive Regression May, 212 11 / 44

The Lack of Robustness To overcome this robustness problem we have to robustify these testing procedures. Unfortunately, this task may be hard to achieve for bias-corrected methods and tests based on near-to-unit-root asymptotics. What about resampling methods? By applying resampling methods to robust statistics do we get robustness? For the iid case the answer is NO! (see, e.g., Singh, 1998, Salibian-Barrera and Zamar, 22, and Camponovo, Scaillet and Trojani, 212). What about time series settings? Fabio Trojani (USI and SFI) Robust Predictive Regression May, 212 12 / 44

Robustness Analysis: Breakdown Point Let X (n) = (X 1,..., X n ) be a time series sample and T n := T (X (n) ) be a statistic with breakdown point < b.5. What is the breakdown point b? The breakdown point b is the smallest fraction of outliers in the sample X (n), such that the statistic T (X (n) ) +. b is an important measure of robustness; see e.g., Donoho and Huber (1983). Ex.1: The Mean. Let T n = 1 n n i=1 X i, then the breakdown point is 1/n. Ex.2: The Median. Let T n = med(x 1,..., X n ), then the breakdown point is.5. Fabio Trojani (USI and SFI) Robust Predictive Regression May, 212 13 / 44

Robustness Analysis: The Quantile Let X(k) = (X 1,..., Xk ) be a subsampling/bootstrap sample k = m, n. Then, the empirical subsampling/bootstrap distribution of Tn,k := T (X (k) ) approximates the sampling distribution of T n. Let t (, 1), the t-quantile of the resampling distribution is defined as Q t (X (n) ) = inf{x P (T n,k x). t} We characterize the robustness properties of resampling methods through the breakdown point of the distribution quantile Q t. Fabio Trojani (USI and SFI) Robust Predictive Regression May, 212 14 / 44

Robustness Analysis: Quantile Breakdown Point The breakdown point of the t-quantile Qt is given by [ b t = 1 n { p Z ζ p Zp ζ : Qt (X (n) + Zp ζ ) = + }] inf {1 p n/2} where Z ζ p is the set of n-components outlier samples and p N, ζ R are the number and size of the outliers, respectively. When the breakdown occurs, inference based on the subsampling/bootstrap distribution becomes meaningless! Contribution: We compute quantile breakdown point formulas for the subsampling and bootstrap, as a function of n, m, b, t. Fabio Trojani (USI and SFI) Robust Predictive Regression May, 212 15 / 44

Robustness Analysis: Numerical Examples Quantile breakdown subsampling and bootstrap. n = 12, b =.5 t =.95 t =.99 Subsampling (m = 1).5.5 Subsampling (m = 15).667.667 Bootstrap (m = 1).3333.2667 Bootstrap (m = 15).325.2167 Fabio Trojani (USI and SFI) Robust Predictive Regression May, 212 16 / 44

Conclusions on the Lack of Robustness Standard testing procedures for predictability are dramatically non-resistant to even small fractions of anomalous observations in the data. In particular, the presence of anomalous observations dramatically decreases the test ability to reject the null of no predictability. We analyze theoretically the robustness properties of resampling methods through the concept of breakdown point. These theoretical results confirm the dramatic lack of robustness of resampling methods. Fabio Trojani (USI and SFI) Robust Predictive Regression May, 212 17 / 44

Robust Approach: General Time Series Setting To overcome the robustness problem, we consider a fast resampling approach. Fast Procedures, see e.g., Shao and Tu (1995), Davidson and McKinnon (1999), Hu and Kalbfleisch (2), Andrews (22), Goncalves and White (24), Hong and Scaillet (26). Robust Fast Procedures, see e.g., Salibian-Barrera and Zamar (22) Salibian-Barrera, Van Aelst and Willems (26)-(27), Camponovo, Scaillet and Trojani (212). Fabio Trojani (USI and SFI) Robust Predictive Regression May, 212 18 / 44

Robust Approach: The Idea Let X (n) = (X 1,..., X n ) be a sample defined on the probability space (Ω, F, P), indexed by the parameter θ. M-estimator. Let ˆθ n be an estimator of θ defined as the solution of ψ n (X (n), ˆθ n ) := 1 n g(x i ; ˆθ n ) =. n i=1 (T n := ˆθ n is the statistic of interest.) Fast resampling approach idea. For each random sample X(k) = (X 1,..., X k ), instead of computing ˆθ k as the solution of the equation ψ k (X (k), ˆθ k ) =, we compute a first order approximation of ˆθ k. Fabio Trojani (USI and SFI) Robust Predictive Regression May, 212 19 / 44

Robust Approach: Construction Fast Resampling approach. Original sample. We denote by θ the true value. Let ˆθ n be the solution of ψ n (X (n), ˆθ n ) =. Let A = ( ψ n(x (n),θ) ) 1. θ θ=θ Taylor expansion around θ ˆθ n = θ + A ψ n (X (n), θ ) + o p (1). Random sample. Let Ân be a consistent estimator of A. Instead of computing ˆθ k as the solution of ψ k (X (k), ˆθ k ) =, we consider the approximation ˆθ k ˆθ n + Ânψ k (X (k), ˆθ n ). Fabio Trojani (USI and SFI) Robust Predictive Regression May, 212 2 / 44

Robust Approach: General Properties General approach that can be applied to a wide class of resampling methods (subsampling/bootstrap). Computational costs. The classic approach with robust estimators becomes easily unfeasible, see, e.g. Salibian-Barrera and Zamar (22). This approach requires instead only the estimators Â n and ˆθ n. Consistency conditions. Bootstrap, Goncalves and White (24). Subsampling, Hong and Scaillet (26). Fabio Trojani (USI and SFI) Robust Predictive Regression May, 212 21 / 44

Robust Approach: Robustness Properties Robustness of the quantile. ˆθn + Ânψ k (X (k), ˆθ n ) may diverge to infinity only when: ˆθ n diverge to infinity. Â n singular matrix. ψk not bounded. The quantile breakdown point depends only on the M-estimator ˆθ n, the estimating function and the estimator Â n. ((Robust ˆθ n )+ (Robust Ân)+( ψ k < c < )) Robust resampling method! Fabio Trojani (USI and SFI) Robust Predictive Regression May, 212 22 / 44

Robust Approach: Predictive Regression Models Finally, we can apply our robust approach to the predictive regression model. Let z (n) = ((y 1, x ),..., (y n, x n 1 )) be the observation sample. Then, we apply our robust approach to the robust M-estimator ˆθ n R of the parameter θ := (α, β) defined as the solution of ψ n,c (z (n), ˆθ R n ) := 1 n n g c (y t, w t 1, ˆθ n R ) =, t=1 where w t := (1, x t ) and the estimating function g c is defined by g c (y t, w t 1, θ) = (y t θ w t 1 )w t 1 min ( c 1, (y t θ w t 1 )w t 1 ). Fabio Trojani (USI and SFI) Robust Predictive Regression May, 212 23 / 44

Robust Predictive Regression Model Let b t denotes the breakdown point of the t-quantile of our robust bootstrap or subsampling. Then, b t =.5, t (, 1), i.e., our robust approach implies a maximal breakdown point for each t-quantile! We also study through Monte Carlo simulations the robustness properties of our approach. Fabio Trojani (USI and SFI) Robust Predictive Regression May, 212 24 / 44

Power Curves 1 1.9.9.8.8.7.7.6.6 Power.5 Power.5.4.4.3.3.2.2.5 5 True parameter value.5 5 True parameter value We plot the proportion of rejections of the null hypothesis H : β =, when the true parameter value β [, 5]. We consider the robust bootstrap (left) and the robust subsampling (right). We consider noncontaminated samples (straight line) and contaminated samples (dashed line). Fabio Trojani (USI and SFI) Robust Predictive Regression May, 212 25 / 44

Conclusions on the Robust Approach For the class of M-estimators, we provide a fast robust approach that implies robust resampling methods. The approach can be applied to a wide class of resampling methods (subsampling/bootstrap). The resampling methods inherit directly the robustness properties of the robust bounded estimating function. Using our robust approach we introduce robust resampling tests of predictability. Fabio Trojani (USI and SFI) Robust Predictive Regression May, 212 26 / 44

Empirical Analysis We consider both single-predictor and two-predictor models. In particular, we study the forecast ability of Dividend Yield, Variance Risk Premia, Labor Income, to predict future stock returns. Fabio Trojani (USI and SFI) Robust Predictive Regression May, 212 27 / 44

Single-Predictor Model Data. S&P 5 index data (1871-28) from Shiller (2). We define the one-period real total return as R t = (P t + d t )/P t 1, P t is the end of month real stock price. d t is the real dividends paid during month t. We define the annualized dividend series D t as D t = d t + (1 + r t )d t 1 + + (1 + r t )... (1 + r t 1 )d t 11, where r t is the one-month Treasury-bill rate. Predictive regression model. ln(r t ) = α + β ( Dt 1 P t 1 ) + ɛ t. Fabio Trojani (USI and SFI) Robust Predictive Regression May, 212 28 / 44

Confidence Intervals: Nonrobust Tests.8.8.6.6 Confidence interval bounds.4.2.2.4 Confidence interval bounds.4.2.2.4.6.6.8.8 1995 2 25 21 1995 2 25 21.8.8.6.6 Confidence interval bounds.4.2.2.4 Confidence interval bounds.4.2.2.4.6.6.8.8 1995 2 25 21 1995 2 25 21 9% confidence intervals for parameter β. In the top panels we present Bias-corrected method (left) and test based on near-to-unit-root asymptotics (right). In the bottom panels we consider the bootstrap (left) and the subsampling (right). Fabio Trojani (USI and SFI) Robust Predictive Regression May, 212 29 / 44

Confidence Intervals: Robust Tests.8.8.6.6 Confidence interval bounds.4.2.2.4 Confidence interval bounds.4.2.2.4.6.6.8.8 1995 2 25 21 1995 2 25 21 9% confidence intervals for parameter β. We consider the robust bootstrap (left) and the robust subsampling (right). Fabio Trojani (USI and SFI) Robust Predictive Regression May, 212 3 / 44

Huber Weights: The Anomalous Observations 1.9.8.7 Huber weights.6.5.4.3.2 198 1985 199 1995 2 25 21 We plot the Huber weights for the predictive regression model in the period 198-21. Fabio Trojani (USI and SFI) Robust Predictive Regression May, 212 31 / 44

Properties Anomalous Observations In the whole dataset for the period 198-21, the proportion of anomalous observations is 5.55%. Particular influential observations are: October 28: Lehman Brothers default on September 15 28. October 21: Terrorist attack to the Twin Towers on September 11 211. November 1987: Black Monday on October 19 1987, one of the largest one-day percentage decline in recorded stock market history. Fabio Trojani (USI and SFI) Robust Predictive Regression May, 212 32 / 44

Two-Predictor Model: Bollerslev et al. (29) We study the forecast ability of dividend yields and variance risk premia in the two-predictor model proposed in Bollerslev, Tauchen and Zhou (29) where 1 k ln(r t+k,t) = α + β 1 ln ( Dt P t ln(r t+k,t ) := ln(r t+1 ) + + ln(r t+k ). VRP t := IV t RV t. ) + β 2 VRP t + ɛ t+k,t, Fabio Trojani (USI and SFI) Robust Predictive Regression May, 212 33 / 44

Confidence Intervals: Dividend Yields.8.8.6.6 Confidence interval bounds.4.2.2.4 Confidence interval bounds.4.2.2.4.6.6.8.8 25 26 27 28 29 21 25 26 27 28 29 21.8.8.6.6 Confidence interval bounds.4.2.2.4 Confidence interval bounds.4.2.2.4.6.6.8.8 25 26 27 28 29 21 25 26 27 28 29 21 9% confidence intervals for parameter β 1. In the top panels, we consider the nonrobust bootstrap (left) and the nonrobust subsampling (right). In the bottom panels, we consider the robust bootstrap (left) and the robust subsampling (right). Fabio Trojani (USI and SFI) Robust Predictive Regression May, 212 34 / 44

Confidence Intervals: Variance Risk Premia 1 1.8.8.6.6 Confidence interval bounds.4.2.2.4 Confidence interval bounds.4.2.2.4.6.6.8.8 1 25 26 27 28 29 21 1 25 26 27 28 29 21 1 1.8.8.6.6 Confidence interval bounds.4.2.2.4 Confidence interval bounds.4.2.2.4.6.6.8.8 1 25 26 27 28 29 21 1 25 26 27 28 29 21 9% confidence intervals for parameter β 2. In the top panels, we consider the nonrobust bootstrap (left) and the nonrobust subsampling (right). In the bottom panels, we consider the robust bootstrap (left) and the robust subsampling (right). Fabio Trojani (USI and SFI) Robust Predictive Regression May, 212 35 / 44

Huber Weights: The Anomalous Observations 1.9.8.7 Huber weights.6.5.4.3.2 199 1995 2 25 21 We plot the Huber weights for the predictive regression model in the period 199-21. Fabio Trojani (USI and SFI) Robust Predictive Regression May, 212 36 / 44

Properties Anomalous Observations In the whole dataset for the period 199-21, the proportion of anomalous observations is 8.33%. Particular influential observations are: October 28: Lehman Brothers default on September 15 28. August 22: Dot-Com bubble collapse. October 21: Terrorist attack to the Twin Towers on September 11 211. Fabio Trojani (USI and SFI) Robust Predictive Regression May, 212 37 / 44

Two-Predictor Model: Santos and Veronesi (26) We study the forecast ability of dividend yields and labor income in the two-predictor model proposed in Santos and Veronesi (26) ( ) Dt 1 ln(r t ) = α + β 1 ln + β 2 s t 1 + ɛ t, P t 1 where s t 1 = w t 1 /C t 1 is the share of labor income to consumption. In this study, we consider quarterly returns on the value weighted CRSP index, which includes NYSE, AMEX, and NASDAQ. Fabio Trojani (USI and SFI) Robust Predictive Regression May, 212 38 / 44

Confidence Intervals: Dividend Yields.2.2 5 5 Confidence interval bounds.5.5 Confidence interval bounds.5.5 5 5.2 1995 2 25 21.2 1995 2 25 21.2.2 5 5 Confidence interval bounds.5.5 Confidence interval bounds.5.5 5 5.2 1995 2 25 21.2 1995 2 25 21 9% confidence intervals for parameter β 1. In the top panels, we consider the nonrobust bootstrap (left) and the nonrobust subsampling (right). In the bottom panels, we consider the robust bootstrap (left) and the robust subsampling (right). Fabio Trojani (USI and SFI) Robust Predictive Regression May, 212 39 / 44

Confidence Intervals: Labor Income 1 1.8.8.6.6 Confidence interval bounds.4.2.2.4 Confidence interval bounds.4.2.2.4.6.6.8.8 1 1995 2 25 21 1 1995 2 25 21 1 1.8.8.6.6 Confidence interval bounds.4.2.2.4 Confidence interval bounds.4.2.2.4.6.6.8.8 1 1995 2 25 21 1 1995 2 25 21 9% confidence intervals for parameter β 2. In the top panels, we consider the nonrobust bootstrap (left) and the nonrobust subsampling (right). In the bottom panels, we consider the robust bootstrap (left) and the robust subsampling (right). Fabio Trojani (USI and SFI) Robust Predictive Regression May, 212 4 / 44

Huber Weights: The Anomalous Observations 1.9.8.7 Huber weights.6.5.4.3.2 195 196 197 198 199 2 21 We plot the Huber weights for the predictive regression model in the period 195-21. Fabio Trojani (USI and SFI) Robust Predictive Regression May, 212 41 / 44

Properties Anomalous Observations In the whole dataset for the period 195-21, the proportion of anomalous observations is 4.58%. Particular influential observations are: 4Q 28-1Q 29: Lehman Brothers default on September 15 28. 4Q 1987-1Q 1988: Black Monday on October 19 1987, one of the largest one-day percentage decline in recorded stock market history. 4Q 1973-1Q 1974: Oil Crisis. Fabio Trojani (USI and SFI) Robust Predictive Regression May, 212 42 / 44

Conclusions on the Empirical Analysis Using our robust tests we show that dividend yield is a robust predictive variable of market returns, which is significant at the 5% significance in all our regressions, for each subperiod, sampling frequency and forecasting horizon considered! Variance Risk Premium is a robust predictive variable of future market returns at quarterly forecasting horizons (both using nonrobust and robust tests). Using nonrobust tests, the forecast ability of labor income is absent or less pronounced in some subperiods. Fabio Trojani (USI and SFI) Robust Predictive Regression May, 212 43 / 44

Final Conclusions (1) Standard testing procedures for predictability are dramatically non-resistant to even small fractions of anomalous observations in the data. The presence of such observations dramatically decreases the test ability to reject the null of no predictability. (2) To overcome this robustness problem, we introduce a new class of resampling method tests with desirable robustness properties. (3) Our robust tests detect predictability structures more consistently than classical nonrobust methods. Fabio Trojani (USI and SFI) Robust Predictive Regression May, 212 44 / 44