Exact Split-Sample Permutation Tests of Orthogonality and Random Walk in the Presence of a Drift Parameter

Similar documents
Predictive Regression and Robust Hypothesis Testing: Predictability Hidden by Anomalous Observations

Predicting bond returns using the output gap in expansions and recessions

Bootstrap tests of multiple inequality restrictions on variance ratios

Volatility. Gerald P. Dwyer. February Clemson University

Predictive Regressions: A Reduced-Bias. Estimation Method

Econ 423 Lecture Notes: Additional Topics in Time Series 1

New Methods for Inference in Long-Horizon Regressions

GARCH Models. Eduardo Rossi University of Pavia. December Rossi GARCH Financial Econometrics / 50

Predictive Regressions: A Reduced-Bias. Estimation Method

Functional Coefficient Models for Nonstationary Time Series Data

Estimation and Hypothesis Testing in LAV Regression with Autocorrelated Errors: Is Correction for Autocorrelation Helpful?

Discussion of Bootstrap prediction intervals for linear, nonlinear, and nonparametric autoregressions, by Li Pan and Dimitris Politis

Introduction to Modern Time Series Analysis

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Lecture 6: Univariate Volatility Modelling: ARCH and GARCH Models

Econ 424 Time Series Concepts

Efficient tests of stock return predictability

The Slow Convergence of OLS Estimators of α, β and Portfolio. β and Portfolio Weights under Long Memory Stochastic Volatility

Testing an Autoregressive Structure in Binary Time Series Models

University of Pretoria Department of Economics Working Paper Series

Bootstrapping Heteroskedasticity Consistent Covariance Matrix Estimator

Vanishing Predictability and Non-Stationary. Regressors

Modified Variance Ratio Test for Autocorrelation in the Presence of Heteroskedasticity

The Number of Bootstrap Replicates in Bootstrap Dickey-Fuller Unit Root Tests

Inference in VARs with Conditional Heteroskedasticity of Unknown Form

When is a copula constant? A test for changing relationships

Introduction to Regression Analysis. Dr. Devlina Chatterjee 11 th August, 2017

Shortfalls of Panel Unit Root Testing. Jack Strauss Saint Louis University. And. Taner Yigit Bilkent University. Abstract

Robust Bond Risk Premia

ECON3327: Financial Econometrics, Spring 2016

Outline. Nature of the Problem. Nature of the Problem. Basic Econometrics in Transportation. Autocorrelation

Diagnostic Test for GARCH Models Based on Absolute Residual Autocorrelations

6 Single Sample Methods for a Location Parameter

Supplemental Material for KERNEL-BASED INFERENCE IN TIME-VARYING COEFFICIENT COINTEGRATING REGRESSION. September 2017

Robust Bond Risk Premia

A nonparametric test for seasonal unit roots

11. Bootstrap Methods

Review of Statistics

A Simple Nonlinear Predictive Model for Stock Returns

Time series: Cointegration

Introduction to Econometrics

Hypothesis Testing in Predictive Regressions

ON SMALL SAMPLE PROPERTIES OF PERMUTATION TESTS: INDEPENDENCE BETWEEN TWO SAMPLES

A Test of Cointegration Rank Based Title Component Analysis.

Chapter 1. GMM: Basic Concepts

Warwick Business School Forecasting System. Summary. Ana Galvao, Anthony Garratt and James Mitchell November, 2014

Financial Econometrics Return Predictability

Christopher Dougherty London School of Economics and Political Science

Finite Sample and Optimal Inference in Possibly Nonstationary ARCH Models with Gaussian and Heavy-Tailed Errors

Multivariate GARCH models.

9) Time series econometrics

The Bootstrap: Theory and Applications. Biing-Shen Kuo National Chengchi University

Prof. Dr. Roland Füss Lecture Series in Applied Econometrics Summer Term Introduction to Time Series Analysis

13. Time Series Analysis: Asymptotics Weakly Dependent and Random Walk Process. Strict Exogeneity

FinQuiz Notes

Subsampling-Based Tests of Stock-Re Title Predictability.

10. Time series regression and forecasting

Lecture 8a: Spurious Regression

Asymptotic distribution of the sample average value-at-risk

Does k-th Moment Exist?

Multivariate Time Series Analysis and Its Applications [Tsay (2005), chapter 8]

E 4160 Autumn term Lecture 9: Deterministic trends vs integrated series; Spurious regression; Dickey-Fuller distribution and test

Lecture 8a: Spurious Regression

DEPARTMENT OF ECONOMICS

A Threshold Model of Real U.S. GDP and the Problem of Constructing Confidence. Intervals in TAR Models

11.1 Gujarati(2003): Chapter 12

Subject CS1 Actuarial Statistics 1 Core Principles

Understanding Regressions with Observations Collected at High Frequency over Long Span

A Non-Parametric Approach of Heteroskedasticity Robust Estimation of Vector-Autoregressive (VAR) Models

A Bootstrap Test for Causality with Endogenous Lag Length Choice. - theory and application in finance

R = µ + Bf Arbitrage Pricing Model, APM

Combined Lagrange Multipier Test for ARCH in Vector Autoregressive Models

Probabilities & Statistics Revision

Multivariate Time Series: VAR(p) Processes and Models

Department of Economics, UCSB UC Santa Barbara

Exact Tests of Equal Forecast Accuracy with an Application to the Term Structure of Interest Rates Richard Luger

Bootstrap tests of mean-variance efficiency with multiple portfolio groupings

Forecasting the term structure interest rate of government bond yields

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)

Predictability Hidden by Anomalous Observations

NCoVaR Granger Causality

Combining Macroeconomic Models for Prediction

A radial basis function artificial neural network test for ARCH

Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error Distributions

Ross (1976) introduced the Arbitrage Pricing Theory (APT) as an alternative to the CAPM.

A TIME SERIES PARADOX: UNIT ROOT TESTS PERFORM POORLY WHEN DATA ARE COINTEGRATED

Testing Statistical Hypotheses

Robust Bond Risk Premia

A Guide to Modern Econometric:

E 4101/5101 Lecture 9: Non-stationarity

Financial Econometrics

Long memory and changing persistence

Econometric Forecasting

Ch3. TRENDS. Time Series Analysis

Is there a flight to quality due to inflation uncertainty?

Liquidity Preference hypothesis (LPH) implies ex ante return on government securities is a monotonically increasing function of time to maturity.

Bootstrap Testing in Econometrics

Extended Tests for Threshold Unit Roots and Asymmetries in Lending and Deposit Rates

Nonsense Regressions due to Neglected Time-varying Means

Steven Cook University of Wales Swansea. Abstract

Transcription:

Exact Split-Sample Permutation Tests of Orthogonality and Random Walk in the Presence of a Drift Parameter Richard Luger Department of Economics, Emory University, Atlanta, GA 30322-2240, USA February 13, 2006 Abstract Consider the predictive regression y t = β 0 + β 1 x t 1 + ε t, t = 1,..., T, where β 0 is unknown and the error terms are, in general, neither independent nor identically distributed. Assuming conditionally median-zero errors, Cambpell and Dufour [International Economic Review 38 (1997) p. 151] propose finite-sample generalized bounds tests of the null hypothesis H 0 : β 1 = 0. The framework includes tests of the random walk hypothesis in the presence of a drift. This paper proposes a splitsample permutation principle to eliminate the bound from the Campbell-Dufour test procedure. The result is a new procedure which yields tests with the correct size (as opposed to the tests of Campbell and Dufour which only have the correct level). Thus, the test procedure proposed in this paper has all the virtues of that in Campbell and Dufour: it remains exact in the presence of general forms of feedback, non-normality, conditional heteroscedasticity, and non-linear dependence. A simulation study confirms the reliability of the new procedure, and its power is strikingly superior to the bounds tests of Campbell and Dufour, especially when the x-process is highly persistent. The test procedure developed is applied to U.S. data to examine whether interest rates or interest rate spreads help predict future output growth. JEL classification: C12, C15, C22, C52, C53 Keywords: Permutation test; Orthogonality test; Random walk; Predictive regression; Feedback; Exact inference. Tel.: +1-404-727-0328; fax: +1-404-727-4639. E-mail address: rluger@emory.edu (R. Luger).

1. Introduction Many models in economics and finance imply orthogonality conditions of the form: E[y t Ω t 1 ] = β 0, (1) where Ω t = (x 0,...,x t ) defines an information vector. The condition in (1) states that the conditional expectation of y t with respect to the past of x t is a constant. Although the orthogonality condition in (1) entertains many alternatives, it is typical in empirical work to run linear predictive regressions of the form: y t = β 0 + β 1 x t 1 + ε t, (2) and to test the null hypothesis H 0 : β 1 = 0. A statistically significant non-zero value of β 1 means that the x-variable is a useful linear predictor of the variable that appears of the left-hand side of (2). Tests of stock return predictability, forward rate unbiasedness, the permanent income hypothesis, the expectations hypothesis of the term structure of interest rates, and the constant real interest rate hypothesis can all be cast in this form. Examples include Fama and Schwert (1977), Rozeff (1984), Keim and Stambaugh (1986), Campbell (1987), Fama and French (1988, 1989), Campbell and Shiller (1988), Hodrick (1992), Kothari and Shanken (1997), and Pontiff and Schall (1998). The usual regression testing procedure relies on first-order asymptotic theory, which implies that the t-statistic is approximately standard normal in large samples. As both simulation and analytic studies have shown, the normal distribution can be a very poor approximation of the t-statistic s finite-sample null distribution. The problem largely originates in feedback from the error terms that may affect future values of the regressor, even though the error term and the regressor are contemporaneously uncorrelated; i.e., condition (1) allows feedback so that E[ε t x t+i ], for i 0, may be different from zero. In such cases, the t-test has the wrong size, rejecting the null too often. This overrejection problem is further exacerbated when the regressor variable is highly persistent. Evidence is found in Mankiw and Shapiro (1986), Banerjee and Dolado (1987, 1988), Galbraith, Dolado, and Banerjee (1987), Banerjee, Dolado, and Gabraith (1990), Elliot and Stock (1994), and Stambaugh (1999). The reliability of inference in studies using highly persistent regressors such as interest rates, dividend-price ratios, and forward rates is therefore questionable. Several theories have been proposed to remedy the over-rejection problem. Local-tounity asymptotic solutions are proposed in Elliott and Stock (1994), Cavanagh, Elliott, 1

and Stock (1995), Valkanov (2003), Campbell and Yogo (2003) and Torous, Valkanov, and Yan (2005). Other solutions include: size corrections proposed in Stambaugh (1999) and Lewellen (2003); resampling approaches proposed in Nelson and Kim (1993), Goetzmann and Jorion (1993), and Wolf (2000); and alternative covariance-based t-tests in Maynard and Shimotsu (2004). None of these methods, however, are provably exact. An exact test procedure is proposed by Campbell and Dufour (1997). Their approach focuses on non-parametric analogues of the t-test, based on sign and signed rank statistics, which are exact under very weak assumptions concerning the distribution of y t and the relationship between y t and x t. For one group of tests, they simply assume that y t has conditional median β 0 ; for the other, they make the additional assumption that the distribution of y t is conditionally symmetric about β 0. These assumptions leave open the possibility of feedback from y t to current and future values of the x-variable, without specifying the form of feedback or any other property of the x-process; no other assumptions about the y-variable, such as normality or even the existence of moments, are required. The Campbell-Dufour test procedure combines an exact non-parametric confidence interval for the parameter β 0 with conditional non-parametric tests linked to each point in the confidence interval. Their approach yields finite-sample generalized bounds tests, which are invariant to general forms of conditional heteroscedasticity and non-linear dependence. The random walk hypothesis corresponds to β 1 = 0 in model (2) with the left-hand side variable defined as y t = y t y t 1 and the regressor variable defined as y t 1. Among the many ways to test this hypothesis, Cochrane s (1988) variance-ratio methodology is quite popular, especially in empirical finance (see, for example, Poterba and Summers 1988; Lo and MacKinlay 1988; Liu and He 1991; Kim, Nelson, and Startz 1991). Lo and MacKinlay (1988) derive the asymptotic sampling theory for the variance-ratio statistic. In recognition of the time-varying volatilities that characterize financial time series, Lo and MacKinlay also derive a heteroscedasticity-consistent estimator of the variance-ratio s asymptotic variance. As the simulation results of Kim, Nelson, and Startz (1998) indicate, the variance-ratio test can reject the null hypothesis too often, depending on the degree of heterogeneity in the conditional variances. Compared to such parametric procedures, the invariance of the Cambpell-Dufour approach to general forms of conditional heteroscedasticity and non-linear dependence makes it particularly attractive to test the random walk hypothesis. Following Campbell and Dufour (1997), a test of β 1 = 0 proceeds in two steps. First, one establishes an exact confidence interval for the parameter β 0 that is valid at least 2

under the null. Second, signed rank statistics based on the products (y t b)g t 1, where g t = g t (I t ), t = 0,...,n 1, is a sequence of measurable functions of the information vector I t, are computed for each candidate value b of the confidence interval. Campbell and Dufour show that the sign and Wilcoxon tests apply in this context such that, when combined using Bonferroni s inequality with the confidence interval for β 0, an exact finitesample bounds test can be performed. As the simulation results in Campbell and Dufour (1997), Luger (2003), and those in Section 4 of the present paper show, the Campbell-Dufour approach can be quite conservative in finite samples such that power losses can result against certain alternatives. This paper proposes a split-sample permutation principle to eliminate the bound from the Campbell-Dufour test procedure. The result is a new procedure which yields tests with the correct size (as opposed to the tests of Campbell and Dufour which only have the correct level). Thus, the test procedure proposed in this paper has all the virtues of that in Campbell and Dufour: it remains exact in the presence of general forms of feedback, non-normality, conditional heteroscedasticity, and non-linear dependence. Simulations reveal that eliminating the bound from the Campbell-Dufour test procedure can lead to dramatic power gains, especially when the x-process is highly persistent. This paper extends the permutation principles described in Dufour and Roy (1985), McCabe (1989), Kennedy (1995), and Luger (2006) to tests of orthogonality and random walk when there is an unknown intercept or drift parameter. The key insight is that a permutation distribution can be generated by splitting the sample in two parts. The new test procedure is built on three exchangeability properties that hold under the null hypothesis: (i) conditional on g 0,...,g T 1, the statistics I[(y t b)g t 1 0], t = 1,...,T, where I[ ] is the indicator function (the basic building blocks of the Campbell-Dufour approach), are exchangeable when evaluated at b = β 0 ; (ii) the split-sample confidence interval constructed as per the first step of the Campbell-Dufour approach is equal in distribution to one based on the permuted sample; and (iii) split-sample conditional sign statistics linked to each point in the confidence interval are equal in distribution to ones based on the permuted sample. This last property is used with a Monte Carlo resampling technique to construct conditional distribution-free tests of any hypothesis that restricts the value of β 1 in the context of (2). The rest of this paper is organized as follows. Section 2 reviews the Campbell-Dufour approach. Section 3 derives the size-correct test procedure. Section 4 compares the size and power of the proposed test procedure with that of Campbell and Dufour. The results 3

show that the power of the Campbell-Dufour test procedure decreases as the x-process becomes more persistent. The new test procedure does not have this problem. Further, the power gap between the Campbell-Dufour tests and the new one becomes greater under heavier-tailed error terms. Section 5 applies the methods to test whether interest rates or interest rate spreads help predict future output growth. Ang, Piazzesi, and Wei (2005) develop a no-arbitrage model for the dynamics of the term structure of interest rates jointly with output growth. Their model predicts that the nominal short rate contains more information about future output growth than any yield spread. Contrary to Ang, Piazzesi, and Wei, the yield spread is found to be a statistically significant predictor of future output growth, while the nominal short rate is not once more correct non-parametric inference procedures are used. Section 6 offers some conclusions. 2. The Campbell-Dufour Approach Campbell and Dufour (1997) develop their tests in the framework of a general model involving the random variables y 1,...,y T,x 0,...,x T 1, and the corresponding information vectors defined by I t = (x 0,x 1,...,x t,y 1,...,y t ), where t = 0,...,T 1, with the convention that I 0 = (x 0 ). (Note that I t Ω t.) Campbell and Dufour introduce tests of the independence of y t from I t 1, which are exact under very weak assumptions concerning the distribution of y t and the relationship between y t and x t. In particular, they assume that y t is independent of I t 1, (3) and P(y t > β 0 ) = P(y t < β 0 ), (4) for t = 1,...,T. No assumptions are made about the process governing x t nor about the relationship between y t and x t. This leaves open the possibility of feedback from y t to current and future values of the x-variable, without specifying the form of feedback; the variables y t need not be normally nor identically distributed. Campbell and Dufour also develop tests under the stronger assumption: y 1,...,y T have continuous distributions symmetric about β 0. (5) Clearly, (5) implies (4), but the converse is not necessarily true. 4

The Cambpell-Dufour approach yields an exact test of H 0 in the context of (2) under the assumption that the error terms ε t are independently distributed according to medianzero distributions. Of course, one can just as easily test restrictions such as β 1 = β τ 1, for any β τ 1 R, simply by reparameterizing the model as y t β τ 1x t 1 = β 0 + β 1 x t 1 + ε t. This is precisely how the random walk hypothesis (H RW 0 : β 1 = 1) is tested when x t 1 = y t 1. Following Campbell and Dufour, a test of the null H 0 when β 0 is unknown proceeds in two steps. First, one establishes an exact confidence interval for the parameter β 0, that is valid at least under the null hypothesis. Following well-known results from nonparametric statistics, such a confidence interval is easily constructed for the parameter β 0 when β 1 = 0, where the confidence bounds correspond to certain order statistics. The order statistics of interest are y (1),...,y (T), which represent the observations Y = (y 1,...,y T ) placed in increasing order of magnitude; i.e., y (1) y (2)... y (T). An exact confidence interval, say C β0 (Y,α), for β 0 with level 1 α is given by [y (k+1),y (T k) ], where k is the largest integer such that P(B k) α/2, for B a binomial random variable with number of trials T and probability of success 1/2 (Campbell and Dufour 1997, p. 157). The second step of the Campbell-Dufour test procedure is to compute, for each candidate value b element of the confidence interval, the value of sign or signed rank test statistics. The basic building block of these statistics are the products z t (b) = (y t b)g t 1, where g t = g t (I t ), t = 0,...,T 1, is a sequence of measurable functions of the information vector I t. Let s[z] = 0.5(1 sign[z]), where sign[z] = 1 if z > 0, and sign[z] = 1 if z 0. Campbell and Dufour consider the sign statistic S + g (Y,b) = T s[(y t b)g t 1 ] (6) t=1 and the Wilcoxon signed rank statistic SR + g (Y,b) = T s[(y t b)g t 1 ]R t ( Y b ), (7) t=1 where R t ( Y b ) is the rank of y t b when y 1 b,..., y T b are placed in ascending order. In contrast to the usual definition of Wilcoxon-type statistics, where the absolute ranks would be based on the products (y t b)g t 1, the definition of (7) uses the absolute ranks of y t b. Under assumptions ensuring that the z-series contains no zeros and that no ties occur among the absolute values y 1 b,..., y T b with probability 1, Campbell and Dufour show that the sign and Wilcoxon tests apply in this context; i.e., S g + (Y,β 0 ) 5

follows a binomial distribution with number of trials T and probability of success 1/2, and SR + g (Y,β 0 ) is distributed like W = T t=1 tb t, where B 1,...,B T are mutually independent uniform Bernoulli variables on {0, 1}. The distribution of the Wilcoxon variate W has been tabulated for various sample sizes; see Table A.4 in Hollander and Wolfe (1973) for T 15. For larger values, the standard normal distribution provides a very good approximation to the standardized version (SR g + (Y,β 0 ) E[SR g + (Y,β 0 )])/ V ar[sr g + (Y,β 0 )]. Consider the standardized versions of the sign and Wilcoxon signed rank statistics: S g(y,b) = S+ g (Y,b) T/2 T/4, SR g(y,b) = SR+ g (Y,b) T(T + 1)/4 T(T + 1)(2T + 1)/24, (8) where null means and variances are used. For any 0 α 1, let c S (α) and c SR (α) be the critical values such that P ( S g(β 0 ) > c S (α) ) α, P ( SR g(β 0 ) > c SR (α) ) α. (9) Note that S g(y,β 0 ) and SR g(y,β 0 ) have discrete symmetric distributions about zero, so it may not be possible to make the tail probabilities in (9) equal to α in small samples. Consider the standardized sign statistic, S g(y,b), and let SB = inf{ S g(y,b) : b C β0 (Y,α 1 )}. (10) Campbell and Dufour (1997, Proposition 2) show that P (SB > c S (α 2 /2)) α 1 + α 2 ; so if one adopts the following two-sided decision rule: reject H 0 when SB is significant at level α 2 /2, then the overall level of the two-step procedure is bounded by α = α 1 + α 2 The same argument applies to SRB = inf{ SR g(y,b) : b C β0 (Y,α 1 )}. It should be noted that the part of the Campbell-Dufour approach adopted here is to reject the null only if there is sufficient evidence against it. This procedure is driven by a minimax argument as in Dufour (2006), whereby the null hypothesis is viewed as the union of point null hypotheses: H 0 = b B 0 {β 0 = b,β 1 = 0}, where B 0 R is a set of admissible values for β 0. Seen this way, it becomes natural to reject H 0 whenever for all admissible values of β 0 (under the null), the corresponding point null is rejected. The bound on the overall significance level comes from replacing B 0 by C β0 (Y,α 1 ). 6

It remains to discuss the choice of the functions g t = g t (I t ). For power considerations, Campbell and Dufour suggest g t = x t ˆm t (x), t = 0,...,T 1, (11) where ˆm t (x) = median(x 0,x 1,...,x t ); define the initial value as ˆm 0 (x 0 ) = 0 so that g t 1 0 may be assumed. It is important to note that ˆm t (x) only depends on information up to time t. As Campbell and Dufour remark, other centering functions may be more appropriate if the median of the x-variable is assumed non-constant. For example, one could consider an explicit model of the x-process. 3. Size-Correct Test Procedure The Campbell-Dufour approach ensures that the probability of a Type I error is bounded from above by the desired level α. This section shows how to eliminate the bound from the Campbell-Dufour approach so as to obtain a size-correct test procedure; i.e., one for which the probability of a Type I error is exactly equal to α. The tests of H 0 : β 1 = 0 are developed in the context of model (2) under very general distributional assumptions for the error terms. In particular, the error terms are not assumed to be statistically independent. Let sign[ε t 1] denote the vector (sign[ε 1 ],...,sign[ε t ]). The family of tests developed here assumes merely that the error terms satisfy: P 0 ( sign[εt ] = 1 Ω t 1,sign[ε t 1 1 ] ) = P 0 ( sign[εt ] = 1 Ω t 1,sign[ε t 1 1 ] ), (12) for each t = 1,...,T, where P 0 indicates a probability computed under H 0. Under such a general distributional assumption for the error terms, testing H 0 : β 1 = 0 is equivalent to testing whether the conditional median of y t, given Ω t 1 and sign[ε t 1 1 ], equals β 0. Discrete and asymmetric conditional error distributions are thus admitted in this framework, provided there be no mass at the origin. As Proposition 1 below shows, Assumption (12) is sufficient to ensure the validity of the Campbell-Dufour approach based on the sign statistic; i.e., one need not assume that y t is independent of the past of x t and y t itself as in (3). Clearly, Assumption (12) still leaves open the possibility of feedback from y t to current and future values of the x-variable, and the process governing x t may be nonstationary. Furthermore, the errors may be conditionally heteroscedastic. Indeed, several popular models of time-varying conditional variance, such as generalized autoregressive conditional heteroscedasticity (GARCH) and stochastic volatility models, posit ε t = σ t η t, 7

where σ t is independent of η t, and {η t } is an i.i.d. sequence of random variables drawn from a symmetric distribution (such as a standard normal or Student-t distribution). In those cases (12) is trivially satisfied. Note that the conditional variance need not be finite nor even follow a stationary process, and moments need not exist. In fact, other than (12), no restrictions are placed on the degree of heterogeneity and persistence of the conditional variance process (or any other non-linear dependence process governing ε t ). As in the Campbell-Dufour approach, the simple products z t (b) = (y t b)g t 1, t = 1,...T, form the basic building blocks of the inference method proposed here. Consider then the class of sign statistics defined by T S g (Y,b) = sign[(y t b)g t 1 ]. (13) t=1 Note that it is easier to implement two-sided permutation tests based on (13) rather than on (6) since S g (Y,β 0 ) is already symmetrically distributed about the origin. The next two results establish the key exchangeability properties that ensure the exactness of the proposed inference method. Proposition 1. Suppose the model in (2) holds with error terms satisfying (12) and that P 0 [g t = 0] = 0 for t = 0,...,T 1. Then, under the null hypothesis and conditional on g = (g 0,...,g T 1 ), (sign[(y 1 β 0 )g 0 ],...,sign[(y T β 0 )g T 1 ]) d = (sign[(y d1 β 0 )g 0 ],...,sign[(y dt β 0 )g T 1 ]), for all permutations d 1,...,d T of the integers 1,...,T, where d = stands for the equality in distribution. Proof. Conditional on g and assuming g t 0 with probability 1 for t = 0,...,T 1, the variables sign[(y t β 0 )g t 1 ], t = 1,...,T, are mutually independent under Assumption (12) with P 0 {sign[(y t β 0 )g t 1 ] = 1 g} = P 0 {sign[(y t β 0 )g t 1 ] = 1 g} = 1/2. To see this, let Z = (sign[(y 1 β 0 )g 0 ],...,sign[(y T β 0 )g T 1 ]) and consider the null characteristic function of Z: [ T ] ϕ Z (τ) = E [exp(iτ Z)] = E exp (iτ t sign[(y t β 0 )g t 1 ]), t=1 where τ = (τ 1,...,τ T ) R T and i = 1. Conditional on Ω T 1 and sign[ε T 1 1 ], the null characteristic function can be written as ϕ Z (τ) = E [ T 1 t=1 exp (iτ t sign[(y t β 0 )g t 1 ])E [ iτ T sign[(y T β 0 )g T 1 ] Ω T 1,sign[ε T 1 1 ] ]]. 8

Under (12) and assuming that g T 0 with probability 1, P 0 ( sign[(yt β 0 )g T 1 ] = 1 Ω T 1,sign[ε T 1 1 ] ) = It follows that P 0 ( sign[(yt β 0 )g T 1 ] = 1 Ω T 1,sign[ε T 1 1 ] ) = 1/2. ϕ Z (τ) = 0.5[1 + exp(iτ T )]E [ T 1 t=1 The argument can be repeated recursively to find ϕ Z (τ) = exp (iτ t sign[(y t β 0 )g t 1 ]) T 0.5[1 + exp(iτ t )], t=1 which is the product of the characteristic functions of Bernoulli distributions; i.e., under the null and conditional on g, (sign[(y 1 β 0 )g 0 ],...,sign[(y T β 0 )g T 1 ]) d = (S 1 sign[g 0 ],...,S T sign[g T 1 ]), where S 1,...,S T are mutually independent uniform Bernoulli variables on { 1, 1}. It follows, under the null and conditional on g, that the variables sign[(y t β 0 )g t 1 ], t = 1,...,T, are i.i.d., and hence exchangeable. The proof of Proposition 1 represents a generalization of the result in Campbell and Dufour (1997), in that here the sign statistic is shown to follow the usual Binomial distribution without assuming that y t is independent of I t 1. The proposed test procedure requires that the sample Y be split in two subsamples of observations in order to generate a permutation distribution. The split is taken at the sample midpoint (the reason for this choice is discussed below in Remark 1). It will be assumed that T is even, so that the midpoint T/2 is an integer. Under the null hypothesis, the choice of which subsample to denote as the first is arbitrary. A practical prescription, however, would be to denote the first (second) subsample as the one where the x-variable exhibits less (more) variability to ensure greater power. The first subsample, say Y 1 = (y 1,...,y T/2 ), is used to construct a confidence interval for the parameter β 0 as per the first step of the Campbell-Dufour approach. Denote by C β0 (Y 1,α 1 ) such a confidence interval. The second subsample is used to find SL(Y 2 ) = inf { S g (Y 2,b) : b C β0 (Y 1,α 1 )}, (14) 9 ].

where Y 2 = (y T/2+1,...,y T ) and S g (Y 2,b) = T t=t/2+1 sign[(y t b)g t 1 ]. The next result establishes a permutation distribution for the first-step confidence interval, which is then used with Proposition 1 to obtain a permutation test of H 0. Proposition 2. Suppose the model in (2) holds with error terms satisfying (12) and let C β0 (Y,α) be an α-level confidence interval constructed as per the first step of the Campbell- Dufour approach. Then, under the null hypothesis, C β0 (Y 1,α) d = C β0 (Y 1 (d ),α), where Y 1 (d ) = (y d1,...,y dt/2 ) for every permutation d = (d 1,...,d T ) of the integers 1,...,T. Proof. Observe that the order statistics y (1),...,y (T/2) used to construct C β0 (Y 1,α) are related to the empirical distribution function, F 1, of Y 1 under the null by 0 if b < y (1), T 2 F 1(b) = 1/2 if y (1) b < y (2),... t/2 if y (t) b < y (t+1),... T/2 if b y (T/2). Let S + (Y 1,b) = T/2 t=1 s[y t b] and note that T 2 F 1(b) = S + (Y 1,b). Therefore, the statistic T 2 F 1(β 0 ) is distributed according to B(T/2, 1/2), a binomial distribution with number of trials T/2 and probability of success 1/2. Let B α denote the quantile of order α of a B(T/2, 1/2) distribution. Since T 2 F 1(β 0 ) is exactly pivotal, it follows that P 0 [ B α1 T 2 F 1(β 0 ) B 1 α2 ] = P 0 [ F 1 1 (2B α1 /T) β 0 F 1 1 (2B 1 α2 /T) ] = 1 α 1 α 2, where F 1 1 (u) = inf{x : F 1 (x) u}. If k is the largest integer such that P[B k] α/2, where B B(T/2, 1/2), then the symmetric confidence interval C β0 (Y 1,α) = [y (k+1),y (T/2 k) ] has level 1 α. In light of Proposition 1, it is easy to see that S + (Y 1,β 0 ) = d S + (Y 1 (d ),β 0 ), where S + (Y 1 (d ),β 0 ) = T/2 t=1 s[y d t β 0 ]. It directly follows that C β0 (Y 1,α) is distributed like C β0 (Y 1 (d ),α). 10

Propositions 1 and 2 pave the way for the following result. Proposition 3. Suppose the model in (2) holds with error terms satisfying (12) and that P 0 (g t = 0) = 0 for t = 0,...,T 1. Let SL(Y ) = inf{ S g (Y 2,b) : b C β0 (Y 1,α 1 )}, where S g (Y 2,b) = T t=t/2+1 s[(y t b)g t 1 ], and C β0 (Y 1,α 1 ) is an α 1 -level confidence interval for β 0 based on Y 1 constructed as per the first step of the Campbell-Dufour approach. Then, under the null hypothesis and conditional on g = (g 0,...,g T 1 ), SL(Y ) d = SL(Y (d )), where Y (d ) = (y d1,...,y dt ) for every permutation of the integers 1,...,T; SL(Y (d )) = inf{ S g (Y 2 (d ),b) : b C β0 (Y 1 (d ),α 1 )}, where S g (Y 2 (d ),b) = T t=t/2+1 s[(y d t b)g t 1 ], and C β0 (Y 1 (d ),α 1 ) is an α 1 -level confidence interval for β 0 based on Y 1 (d ) constructed as per the first step of the Campbell-Dufour approach. Proof. Follows directly from Propositions 1 and 2. The equally likely property in Proposition 3 can be used to construct conditional α- level distribution-free tests of the null H 0 : β 1 = 0. Since large values of SL(Y ) are more probable under the alternative hypothesis, the null is rejected if the observed value of SL(Y ) falls in a set R α containing the T!α largest values of the test statistic SL(Y (d )) that can be obtained from the class of all permutations. Although such tests are conditional ones, it is easy to see that they also maintain their size unconditionally. Remark 1. Denote by m and T m the number of observations in Y 1 and Y 2, respectively. Proposition 3 makes evident the tradeoff between the number of observations included in Y 1 versus Y 2. If m is too small, the permutation distribution of C β0 (Y (d ),α 1 ) has an increased variance. Consequently, SL(Y (d )) also has an increased variance thereby inflating the values of the test statistic that fall in R α. Further, it is clear that power losses also occur if T m is too small since S g (Y 2,b) is based on fewer observations. Therefore, m = T/2 is a natural choice given this tradeoff. Remark 2. A remarkable feature of the equally likely property in Proposition 3 is that is does not depend on α 1, the level of the first-step confidence interval. The value of α 1 does not affect the behavior of the split-sample permutation test procedure under the alternative hypothesis either since C β0 (Y 1,α 1 ) and S g (Y 2,b) are computed over different subsamples. A computationally simplified procedure thus obtains by letting α 1 tend to 11

1. It is easy to see that the sample median of Y 1 constitutes a (degenerate) confidence interval for which α 1 = 1. The proposed test procedure is thus based on S g (Y 2, b), where b = median(y1,...,y T/2 ). Remark 3. It is important to emphasize that for each randomly chosen permutation, S g (Y 2 (d ),b) = T t=t/2+1 s[(y d t b)g t 1 ] is computed by holding g 0,...,g T 1 in the same order used to compute the original S g (Y 2,b). A simple method to obtain a test with the desired size and a precise p-value, without enumerating the entire permutation distribution, is the Monte Carlo (MC) procedure proposed by Dwass (1957). The MC-S procedure for a two-sided test of H 0 is summarized as follows: 1. Compute the value of S g (d M ) = T t=t/2+1 s[(y t b)g t 1 ], where b = median(y 1,...,y T/2 ). 2. Generate a random permutation d m = (d 1,...,d T ) of the integers 1,...,T and compute S g (d m) = T t=t/2+1 s[(y d t b)g t 1 ], where b = median(y d1,...,y dt/2 ). 3. Repeat Step 2 to obtain S g (d m), m = 1,...,M 1. Note that ties among the S g s have a non-zero probability of occurrence. The desired test size can nevertheless be achieved by applying the following tie-breaking rule (Dufour 2006): 4. Draw variates U i, i = 1,...,M, from a continuous uniform distribution independently of the S g s and arrange the pairs ( S g (d 1),U 1 ),..., ( S g (d M ),U M) according to the lexicographic order: ( S g (d i),u i ) < ( S g (d j),u j ) [ Sg (d i) < S g (d j) or { S g (d i) = S g (d j) and U i < U j } ]. 5. Compute the rank of (S g (d M ),U M) in the lexicographic ordering according to R M = 1 + M 1 i=1 I [ S g (d M) > S g (d i) ] + where I[ ] is again the indicator function. 12 M 1 i=1 I [ S g (d M) = S g (d i) ] I [U M > U i ],

6. If αm is an integer, then the null hypothesis is rejected at level α whenever the MC p-value: p M = M R M + 1, M is less than or equal to α. Otherwise, the null hypothesis is maintained. For a given, possibly small, number of random draws, this MC-S procedure allows one to control exactly the size of the test; i.e., if αm is an integer, then P 0 ( R M M αm+1) = α. As M increases without bound, inference based on the MC-S procedure becomes equivalent to that based on the entire permutation distribution. See Dufour (2006) and Dufour and Khalaf (2001) for more on the technique of MC testing. 4. Simulation Experiments This section reports the results of a simulation study in which the proposed MC-S test procedure is compared to the Campbell-Dufour test procedure. The model specifications considered in this section are similar to those studied in Campbell and Dufour (1997). In the first example, which is based on Mankiw and Shapiro (1986), x t is assumed to follow an autoregressive model. The relevant equations in this case are y t = β 0 + β 1 x t 1 + ε t, x t = µ + φx t 1 + η t, (15) for t = 1,...,T, where the vectors (ε 1,η 1 ),..., (ε T,η T ) are i.i.d. according to a standard bivariate normal distribution with correlation coefficient ρ. The coefficient ρ determines the importance of feedback: when ρ = 0 there is no feedback from ε t to future values of the x-variable. To investigate the relative behavior of the non-parametric tests as the degree of tail thickness varies, error vectors were generated by setting ε t = ρe t + η t 1 ρ2, where e t and η t are i.i.d. according to a Student-t distribution with degrees of freedom equal to 3 or 1 (Cauchy). Although not necessary for the proposed MC-S test procedure, attention is restricted to symmetrically distributed error terms to ensure the theoretical validity of the Campbell-Dufour test procedure based on the Wilcoxon-type statistic. The initial value is generated as x 0 = µ + η 0 under both the null and alternative hypotheses. The parameter values are β 0 = µ = 0, ρ = 0.9, β 1 = 0, 0.1, 0.2, and φ = 0.9, 0.95, 1. Although the parameter β 0 is set equal to zero, this is not assumed known and the non-parametric 13

Table 1 Empirical rejection rates (in percentages) of the null hypothesis H 0 : β 1 = 0 in model (15) with T = 50 Test Normal t(3) Cauchy β 1 0.0 0.1 0.2 0.0 0.1 0.2 0.0 0.1 0.2 φ = 0.90 SB 0.0 0.8 4.4 0.0 1.7 8.8 0.0 11.1 22.7 SRB 0.3 3.6 13.4 0.4 6.5 19.6 0.8 22.1 38.0 MC-S 5.0 16.2 32.3 4.5 21.7 44.2 5.1 61.7 75.4 φ = 0.95 SB 0.1 0.6 6.3 0.0 1.6 8.8 0.0 10.8 18.8 SRB 0.6 2.5 13.9 0.8 5.5 19.8 0.3 21.4 30.8 MC-S 4.5 13.2 37.4 5.0 23.4 50.6 5.2 62.1 77.8 φ = 1.00 SB 0.0 0.4 3.4 0.0 1.7 6.5 0.0 9.1 15.3 SRB 0.2 1.8 6.9 0.1 3.9 14.7 0.0 14.0 23.7 MC-S 4.9 17.7 41.3 4.7 32.4 56.5 5.0 67.0 81.6 Note: Nominal level is 5%. SB and SRB refer to the sign and signed rank bounds tests of Campbell and Dufour (1997). MC-S refers to the proposed split-sample permutation test. procedures dealing with a possibly non-zero intercept are applied. Each experiment is based on 1000 replications and sample sizes T = 50, 100 are considered. For each data-generating configuration, the MC-S test procedure was implemented with M = 200 so that the null H 0 : β 1 = 0 is rejected at the 5% level when R 200 191. The two-step Campbell-Dufour procedure was implemented by first constructing an exact confidence interval for β 0 with level 1 α 1 = 0.99 and then the null was rejected if the smallest absolute value of the non-parametric test statistic over the confidence interval was significant at level α 2 = 0.04. The choice of α 1 and α 2 follows Campbell and Dufour (1997), where it is observed that the power of the two-step procedure increases as the confidence interval becomes wider. Even though Campbell and Dufour (1997) consider aligned sign and signed rank statistics based on the sample median, those are not considered here since the focus is on provably exact test procedures. Tables 1 and 2 present the empirical rejection rates for the first example. The results in Table 1 are for the case where T = 50, while Table 2 reports those when the sample size is doubled to T = 100. The salient features are summarized as follows. 14

Table 2 Empirical rejection rates (in percentages) of the null hypothesis H 0 : β 1 = 0 in model (15) with T = 100 Test Normal t(3) Cauchy β 1 0.0 0.1 0.2 0.0 0.1 0.2 0.0 0.1 0.2 φ = 0.90 SB 0.1 5.4 27.9 0.4 10.0 42.7 0.1 57.2 76.1 SRB 0.5 13.9 47.6 1.5 20.7 59.7 1.4 68.8 84.7 MC-S 5.3 23.5 49.9 5.0 34.6 67.7 4.9 84.3 95.1 φ = 0.95 SB 0.1 7.9 31.3 0.1 18.1 50.8 0.1 55.9 68.2 SRB 1.2 18.3 49.7 0.4 28.9 64.7 1.1 68.6 79.5 MC-S 4.8 30.8 59.4 5.2 47.7 76.4 4.5 88.9 95.1 φ = 1.00 SB 0.1 6.2 27.0 0.2 13.1 33.9 0.0 39.5 50.3 SRB 0.3 11.0 36.0 0.5 18.6 40.9 0.6 47.1 57.5 MC-S 5.0 43.9 77.2 4.9 62.8 87.0 5.1 90.8 95.7 Note: Nominal level is 5%. SB and SRB refer to the sign and signed rank bounds tests of Campbell and Dufour (1997). MC-S refers to the proposed split-sample permutation test. 15

The Campbell-Dufour tests are undersized, which can lead to substantial power losses. Consistent with the theory, the size of the MC-S test corresponds closely to the nominal value of 5%. The MC-S test outperforms the Campbell-Dufour tests in terms of power, often by a very wide margin. These results are all the more impressive considering that the MC-S test only use information on signs and effectively uses only half the sample to detect departures from the null. As expected, the power of the MC-S test increases as the process governing the x- variable becomes more persistent. Curiously, the opposite occurs with the Campbell- Dufour tests. For example, when T = 100 and φ = 0.9 the SRB test has a rejection rate of about 80% under Cauchy errors. That rate drops to nearly 50% when φ = 1. To see why the power of the Campbell-Dufour test procedure is a decreasing function of the persistence of the x-process, consider the first-step confidence interval C β0 (Y,α 1 ). When β 1 0, C β0 (Y,α 1 ) becomes a confidence interval for β 0 + β 1 m(x t ), where m(x t ) is the population median of x t. If φ = 0, then m(x t ) equals the constant µ. In that case, C β0 (Y,α 1 ) is shifted away from β 0 depending on the sign and magnitude of β 1 µ. As φ increases toward 1, C β0 (Y,α 1 ) is more likely to include a centering value for the products z t (b) = (y t b)g t 1, t = 1,...,T, which decreases the probability of rejecting H 0. This phenomenon is illustrated in Figure 1, where (smoothed) empirical power functions are shown for values of φ in the interval [0.9, 1] under normally distributed error terms with ρ = 0.9, β 1 = 0.2, and T = 100. The figure clearly shows the power of the Campbell- Dufour tests declining as the x-process nears non-stationarity. The MC-S test does not have this power problem. The second example studied is the random walk hypothesis in the context of a model of financial returns that allows for the presence of a time-varying risk premium. In the ARCH-in-mean class of models (Engle, Lilien, and Robins 1987), some function of the conditional variance appears as an explanatory variable in the conditional mean of returns so that the mean return increases in proportion to the variance. Suppose y t represents the logarithm of an asset price and consider a specification in which the conditional mean of returns is a linear function of the conditional variance itself: y t = β 0 + β 1 y t 1 + δh t + ε t, (16) 16

Power (%) 0 20 40 60 80 MC S SRB SB 0.90 0.92 0.94 0.96 0.98 1.00 Phi Figure 1. Empirical power functions: model (15) under normally distributed error terms with ρ = 0.9, β 1 = 0.2, and T = 100. SB and SRB refer to the sign and signed rank bounds tests of Campbell and Dufour (1997). MC-S refers to the proposed split-sample permutation test. 17

where ε t = h t v t, h t = a 0 + a 1 ε 2 t 1, with v t i.i.d. N(0, 1) for t = 1,...,T; the initial values are y 0 = β 0 + δh 0 + ε 0 and h 0 = a 0. The errors in model (16) are governed by an ARCH(1) process, which has stationary moments of order 2 and 4 if 3a 2 1 < 1. In that case, the terms ε t have kurtosis 3(1 a 2 1)/(1 3a 2 1). This model therefore exhibits excess kurtosis a well-known feature of financial series as soon as a 1 > 0. Luger (2003) also extended the non-parametric approach of Campbell and Dufour (1997) to testing for a random walk. Luger s (2003) sign and signed rank statistics are computed as and W m = S m = m s[h t ] (17) t=1 m s[h t ]R t ( H ), (18) t=1 where h t = y t+m y t, for t = 1,...,m (m = T/2), and R t ( H ) is the rank of h t when the elements of H = ( h 1,..., h m ) are placed in ascending order. Luger (2003) shows that the use of the long (non-overlapping) differences, h t, eliminates the unknown drift parameter β 0 in a way that preserves the properties of the original errors. In particular, assuming that the density of (ε 1,...,ε T ) is symmetric, he shows that when β 1 = 0: (i) the statistic S m defined by (17) is distributed according to a binomial distribution with number of trials m and probability of success 1/2; and (ii) the statistic W m defined by (18) is distributed like W(m) = m t=1 tb t, where as before B 1,...,B m are mutually independent uniform Bernoulli variables on {0, 1}. In the case of the random walk hypothesis, Campbell and Dufour s sign test and the proposed split-sample permutation test are computed with z t (b) = ( y t b)g t 1, where g t = y t median(y 0,...,y t ); Campbell and Dufour s signed rank test uses R t ( Y b ), the rank of y t b when y 1 b,..., y T b are placed in ascending order. The random walk with constant drift hypothesis corresponds to β 1 = 0 and δ = 0 in model (16). The relative performance of the proposed method was investigated for parameter values β 1 = 1, 0.99, 0.98, δ = 0, 0.25, 0.5, a 1 = 0.9, 0.95, 0.99, and sample sizes T = 500, 1000. The other parameters were held constant at values β 0 = 0 and a 0 = 1. The results are reported in Tables 3 and 4. The power of Luger s (2003) tests depends by construction on the value of the drift parameter, β 0. Hence, S m and W m are seen in Tables 3 and 4 to have trivial power when δ = 0 since the drift parameter is set equal to zero here. Against the stationary 18

Table 3 Empirical rejection rates (in percentages) of the null hypothesis H 0 : β 1 = δ = 0 in model (16) with T = 500. Test a 1 = 0.90 a 1 = 0.95 a 1 = 0.99 β 1 + 1 1.0 0.99 0.98 1.0 0.99 0.98 1.0 0.99 0.98 δ = 0.0 SB 0.2 2.5 9.0 0.1 3.4 9.7 0.3 4.3 12.0 SRB 0.4 4.1 17.2 0.2 7.2 18.0 0.8 8.2 20.8 S m 4.6 4.2 4.5 4.7 5.1 5.9 4.4 5.3 5.5 W m 5.0 2.8 2.6 5.1 3.1 4.2 4.9 3.3 3.2 MC-S 4.7 11.3 24.1 4.9 12.2 23.4 4.8 14.7 26.0 δ = 0.25 SB 0.0 3.0 16.6 0.0 5.1 20.6 0.0 7.6 28.7 SRB 0.0 4.9 20.8 0.0 7.9 25.6 0.0 10.4 32.8 S m 8.2 62.6 51.1 7.7 67.7 58.1 8.3 72.6 61.6 W m 12.2 67.6 50.5 13.4 70.6 57.3 15.6 74.1 61.4 MC-S 7.2 65.0 69.0 6.7 69.5 75.2 8.1 73.6 83.2 δ = 0.5 SB 0.0 5.1 22.7 0.0 6.8 28.6 0.0 9.0 35.0 SRB 0.0 6.6 27.6 0.2 9.1 36.4 0.3 11.0 40.9 S m 12.1 81.1 65.5 12.3 81.2 66.2 13.4 81.7 68.8 W m 18.6 83.6 67.0 24.2 82.4 66.0 25.1 83.6 69.7 MC-S 13.1 78.8 82.9 12.8 82.3 84.8 13.9 82.0 90.9 Note: Nominal level is 5%. SB and SRB refer to the sign and signed rank bounds tests of Campbell and Dufour (1997). S m and W m refer to the sign and signed rank tests of Luger (2003). MC-S refers to the proposed split-sample permutation test. 19

Table 4 Empirical rejection rates (in percentages) of the null hypothesis H 0 : β 1 = δ = 0 in model (16) with T = 1000. Test a 1 = 0.90 a 1 = 0.95 a 1 = 0.99 β 1 + 1 1.0 0.99 0.98 1.0 0.99 0.98 1.0 0.99 0.98 δ = 0.0 SB 0.4 14.1 40.4 0.1 17.2 47.5 0.7 19.5 49.9 SRB 0.5 22.2 59.3 0.3 24.3 64.7 1.0 28.8 67.1 S m 4.7 5.3 5.4 4.4 6.9 7.3 5.2 9.0 8.6 W m 4.5 3.0 2.4 4.8 4.1 4.2 4.7 5.1 5.6 MC-S 4.7 25.1 49.4 4.5 29.6 55.3 4.9 33.1 55.4 δ = 0.25 SB 0.0 26.5 66.5 0.0 35.1 76.7 0.0 44.6 83.3 SRB 0.0 28.8 66.8 0.0 39.4 77.0 0.0 48.8 84.0 S m 7.6 62.1 56.1 8.4 69.6 57.5 8.2 72.8 65.8 W m 12.8 62.2 57.0 15.0 71.7 61.7 15.0 74.3 68.3 MC-S 6.8 83.1 93.2 6.9 89.3 97.0 7.8 93.8 97.5 δ = 0.5 SB 0.0 36.0 83.8 0.0 42.0 88.4 0.0 48.6 90.2 SRB 0.8 41.5 82.4 5.1 46.8 86.0 11.8 54.7 89.0 S m 12.9 74.9 64.5 11.2 79.0 71.9 14.0 82.2 71.6 W m 20.2 77.4 66.3 20.5 80.0 74.2 23.1 83.1 73.1 MC-S 11.7 91.7 98.8 12.4 95.2 99.7 12.8 96.7 99.4 Note: Nominal level is 5%. SB and SRB refer to the sign and signed rank bounds tests of Campbell and Dufour (1997). S m and W m refer to the sign and signed rank tests of Luger (2003). MC-S refers to the proposed split-sample permutation test. 20

alternatives considered here, the probability of rejecting the null is an increasing function of β 0 and β 1 + 1, except at β 1 = 0 (the null). When β 0 = 0 and β 1 + 1 < 1, power is lost because the stationary distribution of y t is symmetric such that h t has median zero. This is revealed in Tables 3 and 4, where power declines as β 1 + 1 takes on values further from 1, even when δ 0. Note however that those tests will have non-trivial power against non-stationary alternatives even if the drift is identically zero. Luger (2003) modifies the definition of h t to improve power with respect to β 0. However, no analytical results establish the finite-sample validity of the resulting tests. The MC-S test does not suffer from this problem. Indeed, Tables 3 and 4 show power increases in every direction away from the null. Further, that rate of increase is greater than that of Campbell and Dufour s tests. For example, when a 1 = 0.90, β 1 + 1 = 0.98, and T = 1000 (Table 4, column 3) the power of the MC-S test increases from about 50% to nearly 100% as δ increases from 0 to 0.5, while the power of the SRB test only increases from about 60% to 82% (and further increases to only 89% as a 1 increases to 0.99). 5. Empirical Illustration A large body of empirical literature has examined whether asset prices might help predict real economic activity. In particular, interest rates and interest rate (or yield) spreads the difference between long- and short-term interest rates have received considerable attention. Harvey (1988, 1989) finds that the real term structure contains information about future consumption and output growth. Stock and Watson (1989) find that nominal interest rate spreads are important leading indicators of economic activity. Estrella and Hardouvelis (1991) find that the yield spread between ten-year Treasury bonds and threemonth Treasury bills is a useful predictor of future growth in output, consumption, and investment. Estrella and Mishkin (1997) confirm that the basic result of Estrella and Hardouvelis (1991) continues to hold in a number of European countries as well as in the United States. Related evidence about the predictive power of interest rate spreads for future economic activity is also found in Mishkin (1990, 1991), Jorion and Mishkin (1991), Kozicki (1997), Hamilton and Kim (2000), and Estrella (2005). Ang, Piazzesi, and Wei (2005) develop a no-arbitrage model for the dynamics of the term structure of interest rates jointly with output growth. Their model provides a complete characterization of expectations of future output growth. Further, their model predicts that the nominal short rate contains more information about future output growth 21

Percent 5 0 5 10 15 1960 1970 1980 1990 2000 Time Percent 5 0 5 10 15 1960 1970 1980 1990 2000 Time Figure 2. Quarterly GDP growth (solid line), with yield spread (top panel) and short-term interest rate (bottom panel) lagged 1 period. than any yield spread. The goal of this section is to illustrate the non-parametric tests in this context. Contrary to Ang, Piazzesi, and Wei, the yield spread is found to be a statistically significant predictor of future output growth, while the nominal short rate is not. The data consist of zero-coupon U.S. Treasury bill and bond yields from the Center for Research in Securities Prices for the period covering April 1960 to December 2004. The three-month rate is taken as the short rate and the slope of the yield curve is measured as the difference between the five-year and the three-month rates. The quarterly rates used are averages of the monthly rates. The quarterly data on real GDP are seasonally adjusted in billions of chained 2000 dollars obtained from the Federal Reserve Bank of St. Louis Web site. Annualized output growth from quarter to quarter is defined as y t = 400 (log GDP t log GDP t 1 ). These data definitions are similar to those in Ang, 22

Table 5 Output growth predictability: nonparametric test results Test Short rate Yield spread SB 100.00 13.38 SRB 99.94 3.17 MC-S 78.10 5.20 ˆρ -0.27 0.24 ˆφ 0.95 0.83 Note: Entries on first 3 lines are twosided p-values (in %) of tests of the null of no predictability. Piazzesi, and Wei (2005). The predictive regression model of interest is that in (2), where x t 1 is either the short rate or the yield spread. Figure 2 plots one-quarter GDP growth, y t, with the short rate and the yield spread, both lagged one quarter. There are 178 observations once the first lag is allowed for, covering the period 1960Q3 to 2004Q4. Table 5 reports the results of the non-parametric tests, where the entries are two-sided p-values in percentages of the null hypothesis of no predictability. Results are reported for both the short-term interest rate (first column) and the yield spread (second column) as predictor. The table also reports the value of the sample correlation between the predictor variable and output growth (ˆρ) and the estimated coefficient ˆφ in the autoregression x t = µ+φx t 1 +η t. The results clearly show that the short-term interest rate has no predictive power for output growth: the p-values indicate that the corresponding test statistics are far from significant. When the yield spread appears as explanatory variable, the p-values change dramatically. The greatest change occurs with the SRB test: the p-value drops from about 100% to 3.2%. The MC-S test confirms the significance of the yield spread as predicitor of future GDP growth, without the symmetry assumption required for the validity of the SRB test procedure. These non-parametric results refute the theoretical prediction of Ang, Piazzesi, and Wei (2005) that the nominal short rate contains more information about future output growth than any yield spread. It should be emphasized that these results are valid under far more general distributional assumptions than those required for parametric inference. 23

6. Conclusion The Campbell and Dufour (1997) test procedure combines an exact non-parametric confidence interval for the parameter β 0 with conditional non-parametric tests linked to each point in the confidence interval. Their approach yields finite-sample generalized bounds tests; i.e., test for which the probability of a Type I error is less than or equal to α. The procedure presented in this paper extends the Campbell-Dufour approach based on sign statistics. It is well known that sign statistics are the only statistics which can produce valid tests for hypotheses about a median under sufficiently general distribution assumptions, allowing for non-normal, possibly heteroscedastic observations (Lehmann and Stein 1949; Pratt and Gibbons 1981 p. 218). Critical values for such tests are determined by assigning random signs to the absolute values of the observations. The approach here exploits this result by considering the signs of the permuted sample. The proposed test procedure builds on two key exchangeability properties for the signs of the permuted sample under the null hypothesis. The first property (Proposition 1) is that the signs of the observations are exchangeable when they are centered at the true value of β 0. The second property (Proposition 2) is that a null permutation distribution can be generated for the first-step Campbell-Dufour confidence interval which is derived by inverting the acceptance region of a sign statistic by splitting the sample in two subsamples of observations. By combining these two properties, Proposition 3 establishes an exact null permutation distribution for a class of sign statistics. A remakable feature of this result is that it does not depend on the level of the first-step confidence interval, so that the sample median may be taken as a (degenerate) confidence interval. Proposition 3 is then used with a Monte Carlo resampling technique to obtain an exact and computationally inexpensive inference procedure. The result is a new procedure which yields test of orthogonality and random walk for which the probability of a Type I error is exactly α. Further, the proposed test procedure has all the virtues of that in Campbell and Dufour: it remains exact in the presence of general forms of feedback, non-normality, conditional heteroscedasticity, and non-linear dependence. These robustness features are particularly appealing when one deals with financial time series. One could not say a priori whether the new procedure would yield more power. The procedure does indeed eliminate the bound from the sign-based Campbell-Dufour approach, but the cost is that only half the sample is effectively used to detect departures from the 24