Multivariate Signed-Rank Tests in Vector Autoregressive Order Identification

Size: px

Start display at page:

Download "Multivariate Signed-Rank Tests in Vector Autoregressive Order Identification"

Imogen Marsh
6 years ago
Views:

1 Statistical Science 2004, Vol 9, No 4, DOI 024/ Institute of Mathematical Statistics, 2004 Multivariate Signed-Ran Tests in Vector Autoregressive Order Identification Marc Hallin and Davy Paindaveine Abstract The classical theory of ran-based inference is essentially limited to univariate linear models with independent observations The objective of this paper is to illustrate some recent extensions of this theory to time-series problems serially dependent observations in a multivariate setting multivariate observations under very mild distributional assumptions mainly, elliptical symmetry; for some of the testing problems treated below, even second-order moments are not required After a brief presentation of the invariance principles that underlie the concepts of rans to be considered, we concentrate on two examples of practical relevance: the multivariate Durbin Watson problem testing against autocorrelated noise in a linear model context and 2 the problem of testing the order of a vector autoregressive model, testing VARp 0 against VARp 0 + dependence These two testing procedures are the building blocs of classical autoregressive order-identification methods Based either on pseudo-mahalanobis Tyler or on hyperplane-based Oja and Paindaveine signs and rans, three classes of test statistics are considered for each problem: statistics of the sign-test type, 2 Spearman statistics and 3 van der Waerden normal score statistics Simulations confirm theoretical results about the power of the proposed ran-based methods and establish their good robustness properties Key words and phrases: Rans, signs, Durbin Watson test, interdirections, elliptic symmetry, autoregressive processes RANKS, SIGNS AND SEMIPARAMETRIC MODELS Ran-Based Methods: From Nonserial Univariate to Multivariate Serial Ran-based methods for a long time have been essentially limited to statistical models that involve univariate independent observations Save a few exceptions such as testing against bivariate dependence, tests based on runs, tests for scale or goodness-of-fit methods that do not address any specific alternative, classical monographs mainly deal with single-response linear models with independent errors: one- and two- Marc Hallin and Davy Paindaveine are Professors, ISRO, ECARES and Département de Mathématique, Université Libre de Bruxelles, Brussels, Belgium mhallin@ulbacbe, dpaindav@ulbacbe sample location, analysis of variance, regression and so forth The need for non-gaussian, distribution-free and robust methods is certainly no less acute in problems that involve multivariate and/or serially dependent timeseries data Ran-based methods for multivariate observations attracted much attention in the late fifties and the sixties, leading to a fairly complete theory of hypothesis testing based on componentwise rans A unified account of this line of research is given in the monograph by Puri and Sen 97 Componentwise rans, however, are not affine-invariant and hence they crucially depend on the often arbitrary choice of a coordinate system; as a consequence, they cannot yield distribution-free statistics The resulting tests are permutation tests However, if invariance and distribution-freeness are lost, there is little reason to consider permutations of componentwise ran vectors rather than permutations of the observations them- 697

2 698 M HALLIN AND D PAINDAVEINE selves The resulting theory, therefore, is not entirely satisfactory Interest in an adequate generalization of rans and signs for multivariate observations still in the independent case was revived in the nineties with a series of papers by Oja, Randles, Hettmansperger and their collaborators: see Oja 999 for a review The signs and rans we consider herein belong to this vein, and we refer to Section 3 for details Despite the fact that some of the earliest and most classical ran tests such as runs tests and turning point tests were of a genuine serial nature, no systematic and coherent theory of serial ran-based statistics was constructed until the mid-eighties The reason for this late interest is probably the confusing idea that since rans are intimately related with independence or, at least, exchangeability, they are inherently confined to the analysis of independent observations This idea, however, does not resist closer examination, since rans, whatever their definition, always should be computed from a series of residuals that reduce to white noise under some null hypothesis to be tested Serial statistics based on the rans of univariate observations or residuals were considered in a series of papers Hallin, Ingenblee and Puri, 985; Hallin and Puri, 988, 99, 994; see Hallin and Puri 992 for a review of ran-based testing in a univariate autoregressive moving average ARMA context The purpose of this paper is to combine these two extensions of the classical theory: time-series in a multivariate setting Rather than give a general exposition for which we refer to Hallin and Paindaveine, 2004a, 2005, we concentrate on two important particular problems: a multivariate version of the classical Durbin Watson test and 2 the tests that allow for autoregressive order identification, namely, the problem of testing VARp 0 against VARp 0 + dependence which reduces to the Durbin Watson problem for p 0 = 0 In both cases, we limit ourselves to constant, linear and normal ran-weighting functions the so-called score functions, which yield test statistics of the sign, Spearman and van der Waerden types, respectively 2 From Classical Univariate Signed Rans to Multivariate Signs and Rans Denote by Z n,,zn n an n-tuple of univariate iid random variables with common density f satisfying the symmetry assumption f z = fz, z Z, and consider the group G ={g n g } of transformations g n g : Z n,,zn n g n n g Z,,Zn n := g Z n,,g Z n, where g : R R is antisymmetric [g z = gz], continuous and order-preserving [z <z 2 gz < gz 2 ] The vector of signed rans s n Rn +;,, s n n R n +;n,wheresn t := I n [Z t >0] I [Z t n stands <0] for the sign of Z t n and R n +;t denotes the ran of Z t n among Z n,, Zn n, constitutes up to a factor ± a maximal invariant for G This means that, beyond the fact that the signed rans are invariant statistics [which means they tae the same value in the transformed sample g n g Z n,,zn n as in the original sample Z n,,zn n for all g], any invariant statistic can be expressed as a function of the signed rans The invariance principle, which says one should restrict to invariant test statistics, therefore naturally leads to tests based on the signed rans Thans to the fact that the group G generates the set of all possible symmetric densities f, the resulting signed-ran tests are distributionfree Similarly, denote by Z n,,zn n an n-tuple of -dimensional iid random vectors with common density f The univariate assumption of symmetry is replaced by the assumption of elliptical symmetry We say that a random vector Z, with density f = f,f, is elliptically symmetric if there exist a symmetric, positive definite matrix and a function f : R + 0 R+ 0 satisfying 0 r frdr<, with 2 f,f z = c,f det /2 f /2 z, where c,f is a normalizing constant and /2 z :=z z /2 n z R, denotes the norm of z in the metric associated with [we write /2 for the unique upper-triangular array with positive diagonal elements satisfying = /2 /2 ] The contours of f,f clearly are a family of ellipsoids centered at the origin, the shape of which is characterized by the matrix ; the nonnegative function f will be called a radial density, although it does not integrate to Note that need not

3 MULTIVARIATE SIGNED-RANK TESTS 699 be the covariance matrix of Z; the ran-based Durbin Watson tests we are describing in Section 3 do not even require finite second-order moments to exist In practice, of course, both and f remain unspecified nuisance parameters The multivariate generalizations of signed rans we are now considering are based on arguments of invariance with respect to the group G of continuous orderpreserving radial transformations a direct extension to the multivariate setting of the group G above and the group G a of affine transformations acting on R Let d t = d n /2 Z n t /d n := := /2 Z n t Then U n is the unit vector that points in the direction of the sphericized vector /2 Z n t Clearly, if Z n t has density 2, then the density of /2 Z n t is constant over the spheres centered at 0 thisiswhy we call it sphericized, while U n unit sphere S in R,justass n is uniform over the t in the univariate setting is uniform over S 0 ={, }, the unit sphere in R For each,definethegroup of continuous orderpreserving radial transformations G n ={gn g } with [cf with above] g n g : Z,,Z n g n g Z,,Z n := g d n ; /2 U n ;,,g d n ;n /2 U n ;n, where g : R + R + is a continuous, strictly increasing function such that g0 = 0 and lim r gr = The transformation g n g is radial in the sense that, under the action of g n g, the residuals Z t = d /2 U move along a half line running through the origin in R This group is a generating group for the fixed- submodel and, quite analogous to the univariate case, a maximal invariant for this group is the couple U n, Rn, where the matrix Un = U n ;,,Un ;n collects the signs of the observations and R n = Rn ;,,Rn ;n is the vector of the rans R n of dn among dn ;,,dn ;n, t =,,n Similarly, the group G a of affine transformations of R generates the fixed-f submodel Indeed, Z and Z 2 have elliptical densities f,f and f 2,f, d /2 respectively, iff Z 2 = 2 /2 Z,where /2 2 /2 clearly belongs to G a = d denotes equality in distribution In view of this, U n and R n can be considered as multivariate generalizations of the usual signs and rans of absolute values, respectively We refer to Hallin and Paindaveine 2003 for a characterization of the testing problems for which this invariance approach maes sense Of course, when is unspecified, these multivariate signs and rans cannot be computed from the observations Z n t In Sections 2 and 22, we describe two different empirical reconstructions of these multivariate signs and rans 2 Pseudo-Mahalanobis signs and rans: The Tyler signs and rans The most natural way to deal with the nonspecification of consists of replacing U n and Rn with Un and Rn, respectively, where = n is some reasonable estimator of : namely, we require to be root-n consistent so that this replacement asymptotically has only limited effect and affine-equivariant to ensure the affine invariance of the resulting test statistics A possible choice for is the empirical covariance matrix of the Z n t s, but this estimate is nown to be highly nonrobust and its consistency requires finite moments of order 2 We therefore suggest using Tyler s 987 estimator of shape This estimator is defined as Tyl := C Tyl,whereC Tyl is the unique upper-triangular matrix with nonnegative diagonal and upper left element such that CTyl Z t CTyl Z t 3 = n C Tyl Z t C Tyl Z t I t= Tyl C I stands for the identity matrix This estimate thus is such that the empirical covariance of the resulting signs U n coincides with the covariance matrix I of the uniform distribution over the unit sphere S It is affine-equivariant and, under the assumption that the Z n t s are iid with density 2, it can be shown without maing any moment assumption that is root-n consistent for a, wherea is some positive constant The resulting Tyler signs U n are strictly equivariant under both G and G a,butthetyler rans R n are invariant under G a only However, it can be shown that U n Un and R n Rn are o P as n, so that, although the rans R n are not invariant under G, they are at least asymptotically invariant under G, in the sense that they are asymptotically equivalent to the strictly invariant exact rans R n When the choice of is not imposed, we use the somewhat heavier terminology pseudo-mahalanobis signs and pseudo-mahalanobis rans

4 700 M HALLIN AND D PAINDAVEINE 22 Hyperplane-based signs and rans Another approach to reconstructing the exact signs U n and the exact rans R n is based on counts of hyperplanes For the signs, the idea is due to Randles 989 For any pair Z n t, Z n t 2, t t 2 n, consider the n 2 hyperplanes going through the origin and out of the n 2 remaining Z n t s t t t 2 Define the interdirection c t n t 2 as the number of such hyperplanes that separate Z n and Z n t 2 see Figure t for an illustration in the bivariate case Interdirections are invariant under the affine group G a and under the group G of radial transformations, irrespective of Due to this invariance, it is intuitively clear that πp t n t 2 := πc t n t 2 / n 2 is a consistent estimate of the angle arc cosu U 2 between U n and U n 2 Interdirections thus allow for a reconstruction of those angles equivalently, a reconstruction of their cosines U U 2, since the U s are unit vectors: quite remarably, they do the same job, with the same invariance properties, as the Tyler cosines U U 2, but require no estimation of The respective advantages of Tyler angles and Randles interdirections are discussed in Hallin and Paindaveine 2002c The hyperplane-based cosines p t n t 2 are sufficient for the first problem we treat Section 3 For the second problem Section 4, we need the slightly more informative concept of absolute interdirections Hallin and Paindaveine, 2004b, 2005 The basic idea is exactly the same and the same hyperplanes are taen into account as before However, instead of counting the number of hyperplanes that separate Z n t now count the number c n t;i and Z n t 2,we of hyperplanes that separate Z n t and the transformed unit vectors /2 u i, FIG An illustration for Randles interdirections in the bivariate case The interdirection associated with Z 4 and Z 5 is c 45 = 2 in this small sample of size n = 5[two separating hyperplanes out of a total of 3 = 3 to be considered] FIG2 An illustration for lift interdirections in the bivariate case The lift interdirection associated with Z 4 is l 4 = 2 within this small sample of size n = 4[two separating hyperplanes out of a total of 32 = 3 hyperplanes to be considered] i =,,,whereu,,u forms the canonical basis of R Then, for the same reasons as above, πp n t;i := πc n t;i / n allows for a consistent estimation of the angles arc cosu u i, i =,,,sothat the vectors cosπp t;i, i =,,are consistent estimators of the signs U themselves Absolute interdirections are invariant under the group of radial transformations; however, they are only asymptotically affine-equivariant in the sense that they converge to strictly equivariant quantities Along with the hyperplane-based concepts of signs just described, we propose using a hyperplane-based concept of rans introduced by Oja and Paindaveine 2004 This concept relies on the so-called lift interdirections For any Z n t, consider the n hyperplanes going s t t through out of the n remaining Z n t The lift interdirection l n associated with Z n t t is defined as the number of such hyperplanes that separate Z n t and Z n t see Figure 2 for an illustration in the bivariate case Lift interdirections can be shown to converge to some monotone increasing function of the distances d n, so that their rans converge to the exact rans R n Again, we are able to reconstruct, as n, a quantity that depends on the unspecified shape matrix without estimating it When used in the procedures described below, the lift interdirection rans are those associated with a symmetrized version of lift interdirections see Oja and Paindaveine, 2004, for details 2 THE GENERAL LINEAR MODEL WITH VECTOR AUTOREGRESSIVE ERRORS The model we are considering throughout is the -variate general linear model with vector autoregressive VAR error terms [the more general case of vector

5 MULTIVARIATE SIGNED-RANK TESTS 70 autoregressive moving average VARMA errors could be treated as well; we restrict to the VAR case for the sae of simplicity] Under this model, the observation is an n-tuple Y Y, Y,2 Y, Y n := := Y n, Y n,2 Y n, of -variate random vectors that satisfies 4 Y n = X n β + V n, where x, x,2 x,m x X n := := x n, x n,2 x n,m x n and β, β,2 β, cβ β := := β m, β m,2 β m, β m denote an n m matrix of constants the design matrix and the m regression parameter, respectively Instead of the traditional assumption that the error term Y n V V, V,2 V, V n := := V n, V n,2 V n, V n is white noise, we rather assume V t, t =,,n to be a finite realization of length n ofthevarp process generated by p 5 V t = A i V t i + ε t, t Z, i= where {ε t t Z} is a -dimensional white-noise process with elliptical density 2 Under 4 and 5, 6 t Y t = β x t + G u ε t u + r t, u=0 t =,,n, with matrices G u the Green s matrices of the VAR operator characterized by the linear recursion G u = p i= A ig u i, u Z, and initial conditions G 0 = I, G = G 2 = =G p+ = 0 The remainder term r t is related to the influence of the unobserved initial values V 0,,V p+ It is easy to see that, under the traditional VAR stationarity assumptions, lim t t r t is bounded in probability, where < is the modulus of the smallest root of the characteristic polynomial associated with 5 Letting θ := vec β, vec A,,vec A p R m+2p =: R K, we write P n θ,,f for the probability distribution of the observation Y n under 6 3 RANK-BASED DURBIN WATSON TESTS 3 The Gaussian Durbin Watson Test Consider the first-order version p = of the general model described in Section 2 Writing A instead of A, 6 taes the form Y t = β t 7 x t + A u ε t u + A t V 0, t =,,n u=0 The Durbin Watson testing problem deals with the null hypothesis that V t is white noise, that is, that A = 0 Under this hypothesis, the observations are serially independent, of the form Y t = β x t + ε t The regression parameter β, as well, of course, as the underlying elliptic density the shape matrix and the radial density f of ε t, remain unspecified The multivariate version of the traditional Gaussian Durbin Watson procedure relies on the following test statistic Denote by ˆβ n N := X X X Y the usual least squares estimate of β and denote by Z t := Y t ˆβ n N x t the corresponding estimated residuals Write N := n nt= Z t Z t for the empirical residual covariance matrix The null hypothesis of serially independent errors is rejected at asymptotic level α whenever 8 Z s s,t=2 = n n t=2 W n DW := n N Z tz s N Z t /2 N Z tz t /2 N 2 exceeds the α quantile χ 2 of the chi-squared 2 ; α distribution with 2 degrees of freedom [ M := i,j= M ij 2 /2 stands for the Euclidean norm of the matrix M = M ij ] Being the sum of all residual squared cross-correlation coefficients at lag, this test statistic has a clear intuitive interpretation: in the univariate case, it reduces to the squared residual autocorrelation coefficient of order 32 Multivariate Signed-Ran Durbin Watson Tests The Gaussian test just described requires finite second-order moments, whereas the signed-ran tests

6 702 M HALLIN AND D PAINDAVEINE we now consider remain valid under arbitrarily heavy tails: only finite radial Fisher information 0 [ f / f r] 2 r f r dr/ 0 r frdr is required Any consistent sequence of estimates of β can be substituted for the Gaussian ˆβ n N [consistency here means consistency under the null hypothesis at the appropriate optimal rate ; the definition of this rate depends on the asymptotic behavior of the regression constants; see Hallin and Paindaveine, 2005, Section 2] If, however, the tests are to remain valid under infinite second-order moments, robust estimators that resist heavy-tailed distributions such as the M estimators proposed by Davis and Wu 997 should be used; denote by ˆβ n such an estimator The residuals associated with ˆβ n are obtained as in Section 3 Denote by U n t and R t n the sign and the ran among Z,,Z n, respectively, of the residual Z t In principle, any combination of a pseudo-mahalanobis or hyperplane-based sign with a pseudo-mahalanobis or hyperplane-based ran can be considered four possibilities, thus However, hybrid statistics that mix the two types Tyler signs, eg, with lift-interdirection rans are somewhat incoherent, so we restrict ourselves to combining signs and rans of the same type either pseudo-mahalanobis or hyperplane-based; we use the same notation for both cases We concentrate on three versions of signed-ran Durbin Watson statistics: A multivariate Durbin Watson statistic of the signtest type, n DW;sign W := 2 U s n U tu s U t s,t=2 9 = 2 2 U n t U t t=2 2 A multivariate Durbin Watson statistic of the Spearman type, n DW;Sp W := 9 2 n n + 4 R s n R n s Rn t R n t s,t=2 0 U s U tu s U t 9 2 = n n + 4 R n t R n 2 t U tu t t=2 3 A multivariate Durbin Watson statistic of the van der Waerden type, W n DW;vdW := n s,t=2 R n s n + R n t n + n R s n + U s U tu s U t = n R t n n + t=2 n R t U t U t n + n R t n + 2, where, denoting by F χ 2 u the quantile function of the chi-squared variable with degrees of freedom, u := F χ 2 u, u ]0, [ In all cases, the null hypothesis of serially independent errors is rejected whenever the test statistic exceeds the α quantile of a chi-squared distribution with 2 degrees of freedom 33 Asymptotic Relative Efficiencies The asymptotic relative efficiencies ARE; with respect to the traditional Gaussian procedure described in Section 3 of the signed-ran tests in Section 32 were derived by Hallin and Paindaveine 2005, who also established a multivariate serial version of the classical Chernoff Savage result This result shows that the asymptotic relative efficiency [with respect to the Gaussian procedure based on 8] of the van der Waerden tests list item 3 based on is uniformly larger than Some of these ARE values are reported in Table for several elliptic Student distributions and several dimensions of the observation space Note that the elliptical Student distributions considered have strictly more than 2 degrees of freedom in order for the Gaussian procedure to be valid

7 MULTIVARIATE SIGNED-RANK TESTS 703 TABLE AREs with respect to the Gaussian procedure of the sign-type S, Spearman-type Sp and van der Waerden-type vdw Durbin Watson tests under various -variate Student and normal densities, =, 2, 4, 6, 0 Degrees of freedom of the underlying t density Test S Sp vdw S Sp vdw S Sp vdw S Sp vdw S Sp vdw Numerical Study 34 Size and power To study the size and power of the Durbin Watson tests described in Sections 3 and 32, we generated N = 000 independent samples ε,,ε 650 of size n = 650 from various bivariate spherical densities, with mean zero and identity covariance matrix the bivariate normal and bivariate Student distributions with, 3 and 8 degrees of freedom From each of these samples, we constructed a series of 650 observations Y,,Y 650 characterized by the linear models 2 Y t = β I [50 t 575] + β 2 I [576 t 650] + V t, V t mav t = ε t, m= 0,, 2, with initial value V 0 = 0, β = 0, β2 = 0 and A = Dropping observations Y through Y 500 this warming up period of 500 observations allows approximate stationarity to be achieved, we performed on the remaining n = 50 observations Y,,Y 50 := Y 50,,Y 650 the following seven Durbin Watson tests at asymptotic probability level α = 5%: a The Gaussian Durbin Watson test based on 8 b The sign-test-type Durbin Watson test based on 9 with Tyler signs b2 The sign-test-type Durbin Watson test based on 9 with hyperplane-based signs interdirections c The Spearman Durbin Watson test based on 0 with Tyler signs and rans c2 The Spearman Durbin Watson test based on 0 with hyperplane-based signs and rans d The van der Waerden Durbin Watson test based on with Tyler signs and rans d2 The van der Waerden Durbin Watson test based on with hyperplane-based signs and rans In the Gaussian test a, the least squares estimator ˆβ N := Y t Y t t= t=76 was used for β = β β 2, while in the ran-based procedures, the location center of each group was estimated by the multivariate affine-equivariant median introduced by Hettmansperger and Randles 2002; the latter is root-n consistent and, consequently, the resulting ran-based procedures are valid without any assumptions on the tails of the underlying densities so that, unlie the Gaussian test, the ran-based tests are valid under the t distribution The Tyler estimate Tyl was computed from the algorithm of Randles 2000

8 704 M HALLIN AND D PAINDAVEINE TABLE 2 Rejection frequencies out of N = 000 replications under various values ma, m = 0,, 2[cf 2], of the autoregression matrix and various innovation densities of the Gaussian parametric Durbin Watson test φ N, the Tyler signed-ran van der Waerden φ vdw, Spearman φ Sp and sign φ S Durbin Watson tests, and their hyperplane-based counterparts φvdw h, φh Sp and φh S ; the series length is 50 Autoregression matrix Autoregression matrix Innovation Innovation Test density 0 A 2A ARE density 0 A 2A ARE φ N N t φ vdw φ S φ Sp φvdw h φs h φsp h φ N t t Undefined φ vdw Undefined φ S Undefined φ Sp Undefined φvdw h Undefined φs h Undefined φsp h Undefined Iterations were stopped as soon as the Frobenius norm of the difference between the two members of 3 fell below 0 6 Rejection frequencies are reported in Table 2 The corresponding individual confidence intervals for N = 000 replications, at confidence level 095, have half-widths 004, 0025 and 003 for frequencies of the order of , and 050, respectively It appears that, under the null hypothesis, none of the rejection frequencies significantly differs from the nominal 5% level All tests thus apparently are valid and unbiased even the Gaussian test under the Cauchy density, although in principle it is not valid Except for the sign test, the ran-based procedures yield the same overall performance as the Gaussian test under the Gaussian density, a slight superiority under t 8 density and a more mared superiority under t 3 density This confirms the ARE values which we also report in the table Somewhat disappointingly, all methods except again for the sign tests have more or less the same power under the Cauchy density, a fact that is not explained by any ARE value, since the latter is not defined As a rule, the hyperplane versions of all ran-based tests do slightly better than their Tyler counterparts 342 Robustness To investigate the robustness properties of the various Durbin Watson procedures proposed in Section 33, we studied their resistance to innovation and observation outliers, respectively For simplicity, in this section, we consider only Gaussian series The same Monte Carlo scheme as in Section 34 was used to generate a bivariate series of length n = 650 from model 2, with iid Gaussian innovations ε,,ε 650 The resulting series Y,,Y 650 then was subjected to the following perturbations inducing observation outliers: Y + observation outliers: Observations Y t were replaced with 5Y t at time t = 549, 550, 599, 600, 649, 650 Y observation outliers: Observations Y t were replaced with 5Y t at time t = 549, 599, 649 and with 5Y t at time t = 550, 600, 650 E + innovation outliers: The Gaussian innovations ε t were replaced with 5ε t at time t = 549, 550, 599, 600, 649 and 650 E innovation outliers: The Gaussian innovations ε t were replaced with 5ε t and 5ε t at time t = 549, 599, 649 and t = 550, 600, 650, respectively We generated N = 000 series of each type The last n = 50 observations then were subjected to the

9 MULTIVARIATE SIGNED-RANK TESTS 705 TABLE 3 Rejection frequencies out of N = 000 replications for various perturbed Gaussian VAR processes of the Gaussian parametric φ N, the Tyler signed-ran van der Waerden φ vdw, Spearman φ Sp and sign-test-type φ S Durbin Watson tests, and their hyperplane-based counterparts φ h vdw, φh Sp and φh S at asymptotic probability level 5%; the series length throughout is n = 50 Autoregression matrix Autoregression matrix Type of Type of Test outliers 0 A 2A outliers 0 A 2A φ N Y E φ vdw φ S φ Sp φvdw h φs h φsp h φ N Y E φ vdw φ S φ Sp φvdw h φs h φsp h various Durbin Watson procedures described in Section 34 The resulting rejection frequencies are reported in Table 3, which thus consists of four parts one for each type of outlier, each of which is to be compared with the left upper part of Table 2 Inspection of the table reveals that, quite significantly, the type I ris of the Gaussian test is exploding up to a 70% rejection rate under Y The Gaussian procedure thus is totally unreliable in the presence of outliers, whatever their type; the corresponding rejection frequencies under the alternative thus are meaningless The ran-based tests also are affected, but considerably less so, with a rejection rate under the null that, in general, does not significantly differ from the nominal As expected, the sign tests seem to be slightly more robust than the van der Waerden and Spearman tests 4 RANK-BASED SELECTION OF THE ORDER OF A VAR PROCESS 4 Gaussian Parametric VAR Order Selection Going bac to the general model described in Section 2, we now turn to the problem of testing a VARp 0 dependence in 5 against a VARp 0 + dependence For simplicity, we assume that β = 0; and f of course are nuisance parameters in this problem A sequential application of such tests can be used to identify the actual order of the unobserved autoregressive errors; see Pötscher 983 or Garel and Hallin 999 for the univariate counterpart of the problem More formally, denote by p0 the set of all values of θ R K such that A p0 + = =A p = 0, A p0 0, and for which the VARp 0 model with parameters A,,A p0 is stationary and invertible The null hypothesis then is of the form θ p0 Gaussian parametric optimal tests for this problem can be obtained, for example, by the Lagrange multiplier method; they require finite second-order moments Denote by Â,,Â p0 the estimators obtained under the assumption that the VAR model in 5 is of order p 0 Write ˆθ for vec Â,,vec Â p0, 0,,0 Defining the residuals p 0 Z t = Z t ˆθ = Y t Â i Y t i, t = p 0 +,,n, i= the residual cross-covariance matrix at lag i taes the form Ɣ n i := n p 0 i Z t Z t i = n p 0 i t=p 0 ++i t=p 0 ++i d d i /2 U U i /2

10 706 M HALLIN AND D PAINDAVEINE Write N for Ɣ n 0 The Gaussian test statistic for this problem then is 3 W p n 0 := nt p 0 ; N Q N T p0 ; N, where, writing G u = G u ˆθ for the Green s matrices associated with Â,,Â p0, 4 n /2 T p0 ; N u=2 u=2 := u=i n p 0 n p 0 n p 0 n p 0 u=p 0 n /2 vec n u /2 vec n u /2 vec n u /2 vec n u /2 vec and for p 0 =, p 0 is void N N p 0 Q N := 0 2 p 0 2 w2 I p 0 I 2 p 0 N Ɣn N Ɣn u G u N Ɣn u G u 2 N Ɣn u G u i N Ɣn u G u p 0 I p 0, I 2 p 0 W 2 with the 2 p 0 2 p 0 matrices w 2 and W 2 having i, j blocs [of dimension 2 2 ] w 2 ij := W 2 ij := n p 0 u=max2,i,j n p 0 u=maxi,j G u i N G u j G u i N G u j N N, and i, j =,,p 0, respectively Note that w 2 and W 2 differ only by their upper left 2 2 bloc The structure of this test statistic is the same as that of the univariate Gaussian Lagrange multiplier test statistic described by Garel and Hallin 999 The null hypothesis of ARp 0 dependence is rejected whenever W p n 0 exceeds the α quantile of a chi-squared distribution with 2 degrees of freedom The intuition behind the test statistic 3 is a little bit less straightforward than in the Durbin Watson case Actually, W p n 0 is a quadratic form that involves all estimated residual cross-correlation matrices, with weights that neutralize the effect of parameter estimation on the residuals and optimize the power For instance, p 0 = yields writing Â instead of Â,wehaveG u = Â u n /2 vec N n /2 Ɣn T ; N := n n u /2 vec u Âu and Q N := N 0 u=2 N 0 n n I 2 I 2 u= I 2 I 2 u=2 N Ɣn A u N A u A u N A u N N The order selection procedure then consists of first running a Durbin Watson test reducing to a simple test for randomness when β = 0 In case this is inconclusive, a VAR of order zero ie, white noise is selected and a traditional regression model is considered for the analysis If Durbin Watson is significant, then turn to testing VAR against VAR2 ie, the particular case just discussed and so on This procedure as a whole is of a heuristic nature and no precise ris can be evaluated for the final output However, consistency results have been obtained, possibly with α values varying from step to step; see Pötscher 983, Signed-Ran VAR Order Selection The procedure runs exactly as in the Gaussian parametric case, but is based on multivariate signed-ran statistics Here again, we propose three particular test statistics Each test can be computed from Tyler signs and rans or from hyperplane-based signs and rans In case interdirections are used, they should be absolute The three statistics are the following:

11 MULTIVARIATE SIGNED-RANK TESTS 707 A test statistic of the sign-test type, n W p 0 ; ;sign Q T p 0 ; ;sign, p 0 ;sign := 2 nt with n /2 p 0 ;sign as in 4, but with the sign-test T type cross-covariance matrices n Ɣ i; ;sign := /2 n p 0 i t=p 0 +i+ substituted for Ɣ n i 2 A test statistic of the Spearman type, n W U t U t i p 0 ; ;Sp Q T p 0 ; ;Sp, /2 p 0 ;Sp := 92 nt with n /2 p 0 ;Sp as in 4, but with the Spearman T cross-covariance matrices n /2 := Ɣ i; ;Sp n p 0 in p t=p 0 +i+ R t R t i U t U t i substituted for Ɣ n i 3 A test statistic of the van der Waerden type, n W p 0 ; ;vdw Q T p 0 ; ;vdw, /2 p 0 ;vdw := n T with n /2 p 0 ;vdw as in 4, but with the van der T Waerden cross-covariance matrices n Ɣ i; ;vdw := /2 n p 0 i t=p 0 +i+ R t n p 0 + R t i n p 0 + U t U t i /2 [ is as in ] substituted for Ɣ n i The null hypothesis of ARp 0 dependence is rejected whenever the test statistic exceeds the α quantile of a chi-squared distribution with 2 degrees of freedom We insist upon the fact that, contrary to the estimate N appearing in the Gaussian statistic, need no longer be the empirical marginal covariance matrix 43 Asymptotic Relative Efficiencies The asymptotic relative efficiencies, with respect to their Gaussian counterparts, of the ran-based tests used at each step of the order selection procedure are the same as in the Durbin Watson case The figures in Table as well as the generalized Chernoff Savage result of Hallin and Paindaveine 2005 thus still apply here However, a more pertinent assessment of the respective relative efficiencies of order selection procedures considered as a whole would be provided by ratios of correct identification probabilities Deriving exact values for such ratios is probably infeasible Monte Carlo evaluations, however, are possible; some numerical values are given in the simulation study below 44 Numerical Study 44 Efficiency Here we generated N =000 independent samples ε,,ε 620 of size n = 620 from various bivariate spherical densities, with mean zero and identity covariance matrix: the bivariate normal and bivariate Student distributions with in this case, the shape, not the covariance matrix, is identity, 3 and 8 degrees of freedom These samples were used in the VAR model Y t AY t = ε t with A = and initial value Y 0 = 0, yielding VAR series Y,,Y 620 of length 620 Of these observations, the last 20, denoted as Y,,Y 20, were subjected to various sequential order-identification procedures Seven versions Gaussian or ran based of the order-identification procedure were performed on each series Step one of each procedure consisted of testing for white noise against VAR dependence using a degenerate since no trend has to be estimated Durbin Watson test which coincides with the tests for randomness developed by Hallin and Paindaveine 2002b If the hypothesis of randomness cannot be rejected, the model is identified as being VAR0, that is, white noise order underidentification If randomness is rejected, the tests developed in Sections 4 and 42 are performed for testing VAR against VAR2 dependence If VAR is not rejected, the order p =

12 708 M HALLIN AND D PAINDAVEINE is correctly identified; if not, the procedure is pursued further, but we simply record overidentification of the order Of course, it is pretty natural to use the same type of test throughout the procedure The following seven types of identification procedures were considered: a The parametric Gaussian procedure b The sign-test-type procedure based on Tyler s signs and rans b2 The hyperplane-based sign-test-type procedure c The Spearman-type procedure based on Tyler s signs and rans c2 The hyperplane-based Spearman-type procedure d The van der Waerden-type procedure based on Tyler s signs and rans d2 The hyperplane-based van der Waerden-type procedure All individual tests were performed at nominal asymptotic level α = 5% In each case, the Yule Waler estimator Â := 9 20 t=2 Y t Y t 9 20 t=2 Y t Y t wasusedtoestimatea The Tyler estimate Tyl was computed from the algorithm of Randles 2000 [again, iterations were stopped as soon as the Frobenius norm of the difference between the two members of 3 fell below 0 6 ] Under-, correct and overidentification frequencies are reported in Table 4, along with the corresponding ARE figures The corresponding individual confidence intervals for N = 000 replications, at confidence level 095, have half-widths 004, 0025 and 003 for frequencies on the order of , and 050, respectively Inspection of the table reveals the excellent overall performance of all ran-based procedures considered: Hyperplane-based van der Waerden procedures uniformly outperform the Tyler-type van der Waerden ones, which in turn perform at least as well as their parametric Gaussian counterpart, even under Gaussian innovations More generally, hyperplane-based procedures van der Waerden, signs, Spearman do uniformly better than their Tyler-type competitors Although the validity of the tests used at each step of the identification procedure is not formally estab- TABLE 4 Underidentification p = 0, correct identification p = and overidentification p 2 frequencies out of N = 000 replications for the VAR model 5 under various Gaussian and Student innovation densities The seven procedures considered are based on the Gaussian parametric tests φ N, the Tyler signed-ran van der Waerden and Spearman tests φ vdw and φ Sp, the Tyler sign test φ S, and their hyperplane-based counterparts φ h vdw, φh Sp and φh S Order identification Order identification Innovation Innovation Test density 0 2 ARE density 0 2 ARE φ N N t φ vdw φ S φ Sp φvdw h φs h φsp h φ N t t Undefined φ vdw Undefined φ S Undefined φ Sp Undefined φvdw h Undefined φs h Undefined φsp h Undefined NOTE All tests were performed at probability level 5%; the series length throughout is n = 20 AREs refer to individual tests, not to the order identification procedure as a whole

13 MULTIVARIATE SIGNED-RANK TESTS 709 TABLE 5 Underidentification p = 0, correct identification p = and overidentification p 2 frequencies out of N = 000 replications in various perturbed Gaussian VAR series Order identification Order identification Type of Type of Test outliers 0 2 outliers 0 2 φ N Y E φ vdw φ S φ Sp φvdw h φs h φsp h φ N Y E φ vdw φ S φ Sp φvdw h φs h φsp h NOTE The various order identification procedures are based on the Gaussian parametric tests φ N, the Tyler signedran van der Waerden φ vdw,spearmanφ Sp and sign-test-type φ S tests, and their hyperplane-based counterparts φvdw h, φh Sp and φh S at asymptotic probability level 5%; the series length throughout is n = 20 lished under multivariate Cauchy t innovations, the final result under such densities remains excellent, with a remarable 95% frequency of correct identification for the hyperplane-based van der Waerden and Spearman versions 442 Robustness A robustness investigation also was conducted on the model in Section 342 for the various order identification procedures proposed in Sections 4 and 42 Observations, Y,,Y 620 were generated in the same way as in the previous section from model 5 with Gaussian ε t s These observations then were perturbed, as in Section 342, to produce observation outliers and innovation outliers respectively: Y + observation outliers: Observations Y t were replaced with 5Y t for t = 538, 540, 578, 580, 68 and 620 Y observation outliers: Observations Y t were replaced with 5Y t for t = 538, 578 and 68, and with 5Y t for t = 540, 580 and 620 E + innovation outliers: Gaussian innovations ε t were replaced with 5ε t for t = 538, 578, 580, 68 and 620 E innovation outliers: Gaussian innovations ε t were replaced with 5ε t for t = 538, 578 and 68, and with 5ε t for t = 540, 580 and 620 The last n = 20 observations then were subjected to the seven order identification procedures described in Section 44 The resulting under-, correct and overidentification frequencies are reported in Table 5 This simulation exercise of course is somewhat limited and allows only for very general conclusions The frequencies reported in Table 5, however, very clearly show how fragile the traditional parametric method can be in the presence of a small number of outliers: the observed proportion of correct identification based on the parametric tests drops from 0898 in the unperturbed case to 080 under the observation outlier scheme Y Quite on the contrary, the ran-based methods apparently resist quite well, irrespective of the type of outlier 5 CONCLUSIONS Ran-based methods have been confined for a long time to problems that involve univariate independent observations We show, on the basis of two particular examples the Durbin Watson and the autoregressive order selection problems, that ran methods also apply

14 70 M HALLIN AND D PAINDAVEINE to serial ie, time-series multivariate problems Two concepts of signs and rans are considered: pseudo- Mahalanobis or Tyler and the hyperplane-based or Oja Paindaveine Theoretical results establish that these methods are as efficient, locally and asymptotically, as their everyday-practice parametric competitors based on cross-correlation matrices; the van der Waerden versions even uniformly dominate the competition Simulations moreover show that the ran-based procedures successfully resist the presence of observation as well as innovation outliers, whereas traditional parametric methods literally collapse under such perturbations APPENDIX: RANKS, SIGNS AND SEMIPARAMETRIC EFFICIENCY For the reader who is familiar with local asymptotic normality or tangent spaces, we conclude this paper with a brief theoretical justification for considering ran-based methods in the analysis of a broad class of semiparametric models Details can be found in Hallin and Werer 2003 Ran-based methods apply whenever the data are generated, through some model involving a parameter θ R K, by some unobserved white noise here -dimensional with unspecified density f belonging to some class F of densities The statistical models we are considering are thus, typically, semiparametric, of the form 6 X n, A n, P := { P n θ,f, θ,f F } Assume that θ is the parameter of interest, whereas f plays the role of a nuisance parameter Whenever the fixed-f parametric submodels of 6 are locally asymptotically normal with central sequence n f θ and provided that some other regularity assumptions are met, the theory of semiparametric efficiency see Bicel, Klaassen, Ritov and Wellner, 993 stipulates that semiparametrically efficient at θ and f inference can be based on the projection n f θ of n f θ along the so-called tangent spaces Another way to reach semiparametric efficiency still at θ and f is possible when the fixed-θ submodels of 6 are generated by some group of transformations G n θ acting over X n, A n, with maximal invariant R n θ Hallin and Werer 2003 showed that, under quite general conditions, the difference between n n f θ and f θ := E[ n f θ Rn θ] tends to zero as n, in probability, under P n θ,f Conditioning on the maximal invariant thus does the same job as projecting along tangent spaces Now, in most models that involve unobserved white noise with unspecified density f, residual rans and/or signs their definitions depend on the class of densities F provide a maximal invariant R n θ Ran-based methods thus, in a sense, allow for bypassing tangent space calculations in the construction of semiparametrically efficient inference procedures Besides these semiparametric efficiency features, of course, they also enjoy their usual properties of distribution-freeness a consequence of invariance, robustness and so forth ACKNOWLEDGMENT This research was supported by a PAI contract with the Belgian Federal Government and an Action de Recherche Concertée of the Communauté Française de Belgique REFERENCES BICKEL, PJ,KLAASSEN, CAJ,RITOV, YandWELLNER, J A 993 Efficient and Adaptive Estimation for Semiparametric Models Johns Hopins Univ Press, Baltimore DAVIS,RAandWU, W 997 M-estimation for linear regression with infinite variance Probab Math Statist 7 20 GAREL, B and HALLIN, M 999 Ran-based autoregressive order identification J Amer Statist Assoc HALLIN, M,INGENBLEEK, J-FandPURI, M L 985 Linear serial ran tests for randomness against ARMA alternatives Ann Statist HALLIN, M and PAINDAVEINE, D 2002a Optimal tests for multivariate location based on interdirections and pseudo- Mahalanobis rans Ann Statist HALLIN, M and PAINDAVEINE, D 2002b Optimal procedures based on interdirections and pseudo-mahalanobis rans for testing multivariate elliptic white noise against ARMA dependence Bernoulli HALLIN, M and PAINDAVEINE, D 2002c Multivariate signed rans: Randles interdirections or Tyler s angles? In Statistical Data Analysis Based on the L Norm and Related Methods Y Dodge, ed Birhäuser, Basel HALLIN, M and PAINDAVEINE, D 2003 Affine invariant linear hypotheses for the multivariate general linear model with VARMA error terms In Mathematical Statistics and Applications: Festschrift for Constance van Eeden M Moore, S Froda and C Léger, eds IMS, Beachwood, OH HALLIN, M and PAINDAVEINE, D 2004a Ran-based optimal tests of the adequacy of an elliptic VARMA model Ann Statist HALLIN, M and PAINDAVEINE, D 2004b Asymptotic linearity of serial and nonserial multivariate signed ran statistics J Statist Plann Inference To appear HALLIN, M and PAINDAVEINE, D 2005 Affine-invariant aligned ran tests for the multivariate general linear model with VARMA errors J Multivariate Anal

15 MULTIVARIATE SIGNED-RANK TESTS 7 HALLIN, M and PURI, M L 988 Optimal ran-based procedures for time-series analysis: Testing an ARMA model against other ARMA models Ann Statist HALLIN, M and PURI, M L 99 Time-series analysis via ran-order theory: Signed-ran tests for ARMA models J Multivariate Anal HALLIN, M and PURI, M L 992 Ran tests for time-series analysis: A survey In New Directions in Time Series Analysis I D Brillinger, P Caines, J Gewee, E Parzen, M Rosenblatt and M S Taqqu, eds 54 Springer, New Yor HALLIN, M and PURI, M L 994 Aligned ran tests for linear models with autocorrelated error terms J Multivariate Anal HALLIN, M andwerker, B J M 2003 Semi-parametric efficiency, distribution-freeness and invariance Bernoulli HETTMANSPERGER, TPandRANDLES, R H 2002 A practical affine equivariant multivariate median Biometria OJA, H 999 Affine invariant multivariate sign and ran tests and corresponding estimates: A review Scand J Statist OJA, H and PAINDAVEINE, D 2004 Optimal signed-ran tests based on hyperplanes J Statist Plann Inference To appear PÖTSCHER, B M 983 Order estimation in ARMA models by Lagrangian multiplier tests Ann Statist PÖTSCHER, B M 985 The behaviour of the Lagrange multiplier test in testing the orders of an ARMA model Metria PURI, M L andsen, P K 97 Nonparametric Methods in Multivariate Analysis Wiley, New Yor RANDLES, R H 989 A distribution-free multivariate sign test based on interdirections J Amer Statist Assoc RANDLES, R H 2000 A simpler, affine-invariant, multivariate, distribution-free sign test J Amer Statist Assoc TYLER, D E 987 A distribution-free M-estimator of multivariate scatter Ann Statist

Optimal Procedures Based on Interdirections and pseudo-mahalanobis Ranks for Testing multivariate elliptic White Noise against ARMA Dependence

Optimal Procedures Based on Interdirections and pseudo-mahalanobis Rans for Testing multivariate elliptic White Noise against ARMA Dependence Marc Hallin and Davy Paindaveine Université Libre de Bruxelles,