The Asymptotics of the Mahalanobis. Depth-Based Theil-Sen Estimators in. Multiple Linear Models

Size: px

Start display at page:

Download "The Asymptotics of the Mahalanobis. Depth-Based Theil-Sen Estimators in. Multiple Linear Models"

Hilary Fisher
5 years ago
Views:

1 The Asyptotics of the Mahalanobis Depth-Based Theil-Sen Estiators in Multiple Linear Models Fang Li a, Hanxiang Peng a,,1 and Fei Tan b a Departent of Matheatical Sciences, Indiana University Purdue University At Indianapolis, Indianapolis, IN , USA b Institute of Public Health, College of Pharacy and Pharaceutical Sciences, Florida A&M University, Tallahassee, FL 3237, USA Abstract In this article, we propose the use of the Mahalanobis depth to construct fully affine equivalent Theil-Sen estiators of paraeters in a ultiple linear odel. Our construction includes the spatial depth-based Theil-Sen estiators as a special case. We exhibit that the efficiency of the proposed estiators relative to the least squares estiators increases with the nuber of paraeters under independent and identically distributed noral errors. We point out that the estiators are robust with a possible axial breakdown point and possess bounded influences. We show that the estiators are consistent and asyptotic noral for both deterinistic and rando covariates under suitable conditions. AMS 2 subject classification: priary 62G5; 62G2. Key words: Breakdown point, efficiency, influence function, Mahalanobis distance, spatial depth function. July 17, 29

2 1 Introduction In a siple linear odel, each pair of distinct observations results in an estiator of the slope. The edian of these slope estiators is known as the Theil-Sen estiator (Theil, 195; Sen, 1968). Having a clear geoetric interpretation, the Theil-Sen estiator is robust and relatively efficient, so that it is copetitive to other slope estiators such as the least squares estiator. The Theil-Sen estiator is often included in textbooks about robust statistics (Hollander and Wolfe, 1973, 1999; Rousseeuw and Leroy, 1986; Huber 1977; Dietz, 1989; Wilcox, 1998; Sprent 1993; Jurečková and Picek, 26; and Maronna, et al., 26). It also has iportant applications, e.g., in astronoy by Akritas et al. (1995) in censored data, in reote sensing by Fernandes and Leblanc (25). Recently, Wang (25) investigated the asyptotic behaviors of the Theil-Sen estiator when covariates are rando. When covariates are deterinistic, Peng, Wang and Wang (28) obtained the consistency and asyptotic distribution of the Theil-Sen estiator, and showed that it is super-efficient when the error distribution is discontinuous. Chatterjee and Olkin (26) proposed nonparaetric ethod for fitting a quadratic regression by exploiting the idea of the Theil-Sen estiator. Using the spatial edian, Zhou and Serfling (28) and Wang, Dang, Peng and Zhang (29) constructed the ultivariate Theil-Sen estiator (MTSE) of paraeters in a ultiple linear odel, generalizing the (univariate) Theil-Sen estiator (UTSE). When covariates are rando, the asyptotic norality of Corresponding author. Eail address: hpeng@ath.iupui.edu (Hanxiang Peng). 1 This research is supported by the US National Science Foundation under Grant No. DMS

3 the MTSE can be derived fro the spatial quantile theory of Zhou and Serfling. Also for rando covariates, Wang, et al. deonstrated that the MTSE s are robust with a possible axial breakdown point, consistent, and asyptotically noral under ild conditions. They also conducted siulations to deonstrate the robustness and efficiency of the MTSE s and copare the with the least squares estiators. Busarova, et al. (26) proposed the ultivariate Theil type estiator of the paraeter in a ultiple linear odel based on the Oja edian and derived the asyptotic norality for rando covariates. When the covariates are deterinistic, Shen (28, 29) proved the asyptotic norality using the convexity lea. In this article, we will construct the Theil-Sen estiators of paraeters in a ultiple linear odel based on the Mahalanobis depth function. Our construction is flexible, for exaple, we can choose the scale atrix in the Mahalanobis depth function to be the identity atrix, the resulting estiators are then the spatial depth-based Theil-Sen estiators given by Zhou and Serfling (28) and Wang, et al. (29). Since the Mahalanobis depth function is fully affine invariant, the Mahalanobis depth-based Theil-Sen estiators are affine equivariant. Whereas the spatial depth-based Theil-Sen estiators are only orthogonally equivariant, since the spatial depth is orthogonally but not affine invariant. It is worth to note that both Mahalanobis depth and spatial depth based- Theil-Sen estiators are regression and scale equivarant. We show that the efficiency of the MTSE s relative to the least squares estiators increases with the nuber of paraeters and tends to one as the nuber of paraeters goes to infinity under i.i.d. noral errors. This is an advantage of the MTSE s, bearing in ind that the MTSE s have the desired properties such as consistency and asyptotic norality under uch weaker assup- 3

4 tions than other estiators such as the least squares estiators. In fact, what is required for the error with the MTSE s is that it is siilar to angularly syetric, a weaker assuption than the usual syetry, see Liu (199). Whereas the error with the least squares estiator ust have ean zero and second finite oent. Note that the (univariate) saple edian is soeties described as inefficient, with noral efficiency 2/π =.637 copared to the saple ean. The spatial edian generalizes the univariate edian to high diension. Brown (1983) showed that the efficiency of the spatial edian relative to the saple ean is superior to the efficiency of the univariate edian. Higher diensional results for spherical syetric distributions show further increase in efficiency, which converges to one as the diension tends to infinity. Bai, Chen, Miao and Rao (199) exhibited the asyptotic relative efficiency of their least distance estiator in a location regression odel with noral errors intends to one as the diension of the covariate atrices intends to infinity. Chaudhuri (1992) deonstrated what was observed by Brown (1983) is also true for his ultivariate extension of the Hodges-Lehan estiate. Our result confirs that what Brown, Bai, et al. and Chaudhuri have found is also true for a ultiple linear regression. This shows that the advantage in efficiency of the least squares estiator over the the MTSE s diinishes as the diension p gets larger, for very large p, the least squares estiator practically loses its erit over the MTSE s. Of course the MTSE s are uch ore difficult to copute than the least squares estiator. This phenoenon is coparable to the behavior of the well known Stein estiator. We prove that the MTSE s are asyptotic noral for both deterinistic or rando covariates under suitable conditions. When covariates are deterinistic, the loss function Ψ n (b) (see (3)) that defines the MTSE s is not a U- 4

5 statistic. This causes us a technical difficulty, that is, we cannot apply the coonly used theory of U-statistics to derive the asyptotic norality. We will follow Shen (29) to overcoe this difficulty. Specifically, like Shen, we will prove the asyptotic norality by using the convexity lea (Pollard, 1991), the characterization of iniizers of a convex function, and the central liit theore for sus of dissociated rando variables (Barbour and Eagleson, 1985). The rest of the article is structured as follows. In Section 2, we construct the Theil-Sen estiators and explore existence, uniqueness and robustness. We also prove the consistency. In Section 3, we introduce the assuptions and give the ain results. Section 4 is devoted to the coparison of the efficiency of the MTSE s with the LSE. Proofs can be found in Section 5. 2 The Theil-Sen Estiators and Existence, Uniqueness, Robustness and Consistency In this section, we construct the MTSE s, discuss existence, uniqueness and robustness. At the end of the section, we give the consistency. The Multivariate Theil-Sen Estiators. In a ultiple linear odel, the response y j and covariate vector x j satisfy y j = x j β + ε j, j = 1,..., n, (1) where β is a p-variate (colun) vector of regression paraeter, x j s are (either rando or deterinistic) covariates, ε j s are rando errors, and x j s and ε j s are uncorrelated. 5

6 Let J be a p-subset of {1,..., n}. Let X J be the p p atrix with rows x j,j J and ε J be the colun vector with coponents ε j,j J. Assuing for now that X J is invertible. Then an estiate of β can be obtained as the solution to the p equations x j β = y j, j J. (2) This estiate is b J = X 1 J Y J, which is also the least squares estiate (LSE) of β based on the sub-saple {(x j,y j ) : j J}. Here the sub-saple size is p. One actually does not have to choose exactly p observations; any observations can be chosen as long as is at least the nuber of paraeters. Obviously is at ost the saple size n, so p n. For each -subset J of {1,..., n} and the sub-saple with indices in J, an estiate of β is the least squares estiate b = (X J X J ) 1 X J Y J, provided X J has full rank. Let J (n) be the collection of the -subsets of {1,..., n} such that for every J J (n), X J has full rank. Then an estiator ˆβ n of β is the Mahalanobis depth-based edian, ˆβn = MHed{b : J J (n) }, which is the iniizer of the loss function Ψ n (b) n ( 1/2 Σ (b b ) Σ 1/2 b ), b R p, (3) where is the Euclidean nor and Σ is an estiate of the covariance functional Σ (Serfling, 28) depending on X J (which is assued hereafter), so that Σ and Σ are syetric and positive definite. Since the suand in (3) is bounded by Σ 1/2 b by the triangle inequality, differentiation of (3) with respect to b can pass the expectation by the doinated convergence theore, therefore we see that the iniizer ˆβ n ust satisfy n Σ /2 S ( Σ 1/2 (b b ) ) =, (4) 6

7 under certain assuptions, see Reark 3 and (22) below and the discussions therein. Here S(b) = b/ b,b R p is the spatial sign function (Zhou and Serfling, 28) or the spatial unit function (Chaudhuri, 1992), whereas D S (b) = 1 ES(b ξ) is tered as the spatial depth function of rando vector ξ. Note when Σ = I, then ˆβ n siplifies to the spatial depth-based MTSE s proposed by Wang, et al. (29). Difference-Based MTSE s. Taking the pairwise differences of ultiple linear odel (1), we eliinate the intercept and obtain y j y k = (x j x k ) β 1 + ε j ε k, j,k = 1, 2,..., n, (5) where β 1 is a (p 1)-variate paraeter resulting fro β with the first coponent (the intercept) deleted. There are N = n(n 1)/2 distinct pairwise differences for n distinct observations. For an integer between p 1 paraeters and the saple size n, let K (n) be the ( ) N cobinations of (j,k) fro {(j,k) : j < k,j,k = 1,..., n} and write {(k 1,i, k 2,i ) : i = 1,..., }, a generic eleent in K (n), K j = (k j,i : i = 1,..., ) for j = 1, 2, and write K for either K 1 or K 2. Then (5) can be written in atrix for as Y K1,K 2 = X K1,K 2 β 1 + ε K1,K 2, (K 1,K 2 ) K (n), (6) where Y K1,K 2 = Y K1 Y K2, X K1,K 2 = X K1 X K2 and ε K1,K 2 = ε K1 ε K2 with ε K = (ε k : k K). Let b (K1,K 2 ) be the least squares estiator based on the subset of the observations with indices in (K 1,K 2 ), i.e., b (K1,K 2 ) = (X K1,K 2 X K1,K 2 ) 1 X K1,K 2 Y K1,K 2, (K 1,K 2 ) K (n). (7) Like Zhou and Serfling (28) and Wang, et al. (29), we propose an estiator ˆβ 1,n of β 1 as the Mahalanobis-depth edian, ˆβ 1,n = MHed{b (K1,K 2 ) : 7

8 (K 1,K 2 ) K (n) }, which is the iniizer of Ψ n,d (b) N ( 1/2 Σ (K 1,K 2 ) (b b (K 1,K 2 )) Σ 1/2 (K 1,K 2 ) b (K 1,K 2 ) ) (K 1,K 2 ) K (n) (8) for b R p 1, where Σ (K1,K 2 ) is an estiate of the covariance functional Σ depending only on X K1,K 2, which is assued hereafter. Here K is a subset of K in which all the least squares estiates exist. Note when Σ (K1,K 2 ) = I, the proposed estiator ˆβ 1,n siplifies the MTSE s proposed by Zhou and Serfling (28) and Wang, et al. (29). The syetry of ε K1,K 2 later on is shown to be one of the sufficient conditions that guarantee the uniqueness of the MTSE s. Even though each coponent of ε K1,K 2 is syetric, we do not have the syetry of ε K1,K 2, i.e., ε K1,K 2 d = ε K1,K 2, without the assuption of central syetry on the error ε. The reason is that the coponents of the rando vectors ε K1,K 2 ay be correlated, for instance, ε 2 ε 1 and ε 3 ε 2 are correlated. One siple reedy to this proble is to choose its coponents, the pairwise differences, in a way that they are not overlapped, for instance, we ay choose ε K1,K 2 = (ε 1 ε 2, ε 3 ε 4,..., ε 2 1 ε 2 ). Specifically, we choose the pairwise differences which constitute ε K1,K 2 in such a way that each of K 1 and K 2 has distinct eleents and K 1,K 2 have no eleent in coon. We denote the set of all possible such (K 1,K 2 ) as K. It is easy to obtain that the total nuber of (K 1,K 2 ) in K is N n, = ( )( ) ( ) n n n /(2!) = ( ) n 2 (2)!/(2!). Then following the above procedure, we construct the non overlapped difference- based MTSE ˆβ 1,n = MHed{b (K1,K 2 ) : (K 1,K 2 ) K }, which is the iniizer of Ψ n,d(b) N 1 n, (K 1,K 2 ) K ( Σ 1/2 (K 1,K 2 ) (b b (K 1,K 2 )) Σ 1/2 (K 1,K 2 ) b (K 1,K 2 ) ) (9) 8

9 for b R p 1. Here K is a subset of K in which all the least squares estiates exist. Existence and Uniqueness The proposed estiators are the ultivariate edians defined by the Mahalanobis depth function. Keperan (1987) investigated the edian of a finite easure on a Banach space. Here we review soe of the facts tailored for our application. Let µ be a probability easure on a Banach space X with nor. The edian θ of µ is any iniizer in X of the loss function Ψ(y) = ( x y x )µ(dx), y X. (1) Clearly, the integral in (1) is finite as the integrand is bounded by y for every y X. The edian θ always exists when X is finite diensional. Note that Ψ is strictly convex, hence µ has a unique edian if µ is not concentrated on any straight line in X. A siple calculus shows that θ is a edian of µ if and only if φ θ x (h)µ(dx), where φ x (h) = li t ( x + th x )/t, x, h X is the right liit at x. This liit exists for X = R p, hence θ is a edian if and only if φ θ x (h)µ(dx) =, x,h X, (11) provided µ({θ}) =. Moreover, a sufficient condition that θ is a edian is that θ satisfies (11). For the Euclidean nor, a sufficient condition that θ is a spatial edian is S(θ x)µ(dx) =. (12) It is worth to note that if µ is syetric about θ, then θ satisfies (12) ((11)). We now apply the above results to acquire the conditions that ensure the 9

10 existence and uniqueness of the MTSE s. Let β be the true but unknown paraeter value and β 1, be the true paraeter value obtained fro β with the first coponent deleted. There are two cases. Case i. Rando Covariates. We first consider the case that x 1,..., x n are i.i.d. rando covariates. Let Σ 1/2 be its square root of a covariance functional Σ, so that Σ = Σ 1/2 Σ /2. Since Σ 1/2 b is convex in b R p, we apply b = Σ 1/2 b. Obviously the MTSE s ˆβ n (ˆβ 1,n ) exists since it is finite diensional. By an analogous reasoning to Reark 1 of Wang, et al. (29), we conclude that if p 2 and the error is not point ass, then the MTSE ˆβ n (ˆβ 1,n ) is unique alost surely for large n. In order that as n tends to infinity the MTSE ˆβ n (ˆβ 1,n ) will converge to the true paraeter value β (β 1, ) respectively, the population loss function b E[Ψ n (b)] (b E[Ψ n,d (b)]) ust be iniized at the true paraeter value β (β 1, ). We now derive an analog of characterization (12). Since under the true odel, b β = (XJ X J ) 1 ε J, it follows that β is the iniizer of b E[Ψ n (b)] if E { Σ /2 (J ) S ( Σ 1/2 (J ) (X J X J ) 1 ε J )} =. (13) Since x i s and ε i s are uncorrelated, the above equation is iplied by the following sufficient condition Assuption 1. The error ε is syetric about zero. The analog of (13) for the difference-based MTSE ˆβ 1,n is E { Σ /2 (K 1,K 2 ) S ( Σ 1/2 (K 1,K 2 ) ((X K 1,K 2 X K1,K 2 ) 1 X K1,K 2 ε K1,K 2 ) )} =, where (K 1,K 2 ) K. By Theore 1 of Wang, et al. (29), Assuption 1 iplies that ε K1,K 2 is also syetric about zero for any (K 1,K 2 ) K. Thus the above analog is satisfied under Assuption 1. If the pairwise differences 1

11 are not overlapped K 1 K 2 =, then ε K1,K 2 is syetric about zero, and the convergence of the non overlapped difference-based MTSE ˆβ 1,n in Theore 5 to the true paraeter value β 1, is autoatic. If we choose Σ 1/2 = X J X J, then (13) siplifies to E(S(ε)) =. (14) This eans that the error ε is angularly syetric about zero. Introduced in Liu (199), angular syetry is a weaker syetry than the usual central syetry, i.e., central syetry iplies angular syetry. A systeatic discussion about various types of syetry can be found in Serfling (26). Case ii. Deterinistic Covariates. We now consider the case that x 1,..., x n are deterinistic. Analogous to the reasoning of the existence of Keperan (pages , 1987) and the uniqueness of Wang, et al. (29), we can show that there exists a iniizer for E[Ψ n (b)] in b R p, and this iniizer is unique if p 2 and the error ε is not point ass. In order that the MTSE s ˆβ n, ˆβ 1,n will converge to the true paraeter values β,β 1, respectively, we need the following assuption and its analogous version for Ψ n,d (b) and Ψ n,d(b). Assuption 2. For each b R p in a neighborhood of the true paraeter value β, there exists an finite constant Ψ(b) such that as n tends to infinity, EΨ n (b) Ψ(b). Further, Ψ(b) = (15) has a unique solution b = β. The above assuption is ild. In fact, Assuption 1 iplies that β satisfies (15), whereas the uniqueness in (15) follows fro the positive definiteness of atrix D in Assuption 3. Reark 1. Assuption 2 is iplied by Assuptions 1, 3 and the existence 11

12 of the finite liit Ψ(b) for b in a neighborhood of β. Consistency. Fro the above discussion and Corollary II.2 of Andersen and Gill (1982), we can prove the consistency of the MTSE s. The following Theore 1 is a generalization of convergence fro an average of independent rando variables in Theore 2.24 and Corollary 2.26 of Keperan (1987) to an average of dissociated rando variables (Barbour and Eegleson, 1985). Theore 1. Suppose that the error ε is not point ass and p 2. Case i. Covariates x i s are rando. (i.1) Suppose b E[Ψ n (b)],b R p (b E[Ψ n,d (b)],b R p 1 ) is strictly convex. If Assuption 1 holds, then ˆβ n (ˆβ 1,n ) will converge in probability to the true paraeter value β (β 1, ), i.e., ˆβ n P β (ˆβ 1,n then ˆβ 1,n P β 1,. P β 1, ). (i.2) If b E[Ψ n,d(b)],b R p 1 is strictly convex, Case ii. Covariates x i s are deterinistic. (ii.1) If Assuption 2 and its analogs hold, respectively, then ˆβ n respectively. P β, ˆβ 1,n P β 1, and ˆβ 1,n P β 1,, The strict convexity in Case i and the uniqueness of one solution (zero point) in Case ii are guaranteed by the positive definiteness of D in Assuption 3 and the analog of D in Assuption 5. Note also that with non-overlapped differences, syetry Assuption 1 is not required for the consistency, and this is also true for the asyptotic norality. More discussions can be found in Wang, et al. (29). Robustness. Keperan (1987) shows that the edian has a axial breakdown point 5%. Further, a edian will not change if transporting ass along open half lines (rays) with the edian as endpoint, leaving any ass at the edian unchanged. 12

13 Reark 2. Instead of the least squares estiate, we use a robust estiate with axial breakdown point to construct the MTSE s, then the MTSE s has a axial breakdown point, see Wang, et al. (29). 3 Asyptotic Norality In this section, we first introduce the assuptions. We then give the asyptotic norality of the MTSE s for deterinistic and rando covariates, and the difference-based MTSE. The MTSE ˆβ n is the iniizer of the convex loss function (3). For deterinistic covariates, Ψ n (b) is not a U-statistic, so we cannot apply the elegant theory of U-statistics. Like Shen (29), we will exploit the convexity (Pollard, 1992), the characterization of iniizers of a convex function, and the central liit theore for sus of dissociated rando variables (Barbour and Eagleson, 1985) to derive the asyptotic norality of the MTSE s. We need the following assuptions. Assuption 3. EΨ n (β) has continuous second derivative 2 EΨ n (β) with respect to β in a neighborhood of the true paraeter value β such that li n 2 E(Ψ n (β )) = D, li n Var( Ψ n (β )) = V (16) n for soe positive definite atrices D and V. A siple calculus shows Ψ n (b) = n Σ /2 S ( Σ 1/2 (b b ) ) (17) 13

14 and 2 Ψ n (b) = n Σ /2 S ( (1) Σ 1/2 (b b ) ) Σ 1/2, (18) where S (1) (b) = (I p S 2 (b))/ b, b, b R p and S (1) () =. In view of Assuption 3.1 and Lea 5.3 of Chaudhuri (1992), we have a sufficient condition for Assuption 3. Reark 3. A sufficient condition for Assuption 3 to hold is that the rando errors ε 1,..., ε n are i.i.d. with an absolutely continuous distribution (w.r.t. the Lebesgue easure) having a density such that it is bounded on any copact subset of the real line R; E(Σ 1/2 J ) is nonsingular for every J J (n) ; p 2; the second liit in (16) exists for soe positive definite atrix V when the covariates x i s are rando, and when the covariates are deterinistic both liits in (16) exist for soe positive definite atrices D and V. Under the assuptions in Reark 3, we have and E [Ψ n (b)] = 2 E [Ψ n (b)] = n [ /2 E Σ S ( Σ 1/2 (b b ) )] (19) n [ /2 E Σ S ( (1) Σ 1/2 (b b ) ) Σ 1/2 ]. (2) If, further, covariates x 1,..., x n are i.i.d., then we have E ( Σ /2 (J ) S (1) (Σ 1/2 (J ) (β b (J )))Σ 1/2 ) (J ) = D(β ) = D (21) for soe positive definite D. The positive definiteness of D is ensured by the assuptions in Reark 3, ore details can be found Lea 5.3 of Chaudhuri (1992). However, if the covariates x i s are deterinistic, we need assue the liit of (19) exists as n tends to infinity and D is positive definite. Analogous 14

15 to the discussion in Chaudhuri (page 9, 1992), the MTSE ˆβ n ust satisfy E [Ψ n (b)] = n [ /2 E Σ S ( Σ 1/2 (b b ) )] =. (22) Clearly (4) is the saple equation of (22). Fro now on, we assue = p and = p 1 for the difference-based MTSE s. Assuption 4. The rando error ε and deterinistic covariates x 1,..., x n satisfy and n λ 1 in(σ )E({a 2 ε 2 } n) = o(n), (23) n λ 3/2 in (Σ ) = o(n 1/2 ), (24) where a = λ 1/2 in (Σ )λ 1/2 ax(σ (X J X J )), λ in (M)(λ ax (M)) denotes the sallest (largest) eigenvalue of atrix M, and a b = in(a,b). Since E({a 2 ε 2 } n) = n P( ε a n 1/2 ) + E(a 2 ε 2 1[ ε > a n 1/2 ]), it follows that if the error ε has a density bounded in a neighborhood of the origin, then E({a 2 ε 2 } n) c a n 1/2, for soe constant c. While a ean is highly sensitive to outliers, a edian is upset by the ass in the neighborhood of the edian. Less ass in a neighborhood will result in ore variation of the edian estiator. The boundedness of the density in a neighborhood of a edian (the origin) is a typical condition that regulates the behavior of the edian. Bai, Chen, Miao and Rao (199) iposed boundedness in a neighborhood of the origin on the error distribution in assuption (a). Therefore, we give the following sufficient condition. Reark 4. Suppose that the density of the rando error ε is bounded in a 15

16 neighborhood of the origin. Then (23) is iplied by n λ 3/2 in (Σ )λ 1/2 ax(σ (XJ X J )) = o(n 1/2 ). (25) Conditions (25) and (24) are ild assuptions. In fact, if Σ = I and the covariates are in the agnitude of o(n 1/2 ), then both (25) and (24) are satisfied. An iportant special case of this situation is that the covariates are bounded and Σ = (X J X J ) 1, then Assuption 4 is satisfied. We now give our ain theore. Theore 2. Suppose that the error ε is not point ass and p 2. Suppose Assuptions 1, 3 and 4 hold. Then ˆβ n has a bounded influence function D 1 Σ /2 S and satisfies the stochastic expansion: ˆβ n = β + n D 1 Σ /2 S ( Σ 1/2 (b β ) ) + o p (n 1/2 ). (26) Hence, n 1/2 (ˆβ n β ) = N(, D 1 V D ). I.I.D. Rando Covariates. When covariates x 1,..., x n are independent and identically distributed rando vectors, Wang and et al. (28) have proved the asyptotic norality of the spatial depth-based ultivariate Theil-Sen estiators using the theory of U-statistics. With the aid of the convexity lea (Pollard, 1991) and following the proof of Theore 2, we can obtain a siple proof for the asyptotic norality of the MTSE. We will introduce the assuptions and give results below with the details of the proof oitted. Note when x 1,..., x n are i.i.d., Assuption 4 is siplified to E ( λ 1 in(σ (J ))[{λ 1 in(σ (J ))λ ax (Σ (J )(X J X J ))ε 2 } n] ) = o(n), (27) 16

17 and E ( λ 3/2 in (Σ (J )) ) = o(n 1/2 ). (28) Reark 4 is reduced to Reark 5. Suppose that the density of the rando error ε is bounded in a neighborhood of the origin. Then (27) is iplied by E ( λ 3/2 in (Σ (J ))λ 1/2 ax(σ (J )(X J X J )) ) = o(n 1/2 ). (29) We are now ready to give the theore with the proof oitted. Theore 3. Suppose that the error ε is not point ass and p 2. Suppose that the covariates x 1,..., x n in ultiple linear odel (1) are i.i.d. rando vectors. Assue Assuptions 1, 3 hold. If (27) and (28) are satisfied, then the ultivariate Theil-Sen estiator ˆβ n has bounded influence function D 1 r Σ /2 S and satisfies the stochastic expansion: ˆβ n = β + n D 1 r Σ /2 S(Σ 1/2 (b β )) + o p (n 1/2 ). (3) Hence, n 1/2 (ˆβ n β ) = N(, D r 1 V r D r ), where D r = D(β ),V r = V (β ) coputed in (21) and (35). The Difference-Based MTSE. In a siple linear regression odel with deterinistic covariates, Peng, Wang and Wang (28) studied the Theil-Sen estiator under no assuption on the distribution of the error. They showed that the TSE is strongly consistent and has an asyptotic distribution under ild conditions. Wang, et al. (29) extended these results to the spatial depth-based MTSE s and gave the asyptotic norality when covariates are rando. Here we will investigate the asyptotic behaviors of the Mahalanobis depth-based MTSE s when covariates are deterinistic. Our consideration includes the spatial depth-based MTSE s as a special case. 17

18 The proof of the asyptotic norality of ˆβ 1,n is analogous to the proof of Theore 2, and we give the sketches where ajor differences occur. One of the differences in the proof is that we cannot directly apply Theore 2.1 of Barbour and Eagleson (1985) to clai the asyptotic norality. What happens is that the su of rando variables defining our estiate ˆβ 1,n is not dissociated, that is, b (K1,K 2 ),b (K3,K 4 ) are not independent when the indices (K 1,K 2 ), (K 3,K 4 ) are disjoint, while Theore 2.1 gives the asyptotic norality of a su of dissociated rando variables. This technical difficulty can be circuvented by cobining the like ters, see details in the proof below. We need the following assuptions. Assuption 5. Assuption 3 is satisfied for Ψ n (b) = Ψ n,d (b) given in (8). Under difference odel (5), using an analogous reasoning leading (24), (39) and (4) to Assuption 4, we introduce the following assuption. Assuption 6. N λ 1 in(σ (K1,K 2 )) (K 1,K 2 ) K (n) E ( {λ 1 in(σ (K1,K 2 ))λ ax (Σ (K1,K 2 )(X K 1,K 2 X K1,K 2 ))(ε 1 ε 2 ) 2 } n ) = o(n) N λ 3/2 in (Σ (K1,K 2 )) = o(n 1/2 ). (31) (K 1,K 2 ) K (n) Siilarly, we give a sufficient condition below. Reark 6. Suppose the convolution of the density of the rando error ε is bounded in a neighborhood of the origin. Then (31) is iplied by N λ 3/2 in (Σ (K1,K 2 ))λ 1/2 ax(σ (K1,K 2 )(XK 1,K 2 X K1,K 2 )) = o(n 1/2 ). (K 1,K 2 ) K (n) (32) 18

19 With the pairwise differences, the syetry of the error is still indispensable for the asyptotic norality, see Wang, et al. (29). Theore 4. Suppose p 3 and ε is not point ass. Suppose Assuptions 1, 5 and 6 hold. Then ˆβ 1,n has a bounded influence function D d 1 Σ /2 (K 1,K 2 ) S and satisfies the stochastic expansion: N ˆβ 1,n = β 1, + (K 1,K 2 ) K (n) D d 1 Σ /2 (K 1,K 2 ) S ( Σ 1/2 (K 1,K 2 ) (b (K 1,K 2 ) β 1, ) ) +o p (n 1/2 ). (33) Hence, n 1/2 (ˆβ 1,n β 1, ) = N(, D d 1 V d D d ), where D d and V d are siilarly defined as D and V. If the differences are not overlapped, then the syetry of the error is no longer required. With a siilar proof in Theore 4, we derive Theore 5. Suppose p 3 and ε is not point ass. Suppose analogs of Assuptions 5 and 6 hold. Then ˆβ 1,n has a bounded influence function D 1 Σ /2 (K 1,K 2 ) S and satisfies the stochastic expansion: ˆβ 1,n = β 1, +N 1 n, (K 1,K 2 ) K D 1 Σ /2 (K 1,K 2 ) S ( Σ 1/2 (K 1,K 2 ) (b (K 1,K 2 ) β 1, ) ) +o p (n 1/2 ). (34) Hence, n 1/2 (ˆβ 1,n β 1, ) = N(, D 1 V D ), where D,V are siilarly defined as D,V. 4 Efficiency Consideration In this section, we copare the efficiency of the MTSE s with that of the LSE when the rando errors are fro noral distribution. To calculate the asyptotic covariance atrix of the MTSE, we need co- 19

20 pute V and D. Let us first calculate V. To this end, we ust calculate the covariance of the average in (17). Note that this average is not a ultivariate U-statistic (vector of U-statistic) since the suand in the average Z = Σ /2 S(Σ 1/2 (b b )) is not a syetric kernel in its arguents. Thus we cannot directly apply the forula of the variance of a U-statistic. Instead, we will exploit the technique used in coputing the variance of a U- statistic, see, e.g., van der Vaart (1998). Specifically, since Cov(Z (I),Z ) = if J 1 J 2 = and the liiting distribution is non-degenerate, the leading ters contributing to the covariance are those Cov(Z (I),Z ) whose indices I = {I 1 <... < I } and J = {J 1 <... < J } have only one coon index. More specifically, for i,j = 1,...,, we copute those indices (I,J) such that I i = J j and I J = {I i }. It then can be seen V = li n = li n n ) 2 ( n i,j=1 n ) 2 ( n I,, I i =J j, I J={I i } i,j=1 I i =i+j 1 E(Z I Z J ) n 2p+i+j [( I i )( 1 I i )( )( i n I i n I i ) p + i i 1 j 1 p i p j ] E[E(Z J x i,ε i )E(ZJ i,j x i,ε i )] where J = (1, 2,..., ) and J i,j is the -subset resulting fro augenting ( + 1,..., 2 1) with i inserted so that the j-th coponent equal i for i,j = 1,... By Karata theore, n 2+2 l=1 n 2+2 (l + i + j 3)! (n l i j + 2)! l i+j 2 (n l) 2 i j (l 1)! (n l 2 + 2)! l=1 1 n 2 1 x i+j 2 (1 x) 2 i j dx, so that we have 2

21 n 2+i+j I i =i+j 1 ( I i 1 i 1 )( I i i j 1 n 2 1 )( )( n I i n I i ) + i i j x i+j 2 (1 x) 2 i j dx (i 1)!(j 1)!( i)!( j)! ( )( ) = n2 1 i + j 2 2 i j. (2 1)! i 1 i 1 Hence, V = li n ( n = (!)2 (2 1)! It can be verified n 2 ( )( ) i + j 2 2 i j ) 2(2 E[E(Z J x i,ε i )E(ZJ 1)! i,j=1 i 1 i i,j x i,ε i )] ( )( ) i + j 2 2 i j E[E(Z J x i,ε i )E(ZJ i 1 i i,j x i,ε i )] (35) i,j=1 (!) 2 ( )( ) i + j 2 2 i j = 2. (36) (2 1)! i=1 j=1 i 1 i If Z J is syetric in its arguents, then we obtain a U-statistic vector and V indeed boils down to be the usual variance forula of a U-statistic, naely, V = 2 E[E(Z J x 1,ε 1 )E(Z J x 1,ε 1 )]. Case 1. In this case, we assue the covariates are deterinistic and X J I p. Also we assue the errors follow noral distribution with ean and variance σ 2. To keep the notation siple, we assue σ 2 = 1. We choose the scale atrix Σ = (XJ X J ) 1 = I p. Then the proposed MTSE is the spatial edian of the ( ) n p least squares estiators. We now copute D and V in Theore 2. It can be seen that Σ 1/2 J (β b J ) = ε J has a spherically syetric p-variable noral distribution around the origin. By a siilar arguent to Brown (1983), D = (p 1)Γ(p 1 2 ) p 2Γ( p 2 ) I p. 21

22 Now copute V via (35). Since Z J = ε J / ε J, it follows E(Z J x i, ε i ) = E(ε i ε J 1 ε i )1 i, E(Z J i,j x i,ε i ) = E(ε i ε J 1 ε i )1 j. where 1 i is the p-diensional vector with all zero coponents except for the i-th unit coponent. Hence, E[E(Z J x i,ε i )E(Z J i,j x i,ε i )] = E[{E(ε i ε J 1 ε i )} 2 ]1 i 1 j E[{E(ε 2 i ε J 2 ε i )}]1 i 1 j = p 1 1 i 1 j and V = (V i,j ) p p with ( )( ) i+j 2 2p i j V i,j (p!)2 i 1 p i. p (2p 1)! Hence, the asyptotic covariance atrix of the proposed MTSE is ( (p 1)Γ( p 1 D 1 V D T 2 = ) ) 2 p 2Γ( p ) V. 2 Define M a = i,j=1 M i,j as the nor of the covariance atrix M. Notice that this nor is ore conservative in a sense that M a GV (M), where GV (M) is the Generalized Variance that is defined to be the su of variances of each rando variables. By (36), ( (p 1)Γ( p 1 D 1 V D 2 a = ) ) 2 p p 2Γ( p ) V i,j 2 i,j=1 ( (p 1)Γ( p 1 2 ) ) 2 p p (p!) ( )( ) 2 i+j 2 2p i j i 1 p i 2Γ( p ) 2 i,j=1 p (2p 1)! ( (p 1)Γ( p 1 2 = p ) ) 2 p 2Γ( p ) (37) 2 The covariance atrix of the least square estiator is pi p with pi p a = p 2. The asyptotic relative efficiency (ARE) of the MTSE relative to the LSE is the ratio of the nor of the covariance atrix of the LSE to the MTSE. Then 22

23 it can be seen that the ARE satisfies ARE(p) = pi p a D 1 V D a 1 p ( (p 1)Γ( p 1 ) ) 2 2 p 2Γ( p ) 1, p. 2 As in Brown (1983) and Chaudhuri (1992), the ARE increases with the nuber of regression paraeters. Table 1 reports the above lower bound for several values of p. Table 1 p ARE Case 2. In this case, we assue that the covariates x n s are i.i.d rando vectors and the errors ε i s follow the standard noral distribution. This case was studied by Wang, et al. (29) and corresponds to our Theore 3. We now calculate the D r,v r in Theore 3. It can be seen that under the true odel, Σ 1/2 J (β b J ) = (X J X J ) 1/2 X J ε J has a spherically syetric p- variable noral distribution around the origin. Again as in case 1, with a siilar arguent in Brown (1983), we have D r = (p 1)Γ(p 1 ) 2 p 2Γ( p ) E(X (p 1)Γ(p 1 J X J ) = p ) 2 p 2Γ( p ) E(x 1 x 1 ). 2 2 Since the covariate X and error ε are uncorrelated. Since Z J = X J ε J / ε J, it follows E(Z J x i,ε i ) = E(X J x i )E(ε J ε J 1 ε i ) = E ((x 1,..., x p ) x i ) E(ε i ε J 1 ε i )1 i = E(ε i ε J 1 ε i )x i. Hence, E[E(Z J x i,ε i )E(Z J i,j x i,ε i )] = p 2 E[(E(ε i ε J 1 ε i )) 2 ]E(x 1 x 1 ) 23

24 By (35) and (36), we have V r = p 2 E[(E(ε i ε J 1 ε i )) 2 ]E(x 1 x 1 ). Hence, the asyptotic covariance atrix of the MTSE is D 1 r V r D r = so that the nor is ( (p 1)Γ( p 1 ) 2 p 2Γ( p ) 2 ) 2 E[(E(ε i ε J 1 ε i )) 2 ] ( E(x 1 x 1 ) ) 1. D 1 r V r D r a = 1 p ( (p 1)Γ( p 1 2 ) p 2Γ( p ) 2 ( (p 1)Γ( p 1 ) 2 p 2Γ( p ) 2 ) 2 E[(E(ε i ε J 1 ε i )) 2 ] (E(x 1 x 1 )) 1 a ) 2 (E(x 1 x 1 )) 1 a. (38) The covariance atrix of the LSE is n(e(x n X n )) 1 = (E(x 1 x 1 )) 1 and its nor is (E(x 1 x 1 )) 1 a. Thus, we have obtained the sae conclusion as in Case 1. That is, the asyptotic relative efficiency of the MTSE relative to the LSE satisfies ARE 1 ( (p 1)Γ( p 1 ) ) 2 2 p p 2Γ( p ). 2 5 Proofs In this section, we provide the proofs. Proof of Theore 2. Let R n (α) = n{ψ n (β + n 1/2 α) Ψ n (β ) n 1/2 α Ψ n (β )}, α R p, We show next Var(R n (α)) for each α R p. Let α n = n 1/2 α and denote γ = Σ 1/2 (β b ) and δ = Σ 1/2 (α n + β b ). Using a + b b = ( a + b 2 b 2 )/( a + b + b ), we have 24

25 n R n (α) = n n =n ( Σ 1/2 ( δ γ α n Σ /2 S(Σ 1/2 (β b )) ) α n 2 + 2αn Σ 1 (β b ) δ + γ α n Σ /2 S(Σ 1/2 (β b )) n ( Σ 1/2 α n 2 =n + γ ) δ αn Σ /2 S(Σ 1/2 (β b )) δ + γ δ + γ n =n C, say. Clearly Cov(C (J1 ),C (J2 )) = if J 1 J 2 =, and 2 Cov(C (J1 ),C (J2 )) Var(C (J1 ))+ Var(C (J2 )),J 1,J 2 J (n). Hence, for large n, ) ) 2 ( n Var(R n (α)) n 2 i=1 n cn ( i )( n i E(C 2 ), ) Var(C )) where c is a constant independent of n. Since the spatial sign function S( ) is bounded by one and γ δ Σ 1/2 C 2 Σ 1/2 α n 2 γ + δ 2 Accordingly, Var(R n (α)),α R q if we show α n, it follows Σ 1/2 α n 2 γ Σ 1/2 α n. n λ 1 in(σ )E ( {λ 1 in(σ )γ 2 } n) = o(n). (39) Under the true odel, b = β + (X J X J ) 1 X J ε J. Let Γ be an orthogonal atrix such that ΓX J (X J X J ) 1 Σ 1 (X J X J ) 1 X J Γ = Λ, where Λ is a diagonal atrix whose diagonal eleents are the eigenvalues of M = X J (X J X J ) 1 Σ 1 (X J X J ) 1 X J. Then γ 2 = Σ 1/2 (β b ) 2 = ε J M ε J = ε J Γ ΛΓε J λ in (M ) ε J 2. 25

26 Therefore, in view of ε J ε j for j J, (39) follows fro n λ 1 in(σ )E ( {λ 1 in(σ )λ 1 in(m )ε 2 1 } n ) = o(n), (4) provided that covariates X j s and errors ε j s are uncorrelated, which holds if the covariates are deterinistic. Since λ 1 in(m ) = λ 1 in((x J X J ) 1 Σ 1 ) = λ ax (Σ (X J X J )), it follows that (4) is iplied by (23). Hence R n (α) E(R n (α)) in probability for α R q. By Assuption 3, E(R n (α)) = 1 2 α Dα + o(1), (41) where D is given in Assuption 3. Therefore, n{ψ n (β + n 1/2 α) Ψ n (β ) n 1/2 α Ψ n (β )} = 1 2 α Dα + o p (1), α R p. Since the suand in Ψ n (β) is convex in β, it follows fro the convexity lea (Pollard, 1991) that for any M >, sup α M n{ψ n (β + n 1/2 α) Ψ n (β ) n 1/2 α Ψ n (β )} 1 2 α Dα = o p (1), (42) Let n (α) = n[ψ n (β +n 1/2 α) Ψ n (β )] and ˆα n = arg in α R q n (α). Then ˆα n = n 1/2 (ˆβ n β ). Further, for any rando variable γ n bounded in probability, it follows fro (42) that n (γ n ) = γ n n 1/2 Ψ n (β ) γ n Dγ n + o p (1). (43) This shows that n (γ n ) can be approxiated by a quadratic function in γ n, which is uniquely iniized at ˆγ n = D 1 n 1/2 Ψ n (β ). Like Shen (29), we shall use the convexity of the criterion function and characterization of a iniizer to show that ˆγ n and ˆα n are equivalent. Arbitrarily fix ǫ >. If ˆα n ˆγ n > ǫ, then there exists ˆγ n on the line segent joining ˆα n and ˆγ n such 26

27 that ˆγ n ˆγ n = ǫ. Clearly ˆγ n = O p (1). Substituting it in (43), we obtain n (ˆγ n) = n (ˆγ n ) (ˆγ n ˆγ n ) D(ˆγ n ˆγ n ) + o p (1). Since n ( ) is convex and n (ˆα n ) n (ˆγ n ), it follows n (ˆγ n) n (ˆγ n ), so that This shows 1 2 ǫ2 λ in (D) + o p (1) 1 2 (ˆγ n ˆγ n ) D(ˆγ n ˆγ n ) + o p (1). P( ˆα n ˆγ n > ǫ) P ( n (ˆγ n) n (ˆγ n )) ( 1 P 2 ǫ2 λ in (D) + o p (1) ). The last probability converges to zero as n tends to infinity by Assuption 3, and this gives the desired equivalence ˆα n ˆγ n in probability. We now apply the central liit theore of U-statistics to deterine the asyptotic distribution of ˆγ n. By the Craér-Wold device and Assuption 3, it suffices to show c n 1/2 Ψ n (β ) = N(,c V c) (44) for every c R p. We shall apply Theore 2.1 of Barbour and Eagleson (1985) to prove (44). To this end, we verify their condition (2.7) for X (n) n J := n 1/2 c Σ /2 S(Σ 1/2 (β b )). Let (s (n) ) 2 := Var[c n 1/2 Ψ n (β )] = Var [ ] X(n) J. Then by Assuption 3, (s (n) ) 2 c V c as n tends to infinity. This together with (24) gives ( ) n 2 2 (s (n) ) 3 3 E X (n) n J 3 n 2 2 (s (n) ) 3 n 3/2 Σ 1/2 c 3 ( n = O n 1/2 ) λ 3/2 in (Σ ). This verifies their (27) and hence shows the desired (44). 27

28 Proof of Theore 4. First, as in the proof of Theore 2, we have for α R q that R n,d (α) := n{ψ n,d (β 1, + n 1/2 α) Ψ n,d (β ) n 1/2 α Ψ n,d (β 1, )}, = n ( N ) (K 1,K 2 ) K (n) ( Σ 1/2 (K 1,K 2 ) α n 2 γ (K1,K 2 ) + δ (K1,K 2 ) + α n Σ /2 (K 1,K 2 ) Σ 1/2 (γ (K1,K 2 ) + δ (K1,K 2 ))γ (K1,K 2 ) (K 1,K 2 ) (β b (K1,K 2 ))(γ K1,K 2 δ (K1,K 2 )) where γ (K1,K 2 ) = Σ 1/2 (K 1,K 2 ) (β 1, b (K1,K 2 )) and δ (K1,K 2 ) = Σ 1/2 (K 1,K 2 ) (α n+β 1, b (K1,K 2 )). Denote the suand by C (K1,K 2 ). Then Cov(C (K1,K 2 ),C (K 1,K 2 ) ) = if {K 1,K 2 } {K 1,K 2} =, and 2 Cov(C (K1,K 2 ),C (K 1,K 2 ) ) Var(C (K1,K 2 )) + Var(C (K 1,K 2 ) ) for (K 1,K 2 ), (K 1,K 2) K (n). Since for each (K 1,K 2 ) K (n), there are at ost (2n 2 1) possible (j,k) s fro that {j,k} {K 1,K 2 }, it follows that Var(R n (α)) is bounded by ), ( )( ) (2n 2 1) N (2n 2 1) n 2 ( ) N 2 i=1 N cn i (K 1,K 2 ) K (n) E(C 2 (K 1,K 2 )), i (K 1,K 2 ) K (n) for large n, where c is a constant independent of n. Arguing as in (39), we derive Var(R n,d (α)) for α R q if we can show Var(C (K1,K 2 )) N (K 1,K 2 ) K (n) E ( {λ 1 in ( Σ 1/2 (K 1,K 2 ) ( λ 1 1/2 in Σ (K 1,K 2 )) ) 1/2 Σ (K 1,K 2 ) (β 1, b (K1,K 2 )) 2 } n ) = o(n). (45) Under the true odel (5), b (K1,K 2 ) = β 1, + (X K1,K 2 X K1,K 2 ) 1 X K1,K 2 ε K1,K 2, so that Σ 1/2 (K 1,K 2 ) (β 1, b (K1,K 2 )) 2 λ in (M (K1,K 2 )) ε K1,K 2 2, 28

29 where M (K1,K 2 ) = (X K1,K 2 X K1,K 2 ) 1 Σ 1 (K 1,K 2 ). Further, λ 1 in(m (K1,K 2 )) = λ ax (Σ (K1,K 2 )(X K1,K 2 X K1,K 2 )). In view of ε K1,K 2 > ε i ε j for soe i k 1, j k 2 with i j, we see that (45) is iplied by (31). Analogously, we derive ˆα n ˆγ n in probability, where ˆα n = n 1/2 (ˆβ 1,n β 1, ) for ˆβ 1,n given in Section 2, and ˆγ n = D 1 n 1/2 Ψ n,d (β 1, ) N = D 1 n 1/2 (K 1,K 2 ) K (n) Σ /2 (K 1,K 2 ) S ( Σ 1/2 (K 1,K 2 ) (β 1, b (K1,K 2 )) ). Next we will prove the asyptotic norality of ˆγ n, so that we obtain the asyptotic norality of ˆβ 1,n. First, let I (n) be the collection of the 2-subsets of {1,..., n}. There are in total ( ) n 2 such subsets. Then, we do a atching, for each (K 1,K 2 ) K (n), we atch it with one of the 2-subset I I (n) such that K 1 K 2 I, and we denote the atch by (K 1,K 2 ) I. Note, for each I, the nuber of (K 1,K 2 ) that are atched is bounded by a constant, c(), depending only upon. Now for any c R p 1, we have c n 1/2 Ψ n,d (β ) ) 1 = n 1/2 ( N = I I (n) n 1/2 (K 1,K 2 ) K (n) ) 1 ( N (K 1, K 2 ) K (n) c Σ /2 (K 1,K 2 ) S ( Σ 1/2 (K 1,K 2 ) (b b (K 1,K 2 )) ) c Σ /2 (K 1,K 2 ) S ( Σ 1/2 (K 1,K 2 ) (b b (K 1,K 2 )) ) I I (n) W I, say. (K 1, K 2 ) I The last su is a su of dissociated rando variables, that is, W I and W J are independent if I and J are disjoint. Thus, we can apply Theore 2.1 of Barbour and Eagleson (1985) to prove the asyptotic norality of ˆγ n. To do 29

30 so, it suffices to verify their condition (2.7) for X (n) I Assuption 5, (s (n) ) 2 Var[c n 1/2 Ψ n,d (β )] c V c. Thus, = W I and k = 2. By n 2(2) 2 {s (n) } 3 I I (n) W I 3 = n 4 2 {s (n) } 3 n 3/2 1 ( ) N 3 I I (n) (K 1, K 2 ) K (n) c Σ /2 (K 1,K 2 ) cn 4 2 {s (n) } 3 n 3/2 1 ( ) N 3 I I (n) cn 4 2 {s (n) } 3 n 3/2 1 ( ) N 3 ( = O n 1/2 1 ( N ) (K 1,K 2 ) K (n) λ 3/2 in (K 1, K 2 ) I S ( Σ 1/2 (K 1,K 2 ) (β 1, b (K1,K 2 )) ) 3 c Σ /2 (K 1,K 2 ) (K 1, K 2 ) K (n) (K 1, K 2 ) I (K 1,K 2 ) K (n) ( Σ(K1,K 2 ) S ( Σ 1/2 (K 1,K 2 ) (β 1, b (K1,K 2 )) ) 3 ( Σ(K1,K 2 )) λ 3/2 in ) ), by Assuption 6. This verifies their condition (2.7), so that we obtain the asyptotic norality of c n 1/2 Ψ n,d (β 1, ) for any c R p 1 and hence ˆγ n. Cobining with the established equivalence ˆα n ˆγ n = n 1/2 (ˆβ 1,n β 1, ) ˆγ n in probability, we obtain the desired asyptotic norality of the differencebased MTSE ˆβ 1,n. References [1] Andersen, P.K. and Gill, R.D. (1982). Cox s Regression Model for Counting Processes: A Large Saple Study. Ann. Statist. 1, [2] Bai, Z.D., Chen, X.R., Miao, B.Q. and Rao, C.R. (199). Asyptotic Theory of Least Distance Estiate in Multivariate Linear Models. J. Aer. Statist. Assoc. 9,

31 [3] Akritas, M., Murphy, S. and LaValley, M. (1995). The Theil-Sen Estiator with Doubly Censored Data and Applications to Astronoy. J. Aer. Statist. Assoc. 9, [4] Barbour, A.D. and Eagleson, G.K. (1985). Multiple Coparisons and Sus of Dissociated Rando Variables. Adv. Appl. Prob. 17, [5] Brown, B.M. (1983). Statistical use of the spatial edian. J. Roy. Statist. Soc. Ser. B 45, [6] Busarova, D., Tyurin, Y., Mottonen, j. and Oja, H. (26). Multivariate Theil estiator with the corresponding test. Math. Meth. Statist. 15, [7] Chatterjee, S. and Olkin, I. (26). Nonparaetric estiation for quadratic regression. Statist. & Probabil. Lett. 76, [8] Chaudhuri, P. (1992). Multivariate Location Estiation Using Extension of R-Estiates Through U-Statistics Type Approach. Ann. Statist. 2, [9] Dietz, E. J. (1989). Teaching Regression in a Nonparaetric Statistics Course. Aer. Statist. 43, [1] Fernandes, R. and Leblanc, S. (25). Paraetric (odified least squares) and non-paraetric (Theil-Sen) linear regressions for predicting biophysical paraeters in the presence of easureent errors. Reote Sensing of Environent. 95, [11] Hollander, M. and Wolfe, D. A. (1999). Nonparaetric Statistic Methods, 2nd Ed. John Wiley & Sons, NY [12] Keperan (1987). The Median of a Finite Measure on a Banach Space. Probability and Related Methods. Elsevier Science Publishers B.V. (North-Holland) [13] Liu. R. (199). On a notion of data depth based on rando siplices. Ann. Statist. 18, [14] Peng, H., Wang, S. and Wang, X. (28). Consistency and asyptotic distri- 31

32 bution of the Theil-Sen estiator. J. Statist. Plann. Inferr. 138, [15] Pollard, D. (1991). Asyptotics for Least Absolute Deviation Regression Estiators. Econoetric Theory 7, [16] Rousseeuw, P.J. and Leroy, A.M. (1987). Robust Regression and Outlier Detection. John Wiley & Sons, Inc. [17] Sen, P. K. (1968). Estiates of the regression coefficient based on Kendall s tau. J. Aer. Statist. Assoc. 63, [18] Serfling, R. (26). Multivariate syetry and asyetry. In Encyclopedia of Statistical Sciences, 2nd Ed. (Kotz, Balakrishnan, Read and Vidakovic, Eds.), Wiley, [19] Shen, G. (28). Asyptotics of Oja Median Estiate. Statist. & Probabil. Lett. 78, [2] Shen, G. (29). Asyptotics of A Theil-type Estiate in Linear Regression. Stat & Prob Lett. 79, [21] Sprent, P. (1993). Applied nonparaetric statistical ethods. 2nd Ed., CRC Press, NY. [22] Theil, H. (195). A rank-invariant ethod of linear and polynoial regression analysis, I. Proc. Kon. Ned. Akad. v. Wetensch. A53, [24] Wang, X. (25). Asyptotics of the Theil-Sen estiator in a siple linear regression odel with a rando covariate. J. Nonpara. Statist. 17, [24] Wang, X., Dang, X., Peng, H. and Zhang, H. (29). The Theil-Sen estiators in ultiple linear regression odels. Manuscript, Available at URL: hpeng@ath.iupui.edu. [25] Wilcox, R. (1998). Siulations on the Theil-Sen regression estiator with right-censored data. Statist. & Probabil. Lett. 39, [26] Wilcox, R. (24). Soe results on extensions and odifications of the Theil- Sen regression estiator. British J. Math. Statist. Psych. 57,

33 [27] Zhou, W. and Serfling, R. (28). Multivariate spatial U-quantiles: a Bahadur- Kiefer representation, a Theil-Sen estiator for ultiple regression, and a robust dispersion estiator. J. Statist. Plann. Inferr. 138,

Block designs and statistics

Block designs and statistics Bloc designs and statistics Notes for Math 447 May 3, 2011 The ain paraeters of a bloc design are nuber of varieties v, bloc size, nuber of blocs b. A design is built on a set of v eleents. Each eleent