Joural of Multivariate Aalysis 102 (2011) 1315 1319 Cotets lists available at ScieceDirect Joural of Multivariate Aalysis joural homepage: www.elsevier.com/locate/jmva Superefficiet estimatio of the margials by exploitig kowledge o the copula Joh H.J. Eimahl, Ramo va de Akker Departmet of Ecoometrics & OR ad CetER, Tilburg Uiversity, PO Box 90153, NL-5000 LE Tilburg, The Netherlads a r t i c l e i f o a b s t r a c t Article history: Received 9 November 2010 Available olie 6 May 2011 AMS subject classificatios: 62G05 62G20 Keywords: Copula Estimatio of margials Superefficiet estimatio We cosider the problem of estimatig the margials i the case where there is kowledge o the copula. If the copula is smooth, it is kow that it is possible to improve o the empirical distributio fuctios: optimal estimators still have a rate of covergece 1/2, but a smaller asymptotic variace. I this paper we show that for o-smooth copulas it is sometimes possible to costruct superefficiet estimators of the margials: we costruct both a copula ad, exploitig the iformatio our copula provides, estimators of the margials with the rate of covergece log /. 2011 Elsevier Ic. All rights reserved. 1. Itroductio Suppose oe observes a radom sample from a bivariate distributio. By Sklar s theorem (see, e.g., [5]) the distributio fuctio is determied by its copula ad the margial distributios. I semiparametric copula models, it is assumed that the copula depeds o a Euclidea parameter ad, apart from (absolute) cotiuity, o assumptios are imposed o the margials. The study of efficiet estimatio for semiparametric copula models origiated i [3,2], which focused o efficiet estimatio of the copula parameter. [3] also oted that exploitig the kowledge o the copula may help to improve o the margial empirical distributio fuctios. Followig the setup i [3], [1] ad [7, Chapter 5] provide efficiet estimators of the margials, icorporatig the iformatio the copula provides, with the stadard rate of covergece 1/2 ad a limitig distributio that has less spread tha the limitig distributio of the empirical distributio fuctios. I those models smoothess assumptios o the copula are imposed. This paper shows that i the absece of smoothess, superefficiet estimatio of the margials is possible. To this ed, we costruct, i Sectio 2, a specific copula. I Sectio 3, we costruct a estimator of the margials that exploits the iformatio our copula provides, ad show that its rate of covergece is log /. Our copula is a best copula i the sese that log / is the best possible rate of covergece. 2. The copula I this sectio we defie our copula. To this ed we itroduce idepedet Beroulli variables (B k ) k N with success probability 1/2, ad defie Beroulli variables ( Bk ) k N by Bk = B k for k odd ad Bk = 1 B k for k eve. Usig these Beroulli sequeces, we itroduce the radom pair (U, V) by B k U = 2, ad V = B k k 2. k k=1 k=1 Correspodig author. E-mail addresses: j.h.j.eimahl@uvt.l (J.H.J. Eimahl), r.vdakker@uvt.l (R. va de Akker). 0047-259X/$ see frot matter 2011 Elsevier Ic. All rights reserved. doi:10.1016/j.jmva.2011.04.015
1316 J.H.J. Eimahl, R. va de Akker / Joural of Multivariate Aalysis 102 (2011) 1315 1319 Fig. 1. The support S k of the copula C k for k = 1, 2, 3. Hece V is a oe-to-oe fuctio of U ad the iverse is the same fuctio. Note that U ad V are uiformly distributed o [0, 1]. The joit distributio of (U, V) thus defies a copula, which we will deote by C. This copula ca be iterpreted as a ifiite shuffle of mi (see [4] for shuffles of mi). We provide a secod costructio of C that might be more ituitive ad allows us to itroduce otatio that is eeded i the remaider of the paper. Defie, for k N ad p, q = 1,...,, the sets A (k) p,q = [(p 1)2 k, p2 k ) [(q 1)2 k, q2 k ). Next, we defie, for k N ad p = 1,...,, idices q (k) (p) as follows. For k = 1 we set q (1) (1) = 1 ad q (1) (2) = 2. For k 2 we set, for p = 1,..., 1, q (k) 2q (2p) = (k 1) (p), k odd; 2q (k 1) ad q (k) 2q (2p 1) = (k 1) (p) 1, k odd; (p) 1, k eve, 2q (k 1) (p), k eve. Next we itroduce, for k N, S k = 2k p=1 A(k) p,q (k) ; see Fig. 1 for a illustratio. (p) Now we are able to itroduce, for k N, radom variables (U (k), V (k) ) that are uiformly distributed o S k (the desity equals ). Note that U (k) ad V (k) are uiformly distributed o [0, 1], so the law of (U (k), V (k) ) defies a copula C k. It is easy to see that C k C poitwise, as k. I particular, we have, for all k, m N ad all p, q = 1,...,, P{(U (k), V (k) ) A (k) } = p,q P{(U(k+m), V (k+m) ) A (k) } = P{(U, V) A(k) ad this probability equals 2 k i the case q = q (k) (p) ad 0 i the case q q (k) (p). 3. The estimator ad its limitig behavior p,q Available is a radom sample (X 1, Y 1 ),..., (X, Y ) from a bivariate distributio fuctio H which has C, as defied i Sectio 2, as copula. By Sklar s theorem we have, for all (x, y) R 2, H(x, y) = C(F(x), G(y)), where F ad G are the margial distributio fuctios of X 1 ad Y 1, respectively. The oly assumptio we impose o F ad G is that they belog to F, the set of cotiuous distributio fuctios o the real lie. We itroduce our estimator of F via its quatile fuctio. First, we defie (u) o the set {p2 k p = 0,...,, k 1}. Set (0) = X 1:, (1) = X :, ad defie (p2 k ) for k N ad p {1,..., 1} odd, recursively by (we adopt the usual covetio max = ): p = max p 1 where max p i I p X i k = max X i with i I p k j:x j j: X j I p = k i {1,..., } X i, Q p, max Y j < mi p 1,X i j:x j mi Y j > max p 1,X i j:x j p 1 Y j p+1 X i, Y j p+1 X i,,, p,q }, for k odd, for k eve, ] p + 1 p + 1,, X j X i,. Next, we exted the domai to [0, 1] by (u) = sup{ (p2 k ) p2 k u}. As estimator of F we take the distributio fuctio associated with. We deote this estimator by ˆF. Note that ˆF ca be writte as ˆF (x) = i=1 p i1 (,x] (X i ), where the probability masses p i oly deped o the observatios via the raks (R X, j RY ) j of (X j, Y j ), j = 1,...,. The followig theorem is the mai result of this paper.
J.H.J. Eimahl, R. va de Akker / Joural of Multivariate Aalysis 102 (2011) 1315 1319 1317 Fig. 2. Realizatio of ˆF (solid) ad F edf (dashed) for = 100, F = Φ (dotted), ad G F. Theorem 3.1. For F, G F we have ( deotes the sup-orm): 1 2 lim if log ˆF F lim sup log ˆF F 4 a.s. (1) The theorem demostrates that ˆF is superefficiet, i.e. the rate of covergece is log / istead of the usual rate 1/2. Remark 1. I the proof of Theorem 3.1 we exploit that ay estimator F of F that cocetrates o X 1,..., X satisfies lim if log F F 1 2 a.s. (2) This property implies that our estimator ˆF achieves the best attaiable rate of covergece log /. As the boud (2) does ot deped o the copula, our copula C ca be iterpreted as a best oe (i terms of rate of covergece). Remark 2. A atural questio is whether Z = (/ log )(ˆF (x) F(x)) x R, see as a elemet of l (R), weakly coverges (if so, the limit determies the limitig distributio of (/ log ) ˆF F by a applicatio of the cotiuous mappig theorem). The aswer is egative. For F = I, where I deotes the distributio fuctio of the Uiform[0, 1] distributio, the argumet is as follows (the geeral case easily follows from the uiform case). Sice ˆF cocetrates o the observatios ad, as we exploit i the proof of Theorem 3.1, the maximal spacig of i.i.d. draws from the Uiform[0, 1] distributio satisfies (/ log ) 1 a.s. we have, for ay η (0, 1), ϵ (0, 1/2) ad ay fiite partitio k i=1 T i of [0, 1], lim P sup i sup u,u T i which shows that Z is ot tight. log ˆF (u) ˆF (u ) (u u ) > ϵ = 1 > η, As a illustratio, Fig. 2 presets a realizatio of our estimator ad the empirical distributio fuctio F edf for = 100, F = Φ, the stadard ormal distributio fuctio, ad G F, ad Fig. 3 presets the cetered versios of the estimates. Proof of Theorem 3.1. Itroduce U i = F(X i ) ad V i = G(Y i ), ad recall that mootoe trasformatios of the margials U do ot chage the copula. Let ˆF deote the distributio fuctio resultig from computig ˆF from (U i, V i ) i=1 istead of (X i, Y i ) i=1. As U ˆF (x) = ˆF (F(x)) a.s. we have ˆF U F = ˆF I a.s., which shows that it suffices to prove (1) for F = G = I. To stress that we cosider uiform margials we deote the observatios by (U i, V i ) i the remaider of the proof. As the probability of a tie i (U i ) i=1 or (V i) i=1 equals zero, we throughout work o the evet that there are o ties. Let = max i=1,...,+1 U i: U i 1:, with U 0: = 0 ad U +1: = 1, deote the maximal spacig of U 1,..., U. Observe that ay estimator F of I of the form F (u) = i=1 p i1 [0,u] (U i ) satisfies F I /2. Observe that F I = I. As it is well-kow (see, e.g., [6]) that (/ log ) 1 a.s., we see that the theorem holds oce we establish the boud I 4. As (0) 0 ad (1) 1 we have to prove (u) u 4, for all u (0, 1). (3)
1318 J.H.J. Eimahl, R. va de Akker / Joural of Multivariate Aalysis 102 (2011) 1315 1319 Fig. 3. Realizatio of ˆF F (solid) ad F edf F (dashed) for = 100, F = Φ, ad G F. Deote U = {U 1,..., U } ad itroduce the radom variable ] p 1 K = max k N p = 1,..., +1 : 2, p U k+1 +1. I the case K = we have 1/4 ad (3) trivially holds, so we oly eed to cosider K 1. We will prove, for k = 1,..., K ad p = 1,..., 2k 1 odd, p = max U i U i < p2. (4) i=1,..., k Before we prove (4) we show that (4) implies (3). From (4) it is immediate that (3) holds for u {p2 K p = 1,..., 2 K 1}; to be precise, we have, for p = 1,..., 2 K 1, p p. 2 K 2 K Let K = K+1 ad ote that the itervals ((p 1)2 K, p2 K ] ad (p2 K, (p+1)2 K ] both cotai at least oe observatio. The defiitio of ad (4) ow yield (p2 K ) [(p 1)2 K, (p+1)2 K ) ad the defiitio of K implies 2 (K +1). A combiatio of these observatios immediately yields p 2 K p 2 K 2, which shows that (3) holds for all u {p2 K p = 1,..., 2 K 1}. Fially, we cosider u (0, 1) with u2 K N. Let p such that u (p 2 K, (p + 1)2 K ). We easily obtai the boud p 4 p 1 p + 1 2 K 2 K 2 K (u) u p + 1 + 1 2 K 2 K 2 K 4. We coclude that (3) ideed holds. We coclude the proof by establishig (4). We start with k = p = 1. Sice the squares A (1) 1,1 ad A(1) 2,2 both cotai at least two observatios ad A (1) 2,2 is orth to A(1) 1,1, it follows from the defiitio of (1/2) that (1/2) max i {U i U i < 1/2}. As the square A (2) 3,4 is orth to A(2) 4,3 ad both squares cotai at least oe observatio it is also immediate that (1/2) < mi i {U i U i 1/2}. Hece (4) ideed holds for k = p = 1. Suppose that we have show (4) to hold for k = 1,..., K 1, with K K. We show that the (4) also holds for k = K. We have to discuss the cases K eve ad K odd separately. As the argumets are similar, we oly discuss the case K odd. For p odd we obtai from the iductio hypothesis that all observatios that are relevat for (p2 K ), i.e. the observatios U i that belog to the iterval ( ((p 1)2 K ), ((p+1)2 K )], correspod to observatios (U i, V i ) that fall i the sets A (K) p,q (K) ad A(K) (p) p+1,q (K) (p+1). As K K p,q (K) (p). It follows that (p2 K ) both squares cotai at least oe observatio. As K is odd A (K) p+1,q (K) is orth to A(K) (p+1) max i {U i U i < p2 k }. The mass that C assigs to the set A (K) p+1,q (K) cocetrates i the two subsets A(K+1) (p+1) 2p+1,q (K+1) (2p+1) ad A (K+1) 2(p+1),q (K+1), ad both sets cotai at least oe observatio. As K + 1 is eve the set A(K+1) (2(p+1)) 2(p+1),q (K+1) is south to (2(p+1)) A (K+1) 2p+1,q (K+1) (2p+1). This easily yields (p2 K ) < mi i {U i U i p2 k }. We coclude that (4) holds for k = K as well, which cocludes the iductio argumet.
J.H.J. Eimahl, R. va de Akker / Joural of Multivariate Aalysis 102 (2011) 1315 1319 1319 Refereces [1] X. Che, Y. Fa, V. Tsyreikov, Efficiet estimatio of semiparametric multivariate copula models, Joural of the America Statistical Associatio 101 (2006) 1228 1240. [2] C. Geest, B.J.M. Werker, Coditios for the asymptotic semiparametric efficiecy of a omibus estimator of depedece parameters i copula models, i: C.M. Cuadras, J. Fortiaa, J.A. Rodríguez-Lallea (Eds.), Distributios with Give Margials ad Statistical Modelig, Kluwer, Dordrecht, 2002, pp. 103 112. [3] C.A.J. Klaasse, J.A. Weller, Efficiet estimatio i the bivariate ormal copula model: ormal margis are least favourable, Beroulli 3 (1997) 55 77. [4] P. Mikusiski, H. Sherwood, M. Taylor, Shuffles of mi, Stochastica XIII (1992) 61 74. [5] R. Nelse, A Itroductio to Copulas, 1st ed., Spriger-Verlag, New York, 1999. [6] E. Slud, Etropy ad maximal spacigs for radom partitios, Zeitschrift für Wahrscheilichkeitstheorie ud verwadte Gebiete 41 (1978) 341 352. [7] R. Va de Akker, Iteger-valued time series, Ph.D. Thesis, CetER Dissertatio Series 197, Tilburg Uiversity, 2007. Available at: http://aro.uvt.l/show.cgi?did=306632.