On the empirical multilinear copula process for count data

Size: px
Start display at page:

Download "On the empirical multilinear copula process for count data"

Transcription

1 Bernoulli 20(3), 2014, DOI: /13-BEJ524 arxiv: v1 [math.st] 4 Jul 2014 On the empirical multilinear copula process for count data CHRISTIAN GENEST 1,*,**, JOHANNA G. NEŠLEHOVÁ1,, and BRUNO RÉMILLARD2 1 Department of Mathematics and Statistics, McGill University, 805, rue Sherbrooke ouest, Montréal (Québec), Canada H3A 0B9. * cgenest@math.mcgill.ca; johanna@math.mcgill.ca; url: ** Service de l enseignement des méthodes quantitatives de gestion, HEC Montréal, 3000, chemin de la Côte-Sainte-Catherine, Montréal (Québec), Canada H3T 2A7. bruno.remillard@hec.ca; url: neumann.hec.ca/pages/bruno.remillard/ Continuation refers to the operation by which the cumulative distribution function of a discontinuous random vector is made continuous through multilinear interpolation. The copula that results from the application of this technique to the classical empirical copula is either called the multilinear or the checkerboard copula. As shown by Genest and Nešlehová (Astin Bull. 37 (2007) ) and Nešlehová (J. Multivariate Anal. 98 (2007) ), this copula plays a central role in characterizing dependence concepts in discrete random vectors. In this paper, the authors establish the asymptotic behavior of the empirical process associated with the multilinear copula based on d-variate count data. This empirical process does not generally converge in law on the space C([0,1] d ) of continuous functions on [0,1] d, equipped with the uniform norm. However, the authors show that the process converges in C(K) for any compact K O, where O is a dense open subset of [0,1] d, whose complement is the Cartesian product of the ranges of the marginal distribution functions. This result is sufficient to deduce the weak limit of many functionals of the process, including classical statistics for monotone trend. It also leads to a powerful and consistent test of independence which is applicable even to sparse contingency tables whose dimension is sample size dependent. Keywords: checkerboard copula; contingency table; count data; empirical process; Kendall s tau; mid-ranks; multilinear extension copula; Spearman s rho; test of independence 1. Introduction This paper s central message is that there are advantages, both conceptual and technical, to viewing a contingency table as arising from a multivariate distribution having uniform margins on the unit interval, that is, a copula. As will be shown here, this approach This is an electronic reprint of the original article published by the ISI/BS in Bernoulli, 2014, Vol. 20, No. 3, This reprint differs from the original in pagination and typographic detail c 2014 ISI/BS

2 2 C. Genest, J.G. Nešlehová and B. Rémillard leads to new statistical methodology that can be used to analyze tables that are sparse or whose number of categories grows with the sample size. To go straight to the point, consider the simple case of a K L contingency table derived from a random sample of size n of ordinal or interval responses in ordered categories A 1 < <A K and B 1 < <B L. For arbitrary k 1,...,K} and l 1,...,L}, let f kl be the relative frequency of the pair (A k,b l ) and denote by f k+ and f +l the row-wise and column-wise totals, respectively. Further set F k+ = k f i+, F +l = i=1 l f +i, and let F 0+ =F +0 =0. A density ĉ n with respect to the Lebesgue measure can then be defined (almost everywhere) on [0,1] 2 by setting ĉ n (u,v)= f kl f k+ f +l whenever u (F (k 1)+,F k+ ) and v (F +(l 1),F +l ). As shown in Section 2, the corresponding distribution function Ĉ n is a copula, that is, its margins are uniform on [0,1]. Moreover, when f kl =f k+ f +l for all k 1,...,K} and l 1,...,L}, Ĉ n becomes the independence copula Π whose Lebesgue density is identically equal to 1 on [0,1] 2. More significantly, several standard measures of association in the pair (X, Y), and classical tests of independence between X and Y, are based on Ĉ n. For example, Pearson s χ 2 statistic and the likelihood ratio statistic G 2 are immediately seen to satisfy χ 2 =n G 2 =2n K L k=1 l=1 K L k=1 l=1 (f kl f k+ f l+ ) 2 f k+ f l+ =n ( ) fkl f kl ln =2n f k+ f +l i= ĉ n(u,v) 1} 2 dvdu, (1.1) 0 lnĉ n(u,v)}dĉ n (u,v). (1.2) With some additional work (Nešlehová [14]) it can also be shown that the well-known Spearman and Kendall statistics for testing monotone trend(agresti[1]) can be rewritten in terms of Ĉ n. Many other examples could be given. The introduction of the multilinear empirical copula Ĉ n in this context is not merely a neat way of unifying various known statistics for frequency data analysis. Because integral expressions such as (1.1) and (1.2) make sense even when the number of categories changeswith n, Ĉ n isratherakeytool forthe investigationofnew orexistingprocedures that can be used even in cases where the table is sparse or of varying dimension. Further, it may be seen that when X and Y are continuous, Ĉ n is a smoothed version of the classical empirical copula (Deheuvels [3]) from which it differs by at most a factor of 1/n uniformly. Statistical tools based on Ĉ n can thus bridge the gap between continuous and discrete outcomes. In particular, the problems associated with ties, which

3 Empirical copula process for count data 3 invalidate many of the procedures developed for continuous data (Genest, Nešlehová and Ruppert [8]), are then automatically taken care of. While it seems intuitively reasonable to base inference on Ĉ n, this new approach generally requires the knowledge of its limit C and the asymptotic behavior of the corresponding empirical process Ĉ n = n(ĉ n C ), (1.3) which has hitherto never been studied in the literature. This paper contributes to the problem by determining the asymptotic behavior of the process (1.3) in general dimension d 2 when the components of the underlying random vector X=(X 1,...,X d ) are either integer-valued or strictly increasing transformations thereof.aswillbeseen, Ĉ n isaconsistentestimatoroftheso-calledmultilinearextension (or checkerboard) copula C of X. This limiting copula, defined in Section 2, has been studied earlier, for example, by Genest and Nešlehová[7] and Nešlehová[14], who showed that it captures many important dependence properties of X when d = 2. In particular, when the components of X are independent, C is the independence copula Π. The main result, stated in Section 3, gives the asymptotic behavior of the process(1.3). Unless the components of X are mutually independent, Ĉ n does not generally converge on the space C([0,1] d ) of continuous functions on [0,1] d equipped with the uniform norm because C has discontinuous partial derivatives. Fortunately, Ĉ n converges without any regularity conditions in the subspace C(K) for any compact subset K O, where O is a dense open subset of [0,1] d whose complement is the Cartesian product of the ranges of the marginal distribution functions. The proof of the main result is involved; it is outlined in Section 4 and detailed in the Appendix. To illustrate the usefulness of the process (1.3) for inference, Section 5 provides a few initial examples of application. It is first shown that the main result is sufficient to deduce the limiting distribution of classical statistics for monotone trend such as Spearman s rho and Kendall s tau. Moreover, a new and consistent Cramér von Mises type test of independence is proposed that can be used whatever the margins. As illustrated through a small simulation study, it performs very well even for sparse contingency tables whose dimension is sample size dependent; in all cases considered, it is consistently more powerful than the classical chi-squared test. Section 6 concludes. 2. The multilinear extension copula Suppose that X =(X 1,...,X d ) is a vector of discrete random variables with joint cumulative distribution function H. For each j 1,...,d}, let F j denote the distribution function of X j and assume that there exists a strictly increasing function A j : N R such that supp(x j ) A j (k): k N}. Note that the inclusion may be strict; in particular, it is not assumed that PrX j = A j (k)} > 0 holds for all k N or that the support of X j is infinite. Furthermore, observe that the closure of the range of F j, viz. R j =0,1,F j A j (0)},F j A j (1)},...}, defines a partition of [0,1]. In what follows, A j ( 1)=A j (0) 1 for all j 1,...,d} by convention.

4 4 C. Genest, J.G. Nešlehová and B. Rémillard Definition 2.1. The multilinear extension copula C of H is the unique copula whose density with respect to the Lebesgue measure is given by c (u 1,...,u d )= PrX 1 =A 1 (k 1 ),...,X d =A d (k d )} PrX 1 =A 1 (k 1 )} PrX d =A d (k d )} whenever for all j 1,...,d}, F j A j (k j 1)}<u j F j A j (k j )} for some k j N. AnexplicitformofC,whichiseasilyverifiedbydifferentiation,isgiveninProposition 2.1 below. For each j 1,...,d} and u [0,1], let u j and u + j be, respectively, the greatest and the least element of R j such that u j u u+ j. Further let (u u λ Fj (u)= j )/(u + j u j ), if u j u+ j, 1, otherwise. Thus when k N is such that F j A j (k)} = PrX j = A j (k)} > 0, then for all u (F j A j (k 1)},F j A j (k)}), one has u j =F ja j (k 1)}, u + j =F ja j (k)} and λ Fj (u)= u F ja j (k 1)}. F j A j (k)} Furthermore, if Fj 1 is the pseudo-inverse of F j, then F j Fj 1 (u j )=F ja j (k 1)} and F j F 1 j (u + j )=F ja j (k)}. Finally, for any S 1,...,d} and u 1,...,u d [0,1], set λ H,S (u 1,...,u d )= l S λ Fl (u l ) l/ S1 λ Fl (u l )}, which depends on H only through its margins F 1,...,F d. Proposition 2.1. The multilinear extension copula C of H is given by C (u 1,...,u d )= λ H,S (u 1,...,u d )HF1 1 (u S1 ),...,F 1 d (u S d )}, S 1,...,d} where for each j 1,...,d}, u Sj =u + j if j S and u Sj =u j otherwise. In particular, C (u S1,...,u Sd )=HF1 1 (u S1 ),...,F 1 d (u S d )} for any S 1,...,d}. It is easily seen that C satisfies Sklar s representation, that is, for all x 1,...,x d R, H(x 1,...,x d )=C F 1 (x 1 ),...,F d (x d )}. This is because in effect, this identity needs only be verified if for all j 1,...,d}, x j = A j (k j ) for some k j N such that F j A j (k j )} > 0. In fact, C is precisely the construction used to extend a sub-copula to a copula in the proof of Sklar s theorem; see, for example, Nelsen [13] for details in the bivariate case.

5 Empirical copula process for count data 5 The copula C is known to capture many important dependence properties of H, as summarized by Genest and Nešlehová [7]. As shown by Nešlehová ([14], Corollary 6), C is invariant with respect to strictly increasing transformations of the margins. Now consider a random sample X =(X 11,...,X 1d ),...,(X n1,...,x nd )} from H and let H n be the corresponding empirical distribution function. Because H n is itself a discrete distribution, one can define its multilinear extension copula Ĉ n and its corresponding density ĉ n with respect to the Lebesgue measure as above. To be explicit, fix j 1,...,d} and denote by A nj (0)< <A nj (n j ) the distinct values of X 1j,...,X nj. Let also A nj ( 1)=A nj (0) 1. The range R nj of F nj then consists of 0=F nj A nj ( 1)}<F nj A nj (0)}< <1=F nj A nj (n j )}. If (u 1,...,u d ) is such that for all j 1,...,d}, F nj A nj (k j 1)}<u j F nj A nj (k j )} for some k j 0,...,n j }, then ĉ n(u 1,...,u d )= h n A n1 (k 1 ),...,A nd (k d )} F n1 A n1 (k 1 )} F nd A nd (k d )}, whose numerator is the proportion of data with X ij =A nj (k j ) for j 1,...,d}, and Ĉn (u 1,...,u d )= λ Hn,S(u 1,...,u d )H n Fn1 1 (u S 1 ),...,F 1 nd (u S d )}. S 1,...,d} Observe that ĉ n and Ĉ n are both functions of the component-wise ranks. As announced in the Introduction, Ĉn is a consistent estimator of the multilinear extension copula C of H. This fact will be a consequence of this paper s main result, Theorem 3.1, which characterizes the limit of the process Ĉ n defined in (1.3). Remark 2.1. When X 1,...,X d are continuous, Ĉ n was actually used by Deheuvels [4] to construct tests of independence. It is then asymptotically equivalent to the empirical copula Ĉn given, for all u 1,...,u d [0,1], by Ĉ n (u 1,...,u d )= 1 n n 1F n1 (X i1 ) u 1,...,F nd (X id ) u d }. i=1 Indeed, if for all j 1,...,d}, F nj A nj (k j 1)} u j < F nj A nj (k j )} for some k j 0,...,n j }, then Ĉn(u 1,...,u d )=H n A n1 (k 1 1),...,A nd (k d 1)}. Because the coefficients λ Hn,S are non-negative and add up to 1 by the multinomial formula, the fact that H n is non-decreasing component-wise implies that Ĉ n (u 1,...,u d ) Ĉ n (u 1,...,u d ) H n A n1 (k 1 ),...,A nd (k d )}. Hence, Ĉn(u 1,...,u d ) Ĉ n (u 1,...,u d ) is bounded above by H n A n1 (k 1 ),...,A nd (k d )} H n A n1 (k 1 1),...,A nd (k d 1)}

6 6 C. Genest, J.G. Nešlehová and B. Rémillard F nj A nj (k j )} F nj A nj (k j 1)}, from which it follows that Ĉn Ĉ n d/n almost surely. This also implies that Ĉ n is asymptotically equivalent to other versions of the empirical copula commonly used in the literature; see, for example, Fermanian, Radulović and Wegkamp [6]. To ease the notation, it will be assumed henceforth, without loss of generality, that X 1,...,X d are integer-valued. In this case, one has the following alternative representation of C, which is useful to study the process (1.3). Proposition 2.2. Let (X 1,...,X d ) be a random vector in N d with distribution function H. Let also U 1,...,U d be independent standard uniform random variables, independent of (X 1,...,X d ). Then C is the unique copula of the distribution function H of (X 1 +U 1 1,...,X d +U d 1) with margins F 1,...,F d, that is, for all u 1,...,u d [0,1], C (u 1,...,u d )=H F 1 1 (u 1 ),...,F 1 d (u d )}. Given an empirical distribution function H n based on a random sample from a multivariate integer-valued distribution H, one can proceed as in Proposition 2.2 to define a multilinear extension Hn whose margins F n1,...,f nd are continuous extensions of the margins F n1,...,f nd of H n. Furthermore, Ĉ n (u 1,...,u d )=H n F 1 n1 (u 1 ),...,F 1 nd (u d )} holds for all u 1,...,u d [0,1], which will come in handy in Section The empirical multilinear copula process In what follows, C(K) stands for the space of all continuous functions from a compact set K [0,1] d to R equipped with the uniform norm, that is, f K = sup f(u 1,...,u d ) : (u 1,...,u d ) K}. When K =[0,1] d, the index on is suppressed. Similarly, let l (K) denote the space of all bounded functions from K to R equipped with the uniform norm. For each j 1,...,d} and all u 1,...,u d (0,1) where the partial derivatives exist, set Ċ j (u 1,...,u d )= u j C (u 1,...,u d ). Furthermore, let B C be a C -Brownian bridge, that is, a centred Gaussian process on [0,1] d with covariance given, for all s 1,...,s d,t 1,...,t d [0,1], by C (s 1 t 1,...,s d t d ) C (s 1,...,s d )C (t 1,...,t d ). Here, a b=min(a,b) for arbitrary a,b R. The limit of Ĉ n can be expressed in terms of a transformation of B C involving the following operator.

7 Empirical copula process for count data 7 Definition 3.1. Let H be a multivariate distribution function with support included in N d and margins F 1,...,F d. The multilinear interpolation operator M H : l ([0,1] d ) l ([0,1] d ):g M H (g) is defined, for every g l ([0,1] d ), by M H (g)(u 1,...,u d )= λ H,S (u 1,...,u d )g(u S1,...,u Sd ). S 1,...,d} As was the case with λ H,S, the operator M H depends on H only through its margins. Although the paths of the process Ĉ n are continuous on [0,1] d for every n, it cannot possibly converge in C([0,1] d ) in general. This is because unless C = Π, its partial derivatives exist only on the open set O= (F 1 (k 1 1),F 1 (k 1 )) (F d (k d 1),F d (k d )). (k 1,...,k d ) N d Fortunately, the convergence of Ĉ n can be established in C(K) for any compact K O. The symbol is used henceforth to denote weak convergence. Theorem 3.1. Let C =M H (B C ) and let K be any compact subset of O. Then, as n, Ĉ n Ĉ in C(K), where, for all (u 1,...,u d ) O, Ĉ (u 1,...,u d )=C (u 1,...,u d ) Ċj (u 1,...,u d )C (1,...,1,u j,1,...,1). This theorem can be strengthened when X 1,...,X d are mutually independent, which is the case if and only if C is the independence copula Π. Corollary 3.1. Suppose that C =Π. Then, as n, Ĉ n Ĉ in C([0,1] d ). Remark 3.1. When X 1,...,X d are continuous, C =C is the unique copula of H and Ĉn is asymptotically equivalent to the empirical copula Ĉn by Remark 2.1. Rüschendorf [16] showed that under suitable regularity conditions on C, Ĉ n Ĉ as n, where Ĉ is defined in terms of a C-Brownian bridge B C, for all u 1,...,u d [0,1], by Ĉ(u 1,...,u d )=B C (u 1,...,u d ) Ċ j (u 1,...,u d )B C (1,...,1,u j,1,...,1). This result has since been refined in various ways; see Segers [18] and references therein. 4. Proof of Theorem 3.1 The proofofthe mainresult is quite involved.it restson aseriesofstepsand propositions that are described below. All proofs may be found in Appendix B.

8 8 C. Genest, J.G. Nešlehová and B. Rémillard Because C is a copula of H, it can be assumed without loss of generality that the sample X from H arises from a random sample V =(V 11,...,V 1d ),...,(V n1,...,v nd )} from C, that is, for every i 1,...,n}, one has X i1 =F1 1 (V i1 ),...,X id =F 1 d (V id). If B n denotes the empirical distribution function of this latent sample V, it is well known thatasn,the correspondingempiricalprocessb n = n(b n C ) convergesweakly in l ([0,1] d ) to the C -Brownian bridge B C (van der Vaart and Wellner [19]). The first step consists of considering the case where the margins of H are known. In contrast to the continuous case, the variables F 1 (X 1 ),...,F d (X d ) are not uniform and their joint distribution function D is not a copula. Observe that C = M H (D) and introduce Cn =M H(D n ), where D n denotes the empirical distribution function of the transformed data (F 1 (X 11 ),...,F d (X 1d )),...,(F 1 (X n1 ),...,F d (X nd )). Note that Cn cannot be computed in practice, because it relies on the unknown marginal distribution functions. As is easily seen by differentiation, Cn is a continuous distribution function on [0,1] d whose jth margin is given, for all u [0,1], by C nj(u)=λ Fj (u)d nj (u + )+1 λ Fj (u)}d nj (u ). Because its margins are not uniform, Cn is not a copula. The following proposition shows that the empirical process C n = n(cn C ) converges. Its proof rests on the factthat M H isacontinuouslinearcontraction.thisisbecausethe weightsλ H,S arenonnegativeandaddupto1,sothatforanyg,g l ([0,1] d ),onehas M H (g) M H (g ) g g. Proposition 4.1. As n, C n C =M H (B C ) in C([0,1] d ). Next, the process Ĉ n in which margins are unknown can be written in the form where the summands are defined, for all u 1,...,u d [0,1], by Ĉ n = C n + D n, (4.1) C n (u 1,...,u d )= n[h n F 1 n1 (u 1 ),...,F 1 nd (u d )} H F 1 n1 (u 1 ),...,F 1 nd (u d )}] and D n (u 1,...,u d )= n[c F 1 F 1 n1 (u 1 ),...,F d F 1 nd (u d )} C (u 1,...,u d )]. The next proposition shows that C n has the same asymptotic behavior as C n. Proposition 4.2. As n, C n C n p 0. Next, one needs to determine the limit of the second summand in (4.1). The following result first shows that D n has the same asymptotic behavior as that of the auxiliary

9 Empirical copula process for count data 9 process D n defined, for all u 1,...,u d [0,1], by D n (u 1,...,u d )= n [C u 1 C n1(u 1 ),...,u d C nd (u ] d) } C (u 1,...,u d ), n n where C n1,...,c nd are the margins of C n. Proposition 4.3. As n, D n D n p 0. Finally, fix an arbitrary compact subset K of O and consider the mapping D K : C([0,1] d ) C(K) defined, for all g C([0,1] d ) and (u 1,...,u d ) K, by D K (g)(u 1,...,u d )= Ċj (u 1,...,u d )g(1,...,1,u j,1,...,1). This mapping is clearly linear and continuous because for any g, g C([0,1] d ), D K (g) D K (g ) Ċj (u 1,...,u d ) g g d g g. For, when they exist, the partial derivatives of any copula take values in [0,1]. The Continuous Mapping theorem then implies that, as n, D K (C n) D K (C ) in C(K). As shown next, the difference between D n and D K (C n ) is asymptotically negligible. Proposition 4.4. As n, D n D K (C n) K p 0 for any compact K O. To complete the proof of Theorem 3.1, let K be any compact subset of O. Combining Propositions , one finds that, as n, Ĉ n C n D K(C n ) K p 0. The Continuous Mapping theorem can then be invoked together with Proposition 4.1 to conclude that Ĉ n Ĉ =C +D K (C ). To establish Corollary 3.1, first note that when C =Π, Ċ j is continuous on [0,1] d for all j 1,...,d}. One can then define D as D K with K =[0,1] d and use the following result to conclude. Proposition 4.5. When C =Π, D n D(C n) p 0 as n. Remark 4.1. Although the process Ĉ n fails to converge on C([0,1]d ) in general, the sequence Ĉ n is tight. Indeed, the definition of D n and the Lipschitz property of C imply that D n C n1 + + C nd d C n. From (4.1) and the triangle inequality, Ĉ n (d+1) C n + C n C n + D n D n. (4.2)

10 10 C. Genest, J.G. Nešlehová and B. Rémillard The result thus follows because the three summands form tight sequences. Indeed, C n converges weakly in C([0,1] d ) by Proposition 4.1 and the other two terms converge in probability to 0 by Propositions 4.2 and 4.3, respectively. It is further of interest to observe that because Ċ j O 1 for all j 1,...,d}, one has Ĉ O (d+1) C. Finally, note that Ĉ n is a uniformly consistent estimator of C. This follows immediately from (4.2), the Continuous Mapping theorem and Slutsky s lemma. Corollary 4.1. As n, Ĉ n C p Applications Theorem 3.1 characterizes the weak limit of the empirical process Ĉ n in C(K) for any compact subset K of O. To illustrate the usefulness of this result for inference, a few initial examples of application are provided below. They pertain to classical statistics for monotone trend and tests of independence, respectively Tests of monotone trend Kendall s tau and Spearman s rho are two classical measures of monotone trend for twoway cross-classifications of ordinal or interval data. As described, for example, in Agresti [1], powerful tests of independence can be based on these statistics. Both of them are functions of (mid-) ranks that can be expressed as functionals of Ĉ n (Nešlehová [14]). Given a random sample X =(X 11,X 12 ),...,(X n1,x n2 )} from a bivariate distribution function H, let R ij denote the component-wise mid-rank of X ij for i 1,...,n} and j 1,2}. Let also a n and b n, respectively, represent the number of strictly concordant and discordant pairs in the sample. The non-normalized versions of Kendall s and Spearman s coefficients then satisfy τ n = a n b ( n n 2 ρ n = 12 n 3 ) = n 1 n n ( i=1 R i1 n )( R i2 n+1 2 Ĉ n (u,v)dĉ n (u,v) }, ) =12 Ĉ n (u,v) uv}dπ(u,v). [0,1] 2 It is immediate from Corollary 4.1 that τ n and ρ n are consistent estimators of τ = C (u,v)dc (u,v), ρ=12 C (u,v) uv}dπ(u,v). [0,1] 2 It is well known that τ n is a U-statistic and hence asymptotically Gaussian (Lee [11]). Its limiting behavior can also be deduced from Theorem 3.1. To see this, first call on

11 Empirical copula process for count data 11 Hoeffding s identity (Nelsen [13], Corollary 5.1.2) to write C (u,v)dĉ n (u,v)= Ĉn (u,v)dc (u,v). [0,1] 2 [0,1] 2 Given that Ĉ n and C are absolutely continuous with respect to the Lebesgue measure, the fact that the complement of O in [0,1] d has Lebesgue measure 0 then implies that } n Ĉn (u,v)dĉ n (u,v) C (u,v)dc (u,v) [0,1] 2 [0,1] 2 = Ĉ n(u,v)dĉ n (u,v)+ Ĉ n(u,v)dc (u,v). O The following representation for the limit of n(τ n τ) can be deduced from this relation. Details are provided in Appendix C. Proposition 5.1. In dimension d=2, n(τ n τ) converges weakly, as n, to the centred Gaussian random variable T 2 =8 Ĉ (u,v)dc (u,v). O Similarly, the asymptotic normality of n(ρ n ρ) can be deduced from the theory of U-statistics; see, for example, Quessy [15]. The latter paper also considers several d- variate extensions of ρ n which mimic the multivariate versions of these coefficients for continuous data proposed by Schmid and Schmidt [17]. Recently, we proposed alternative estimators of ρ in the multivariate case and showed that they lead to powerful tests of independence and a graphical tool for visualizing dependence in discrete data (Genest, Nešlehová and Rémillard [9]). In particular, we considered ρ nd = d [ 1 2 d + 1 n O n d ( 2n+1 2n R ) }] ij, n i=1 where d =2 d (d+1)/2 d (d+1)} and for each i 1,...,n} and j 1,...,d}, R ij denotesthe mid-rankofx ij amongx 1j,...,X nj.thelatterreducesto ρ n inthe bivariate case and can be rewritten as ρ nd = d Ĉ n (u 1,...,u d ) Π(u 1,...,u d )}dπ(u 1,...,u d ). [0,1] d Furthermore, it is a consistent estimator of ρ d = d C (u 1,...,u d ) Π(u 1,...,u d )}dπ(u 1,...,u d ). [0,1] d

12 12 C. Genest, J.G. Nešlehová and B. Rémillard The asymptotic normality of n(ρ nd ρ d ), established by Genest, Nešlehová and Rémillard [9], can be shown alternatively using Theorem 3.1. A detailed proof of the following result is given in Appendix C. Proposition 5.2. In arbitrary dimension d 2, n(ρ nd ρ d ) converges weakly, as n, to the centred Gaussian random variable R d = d Ĉ (u 1,...,u d )dπ(u 1,...,u d ). O 5.2. Tests of independence When dealing with contingency tables that are sparse or whose dimension varies with the sample size, Theorem 3.1 can be used to construct consistent and powerful tests of independence. This is because random variables X 1,...,X d are mutually independent if and only if C = Π. To test the null hypothesis H 0 of mutual independence between X 1,...,X d, one could consider, for example, the Cramér von Mises statistic S n =n Ĉ n (u 1,...,u d ) Π(u 1,...,u d )} 2 dπ(u 1,...,u d ). [0,1] d Note that when X 1,...,X d are continuous, S n is equivalent to the statistic suggested by Deheuvels [4] and later studied by Genest and Rémillard [10]. The limiting distribution of S n under H 0 is easily deduced from Corollary3.1when the variablesareinteger-valued or increasing transformations thereof. In fact, a straightforward adaptation of the proof of Proposition 5.2 yields the following result. Proposition 5.3. Under H 0 one has, as n, S n S, where S = Ĉ (u 1,...,u d )} 2 dπ(u 1,...,u d ). [0,1] d If H 0 does not hold, then, as n, S n n p C (u 1,...,u d ) Π(u 1,...,u d )} 2 dπ(u 1,...,u d )>0. [0,1] d In particular, Proposition 5.3 implies that a test based on S n is consistent against any alternative, that is, when H 0 fails then, as n, Pr(S n >ε) 1 for all ε>0. Unfortunately, the limiting null distribution of S n depends on the margins of H which are generally unknown. To carry out the test, one must thus resort to resampling techniques, such as the multiplier bootstrap (van der Vaart and Wellner [19]). An illustration of how this can be done is presented below in the case d=2.

13 Empirical copula process for count data 13 Algorithm 5.1. Given a random sample X =(X 11,X 12 ),...,(X n1,x n2 )} from a bivariate distribution function H, define, for i 1,...,n} and j 1,2}, V nj,i (u)=λ Fnj (u)1x ij A nj (k j )}+1 λ Fnj (u)}1x ij A nj (k j 1)}, whenever F nj A nj (k j 1)}<u F nj A nj (k j )} for some k j 0,...,n j }. The test based on S n can now be carried out as follows. Step 1: For each m 1,...,M},generate an independent random sample ξ (m) 1,...,ξ n (m) of size n from a univariate distribution with mean zero and variance 1, and set ξ (m) =(ξ (m) 1 + +ξ n (m) )/n. Step 2: For each m 1,...,M}, define the process C (m) n at each u,v [0,1] by and compute C (m) n (u,v)= 1 n n (ξ (m) i i=1 S (m) n = 1 1 Step 3: Estimate the p-value for the test by 1 M 0 ξ (m) )V n1,i (u) u}v n2,i (v) v} 0 M m=1 C (m) n (u,v)} 2 dvdu. 1(S (m) n >S n ). An efficient implementation of this procedure is described in a companion paper in preparation, in which the validity of the multiplier bootstrap is established in this specific context. Here, the finite-sample properties of this test are merely illustrated through a small simulation study involving: five copulas: independence, Clayton (Cl) and Gaussian (Ga) with τ 0.1, 0.2}; four margins: Binomial(3, 0.5), Poisson(1), Poisson(20), Geometric(0.5), respectively, denoted by F 1, F 2, F 3 and F 4 ; three statistics: S n, the standard χ 2, and a modified version available in R in which the p-value is computed by a Monte Carlo method; sample size n=100 and nominal level α=5%; M = 1000 multiplier replicates and N = 1000 repetitions of the simulation. TheresultsofthestudyaredisplayedinTable1below.ThetestbasedonS n maintainsits nominal level very well in every scenario. In contrast, the standard χ 2 statistic performs rather poorly except when one of the margins is F 1. Resorting to the Monte Carlo χ 2 statistic improves the level, but the test is still slightly liberal in some cases. The power of the test based on S n is way better than that of its two competitors in columns 1 7 and 9. In columns 8 and 10, χ 2 is slightly better when τ =0.1. Note however that in these cases, the level of the χ 2 statistic is completely off. For a more thorough simulation study, see Murphy [12].

14 14 C. Genest, J.G. Nešlehová and B. Rémillard Table 1. Percentage of rejection of the null hypothesis H 0 of mutual independence for the three tests considered in the simulation study under various conditions τ C Test Distribution of X 1 F 1 F 1 F 2 F 1 F 2 F 3 F 1 F 2 F 3 F 4 Distribution of X 2 F 1 F 2 F 2 F 3 F 3 F 3 F 4 F 4 F 4 F 4 0 Π S n χ χ 2 -MC Cl S n χ χ 2 -MC Ga S n χ χ 2 -MC Cl S n χ χ 2 -MC Ga S n χ χ 2 -MC Conclusion This paper considered the empirical multilinear copula process Ĉ n based on count data. Its convergence was established in C(K) for any compact K O, where O is an open subset of [0,1] d avoiding the points at which the first order partial derivatives of C do not exist. The convergence of Ĉ n in C(K) is sufficient to deduce the asymptotic behavior of simple functionals thereof that are commonly used in statistical inference. This was demonstrated in Section 5 using two standard measures of association based on mid-ranks. While these specific results could have been obtained using the theory of U- statistics, knowledge of the limiting behavior of Ĉ n will be essential in other situations. The new consistent test of independence studied in Section 5 provides an example. It is natural to ask whether the present findings can be extended to the empirical multilinear copula process based on arbitrary discontinuous data. Such an extension may well be possible, given that the estimator Ĉn is defined in general. We are currently investigating this issue. Once this task has been completed, the process Ĉ n will provide a solid foundation for inference in copula models with arbitrary margins.

15 Empirical copula process for count data 15 Appendix A: Proofs from Section 2 Proof of Proposition 2.2. For all x 1,...,x d R, one has H (x 1,...,x d )= H(x 1 +u 1,...,x d +u d )du 1 du d. [0,1] d If (x 1,...,x d ) [k 1 1,k 1 ) [k d 1,k d ) for some k 1,...,k d N, one can replace each x j +u j by k j 1 or by k j, according as 0<u j <k j x j or k j x j u j <1 because H is supported on N d. After straightforward simplification, it follows that H (x 1,...,x d )= } } H(k S ) (k l x l ) (x l k l +1), S 1,...,d} where k S =(k S1,...,k Sd ) and k Sj =k j if j S and k Sj =k j 1 otherwise. If F is a generic margin of H, then F is a linear interpolation of F, that is, F (x)=0 for x< 1 while l/ S l S F (x)=f(k 1)+ F(k)(x k+1) (A.1) when x [k 1,k) for some k N. Thus when u (F(k 1),F(k)], one has F 1 (u)=k 1+ u F(k 1). (A.2) F(k) If u=0, one can set F 1 (0)= 1 for convenience, because the support of X is bounded belowby 0 byhypothesis. It isthen immediate that H F1 1 (u 1 ),...,F 1 d (u d )} yields the formula for C given in Proposition 2.1. Appendix B: Proofs from Section 4 The following elementary result is used in the sequel. Lemma B.1. If G is a cumulative distribution function, then for all u (0,1) and x R, one has u G(x) G 1 (u) x G G 1 (u) G(x). Proof of Proposition 4.1. First note that for fixed values of k 1,...,k d N, one has D n F 1 (k 1 ),...,F d (k d )}= 1 n = 1 n n i=1 n i=1 d 1F j (X ij ) F j (k j )} d 1F j F 1 j (V ij ) F j (k j )}.

16 16 C. Genest, J.G. Nešlehová and B. Rémillard In view of Lemma B.1, it follows that D n F 1 (k 1 ),...,F d (k d )}= 1 n n i=1 d 1V ij F j (k j )}=B n F 1 (k 1 ),...,F d (k d )}. Fromthe definition of M H, one then has M H (D n )=M H (B n ) and hence C n =M H (B n ). The linearity of M H and the fact that M H (C )=C further imply that C n =M H (B n ) from which it also follows that C n = M H(B n ) B n (B.1) because the operator M H is a contraction. Given that M H is a continuous mapping and that B n B C as n, the Continuous Mapping theorem yields the conclusion. The following auxiliary results are needed for the proof of Proposition 4.2. Lemma B.2. For all u 1,...,u d [0,1], C n (u 1,...,u d )=H n F 1 1 (u 1 ),...,F 1 d (u d )}. Proof. First note that the functions on both sides of the above identity are continuous on [0,1] d. This is the case for C n, as explained in Section 4. To see why this is true for the other one, fix arbitrary u 1,...,u d [0,1) and observe that H n F 1 1 (u 1 +),...,F 1 d (u d +)} Hn F 1 1 (u 1 ),...,F 1 d (u d )} Fnj F 1 j (u j +) Fnj F 1 j (u j ). Now each of the summands on the right-hand side must vanish. For, even if u j is a point of discontinuity of F 1 j for some j 1,...,d}, the fact that Fj is continuous implies that Fj F 1 j (u j )=Fj F 1 j (u j +)=u j. Now for arbitrary x,y R, one has F j (x)=f j (y) F nj (x)=f nj (y), (B.2) because F nj can only jump where F j does. Hence Fnj F 1 j (u j )=Fnj F 1 j (u j +). Therefore, it suffices to look at the case where u 1,...,u d (0,1). Suppose that for each j 1,...,d}, u j (F j (k j 1),F j (k j )] for some k j N. It then follows from (A.2) that, for all j 1,...,d}, F 1 j (u j )=k j 1+ u j F j (k j 1), F j (k j )

17 Empirical copula process for count data 17 and hence Consequently, F 1 k j F 1 j (u j ) = F j(k j ) u j, F j (k j ) j (u j ) k j +1= u j F j (k j 1). F j (k j ) Hn F1 1 (u 1 ),...,F 1 (u d )}= Now in view of Lemma B.1, one has Therefore, H n F 1 d H n (k 1,...,k d ) = 1 n = 1 n n i=1 n i=1 1 (u 1 ),...,F 1 (u d )}= d S 1,...,d} d 1(X ij k j )= 1 n λ H,S (u 1,...,u d )H n (k S1,...,k Sd ). n d i=1 1F 1 j (V ij ) k j } d 1V ij F j (k j )}=B n F 1 (k 1 ),...,F d (k d )}. S 1,...,d} λ H,S (u 1,...,u d )B n F 1 (k S1 ),...,F d (k Sd )}, which is M H (B n )(u 1,...,u d ). From the proof of Proposition 4.1, M H (B n )=C n. LemmaB.3. For arbitrary n N, G n =Fn F 1 is a continuous distribution function on [0,1] and Gn 1 =F F 1 n. Proof. As G n is the convolution of two non-decreasing functions, it is non-decreasing. Furthermore, G n (0)=0 and G n (1)=1 by construction. Proceeding as in the proof of Lemma B.2, one can show that G n is indeed continuous. Turning to G 1 n, fix u [0,1] and observe that for any x R such that Fn (x) u, one has F Fn 1 (u) F (x) becausef isnon-decreasing.nowsupposethat y Rissuchthatforallx R,Fn (x) u y F (x). By virtue of Lemma B.1, this is equivalent to saying that for all x R, Fn (x) u F 1 (y) x. This implies that F 1 (y) Fn 1 (u). Applying Lemma B.1 once again, one can see that y F Fn 1 (u). Consequently, F Fn 1 (u)=inff (x): Fn (x) u}. Next, F F 1 (u) = u by continuity of F. Hence, for all x R, F F 1 F (x) = F (x). Invoking implication (B.2), one deduces that F n F 1 F (x) =

18 18 C. Genest, J.G. Nešlehová and B. Rémillard F n (x), which implies inff (x): Fn (x) u}=inff (x): Fn F 1 F (x) u} =infv: Fn F 1 (v) u}=infv: G n (v) u}. In other words, F F 1 n =G 1 n. Proof of Proposition 4.2. First note that in view of Lemma B.2 and Proposition 2.2, one has, for all u 1,...,u d [0,1], C n(u 1,...,u d )= n[hn F1 1 (u 1 ),...,F 1 d (u d )} H F1 1 (u 1 ),...,F 1 d (u d )}]. Next observe that, for all u 1,...,u d [0,1], C n (u 1,...,u d )=C n F 1 F 1 n1 (u 1 ),...,F d F 1 nd (u d )}. (B.3) Indeed, one can write Furthermore, C nf 1 F 1 n1 (u 1 ),...,F d F 1 nd (u d )} = n[hn F1 1 F1 Fn1 1 (u 1 ),...,F 1 d Fd F 1 nd (u d )} H F1 1 F1 F 1 n1 (u 1 ),...,F 1 d Fd F 1 nd (u d )}]. H n F 1 1 F 1 F 1 n1 (u 1 ),...,F 1 d F d F 1 nd (u d )} Hn F 1 n1 (u 1 ),...,F 1 nd (u d )} Fnj F 1 j Fj F 1 nj (u j ) Fnj F 1 nj (u j ). Now the right-hand side is zero by Lemma B.3 and the fact that for all j 1,...,d} and u j [0,1], Fnj F 1 nj (u j )=u j because Fnj is a continuous distribution function. As F1,...,F d are also continuous distribution functions, one has H F 1 1 F 1 F 1 n1 (u 1 ),...,F 1 d F d F 1 nd (u d )} H Fn1 1 (u 1 ),...,F 1 nd (u d )} Fj F 1 j Fj F 1 nj (u j ) Fj F 1 nj (u j ) =0. Therefore, identity (B.3) holds and one can write C n C n = C n C n F 1 F 1 n1,...,f d F 1 nd }.

19 Empirical copula process for count data 19 Next, using (A.1) and (A.2) applied to F and F n, respectively, a direct calculation yields nuj Fj F 1 nj (u j )} Fnj (k j ) u j =B nj F j (k j 1)} F nj (k j ) } +B nj F j (k j )} whenever u j (F nj (k j 1),F nj (k j )] for some k j N. It follows that uj F nj (k j 1) F nj (k j ) }, sup Fj F 1 nj (u j ) u j 1 B nj 1 B n. u j [0,1] n n (B.4) As n, B n B C and hence B n / n p 0. Now for arbitrary ε>0, one has P ( C n C n >ε)=p C n C n (F 1 F 1 n1,...,f d F 1 nd ) >ε}, where P denotes outer probability. Given δ >0, the right-hand side is the same as P C n C n(f1 Fn1 1,...,Fd F 1 nd ) >ε, B } n <δ n +P C n C n (F 1 F 1 n1,...,fd F 1 nd ) >ε, B } n δ, n and in view of (B.4), the latter is bounded above by ( ) P ω n (C Bn n,δ)>ε}+p δ, n where Therefore, ω n (C n,δ)= lim sup n sup u j,v j [0,1]: u j v j <δ, j 1,...,d} C n (u 1,...,u d ) C n (v 1,...,v d ). P ( C n C n >ε) limsupp ω n (C n,δ)>ε}. n Finally, recall that C n converges weakly in C([0,1]d ) to a measurable random element M H (B C ). Because C([0,1] d ) is complete and separable, Theorem in Dudley [5] implies that M H (B C ) is tight. It then follows from Lemma and Theorem in van der Vaart and Wellner [19] that the sequence C n is asymptotically tight and hence asymptotically uniformly equicontinuous in probability, viz. limlimsupp ω n (C δ 0 n,δ)>ε}=0. n This means that as n, P ( C n C n >ε) 0 for all ǫ>0.

20 20 C. Genest, J.G. Nešlehová and B. Rémillard Proof of Proposition 4.3. For fixed j 1,...,d} and u j [0,1], first write Fj (u j ) in the form u j nu j Fj F 1 (u j )}/ n. Then F 1 nj D n D n = n sup u 1,...,u d [0,1] nj C u 1 C n1(u 1 ),...,u d C nd (u } d) n n C [u nu1 F1 1 F 1 n1 (u 1 )},...,u d n nud F d F 1 nd (u d )} n ]. The Lipschitz property of copulas further implies that D n D n sup u j [0,1] C nj (u j) nu j Fj Fnj 1 (u j )}. For any j 1,...,d}, one can now call upon Proposition 4.2 with d=1 and H =F j to conclude that, as n, sup uj [0,1] C nj (u j) nu j F j F 1 nj (u j )} p 0. The proof of Proposition 4.4 relies on the following lemma. Lemma B.4. Let u 1,...,u d [0,1] and v 1,...,v d [0,1] be such that for each j 1,...,d}, u j,v j (F j (k j 1),F j (k j )) for some k j N. Then C (v 1,...,v d ) C (u 1,...,u d )= (v m u m )Ċ m(w m1,...,w md ), m=1 where w mj equals u j or v j according as j <m or j m, respectively. Proof. First, write C (v 1,...,v d ) C (u 1,...,u d ) in the alternative form C (w m1,...,w md ) C (w (m+1)1,...,w (m+1)d )}. m=1 It must then be shown that for all m 1,...,d}, one has C (w m1,...,w md ) C (w (m+1)1,...,w (m+1)d ) =(v m u m )Ċ m (w m1,...,w md ). (B.5) To this end, observe that on the left-hand side of (B.5), C is evaluated at two vectors whose components are identical, except in position m. Let w 1,...,w m 1,w m+1,...,w d

21 Empirical copula process for count data 21 be the matching components and note that w mm = v m while w (m+1)m = u m. Given S 1,...,d}, let s m be the size of S m}. From the definition of λ H,S, one has and λ H,S (w m1,...,w md ) =λ H,S (w 1,...,w m 1,v m,w m+1,...,w d ) } sm vm F m (k m 1) Fm (k m ) v m = F m (k m ) F m (k m ) } F l (k l ) w l F l (k l ) l/ S l m l/ S l m l S l m } 1 sm w l F l (k l 1) F l (k l ) λ H,S (w (m+1)1,...,w (m+1)d )=λ H,S (w 1,...,w m 1,u m,w m+1,...,w d ) } sm } 1 sm um F m (k m 1) Fm (k m ) u m = F m (k m ) F m (k m ) } F l (k l ) w l } w l F l (k l 1). F l (k l ) F l (k l ) Consequently, their difference is equal to (v m u m ) ( 1)1 sm F m (k m ) l/ S l m l S l m F l (k l ) w l F l (k l ) It then follows from the definition of C that } l S l m } } w l F l (k l 1). F l (k l ) C (w m1,...,w md ) C (w (m+1)1,...,w (m+1)d ) = H(k S )λ H,S (w m1,...,w md ) λ H,S (w (m+1)1,...,w (m+1)d )} S 1,...,d} =(v m u m ) S 1,...,d} =(v m u m )Ċ m (w m1,...,w md ). H(k S ) ( 1)1 sm F m (k m ) l/ S l m } F l (k l ) w l F l (k l ) l S l m } w l F l (k l 1) F l (k l ) This completes the argument. Proof of Proposition 4.4. Recall from the definition of O that because K is compact, it can be covered by finitely many open cubes of the form O l =(F 1 (k 1l 1),F 1 (k l1 )) (F d (k dl 1),F d (k dl )),

22 22 C. Genest, J.G. Nešlehová and B. Rémillard where k 1l,...,k dl N for l 1,...,L}. Given that the sets O 1,...,O L are mutually disjoint, K l =K O l is compact for each l 1,...,L}. Therefore, K =K 1 K L is a union of finitely many disjoint compact sets. For arbitrary δ>0, let K l,δ = (u 1,...,u d ) R d : u 1 x u d x d <δ}. (x 1,...,x d ) K l Because K 1,...,K L are compact, there exists δ 0 > 0 such that K l,δ0 O l for all l 1,...,L}. Now fix δ <δ 0 and let K denote the closure of K δ =K 1,δ K d,δ, which is compact. Then for all δ (0,δ ), one has K K δ K O. For fixed δ (0,δ ) and ε>0, write P D n D K (C n ) K >ε} =P D n D K (C n) K >ε, B n < δ } n d +P D n D K (C n ) K >ε, B n δ }. n d When the event B n / n<δ/d} holds and (u 1,...,u d ) K l for some l 1,...,L}, ( (v 1,...,v d )= u 1 C n1 (u 1),...,u d C nd (u d) ) K l,δ n n because, for all j 1,...,d}, C nj C n B n by (B.1). From Lemma B.4, D n (u 1,...,u d ) D K (C n )(u 1,...,u d ) = C nj (uj)ċ j (u 1,...,u d ) Ċ j (u 1,...,u j 1,v j,...,v d )} B n Ċ j (u 1,...,u d ) Ċ j (u 1,...,u j 1,v j,...,v d ). Consequently, D n D K (C n ) K is bounded above by B n sup (u 1,...,u d ) K B n B n ω(δ), sup (u 1,...,u d ) K, (w 1,...,w d ) K δ, d m=1 um wm <δ Ċ j (u 1,...,u d ) Ċ j (u 1,...,u j 1,v j,...,v d ) Ċ j (u 1,...,u d ) Ċ j (w 1,...,w d )

23 Empirical copula process for count data 23 where ω(δ)= sup (u 1,...,u d ),(w 1,...,w d ) K, dm=1 u m w m <δ Ċ j (u 1,...,u d ) Ċ j (w 1,...,w d ). This observation implies that lim supp D n D K (C n ) K >ε, B n < δ } n n d limsupp B n ω(δ)>ε, B n < δ } n n d limsupp B n ω(δ)>ε}=p B C ω(δ)>ε}, n where the equality is justified by the fact that B n B C as n. Now ω(δ) 0 as δ 0 because Ċ j is absolutely continuous on K for all j 1,...,d}. Therefore, P B C ω(δ)>ε} 0 as δ 0. Finally, observe that ( lim supp D n D K (C n) K >ε, B n δ ) n n d ( limsupp Bn δ ) =0 n n d because B n / n p 0 as n. As ε>0 is arbitrary, one can conclude. The proof of Proposition 4.5 relies on the following lemma. Lemma B.5. Let G be the distribution function of a uniform random variable on (0,1). Then for every j 1,...,d} and as n, Y nj = n sup 0 u 1 Proof. Fix j 1,...,d} and write G u C nj (u) } n u+ C nj (u) n p 0. Y nj = [ C } nj (u) n sup u 1 u< C nj (u) } 0 u 1 n n + C nj (u) } (1 u) 1 1 u< C nj (u) }]. n n Observe that if C nj M for some constant M >0, then as n, Y nj sup C 0 u M/ nj n (u) + sup C 1 M/ nj (u) 0 p n u 1

24 24 C. Genest, J.G. Nešlehová and B. Rémillard because C nj C j in C([0,1]) and C j (0) = C j (1) =0. Now fix ε > 0 and invoke the tightness of C nj to find M >0 such that Pr( C nj >M)<ε/2 for all n N. Thus, Pr(Y nj >ε) Pr( C nj >M)+Pr ( sup C 0 u M/ nj n (u) + sup C 1 M/ nj ). (u) >ε n u 1 If n is large enough, the right-hand side of the above inequality is at most ε. Proof of Proposition 4.5. For all u 1,...,u d [0,1], let D n(u 1,...,u d )= n [ d u j C nj (u) n } d u j ]. Then D n D n n sup G u C nj (u) } n 0 u 1 u+ C nj (u) n because d a j d b j d a j b j for all a 1,...,a d, b 1,...,b d (0,1). Lemma B.5 thus implies that D n D n p 0 as n. Now by the multinomial formula, D n D(C n ) n S 1,...,d}, S 2j S C nj n p 0. Appendix C: Proofs from Section 5 The proofs of Propositions 5.1 and 5.2 have much in common. They both rely on the following straightforward consequence of Proposition in Brockwell and Davis [2]. Lemma C.1. Let Z n be a sequence of random variables. Suppose that for all δ, ǫ>0, there exists a sequence Y n,δ,ǫ of random variables such that for all n N, Pr( Z n Y n,δ,ǫ > δ)<ǫ and Y n,δ,ǫ Y δ,ǫ as n. Further assume that there exists a random variable Z such that for all δ,ǫ>0, Pr( Z Y δ,ǫ >δ)<ǫ. Then Z n Z as n. The convergence of Spearman s rho is presented first. Proof of Proposition 5.2. Because the complement of O in [0,1] d has Lebesgue measure zero, it suffices to show that Z n = Ĉ n dπ Z = Ĉ dπ. O O

25 Empirical copula process for count data 25 Given δ, ǫ >0, call on Remark 4.1 to pick M >0 such that Pr( Ĉ O > M)<ǫ and Pr( Ĉ n >M) <ǫ for all n N. Then choose a compact set K =K δ,ǫ O such that Π(O\K)<δ/M. Now define Y n,δ,ǫ = Ĉ n dπ, Y δ,ǫ = Ĉ dπ. K K Theorem 3.1 implies that Y n,δ,ǫ Y δ,ǫ as n. Furthermore, Z n Y n,δ,ǫ = Ĉ n dπ Ĉ n Π(O\K)< δ M Ĉ n, O\K while Z Y δ,ǫ = O\K Ĉ dπ Ĉ O Π(O\K)< δ M Ĉ O. For all n N, one then has Pr( Z n Y n,δ,ǫ >δ) Pr( Ĉ n δ/m >δ)<ǫ and similarly Pr( Z Y δ,ǫ >δ)<ǫ. The conclusion is then a consequence of Lemma C.1. The following lemma, needed for the proof of Proposition 5.1, is excerpted from Genest, Nešlehová and Rémillard [9]. Lemma C.2. Let H be a distribution function on R d and denote by H n its empirical counterpart corresponding to a random sample of size n. If the sequence of processes G n is tight with respect to the uniform norm on the space C b (R d ) of bounded and continuous functions on R d, then, as n, R n = G n dh n G n dh p 0. Proof of Proposition 5.1. Observe that n(τn τ)=4 Ĉ n (u,v)dĉ n (u,v)+4 First, it will be shown that, as n, Z n = Ĉ n(u,v)dc (u,v) Z = O O O O Ĉ n (u,v)dc (u,v). Ĉ (u,v)dc (u,v). Toseethis,fixarbitraryδ,ǫ>0anduseRemark4.1topickM >0suchthatPr( Ĉ O > M)<ǫ and Pr( Ĉ n >M)<ǫ for all n N. Then choose a compact set K =K δ,ǫ O such that C (O\K)<δ/M. Setting Y n,δ,ǫ = Ĉ n (u,v)dc (u,v), Y δ,ǫ = K K Ĉ (u,v)dc (u,v),

26 26 C. Genest, J.G. Nešlehová and B. Rémillard one can invoke Theorem 3.1 to deduce that Y n,δ,ǫ Y δ,ǫ as n. The rest of the argument rests on Lemma C.1, in analogy to the proof of Proposition 5.2. Secondly, to establish that, as n, Ĉ n (u,v)dĉ n (u,v) Ĉ (u,v)dc (u,v), (C.1) O use a change of variables and the definition of Hn to write Ĉ n (u,v)dĉ n (u,v)= Ĉ n F n1 (x 1),Fn2 (x 2)}dHn (x 1,x 2 ) [0,1] 2 R 2 = G n (x 1,x 2 )dh n (x 1,x 2 ), R 2 where, for all x 1,x 2 R, G n (x 1,x 2 )= Ĉ n F n1 (x 1 +u 1),Fn2 (x 2 +v 1)}dvdu. [0,1] 2 Itisclearthat G n Ĉ n andhence,byvirtueofremark4.1,thesequenceofprocesses G n is tight on C b (R 2 ). Lemma C.2 thus implies that, as n, G n (x 1,x 2 )dh n (x 1,x 2 ) G n (x 1,x 2 )dh(x 1,x 2 ) 0. p R 2 R 2 Undoing the change of variables and using the definitions of H and C, one finds G n (x 1,x 2 )dh(x 1,x 2 )= Ĉ n F n1 (x 1),Fn2 (x 2)}dH (x 1,x 2 ) R 2 R 2 = Ĉ nfn1 F1 1 (u),fn2 F2 1 (v)}dc (u,v) [0,1] 2 = Ĉ n F n1 F 1 1 (u),fn2 F 1 2 (v)}dc (u,v). Claim (C.1) is established if one can show that, as n, O O Ĉ n(f n1 F 1 1,F n2 F 1 2 ) Ĉ n K p 0 (C.2) for any fixed compact set K O. Given such a set, one can proceed exactly as in the proof of Proposition 4.4 to find δ > 0 and a compact set K O such that for all δ (0,δ ), K K δ K. Next, fix ǫ>0 and δ (0,δ ) and recall that C nj B n for j =1,2. As in the proof of Proposition 4.4, one has that when B n / n<δ/2} holds, (F n1 F 1 1 (u),f n2 F 1 2 (v))= ( u+ C n1(u) n,v+ C n2(v) n ) K δ

27 Empirical copula process for count data 27 whenever (u, v) K. Therefore, is bounded above by P P Ĉ n (F n1 F 1 1,F n2 F 1 2 ) Ĉ n K >ǫ} sup (u,v),(u,v ) K u u + v v <δ n Ĉ n (u,v ) Ĉ n (u,v) >ǫ }+P ( B n / n δ/2). Given that Ĉ n converges to Ĉ on C(K ), limlimsupp } sup Ĉ δ 0 n(u,v ) Ĉ n(u,v) >ǫ =0. (u,v),(u,v ) K u u + v v <δ Claim (C.2) now readily follows from the fact that B n / n 0, p as n. In conclusion, n(τ n τ) T 2 =8 OĈ (u,v)dc (u,v) as n, as claimed. Acknowledgments The authors thank the Editors and referees for many helpful comments. They acknowledge grants from the Canada Research Chairs Program, the Natural Sciences and Engineering Research Council and the Fonds de recherche du Québec Nature et technologies. References [1] Agresti, A. (2007). An Introduction to Categorical Data Analysis, 2nd ed. Wiley Series in Probability and Statistics. Hoboken, NJ: Wiley-Interscience. MR [2] Brockwell, P.J. and Davis, R.A. (1991). Time Series: Theory and Methods, 2nd ed. Springer Series in Statistics. New York: Springer. MR [3] Deheuvels, P. (1979). La fonction de dépendance empirique et ses propriétés. Un test non paramétrique d indépendance. Acad. Roy. Belg. Bull. Cl. Sci. (5) MR [4] Deheuvels, P. (1979). Non parametric tests of independence. In Statistique non paramétrique asymptotique (J.-P. Raoult, ed.) Berlin: Springer. [5] Dudley, R.M. (2002). Real Analysis and Probability, 2nd ed. Cambridge Studies in Advanced Mathematics 74. Cambridge: Cambridge Univ. Press. MR [6] Fermanian, J.-D., Radulović, D. and Wegkamp, M. (2004). Weak convergence of empirical copula processes. Bernoulli MR [7] Genest, C. and Nešlehová, J. (2007). A primer on copulas for count data. Astin Bull MR [8] Genest, C., Nešlehová, J. and Ruppert, M. (2011). Discussion: Statistical models and methods for dependence in insurance data. J. Korean Statist. Soc MR

Copula modeling for discrete data

Copula modeling for discrete data Copula modeling for discrete data Christian Genest & Johanna G. Nešlehová in collaboration with Bruno Rémillard McGill University and HEC Montréal ROBUST, September 11, 2016 Main question Suppose (X 1,

More information

A measure of radial asymmetry for bivariate copulas based on Sobolev norm

A measure of radial asymmetry for bivariate copulas based on Sobolev norm A measure of radial asymmetry for bivariate copulas based on Sobolev norm Ahmad Alikhani-Vafa Ali Dolati Abstract The modified Sobolev norm is used to construct an index for measuring the degree of radial

More information

Accounting for extreme-value dependence in multivariate data

Accounting for extreme-value dependence in multivariate data Accounting for extreme-value dependence in multivariate data 38th ASTIN Colloquium Manchester, July 15, 2008 Outline 1. Dependence modeling through copulas 2. Rank-based inference 3. Extreme-value dependence

More information

Copula-based tests of independence for bivariate discrete data. Orla Aislinn Murphy

Copula-based tests of independence for bivariate discrete data. Orla Aislinn Murphy Copula-based tests of independence for bivariate discrete data By Orla Aislinn Murphy A thesis submitted to McGill University in partial fulfillment of the requirements for the degree of Masters of Science

More information

Lecture Quantitative Finance Spring Term 2015

Lecture Quantitative Finance Spring Term 2015 on bivariate Lecture Quantitative Finance Spring Term 2015 Prof. Dr. Erich Walter Farkas Lecture 07: April 2, 2015 1 / 54 Outline on bivariate 1 2 bivariate 3 Distribution 4 5 6 7 8 Comments and conclusions

More information

Multivariate Extensions of Spearman s Rho and Related Statistics

Multivariate Extensions of Spearman s Rho and Related Statistics F. Schmid and R. Schmidt, On the Asymptotic Behavior of Spearman s Rho and Related Multivariate xtensions, Statistics and Probability Letters (to appear), 6. Multivariate xtensions of Spearman s Rho and

More information

Empirical Processes: General Weak Convergence Theory

Empirical Processes: General Weak Convergence Theory Empirical Processes: General Weak Convergence Theory Moulinath Banerjee May 18, 2010 1 Extended Weak Convergence The lack of measurability of the empirical process with respect to the sigma-field generated

More information

Séminaire de statistique IRMA Strasbourg

Séminaire de statistique IRMA Strasbourg Test non paramétrique de détection de rupture dans la copule d observations multivariées avec changement dans les lois marginales -Illustration des méthodes et simulations de Monte Carlo- Séminaire de

More information

Introduction to Empirical Processes and Semiparametric Inference Lecture 09: Stochastic Convergence, Continued

Introduction to Empirical Processes and Semiparametric Inference Lecture 09: Stochastic Convergence, Continued Introduction to Empirical Processes and Semiparametric Inference Lecture 09: Stochastic Convergence, Continued Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and

More information

Stanford Mathematics Department Math 205A Lecture Supplement #4 Borel Regular & Radon Measures

Stanford Mathematics Department Math 205A Lecture Supplement #4 Borel Regular & Radon Measures 2 1 Borel Regular Measures We now state and prove an important regularity property of Borel regular outer measures: Stanford Mathematics Department Math 205A Lecture Supplement #4 Borel Regular & Radon

More information

On a simple construction of bivariate probability functions with fixed marginals 1

On a simple construction of bivariate probability functions with fixed marginals 1 On a simple construction of bivariate probability functions with fixed marginals 1 Djilali AIT AOUDIA a, Éric MARCHANDb,2 a Université du Québec à Montréal, Département de mathématiques, 201, Ave Président-Kennedy

More information

Dependence. MFM Practitioner Module: Risk & Asset Allocation. John Dodson. September 11, Dependence. John Dodson. Outline.

Dependence. MFM Practitioner Module: Risk & Asset Allocation. John Dodson. September 11, Dependence. John Dodson. Outline. MFM Practitioner Module: Risk & Asset Allocation September 11, 2013 Before we define dependence, it is useful to define Random variables X and Y are independent iff For all x, y. In particular, F (X,Y

More information

Concentration behavior of the penalized least squares estimator

Concentration behavior of the penalized least squares estimator Concentration behavior of the penalized least squares estimator Penalized least squares behavior arxiv:1511.08698v2 [math.st] 19 Oct 2016 Alan Muro and Sara van de Geer {muro,geer}@stat.math.ethz.ch Seminar

More information

ON THE ERDOS-STONE THEOREM

ON THE ERDOS-STONE THEOREM ON THE ERDOS-STONE THEOREM V. CHVATAL AND E. SZEMEREDI In 1946, Erdos and Stone [3] proved that every graph with n vertices and at least edges contains a large K d+l (t), a complete (d + l)-partite graph

More information

Asymptotic Statistics-VI. Changliang Zou

Asymptotic Statistics-VI. Changliang Zou Asymptotic Statistics-VI Changliang Zou Kolmogorov-Smirnov distance Example (Kolmogorov-Smirnov confidence intervals) We know given α (0, 1), there is a well-defined d = d α,n such that, for any continuous

More information

An elementary proof of the weak convergence of empirical processes

An elementary proof of the weak convergence of empirical processes An elementary proof of the weak convergence of empirical processes Dragan Radulović Department of Mathematics, Florida Atlantic University Marten Wegkamp Department of Mathematics & Department of Statistical

More information

n [ F (b j ) F (a j ) ], n j=1(a j, b j ] E (4.1)

n [ F (b j ) F (a j ) ], n j=1(a j, b j ] E (4.1) 1.4. CONSTRUCTION OF LEBESGUE-STIELTJES MEASURES In this section we shall put to use the Carathéodory-Hahn theory, in order to construct measures with certain desirable properties first on the real line

More information

Modelling Dependence with Copulas and Applications to Risk Management. Filip Lindskog, RiskLab, ETH Zürich

Modelling Dependence with Copulas and Applications to Risk Management. Filip Lindskog, RiskLab, ETH Zürich Modelling Dependence with Copulas and Applications to Risk Management Filip Lindskog, RiskLab, ETH Zürich 02-07-2000 Home page: http://www.math.ethz.ch/ lindskog E-mail: lindskog@math.ethz.ch RiskLab:

More information

University of California San Diego and Stanford University and

University of California San Diego and Stanford University and First International Workshop on Functional and Operatorial Statistics. Toulouse, June 19-21, 2008 K-sample Subsampling Dimitris N. olitis andjoseph.romano University of California San Diego and Stanford

More information

Copulas and dependence measurement

Copulas and dependence measurement Copulas and dependence measurement Thorsten Schmidt. Chemnitz University of Technology, Mathematical Institute, Reichenhainer Str. 41, Chemnitz. thorsten.schmidt@mathematik.tu-chemnitz.de Keywords: copulas,

More information

Recall that if X is a compact metric space, C(X), the space of continuous (real-valued) functions on X, is a Banach space with the norm

Recall that if X is a compact metric space, C(X), the space of continuous (real-valued) functions on X, is a Banach space with the norm Chapter 13 Radon Measures Recall that if X is a compact metric space, C(X), the space of continuous (real-valued) functions on X, is a Banach space with the norm (13.1) f = sup x X f(x). We want to identify

More information

Maths 212: Homework Solutions

Maths 212: Homework Solutions Maths 212: Homework Solutions 1. The definition of A ensures that x π for all x A, so π is an upper bound of A. To show it is the least upper bound, suppose x < π and consider two cases. If x < 1, then

More information

Dependence. Practitioner Course: Portfolio Optimization. John Dodson. September 10, Dependence. John Dodson. Outline.

Dependence. Practitioner Course: Portfolio Optimization. John Dodson. September 10, Dependence. John Dodson. Outline. Practitioner Course: Portfolio Optimization September 10, 2008 Before we define dependence, it is useful to define Random variables X and Y are independent iff For all x, y. In particular, F (X,Y ) (x,

More information

Optimal exact tests for complex alternative hypotheses on cross tabulated data

Optimal exact tests for complex alternative hypotheses on cross tabulated data Optimal exact tests for complex alternative hypotheses on cross tabulated data Daniel Yekutieli Statistics and OR Tel Aviv University CDA course 29 July 2017 Yekutieli (TAU) Optimal exact tests for complex

More information

MAT 570 REAL ANALYSIS LECTURE NOTES. Contents. 1. Sets Functions Countability Axiom of choice Equivalence relations 9

MAT 570 REAL ANALYSIS LECTURE NOTES. Contents. 1. Sets Functions Countability Axiom of choice Equivalence relations 9 MAT 570 REAL ANALYSIS LECTURE NOTES PROFESSOR: JOHN QUIGG SEMESTER: FALL 204 Contents. Sets 2 2. Functions 5 3. Countability 7 4. Axiom of choice 8 5. Equivalence relations 9 6. Real numbers 9 7. Extended

More information

Financial Econometrics and Volatility Models Copulas

Financial Econometrics and Volatility Models Copulas Financial Econometrics and Volatility Models Copulas Eric Zivot Updated: May 10, 2010 Reading MFTS, chapter 19 FMUND, chapters 6 and 7 Introduction Capturing co-movement between financial asset returns

More information

Hybrid Copula Bayesian Networks

Hybrid Copula Bayesian Networks Kiran Karra kiran.karra@vt.edu Hume Center Electrical and Computer Engineering Virginia Polytechnic Institute and State University September 7, 2016 Outline Introduction Prior Work Introduction to Copulas

More information

A Goodness-of-fit Test for Copulas

A Goodness-of-fit Test for Copulas A Goodness-of-fit Test for Copulas Artem Prokhorov August 2008 Abstract A new goodness-of-fit test for copulas is proposed. It is based on restrictions on certain elements of the information matrix and

More information

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007) FROM: PAGANO, R. R. (007) I. INTRODUCTION: DISTINCTION BETWEEN PARAMETRIC AND NON-PARAMETRIC TESTS Statistical inference tests are often classified as to whether they are parametric or nonparametric Parameter

More information

A consistent test of independence based on a sign covariance related to Kendall s tau

A consistent test of independence based on a sign covariance related to Kendall s tau Contents 1 Introduction.......................................... 1 2 Definition of τ and statement of its properties...................... 3 3 Comparison to other tests..................................

More information

1 Introduction. On grade transformation and its implications for copulas

1 Introduction. On grade transformation and its implications for copulas Brazilian Journal of Probability and Statistics (2005), 19, pp. 125 137. c Associação Brasileira de Estatística On grade transformation and its implications for copulas Magdalena Niewiadomska-Bugaj 1 and

More information

Introduction to Empirical Processes and Semiparametric Inference Lecture 02: Overview Continued

Introduction to Empirical Processes and Semiparametric Inference Lecture 02: Overview Continued Introduction to Empirical Processes and Semiparametric Inference Lecture 02: Overview Continued Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations Research

More information

Estimation of the Bivariate and Marginal Distributions with Censored Data

Estimation of the Bivariate and Marginal Distributions with Censored Data Estimation of the Bivariate and Marginal Distributions with Censored Data Michael Akritas and Ingrid Van Keilegom Penn State University and Eindhoven University of Technology May 22, 2 Abstract Two new

More information

Clearly, if F is strictly increasing it has a single quasi-inverse, which equals the (ordinary) inverse function F 1 (or, sometimes, F 1 ).

Clearly, if F is strictly increasing it has a single quasi-inverse, which equals the (ordinary) inverse function F 1 (or, sometimes, F 1 ). APPENDIX A SIMLATION OF COPLAS Copulas have primary and direct applications in the simulation of dependent variables. We now present general procedures to simulate bivariate, as well as multivariate, dependent

More information

Goodness-of-Fit Testing with Empirical Copulas

Goodness-of-Fit Testing with Empirical Copulas with Empirical Copulas Sami Umut Can John Einmahl Roger Laeven Department of Econometrics and Operations Research Tilburg University January 19, 2011 with Empirical Copulas Overview of Copulas with Empirical

More information

P-adic Functions - Part 1

P-adic Functions - Part 1 P-adic Functions - Part 1 Nicolae Ciocan 22.11.2011 1 Locally constant functions Motivation: Another big difference between p-adic analysis and real analysis is the existence of nontrivial locally constant

More information

Modelling Dependent Credit Risks

Modelling Dependent Credit Risks Modelling Dependent Credit Risks Filip Lindskog, RiskLab, ETH Zürich 30 November 2000 Home page:http://www.math.ethz.ch/ lindskog E-mail:lindskog@math.ethz.ch RiskLab:http://www.risklab.ch Modelling Dependent

More information

MATHS 730 FC Lecture Notes March 5, Introduction

MATHS 730 FC Lecture Notes March 5, Introduction 1 INTRODUCTION MATHS 730 FC Lecture Notes March 5, 2014 1 Introduction Definition. If A, B are sets and there exists a bijection A B, they have the same cardinality, which we write as A, #A. If there exists

More information

Analysis Qualifying Exam

Analysis Qualifying Exam Analysis Qualifying Exam Spring 2017 Problem 1: Let f be differentiable on R. Suppose that there exists M > 0 such that f(k) M for each integer k, and f (x) M for all x R. Show that f is bounded, i.e.,

More information

Decomposition of Riesz frames and wavelets into a finite union of linearly independent sets

Decomposition of Riesz frames and wavelets into a finite union of linearly independent sets Decomposition of Riesz frames and wavelets into a finite union of linearly independent sets Ole Christensen, Alexander M. Lindner Abstract We characterize Riesz frames and prove that every Riesz frame

More information

Defining the Integral

Defining the Integral Defining the Integral In these notes we provide a careful definition of the Lebesgue integral and we prove each of the three main convergence theorems. For the duration of these notes, let (, M, µ) be

More information

3 Integration and Expectation

3 Integration and Expectation 3 Integration and Expectation 3.1 Construction of the Lebesgue Integral Let (, F, µ) be a measure space (not necessarily a probability space). Our objective will be to define the Lebesgue integral R fdµ

More information

A note on some approximation theorems in measure theory

A note on some approximation theorems in measure theory A note on some approximation theorems in measure theory S. Kesavan and M. T. Nair Department of Mathematics, Indian Institute of Technology, Madras, Chennai - 600 06 email: kesh@iitm.ac.in and mtnair@iitm.ac.in

More information

SOME MEASURABILITY AND CONTINUITY PROPERTIES OF ARBITRARY REAL FUNCTIONS

SOME MEASURABILITY AND CONTINUITY PROPERTIES OF ARBITRARY REAL FUNCTIONS LE MATEMATICHE Vol. LVII (2002) Fasc. I, pp. 6382 SOME MEASURABILITY AND CONTINUITY PROPERTIES OF ARBITRARY REAL FUNCTIONS VITTORINO PATA - ALFONSO VILLANI Given an arbitrary real function f, the set D

More information

Math212a1413 The Lebesgue integral.

Math212a1413 The Lebesgue integral. Math212a1413 The Lebesgue integral. October 28, 2014 Simple functions. In what follows, (X, F, m) is a space with a σ-field of sets, and m a measure on F. The purpose of today s lecture is to develop the

More information

STATISTICS SYLLABUS UNIT I

STATISTICS SYLLABUS UNIT I STATISTICS SYLLABUS UNIT I (Probability Theory) Definition Classical and axiomatic approaches.laws of total and compound probability, conditional probability, Bayes Theorem. Random variable and its distribution

More information

= 176 p-value < Source: Metropolitan Asylums Board: Small-pox epidemic (Pearson, 1900)

= 176 p-value < Source: Metropolitan Asylums Board: Small-pox epidemic (Pearson, 1900) 2 JOAKIM EKSTRÖM 1. Introduction The polychoric correlation coefficient is a measure of association for ordinal variables. It was first proposed by Karl Pearson in year 1900, and throughout his career,

More information

The main results about probability measures are the following two facts:

The main results about probability measures are the following two facts: Chapter 2 Probability measures The main results about probability measures are the following two facts: Theorem 2.1 (extension). If P is a (continuous) probability measure on a field F 0 then it has a

More information

Math 118B Solutions. Charles Martin. March 6, d i (x i, y i ) + d i (y i, z i ) = d(x, y) + d(y, z). i=1

Math 118B Solutions. Charles Martin. March 6, d i (x i, y i ) + d i (y i, z i ) = d(x, y) + d(y, z). i=1 Math 8B Solutions Charles Martin March 6, Homework Problems. Let (X i, d i ), i n, be finitely many metric spaces. Construct a metric on the product space X = X X n. Proof. Denote points in X as x = (x,

More information

Decomposition of Parsimonious Independence Model Using Pearson, Kendall and Spearman s Correlations for Two-Way Contingency Tables

Decomposition of Parsimonious Independence Model Using Pearson, Kendall and Spearman s Correlations for Two-Way Contingency Tables International Journal of Statistics and Probability; Vol. 7 No. 3; May 208 ISSN 927-7032 E-ISSN 927-7040 Published by Canadian Center of Science and Education Decomposition of Parsimonious Independence

More information

arxiv: v1 [math.pr] 24 Aug 2018

arxiv: v1 [math.pr] 24 Aug 2018 On a class of norms generated by nonnegative integrable distributions arxiv:180808200v1 [mathpr] 24 Aug 2018 Michael Falk a and Gilles Stupfler b a Institute of Mathematics, University of Würzburg, Würzburg,

More information

Local sensitivity analyses of goodness-of-fit tests for copulas

Local sensitivity analyses of goodness-of-fit tests for copulas Local sensitivity analyses of goodness-of-fit tests for copulas DANIEL BERG University of Oslo and The Norwegian Computing Center JEAN-FRANÇOIS QUESSY Département de mathématiques et d informatique, Université

More information

A copula goodness-of-t approach. conditional probability integral transform. Daniel Berg 1 Henrik Bakken 2

A copula goodness-of-t approach. conditional probability integral transform. Daniel Berg 1 Henrik Bakken 2 based on the conditional probability integral transform Daniel Berg 1 Henrik Bakken 2 1 Norwegian Computing Center (NR) & University of Oslo (UiO) 2 Norwegian University of Science and Technology (NTNU)

More information

SOME CONVERSE LIMIT THEOREMS FOR EXCHANGEABLE BOOTSTRAPS

SOME CONVERSE LIMIT THEOREMS FOR EXCHANGEABLE BOOTSTRAPS SOME CONVERSE LIMIT THEOREMS OR EXCHANGEABLE BOOTSTRAPS Jon A. Wellner University of Washington The bootstrap Glivenko-Cantelli and bootstrap Donsker theorems of Giné and Zinn (990) contain both necessary

More information

MATH 205C: STATIONARY PHASE LEMMA

MATH 205C: STATIONARY PHASE LEMMA MATH 205C: STATIONARY PHASE LEMMA For ω, consider an integral of the form I(ω) = e iωf(x) u(x) dx, where u Cc (R n ) complex valued, with support in a compact set K, and f C (R n ) real valued. Thus, I(ω)

More information

Measure Theory on Topological Spaces. Course: Prof. Tony Dorlas 2010 Typset: Cathal Ormond

Measure Theory on Topological Spaces. Course: Prof. Tony Dorlas 2010 Typset: Cathal Ormond Measure Theory on Topological Spaces Course: Prof. Tony Dorlas 2010 Typset: Cathal Ormond May 22, 2011 Contents 1 Introduction 2 1.1 The Riemann Integral........................................ 2 1.2 Measurable..............................................

More information

Lectures on. Sobolev Spaces. S. Kesavan The Institute of Mathematical Sciences, Chennai.

Lectures on. Sobolev Spaces. S. Kesavan The Institute of Mathematical Sciences, Chennai. Lectures on Sobolev Spaces S. Kesavan The Institute of Mathematical Sciences, Chennai. e-mail: kesh@imsc.res.in 2 1 Distributions In this section we will, very briefly, recall concepts from the theory

More information

Nonparametric tests for tail monotonicity

Nonparametric tests for tail monotonicity Nonparametric tests for tail monotonicity Betina Berghaus and Axel Bücher Ruhr-Universität Bochum Fakultät für Mathematik 44780 Bochum, Germany e-mail: betina.berghaus@ruhr-uni-bochum.de e-mail: axel.buecher@ruhr-uni-bochum.de

More information

Extreme Value Analysis and Spatial Extremes

Extreme Value Analysis and Spatial Extremes Extreme Value Analysis and Department of Statistics Purdue University 11/07/2013 Outline Motivation 1 Motivation 2 Extreme Value Theorem and 3 Bayesian Hierarchical Models Copula Models Max-stable Models

More information

Measurable functions are approximately nice, even if look terrible.

Measurable functions are approximately nice, even if look terrible. Tel Aviv University, 2015 Functions of real variables 74 7 Approximation 7a A terrible integrable function........... 74 7b Approximation of sets................ 76 7c Approximation of functions............

More information

II - REAL ANALYSIS. This property gives us a way to extend the notion of content to finite unions of rectangles: we define

II - REAL ANALYSIS. This property gives us a way to extend the notion of content to finite unions of rectangles: we define 1 Measures 1.1 Jordan content in R N II - REAL ANALYSIS Let I be an interval in R. Then its 1-content is defined as c 1 (I) := b a if I is bounded with endpoints a, b. If I is unbounded, we define c 1

More information

Bivariate Paired Numerical Data

Bivariate Paired Numerical Data Bivariate Paired Numerical Data Pearson s correlation, Spearman s ρ and Kendall s τ, tests of independence University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/~eariasca/teaching.html

More information

arxiv: v1 [math.st] 24 Aug 2017

arxiv: v1 [math.st] 24 Aug 2017 Submitted to the Annals of Statistics MULTIVARIATE DEPENDENCY MEASURE BASED ON COPULA AND GAUSSIAN KERNEL arxiv:1708.07485v1 math.st] 24 Aug 2017 By AngshumanRoy and Alok Goswami and C. A. Murthy We propose

More information

A note about the conjecture about Spearman s rho and Kendall s tau

A note about the conjecture about Spearman s rho and Kendall s tau A note about the conjecture about Spearman s rho and Kendall s tau V. Durrleman Operations Research and Financial Engineering, Princeton University, USA A. Nikeghbali University Paris VI, France T. Roncalli

More information

MAT 257, Handout 13: December 5-7, 2011.

MAT 257, Handout 13: December 5-7, 2011. MAT 257, Handout 13: December 5-7, 2011. The Change of Variables Theorem. In these notes, I try to make more explicit some parts of Spivak s proof of the Change of Variable Theorem, and to supply most

More information

List coloring hypergraphs

List coloring hypergraphs List coloring hypergraphs Penny Haxell Jacques Verstraete Department of Combinatorics and Optimization University of Waterloo Waterloo, Ontario, Canada pehaxell@uwaterloo.ca Department of Mathematics University

More information

Weak Convergence of Stationary Empirical Processes

Weak Convergence of Stationary Empirical Processes Weak Convergence of Stationary Empirical Processes Dragan Radulović Department of Mathematics, Florida Atlantic University Marten Wegkamp Department of Mathematics & Department of Statistical Science,

More information

Mathematical Methods for Physics and Engineering

Mathematical Methods for Physics and Engineering Mathematical Methods for Physics and Engineering Lecture notes for PDEs Sergei V. Shabanov Department of Mathematics, University of Florida, Gainesville, FL 32611 USA CHAPTER 1 The integration theory

More information

THEOREMS, ETC., FOR MATH 515

THEOREMS, ETC., FOR MATH 515 THEOREMS, ETC., FOR MATH 515 Proposition 1 (=comment on page 17). If A is an algebra, then any finite union or finite intersection of sets in A is also in A. Proposition 2 (=Proposition 1.1). For every

More information

Combinatorics in Banach space theory Lecture 12

Combinatorics in Banach space theory Lecture 12 Combinatorics in Banach space theory Lecture The next lemma considerably strengthens the assertion of Lemma.6(b). Lemma.9. For every Banach space X and any n N, either all the numbers n b n (X), c n (X)

More information

BALANCING GAUSSIAN VECTORS. 1. Introduction

BALANCING GAUSSIAN VECTORS. 1. Introduction BALANCING GAUSSIAN VECTORS KEVIN P. COSTELLO Abstract. Let x 1,... x n be independent normally distributed vectors on R d. We determine the distribution function of the minimum norm of the 2 n vectors

More information

van Rooij, Schikhof: A Second Course on Real Functions

van Rooij, Schikhof: A Second Course on Real Functions vanrooijschikhof.tex April 25, 2018 van Rooij, Schikhof: A Second Course on Real Functions Notes from [vrs]. Introduction A monotone function is Riemann integrable. A continuous function is Riemann integrable.

More information

Recall the Basics of Hypothesis Testing

Recall the Basics of Hypothesis Testing Recall the Basics of Hypothesis Testing The level of significance α, (size of test) is defined as the probability of X falling in w (rejecting H 0 ) when H 0 is true: P(X w H 0 ) = α. H 0 TRUE H 1 TRUE

More information

Subject CS1 Actuarial Statistics 1 Core Principles

Subject CS1 Actuarial Statistics 1 Core Principles Institute of Actuaries of India Subject CS1 Actuarial Statistics 1 Core Principles For 2019 Examinations Aim The aim of the Actuarial Statistics 1 subject is to provide a grounding in mathematical and

More information

Some SDEs with distributional drift Part I : General calculus. Flandoli, Franco; Russo, Francesco; Wolf, Jochen

Some SDEs with distributional drift Part I : General calculus. Flandoli, Franco; Russo, Francesco; Wolf, Jochen Title Author(s) Some SDEs with distributional drift Part I : General calculus Flandoli, Franco; Russo, Francesco; Wolf, Jochen Citation Osaka Journal of Mathematics. 4() P.493-P.54 Issue Date 3-6 Text

More information

Lecture 9 Metric spaces. The contraction fixed point theorem. The implicit function theorem. The existence of solutions to differenti. equations.

Lecture 9 Metric spaces. The contraction fixed point theorem. The implicit function theorem. The existence of solutions to differenti. equations. Lecture 9 Metric spaces. The contraction fixed point theorem. The implicit function theorem. The existence of solutions to differential equations. 1 Metric spaces 2 Completeness and completion. 3 The contraction

More information

Analysis Comprehensive Exam Questions Fall F(x) = 1 x. f(t)dt. t 1 2. tf 2 (t)dt. and g(t, x) = 2 t. 2 t

Analysis Comprehensive Exam Questions Fall F(x) = 1 x. f(t)dt. t 1 2. tf 2 (t)dt. and g(t, x) = 2 t. 2 t Analysis Comprehensive Exam Questions Fall 2. Let f L 2 (, ) be given. (a) Prove that ( x 2 f(t) dt) 2 x x t f(t) 2 dt. (b) Given part (a), prove that F L 2 (, ) 2 f L 2 (, ), where F(x) = x (a) Using

More information

Adaptive estimation of the copula correlation matrix for semiparametric elliptical copulas

Adaptive estimation of the copula correlation matrix for semiparametric elliptical copulas Adaptive estimation of the copula correlation matrix for semiparametric elliptical copulas Department of Mathematics Department of Statistical Science Cornell University London, January 7, 2016 Joint work

More information

Problem set 1, Real Analysis I, Spring, 2015.

Problem set 1, Real Analysis I, Spring, 2015. Problem set 1, Real Analysis I, Spring, 015. (1) Let f n : D R be a sequence of functions with domain D R n. Recall that f n f uniformly if and only if for all ɛ > 0, there is an N = N(ɛ) so that if n

More information

Stein s Method and Characteristic Functions

Stein s Method and Characteristic Functions Stein s Method and Characteristic Functions Alexander Tikhomirov Komi Science Center of Ural Division of RAS, Syktyvkar, Russia; Singapore, NUS, 18-29 May 2015 Workshop New Directions in Stein s method

More information

arxiv: v1 [math.st] 28 May 2013

arxiv: v1 [math.st] 28 May 2013 WHEN UNIFORM WEAK CONVERGENCE FAILS: EMPIRICAL PROCESSES FOR DEPENDENCE FUNCTIONS VIA EPI- AND HYPOGRAPHS 1 arxiv:1305.6408v1 [math.st] 28 May 2013 By Axel Bücher, Johan Segers and Stanislav Volgushev

More information

Robust Backtesting Tests for Value-at-Risk Models

Robust Backtesting Tests for Value-at-Risk Models Robust Backtesting Tests for Value-at-Risk Models Jose Olmo City University London (joint work with Juan Carlos Escanciano, Indiana University) Far East and South Asia Meeting of the Econometric Society

More information

Yale University Department of Computer Science. The VC Dimension of k-fold Union

Yale University Department of Computer Science. The VC Dimension of k-fold Union Yale University Department of Computer Science The VC Dimension of k-fold Union David Eisenstat Dana Angluin YALEU/DCS/TR-1360 June 2006, revised October 2006 The VC Dimension of k-fold Union David Eisenstat

More information

h(x) lim H(x) = lim Since h is nondecreasing then h(x) 0 for all x, and if h is discontinuous at a point x then H(x) > 0. Denote

h(x) lim H(x) = lim Since h is nondecreasing then h(x) 0 for all x, and if h is discontinuous at a point x then H(x) > 0. Denote Real Variables, Fall 4 Problem set 4 Solution suggestions Exercise. Let f be of bounded variation on [a, b]. Show that for each c (a, b), lim x c f(x) and lim x c f(x) exist. Prove that a monotone function

More information

ANALYSIS QUALIFYING EXAM FALL 2017: SOLUTIONS. 1 cos(nx) lim. n 2 x 2. g n (x) = 1 cos(nx) n 2 x 2. x 2.

ANALYSIS QUALIFYING EXAM FALL 2017: SOLUTIONS. 1 cos(nx) lim. n 2 x 2. g n (x) = 1 cos(nx) n 2 x 2. x 2. ANALYSIS QUALIFYING EXAM FALL 27: SOLUTIONS Problem. Determine, with justification, the it cos(nx) n 2 x 2 dx. Solution. For an integer n >, define g n : (, ) R by Also define g : (, ) R by g(x) = g n

More information

Chapter 1. Measure Spaces. 1.1 Algebras and σ algebras of sets Notation and preliminaries

Chapter 1. Measure Spaces. 1.1 Algebras and σ algebras of sets Notation and preliminaries Chapter 1 Measure Spaces 1.1 Algebras and σ algebras of sets 1.1.1 Notation and preliminaries We shall denote by X a nonempty set, by P(X) the set of all parts (i.e., subsets) of X, and by the empty set.

More information

Theorem 2.1 (Caratheodory). A (countably additive) probability measure on a field has an extension. n=1

Theorem 2.1 (Caratheodory). A (countably additive) probability measure on a field has an extension. n=1 Chapter 2 Probability measures 1. Existence Theorem 2.1 (Caratheodory). A (countably additive) probability measure on a field has an extension to the generated σ-field Proof of Theorem 2.1. Let F 0 be

More information

1/12/05: sec 3.1 and my article: How good is the Lebesgue measure?, Math. Intelligencer 11(2) (1989),

1/12/05: sec 3.1 and my article: How good is the Lebesgue measure?, Math. Intelligencer 11(2) (1989), Real Analysis 2, Math 651, Spring 2005 April 26, 2005 1 Real Analysis 2, Math 651, Spring 2005 Krzysztof Chris Ciesielski 1/12/05: sec 3.1 and my article: How good is the Lebesgue measure?, Math. Intelligencer

More information

1 Weak Convergence in R k

1 Weak Convergence in R k 1 Weak Convergence in R k Byeong U. Park 1 Let X and X n, n 1, be random vectors taking values in R k. These random vectors are allowed to be defined on different probability spaces. Below, for the simplicity

More information

LECTURE 3 Functional spaces on manifolds

LECTURE 3 Functional spaces on manifolds LECTURE 3 Functional spaces on manifolds The aim of this section is to introduce Sobolev spaces on manifolds (or on vector bundles over manifolds). These will be the Banach spaces of sections we were after

More information

Real Analysis Notes. Thomas Goller

Real Analysis Notes. Thomas Goller Real Analysis Notes Thomas Goller September 4, 2011 Contents 1 Abstract Measure Spaces 2 1.1 Basic Definitions........................... 2 1.2 Measurable Functions........................ 2 1.3 Integration..............................

More information

APPROXIMATE ISOMETRIES ON FINITE-DIMENSIONAL NORMED SPACES

APPROXIMATE ISOMETRIES ON FINITE-DIMENSIONAL NORMED SPACES APPROXIMATE ISOMETRIES ON FINITE-DIMENSIONAL NORMED SPACES S. J. DILWORTH Abstract. Every ε-isometry u between real normed spaces of the same finite dimension which maps the origin to the origin may by

More information

MULTIDIMENSIONAL POVERTY MEASUREMENT: DEPENDENCE BETWEEN WELL-BEING DIMENSIONS USING COPULA FUNCTION

MULTIDIMENSIONAL POVERTY MEASUREMENT: DEPENDENCE BETWEEN WELL-BEING DIMENSIONS USING COPULA FUNCTION Rivista Italiana di Economia Demografia e Statistica Volume LXXII n. 3 Luglio-Settembre 2018 MULTIDIMENSIONAL POVERTY MEASUREMENT: DEPENDENCE BETWEEN WELL-BEING DIMENSIONS USING COPULA FUNCTION Kateryna

More information

Integration on Measure Spaces

Integration on Measure Spaces Chapter 3 Integration on Measure Spaces In this chapter we introduce the general notion of a measure on a space X, define the class of measurable functions, and define the integral, first on a class of

More information

Weak convergence. Amsterdam, 13 November Leiden University. Limit theorems. Shota Gugushvili. Generalities. Criteria

Weak convergence. Amsterdam, 13 November Leiden University. Limit theorems. Shota Gugushvili. Generalities. Criteria Weak Leiden University Amsterdam, 13 November 2013 Outline 1 2 3 4 5 6 7 Definition Definition Let µ, µ 1, µ 2,... be probability measures on (R, B). It is said that µ n converges weakly to µ, and we then

More information

Testing Independence Based on Bernstein Empirical Copula and Copula Density

Testing Independence Based on Bernstein Empirical Copula and Copula Density Testing Independence Based on Bernstein Empirical Copula and Copula Density Mohamed Belalia a, Taoufi Bouezmarni a, Abderrahim Taamouti b a Département de mathématiques, Université de Sherbrooe, Canada

More information

A note on a construction of J. F. Feinstein

A note on a construction of J. F. Feinstein STUDIA MATHEMATICA 169 (1) (2005) A note on a construction of J. F. Feinstein by M. J. Heath (Nottingham) Abstract. In [6] J. F. Feinstein constructed a compact plane set X such that R(X), the uniform

More information

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages: Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the

More information

Analysis Finite and Infinite Sets The Real Numbers The Cantor Set

Analysis Finite and Infinite Sets The Real Numbers The Cantor Set Analysis Finite and Infinite Sets Definition. An initial segment is {n N n n 0 }. Definition. A finite set can be put into one-to-one correspondence with an initial segment. The empty set is also considered

More information

Measurable Choice Functions

Measurable Choice Functions (January 19, 2013) Measurable Choice Functions Paul Garrett garrett@math.umn.edu http://www.math.umn.edu/ garrett/ [This document is http://www.math.umn.edu/ garrett/m/fun/choice functions.pdf] This note

More information