On Parameter-Mixing of Dependence Parameters by Murray D Smith and Xiangyuan Tommy Chen 2 Econometrics and Business Statistics The University of Sydney Incomplete Preliminary Draft May 9, 2006 (NOT FOR QUOTATION) Abstract: The method of parameter mixing has served to introduce new distributions into statistical practice. These distributions are usually more exible in the sense that they may contain an increased number of parameters as compared to that of the unmixed, parent distribution. The classic example of parameter mixing is the Beta-Binomial distribution, a univariate distribution formed by assigning a Beta distribution to the success probability of the Binomial distribution. The object of interest in this article is the dependence structure of a collection of two or more random variables, where this is measured by a copula function, the parameters of which are termed dependence parameters. The key issue is to establish to what extent does parameter-mixing of dependence parameters contribute, or enhance, the dependence structure, and promote improved outcomes to statistical modelling. The results derived here may have further implications because of recent proposals to model dependence parameters in terms of covariates of the random variables of interest. Keywords: Dependence parameter; Margin parameter; Copula; Copula representation; Fisher Information. Address for correspondence: Murray D Smith, Econometrics and Business Statistics, The University of Sydney, Sydney NSW 2006, Australia. (Murray.Smith@econ.usyd.edu.au) 2 tommy_chen@student.usyd.edu.au
Introduction In multivariate contexts, the dependence parameters (i.e. the parameters of the copula function) typically enter in classic pairwise measures of association between random variables such as Spearman s rho S and Kendall s tau : For example, the dependence parameter 2 [ ; ] of the Farlie-Gumbel-Morgenstern family of 2-copulas (FGM hereafter) C (u; v) = uv( + ( u)( v)) () where (u; v) 2 II 2 = [0; ] [0; ]; nds it prominent in the relevant formulas: S = =3 and = 2=9: With data, estimation of dependence parameters (along with any other margin parameters) may often proceed using standard techniques, the point being that these unknown parameters are assumed xed in the population of interest. One way in which statistical models can be imbued with greater parametric exibility is to relax the notion of parameter xity. One such technique that is well known throughout the statistical literature is (termed here) parameter-mixing. Roughly, the basic idea is to assign the parameter of interest a distribution, such that the latter itself depends on a parameter set larger in dimension than the base, or parent distribution. For extensive discussion and numerous examples see Johnson et al [, Chps 8-9]. Perhaps the most well-known example of parameter-mixing is in the generalisation of the 2-parameter univariate Binomial(n; p) distribution to the 3-parameter univariate Beta-Binomial(n; ; ) distribution, where the latter is formed by ascribing a Beta(; ) distribution to the success probability of the former. Put formally, this parameter-mix is de ned as the following expectation with respect to the parameter distribution: Beta-Binomial(n; ; ) = Binomial(n; P ) Beta(; ) ^P = E[Binomial(n; P )] where the operator ^ is a popularly used notation to denote parameter-mixing. Here, the Binomial(n; p) parent is interpreted as a conditional distribution, where conditioning is on a value p assigned to the variable P: In this instance, parameter-mixing achieves a remarkable success, because what began as a 2-parameter statistical model has now, through mixing, been grown to a new statistical model whose three parameters are formally identi ed. 2
Parameter-mixing ideas applied to dependence parameters would at rst sight appear to give the same opportunity to enhance the exibility of the resulting copula function. To establish notation, for a (possibly vector-valued) dependence parameter = ascribed the distribution F(), where are themselves parameters assumed to have dimensionality at least that of (preferably more because the desired aim is to enhance parametric exibilty), the F parameter-mix of the 2-copula C is given by 3;4 C(u; 0 v) = E [C (u; v)] Z = C (u; v)df (; ) Z = C (u; v)f(; )d (2) where F (; ) denotes the cdf of and f(; ) the pdf of ; the last line assumes continuity of the mixing distribution. The outcome C 0 (u; v) is a 2-copula. Finally, having averaged over the members of the C family, it is clear that the dependence structure of C 0 can cover no more than that of C ; and can be less if F() is used to represent prior information. The rst readily obvious result to ow from (2) occurs if the parent copula C is linear in ; for then, provided F = E[] exists, the parameter-mixed copula is equivalent to the parent copula, apart from its parameterisation. Thus, there can be no gain made from parameter-mixing. Indeed, if the parameter space induced through the new dependence parameter F exceeds that of ; then the additional parameters cannot be identi ed. To illustrate this, compare the functional form of the following parameter-mixed copula with that of the FGM parent () that is linear in : Let X F = Beta(; ) and consider the parameter-mix: where = Z F GM (2X ) = uv( + (2x )( u)( v)) x ( x) dx ^ 0 B(; ) = uv ( + ( u)( v)) + : Clearly, the FGM dependence structure has been preserved under parameter-mixing, but under this speci cation of F both induced dependence parameters and cannot separately be identi ed. The re-parameterisation in X that sets = + 3 Extending parameter-mixing to an arbitrary dimension copula is straightforward, but as the main ideas can be conveniently expressed in two dimensions then only 2-copulas will be considered in this article. 4 An alternative nomenclature is the convex sum of C with respect to F; see Nelsen [2, Sec. 3.2.4]. 3
would serve to isolate the object as the dependence parameter and eliminate the super- uous parameter ; but in doing so nothing whatsoever is gained by parameter-mixing. A second result, somewhat similar to the rst, is that if the parameter-mix C 0 is of the same family as C ; then there is no gain to be made from parameter-mixing. Consider, for example, Mardia s family of comprehensive copulas, indexed by 2 [ ; ] : C (u; v) = 2 2 ( + )M + ( 2 ) + 2 2 ( )W where M = min(u; v) and W = max(u + v ; 0) are the extremal 2-copulas, and = uv is the Product copula. Despite being non-linear in ; parameter-mixing applied to the Mardia family serves merely to recover the parent, di ering only in its parameterisation. If parameter-mixing is to enhance the exibility of the resulting copula function, from this and the previous example it is clear that mixing must generate a family of copulas that are functionally di erent to the parent. 2 Example: The #9 Family of Copulas 5 Consider the #9 family of copulas given by C (u; v) = exp( (log u)(log v)) (3) where the family is indexed by values assigned to 2 (0; ]: Obviously, lim!0 C (u; v) = and C (u; v) < ; so the #9 family covers a region of negative dependence. In terms of Spearman s rho, the #9 family is such that 0:523852 S < 0: The lower bound is found by substitution of = into S = 2 Z C (u; v)dudv I 2 3 = 2 e 4= G (4=) 3 where G(z) = R e t t dt is a special case of the incomplete gamma function such that, z for 2 (0; ]; 0 < G(4=) G(4) = 0:00378: In this example, parameter-mixing yields a family of copulas di erent to that of the parent family, with interest here centreing on whether the number of dependence parameters can be increased from the original singleton, and be formally identi ed. For a 5 Lacking knowledge of the name of the proposer of this family of copulas the name #9 is used instead, as this family was listed by Nelsen [2, Table 4.] as equation number (4.2.9). 4
distribution F with pdf f(; ); the F parameter-mix of the #9 copula is, applying (2), C 0 (u; v) = Z 0 exp( (log u)(log v))f(; )d = mgf F ( (log u)(log v)) where mgf F denotes the moment generating function of F, i.e. E[exp(t)]: In particular, for F = Beta(; ); where > 0 and > 0 are parameters, with mgf given by the con uent hypergeometric function F (; + ; t) ; t 2 IR; then the copula of the Beta parameter-mix of the #9 copula is given by, C 0 ;(u; v) = F (; + ; (log u)(log v)) = exp( (log u)(log v)) F (; + ; (log u)(log v)) where the second line uses Kummer s relation F (p; q; x) = e x F (q p; q; x): It is evident that limiting cases applied to the parameters (; ) correspond to the extremes of dependence that C 0 ; covers. For instance, allowing to be free and letting! 0 nds C 0 ; (u; v)! C (u; v): Equally,! and free nds C 0 ; (u; v)! C (u; v): Likewise, allowing to be free but letting! nds C; 0 (u; v)! ; as too! 0 and free nds C; 0 (u; v)! : That we cannot distinguish between, for example, becoming large and becoming small, is an indication that the introduction of additional dependence parameters does not lead to added exibility, for those parameters are not identi ed. This can be formalised by examining the Fisher Information matrix for (; ); 2 3 I ; = 4 i i i i where its elements can be obtained using Theorem of Smith [5], valid under any pair of continuous margins bound together by an assigned copula. To illustrate, the element corresponding to is given by Z 2 @ i = I c 0 2 ; (u; v) @ c0 ;(u; v) dudv where the copula density 5 c 0 ;(u; v) = @2 @u@v C0 ;(u; v): Explicit expressions in terms of (; ) for (i ; i ; i ) cannot be obtained; however, numerical integration can be used to approximate the Fisher Information matrix at given 5
values of (; ): The following table lists a small selection of results in the parameter space: Table: Fisher Information matrix and associated eigenvalues (to 3dp) (; ) 2 I ; 3 Eigenvalues (; ) 0:094 0:050 4 5 0:2 0:050 0:027 2 3 0:000 (2; 4) 4 i i 5 T BA i i 2 3 T BA (2; 0) 4 i i 5 T BA i i T BA To numerical precision, the (scaled) second eigenvalue is so small in each case that this is strong evidence that both parameters (; ) are not identi ed in the Beta parameter mix of the #9 family of copulas. 3 Example: The Ali-Mikhail-Haq family of copulas In this example, parameter-mixing yields a family of copulas di erent to that of the parent family, with interest here centreing on measuring the improvement in modelling, if any, with idealised data. Consider the Ali-Mikhail-Haq family of copulas (AMH hereafter) C (u; v) = uv ( u)( v) (4) where the family is indexed by values assigned to the dependence parameter 2 [ ; ]: Let F = Beta(; b); where parameter > 0 and scalar b > 0 is known. The Beta(; b) parameter-mix of the AMH family of copulas is, applying (2), Z C(u; 0 v) = 0 uv (2x )( u)( v) uv = Beta(; b) + ( u)( Z v) x ( x) b dx Beta(; b) 2( u)( v) 0 + ( u)( v) x x ( x) (+b) dx uv = + ( u)( v) 2 F (; ; + b; s) uv = ( u)( v) 2 F (; b; + b; s) (5) 6
where s = 2( u)( v) + ( u)( v) is such that 0 s ; when (u; v) 2 II 2 ; and the solution to the integral can be deduced as a special case of the single-valued, analytic de nition of the Gaussian hypergeometric function given by Euler (e.g. see Rainville [3, p.47]): 2F (p; q; r; s) = Beta(q; r q) Z 0 ( sx) p x q ( x) r q dx provided jarg( transformation: s)j < : Note that the last line of (5) is obtained by using Euler s 2F (p; q; r; s) = ( s) p 2F p; r q; r; s : s To establish the dependence coverage of the Beta mixed family of AMH copulas, observe that C(u; 0 v) = C (u; v) 2 F (; ; + b; s) with in this case the Gaussian hypergeometric function satisfying 2 F (; ; + b; s) F 0 (; s) (6) where F 0 (; s) = ( s) = + ( u)( v) ( u)( v) = C +(u; v) C (u; v) : (7) For xed u; v and b; 2 F (; ; + b; s) takes smaller (larger) values corresponding to! 0 (! ); so that the range of dependence coverage of the mixed family of copulas is, as expected, equivalent to that of the parent family; i.e. C (u; v) C 0 (u; v) C + (u; v) where this is found by substituting (7) into (6) and multiplying through by C (u; v): In terms of Spearman s measure S ; the mixed family covers the interval [ 0:27; 0:4784]; the same coverage as that of the parent AMH family of copulas. The following experiment... AMH algorithm given on Nelsen p0. Results in Table 7
4 Empirical Application It is interesting to observe that while the AMH family (4) nests the Product copula as a special case, i.e. = uv = C 0 (u; v); the Beta mixed family (5) does not. For C 0 (u; v) to nest ; the Gaussian hypergeometric function 2 F (; ; + b; s) would have to be simpli able to ( 2 s) = + ( u)( v) for some at every given b; but no such set of pairings can be found. This implies that even though the point at which the mixed copula transits from negative to positive dependence corresponds to zero dependence, that point does not represent independence. For the mixed AMH copula to retain independence as a special case, a mixture that restricts attention to either positive or negative ranges of dependence is required. Such informative mixtures will retain the Product copula as a special case provided represents an extreme of the parameter space. Furthermore, informative mixtures are of interest in their own right for modelling purposes as will be illustrated. To construct a positive informative mixture, contrast the derivation of (5) with the following Beta parameter-mix of the AMH copula: Z C 0+ uv x ( x) b (u; v) = dx 0 x( u)( v) Beta(; b) = uv 2 F (; ; + b; ( u)( v)) (8) that covers positive dependence C 0+ (u; v) C + (u; v) where parameter > 0; and lim!0 C 0+ (u; v) = : The negative-only version of the previous mix is given by Z C 0 uv x ( x) b (u; v) = dx 0 + x( u)( v) Beta(; b) = uv 2 F (; ; + b; ( u)( v)) (9) where parameter > 0; and lim!0 C 0 (u; v) = : Clearly, informative mixing is one means by which prior information can be imposed. The following bivariate data... 8
References [] Johnson, N.L., Kotz, S., and Kemp, A.W. (993). Univariate Discrete Distributions. Wiley: New York. [2] Nelsen, R. B. (2006). An Introduction to Copulas. 2nd edition. Springer-Verlag: New York. [3] Rainville, E. D. (960). Special Functions. MacMillan: New York. [4] Slater, L. J. (960). Con uent Hypergeometric Functions. Cambridge University Press: Cambridge. [5] Smith, M. D. (2005). Invariance theorems for Fisher Information. Unpublished manuscript. 9
Table : Estimates of the AMH and Power mixed AMH models AMH Power-AMH Parameter S b log L ba log L 0 = 0: 0:297 0:7984 (0:06) 0:3 0:382 0:5020 (0:33) 0:5 0:0760 0:2466 (0:076) 0:7 0:0266 0:0935 (0:023) 0:8275 0:0000 0:0079 (0:0983) 0:033 0:0938 (0:0929) 2 0:498 0:4207 (0:0727) 3 0:257 0:5847 (0:0593) 4 0:2583 0:669 (0:0522) 5 0:2884 0:7232 (0:0462) 0 0:3639 0:8482 (0:032) 20 0:435 0:922 (0:027) 50 0:4498 0:9660 (0:034) = 0:9 0:2483 0:8936 (0:0889) 0 0 0:0008 (0:0947) 0:9 0:4070 0:8970 (0:0245) 26:705 0:045 (0:0605) :486 0:2927 (0:0857) 3:377 0:598 (0:74) 0:9870 0:7053 (0:448) 0:5398 0:8324 (0:637) :0275 :008 (0:94) 2:6902 2:0299 (0:3722) 27:4526 3:340 (0:5984) 38:2372 4:0300 (0:7958) 49:032 5:059 (:060) 8:863 0:983 (2:5647) 4:5770 22:0767 (7:4427) 47:500 59:699 (22:4894) 33:5976 0:0509 (0:0530) 0:5757 0:8353 (0:662) 06:2990 5:933 (4:795) 27:3540 :846 4:080 :9480 :740 2:225 3:998 28:4526 39:386 49:898 82:4709 4:9090 47:2830 35:328 0:5523 05:8690 Notes: (i) Sample size n = 000 (ii) Estimates are averaged over 200 replications (iii) Figures to 4dp 0