A famly of multvarate dstrbutons wth refxed margnals Isdro R. Cruz_Medna, F. Garca_Paez and J. R. Pablos_Tavares. Recursos Naturales, Insttuto Tecnológco de Sonora Cnco de Febrero 88, Cd. Obregón Son. CP 85 000 Méxco. Abstract When dealng wth multvarate data, t s convenent to have a stock of adatable famles flexble enough to be constructed from a secfc set of unvarate margnals. Such a famly of multvarate dstrbutons s constructed n ths aer by a transformaton of the multvarate normal dstrbuton. It s shown that the margnal and condtonal dstrbutons belong to the same famly whch has a smle exresson. The deendence arameter for ths famly, n the bvarate case, s the well known roduct moment correlaton coeffcent. In contrast, most bvarate coulas have arameters dffcult to nterret. An alcaton to hydrologcal data shows that the bvarate verson of the roosed famly comares well wth bvarate coulas. Key words: Coulas, Multvarate dstrbuton; Margnal and condtonal dstrbuton; Bvarate dstrbutons.. Introducton Multvarate dstrbutons are needed to model comlex events n everyday scence alcatons. Although the multnormal dstrbuton has been extensvely used for the analyss of multvarate data, there are many mortant stuatons where the unvarate comonents of comlex henomena are clearly non-normal. In areas lke hydrology (Yue et al, 00), lfe testng and bostatstcs (Park, 004), for examle, varables are ostve Corresondng autor: E-mal address: rcruz@tson.mx
and skewed and dstrbutons lke gamma, lognormal, nverse Gaussan and others are commonly used. In most cases t may be dffcult to comletely secfy the jont dstrbuton of the avalable random vector; however, t may be ossble to secfy the margnal dstrbuton of ther comonents and ther correlaton structure. The constructon of bvarate and multvarate dstrbutons functons s a subject whch has been studed ntensvely. In artcular, the constructon of bvarate dstrbutons wth secfed margnals has been dscussed by Moran (969), Farle (960), Plackett (965), Johnson & Tenenben (98), Genest & MacKay (986) and Genest (987). Marshall & Olkn (988), Koehler & Symanowsk (995) and Arnold et al., (006) have resented methods for constructng multvarate dstrbutons. The general method for constructng multvarate dstrbutons was rovded by the concet of coula gven by Sklar (959) wth the theorem. Theorem. If H s a dstrbuton functon on R, wth one-dmensonal margnal dstrbutons functons F, F,, L F. There s a coula C such that H ( ) = C( F( ), F ( ), L, F ( ) ) () If F s contnuous, then the coula C satsfyng () s unque and s gven by CU ( ) = H( F ( U ), F ( U ), L, F ( U ) ) () P for (0, ) U ε where F ( V) = nf { W : F( W) V}, =,, K,. where T T = (,, L, ) and U = ( U, U, L, U ) (3) are the transoses of column vectors and U resectvely. Conversely, If C s a coula on [ 0,] and F, F,, are dstrbuton functons on R, L F then the functon H defned by () s a dstrbuton functon on R wth one dmensonal dstrbutons F, F,,. L F
The roof of ths theorem can be seen n Nelsen (006), who descrbes dfferent methods for constructng coulas. In ths aer, the nverson method s aled to obtan a multvarate famly. The nverson method begns wth a multvarate dstrbuton functon G, wth margns G, G,,, to obtan the coula: L G CU ( ) = G( G ( u ), G ( u ), L, G ( u ) ) P wth ths coula, a new multvarate dstrbuton wth arbtrary margns F, F,, L F, can be constructed usng Sklar`s theorem: H ( ) = C( F( x ), F ( x ), L, F ( x ) ) The nverson method s qute oular, t was used by Moran (969), Koehler & Symanowsk (995) and Arnold et al., (006) to obtan multvarate dstrbutons. In ths aer, the multvarate normal dstrbuton s used as the G functon n the nverson method, ths dstrbuton has the roerty that the deendence structure s comletely secfed wth the covarance matrx. In secton, the robablty densty functon of ths famly, whch s called the Generalzed Moran Famly (GMF), s obtaned. In secton 3, the condtonal and margnal dstrbutons of any subset of varables are derved and t s shown that they belong to the same GMF. In secton 4, an alcaton of ths famly s resented and some methods of estmaton are dscussed.. A famly of multvarate dstrbutons For the constructon of the multvarate GMF, the followng well known results are used.. Deendence among varables n a multvarate standard normal dstrbuton s comletely secfed by the correlaton matrx (ths roerty makes the multvarate normal dstrbuton deal for exressng the deendence among varables wth other dstrbutons). 3
. The cumulatve dstrbuton functon (CDF) for any contnuous unvarate dstrbuton has a unform dstrbuton functon u (0, ), n the nterval (0, ). Ths roerty allows to transform any dstrbuton onto another. Gven a contnuous dstrbuton functon f (x), ts cumulatve dstrbuton functon F (x) has unform u (0, ) dstrbuton. The alcaton of the quantle functon wth dstrbuton functon g (y). G ( u) to ths unform dstrbuton gves a varable y 3. The change of varable theorem (Casella, 990) allows one to obtan a multvarate dstrbuton by transformatons on an orgnal set of varables. Lets start wth an -dmensonal random vector Z, wth multvarate standard normal dstrbuton (Marda et al, 979) gven by (4): f / / ( Z Z ) = ( π ) Σ Ex Z T Σ Z (4) Where Z s a column vector, whch comonents Z, follows a standard normal dstrbuton, Σ s the nverse of the covarance (correlaton) matrx gven by (5). ρ.... ρ.... Σ =....... (5)....... ρ ρ ρ ρ... Let be the column vector defned n (3), whose comonents x, follow ossbly dfferent known margnal dstrbutons wth PDFs gven by (6): f ( ; θ ) for =,,... (6) where θ s a column vector of t arameters. For examle, for the gamma famly, t=, and θ T = ( α, λ ); α and λ are the scale and shae arameters of the gamma 4
dstrbuton. The corresondng CDFs of varables are F ( ; θ ) gven by equaton (7): F ( ; θ) = f ( t; θ ) dt, =,,... (7) Vector can be obtaned from vector Z wth the element-wse one-to-one transformaton: [ ( )] F Z = Φ, for =,,... (8) where Φ ( Z ) reresents the standard normal CDF gven by (9). ( ) Z / π Φ ( Z) = ( ) Ex /t dt (9) The nverse transformaton of equaton (9) s: Z = Φ F ( ), for =,,... (0) (). Alyng the change of varable theorem, the jont PDF of vector s gven by ( ) h( ) = fz Z J () where Z T = [ Φ ( F( )), Φ ( F ( )),..., Φ ( F ( )) ], and J s the Jacoban of the nverse transformaton gven by: Z J = () Ths equalty holds because functon Z has only one argument: Varable, Z s obtaned alyng the well known formulas for the dervatves of a comoston of functons and for the nverse functon. Z Φ ( F( )) = = f ϕ ( ; θ ) = ( Φ ( F( ) ) 5
{ } = π θ, =,,... / ( ) Ex / Z f( ; ) where ϕ stands for the PDF of the standard normal dstrbuton and Z = Φ ( F( )). Therefore, the PDF of the GMF has the comact exresson: / T h( ) = Σ Ex Z ( Σ I) Z f ( ; = θ ) (3) where I stands for the dentty matrx of order. 3. Margnal and condtonal dstrbutons 3. Margnal dstrbutons If the -dmensonal vector belongs to the GMF we wll say that ( { f } =,..., ) ~ M 0, Σ, ( ; θ ) (4) where the frst element, the zero -comonent vector s the mean and Σ, s the covarance matrx of the multvarate normal dstrbuton used n the transformaton, the thrd set reresents the margnal PDF s of the GMF. If the random vector s arttoned n two vectors and wth r and q = (-r) comonents, the constructon of the famly ( =,..., r) suggests that: ~,, { f ( ; ) } M 0 Σ θ, where s defned n the corresondng arttonng of matrx Σ : Σ Σ Σ Σ = Σ Σ Proof: By defnton: h ( ) = h( ) d (5) For the evaluaton of ths ntegral, the q-comonent vector wll be transformed nto the vector Z of standard normal comonents, where vector arttoned as vector. The Jacoban of ths transformaton (8) s: T Z = ( Z, Z T T ) s 6
Z J / q = r + = = ( π ) Ex f ( ; ) r r = + = + Z θ (6) therefore, h r / / Z q T = r ( ) = ( π ) Ex Σ Z Σ Z + f ( ; ) θ d Z = Z (7) The exressons for the determnant and the nverse of a arttoned matrx (Graybll, 969): Σ = Σ Σ. Σ Σ + Σ Σ Σ. Σ Σ Σ Σ Σ. = Σ. Σ Σ Σ. (8) where: Σ = Σ Σ Σ Σ. Allow to exress the quadratc form n (7) as: T T T T Z Σ Z = Z Σ Z + ( Z Σ Σ Z ) Σ ( Z Σ Σ Z ) (9) Therefore:. h ( ) = r / Z T = r Σ Ex Z Σ Z + f ( ; ) = θ ( π ) Σ ( ) ( ) Z Σ Σ Z Σ Z Σ Σ Z Z Z q / / T. Ex. d The margnal dstrbuton of ( =,..., r) s M Σ Σ Z, Σ,{ f ( ; )} θ because the ntegral n the second lne of ths exresson s equal to one. Z has 7
multvarate normal dstrbuton wth a mean equal to (condtonal dstrbuton of f ( Z Z )). Σ. Σ Σ Z and covarance matrx 3. Condtonal dstrbutons Results n the last secton show that the condtonal dstrbuton of (,, { } ). ( ; ) = r+,..., M Σ Σ Z Σ f θ, ths s: h ( / ) = / / s: Σ. / Z ( ) T = r + Ex. ( Z Σ Σ Z Σ Z ΣΣ Z ) + f ( ; θ ) (0) = r + 3.3 Parameter estmaton Although arameters of the GMF famly could be estmated by the algorthm gven by Karan et al (000), t s advsable to obtan maxmum lkelhood (ML) estmators because of ther well known roertes. Margnal denstes (whch are GMF arameters) can be estmated by ML. By the nvarance roerty of ML estmaton (Cassela & Berger, 990), the ML estmator of Σ (whch s a functon of the margnal denstes), s the correlaton matrx of the normalzed varables (0). Therefore the ML estmators of the GMF famly can be obtaned wth the algorthm: () Select the margnal dstrbutons wth a statstc method (Kolmogorov- Smrnov, Anderson-Darlng, χ, etc.). () Obtan the ML estmators of the margnal dstrbutons. () Estmate Σ as the correlaton matrx of the normalzed varables (0). 8
3.3 Some bvarate densty contours Densty contours are dslayed for selected bvarate dstrbutons to show some ossble dstrbutonal shaes rovded by ths famly. In each ar of densty contours, the fgure on the left, resents the jont densty wth ndeendent comonents ( ρ = 0 ) whle the fgure on the rght resents the jont densty wth a correlaton coeffcent equal to 0.60 between ther comonents. Fgure shows bvarate denstes wth gamma margnals and Fgure shows bvarate denstes wth Webull margnals. ρ = 0 ρ = 0. 60 60 60 50 50 40 40 30 30 0 0 0 0 0 0 0 40 60 80 00 0 0 0 40 60 80 00 Fgure. Bvarte densty wth gamma margnals G(.5,0) n and G(.5, 8) n Y. 0 0 5 5 0 0 5 5 0 0 5 0 5 0 0 0 5 0 5 0 Fgure. Bvarte densty wth Webull margnals W(, 5) n and W(3, 8) n Y. 9
These fgures, obtaned wth Mathematca (Wolfram, 998), show that the GMF could be useful for exressng a lnear relatonsh between varables wth the great advantage that only the correlaton coeffcent s needed to exress the deendence among each ar of varables. If the relatonsh s not lnear, another coula may be more arorate. 4. Alcaton to a hydrologcal study Runoff modelng and forecastng s of great mortance because t hels to estmate water avalablty and lannng dverse human actvtes lke agrculture. In the artcular case of the Yaqu rver whch rrgates around 00 000 ha n the Yaqu Valley n northwest Méxco, monthly runoff can be dvded n three non correlated erods: a) July, August and Setember (JAS), b) October and November, and c) December to June. July s monthly runoff s not autocorrelated and runoff estmates for ths month can only be obtaned from ts robablty dstrbuton, Augusts runoff s correlated wth July s runoff, therefore a bvarate dstrbuton can be ft and the runoff condtonal dstrbuton of August gven July can be used to mrove the estmates of Augusts runoff. Bvarate dstrbutons from two Archmedean coulas (Gumbel-Hougaard and Cook-Johnson famles), the generalzed lambda dstrbuton (Karan and Dudewcs) and the GMF, were ft to runoff data consstng n the records of the last 48 years. Alyng the algorthm for ML estmaton, a gamma dstrbuton G[4.05, 4.586] was estmated for July s runoff. For Augusts runoff the lognormal dstrbuton LogN[6.5707, 0.4953] was chosen. The Gumbel-Hougaard [] and the Cook-Johnson coulas [], were estmated by the relatonsh between the Kendall s coeffcent between July s and August s runoff, and the arameter of the coula (Zhang and Sngh, 007; Nelsen, 006). The arameter Ψ for the generalzed lambda dstrbuton [3] was 0
estmated wth the rocedure gven n Karan and Dudewcs (000). For the GMF, the correlaton coeffcent (equal to 0.475) was obtaned from the transformed runoff varables (0). Gumbel-Hougaard coula. θ θ / θ { } H(, ) = Ex ( log( F( ))) + ( log( F ( ))) () where θ, s the arameter of the coula, related wth the equaton τ = θ, wth τ, the Kendall s coeffcent of correlaton. Cook-Johnson coula. / θ θ θ H(, ) = ( F( ) ) + ( F( )) () where θ, s the arameter of the coula, related wth the equaton τ = θ /( θ + ), wth τ, the Kendall s coeffcent of correlaton. Generalzed lambda (GL) coula. S S 4( Ψ) ΨF ( ) F ( ) H (, ) = (3) ( Ψ) where ( ) S = + F( ) + F ( ) ( Ψ) and Ψ s the arameter of the coula, whch was est mated wth the rocedure gven n Karan and Dudewcs (000). The emrcal jont dstrbuton of runoff varables was obtaned (Zhang and Sngh, 007) wth equaton (4). N mn0.44 m= n= H ( x, xj) = P( x, xj) = (4) N + 0. where N s the samle sze; N s the number of x, x ) counted as x and mn j ( x j, =,, K, N. x j x
P-P lots obtaned wth the four ftted coulas show that ther ft are smlar. Coulas were comared wth the root mean square error (RMSE), the Kolmogorov- Smrnov (KS) test and the AIC crteron, Table shows these statstcs. Coula RMSE Kolmogorov-Smrnov AIC P-value Gumbel-H 0.09 0.8794 350.89 Cook-J. 0.037 0.5909 35.6 GL 0.0336 0.6393 350.6 GMF 0.037 0.789 348.6 Table. Goodness of ft statstcs for the ftted coulas. The RMSE was obtaned wth the cumulatve robabltes as: RMSE = N = [ H ( x, x ) H ( x, x ] N ) E C where subndex S, n the CDF, H S ( x, x ) stands for the emrcal dstrbuton (S=E) or the dstrbuton gven by the selected coula (S=C). The AIC crteron: AIC = ln L( θ ) + k ^ where L θ ) s ML functon for the model and k, s the number of ftted arameters. The ( ^ ML functon was evaluated for the GMF, for the rest of the models; the arameter of the coula was estmated by means of the relatonsh between the coula arameter and the Kendall s coeffcent. The best ftted dstrbuton usng the AIC crteron s the GMF, but the RMSE shows that the GMF and the Gumbel-Hougaard coula have the best ft. Ths examle
shows that the GMF cometes well wth other coulas and gves a flexble famly for fttng multvarate dstrbutons wth fxed margnal dstrbutons. References Arnold, B.C., Castllo E. and Saraba, J.M. 006. Famles of multvarate dstrbutons nvolvng the Rosenblatt constructon. J. Amer. Statst. Assoc. 0(476), 65-66. Casella, G. and Berger, R.L. 990. Statstcal nference. Duxbury Press, Belmont USA. Farle, D.J.G. 960. The erformance of some correlatons coeffcents for a general bvarate dstrbuton. Bometrka, 47, 307-33. Genest, C. 987. Frank s famly of bvarate dstrbutons. Bometrka. 74, 549-555. Genest, C. and Mackay, R.J. 986. The joy of coulas: Bvarate dstrbutons wth unform margnals. Amer. Statst. 40, 80-83. Graybll, F.A. 969. Introducton to matrces wth alcatons n statstcs. Wadsworth Publshng Co. CA. Jonhson, M.E. and Tenenben, A. 98. A bvarate dstrbuton famly wth secfed margnals. J. Amer. Statst. Assoc. 76, 98-0. Karan, Z. A. and Dudewcs, E.J. 000. Fttng statstcal dstrbutons: The generalzed lambda dstrbuton and generalzed bootstra methods. Chaman & Hall. FL. Koehler, K. J and Symanowsk, J.T. 995. Constructng multvarate dstrbutons wth secfc margnal dstrbutons. J. of Multvarate Analyss. 55, 6-8. Marda, K.V., Kent, J.T and Bbby, J.M. 979. Multvarate analyss. Academc Press. London. U.K. Marshall, A.W. and Olkn, I. 988. Famles of multvarate dstrbutons. J. Am. Statst. Assoc. 83(403), 834-84. 3
Moran, P.A.P. 969. Statstcal nference wth bvarate gamma dstrbutons. Bometrka, 56, 67-634. Nelsen, R.B. 006. An ntroducton to coulas. Srnger, N. Y. Park, C.P. 004. Constructon of random vectors of heterogeneous comonent varables under secfed correlaton structures. Comutatonal Statstcs & Data Analyss, 46, 6-630. Plackett, R.L. 965. A class of bvarate dstrbutons. J. Amer. Statst. Assoc. 60, 56-5. Sklar, A. 959. Fonctons de réartton à n dmensons et leurs marges. Publ. Inst. Statst. Unv. Pars 8, 9-3. Wolfram, S. 998. The mathematca book. Fourth ed. Unversty Press, Cambrdge. Yue, S., Quarda, T.B.M.J. and Bobée B. 00. A revew of bvarate gamma dstrbutons for hydrologcal alcatons. J. of Hdrology, 46, -8. Zhang, L. and Sngh V. P. 007. Bvarate ranfall frequency dstrbutons usng Archmedean coulas. J. of Hdrology, 33, 93-09. 4