University of Lisbon, Portugal

Development and comparative study of two near-exact approximations to the distribution of the product of an odd number of independent Beta random variables Luís M. Grilo a,, Carlos A. Coelho b, a Dep. of Mathematics, Polytechnic Institute of Tomar, Portugal lgrilo@ipt.pt b Mathematics Department, Faculty of Sciences and Technology The New University of Lisbon, Portugal cmac@fct.unl.pt Abstract Using the concept of near-exact approximation to a distribution we developed two different near-exact approximations to the distribution of the product of an odd number of particular independent Beta random variables. One of them is a particular Generalized Near-Integer Gamma GNIG distribution and the other is a mixture of two GNIG distributions. These near-exact distributions are mostly adequate to be used as a basis for approximations of distributions of several statistics used in Multivariate Analysis. By factoring the characteristic function of the logarithm of the product of the Beta random variables, and then replacing a suitably chosen factor of that characteristic function by an adequate asymptotic result it is possible to obtain what we call a near-exact characteristic function, which gives rise to the near-exact approximation to the exact distribution. Depending on the asymptotic result used to replace the chosen parts of the characteristic function, one may obtain different near-exact approximations. Moments from the two near-exact approximations developed are compared with the exact ones. The two approximations are also compared with each other, namely in terms of moments and quantiles. Key words: Near-exact approximations, Generalized Near-integer Gamma distribution, independent random variables, mixtures, Multivariate Analysis. Corresponding author. Address: Dep. of Mathematics, Polytechnic Institute of Tomar, Est. da Serra - Qta. do Contador, 300-33 Tomar; Fax:35 4 93 836. This research was financially supported by the Portuguese Foundation for Science and Technology FCT. Accepted for publication in J. of Stat. Planning and Inference

Introduction It is our aim to obtain a close approximation to the exact distribution of the product of an odd number of independent r.v. s random variables with Beta distributions with parameters yielding particular relationships, in a concise manageable form, so that at least the computation of quantiles is rendered reasonably easy. Let Y j Beta a j, b j =,..., p be independent Beta r.v. s with parameters a j and b/, with p and b positive odd integers, where a j = c + p/ j/ j =,..., p, c IR + where IR + is the set of positive reals. We are then interested in the distribution of W = p Y j or p W = log W = log Y j. We may note that a few likelihood ratio test statistics used in Multivariate Analysis, such as the Wilks Λ statistic Wilks, 93, 935 have a distribution of this kind Anderson, 984. However, the exact form of the distribution of W and W is too complicated for practical use, since it usually involves hypergeometric functions of higher order. The exact form of the distribution of W, as well as the distribution of the sum of independent Gamma r.v. s with different rate parameters, have already been studied in the past by several authors but all results were obtained under the form of series expansions Kabe, 96; Tretter and Walster, 975; Nandi, 977; Gupta and Richards, 979, 983. When p and b are both even, or only one of them is odd, although some of the series expansions have very good convergence properties, namely those based on Chi-square mixtures Gupta and Richards, 983, the exact distribution for W and W was obtained in a concise manageable form, without any series expansions by Coelho 998, 999. In Coelho 998, 999 the distribution of W is presented as a particular case of the GIG Generalized Integer Gamma distribution, that is the distribution of the sum of independent Gamma r.v. s with different rate parameters and integer shape parameters. When p and b are both odd, Coelho 004 obtained a near-exact approximation for the distribution of W under the form of a GNIG Generalized Near-Integer Gamma distribution, that is the distribution of the sum of a GIG r.v. with an independent Gamma r.v. with non-integer shape parameter. These approximations were built in the following way: after factorizing the c.f. characteristic function of W we replace one of the factors, by an

approximation with good convergence properties, in such a way that we obtain a near-exact c.f. that corresponds to a known distribution. In this paper we obtain two other near-exact approximations for the distribution of W and W, both of them even closer to the exact distribution. The factor that is replaced corresponds to the c.f. of a Logbeta r.v. a r.v. whose exponential has a Beta distribution, being in one case approximated by the c.f. of the sum of two Gamma r.v. s that matches the first three exact moments and in the other case approximated by the c.f. of the mixture of two Gamma r.v. s that matches the first four exact moments. By joining this factor, with the remaining unchanged part of the c.f., we get what we call a near-exact c.f. for W. For the first case this c.f. corresponds to a GNIG distribution, while in the second case it corresponds to a mixture of two GNIG distributions. In the following section we set forth some distributions that will be used in the sections ahead. Some useful distributions. The GNIG distribution and the mixture of two GNIG distributions Let Z be a r.v. with a GIG distribution Coelho, 998 of depth g, with shape parameters r,..., r g IN where IN is the set of positive integers and all different rate parameters λ,..., λ g IR +, that is, Z GIG r,..., r g ; λ,..., λ g and let Z be a r.v. with a Gamma distribution with positive non-integer shape parameter r and rate parameter λ IR +, that is, Z Gr, λ. Furthermore, let Z and Z be independent and λ λ j j =..., g. Then the distribution of Z = Z + Z is a GNIG distribution Coelho, 004 of depth g +. We will denote this by Z GNIGr,..., r g, r; λ,..., λ g, λ. The p.d.f. probability density function of Z is given by r g j { } f Z z = Kλ r e λ Γk jz c j,k k= Γk+r zk+r F r, k+r, λ λ j z, z > 0 3

and the c.d.f. cumulative distribution function by z r F Z z = λ r Γr+ F r, r+, λz r g j k Kλ r e λ jz c z r+i λ i j j,k Γr++i F r, r++i, λ λ j z k= i=0 z > 0 where g K = λ r j j and c j,k = c j,k Γk λ k j with c jk given by through 3 in Coelho 998. In the above expressions F a, b; z is the Kummer confluent hypergeometric function Abramowitz and Stegun, 974. Such functions have usually very good convergence properties and are nowadays handled by a number of software packages. We may note that if r IN, the GNIG distribution turns itself into a GIG distribution, so that we may look at the GNIG distribution as a generalization of the GIG distribution. Let us suppose now that the r.v. W has a distribution that is a mixture of two GNIG distributions of depth g, the first one with shape parameters r,..., r g and rate parameters λ,..., λ g, with weight θ 0<θ < and the second one with shape parameters r,..., r g and rate parameters λ,..., λ g, with weight θ. We will denote this fact by W MGNIG θ; r,..., r g, λ,..., λ g; r,..., r g, λ,..., λ g.. The Logbeta distribution Let now X be a r.v. with a Beta distribution with parameters α > 0 and β > 0. We will denote this by X Betaα, β. The h-th moment of X is given by EX h = Bα + h, β Bα, β = Γα + β Γα Γα + h Γα + β + h α + h > 0. 3 4

If we take Y = log X then we will say that Y has a Logbeta distribution with parameters α and β Johnson, Kotz and Balakrishnan, 995, what is denoted by Y Logbetaα, β. The p.d.f. of Y is f Y y = Bα, β e αy e y β, y > 0. 4 Since the Gamma functions in 3 are still defined for any real or complex h V ɛ 0, the c.f. of Y is given by φ Y t = Ee ity = Ee it log X = EX it = Γα + β Γα Γα it Γα + β it, 5 where i = / and t IR where IR is the real set. Based on 5, we may obtain the first four moments of Y as µ = EY = ψα + β ψα µ = EY = ψ α ψ α + β + [ψα + β ψα] µ 3 = EY 3 = ψ α + β ψ α + [ψα + β ψα] 3 +3 [ψα + β ψα] [ψ α ψ α + β] µ 4 = EY 4 = ψ α ψ α + β + [ψα ψα + β] 4 +6 [ψα ψα + β] [ψ α ψ α + β] +4 [ψα ψα + β] [ψ α ψ α + β] +3 [ψ α ψ α + β], where ψx = d log Γx is the digamma function, dx ψ x = d ψx is the trigamma function and dx ψ x = d function, and so on. 6 d log Γx = dx dx ψ x is the quadrigamma 3 Two near-exact approximations to the distribution of the product of particular independent Beta random variables In this section we will obtain two near-exact approximations for the distribution of the product of an odd number of independent Beta r.v. s with same second parameter and whose first parameter evolves by /. 5

Theorem Let Y j Beta a j, b, j =,..., p 7 be p independent r.v. s where p and b are both positive odd integers, with b 3, and a j = c + p j j =,..., p, with c IR+ and let p p W = Y j and W = log W = log Y j. 8 Let us further consider Y j Beta a j, p, j =,..., b b independent r.v. s where a j = c + b j j =,..., b and let b b W = Yj and W = log W = log Yj. Then a near-exact approximation to the distribution of W and W is a GNIG distribution of depth p + b, symbolically, W k ne GNIG r,..., rp+b 3, rp+b =, rp+b ; λ,..., λ p+b 3, λ p+b, λ p+b, k =, 9 with rate parameters λ j = c + j, j =,..., p + b 3 0 and shape parameters r j j =,..., p rj j = p +,..., p + b 4 step = j = p + b 3 r j + j = p,..., p + b 5 step 6

with sequences indexed by j being used only if the upper bound is larger than the lower bound, where h j j =, r j = r j + h j j = 3,..., p + b 3 with j =,..., minp, b h j = 0 j = + minp, b,..., maxp, b j = + maxp, b,..., p + b 3 3 and yet r p+b, λ p+b and λ p+b obtained by numerical solution of the system of equations µ = λ p+b + r p+b λ p+b µ = λ p+b + λ p+b λ p+b r p+b + λ p+b r p+b +r p+b λ p+b λ p+b µ 3 = 6λ3 p+b + 6λ p+b λ p+b r p+b + 3λ p+b λ p+b r p+b +r p+b λ 3 p+b λ3 p+b 4 + λ3 p+b r p+b +3r p+b +r p+b λ 3 p+b λ3 p+b where on the left hand side we have the first three moments of a Logbeta r.v. with parameters a + b 3 and 3, obtained from 6 by replacing α and β by the appropriate values, and on the right hand side the first three moments of the sum of two Gamma r.v. s, the first one with shape parameter rp+b = and rate parameter λ p+b and the second one with shape parameter rp+b and rate parameter λ p+b. The other near-exact approximation to the distribution of W and W is obtained as a mixture of two GNIG distributions of depth p + b, symbolically, W k ne MGNIG θ; r,..., rp+b 3, r p+b ; λ,..., λ p+b 3, λ p+b ; r,..., r p+b 3, r p+b ; λ,..., λ p+b 3, λ p+b, k =, 5 7

where the shape parameters r,..., r p+b 3 are given by through 3, the rate parameters λ,... λ p+b 3 by 0, r p+b is set equal to r p+b, and θ, r p+b, λ p+b and λ p+b are obtained through the numerical solution of the system of equations µ = θ Γ r p+b + Γ r p+b µ = θ Γ r p+b + Γ r p+b µ 3 = θ Γ r p+b + 3 Γ r p+b µ 4 = θ Γ r p+b + 4 Γ r p+b + θ Γ r p+b + λ p+b Γ r p+b λ p+b λ 3 p+b λ 4 p+b + θ Γ r p+b + Γ r p+b + θ Γ r p+b + 3 Γ r p+b + θ Γ r p+b + 4 Γ r p+b λ p+b λ p+b λ 3 p+b λ 4 p+b 6 where on the left hand side we have now the first four moments of a Logbeta r.v. with parameters a + b 3 and 3, obtained from 6 by replacing α and β by the appropriate values, and on the right hand side the first four moments of a mixture with weights θ 0<θ < and θ, of two Gamma r.v. s, the first one with shape parameter r p+b and rate parameter λ p+b and the second one with shape parameter r p+b and rate parameter λ p+b. Proof: From 3, the h-th moment of Y j is given by E Γ a Yj h j + b = Γ a j Γ a j + h Γ a j + b + h, so that, given the independence of the p r.v. s Y j, we have E W h = E p Y h j p = E Yj h p = Γ a j + b Γ a j Γ a j + h Γ a j + b + h. The c.f. of W is thus given by φ W t = E e itw = E e it log W = E W it p = Γ a j + b Γ a j Γ a j it Γ a j + b it, 7 8

where i = / and t IR. But then, splitting the product and taking a j+ = c + p j = a j we may write the c.f. of W as φ W t = Γ a + b Γ a it p Γ a j + b Γ a Γ Γ a j it a + b it j= Γ a j Γ a j + b it = Γ a + b Γ a + b 3 Γ a Γ + b 3 a + b it 3 Γ a Γ Γ a it a + b it Γ a + b 3 it p Γ a j+ + b Γ a j+ it Γ a j+ Γ a j+ + b it = Γ a + b Γ a + b 3 Γ a Γ + b 3 it a + b 3 Γ a Γ Γ a it a + b it Γ a + b 3 it p Γ a j + b Γ a j Γ it a j Γ a j + b it. But then, since b is an odd integer and thus b 3 is an integer, we may use, for any real or complex a, the identity Γa + b 3 Γa = b 3 j=0 a + j, to write the c.f. of W as φ W t = Γ a + b Γ a + b 3 = Γ a + b Γ a + b 3 Γ a + b 3 it Γ a + b it Γ a + b 3 it Γ a + b it b 3 j=0 p b 5 j=0 p a +j Γ a j + b Γ a j b 3 j=0 a it + j Γ a j it Γ a j + b it a +j a it + j Γ a j + b Γ a j Γ a j it Γ. a j + b it 9

Finally, using, the identity Coelho, 004 p Γ a j + b Γ = a j p Γ c + p j + b Γ = c + p j p+b 3 c + j rj, valid for odd p, with r j j =,..., p + b 3 given by and 3, we have φ W t = Γ c + p + b Γ c + p + b 3 b 5 Γ c + p + b 3 it Γ c + p + b it c + p c + j + p + j it j=0 = Γ c + p + b Γ c + p + b 3 p+b 5 j=p step = Γ c + p + b Γ c + p + b 3 p+b 3 c + j rj c + j rj it Γ c + p + b 3 it Γ c + p + b it c + j p+b 3 c + j it c + j rj c + j rj it Γ c + p + b 3 it Γ c + p + b it p+b 3 c + j where r j j =,..., p + b 3 is given by through 3. r j c + j it r j, 8 Now, we keep unchanged the last product in 8 and we replace Γ c + p + b Γ c + p + b 3 Γ c + p + b 3 it Γ c + p + b it, 9 that is the c.f. of a Logbeta r.v. with parameters c + p + b 3 and 3, by λ p+b λ p+b it λ r p+b p+b λ p+b it r p+b, 0 0

that is the c.f. of the sum of two Gamma r.v. s, one with shape parameter and rate parameter λ p+b and the other with shape parameter r p+b and rate parameter λ p+b. The replacement is made in such a way that the c.f ṣ in 9 and 0 have the first three derivatives with respect to t at t = 0 equal. This means that the distributions to which they correspond will have the same first three moments. This leads us to obtain r p+b, λ p+b and λ p+b as the solutions of the system of equations 4. This way a near-exact c.f. of W is obtained under the form φ W t λ p+b λ p+b it λ r p+b p+b λ p+b it r p+b p+b 3 c + j r j c + j r it j, that is the c.f. of the GNIG distribution of depth p + b in 9, the sum of a GIG distribution of depth p + b 3 with a Gamma r.v. with shape parameter equal to and rate parameter λ p+b and another Gamma r.v. with shape parameter r p+b IR + \IN and rate parameter λ p+b. This turns out to be the sum of a GIG distribution of depth p + b with the latter Gamma distribution. If by chance r p+b IN then the near-exact approximation to the distribution of W we just obtained turns into a GIG distribution of depth p + b. In all the above we supposed that λ p+b and λ p+b were different from each other and different from all the other λ j j =,..., p + b 3. We may note that if by chance it happens that any of λ p+b or λ p+b is equal to any of the λ j j =,..., p + b 3 or equal to each other this will only reduce the depth of the GNIG distribution by one. In any case, this near-exact approximation will have the same first three moments as the exact distribution. In fact, depending on the asymptotic result used in 8, we may obtain different near-exact approximations to the distribution of W. This way, we decided now to replace the part of the c.f. of W in 9 by r p+b p+b λ r p+b p+b θ λ p+b it r + θ λ p+b λ p+b it, r p+b that is the c.f. of the mixture of two Gamma r.v. s with shape parameters r p+b and r p+b and rate parameters λ p+b and λ p+b. The approximation is done in such a way that the c.f. s in 9 and have the same first four derivatives with respect to t at t = 0, that is, the corresponding distributions have the same first four moments. We set r p+b = r p+b and then obtain the

four parameters θ, r p+b, λ p+b and λ p+b through the numerical solution of the system of equations 6. In this case we have λ r p+b p+b φ W t θ λ p+b it r + θ λ rp+b p+b p+b λ p+b it r p+b p+b 3 3 c + j r j c + j r it j, that is the c.f. of the mixture of two GNIG distributions of depth p + b in 5, with r p+b = r p+b. In the above we supposed that λ p+b and λ p+b were different from all the other λ j j =,..., p + b 3. If, by chance, this does not happen, this will only reduce the depth of the corresponding GNIG distribution by one. In any case, this near-exact approximation will have the same first four moments as the exact distribution. That the c.f. of W and W are the same that is, that p and b are interchangeable, may be seen through the use of the equality p Γ c + p j + b b Γ c + b Γ c + p = j + p j Γ c + b, j which is valid for any integer p and b Coelho, 998, 999, by writing the c.f. of W as φ W t = E b Γ a e itw j + p Γ a = Γ j it a j Γ a j + p it b Γ c + p = j + b Γ c + p j Γ c + p it j Γ c + p j + b it p Γ c + b = j + p Γ c + b j Γ c + b it j Γ c + b j + p it p Γ a j + b Γ a j it = Γ a j Γ a j + b = φ it W t. Since φ W t = φ W t, W and W have the same distribution and thus the same near-exact c.f. s and distributions. The near-exact distributions for W and W may then be obtained through the transformation W k = e W k k =,.

The near-exact p.d.f. and c.d.f. of W, for the first approximation in Theorem above, corresponding to the c.f. in, are given by and respectively, with Z = W, g = p + b, r j = r j j =,..., p + b 3, r g = r p+b =, r = r p+b, λ j = c + j j =,..., p + b 3, λ g = λ p+b and λ = λ p+b, that is, this near-exact p.d.f. of W is given by p+b fw = Kλ r r { j p+b p+b e λ jw Γk c wk+r j,k k= Γk+rp+b p+b F r p+b, k+r p+b, λ p+b λ j w }, w >0 and the near-exact c.d.f., by F w = λ r p+b p+b w r p+b Γr p+b + F r p+b, r p+b +, λ p+b w p+b Kλ r p+b p+b r j k e λ jw c j,k k= i=0 w r p+b +i λ i j Γr p+b + + i F r p+b, r p+b + + i, λ p+b λ j w w >0 with K = p+b λ r j j. The near-exact p.d.f. and c.d.f. of W, for the second approximation in Theorem above, corresponding to the c.f. in 3, are given by fw = θ Kλ r p+b 3 r { j p+b p+b e λ jw Γk c j,k k= Γk+r p+b wk+r p+b + θ Kλ r p+b 3 p+b p+b F r p+b, k+r p+b, λ p+b λ j w r { j e λ jw c j,k k= Γk Γk+r p+b wk+r p+b F r p+b, k+r p+b, λ p+b λ j w } } w >0 3

and with F w = θ K = { λ r p+b p+b w r p+b Γr p+b + F r p+b, r p+b +, λ p+b w Kλ r p+b 3 p+b p+b e λ jw + θ p+b 3 { λ r p+b p+b Kλ r p+b p+b λ r j j. p+b 3 r j k c j,k k= i=0 w r p+b +i λ i j Γr p+b ++i F r p+b, r p+b ++i, λ p+b λ j w w r p+b Γr p+b + F r p+b, r p+b +, λ p+b w r j k e λ jw c j,k k= i=0 w r p+b +i λ i j Γr p+b ++i F r p+b, r p+b ++i, λ p+b λ j w } } w >0 The corresponding near-exact p.d.f. s and c.d.f. s of W may then be obtained through the transformation W = e W. Although, at first sight, these expressions may look a bit complex, they are actually much manageable. In fact, from these c.d.f. s near-exact quantiles may be easily computed. 4 Evaluation of the quality and comparison of the two near-exact approximations It is now important to analyze how good the proposed approximations are. Let us start with the study of the proximity between a Logbeta c + p + b 3, 3 distribution and a GNIG, rp+b ; λ p+b, λ p+b distribution of depth, which correspond to the c.f ṣ in 9 and 0, respectively. The parameters rp+b, λ p+b, λ p+b are obtained as the numeric solution of the system of equations 4 where µ, µ and µ 3 are given by 6. If we take, for example, α = c + p + b 3 = 5 0, where p and b are both 4

positive odd integers, with b 3, we will obtain for r p+b, λ p+b, λ p+b the values in Table. Table Values for shape and rate parameters for the GNIG distribution r p+b = r p+b =0.4999896506888 λ p+b =5.676439460 λ p+b =5.00375558976448 Then, in order to analyze the closeness between a Logbeta c + p + b 3, 3 distribution and the mixture of two Gamma distributions, MG θ; r p+b, λ p+b ; r p+b, λ p+b, which correspond respectively to the c.f ṣ in 9 and, we will take θ, r p+b, λ p+b, λ p+b in as solutions of the system of equations 6 where µ, µ, µ 3 and µ 4 are given by 6 with α = 5/0 and β =3/. We obtain for θ, r p+b, λ p+b and λ p+b the values in Table. Table Values for the weight, shape and rate parameters for the MG distribution r p+b =.5000004557350 λ p+b =5.489083846704 r p+b = r p+b λ p+b =5.657453500 θ=0.576443645473 We may note that the choice of the value 3/ for the second parameter of the above Logbeta distribution, made in a given moment of the proof of Theorem, was not random but rather a judicious one. This choice was based on the assumption that this value should be of the form k/ where k is a positive odd integer. Besides, on one hand, for the sum of two Gamma distributions this value has to be greater than one, since the sum of the two shape parameters r p+b = and r p+b in 0 will always be greater than one, while, on the other hand, the value 3/ is, for both approximations, the value of the form k/, that leads to a closer match between the exact and approximate c.f ṣ. The p.d.f. s for the three distributions, that is, the Logbeta, GNIG and MG distributions, are indeed almost superimposed, as we may see by analyzing Tables 3 and 4. In Table 3 we consider the ordinates of the Logbeta p.d.f. in 4, with parameters α = 5/0 and β = 3/ and the ordinates of the p.d.f. of a GNIG distribution of depth with parameters given in Table and in Table 4 the ordinates of the Logbeta p.d.f. and the ordinates of the p.d.f. of an MG distribution with parameters given in Table. We may notice the close match between the three p.d.f. s, mainly between the p.d.f. s for the Logbeta and the MG distributions. 5

Table 3 Ordinates for the Logbeta and GNIG p.d.f. s and absolute value of the differences x-value Logbeta Ordinates α=5/0, β =3/ GNIG Ordinates Absolute value of difference 0.00 0.00 0.00 0.00 0.05 7.074907765483 7.07489356005670 0.0000640654453 0.0 4.679995750436 4.679963846476 0.0000003878570 0.5.650340866944.65038603400 0.000004557076 0.0.395967860079.3955739783440 0.00000779336 0.5 0.707396490084 0.7074050957 0.0000006089443 0.30 0.364858983899873 0.3648585349835 0.00000045480037 0.35 0.8375485839 0.836880665808 0.0000006674943 0.40 0.089999784336385 0.08999964354 0.00000059984975 0.45 0.04474873 0.04474974587 0.0000003039774 0.50 0.05048736464 0.05040535843 0.000000337334 Table 4 Ordinates for the Logbeta and MG p.d.f. s and absolute value of the differences x-value Logbeta Ordinates α=5/0, β =3/ MG Ordinates Absolute value of difference 0.00 0.00 0.00 0.00 0.05 7.074907765483 7.0749078984734 0.0000009999 0.0 4.679995750436 4.679995035653 0.000000044784 0.5.650340866944.6503404066706 0.00000004007735 0.0.395967860079.3959609558 0.00000000565050 0.5 0.707396490084 0.707396500356 0.000000009543 0.30 0.364858983899873 0.3648589940406 0.0000000049 0.35 0.8375485839 0.8375535075 0.00000000536036 0.40 0.089999784336385 0.0899997856776 0.0000000033533 0.45 0.04474873 0.04474755459 0.0000000007779 0.50 0.05048736464 0.050486074 0.000000009490 In Table 5 we have the moments for these three distributions. As we can see the first three moments of the GNIG distribution match the first three moments of the Logbeta distribution and the first four moments of the MG distribution match the first four moments of the Logbeta, as it was really meant by construction of the two approximations. The other moments are all very close to the exact Logbeta ones, once again with the MG distribution showing an even better approximation. Table 5 Comparison of moments of the three distributions Moment order Logbeta distribution α=5/0, β =3/ GNIG distribution MG distribution 0.09797449098 0.09797449098 0.09797449098 0.0576385953047 0.0576385953047 0.0576385953047 3 0.003565408035466 0.003565408035466 0.003565408035466 4 0.00039464045853 0.0003946587900 0.00039464045853 5 0.0003704547574 0.00037045650839 0.000370454755770 6 0.000560583794 0.000560605759 0.00056058047 7 0.0000758683908 0.000075870359673 0.00007586840093 8 0.0000480876978 0.00004807394 0.00004808784 9 0.0000575448769 0.00005756477006 0.00005754503685 0 0.000075379433 0.00007540003334 0.0000753794764 6

In Table 6 we may analyze the.90,.95 and.99 quantiles for the three distributions. Once again the mixture of two Gamma distributions shows a better approximation, although both distributions yield remarkably good approximations. Table 6 Comparison of quantiles of the three distributions Quantile Logbeta distribution α=5/0, β =3/ GNIG distribution MG distribution.90 0.0409649747600 0.04096394848 0.0409650573903.95 0.5304534566776 0.5304504347747 0.53045358736.99 0.36740954849 0.367409939 0.36740954048644 We should note that both approximations improve for larger values of α = c + p + b 3 being the approximation given by the mixture of two Gamma distributions always the best of the two. Finally, we compare the near-exact moments for W in 8 given by the two near-exact distributions whose c.f. s are given by and 3 with the exact ones, obtained from the c.f. in 7. In this case the exact distribution of W is the sum of p independent Logbeta r.v. s. We considered three possible different situations that yield α = c+ p + b 3 = 5/0. For all three situations we take c = /0. In Table 7 we have the moments for p = 33 and b =, what would yield a j = + 33 j j =,..., 33 0 in 7 or equivalently, p = and b = 33, what would yield just one Logbeta distribution, with a = c, in Table 8, the moments for p = 3 and b = 3, what would yield a j = + 3 j j =,..., 3 in 7, and in Table 9 for 0 p = 7 and b = 7, yielding a j = + 7 j j =,..., 7. 0 For all three cases it is possible to confirm the equality of the first three moments for the overall GNIG near-exact distribution and of the first four for the overall mixture of two GNIG distributions MGNIG and the very close proximity of all the other moments, of course with some relatively small degradation as the order of the moments progress. Table 7 Comparison of moments of order h for the sum of Logbetas exact and the two near-exact distributions: the GNIG in and the MGNIG in 3 for p = 33 and b =, or p = and b = 33, with c = /0 h Sum of Logbetas exact GNIG near-exact MGNIG near-exact 8.074097996 8.074097996 8.074097996 9.3985373409 9.3985373409 9.3985373409 3 4.65835568 4.65835568 4.65835568 4 8435.93374438 8435.933743580 8435.93374438 5 7807.70880094 7807.70880585 7807.70880097 6 35973.634465535 35973.634579048 35973.634465687 7 74760606.549607864 74760606.5496657388 74760606.5496077994 8 990446068.86070483366 990446068.8630756977 990446068.860705087549 9 34570404593.773404360869 34570404593.8803378760 34570404593.7734587730 0 678534364.408566753 678534369.745856866369 678534364.4034334033 7

Table 8 Comparison of moments of order h for the sum of Logbetas exact and the two near-exact distributions: the GNIG in and the MGNIG in 3 for p = 3 and b = 3, with c = /0 h Sum of Logbetas exact GNIG near-exact MGNIG near-exact 5.058578580 5.058578580 5.058578580 56.93844386038 56.93844386038 56.93844386038 3 5037.3783747536 5037.3783747536 5037.3783747536 4 467.05835558998 467.058355590340 467.05835558998 5 3036894.58503457579 3036894.58503544763 3036894.5850345758 6 9334543.5435305083 9334543.54357535609 9334543.5435305363 7 397990460.599706707560 397990460.5999463797 397990460.599706745 8 3370333.04045680975 3370333.04993904859 3370333.0404657635 9 596368886840.8650094834 59636888684.3969768349 596368886840.86505553897 0 989967638986.690909376 989967639009.954937333 989967638986.63477996 Table 9 Comparison of moments of order h for the sum of Logbetas exact and the two near-exact distributions: the GNIG in and the MGNIG in 3 for p = 7 and b = 7, with c = /0 h Sum of Logbetas exact GNIG near-exact MGNIG near-exact 9.809706388 9.809706388 9.809706388 9.3558757058 9.3558757058 9.3558757058 3 9738.684568847095 9738.684568847095 9738.684568847095 4 005464.87446936057 005464.8744693799 005464.87446936057 5 35888096.38668008775 35888096.3866805455 35888096.38668008778 6 36045465.97770464 36045465.97733694 36045465.9777004 7 5535595330.060307 5535595330.47384574 5535595330.0603803 8 445785793.9045895460 445785793.98697843553 445785793.904587559 9 490737806957.74596707 490737806963.059305576 490737806957.7455807598 0 5909753067795.076505055 590975306786.69757848850 5909753067795.04658780 However, even for the higher moments the error percentage remains at extremely low levels, of no more than 8 0 4 percent for the 0th moment, when computed taking the exact moments as a basis, as it may be seen in Table 0. Another remarkable feature is the fact that the error percentages, for a given near-exact distribution, remain quite stable for any combination of p and b values. Table 0 Error percentages for the 0th moment concerning the two near-exact distributions: the GNIG in and the MGNIG in 3 taking as a basis the exact moment GNIG near-exact MGNIG near-exact p = 33, b = -7.94 0 4-8.57 0 8 p = 3, b = 3-7.80 0 4-8.3 0 8 p = 7, b = 7-5.74 0 4-4.84 0 8 8

Although we have the analytic expressions for the exact and near-exact c.f. s of the logarithm of product of p independent Beta r.v. s, since some of the parameters in the near-exact approximations are computed by numerically solving systems of equations, it doesn t render possible the evaluation of analytic Berry-Esseen type of bounds Berry, 94; Esseen, 945; Loève, 977; Hwang, 998 for the differences between the exact and the approximate c.d.f. s. However, for any given case, once the values for all the parameters of the Beta distributions in the product are specified, we may compute a couple of measures based on the difference between the exact c.f. of Y and the c.f. corresponding to the near-exact approximation under study. These measures will help us in evaluating and comparing the performance of the near-exact distributions proposed. One of these measures is even indeed based on the Berry-Esseen bound. Let Y be a r.v. with support S and let Φ Y t and F Y y be respectively the exact c.f. and c.d.f. of Y and let Φt and F y represent respectively the c.f. and c.d.f. corresponding to the approximation under study. Further, let fy represent the p.d.f. corresponding to F y. One measure we may think of is = Φ Y t Φt dt while the other, related with this one and based directly on the Berry-Esseen upper bound on F Y y F y is = π Φ Y t Φt t dt. The Berry-Esseen inequality, which is indeed an upper bound on F Y y F y, may, for any b > /π and any T > 0, be written as T max F Y y F y b y S T Φ Y t Φt t dt + Cb M T 4 where M = max y S fy and Cb is a positive constant that only depends on b. But if in 4 above we take T then we will have since then we may take b = /π. In Table we may see the computed values of the two measures and for both near-exact distributions proposed. We considered the same three situations as above, that is, the sum of p Logbeta distributions for α = 5/0, with c = /0, and i p = 33, b =, ii p = 3, b = 3 and iii p = 7, b = 7. Once again we may see the better performance of the mixture of two GNIG distributions, with lower values for both measures. 9

Table Values for the two measures and for the two near-exact distributions: the GNIG in and the MGNIG in 3 for α = 5/0, with c = /0, and several values of p and b GNIG MGNIG p = 33, b =.30 0 0 7.887 0.30 0 3.078 0 4 p = 3, b = 3 3.474 0 5.54 0 3.94 0 5.940 0 6 p = 7, b = 7 8.080 0 3.94 0 3 3.377 0 6 6.853 0 7 5 Conclusions and final remarks The two near-exact approximations developed lay very close to the exact distribution in terms of c.f., moments, c.d.f. and quantiles and are highly adequate to be used in computing near-exact quantiles. Although some of the expressions of the near-exact distributions, namely for the c.d.f. s, may still seem a bit complicated they are not only far more manageable than the exact c.d.f. but also perfectly handled by a number of available softwares, rendering the computation of near-exact quantiles not too hard. The two near-exact distributions proposed, that is, the single GNIG distribution and the mixture of two GNIG distributions, illustrate well the trade-off between manageability and precision. While the single GNIG near-exact distribution is much simpler both analytically and computationally, the mixture of two GNIG distributions yields a much better approximation. The mixture of two GNIG distributions leads to the best results. This was somehow expected given that by construction this approximation has the same first four moments equal to the exact ones while the single GNIG distribution only has the first three. The near-exact distributions presented and the methods of derivation used may be readily applied to obtain near-exact distributions of the generalized Wilks Λ statistic in the case where two or more sets of variables have an odd number of variables, as well as to obtain near-exact approximations to the distributions of other related statistics as the one used to test the equality of two multivariate Normal distributions. 6 Acknowledgments The authors want to thank two referees whose comments contributed to improve the contents of the paper and the Portuguese Science Foundation FCT for financial support of the research. 0

References Abramowitz, M. and Stegun, I. A., 974. Handbook of Mathematical Functions, Eds., 9th print. Dover, New York. Anderson, T. W., 984. An Introduction to Multivariate Statistical Analysis, nd ed.. J. Wiley & Sons, New York. Berry, A., 94. The Accuracy of the Gaussian Approximation to the Sum of Independent Variates. Transactions of the American Mathematical Society 49, -36. Coelho, C. A., 998. The Generalized Integer Gamma Distribution A Basis for Distributions in Multivariate Statistics. J. Multiv. Anal. 64, 86-0. Coelho, C. A., 999. Addendum to the paper The Generalized Integer Gamma Distribution A Basis for Distributions in Multivariate Statistics. J. Multiv. Anal. 69, 8-85. Coelho, C. A., 004. The Generalized Near-Integer Gamma Distribution. A basis for near-exact approximations to the distributions of statistics which are the product of an odd number of independent Beta random variables. Journal of Multivariate Analysis 89, 9-8. Esseen, C.-G., 945. Fourier Analysis of Distribution Functions. A Mathematical Study of the Laplace-Gaussian Law. Acta Mathematica 77, -5. Gupta, R. D. and Richards, D., 979. Exact distributions of Wilks Λ under null and non-null linear hypotheses. Statistica 39, 333-34. Gupta, R. D. and Richards, D., 983. Application of results of Kotz, Johnson and Boyd to the null distribution of Wilks criterion. In: P. K. Sen Ed., Contributions to Statistics: Essays in honour of Norman L. Johnson, 05-0. Hwang, H.-K., 998. On convergence rates in the central limit theorems for combinatorial structures. European Journal of Combinatorics 9, 39-343. Johnson, N. L., Kotz, S. and Balakrishnan, N., 995. Continuous Univariate Distributions, Vol., nd ed.. J. Wiley & Sons, New York. Kabe, D. G., 96. On the exact distribution of a class of multivariate test criteria. Ann. Math. Statist. 33, 97-00. Loève, M., 977. Probability Theory, Vol. I, 4th ed.. Springer-Verlag, New York. Nandi, S. B., 977. The exact null distribution of Wilks criterion. Sankhyã 39, 307-35. Tretter, M. J. and Walster, G. W., 975. Central and noncentral distributions of Wilks statistic in Manova as mixtures of incomplete Beta functions. Ann. Statist. 3, 467-47. Wilks, S. S., 93. Certain generalizations in the analysis of variance. Biometrika 4, 47-494. Wilks, S. S., 935. On the independence of k sets of normally distributed statistical variables. Econometrica 3, 309-36.