Estimation of parametric functions in Downton s bivariate exponential distribution

Estimation of parametric functions in Downton s bivariate exponential distribution George Iliopoulos Department of Mathematics University of the Aegean 83200 Karlovasi, Samos, Greece e-mail: geh@aegean.gr Abstract This paper considers estimation of the ratio of means and the regression function in Downton s (1970) bivariate exponential distribution. Unbiased estimators are given and, by presenting improved estimators, they are shown to be inadmissible in terms of mean squared error. The results are derived by conditioning on an unobserved random sample from a geometric distribution which provides conditional independence for the statistics involved. AMS 2000 subject classifications: 62F10, 62C99. Key words and phrases: Downton s bivariate exponential distribution, unbiased estimation, ratio of means, regression function, mean squared error, inadmissibility. 1 Introduction One of the most important bivariate distributions in reliability theory is the bivariate exponential. There are various bivariate exponential distributions in the literature. A recent review can be found in the book of Kotz, Balakrishnan and Johnson (2000). In this paper we are interested in Downton s bivariate exponential distribution with probability 1

density function (pdf) f(x, y; λ 1, λ 2, ρ) = λ { 1λ 2 1 ρ exp λ } } 1x + λ 2 y {2(ρλ 1 λ 2 xy) 1/2 I 0 1 ρ 1 ρ, (1.1) where x, y, λ 1, λ 2 > 0, 0 ρ < 1, and I 0 (z) = k=0 (z/2)2k /k! 2 is the modified Bessel function of the first kind of order zero. The above density was initially derived in a different form by Moran (1967). The form (1.1) is derived by Downton (1970) in a reliability context and is a special case of Kibble s (1941) bivariate gamma distribution. Let (X, Y ) be an observation from (1.1). The marginal distributions of X, Y are exponential with means (scale parameters) 1/λ 1, 1/λ 2 respectively. Since I 0 (0) = 1, it is clear that X and Y are independent if and only if ρ = 0. Downton (1970) showed that ρ is the correlation coefficient of the two variates. By expanding in a series, the pdf can be written in the form f(x, y ; λ 1, λ 2, ρ) = k=0 ( ) ( ) π(k ; ρ) g k+1 x ; 1 ρ λ 1 g k+1 y ; 1 ρ λ 2 where g α ( ; β) denotes the pdf of a Gamma(α, β) random variable and π(k ; ρ) = (1 ρ)ρ k, k = 0, 1, 2,..., is the geometric probability mass function., Let K be a random variable having the above geometric distribution. Then, conditionally on K = k, X, Y are independent gamma variates with shape parameter k + 1 and scale parameters (1 ρ)/λ 1, (1 ρ)/λ 2 respectively. The most common algorithm for generating observations from (1.1) (see Downton, 1970 and Al-Saadi, Scrimshaw and Young, 1979) as well as the extension of the above distribution in more than two dimensions (see Al-Saadi and Young, 1982) are based on this well known property. Statistical inference for the parameters of (1.1) is restricted mainly on the correlation coefficient ρ. Nagao and Kadoya (1971), Al Saadi and Young (1980), and Balakrishnan and Ng (2001) considered the estimation problem of ρ, and Al Saadi, Scrimshaw and Young (1979) the problem of testing the hypothesis ρ = 0. However, another interesting problem is the estimation of λ = λ 2 /λ 1, which represents the ratio of the means of the two components. For example, an estimated value greater than one indicates that on the average the first component is more reliable than the second one. Note that λ is also the ratio of the scale parameters of X and Y. Estimation of λ in general scale families including among others normal, exponential and inverse Gaussian has been considered by many authors in the past. For a decision theoretic approach, see Gelfand and Dey (1988), Madi and Tsui (1990), Kubokawa (1994), Madi (1995), Ghosh and Kundu (1996), 2

Kubokawa and Srivastava (1996) (who assume independence of the two components) and Iliopoulos (2001) (who considers the problem of estimation of the ratio of variances in the bivariate normal distribution). Next, we outline the rest of the paper. In Section 2, an unbiased estimator, ˆλ U, of λ is derived based on a random sample from (1.1). Then, a class of inadmissible estimators with respect to the mean squared error is constructed and it is shown that this class contains ˆλ U. Furthermore, some alternative biased estimators dominating ˆλ U are presented. In Section 3, unbiased estimators of the regression of X on Y, as well as of the conditional variance of X given Y = y, are given. They are also shown to be inadmissible; improved estimators are presented as well. Finally, an Appendix contains useful expressions for expectations of geometric and negative binomial distributions as well as of the statistics involved in the derivation of the results. 2 Estimation of the ratio of means Let (X 1, Y 1 ),..., (X n, Y n ), n 2, be a random sample from (1.1) and K = (K 1,..., K n ) be the associated (unobserved) random sample from the geometric distribution π( ; ρ) such that, given K i = k i, X i is independent of Y i, i = 1,..., n. Since each K i is related only to (X i, Y i ), it is clear that, conditionally on K = k = (k 1,..., k n ), all X s and Y s are independent. Set K = K i, k = k i and note that K follows a negative binomial distribution. By considering the joint distribution of the data, it is easily seen that the sufficient statistic is (X 1 Y 1,..., X n Y n, X i, Y i ). Setting S 1 = X i, S 2 = Y i, U = (U 1,..., U n ) = (X 1 S 1 1,..., X ns 1 1 ), and V = (V 1,..., V n ) = (Y 1 S 1 2,..., Y ns 1 2 ) we obtain the one to one transformation (U 1 V 1,..., U n V n, S 1, S 2 ), which is also sufficient. Conditionally on K = k, S 1, S 2 are independent and S i Gamma(n + k, (1 ρ)λ 1 i ), i = 1, 2. Moreover, from a well known characterization of the gamma distribution, (S 1, S 2 ) is independent of (U, V), and U, V are iid from a (n 1)-variate Dirichlet distribution with parameters k 1 + 1,..., k n 1 + 1, k n + 1. Consider the estimation problem of λ = λ 2 /λ 1. Nagao and Kadoya (1971) showed that the maximum likelihood estimators (mles) of λ 1 and λ 2 are 1/ X and 1/Ȳ respectively, thus the mle of λ is ˆλ mle = S 1 /S 2. Using Lemma 4.1(vii) in the Appendix, we obtain the expectation of this estimator, E[S 1 /S 2 ] = E[E(S 1 /S 2 K)] = λ E 3 [ ] ( ) n + K n ρ = λ n + K 1 n 1. (2.1)

Hence, ˆλ mle is biased. For deriving an unbiased estimator of λ it is necessary to employ an estimator of the correlation coefficient ρ. There are two classes of estimators of ρ in the literature: (i) estimators based on the statistic T = X i Y i /S 1 S 2 = U i V i (such as the moment estimator) and (ii) estimators based on the sample correlation coefficient R, see Al-Saadi and Young (1980) and Balakrishnan and Ng (2001). However, R is not a function of the sufficient statistic, whereas T is. Therefore, T has been chosen for our purposes. Note also that the problem of estimation of λ remains invariant under the group of transformations (X i, Y i ) (c 1 X i, c 2 Y i ), i = 1,..., n, and equivariant estimators of λ are of the form ψ(u 1 V 1,..., U n V n )S 1 /S 2. A particular choice for ψ can be of the form ψ(t ), giving more justification to T. The conditional expectation of T given K = k is n n E[T K = k] = E[U i V i K = k] = E(U i K = k) 2 = i=1 i=1 n i=1 (k i + 1) 2 (k + n) 2. Since T is a function of U and V solely, it follows that, conditionally on K, it is also independent of S 1, S 2. Therefore, E[T S 1 /S 2 ] = E[E(T S 1 /S 2 K)] = E[E(T K)E(S 1 /S 2 K)] [ n ] (K i + 1) 2 = λ E (n + K)(n + K 1) i=1 = λ E { n(n + K) 1 (n + K 1) 1 E [ (K 1 + 1) 2 ]} K [ ] ( n + 1 + 2K 1 = λ E = λ (n + 1)(n + K 1) n 1 + n 3 ) n 2 1 ρ (2.2) (see Lemma 4.1). From (2.1), (2.2) it can be seen that each of E[S 1 /S 2 ], E[T S 1 /S 2 ] equals λ times a first degree polynomial in ρ. The derivation of an unbiased estimator of λ which is a function of S 1, S 2 and T is equivalent to finding c 0, c 1 such that E[c 0 S 1 /S 2 +c 1 T S 1 /S 2 ] = λ. Solving the linear equations n n 1 c 0 + 1 n 1 c 1 = 1 1 n 1 c 0 + n 3 n 2 1 c 1 = 0, we obtain c 0 = (n 3)/(n 1), c 1 = (n + 1)/(n 1). Thus, we have proved the following proposition. Proposition 2.1. The estimator ( n 3 + (n + 1)T ˆλ U = n 1 is unbiased for λ = λ 2 /λ 1. 4 ) S1 S 2 (2.3)

For n 3, the variance of ˆλ U is ( n 3 + (n + 1)T Var(ˆλ U ) = E n 1 = ( n 3 n 1) 2 E [ S 2 1 /S 2 2 S 1 S 2 ) 2 λ 2 ] + 2(n 3)(n+1) (n 1) 2 E [ T S 2 1/S 2 2 ] + ( n+1 n 1) 2 E [ T 2 S 2 1/S 2 2] λ 2, and substituting the expectations from Lemma 4.2 we get [ 2n 2 5n + 5 Var(ˆλ U ) = (n 2)(n 1) 2 2(n3 3n + 10) (n 2 4)(n 1) 2 ρ + n3 + 6n 2 ] 5n + 38 (n + 3)(n 2 4)(n 1) 2 ρ2 λ 2. Consider the class of estimators of λ, C = {ˆλ a1,a 2 = (a 1 + a 2 T )S 1 /S 2, a 1, a 2 R}. The unbiased estimator ˆλ U as well as the mle ˆλ mle are members of C for a 1 = a 1U = (n 3)/(n 1), a 2 = a 2U = (n + 1)/(n 1) and a 1 = 1, a 2 = 0, respectively. We would like to characterize inadmissible estimators within C in terms of mean squared error (mse). By invariance, the (scaled) mse λ 2 E λ1,λ 2,ρ(ˆλ a1,a 2 λ) 2 does not depend on λ 1, λ 2. Thus, without loss of generality, we assume for the rest of the section that λ 1 = λ 2 = 1 and denote the mse of ˆλ a1,a 2 as mse(a, ρ), where a = (a 1, a 2 ). Fix ρ [0, 1]. Then, for n 3, mse(a, ρ) is strictly convex in a and there exists a minimizing point a 0 (ρ) = (a 10 (ρ), a 20 (ρ)) with where a 10 (ρ) = (n 2)q 1 (ρ)/q 2 (ρ) a 20 (ρ) = 3(n 2)(n + 1)(n + 2)(n + 3)ρ(1 ρ) 2 /q 2 (ρ), q 1 (ρ) = (n + 1)(n + 2)(n + 3) + 4(n 6)(n + 1)(n + 3)ρ (n 5)(3n 2 + 29n + 30)ρ 2 + 2(n 3 11n 46)ρ 3, q 2 (ρ) = (n + 1) 2 (n + 2)(n + 3) + 4(n 6)(n + 1) 2 (n + 3)ρ (3n 4 + 32n 3 77n 2 382n 312)ρ 2 +2(n 4 + 4n 3 + 25n 2 126n 256)ρ 3 3(n 3 + 13n 94)ρ 4. As expected, it holds a 0 (0) = ((n 2)/(n + 1), 0), i.e., in the case of two independent exponential samples, the best estimator within C coincides with the best equivariant estimator of λ. On the other hand, a 0 (1) = (1, 0), that is, the best estimator in this case is the mle. Notice here that the mse of the mle tends to zero as ρ 1. To see that 5

without evaluating it, observe that since the support of X i, Y i is (0, ), ρ = 1 implies that X i /λ 1 = Y i /λ 2 with probability one. Setting B(ρ) = E λ 1 =λ 2 =1,ρ(S1 2/S2 2 ) E λ 1 =λ 2 =1,ρ(T S1 2/S2 2 ) E λ1 =λ 2 =1,ρ(T S1 2/S2 2 ) E λ 1 =λ 2 =1,ρ(T 2 S1 2/S2 2 ) the mse of ˆλ a1,a 2 can be expressed as mse(a, ρ) = [a a 0 (ρ)] B(ρ)[a a 0 (ρ)] + mse(a 0 (ρ), ρ). Let E(a, ρ) = { c R 2 : [c a 0 (ρ)] B(ρ)[c a 0 (ρ)] < mse(a, ρ) mse(a 0 (ρ), ρ) } be the interior of the ellipse that consists of the points c = (c 1, c 2 ), such that ˆλ c1,c 2 has equal mse with ˆλ a1,a 2 for the particular ρ. Then, ˆλ a1,a 2 is admissible within C if and only if E(a, ρ) =. ρ [0,1) This condition is clearly satisfied by ˆλ a10 (ρ),a 20 (ρ), ρ [0, 1), implying that these estimators are admissible within C. By the continuity of the mse, this holds also for the mle. However, the determination of the above intersection is in general a problem which does not seem to allow for an analytical solution. Instead of that, we can find a subclass of C containing inadmissible estimators ˆλ a1,a 2, by fixing a 1 or a 2 one at a time. Fix first a 1. Then, the mse of ˆλ a1,a 2 a 2 = a 2 (a 1, ρ) given by is quadratic in a 2 and uniquely minimized at a 2(a 1, ρ) = (n 2)[n + 1 (n 3)ρ] + [(n + 1)2 + (n 2 5n 12)ρ 3(n 5)ρ 2 ]a 1 (n + 2)(n + 3) 2 + 2(n + 3)(n 2 + n 26)ρ + (n 3 8n 2 27n + 178)ρ 2. (2.4) Since the denominator in (2.4) is positive for every ρ [0, 1] and n 3, a 2 (a 1, ρ) is bounded. Let a 2 (a 1) = inf ρ [0,1) a 2 (a 1, ρ) and a 2 (a 1) = sup ρ [0,1) a 2 (a 1, ρ). Then we have the following. Proposition 2.2. (i) If a 2 / A 2 (a 1) = [a 2 (a 1), a 2 (a 1)] then ˆλ a1,a 2 dominated by ˆλ a1,a 2 (a 1) if a 2 < a 2 (a 1) or ˆλ a1,a 2 (a 1) if a 2 > a 2 (a 1). (ii) In particular, if a 1 a 11 = (n 7)/(n 1) then a 2(a 1 ) = a 2(a 1, 1) = is inadmissible being (n + 2)(n + 3) (1 a 1 ), (2.5) 2(n + 5) 6

( ) a 2 (a 1) = a (n + 1)2 n 2 2(a 1, 0) = n + 3 n + 1 a 1 whereas, if a 1 a 12 = (n 3 + 2n 2 41n 34)/[(n 1)(n 2 + 9n + 10)],, (2.6) a 2(a 1 ) = a 2(a 1, 0), a 2 (a 1) = a 2(a 1, 1), where a 2 (a 1, 0), a 2 (a 1, 1) are as in (2.5), (2.6), respectively. Proof. Part (i) is a consequence of the convexity of the mse in a 2. Part (ii) arises from the monotonicity of a 2 (a 1, ρ) with respect to ρ. Specifically, for a 1 a 11, a 2 (a 1, ρ) is strictly decreasing in ρ whereas for a 1 a 12 it is strictly increasing. This can be seen by examining the sign of the derivative of a 2 (a 1, ρ) with respect to ρ which is proportional to the quadratic (n + 3)[(n 1)(n 2 + 9n + 10)a 1 (n 3 + 2n 2 41n 34)]+ 2[(n 1)(n 3 35n 46)a 1 (n + 1)(n 3 8n 2 27n + 178)]ρ+ [(n 1)(n 3 4n 2 19n + 102)a 1 (n 3)(n 3 8n 2 27n + 178)]ρ 2. The rest of the proof is elementary (although messy) and therefore omitted. Remark 2.1. When a 2 < a 2 (a 1), by the convexity of the mean squared error, ˆλ a1,a 2 is dominated not only by ˆλ a1,a 2 (a 1), but by any estimator ˆλ a1,a with a 2 2 (a 2, a 2 (a 1)] (a similar argument occurs when a 2 > a 2 (a 1)). Nevertheless, ˆλ a1,a 2 (a 1) is the best among these estimators, therefore is the only one mentioned in Proposition 2.2. In a similar way, by fixing a 2 and letting a 1 to vary, one can obtain an analogous result. In this case the mse is quadratic in a 1 and uniquely minimized in a 1 (a 2, ρ) given by a 1(a 2, ρ) = (n + 1)(n 2)(n ρ) [(n + 1)2 + (n 2 5n 12)ρ 3(n 5)ρ 2 ]a 2 (n + 1)[n(n + 1) 4(n + 1)ρ + 6ρ 2 ]. The denominator is always positive, thus a 1 (a 2, ρ) is bounded for ρ [0, 1]. a 1 (a 2) = inf ρ [0,1) a 1 (a 2, ρ) and a 1 (a 2) = sup ρ [0,1) a 1 (a 2, ρ), we derive the following. Setting Proposition 2.3. (i) If a 1 / A 1 (a 2) = [a 1 (a 2), a 1 (a 2)] then ˆλ a1,a 2 dominated by ˆλ a 1 (a 2 ),a 2 if a 1 < a 1 (a 2) or ˆλ a 1 (a 2 ),a 2 if a 1 > a 1 (a 2). (ii) In particular, if a 2 a 21 = 3n(n + 1)/[(n 1)(n + 2)] then is inadmissible being a 1(a 2 ) = a 1(a 2, 0) = n 2 n + 1 a 2 n, (2.7) 7

whereas, if a 2 a 22 = 3(n + 1)/(n 1), a 1 (a 2) = a 1(a 2, 1) = 1 2a 2 n + 1, (2.8) a 1(a 2 ) = a 1(a 2, 1), a 1 (a 2) = a 1(a 2, 0), where a 1 (a 2, 0), a 1 (a 2, 1) are as in (2.7), (2.8), respectively. Propositions 2.2 and 2.3 provide necessary conditions for the admissibility of ˆλ a1,a 2 within C as stated in Corollary 2.1 below. Corollary 2.1. Two necessary conditions for the admissibility of ˆλ a1,a 2 a 1 A 1 (a 2) and a 2 A 2 (a 1). within C are Typically, unbiased estimators of scale parameters (as is λ for the distribution of S 1 /S 2 ) are inadmissible in terms of mean squared error. In our case, the inadmissibility of the unbiased estimator ˆλ U follows from Proposition 2.2, since a 1U > a 12 and a 2U > a 2 (a 1U) = (n + 2)(n + 3)/(n 1)(n + 5). Corollary 2.2. The unbiased estimator ˆλ U is inadmissible in terms of mean squared error being dominated by ( ) ˆλ U = ˆλ n 3 a1u,a 2 (a 1U ) = (n + 2)(n + 3) + n 1 (n 1)(n + 5) T S1. (2.9) S 2 Consider now the broader class of estimators D = {ˆλ φ = φ(t )S 1 /S 2 }, where φ( ) is any function such that ˆλ φ has finite mse. Using Stein s (1964) technique, originally presented for improving the best equivariant estimator of a normal variance when the mean is unknown, one concludes that ˆλ U in (2.9) as well as a large subset of C are inadmissible estimators. To be specific, consider the conditional mean squared error of ˆλ φ given T = t, K = k, E {[φ(t )S 1 /S 2 λ] 2 } T = t, K = k, which is quadratic in φ(t) and uniquely minimized at φ k (t) = λ E[S 1/S 2 T = t, K = k] E[S 2 1 /S2 2 T = t, K = k] = n + k 2 n + k + 1 = φ k, say. Note that it does not depend on t, since conditionally on K = k, S 1, S 2 and T are mutually independent. Moreover φ k is strictly increasing in k with φ 0 = (n 2)/(n + 1) 8

n 3 5 10 20 50 100 ρ 0.8463 0.8317 0.8433 0.8663 0.9004 0.9239 Table 1. Values of ρ for which ˆλ mle and ˆλ U have equal mean squared errors. and lim k φ k = 1. As a consequence, each estimator of the form ˆλ φ with P[φ(T ) / [(n 2)/(n + 1), 1] > 0 is inadmissible being dominated by the estimator φ (T )S 1 /S 2, where φ (T ) = max{(n 2)/(n + 1), min[φ(t ), 1]}. Application of the above argument to the class C leads to the following proposition. Proposition 2.4. (i) If a 1 / [(n 2)/(n + 1), 1] or a 2 / [(n 2)/(n + 1) a 1, 1 a 1 ], then the estimator ˆλ a1,a 2 is inadmissible being dominated by max{(n 2)/(n + 1), min[a 1 + a 2 T, 1]}S 1 /S 2. (ii) In particular, ˆλ U in (2.9) is dominated by ˆλ U = min{ˆλ U, ˆλ mle }. The mse of ˆλ U cannot be derived in a closed form, therefore an analytical comparison with ˆλ mle is impossible. However, it is easy to compare the latter with ˆλ U. Table 1 shows, for selected sample sizes, the corresponding values of the correlation coefficient for which both estimators have equal mean squared errors. When ρ is less than the reported value, ˆλ U is superior to ˆλ mle and vice-versa. Since ˆλ U dominates ˆλ U, it follows that for ρ less than the reported value, dominates ˆλ mle as well. (In fact, a Monte Carlo study showed that ˆλ U ˆλ U and ˆλ mle have equal mean squared errors when ρ is approximately 0.05 higher than the values given in Table 1.) It can be concluded that almost perfect linear correlation is suspected. ˆλ U should be preferred, unless 3 Estimation of the regression and the conditional variance Consider now estimation of the regression of X on Y based on a random sample from (1.1). Downton (1970) showed that the conditional expectation of X given Y = y is linear in y, specifically, η(y) = E[X Y = y] = 1 ρ λ 1 + ρ λ 2 λ 1 y. Obviously, for deriving an unbiased estimator of η(y) it suffices to derive unbiased estimators of η 1 = (1 ρ)/λ 1 and η 2 = ρλ 2 /λ 1. 9

Proposition 3.1. (i) The estimator is unbiased for η 1 = (1 ρ)/λ 1. ˆη 1U = 2 (n + 1)T n 1 S 1 (ii) The estimator is unbiased for η 2 = ρλ 2 /λ 1. ˆη 2U = n + 1 n 1 (nt 1) S 1 S 2 Proof. (i) The problem is similar to that of the derivation of ˆλ U in (2.3). We have to find c 0, c 1 such that E[(c 0 + c 1 T )S 1 ] = (1 ρ)λ 1 1. Using Lemma 4.2 (i), (ii), it can be seen that it suffices to solve the equations nc 0 + c 1 = 1, n 1 n+1 c 1 = 1, for c 0 and c 1. The solution is c 0 = 2/(n 1) and c 1 = (n + 1)/(n 1), hence ˆη 1U is an unbiased estimator of η 1 = (1 ρ)/λ 1. (ii) Similarly, we need to find c 0, c 1 such that E[(c 0 + c 1 T )S 1 /S 2 ] = ρλ 2 /λ 1. Using (2.1) and (2.2), we get the equations n n 1 c 0 + 1 n 1 c 1 = 0 1 n 1 c 0 + n 3 n 2 1 c 1 = 1, whose solution is c 0 = (n + 1)/(n 1) and c 1 = n(n + 1)/(n 1), yielding ˆη 2U as an unbiased estimator of η 2 = ρλ 2 /λ 1. Corollary 3.1. The estimator ˆη U (y) = 2 (n + 1)T n 1 S 1 + n + 1 n 1 (nt 1) S 1 S 2 y is unbiased for η(y). The estimator ˆη U (y) is inadmissible for every y, since it assumes negative values with positive probability. A rather crude improved estimator is its positive part, ˆη + U (y) = max{0, ˆη U (y)}, which has smaller risk for any convex loss function. However, the same occurs for ˆη 1U and ˆη 2U, and it seems rational to improve first on them and use their improvements to estimate the regression. An estimator dominating ˆη 1U in terms of mean squared error can be derived using Stein s (1964) technique. Consider the conditional mean squared error of estimators of 10

{ [φ(t the form φ(t )S 1 given T = t, K = k, E )S1 (1 ρ)λ 1 ] } 2 1 T = t, K = k, which is quadratic in φ(t) and uniquely minimized at φ k (t) = λ 1 1 (1 ρ)e[s 1 T = t, K = k] E[S 2 1 T = t, K = k] = 1 n + k + 1 = φ k, say. Now, φ k is positive, attaining its maximum when k = 0, i.e. 0 < φ k φ 0 = (n + 1) 1. As a consequence, each estimator of the form φ(t )S 1 with P[φ(T ) / [0, (n + 1) 1 ]] > 0 is inadmissible being dominated by the estimator φ (T )S 1, where φ (T ) = max{0, min[φ(t ), (n + 1) 1 ]}. Since P[(2 (n + 1)T )/(n 1) / [0, (n + 1) 1 ]] > 0, ˆη 1U is dominated by the estimator S 1 /(n + 1), T < (n + 3)/(n + 1) 2, ˆη 1 = ˆη 1U, (n + 3)/(n + 1) 2 T 2/(n + 1), 0, T > 2/(n + 1). (3.1) In a similar fashion we can improve on ˆη 2U. Note that it contains the quantity nt 1, which is the estimator of ρ obtained by Nagao and Kadoya (1971) using the method of moments. estimator to Using the condition 0 ρ < 1, Al-Saadi and Young (1980) modified this 0, T < 1/n, ρ = nt 1, 1/n T 2/n, 1, T > 2/n. The replacement of nt 1 in ˆη 2U by max{nt 1, 0} leads to its positive part, ˆη + 2U = max{0, ˆη 2U }, which is an improved estimator of ρλ 2 /λ 1. Replacement of nt 1 by ρ seems also reasonable, leading to the estimator 0, T < 1/n, η 2 = ˆη 2U, 1/n T 2/n, n + 1 S 1, T > 2/n. n 1 S 2 (3.2) However, using Stein s (1964) technique we can find an estimator dominating all these estimators. Consider the class of estimators of ρλ 2 /λ 1 having the form ψ(t )S 1 /S 2. The conditional mean squared error given T = t, K = k of such an estimator is uniquely minimized with respect to ψ(t) at ψ k (t) = ρλ 2λ 1 1 E[S 1/S 2 T = t, K = k] E[S 2 1 /S2 2 T = t, K = k] = ρ n + k 2 n + k + 1 = ψ k (ρ), 11

ρ 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 n = 10 η 1 39.4 41.8 43.1 44.3 46.2 49.4 53.9 58.9 64.1 67.7 η 2 46.9 45.4 43.3 42.1 42.3 44.3 47.9 53.0 59.4 65.2 n = 20 η 1 32.1 31.5 27.9 25.8 27.5 32.7 41.0 50.5 58.7 64.9 η 2 42.8 36.4 28.9 24.8 25.4 29.4 36.7 45.6 54.5 62.5 n = 50 η 1 28.1 22.1 11.6 6.5 7.5 13.2 22.5 34.2 47.4 58.7 η 2 43.6 27.9 12.5 6.3 6.9 11.8 20.0 30.7 43.8 56.4 Table 2. Simulated percentage risk improvement of the mean squared error of ˆη 1 in (3.1), ˆη 2 in (3.3) over ˆη 1U, ˆη 2U respectively. say. Since 0 ψ k (ρ) ρ < 1, any estimator of the form ψ(t )S 1/S 2 satisfying P[ψ(T ) / [0, 1]] > 0 is inadmissible. Indeed, it is dominated by ψ (T )S 1 /S 2, where ψ (T ) = max{0, min[ψ(t ), 1]}. Thus, ˆη 2U, ˆη + 2U are dominated by 0, T < 1/n, ˆη 2 = ˆη 2U, 1/n T 2/(n + 1), S 1 /S 2, T > 2/(n + 1). From (3.2) and (3.3), it is obvious that ˆη 2 dominates also η 2. (3.3) Remark. The estimators ˆη 1, ˆη 2 in (3.1), (3.3) respectively, have the property of pretesting for ρ. For example, when T is small (smaller than (n + 3)/(n + 1) 2 ), indicating ρ = 0, ˆη 1 equals to the best equivariant estimator of 1/λ 1 with respect to squared error loss, S 1 /(n+1). On the other hand, when T is large (greater than 2/(n+1)), indicating ρ to be very close to one, ˆη 1 equals zero. Analogous comments hold for ˆη 2. The percentage improvements in terms of mean squared error of the estimators ˆη 1, ˆη 2 over ˆη 1U, ˆη 2U respectively, have been evaluated by Monte Carlo sampling from (1.1), for sample sizes n = 10, 20, 50 and ρ = 0(.1).9. We have taken 10000 replications for each pair (n, ρ). The results are shown in Table 2. It can be seen that the improvements are remarkable even for n = 50. Generally, they are larger for extreme values of ρ. This can be explained by the nature of the improved estimators as indicated in the above remark. The conditional variance of X given Y = y is also linear in y. Specifically, ( ) 1 ρ 2 θ(y) = Var(X Y = y) = + 2ρ(1 ρ) λ 2 λ 1 λ 2 y. 1 Let θ 1 = (1 ρ) 2 λ 2 1, θ 2 = 2ρ(1 ρ)λ 2 λ 2 1. Then we have the following proposition. 12

Proposition 3.2. (i) The estimator ˆθ 1U = h 1 (T )S 2 1, where h 1 (T ) = 4(n + 5) 4(n + 1)(n + 5)T + (n + 1)(n + 2)(n + 3)T 2 (n 1)(n 2 + 5n + 2), is unbiased for θ 1 = (1 ρ) 2 λ 2 1. (ii) The estimator ˆθ 2U = h 2 (T )S 2 1 /S 2, where h 2 (T ) = 4(n2 + 7n + 8) + 2(n + 1)(3n 2 + 19n + 18)T 2(n + 1) 2 (n + 2)(n + 3)T 2 (n 1)(n 2 + 5n + 2), is unbiased for θ 2 = 2ρ(1 ρ)λ 2 λ 2 1. Proof. Similarly to the proof of Proposition 3.1, the problem reduces in finding c 0, c 1, c 2, d 0, d 1, d 2 such that E[(c 0 + c 1 T + c 2 T 2 )S 2 1 ] = θ 1, E[(d 0 + d 1 T + d 2 T 2 )S 2 1 /S 2] = θ 2 for part (i), (ii) respectively. Using Lemma 4.2 and equating the coefficients of the appropriate second degree polynomials in ρ, we obtain ˆθ 1U, ˆθ 2U as unbiased estimators of θ 1, θ 2. Corollary 3.2. The estimator ˆθ U (y) = h 1 (T )S 2 1 + y h 2 (T )S 2 1/S 2 is unbiased for θ(y). The estimators ˆθ 1U, ˆθ 2U and hence ˆθ(y) assume negative values with positive probability. As in the estimation problem of η(y), we can improve on them by truncating h 1, h 2 in suitable intervals. Omitting the details, an estimator of θ 1 = (1 ρ) 2 λ 2 1 of the form φ(t )S 2 1 satisfying P[φ(T ) / [0, 1/(n + 2)(n + 3)]] > 0 is dominated by φ (T )S 2 1 where φ (T ) = max{0, min[φ(t ), 1/(n+2)(n+3)]}, whereas an estimator of θ 2 = 2ρ(1 ρ)λ 2 λ 2 1 of the form ψ(t )S 2 1 /S 2 with P[ψ(T ) / [0, 2(n 2)/(n + 2)(n + 3)]] > 0 is dominated by ψ (T )S 2 1 /S 2 where ψ (T ) = max{0, min[ψ(t ), 2(n 2)/(n + 2)(n + 3)]}, provided n 6. The functions h 1, h 2 satisfy the above conditions for n 3, thus ˆθ 1U, ˆθ 2U are dominated by suitable estimators. 4 Appendix Lemma 4.1. Let K 1, K 2,..., K n be a random sample from a geometric distribution with probability mass function π 1 (k 1 ; ρ) = P(K 1 = k 1 ; ρ) = (1 ρ)ρ k 1, k 1 = 0, 1, 2,.... (4.1) 13

and K = K i. Then, (i) P(K 1 = k 1 K = k) = ( n 2+k k1 k k 1 )( n+k 1 k ) 1, 0 k1 k, ( )( ) n 3+k k1 k (ii) P(K 1 = k 1, K 2 = k 2 K = k) = 2 n+k 1 1, k k 1 k 2 k 0 k1, k 2, k 1 + k 2 k, (iii) E[(K 1 + 1) 2 K = k] = 1 + 3n+1 n(n+1) k + 2 n(n+1) k2, (iv) E[(K 1 + 1) 2 (K 1 + 2) 2 K = k] = ( 4 1 + 8n2 +13n+3 n(n+1)(n+3) k + 19n2 +41n+18 n(n+1)(n+2)(n+3) k2 18 + n(n+2)(n+3) k3 6 + n(n+1)(n+2)(n+3) ), k4 (v) E[(K 1 + 1) 2 (K 2 + 1) 2 K = k] = 1 + (2n+3)(3n+1) n(n+1)(n+3) k + 13n2 +29n+14 n(n+1)(n+2)(n+3) k2 12 + n(n+2)(n+3) k3 4 + n(n+1)(n+2)(n+3) k4, (vi) EK = [ n + K (vii) E n + K 1 nρ 1 ρ, nρ (1 + nρ) EK2 = (1 ρ) 2, ] = n ρ n 1. Proof. Parts (i), (ii) are applications of the Bayes Theorem, whereas parts (iii) (vi) are straightforward. We will prove only part (vii). Since K = K i follows a negative binomial distribution with probability mass function one has [ ] n + K E n + K 1 = = = = π n (k ; ρ) = k=0 k=0 ( n+k 1 k n + k n + k 1 ) ρ k (1 ρ) n, k = 0, 1, 2,..., ( n+k 1 k ) ρ k (1 ρ) n (n + k 2)! (n + k) ρ k (1 ρ) n k!(n 1)! n(1 ρ) n 1 n(1 ρ) n 1 k=0 ( n 1+k 1 k + ρ = n ρ n 1. ) ρ k (1 ρ) n 1 + ρ k=0 ( n+k 1 k ) ρ k (1 ρ) n Lemma 4.2. Let (X 1, Y 1 ), (X 2, Y 2 ),..., (X n, Y n ) be a random sample from (1.1), and S 1 = X i, S 2 = Y i, T = X i Y i /(S 1 S 2 ). Then, 14

(i) E[S 1 ] = n λ 1 (ii) E[T S 1 ] = 1, ( ) 1 + n 1 n+1 ρ λ 1 1, (iii) E[S1 2 ] = n(n + 1) λ 2 1, ( ) (iv) E[T S1 2] = n + 1 + (n 1)(n+2) n+1 ρ n 1 n+1 ρ2 λ 2 1, ( ) (v) E[T 2 S1 2] = n+3 n+1 + 2(n 1)(n+6) (n+1)(n+2) ρ + (n 1)(n2 +n 18) (n+1)(n+2)(n+3) ρ2 λ 2 1, (vi) E[S 2 1 /S 2] = ( n(n+1) n 1 2(n+1) n 1 ρ + 2 n 1 ρ2 ) λ 2 λ 2 1, ( ) (vii) E[T S1 2/S 2] = n+1 n 1 + n2 2n 7 ρ 2(n 3) n 2 1 n 2 1 ρ2 λ 2 λ 2 1, ( ) (viii) E[T 2 S1 2/S 2] = n+3 n 2 1 + 2(n2 +3n 16) (n 2 1)(n+2) ρ + n3 4n 2 27n+78 (n 2 1)(n+2)(n+3) ρ2 λ 2 λ 2 1, ( ) (ix) E[S1 2/S2 2 ] = n(n+1) (n 1)(n 2) 4(n+1) (n 1)(n 2) ρ + 6 (n 1)(n 2) ρ2 λ 2 2 λ 2 1, (x) E[T S 2 1 /S2 2 ] = ( (xi) E[T 2 S 2 1 /S2 2 ] = ( n+1 (n 1)(n 2) + n2 5n 12 (n 2 1)(n 2) ρ 3(n 5) n+3 (n 2 1)(n 2) + 2(n2 +n 26) (n 2 1)(n 2) ρ2 ) (n 2 1)(n 2 4) ρ + n3 8n 2 27n+178 λ 2 2 λ 2 1, (n 2 1)(n 2 4)(n+3) ρ2 ) λ 2 2 λ 2 1, Proof. The marginal distribution of S 1 is Gamma(n, 1/λ 1 ), thus (i), (iii) are immediate. From the rest, we will prove only (v) since the proofs of the other parts are similar. Let K = (K 1,..., K n ) be a random sample from the geometric distribution (4.1) and set U i = X i S1 1, V i = Y i S2 1, i = 1,..., n. Then ( n ) 2 E[T 2 K = k] = E U i V i i=1 K = k n n n = E Ui 2 Vi 2 + U i V i U j V j K = k i=1 i=1 j=1 j i = (n + k) 2 (n + k + 1) 2 n (k i + 1) 2 (k i + 2) 2 + i=1 n i=1 E[S 2 1 K = k] = (n + k)(n + k + 1)(1 ρ) 2 λ 2 1, n (k i + 1) 2 (k j + 1) 2, j=1 j i 15

yielding E[T 2 S1] 2 = E[E(T 2 S1 K)] 2 = E[E(T 2 K)E(S1 K)] 2 { [ n(k1 + 1) 2 (K 1 + 2) 2 + n(n 1)(K 1 + 1) 2 (K 2 + 1) 2 ]} ( = E E 1 ρ (n + K)(n + K + 1) K λ 1 [ ] ( ) n + 3 = E n + 1 + 4(n + 5) (n + 1)(n + 2)(n + 3) K + 4(n + 5) 1 ρ 2 (n + 1)(n + 3) K2. Here the last equality follows from Lemma 4.1(iv), (v). Substituting the moments of K in the last expression from Lemma 4.1(vi), we obtain the desired result. λ 1 ) 2 Acknowledgment The author wishes to thank the referees for their suggestions which improved the results and the presentation of the paper. References Al-Saadi, S. D., Scrimshaw, D. G. and Young, D. H. (1979). Tests for independence of exponential variables. J. Statist. Comput. Simul., 9, 217 233. Al-Saadi, S. D. and Young, D. H. (1980). Estimators for the correlation coefficient in a bivariate exponential distribution. J. Statist. Comput. Simul., 11, 13 20. Al-Saadi, S. D. and Young, D. H. (1982). A test for independence in a multivariate exponential distribution with equal correlation coefficient. J. Statist. Comput. Simul., 14, 219 227. Balakrishnan, N. and Ng, H. K. T. (2001). Improved estimation of the correlation coefficient in a bivariate exponential distribution. J. Statist. Comput. Simul., 68, 173 184. Downton, F. (1970). Bivariate exponential distributions in reliability theory. J. Roy. Statist. Soc. B, 32, 408 417. Gelfand, A. E. and Dey, D. K. (1988). On the estimation of a variance ratio. J. Statist. Plann. Inference, 19, 121 131. Ghosh, M. and Kundu, S. (1996). Statist. Decisions, 14, 161 175. Decision theoretic estimation of the variance ratio. Iliopoulos, G. (2001). Decision theoretic estimation of the ratio of variances in a bivariate normal distribution. Ann. Inst. Statist. Math., 53, 436 446. Kibble, W. F. (1941). A two-variate gamma type distribution. Sankhyã, 5, 137 150. 16

Kotz, S., Balakrishnan, N. and Johnson, N. L. (2000). Continuous Multivariate Distributions, 1, Second edition. New York, Wiley. Kubokawa, T. (1994). Double shrinkage estimation of ratio of scale parameters. Ann. Inst. Statist. Math., 46, 95 116. Kubokawa, T. and Srivastava, M. S. (1996). Double shrinkage estimators of ratio of variances. Multidimensional Statistical Analysis and Theory of Random Matrices (eds. A.K. Gupta and V.L. Girko), 139 154, VSP, Netherlands. Madi, T. M. (1995). On the invariant estimation of a normal variance ratio. J. Statist. Plann. Inference, 44, 349 357. Madi, T. M. and Tsui, K. W. (1990). Estimation of the ratio of the scale parameters of two exponential distributions with unknown location parameters. Ann. Inst. Statist. Math., 42, 77 87. Moran, P. A. P. (1967). Testing for correlation between non-negative variates. Biometrika, 54, 385 394. Nagao, M. and Kadoya, M. (1971). Two-variate exponential distribution and its numerical table for engineering application. Bulletin of the Disaster Prevention Institute, Kyoto University, 20, 183 215. Stein, C. (1964). Inadmissibility of the usual estimator for the variance of a normal distribution with unknown mean. Ann. Inst. Statist. Math., 16, 155 160. 17