International Mathematical Forum, 3, 28, no. 35, 1713-1725 Statistical Inference Using Progressively Type-II Censored Data with Random Scheme Ammar M. Sarhan 1 and A. Abuammoh Department of Statistics and O.R., Faculty of Science King Saud University, P.O. Box 2455, Riyadh 11451, Saudi Arabia asarhan@yahoo.com abuammoh@ksu.edu.sa Abstract Statistical inference of an exponential distribution model, using progressively Type-II censored data with random scheme, is discussed in this paper. Maximum likelihood and Bayes procedures are used to derive both point and interval estimates of the parameters included in the model. Monte Carlo simulation method is used to generate a progressive Type-II censored data from exponential distribution, then these data is used to compute the point and interval estimations of the parameter and compare both the methods used when different random schemes. Keywords: Maximum likelihood procedure, Bayes procedure, exponential distribution 1 Introduction There are several situations in life-testing, in reliability experiments, and survival analysis in which units are lost or removed from the experiments while they are still alive. The loss may occur either out of control or preassigned. The out of control case can be happened when an individual under study (testing) drops out or so. The other case may occur because of limitation of funds or to save the time and cost etc. For more details we refer to Balakrishnan and Aggarwala (2) and the references therein. In such situations, the progressive censoring schemes take place. Recently, the estimation of parameters from different lifetime distributions based on progressive Type-II censored samples are studied by several 1 Home address: Department of Mathematics, Faculty of Science, Mansoura University Mansoura 33516, Egypt. E-mail: ammar@mans.edu.eg
1714 A. M. Sarhan and A. Abuammoh authors including Childs and Balakrishnanm (2), Balakrishnanm and Kannan (21), Mousa and Jaheen (22), Ng, et al. (22), Balakrishnanm, et al. (23) and Soliman (25). But in some reliability experiment, the number of patients dropped out the experiment cannot be pre-fixed and they are random. In such situations, the progressive censoring schemes with random removals are needed. In this paper, we use progressively Type-II censoring data with random removals. We assume that the lifetimes of the units tested are exponentially distributed. We will derive both point and interval estimates of the unknown parameter using: (1) maximum likelihood method; and (2) Bayes method. The rest of this paper is organized as follows. Section 2 presents the progressively Type-II scheme with random removals and the likelihood function. Maximum likelihood and Bayes procedures are discussed in Section 3. Numerical studies and conclusion are presented in Section 4. The paper is concluded in Section 5. 2 The Model Let random variable X have an exponential distribution with parameter θ. The probability density function of X takes the following from The survival function of X is f(x) =θ exp{ θx}, x, θ>. (2.1) S(x) = exp{ θx}, x, θ>. (2.2) Let (X 1,R 1 ), (X 2,R 2 ),,(X m,r m ), denote a Progressively type II censored sample, where X 1 <X 2 < <X m. With pre-determined number of removals, say R 1 = r 1,R 2 = r 2, R m = r m, the conditional likelihood function can be written as, Cohen (1963), L(θ; x R = r) =c m f(x i )[S(x i )] r i, (2.3) where c = n(n r 1 1)(n r 1 r 1 2) (n r 1 r m m + 1), and r i (n m r 1 r i 1 ), for i =1, 2,,m 1. Substituting (2.1) and (2.2) into (2.3), we get { } m L(θ; x R = r) =c θ m exp θ (1 + r i )x i. (2.4)
Statistical inference 1715 Suppose that an individual unit being removed form the test at the i th failure, i =1, 2,,m 1, is independent of the others but with the same probability p. That is, the number R i of units removed at the i th failure, i = 1, 2,,m 1, follows a binomial distribution with parameters n m i 1 l=1 r l and p. Therefore, ( n m P (R 1 = r 1 )= and for i =1, 2,,m 1, P (R i = r i R i 1 = r i 1, R 1 = r 1 )= r 1 ) p r 1 (1 p) n m r 1, (2.5) ( n m i 1 l=1 r l r i ) p r i (1 p) n m P i l=1 r l. (2.6) Now, we further suppose that R i is independent of X i for all i. Then the full likelihood function takes the following form L(θ, p; x, r) =L(θ; x R = r) P (R = r), (2.7) where P (R = r) = P (R 1 = r 1 ) P (R 2 = r 2 R 1 = r 1 ) P (R 3 = r 3 R 2 = r 2,R 1 = r 1 ) P (R m 1 = r m 1 R m 2 = r m 2,,R 1 = r 1 ), (2.8) Substituting (2.5) and (2.6) into (2.9), we get P (R = r) = (n m)! pp m 1 r i (1 p) (m 1)(n m) P m 1 (m i)r i ( n m m 1 r i )! m 1 r i! (2.9) Now, using (2.4), (2.7) and (2.9), we can write the full likelihood function as in the following form L(θ, p; x, r) =AL 1 (θ) L 2 (p), (2.1) where A = p, and c (n m)! (n m P m 1 r i)! Q m 1, does not depend on the parameters θ and r i! { } m L 1 (θ) =θ m exp θ (1 + r i )x i, (2.11) L 2 (p) =p P m 1 r i (1 p) (m 1)(n m) P m 1 (m i)r i. (2.12) 3 Maximum likelihood Estimations This section discuss the process of obtaining the maximum likelihood estimates of the parameters θ and p based on progressively Type-II censoring data with binomial removals. Both point and interval estimations of the parameters are derived.
1716 A. M. Sarhan and A. Abuammoh 3.1 Point Estimations As it seems, L 1 does not involve p. Therefore, the maximum likelihood estimator (MLE) of θ can be derived by maximizing (2.11) directly. The log-likelihood function of L 1 takes the following form L 1 (θ) =m ln θ θ m (1 + r i )x i. (3.1) The first partial derivative of L 2 (p) with respect to p is L 1 (θ) θ = m m θ (1 + r i )x i. (3.2) Setting L 1(p) =, we get the likelihood equation for θ. Solving the equation θ obtained with respect to θ, we get the MLE of θ is given by ˆθ MLE = m m (1 + r i)x i. (3.3) Similarly, since L 2 does not involve θ, the maximum likelihood estimator of p can be derived by maximizing (2.12) directly. The log-likelihood function of L 2 takes the following form L 2 (p) =lnp m 1 [ ] m 1 r i + ln(1 p) (m 1)(n m) (m i)r i. (3.4) The first partial derivative of L 2 (p) with respect to p is L 2 (p) p = m 1 r i p (m 1)(n m) m 1 (m i)r i. (3.5) 1 p Setting L 2(p) =, we get the likelihood equation for p. Solving the equation p obtained with respect to p, we get the MLE of p as in the following form ˆp MLE = 3.2 Interval Estimations m 1 r i (m 1)(n m) m 1 (m 1 i)r. (3.6) i The approximate confidence intervals of the parameters based on the asymptotic distributions of the MLE of the parameters θ and p are derived in this subsection. For the observed information matrix for θ and p, we find
Statistical inference 1717 2 L(θ, p) = m θ 2 θ = A 2 1, 2 L(θ, p) =, θ p 2 L(θ, p) m 1 = r i (m 1)(n m) m 1 + (m i)r i = A p 2 p 2 (1 p) 2 2. Then the observed information matrix is given by ( ) ( ) I11 I I = 12 A1 =, I 21 I 22 A 2 so that the variance-covariance matrix may be approximated as ( ) ( ) 1 ( V11 V V = 12 A1 1 ) A = = 1 1. V 21 V 22 A 2 A 2 It is known that the asymptotic distribution of the MLE (ˆθ, ˆp) is given by, see Miller (1981), ( ) [( ) ( )] ˆθ θ V11 V N, 12 (3.7) ˆp p V 21 V 22 Since V involves the parameters θ and p, we replace the parameters by the corresponding MLE s in order to obtain an estimate of V, which is denoted by ( ) ˆV = 1Â1. 1Â2 By using (3.7), approximate 1(1 α)% confidence intervals for θ and p are determined, respectively, as [ [ ] ˆθ z α/2, Â1 ˆθ + z α/2 Â1 and ] ˆp z α/2, ˆp + z α/2 Â2 Â2 where z α is the upper α th percentile of the standard normal distribution. 4 Bayes procedure (3.8) In this section, we use the Bayes procedure to derive the point and interval estimates of the parameters θ and p based on progressively Type-II censoring data with binomial removals. For this purpose, we need the following additional assumptions:
1718 A. M. Sarhan and A. Abuammoh B.1) The parameters θ and p behave as independent random variables. B.2) The random variable θ has exponential distribution with known parameter κ as a priori distribution. Namely, the prior pdf of θ takes the following form g 1 (θ) =κ exp{ κθ} κ>; θ>. (4.1) While p has beta prior distribution with known parameters α, β. That is, the prior pdf of p is given by g 2 (p) = 1 B(α, β) pα 1 (1 p) β 1, <p<1; α, β >. (4.2) B.3) The loss function is ( )) ( l (θ, p), (ˆθ, ˆθ) 2 ˆp = ε 1 θ + ε1 (p ˆp) 2, ε 1,ε 2 >. (4.3) Based on the assumptions (B.1) and (B.2), the joint prior pdf of (θ) is g(θ) = g 1 (θ) g 2 (p), θ >, <p<1 κ = B(α, β) pα 1 (1 p) β 1 exp{ κθ}, θ >, <p<1. (4.4) Given the available observations, when the joint prior pdf is (4.4), the joint posterior pdf of (θ, p) is π(θ, p x, r) = θm J exp { θκ } p α 1 (1 p) β 1,θ>, <p<1, (4.5) where κ = κ + m (1 + r i)x i, α = α + m 1 r i, β = β +(m 1)(n m) m 1 (m i)r i and J = Γ(m +1)B(α,β ). κ Therefore, the marginal posterior pdf s of θ and p are given respectively by π 1 (θ x, r) = κ m+1 Γ(m +1) θm exp { θκ },θ>, (4.6) and 1 π 2 (p x, r) = 1 B(α,β ) pα (1 p) β 1, <p<1. (4.7) Note that the posterior distribution of θ is gamma with parameters m +1 and κ, while the posterior distribution of p is beta with parameters α and β.
Statistical inference 1719 4.1 Point Estimators Using the fact that, under the squared error loss function the Bayes estimator and its associated minimum posterior risk are are the posterior mean and variance, respectively, see Martz and Waller (1982). Therefore, under the assumption (B.3), the Bayes estimate of θ and p,say θ and p, and the associated minimum posterior risk, say R( θ) and R( p), are given as follows. and θ = p = θπ 1 (θ x, r) dθ, R( θ) = 1 From (4.6)-(4.9), we get and p = θ = pπ 2 (p x, r) dp, R( p) = m +1 κ + m 1 (1 + r i)x i, R( θ) = 1 θ 2 π 1 (θ x, r) dθ [ θ] 2, (4.8) p 2 π 2 (p x, r) dp [ p] 2, (4.9) m +1 [ κ + m 1 (1 + r i)x i ] 2, (4.1) α (α +1) (α + β + 1)(α + β ), R( p) = α β (α + β + 1)(α + β ) 2. (4.11) where α = α + m 1 r i, β = β +(m 1)(n m) m 1 (m i)r i. 4.2 Two-sided Bayes probability interval (TBPI) The Bayesian method to interval estimation is much more direct than the maximum likelihood method. Once the marginal posterior pdf of θ has been obtained, a symmetric 1(1 ϑ)% two-sided Bayes probability interval [1(1 ϑ)% TBPI] estimates of θ, denoted [θ L,θ U ], can be obtained by solving the following two equations, see Martz and Waller (1982): θl π 1 (θ x, r)dθ = ϑ 2, θ U π 1 (θ x, r)dθ = ϑ 2, (4.12) for the limits θ L and θ U. Similarly, a symmetric 1(1 ϑ)% TBPI estimate of p, denoted and [p L,p U ], can be obtained by solving the following two equations: pl π 2 (p x, r)dp = ϑ 1 2, π 2 (p x, r)dp = ϑ p U 2, (4.13) for the limits p L and p U. As it seems, each of the above two systems (4.12) and (4.13) of equations has no analytical solution. So, we have to use a mathematical package to get the numerical solutions. We use the Matlab 7 to solve these two systems.
172 A. M. Sarhan and A. Abuammoh 5 Data Analysis Example 1: In this example, we generate two (one with a small size and the other with a large size) progressive Type-II censored random sample from exponential distribution. The following algorithm is followed to get such sample. 1. Specify the value of n. 2. Specify the value of m. 3. Specify the values of parameters θ and p. 4. Generate a random sample with size m from Exp(θ) and sort it. 5. Generate a random number r 1 from bino(n m, p). ( 6. Generate a random number r i from bino n m ) i 1 l=1 r l,p, for each i, i =2, 3,,m 1,. 7. Set r m according to the following relation { n m m 1 r m = l=1 r l, if n m m 1 l=1 r l >,, o.w. In these two sample, we assumed that the exact values of θ and p are respectively.3 and.2. In this first sample, n = 15 and m = 12, while in the second sample n = 3 and m = 12. The samples obtained are given as follows The first sample: (.278,1), (2.9,1), (6.352,1), (8.286,), (18.325,), (19.332,), (2.333,), (24.727,), (25.717,), (25.877,), (41.47,), (84.676,) The second sample: (7.212,3), (12.473,2), (12.643,3), (2.369,) (22.458,2), (35.462,1), (35.949,4), (45.429,1), (5.923,), (53.898,1), (56.252,), (56.5,1) We used the above two samples to compute: (1) the MLE and BE for each parameter; (2) the 95% C.I and 95% TBPI for each parameter; (3) the variance associated with the MLE of the parameters and the minimum Bayes risks for each, (4) the percentage error (PE) associated with the estimate of each parameter. The PE associated with the estimate ˆφ of the parameter φ is computed by the following relation ˆφ φ Exact PE φ = 1, (5.1) φ Exact The results obtained are given in Tables 1 and 2.
Statistical inference 1721 Table 1: The results obtained using the maximum likelihood procedure. The parameter Point estimate Var PE 95% C.I. Sample 1 θ.42 1.467 1 4 11.9551 [.182,.657] p.5.417 15. [.999,.91] Sample 2 θ.137 1.577 1 5 16.2713 [.6,.215] p.191.17 4.4944 [.193,.2727] Table 2: The results obtained using the Bayes procedure. The parameter Point estimate R PE 95% C.I. Sample 1 θ.47 1.2747 1 4 1.773 [.217,.656] p.2764.342 38.1822 [.1475,.8413] Sample 2 θ.143 1.5788 1 5 15.6735 [.76,.231] p.384.17 8.887 [.1173,.2788] Figures 1 and 2 show the prior and posterior pdf s of the parameters θ and p, respectively, of the sample 1. Figure 3 and 4 show the prior and posterior pdf s of the parameters θ and p, respectively, of the sample 2. Based on the results shown in Tables 1 and 2, one can conclude that: 1. For the sample 1, the Bayes method provides better estimates of θ and p, in the sense of having smaller PE. 2. For the sample 2, the maximum likelihood method provides a better estimate of p while Bayes method provides a better estimate of p. 4 3 35 The posterior pdf 2.5 The probability density function 3 25 2 15 1 The probability density function 2 1.5 1 The posterior pdf 5 The prior pdf.5 The priro pdf.2.4.6.8.1.12 x.1.2.3.4.5.6.7.8.9 1 x Figure 1. The prior and posterior pdfs of θ. Figure 2. The prior and posterior pdfs of p.
1722 A. M. Sarhan and A. Abuammoh 12 1 9 1 The posterior pdf 8 The posterior pdf The probability density function 8 6 4 The probability density function 7 6 5 4 3 2 The prior pdf.1.2.3.4.5.6.7.8.9.1 x 2 1 The prior pdf.1.2.3.4.5.6.7.8.9 1 Figure 3. The prior and posterior pdfs of θ. Figure 4. The prior and posterior pdfs of p. Example 2. In this example, a large simulations studies have been made. In this study we want to study the properties of the estimates of the parameter θ. For this purpose, we use the following algorithm 1. Specify the values of values of the model parameters θ and p. 2. Determine the value of the sample size n. 3. Determine percentage of the censoring units, say per., which is used with the help of n to determine the value of m according to the relation m = round(n per.. 4. Generate a progressively Type-II censored data from the model, using the algorithm illustrated in example 1, say (X 1,R 1 ),,(X m,r m ). 5. Use the sample obtained in the previous step to compute: (a) The MLE of θ and the associated variances, say Vˆθ, (b) The BE of θ and the associated minimum Bayes risks, say V θ, (c) The (1 α)1%c.i. of the parameter θ. (d) The (1 α)1%tbpi of the parameter θ. 6. Repeat steps 2-3 1 times. 7. Compute (a) The mean squared error associated with each estimate according to the following relation 1 ( ) 2 θexact φ MSE θ = Estimated, 1
Statistical inference 1723 where θ, θ Exact and θ Estimated denote the unknown parameter, its exact value and its estimated value, respectively. (b) The mean of the variance and minimum Bayes risk for each estimate according to the following relation Var(ˆθ) = 1 Var 1 i(ˆθ), R(ˆθ) = R i( θ), 1 1 where Var i (ˆθ) is the variance of the MLE of the parameters θ and R i ( θ) is the minimum Bayes risk associated with the BE of the parameters θ computed based on the sample i, i =1, 2,, 1. (c) The coverage probability for the both type of confidence interval. 8. The steps 3-7 are carried when n =3, 4,, 1 and when per= 9%, = 8%, = 6% and = 4%. Tables 3 gives the results obtained for the parameter θ. Based on the results shown in Table 3, one can conclude that: 1. The MSE associated with the MLE of the parameter θ decrease with increasing the sample size n. 2. The MSE associated with the BE of the parameter θ decrease with increasing the sample size n. 3. The value of Var(ˆθ) increases when per. decreases. 4. Both CP C.I and CP TBPI approach to the exact confidence interval when the percentage of censoring becomes small (m is large). Table 3: The results for the parameter θ.
1724 A. M. Sarhan and A. Abuammoh n per. MSEˆθ MSE θ Var(ˆθ) R( θ) CP C.I CP TBPI 3 9.37785.339333.35842.343184.948571.958571 3 8.395284.35975.39817.372581.9325.94125 3 6.58436.46377.44856.423715.8675.8975 3 4.13371.121555.46396.38312.534.64 4 9.27655.259184.267877.25951.947143.944286 4 8.36244.285465.29352.283374.945.9475 4 6.358942.332587.34377.329575.895.92625 4 4.867374.81673.364772.347628.653.713 5 9.2977.199547.211496.26338.952857.957143 5 8.22131.28898.2284.2227.95125.95625 5 6.337314.315672.286554.27674.9125.915 5 4.699889.65321.3735.295482.679.738 6 9.14935.14349.172969.169556.968571.97 6 8.18816.1815.188636.184528.9375.94 6 6.267655.25317.238284.231576.9875.9237 6 4.563292.53353.27828.268698.733.775 7 9.141545.136959.146272.143829.954286.957143 7 8.172746.166127.1674.163799.9425.9475 7 6.227481.21719.2438.199425.9625.92 7 4.456354.431254.258765.2555.785.82 8 9.12639.117137.128975.12785.964286.965714 8 8.139311.13472.14433.141924.9525.95375 8 6.19165.183389.18892.17799.91875.9275 8 4.365676.347558.23124.224835.816.846 9 9.1873.15949.112954.11155.947143.954286 9 8.128749.125122.126559.124731.9275.94125 9 6.16944.16314.161386.158384.91625.925 9 4.312487.297931.213173.27769.838.869 1 9.14254.1187.12837.11637.947143.955714 1 8.1995.16315.113928.112453.95625.96 1 6.15134.14526.146889.144411.935.95125 1 4.28481.273151.1915.185895.835.86 Acknowledgements This research was supported by the the Research Center, College of Science, King Saud University under the project number Stat/28/6. References Balakrishnan, N., Aggarwala, R. 2. Progressive Censoring: Theory, Methods, and Applications, Birkhauser, Boston, Berlin.
Statistical inference 1725 Balakrishnanm, N., Kannan, N., 21. Point and interval estimation for parameters of the logistic distribution based on progressively Type-II censored samples, in Handbook of Statisticsm N. Balakrishnsn and C.R. Rao, Eds. Amsterdam: North-Holand, 2, 431-456. Balakrishnanm, N., Kannan, N., Lin, C.T., Ng, H., 23. Point and interval estimation for gaussian distribution based on progressively Type-II censored samples, IEEE Trans. Reliab., 52, 9-95. Balakrishnan, N., Sandhu, R. A., 1995. A simple simulational algorithm for generating progressive type-ii censored samples, The American Statistician, vol. 49, pp. 229-23. Carbone, P., Kellerthouse, L., Gehan, E., 1967. Plasmacytic Myeloma: Astudy of the relationship of survival to various clinical manifestations and anomalous protein type in 112 patients, American Journal of Medicine, 42, 937-948. Childs, A., Balakrishnanm, N., 2. Conditional inference procedures for the Laplace distribution when the observed samples are progressively censored, Metrika, 52, 253-265. Cohen, A.C., 1963. Progressively censored samples in life testing, Technometrics, 5, 327-329. Mousa, M., Jaheen, Z. 22. Statistical inference for the Burr model based on progressively censored data, An international Computers & Mathematics with applications, 43, 1441-1449. Ng, K., Chan, P.S., Balakrishnanm, N., 22. Estimation of parameters from progressively censored data using an algorithm, Computational Statistics and Data Analysis, 39, 371-386. A.M. Sarhan and Al-Asbahi, I., 28, Statistical Inference for the linear exponential model using progressively censored data, Conference in Statistics and Economics, 25-26 March, 28, Cairo, Egypt. Received: May 27, 28