Investigating the Use of Stratified Percentile Ranked Set Sampling Method for Estimating the Population Mean

royecciones Journal of Mathematics Vol. 30, N o 3, pp. 351-368, December 011. Universidad Católica del Norte Antofagasta - Chile Investigating the Use of Stratified ercentile Ranked Set Sampling Method for Estimating the opulation Mean AMER IBRAHIM AL OMARI AL AL-BAYT UNIVERSITY, JORDAN KAMARULZAMAN IBRAHIM UNIVERSITY KEBANGSAAN MALAYSIA, MALAYSIA and MAHMOUD IBRAHIM SYAM UNIVERSITY KEBANGSAAN MALAYSIA, MALAYSIA Received : December 010. Accepted : September 011 Abstract Stratified percentile ranked set sampling (SRSS) method is suggested for estimating the population mean. The SRSS is compared with the simple random sampling (SRS), stratified simple random sampling (SSRS) and stratified ranked set sampling (SRSS). It is shown that SRSS estimator is an unbiased estimator of the population mean of symmetric distributions and is more efficient than its counterparts using SRS, SSRS and SRSS based on the same number of measured units. Keywords : Simple random sampling; ranked set sampling; percentile ranked set sampling; efficiency; stratified ranked set sampling.

35 Amer Ibrahim, Kamarulzaman Ibrahim and Mahmoud Ibrahim 1. Introduction In last years, the ranked set sampling method, which was proposed by McIntyre (195) to estimate mean pasture yields, was developed and modified by many authors to estimate the population parameters. Dell and Clutter (197) showed that the mean of the RSS is an unbiased estimator of the population mean, whether or not there are errors in ranking. Al-Saleh and Al-Kadiri (000) introduced double ranked set sampling for estimating the population mean. Al-Saleh and Al-Omari (00) suggested multistage ranked set sampling that increase the efficiency of estimating the population mean for specific value of the sample size. Muttlak (003b) suggested percentile ranked set sampling (RSS) to estimate the population mean and showed that using RSS procedure will reduce the errors in ranking comparing to RSS, since we only select and measure the pth or the qth percentile of the sample. Jemain and Al-Omari (006) suggested double percentile ranked set sampling (DRSS) for estimating the populationmeanandshowedthatthedrssmeanisanunbiasedestimator and more efficient than the SRS, RSS and RSS if the underlying distribution is symmetric. Jemain and Al-Omari (007) suggested multistage percentile ranked set sampling (MRSS) to estimate the population mean, they showed that the efficiency of the mean estimator using MRSS can be increased for specific valueofthesamplesizebyincreasingthenumber of stages. For more details about RSS and its modifications see Al-Omari and Jaber (008), Bouza (00), Muttlak (003a), Al-Nasser (007) and Ohyama et al. (008). In this paper, stratified percentile ranked set sampling is suggested to estimate the population mean of symmetric and asymmetric distributions. This paper is organized as follows: In Section, some sampling methods are presented. Estimation of the population mean is given in Section 3. A simulation study is considered in Section 4. Finally, conclusions on the suggested estimator are given in Section 5. Sampling Methods In stratified sampling method, the population of N units is divided into L non overlapping subpopulations each of N 1,N,...,N L units, respectively, such that N 1 + N +... + N L = N. These subpopulations are called strata. For full benefitfromstratification, the size of the hth subpopulation, denoted by N h for h =1,,...,L, must be known. Then the samples are

Investigating the Use of Stratified ercentile Ranked Set Sampling...353 drawn independently from each strata, producing samples sizes denoted by n 1,n,...,n L, such that the total sample size is n = L. If a simple random sample is taken from each stratum, the whole procedure is known as stratified simple random sampling (SSRS). The ranked set sampling (RSS) is suggested by McIntyre (195) can be conducted by selecting n random samples from the population of size n units each, and ranking each unit within each set with respect to the variable of interest. Then an actual measurement is taken of the unit with the smallest rank from the first sample. From the second sample an actual measurement is taken from the second smallest rank, and the procedure is continued until the unit with the largest rank is chosen for actual measurement from the nth sample. Thus we obtain a total of n measured units, one from each ordered sample of size n and this completed one cycle. The cycle may be repeated m times to obtain a sample of size nm units. The percentile ranked set sampling (RSS) procedure is proposed by Muttlak (003b). The RSS can be described as: select n random samples each of size n units from the population and rank each sample with respect to a variable of interest. If the sample size n is even, select for measurement from the first n/ samples the p(n+1)th smallest ranked unit and from the second n/ samples the q(n + 1)th smallest ranked unit where 0 p 1 and p + q = 1. If the sample size n is odd, select for measurement from the first (n 1)/ samples the p(n + 1)th smallest ranked unit and from the last (n 1)/ samples the q(n + 1)th smallest ranked unit, and the median from the middle sample. The cycle can be repeated m times if needed to get a sample of size nm units. Note that we will always take the nearest integer of p(n +1)thandq(n + 1)th. If the percentile ranked set sampling is used to select the sample units from each stratum, then the whole procedure is called a stratified percentile ranked set sampling (SRSS). To illustrate the method, let us consider the following example for even sample size. Example 1: Suppose that from the first stratum, we draw five samples each of 5 units, and from the second stratum, we draw seven samples each of seven units. These samples are ordered as given below. Stratum 1: Five samples, each of size 5 ordered units as given below: X 11(1),X 11(),X 11(3),X 11(4),X 11(5) X 1(1),X 1(),X 1(3),X 1(4),X 1(5)

354 Amer Ibrahim, Kamarulzaman Ibrahim and Mahmoud Ibrahim X 31(1),X 31(),X 31(3),X 31(4),X 31(5) X 41(1),X 41(),X 41(3),X 41(4),X 41(5) X 51(1),X 51(),X 51(3),X 51(4),X 51(5) Assume that p = 37% and q = 63%. For h = 1, select X ih(p(nh +1)) = X i1() for i = 1,, and select X ih(q(nh +1)) = X i1(4) for i = 4, 5 and X n1 +1 ih = X i1(3) for i = 3, the following units are chosen from the first stratum X 11(),X 1(),X 31(3),X 41(4),X 51(4) Stratum : Seven samples, each with 7 ordered units are as given below X 1(1),X 1(),X 1(3),X 1(4),X 1(5),X 1(6),X 1(7) X (1),X (),X (3),X (4),X (5),X (6),X (7) X 3(1),X 3(),X 3(3),X 3(4),X 3(5),X 3(6),X 3(7) X 4(1),X 4(),X 4(3),X 4(4),X 4(5),X 4(6),X 4(7) X 5(1),X 5(),X 5(3),X 5(4),X 5(5),X 5(6),X 5(7) X 6(1),X 6(),X 6(3),X 6(4),X 6(5),X 6(6),X 6(7) X 7(1),X 7(),X 7(3),X 7(4),X 7(5),X 7(6),X 7(7) For h =, select X ih(p(nh +1)) = X i1(3) for i = 1,, 3, and select X ih(q(nh +1)) = X i1(5) for i = 5, 6, 7andX n1 +1 ih = X i1(4) for i = 4. Then the following units are chosen from the second stratum X 1(3),X (3),X 3(3),X 4(4),X 5(5),X 6(5),X 7(5). Therefore, the selected SRSS units are X 11(),X 1(),X 31(3),X 41(4),X 51(4),X 1(3),X (3),X 3(3),X 4(4),X 5(5),X 6(5),X 7(5).

Investigating the Use of Stratified ercentile Ranked Set Sampling...355 3. Estimation of the population mean The SRS estimator of the population mean, µ, based on a sample of size n is given by X SRS = 1 nx X i, n with variance Var ³X SRS = σ n. The estimator of the population mean for a RSS of size n is given by X RSS = 1 nx X n i(i), with variance Var ³X RSS = σ n 1 n nx ³ µ (i) µ, where µ (i) is the mean of the ith order statistics, X (i) of a sample of size n. In the case of stratified percentile ranked set sampling (SRSS), when is even, the estimator of the population mean is defined as (3.1) X SRSS1 = LX X nx X ih(p(nh +1)) + X ih(q(nh +1)), i= +1 where = N h N, N h is the stratum size, N is the total population size and p + q =1. The variance of SRSS1 is given by Var ³X SRSS1 = Var L = L Wh n h = L Wh n h X ih(p(nh +1)) + n X ih(q(nh +1)) i= +1 + n ³ Var X ih(q(nh +1)) i= +1 + n ³ Var X ih(q(nh +1)) i= +1 ³ Var X ih(p(nh +1)) ³ Var X ih(p(nh +1))

356 Amer Ibrahim, Kamarulzaman Ibrahim and Mahmoud Ibrahim (3.) = LX W h n h X n σih(p) + X σ ih(q). i= +1 by When is odd, the SRSS estimator of the population mean is given X SRSS = LX 1 X (3.3) and the variance of SRSS is Var Var L ³X SRSS X ih nh +1 = 1 = L 1 Wh n h µ +Var (3.4) = LX X ih(p(nh +1)) + X ih(p(nh +1)) + ³ Var X ih(p(nh +1)) W h n h 1 X σ ih(p) + i= 1 + + X i= 1 + i= 1 + X i= 1 + X ih(q(nh +1)) + X ih nh +1 X ih(q(nh +1)) + X ih nh +1 ³ Var X ih(q(nh +1)) σih(q) + σ ih(q ) roperty 1: If the underlying distribution is symmetric about µ, ih(q) then E ³X SRSS1 = µ and E ³X SRSS = µ. roof: If is even, we have E ³X SRSS = E L X ih(p(nh +1)) + n X ih(q(nh +1)) i= +1 = L ³ E X ih(p(nh +1)) + n ³ E X ih(q(nh +1)) i= +1 = L µ h(p) + n µ h(q), i= +1

Investigating the Use of Stratified ercentile Ranked Set Sampling...357 where µ h(p) and µ h(q) are the means of the order statistics which correspond to the pth and qth percentiles, respectively. Since the distribution is symmetric about µ, thenµ h(p) + µ h(q) =µ. Therefore, we have E ³X SRSS1 = L = L = L ³ nh ³ ³ nh µ h(p) + µ h(q) nh (µ h ) = L µ h = µ. µ h(p) + µ h(q) If is odd, then E ³X SRSS E L = L = L = 1 X ih(p(nh +1)) + i= 1 + 1 ³ E X ih(p(nh +1)) + i= 1 + 1 µ h(p) + µ h(q) + µ h(q ), i= 1 + X ih(q(nh +1)) + X ih nh +1 ³ µ E X ih(q(nh +1)) + E X ih nh +1 ³ nh 1 where µ h(p) is the mean for the pth percentile in the first samples in ³ stratum h. µ h(q) is the mean for the qth percentile in the last nh 1 samples in stratum h. µ h is the mean for the stratum h. Since the distribution is symmetric about µ, thenwehaveµ h(p) + µ h(p) =µ h. Therefore,

358 Amer Ibrahim, Kamarulzaman Ibrahim and Mahmoud Ibrahim E ³X SRSS = L = L = L = L h³ nh 1 h³ nh 1 h³ nh 1 µ h(p) + i ³µ h(p) + µ h(q) + µ h µ h + µ h i (( 1) µ h + µ h ) = L ( µ h ) = L µ h = µ. ³ i nh 1 µ h(q) + µ h roperty : If the distribution is symmetric about µ, then Var ³X SRSS1 and Var ³X SRSS <Var ³X SRS.. <Var ³X SRS roof: Ifthesamplesizeiseven,thevarianceof X SRSS1 is given by Var ³X SRSS1 = L Wh σih(q). But σih(q) <σ h for each stratum h =1,,,L, which implies that Var ³X SRSS1 = LX W h σ ih(q) < L X The proof is the same for odd sample size. 4. Simulation study W h σ h = Var( SSRS ) <Var(X SRS ). In this section, a simulation study is designed for symmetric and asymmetric distributions with samples of sizes n =7, 1, 14, 15, 18 to compare the SRSS with the SRS, SSRS and SRSS methods. Without loss of generality, we assumed that the population is partitioned into two or three strata. Using 100000 replications, estimates of the means, variances and mean square

Investigating the Use of Stratified ercentile Ranked Set Sampling...359 errors are computed. The efficiency of SRSS relative to SRS, SSRS and SRSS when the parent distribution is symmetric is given by ³ eff ³X SRSS, X M = Var X M Var ³X SRSS, M = SRS,SSRS,SRSS, and when the distribution is asymmetric ³ eff ³X SRSS, X M = MSE X M MSE ³X SRSS, wheremseisthemeansquareerror. In Table 1 to Table 6 we summarized the efficiency values of the estimators, while in Table 7 the bias values of the estimators for the mean of asymmetric distributions considered in this study are presented.

360 Amer Ibrahim, Kamarulzaman Ibrahim and Mahmoud Ibrahim

Investigating the Use of Stratified ercentile Ranked Set Sampling...361

36 Amer Ibrahim, Kamarulzaman Ibrahim and Mahmoud Ibrahim

Investigating the Use of Stratified ercentile Ranked Set Sampling...363

364 Amer Ibrahim, Kamarulzaman Ibrahim and Mahmoud Ibrahim

Investigating the Use of Stratified ercentile Ranked Set Sampling...365

366 Amer Ibrahim, Kamarulzaman Ibrahim and Mahmoud Ibrahim 5. Conclusions In this paper, a new estimator of the population mean using SRSS is suggested. The performance of the estimator based on SRSS is compared with those found using SRSS, SSRS and SRS for the same number of measured units. It is found that SRSS produces an unbiased estimator of the population mean and it is more efficient than SRSS, SSRS and SRS. Thus, the SRSS should be more preferred than SRSS, SSRS and SRS for both symmetric and asymmetric distributions.

Investigating the Use of Stratified ercentile Ranked Set Sampling...367 References [1] Al-Nasser, D.A. L. ranked set sampling: a generalization procedure for robust visual sampling. Communication in Statistics-Simulation and Computation, 36: pp. 33-44, (007). [] Al-Saleh, M. F. & Al-Hadhrami, S. A. Estimation of the mean of exponential distribution using moving extremes ranked set sampling. Statistical apers, 44: pp. 367-38, (003). [3] Al-Saleh, M. F. and Al-Kadiri, M. Double ranked set sampling. Statistics robability Letters, 48: pp. 05-1, (000). [4] Al-Saleh, M. F. and Al-Omari, A. I. Multistage ranked set sampling. Journal of Statistical lanning and Inference, 10: pp. 73-86, (00). [5] Al-Omari, A. I. and Jaber, K. ercentile double ranked set sampling. Journal of Mathematics and Statistics, 4(1): pp. 60-64, (008). [6] Bouza, C. N. Ranked set subsampling the non response strata for estimating the difference of means. Biometrical Journal, 44: pp. 903-915, (00). [7] Dell, T. R. and Clutter, J. L. Ranked set sampling theory with order statistics background. Biometrika, 8: pp. 545-555, (197). [8] Jemain, A. A. and Al-Omari, A. I. Double quartile ranked set samples. akistan Journal of Statistics, (3): pp. 17-8, (006). [9] Jemain, A. A. and Al-Omari, A. I. (007). Multistage quartile ranked set samples. akistan Journal of Statistics, 3(1): pp. 11-, (007). [10] McIntyre, G. A. A method for unbiased selective sampling using ranked sets. Australian Journal of Agricultural Research, 3: pp. 385-390, (195). [11] Muttlak, H. A. Investigating the use of quartile ranked set samples for estimating the population mean. Journal of Applied Mathematics and Computation, 146: pp. 437-443, (003a). [1] Muttlak, H. A. Modified ranked set sampling methods. akistan Journal of Statistics, 19(3): pp. 315-33, (003b).

368 Amer Ibrahim, Kamarulzaman Ibrahim and Mahmoud Ibrahim [13] Ohyama, T., Doi, J., and Yanagawa, T. Estimating population characteristics by incorporating prior values in stratified random sampling/ranked set sampling. Journal of Statistical lanning and Inference, 138: pp. 401-403, (008). [14] Takahasi K. and Wakimoto, K. On unbiased estimates of the population mean based on the sample stratified by means of ordering. Annals of the Institute of Statistical Mathematics, 0: pp. 1-31, (1968). Amer Ibrahim Al-Omari Al al-bayt University, Faculty of Science, Department of Mathematics,. O. Box 130040, Mafraq 5113, Jordan e-mail : amerialomari@aabu.edu.jo, Kamarulzaman Ibrahim School of Mathematical Sciences, University Kebangsaan Malaysia 43600, UKM Bangi Selangor, Malaysia e-mail : kamarulz@ukm.my and Mahmoud Ibrahim Syam School of Mathematical Sciences, University Kebangsaan Malaysia 43600, UKM Bangi Selangor, Malaysia e-mail : msiam0@yahoo.com