Sampling Theory. A New Ratio Estimator in Stratified Random Sampling

Communications in Statistics Theory and Methods, 34: 597 602, 2005 Copyright Taylor & Francis, Inc. ISSN: 0361-0926 print/1532-415x online DOI: 10.1081/STA-200052156 Sampling Theory A New Ratio Estimator in Stratified Random Sampling CEM KADILAR AND HULYA CINGI Hacettepe University, Department of Statistics, Beytepe, Ankara, Turkey In this article, we suggest a new ratio estimator in stratified random sampling based on the Prasad (1989) estimator. Theoretically, we obtain the mean square error (MSE) for this estimator and compare it with the MSE of traditional combined ratio estimate. By this comparison, we demonstrate that proposed estimator is more efficient than combined ratio estimate in all conditions. In addition, this theoretical result is supported by a numerical example. Keywords Mean square errors; Ratio-type estimators; Stratified random sampling. Mathematics Subject Classification Primary 62D05. 1. Introduction The combined ratio estimate is where is the population mean of auxiliary variate and ȳ st = ȳ RC = ȳst = R c (1) h ȳ h = h x h where k is the number of stratum, h = N h is stratum weight, N is the number of N units in population, N h is the number of units in stratum h, ȳ h is the sample mean of variate of interest in stratum h and x h is the sample mean of auxiliary variate in stratum h. The variance of combined ratio estimate is V ȳ RC = 2 h ( h S 2 yh 2RS ) yxh + R 2 S 2 xh (2) Received February 7, 2003; Accepted July 21, 2004 Address correspondence to Cem Kadilar, Hacettepe University, Department of Statistics, Beytepe, Ankara, Turkey; E-mail: kadilar@hacettepe.edu.tr 597

598 Kadilar and Cingi where h = 1 n h/n h R= Y n h is the population ratio, n h is the number of units in sample stratum h, Syh 2 is the population variance of variate of interest in stratum h, Sxh 2 is the population variance of auxiliary variate in stratum h, and S yxh is the population covariance between auxiliary variate and variate of interest in stratum h (Cochran, 1977). In stratified random sampling, Kadilar and Cingi (2003) developed ratio estimators as follows: ȳ stsd =ȳ h h + C xh st (3) h x h + C xh based on the Sisodia and Dwivedi (1981) estimator; ȳ stsk =ȳ h h + 2h x st (4) h x h + 2h x based on Singh and Kakran (1993) estimator; ȳ stus1 =ȳ h h 2h x + C xh st (5) h x h 2h x + C xh based on the first estimator of Upadhyaya and Singh (1999), ȳ stus2 =ȳ h h C xh + 2h x st (6) h x h C xh + 2h x based on second estimator of Upadhyaya and Singh (1999). Here, C x is the population coefficient of variation and 2 x is the population coefficient of kurtosis of auxiliary variate x. Kadilar and Cingi (2003) demonstrate that all of these estimators, presented in Eqs. (3) (6), have a bigger MSE than traditional combined ratio estimate has in some conditions. Therefore, in the next section, we will propose a new ratio estimator in stratified random sampling, and in Sec. 3, we will prove that this proposed estimator is more efficient than the combined ratio estimate in all conditions. In Sec. 4, this theoretical proof will be supported by a numerical example. 2. Ratio Estimator and Its Mean Square Error When first degree approximation is used in obtaining the mean square error (MSE) of a ratio estimate, it is known MSE is equal to the variance, so MSE of combined ratio estimate can be written as follows: MSE ȳ RC = 2 h h S 2 yh 2RS yxh + R 2 S 2 xh (7) To obtain the bias of combined ratio estimate, we write } 1 E ȳ RC Y = E ȳ R (8) st

Ratio Estimator in Stratified Random Sampling 599 where E symbolizes expected value. We can rewrite (1/ as 1 1 = + = + 1 = 1 ( 1 + x ) st 1 and let this expression expand to Taylor series. If we use first degree approximation (omit the terms after the second term, i.e., square, cubic, etc., terms) in Taylor series expansion, the equation will be 1 1 ( 1 x ) st From Eq. (8), ( E ȳ RC Y = E 1 x ) } st ȳ st R = E ȳ st R E ȳ st + RE x } st As E ȳ st R = 0, we can write E ȳ RC Y = 1 = 1 RE xst 2 E [ ȳ st Y ]} R 2 h hs 2 xh From this equation, the bias of combined ratio estimate is (Cingi, 1994). B ȳ RC = 1 } 2 h hcov ȳ h x h 2 h h RS 2 xh S yxh (9) 2.1. The Suggested Ratio-Type Estimator In simple random sampling, Prasad (1989) proposed a ratio estimator as where the coefficient = 1+ C yc x C 2 y +1. In stratified random sampling, we suggest that ȳ p = ȳ R = ȳ x (10) ȳ stp = ȳ RC (11)

600 Kadilar and Cingi Therefore, the MSE of this estimator is MSE ȳ stp = E ( ȳ stp Y ) 2 = E ȳ RC Y 2 = E ( 2 ȳ 2 RC 2 ȳ RC Y + Y 2) = 2 E ( ) ȳ 2 RC 2 YE ȳ RC + Y 2 = 2 E ( ) ȳ 2 RC 2 Y 2 + Y 2 + 2 Y 2 2 E ȳ RC 2 = 2[ E ȳ 2 RC E y RC 2] + Y 2 1 2 = 2 Var ȳ RC + Y 2 1 2 From this equation, we obtain the MSE of the suggested estimate as follows: MSE ȳ stp = 2 2 h h S 2 yh 2RS yxh + R 2 S 2 xh + 1 2 Y 2 (12) Bias of this estimator is obtained as E ȳ stp Y = E ȳ RC Y ( ) = E ȳst Y ( ) ȳ = E st R = E ȳ st R ( 1 = E ȳ st RE E ȳ st + R E )} = Y Y E ȳ st Y + R E 2 = 1 Y + 1 2 h h RS 2 xh S yxh In order to find the equation of which makes the MSE minimum, we should take the derivative of the MSE with respect to and equal this equation to zero as follows: MSE ȳ stp = 2 2 h h S 2 yh 2RS yxh + R 2 S 2 xh + 2 1 Y 2 = 0 From this equation, we obtain where 0 < < 1. = Y 2 Y 2 + 2 h h S 2 yh 2RS yxh + R 2 S 2 xh

3. Efficiency Comparison Ratio Estimator in Stratified Random Sampling 601 If we compare the MSE of combined ratio estimator with the MSE of proposed estimator we will have the condition as follows: Let = 2 h h S 2 yh 2RS yxh + R 2 S 2 xh MSE ȳ stp <MSE ȳ RC 2 1 + Y 2 1 2 < 0 1 + 1 + 1 Y 2 <0 From this condition, as 1 <0, it is clear that if > Y 2 Y 2 + (13) the suggested estimator is more efficient than the combined ratio estimator. When we examine the condition (13) in detail, we see that this condition is always satisfied. Therefore, we can say that the suggested estimator is more efficient than combined ratio estimator in all conditions. 4. Numerical Example We have used the data of Kadilar and Cingi (2003) in this section. We have applied our proposed and combined ratio estimators on the data of apple production amount (as interest of variate) and number of apple trees (as auxiliary variate) in 854 villages of Turkey in 1999 (Source: Institute of Statistics, Republic of Turkey). First, we have stratified the data by regions of Turkey and from each stratum (region); we have randomly selected the samples (villages). By using the Neyman allocation (Cochran, 1977), N n h = n h S h N (14) hs h Table 1 Data statistics N = 854 N 1 = 106 N 2 = 106 N 3 = 94 N 4 = 171 N 5 = 204 N 6 = 173 n = 140 n 1 = 9 n 2 = 17 n 3 = 38 n 4 = 67 n 5 = 7 n 6 = 2 = 37600 1 = 24375 2 = 27421 3 = 72409 4 = 74365 5 = 26441 6 = 9844 ȳ = 2930 Y 1 = 1536 Y 2 = 2212 Y 3 = 9384 Y 4 = 5588 Y 5 = 967 Y 6 = 404 S x = 144794 S x1 = 49189 S x2 = 57461 S x3 = 160757 S x4 = 285603 S x5 = 45403 S x6 = 18794 S y = 17106 S y1 = 6425 S y2 = 11552 S y3 = 29907 S y4 = 28643 S y5 = 2390 S y6 = 946 R = 0 07793 1 = 0 82 2 = 0 86 3 = 0 90 4 = 0 99 5 = 0 71 6 = 0 89 = 0 975 1 = 0 102 2 = 0 049 3 = 0 016 4 = 0 009 5 = 0 138 6 = 0 006 = 215710 432 1 2 = 0 015 2 2 = 0 015 2 3 = 0 012 2 4 = 0 04 2 5 = 0 057 2 6 = 0 041

602 Kadilar and Cingi Table 2 MSE values of ratio estimators Estimators MSE values Proposed 210423.632 Combined ratio 215710.432 we have computed sample size in stratum h. Here we take sample size as n = 140 (Cingi, 1994). From the results of n h, we have decided to join two regions so we take six strata (as 1: Marmara, 2: Agean, 3: Mediterranean, 4: Central Anatolia, 5: Black Sea, 6: East and Southeast Anatolia) for this data. Then by using this stratified random sampling, the MSE of combined and proposed ratio estimators have been computed by the Eqs. (7) and (12), respectively. Finally, these estimators have been compared between each other with respect to their MSE values. In Table 1, we observe the statistics about the population, strata, and sample size. Note that the correlation between the variates is 92%. In Table 2, the values of MSE are given. From these values, it is seen that the MSE value of the proposed ratio estimator is smaller than that of combined ratio estimator. It is an expected result, since = 0 975 > Y 2 = 0 951, as mentioned in Sec. 3. Y 2 + 5. Conclusion We have derived a new ratio-type estimator in stratified random sampling from the estimator of Prasad (1989) and obtained its MSE equation. By this equation, the MSE of proposed estimator has been compared with that of combined ratio estimate in theory and by this comparison it has been found that in all conditions the proposed estimator has a smaller MSE than the combined ratio estimate has. This theoretical result has also been satisfied by a numerical example, whereas Kadilar and Cingi (2003) found that combined ratio estimator was more efficient than the other estimators such as Sisodia and Dwivedi, Singh and Kakran, first and second estimators of Upadhyaya and Singh for the same data used in this article. In the forthcoming studies, we hope to develop new estimators in other sampling methods. References Cingi, H. (1994). Sampling Theory. Ankara, Turkey: Hacettepe University Press. Cochran, W. G. (1977). Sampling Techniques. New York: John Wiley and Sons. Kadilar, C., Cingi, H. (2003). Ratio estimators in stratified random sampling. Biometrical J. 45(2):218 225. Prasad, B. (1989). Some improved ratio type estimators of population mean and ratio in finite population sample surveys. Commun. Statist. Theor. Meth. 18(1):379 392. Singh, H. P., Kakran, M. S. (1993). A modified ratio estimator using known coefficient of kurtosis of an auxiliary character. (unpublished). Sisodia, B. V. S., Dwivedi, V. K. (1981). A modified ratio estimator using coefficient of variation of auxiliary variable. J. Indian Soc. Agricul. Statist. 33:13 18. Upadhyaya, L. N., Singh, H. P. (1999). Use of transformed auxiliary variable in estimating the finite population mean. Biometrical J. 41(5):627 636.