Open Journal of Optimization, 04, 3, 68-78 Publised Online December 04 in SciRes. ttp://www.scirp.org/ournal/oop ttp://dx.doi.org/0.436/oop.04.34007 Compromise Allocation for Combined Ratio Estimates of Population Means of a Multivariate Stratified Population Using Double Sampling in Presence of Non-Response Sana Iftekar, Qazi Mazar Ali, Moammad Jameel Asan Department of Statistics & Operations Researc, Aligar Muslim University, Aligar, India Email: iftekar.sana54@gmail.com Received 8 September 04; revised 6 October 04; accepted November 04 Copyrigt 04 by autors and Scientific Researc Publising Inc. Tis work is licensed under te Creative Commons Attribution International icense (CC BY). ttp://creativecommons.org/licenses/by/4.0/ Abstract Tis paper is an attempt to work out a compromise allocation to construct combined ratio estimates under multivariate double sampling design in presence of non-response wen te population mean of te auxiliary variable is unknown. Te problem as been formulated as a multi-obective integer non-linear programming problem. Two solution procedures are developed using goal programming and fuzzy programming tecniques. A numerical example is also worked out to illustrate te computational details. A comparison of te two metods is also carried out. Keywords Multivariate Stratified Sampling, Compromise Allocation, Non-Response, Double Sampling Goal Programming, Fuzzy Programming. Introduction Often in sample surveys te main variable is igly correlated to anoter variable called an auxiliary variable and te data on auxiliary variable are eiter available or can be easily obtained. In tis situation to obtain te estimate of te parameters regarding te main variable te auxiliary information can be used to enance te precision of te estimate. Ratio and Regression Metods and double sampling tecnique are some examples. Wen data are collected on te sampled units of te main variable due to one or te oter reason, data for all te se- How to cite tis paper: Iftekar, S., Ali, Q.M. and Asan, M.J. (04) Compromise Allocation for Combined Ratio Estimates of Population Means of a Multivariate Stratified Population Using Double Sampling in Presence of Non-Response. Open Journal of Optimization, 3, 68-78. ttp://dx.doi.org/0.436/oop.04.34007
S. Iftekar et al. lected units cannot be obtained. Tis result is an incomplete and less informative sample. Tis penomenon is termed as non response. [] is te first one to consider tis problem. Furtermore, wen auxiliary parameters are unknown, tey can be estimated from a preliminary large sample. Ten a second sample is obtained in wic te main and auxiliary, bot te variables are measured. Often a second sample is a subsample of te first. In suc cases only te main variable is to be measured in te second sample. Tis tecnique is called Double Sampling or Two Pase Sampling, []-[] are some wo used te auxiliary information in sample surveys. [0] as worked on te problem in wic ratio estimator as been considered for population mean under double sampling in presence of non-response for a univariate population. In te present paper, we considered combined ratio estimators of te population means of a multivariate stratified population using double sampling in presence of non-response. Compromise allocations at first and second pase of double sampling are obtained by formulating te problems as multi-obective integer non-linear programming problems. Solution procedures are developed by using goal programming and fuzzy programming tecniques. A numerical example is also worked out to illustrate te computational details. A comparison of te two metods is also carried out. Wen auxiliary information is available, te use of Ratio metod of estimation is well known in univariate stratified sampling. Formulae are also available to work out optimum allocations to various strata []. In multivariate case finding an allocation tat gives optimum results for all te caracteristics is not possible due to te conflicting nature of te caracteristics. Compromise allocation is used in suc situations. Furtermore, if te problem of non-response is also tere, te situation becomes more complicated. Te paper is structured as below: In Section of te manuscript combined ratio estimates for te population means of te p caracteristics in presence of non-response using double sampling are constructed. Section 3 formulates te problem of obtaining compromise allocations for pase-i and pase-ii of te double sampling as an integer nonlinear programming problem (INPP). Sections 4 and 5 sow tat ow tese INPP s can be transformed to apply te Goal Programming Tecnique (GPT) and te Fuzzy Programming Tecnique (FPT) to solve te transformed problems. Section 6 provides an application of te tecniques troug a numerical data. In te last Section 8 gives te conclusion and te future work trend for interested readers.. Te Combined Ratio Estimate in Multivariate Stratified Double Sampling Design in Presence of Non-Response Consider a multivariate stratified population of size N wit non-overlapping strata of sizes N, N,, N wit N = N. et p caracteristics be defined on eac unit of te population. If N =, N,, N are not known in advance ten te strata weigts W = N N, =,,, also remain unknown. In suc situation double sampling tecnique may be used to estimate te unknown strata weigts. For tis a large preliminary simple random sample of size n is obtained at te first pase of te double sampling, treating te population as unstratified. Te number of sampled units n ; =,,, falling in eac stratum is recorded. Te quantity w = n n will give an unbiased estimate of W. Simple random subsamples, witout replacement of sizes n = µ n ; =,,, ; 0 < µ are ten drawn out of n from eac stratum for values of µ cosen in advance. For te t caracteristics and te t stratum denote by y i te value of te it population (sample) unit of te main variable. x te value of te it population (sample) units of te auxiliary variable. Y i N n = y i= i and y i= N n variable. X N n = X i= i and x i= N n = y te stratum mean and te sample mean respectively for te main i = y denote te same values for te auxiliary variable. i In double sampling for stratification te combined ratio estimate of te population mean of te t caracteristics is given as y = wr ; =,,, p () ( CRDS ) = 69
S. Iftekar et al. were CR and DS stand for combined ratio and double sampling respectively. Furter, Te sampling variance of y ( CRDS ) is y r ˆ, ˆ = Rst X Rst = x y x = wy s = = wx s = ( ( ) ) = y CRDS + r n N n = µ st st V y S w S () S r in () is defined as S = S + RS RS (3) r y x x y were R are true population ratios given as R = Y X y x S, S are te stratum variances of te t caracteristics in te t stratum for main variable and auxiliary variables respectively and S are te stratum co-variances of te t caracteristics in te t xy stratum for =,,, p ; =,,,. In te presence of non-response, let out of te n units n units respond at te first call and n = n n units constitute te non-respondents group. Using [], a subsample of size m ;0 = kn < k out of n is drawn and interviewed wit extra efforts. Were k ; =,,, are fixed in advance. An combined ratio estimate y ( CRDS ) of Y may be given as were r y ( ) wr CRDS = n r = (4) + n r m = n r, r m are te sample mean of te ratio estimates for respondents (based on n units) and non-respondents group (based on m units) respectively. Using te results presented in [] Sections 5A.,.9 and 3.6 we get V ( y ( CRDS ) ) in presence of non-response as k ( ( ) ) = ( ( ) ) + CRDS CRDS r n = kµ V y V y w S k = S + w S + w S n N n n k y r r = µ = µ (5) were, S = S + RS RS r y x x y 70
S. Iftekar et al. Sy, S x are te stratum variances of te t caracteristics of te non-respondents in te t stratum for main variable and auxiliary variable respectively. S xy is te stratum co-variances of te t caracteristics of te non-respondents in te t stratum []. Te total cost of te survey may be given (6) C= cn + c n + c n + c m 0 = = = were, c 0 is te per unit cost of getting information from te preliminary sample n. c is te per unit cost of of making te first attempt (Pase I). c is te per unit cost of processing and analyzing te result of all te p caracteristics on te n respondents units in te t stratum at Pase I. c is te per unit cost of measuring and processing te result of all te p caracteristics on te m subsampled units from non-respondents group in te t stratum at Pase II. Since n is not known until te first attempt is made, te quantity w n may be used as its expected value. Te total expected cost Ĉ of te survey is ten given as 3. Formulation of te Problem ( ) (7) Cˆ = cn + c + c w n + c m. 0 = = In Pase I, we obtain te sample size n in eac stratum by minimizing variance given in (5) for fixed cost given in (7). At Pase II subsample size from non-respondents group as been obtained by minimizing te sampling variance in (5) for given cost in (7). 3.. Formulation of te Problem at Pase I Expression (5) can be expressed as were te terms independent of ( ( CRDS ) ) = α y = V = ; =,,, p (8) n n are ignored and α Te cost constraint (7) becomes were ( k ) = wns + w ns n k ( c ) ˆ + w c n C0 (0) = Cˆ = Cˆ cn c m 0 0 = Tus te multi-obective formulation of te problem at Pase I becomes (see []). α Minimize V ;,,, psimulteniously = = = n ( ) ˆ Subect to c + w c n C0 = n n and n integers; =,,, (9) () 7
S. Iftekar et al. 3.. Formulation of te Problem for Pase II Ignoring te term independent from m in (5), substituting expression (5) can be written as were Te cost constraint becomes were = n k = m n and v =, for =,,, p, n β V = ; =,,, p () m w nn S β = (3) n n = c m Cˆ 0 = ( ) Cˆ = Cˆ cn c + w c n (4) 0 0 = Ten te multi-obective formulation of te problem at Pase II becomes β Minimize V ;,,, p; simultaneously = = = m ˆ Subect to cm C 0. = m n and m integers; =,,, 4. Formulation as a Goal Programming Problem 4.. Pase I et V be te optimal value of V under optimum allocation for te t caracteristics obtained by solving te following integer non-linear programming for all te =,,, p caracteristics separately. Furter let Minimize V α = = n ( ) ˆ Subect to c + w c n C0 = n n and n integers; =,,, (,,, ) (5) (6) = = = (7) n V V n n n denote te variance under te compromise allocation, were n ; =,,, are to be worked out. Obviously V V and V V 0; =,,, p will give te increase in te variances due to not using te individual optimum allocation for t caracteristics. et 0 V V ; =,,, p. d denote te tolerance limit specified for ( ) α 7
S. Iftekar et al. We ave V V d ; =,,, p or or V d V ; =,,, p α d V p (8) ;,,, = = n A suitable compromise criterion to work out a compromise allocation at pase-i will ten be to minimize te sum of deviations d. Terefore te Goal Programming problem at pase-i may be given as Subect to = = ( ) = Minimize d V ˆ c + w c n C0 n α n p n d d 0 and n integers =,,, ; =,,, p (See [3]). Were d 0; =,,, p are te goal variables. Te goal is now to minimize te sum of deviations from te respective optimum variances. (9) 4.. Pase II Similarly, at pase II Goal Programming formulation of te problem (5) will be Subect to Minimize β p d = d V = m c m Cˆ. 0 = m n d 0 and m integers; =,,, ; =,,, p (0) 5. Formulation as a Fuzzy Programming Problem 5.. Pase I k for eac caracteris- To obtain Fuzzy solution we first compute maximum value tic. were were (, ) ((, )) ((, )) U k and minimum value U = maxv n = min V n ;, k =,,, p () k k n denote te optimum allocation for te t caracteristics and te maximum and minimum are for all V, among teir values for a particular = k. Te difference of te maximum value U k and minimum values k are denoted by dk = Uk k; k =,,, p. Te Fuzzy Programming Problem (FPP) corresponding to te () at pase I is given by te following NPP 73
S. Iftekar et al. Minimizeδ α Subect to δ ( dk) V = n ( c ) ˆ w c n C + 0 = n n dk 0 and n integers =,,, ; k, =,,, p were δ 0 is te decision variable representing te worst deviation level. () 5.. Pase II Similarly, te Fuzzy Programming Problem corresponding to te (5) at pase II is given by te following NPP Minimizeδ β Subect to δ ( dk) V = m c ˆ m C 0 = (3) m n d k 0 and m integers =,,, ; k, =,,, p were δ 0 is te decision variable representing te worst deviation level. Te NPPs may be solved by using te optimization software [4]. For furter information about INGO one may visit te site: ttp://www.lindo.com. 6. A Numerical Example Te data in Table use are from [5]. A population of size N = 3850 is divided into four strata. Two caracteristic Y and Y are defined on eac unit of te population. Te values of X and X are used as te auxiliary information corresponding on te main variable Y and Y. Te autors ave assumed te values for R =.48 and R = 0.843. Table sows te oter data. Eac stratum is divided into respondents and nonrespondents as sown in Table. It is assumed tat v and k are known and te preliminary sample size n = 000. In te last column of Table, l = is for respondents group and l = is for non-respondents group. Te total cost for te survey is taken as C 3000 units. Out of wic 750 units are for te preliminary sample n, 900 units are for pase-i and 350 units are for pase-ii. Table. Data for four strata and two caracteristics. w v = = k c c c S S S S S S y x xy y x xy 0.3 0.4 0.5 3 784 4 34 444 48 68 0. 0.5 0.6 3 4 576 9 50 676 55 94 3 0.7 0.6 0.7 4 5 04 34 445 936 645 84 4 0.0 0.65 0.75 5 6 96 97 68 6084 08 645 74
S. Iftekar et al. Table. Data for groups of respondents and non-respondents. Group y S S S S S S x xy y x xy w l l =, 3 4 Respondent 36.06 03.6 57.04 767.8 55.76 333.93 w = 0.70 Non-respondent 30.55 88.73 35.07 454.76 5.48 97.93 w = 0.30 Respondent 373.79 4.60 6.4 449.9 69.7 95.67 w = 0.80 Non-respondent 36.9 08.76 4.6 353.8 33.46 53.88 w = 0.0 Respondent 930.5 309.75 404. 7.88 44.07 553.60 w = 0.75 3 Non-respondent 560.8 86.85 43.48 65.98 388.46 507.0 w = 0.5 3 Respondent 355.98 785.33 04.48 690.53 896.84 69.70 w = 0.7 4 Non-respondent 03.08 337.69 440.53 403.55 80.8 044.93 w = 0.8 4 Using estimated values of strata weigts te values of n = wn ; =,,, are obtained as n = 30, n = 0, n = 70, n = 00 wit n = 000. 3 4 4 = 6.. Computation of Compromise Allocation Using Goal Programming Tecnique (GPT) 6... Individual Optimum Allocation (Pase I) Using data from Table and Table, we compute te individual optimum allocation for eac caracteristic by using NPP () will be te solution to: For = Minimize V 39.6074 3.87 55.7960 6.80995 = + + + n n n3 n4 Subect to.4n+ 3.4n + 4n3 + 4.6n4 900 n 30 n 0 n3 70 n4 00 and n are integers =,,, Using optimization software INGO we get te optimal solution as For = Minimize V ( n ) ( ) = 59,80,46,44, V =.36899, 09.68.007 85.6067 5.4584 = + + + n n n3 n4 Subect to.4n+ 3.4n + 4n3 + 4.6n4 900 n 30 n 0 n3 70 n4 00 and n are integers =,,, 75
S. Iftekar et al. Using optimization software INGO we get te optimal solution as ( n ) ( ) = 83, 69,6,57, V =.54485., 6... Compromise Solution Using Goal Programming (Pase I) Using data from Table and Table te Goal Programming Problem (9) can be formulated as wit Minimized Subect to 39.6074 3.87 55.7960 6.80995 + + + d.36899 n n n n 3 4 09.68.007 85.6067 5.4584 + + + d.54485 n n n n 3 4.4n + 3.4n + 4n + 4.6n 900 3 4 n 30 n 0 n 70 3 n 00 4 + d d 0, d 0 and n are integers =,,, Using optimization software INGO we get te optimal solution as d n = 75, n = 7, n = 34, n = 5, 3 4 = 0.79 and d = 0.603434 and te optimum value of te obective function d + d = 0.039343. 6..3. Individual Optimum Allocation (Pase II) As in Section 6.. for te given data te individual optimum allocations for eac te two caracteristics using NPP (5) are: For = For = ( ) ( ) m =,,9, 4 V = 0.7599099, ( ) ( ) m =, 9,9, 6 V =.7448., 6..4. Compromise Solution Using Goal Programming (Pase II) For te given data, as in Section 6.. Goal Programming Problem (0) gives te following optimal solution wit m =, m = 0, m = 0, m = 4, 3 4 d = 0.0075545 and d = 0.0086650 and te optimum value of te obective function d + d = 0.00567695. 6.. Computations of Compromise Solution Using Fuzzy Programming Tecnique (FPT) 6... Compromise Solution Using Fuzzy Programming (Pase I) To obtain fuzzy solution we first obtained te maximum value and minimum value as given in () for eac caracteristic by using individual optimum allocation worked out in Section 6.. U =.56605 =.36899 U =.58685 =.54485 76
S. Iftekar et al. and d = 0.09706 d = 0.04954. After computing te optimum allocation and optimum variances for two caracteristics te compromise optimal solution for te above problem can be obtained by solving te given Fuzzy Programming Problem (FPP) of (3) Minimize δ Subect to 39.6074 3.87 55.7960 6.80995 + + + δ ( 0.09706).36899 n n n3 n4 09.68.007 85.6067 5.4584 + + + δ ( 0.04954).54485 n n n3 n4.4n+ 3.4n + 4n3 + 4.6n4 900 n 30 n 0 n3 70 n4 00 δ 0 andn are integers =,,, Using optimization software INGO we get te optimal solution as 3 4 n = 7, n = 75, n = 35, n = 5 wit δ = 0.486958. 6... Compromise Solution Using Fuzzy Programming (Pase II) Similarly, using data from Table and Table te Fuzzy Programming Problem (3) gives te following optimal solution 7. Summary of te Results m =, m = 0, m = 0, m = 4 wit δ = 0.4337. 3 4 In te following results obtained using Goal Programming Tecnique and Fuzzy Programming Tecnique are summarized. 8. Conclusions Table 3 and Table 4 sow te values of te variance of te combined ratio estimates of te population means at Pase-I and Pase-II respectively, for te two caracteristics. Te figures sow tat bot te approaces te Goal Programming Approac and te Fuzzy Programming Approac give almost same results. However, at Pase-I te Goal Programming Approac is sligtly more precise in terms of te trace value (See [6]). Te Goal Programming and Fuzzy Programming tecnique and some oter tecniques like Dynamic Programming and Separable Programming can be used to solve a wide variety of matematical programming problems. Tese tecniques may be of great elp in solving multivariate sampling problem also. ike determining Table 3. Compromise solution at Pase I. Tecniques Allocations Variances Trace Cost incurred n n n n V 3 4 V V + V C GPT 75 7 34 5.44799.550886 3.795685 900 FPT 7 75 35 5.40.555754 3.797765 900 77
S. Iftekar et al. Table 4. Compromise solution at Pase II. Tecniques Allocations Variances Trace Cost incurred m m m m V V V + V 3 4 GPT 0 0 4 0.76664.74584.507945 350 FPT 0 0 4 0.76664.74584.507945 350 te number of strata, strata boundaries and compromise allocations in multivariate stratified sampling. ittle work as been done to solve te above mentioned optimization problems in real life situations. For example wen te estimates of te population parameters used in formulating te problems are temselves treated as random variables wit assumed or known distributions. In suc cases te formulated problems becomes a multivariate stocastic programming. Furter, apart from a linear cost function, nonlinear functions may be used tat may include travel cost, labour cost, rewards to te respondent and incentives to te investigators etc. Interested researcers may expose tese situations. Acknowledgements Te autors are tankful to te Editor for is valuable remarks and suggestions tat elped us a lot in improving te standard of te paper. Tis researc work is partially supported by te UGC grant of Emeritus Fellowsip to te autor Moammed Jameel Asan for wic e is grateful to UGC. References [] Hansen, M.H. and Hurwitz, W.N. (946) Te Problem of Non-Response in Sample Surveys. Journal of te American Statistical Association, 4, 57-59. ttp://dx.doi.org/0.080/06459.946.050894 [] Rao, P.S.R.S. (986) Ratio Estimation wit Sub-Sampling te Non-Respondents. Survey Metodology,, 7-30. [3] Rao, P.S.R.S. (987) Ratio and Regression Estimates wit Sub-Sampling te Non-Respondents. Special Contributed Session of te International Statistical Association Meeting, -6 September 987, Tokyo. [4] Kare, B.B. and Srivastva, S. (993) Estimation of Population Mean Using Auxiliary Caracter in Presence of Non- Response. National Academy Science etters, 6, -4. [5] Kare, B.B. and Srivastva, S. (997) Transformed Ratio Type Estimators for te Population Mean in te Presence of Non-Response. Communications in Statistics Teory and Metods, 6, 779-79. ttp://dx.doi.org/0.080/036099708830 [6] Raiffa, H. and Sclaifer, R. (96) Applied Statistical Decision Teory. Graduate Scool of Business Administration, Harvard University, Boston. [7] Ericson, W.A. (965) Optimum Stratified Sampling Using Prior Information. Journal of te American Statistical Association, 60, 750-77. ttp://dx.doi.org/0.080/06459.965.048085 [8] Asan, M.J. and Kan, S.U. (98) Optimum Allocation in Multivariate Stratified Random Sampling wit Overead Cost. Metrika, 9, 7-78. ttp://dx.doi.org/0.007/bf0893366 [9] Dayal, S. (985) Allocation in Sample Using Values of Auxiliary Caracteristics. Journal of Statistical Planning and Inference,, 3-38. ttp://dx.doi.org/0.06/0378-3758(85)90037-0 [0] Kan, M.G.M., Maiti, T. and Asan, M.J. (00) An Optimal Multivariate Stratified Sampling Design Using Auxiliary Information: An Integer Solution Using Goal Programming Approac. Journal of Official Statistics, 6, 695-708. [] Varsney, R., Namussear and Asan, M.J. (0) An Optimum Multivariate Stratified Double Sampling Design in Non-Response. Optimization etters, 6, 993-008. [] Cocran, W.G. (977) Sampling Tecniques. 3rd Edition, Jon Wiley& Sons, New York. [3] Scniederans, M.J. (995) Goal Programming: Metodology and Applications. Kluwer, Dordrect. ttp://dx.doi.org/0.007/978--465-9-4 [4] ingo User s Guide (03) ingo-user s Guide. INDO SYSTEM INC., Cicago. [5] Haseen, S., Iftekar, S., Asan, M.J. and Bari, A. (0) A Fuzzy Approac for Solving Double Sampling Design in Presence of Non-Response. International Journal of Engineering Science and Tecnology, 4, 54-55. [6] Sukatme, P.V., Sukatme, B.V., Sukatme, S. and Asok, C. (984) Sampling Teory of Surveys wit Applications. 3rd Edition, Iowa State University Press, Iowa and Indian Society of Agricultural Statistics, New Deli. 78