Ope Joural of Statistics, 06, 6, 85-95 Publised Olie February 06 i SciRes ttp://wwwscirporg/joural/ojs ttp://dxdoiorg/0436/ojs0660 Improved Estimatio of Rare Sesitive ttribute i a Stratified Samplig Usig Poisso Distributio bdul Wakeel, Masood war Departmet of Matematics, COMSTS Istitute of Iformatio Tecology, Islamabad, Pakista Received December 05; accepted 0 February 06; publised 3 February 06 Copyrigt 06 by autors ad Scietific Researc Publisig Ic Tis work is licesed uder te Creative Commos ttributio Iteratioal icese CC BY ttp://creativecommosorg/liceses/by/40/ bstract I tis study, we propose a two stage radomized respose model Improved ubiased estimators of te mea umber of persos possessig a rare sesitive attribute uder two differet situatios are proposed Te proposed estimators are evaluated usig a relative efficiecy compariso It is sow tat our estimators are efficiet as compared to existig estimators we te parameter of rare urelated attribute is kow ad i ukow case, depedig o te probability of selectig a questio Keywords Poisso Distributio, Rare Sesitive ttribute, Rare Urelated ttribute, Stratified Samplig Itroductio Te collectio of data troug direct questioig o rare sesitive issues suc as extramarital affairs, family disturbaces ad declarig religious affiliatio i extremism coditio is far-reacig issue Warer [] itroduced te radomized respose procedure to procure trustworty data for estimatig π, te proportio of respodets i te populatio belogig to te sesitive group Greeberg et al [] suggested a urelated questio radomized respose model i wic eac idividual selected i te samples was asked to reply yes or o to oe of two statemets: a Do you belog to Group? b Do you belog to Group Y? wit respective probabilities P ad P Secod questio asked i te samplig does ot ave ay effect o te first questio Greeberg et al [] cosidered π ad π Y te proportio of persos possessig sesitive ad urelated caracteristic respectively ad discussed bot te cases we π was kow ad ukow Te probability of yes Y How to cite tis paper: Wakeel, ad war, M 06 Improved Estimatio of Rare Sesitive ttribute i a Stratified Samplig Usig Poisso Distributio Ope Joural of Statistics, 6, 85-95 ttp://dxdoiorg/0436/ojs0660
Wakeel, M war resposes 0 θ0 = Pπ + P πy Magat ad Sig [3] proposed a two stage radomized respose procedure wic required te use of two radomizatio devices Te radom device R cosists of two statemets amely a I belog to te sesitive group, ad b Go to radom device R, wit probabilities T ad T respectively Te radom device R wic uses two statemets a I belog to te sesitive group, ad b I do ot belog to te sesitive group wit kow probabilities P ad P respectively Te θ 0, te probability of yes resposes is θ0 = Tπ + T { Pπ + P π } ater o, differet modificatios ave bee made to improve te metodology for collectio of iformatio Some of tem are ee et al [4], Cauduri ad Mukerjee [5], Mamood et al [6], ad et al [7], Bargava ad Sig [8] ad et al [7] proposed te estimators for te mea umber of persos possessig te rare sesitive attribute usig te urelated questio radomized respose model by utilizig a Poisso distributio Recetly, ee et al [4] exteded te ad et al s [7] study to stratify samplig ad propose te estimators we te parameter of rare urelated attribute is kow ad ukow I tis study, we propose improved estimators for te mea ad its variace of te umber of persos possessig a rare sesitive attribute based o stratified samplig by usig Poisso distributio Te estimators are proposed we te parameter of te rare urelated attribute is kow ad ukow Te proposed estimators are evaluated usig a relative efficiecy comparig te variaces of te estimators reported i ee et al [4] θ, defied by tem is Improved Estimatio of a Rare Sesitive ttribute i Stratified Samplig-Kow Rare Urelated ttributes Cosider te populatio of size N idividuals wic is divided ito subpopulatios strata of sizes N =,,, ll te subpopulatios are disjoit ad togeter comprise te wole populatio I stratum, respodet are selected by simple radom samplig wit replacemet SRSWR ad asked to use te pair of radomizatio devices R ad R, eac cosistig of te two statemets Te radomizatio device R is costructed as: i I possessrare sesitive attribute ii Go to radomizatio device R wit respective probabilities P ad P Te radomizatio device R cosists of two statemets: i I possess rare sesitive attribute ii I possess rare urelated attribute Y wit probabilities P ad P respectively By tis radomized device, te probability of a yes respose i stratum is give by { } θ = P π + P P π + P π, 0 Y were π ad π Y are te populatio proportios of idividuals possessig rare sesitive ad rare urelated t attributes i te stratum, respectively Here π Y is assumed to be kow Sice ad Y are very rare attributes, θ0 = 0 is fiite, assumig ad θ0 0 et x, x,, x be a radom sample i stratum from a Poisso distributio wit parameter 0 Te te maximum likeliood estimator for te mea umber of persos wo ave te rare sesitive attribute i = π, is give by stratum, were Y πy = xi P P Y, { P + P P} i= = is kow mea of persos wo ave rare urelated attribute i stratum Te parameter, is te mea umber of persos possessig rare sesitive attribute, i a populatio of size N ad its estimator is give by W W xi P P Y { P P P} = =, 3 = = + i= were W = N N 86
Wakeel, M war Te variace of te estimator i eac stratum is give by were =, 4 P P + = + P P P P P P Y + Tus, te variace expressio of te estimator may be derived as W = = = = W THEOREM is a ubiased estimator of Proof From 3, we ave W 5 E = E xi P P Y, = P+ P P i= P P + = = = 0 Y W W = P P P = THEOREM Te ubiased estimator for is give by W Proof { } E = xi 6 = P + P P i= W = Ex i = i P P P = + W = = = P + P P { 0} Now, we cosider te proportioal ad optimal allocatios of te total sample size ito differet strata Te metod of proportioal allocatio is used to defie sample sizes i eac stratum depedig o eac stratum size Sice te sample size i eac stratum is defied as = N N, te variace of te estimator, uder proportioal allocatio of sample size is give by = W prop 7 = However, te optimal allocatio is a tecique to defie sample size to miimize variace for a give cost or to miimize te cost for a specified variace Te is proportioate to te stadard deviatio, S of te va- riable I stratified samplig, let cost fuctio is defied as C= c0 + c, were c 0 is te fixed cost ad c is te cost for te eac idividual stratum Witi eac stratum te cost is proportioal to te size of sample, but te cost c may vary from stratum to stratum For fixed cost, usig te Caucy Scwarz iequality, te sample size is give by to miimize = W c = W c So te miimum variace of te estimator for te specified cost C uder te optimum allocatio of sample = 8 87
Wakeel, M war size is give by = W c W c opt = = 9 3 Improved Estimatio of a Rare Sesitive ttribute i Stratified Samplig-Ukow Rare Urelated ttributes I tis sectio, te estimators for te mea umber of rare sesitive attribute are proposed uder te assumptios tat te sizes of stratum are kow; owever, Y = πy, te mea of te rare urelated attribute is ukow I tis case eac selected respodet from stratum is asked to use te sequetial pair of radomizatio devices Tat i te t stratum,, respodets are asked to use te radomizatio devices R ad R cosistig of two statemets Te device R cosists of two statemets: i I possess a sesitive group ii Go to radomizatio device R Te statemets occur wit respective probabilities Te two statemets of te radomizatio device R are: i I possess a sesitive attribute ii I possess urelated attribute Y represeted wit respective probabilities P ad P P ad P respodet is asked to use te same pair of devices T, respectively T, fter usig te first pair of radomized devices, R ad R but wit probabilities T, T ad Te probabilities of te yes resposes for te first ad secod use of pair of radomizatio devices are respectively give by ad θ = P π + P P π + P πy 0 θ = T π + T T π + T πy, were π ad π Y are te respective populatio proportios of rare sesitive ad rare urelated attribute i te stratum s is large ad π, πy 0, terefore θ, θ 0 Now, obviously = θ, = θ et x i ad x i =,,,, i =,,, be te pair of resposes from te it respodet selected i t stratum We ave ar x i E x i P P { P P Y } ar xi E xi T T { T T Y } i, i = i i i i { P P P } { T T T } P P T T = = = + + = = = + + 3 Cov x x E x x E x E x = + + +, Y Followig te expressio give i Equatios ad 3, we ave te sample meas for bot set of resposes as 4 ad x i = P + P P P + Y 5 i= By solvig 5 ad 6, we get estimators of xi = T + T T T + Y 6 i= ad Y as = T T x P P x 7 i i B i= i= 88
Wakeel, M war were B P P P T T T = + + { T T T } x { P P P } x 8 Y i i D i= i= = { + } { + } ad D { T T T} { P P P} [ B] [ B] = + + = T T x P P x, i i i= i= = + Puttig, 3 ad 4 i 9 we get were T T x i P P xi i= i= T T P P Cov xi, xi i= [ + ] = B 9 =, 0 { } { } { + } { + } = T T P + P P + P P T + T T T T P P T T T P P P, { } = T T P P T + T T P + P P Te stratified estimators of { } T T P P Y ad Y are defied as = W, ad = THEOREM 3 is a ubiased estimator for Proof = W Y Y = W = = i i E E W T T E x P P E x B = = = i= i= W = T T P P B Puttig te values of ad i Equatio, we get te result THEOREM 4 Te variace of is give by were W = [ + ] =, 3 B { } { } { + } { + } = T T P + P P + P P T + T T P P T T P P P T T T, { } = P P T T P + P P T + T T { } P P T T Y 89
Wakeel, M war Proof Sice = W, we ave = = = 4 W W = = O puttig 0 i 4 we ave te teorem Corollary : ubiased estimator for te variace of rare sesitive attribute is give by It ca be proved easily THEOREM 5 Y is a ubiased estimator of Proof From 8, we ave E Y W 5 = = B + = + + { T T T } { P P P } = { T + T T} { P + P P} { T + T T} { P + P P } Y { T T T} E xi { P P P} xi + + i= i= = WE = W = Y Y Y = = Corollary : ubiased estimator for Y is give by W were = C + C 6 Y Y = D { } { } { } { } C = T + T T P + P P + P + P P T + T T { P P P} { T T T} + +, { } { } C = T + T T P P + P + P P T T { P P P} { T T T} P P T T + +, { } { } D = T + T T P + P P Now uder proportioal allocatio of sample size, te variace of is give by W = [ + ] prop B = However, i optimum allocatio, te sample size i stratum is ad te variace of is give by W W = + c + c B = B W W = + + c c opt = B = B 90
Wakeel, M war 4 Relative Efficiecy ee et al [4] proposed variace of for rare sesitive attribute based o Poisso distributio we te rare urelated attribute kow ad ukow respectively is: were W P y = +, 7 = P P [ ] = P T W = Λ +Λ, 8 { } Λ = P T + T P P T P T { } Λ = P T P T P T Y For compariso of te proposed estimator wit RE, te relative efficiecy is give by = arge samples are required to estimate te meas of rare sesitive attribute So we cosider a large ypotetical populatio, i order to study te relative efficiecy, settig = 0000 wit two strata avig = 4000 ad = 6000 We coose values of te parameters, Y,, Y as 05,5, 5,05, 5,5 ad 05,05, ad we let te value P rage from 03 to 07, ad let tat of P rage from 06 to 09 we te weigts W = 04 ad W = 06 ad W = 06 ad W = 04 wic is proportioal allocatio lso, let = ad Y = Y 4 Relative Efficiecy We Rare Urelated ttribute Is Kow et be te variace of te proposed estimator rare urelated attribute is kow Te relative efficiecy of proposed estimator wit respect to for te rare sesitive attribute we te parameter of estimator is defied as P Y W + = P P RE = = 9 P P Y W + = P + P P P + P P From Equatio 9 it evidet tat te relative efficiecy of proposed estimator is free from te sample size We set te desig probabilities as P = P ad P = P I Table, te relative efficiecies are give wit parameter values, Y,, Y as 05,5, 5, 05, 5,5 ad 05, 05, P varies from 03 to 07, ad P from 06 to 09 avig weigts W = 04,06 W+ W = It is evidet tat te proposed estimator as efficiecy greater ta i all cases, ad is always better ta te estimator study of Figure cofirms tis 4 Relative Efficiecy We Rare Urelated ttribute Is Ukow et be te variace of te proposed estimator for te rare sesitive attribute we te parameter of rare urelated attribute is ukow Te relative efficiecy of proposed estimator wit respect to estimator is defied as 9
Wakeel, M war Figure Relative Efficiecy RE of te proposed model wit respect to ee et al [4] for W = 04 ad P = 03 to 08 [ + ] W = [ ] [ + ] { P + P P} { T + T T} P T RE = = W = 30 9
Wakeel, M war Te relative efficiecy of proposed estimator is free from te sample size For te aalysis, te desig probabilities are fixed as P = P, P = P, T = T, T = T Settig =, Y = Y wit parameter values of, Y,, Y as 05,5, 5, 05, 5,5 ad P = 06, T = 03, 04, T = 0, 03, 04, 05 ad 04, 05 W + W = Te relative efficiecies are give i Table depict tat te proposed W = estimator outer perform ta estimator avig efficiecy greater ta if we set te probabilities as P T However te relative efficiecy starts decreasig as we take P < T study of Figure cofirms tis lso, we W icreaseste relative efficiecy of proposed estimator icreases Table Relative efficiecy of te proposed estimator wit ee et al 03 W = 04 W = 06 P Y P = 06 07 08 09 P = 06 07 08 09 03 05 5 7346 589 4758 3966 5630 464 399 585 5 5 938 706 5439 466 7336 5334 39 855 5 05 98 973 6887 506 0003 777 57 353 04 05 5 873 6667 58 469 6863 508 373 768 5 5 435 8333 666 4574 936 650 4567 333 5 05 6070 568 85 565 349 9436 6447 407 05 05 5 0097 750 570 437 809 5779 448 95 5 5 375 9699 6908 4885 00 40 775 537 343 5 05 30537 445 977 638 757 848 7776 4633 06 05 5 6090 0489 07 090 9370 6545 4576 335 5 5 9600 404 35 377 3596 906 59 3698 5 05 677 636 596 64 377 4550 95 59 07 05 5 747 4383 464 063 064 735 5005 338 5 5 5 6900 3806 66 5897 0346 66 3984 5 05 33 95 758 350 3759 7587 0776 583 Figure Relative Efficiecy RE of te proposed model wit respect to ee et al [4] for idicated values 93
Wakeel, M war Table Relative efficiecy of te proposed estimator wit ee et al 03, W = 04, ad W = 05 P = P P = P T = T T = T = Y = Y RE W = 04 RE W = 05 06 06 03 0 5 05 597 57464 5 5 6957 896 05 5 0005 5064 03 5 05 0396 9908 5 5 3985 7484 05 5 854 0378 04 5 05 888 035 5 5 086 3773 05 5 65033 89 05 5 05 59836 74795 5 5 8050 0065 05 5 4754 59405 06 06 04 0 5 05 3703 3969 5 5 44483 55603 05 5 7607 4509 03 5 05 5759 398 5 5 364 4578 05 5 43 8038 04 5 05 984 4768 5 5 780 3475 05 5 754 568 5 Coclusio 05 5 05 3870 7338 5 5 946 436 05 5 078 I tis study, a two stage radomized respose model is proposed wit improved estimators for te mea ad its variace of te umber of persos possessig a rare sesitive attribute based o stratified samplig by usig Poisso distributio It is sow tat our proposed metod ave better efficiecies ta te existig radomized respose model, we te parameter of rare urelated attribute is kow ad i ukow case, depedig o te probability of selectig a questio For future work, we ca obtai more sesitive iformatio from respodets by usig stratified double samplig wit te proposed model Refereces [] Warer, S 965 Radomized Respose: Survey Tecique for Elimiatig Evasive swer Bias Joural of 5098 94
Wakeel, M war Computatioal ad Grapical Statistics, 60, 63-66 ttp://dxdoiorg/0080/064599650480775 [] Greeberg, BG, bul-ela,, Simmos, WR ad Horvitz, DG 969 Te Urelated Questio Radomized Respose Model: Teoretical Framework Joural of te merica Statistical ssociatio, 64, 50-539 ttp://dxdoiorg/0080/06459969050099 [3] Magat, NS ad Sig, R 990 O te Cofidetiality Guarateed uder Radomized Respose Samplig: Compariso wit Several New Teciques Biometrical Joural, 40, 37-4 [4] ee, GS, Um, D ad Kim, JM 03 Estimatio of a Rare Sesitive ttribute i a Stratified Sample Usig Poisso Distributio Statistics, 47, 685-709 ttp://dxdoiorg/0080/033888065503 [5] Cauduri, ad Mukerjee, R 988 Radomized Respose: Teory ad Teciques Marcel Dekker, New York [6] Mamood, M, Sig, S ad Hor, S 998 O te Cofidetiality Guarateed uder Radomized Respose Samplig: Compariso wit Several New Teciques Biometrical Joural, 40, 37-4 ttp://dxdoiorg/000/sici5-40369980640:<37::id-bimj37>30co;-n [7] ad, M, Sig, S ad Sedory, S 0 Estimatio of a Rare Sesitive ttribute Usig Poisso Distributio Statistics, 46, 35-360 ttp://dxdoiorg/0080/0338880054300 [8] Bargava, M ad Sig, R 000 Modified Radomizatio Device for Warer s Model Statistica, 60, 35-3 95