Estimation of Finite Population Total under PPS Sampling in Presence of Extra Auxiliary Information

Internatonal Journal of Stattc and Analy. ISSN 2248-9959 Volume 6, Number 1 (2016), pp. 9-16 Reearch Inda Publcaton http://www.rpublcaton.com Etmaton of Fnte Populaton Total under PPS Samplng n Preence of Extra Auxlary Informaton P. A. Patel 1 and Shraddha Bhatt 2 Department of Stattc, Sardar Patel nverty, Vallabh Vdyanagar 388120, Inda. 1,2 E-mal:patelpraful_a@yahoo.co.n, hraddha.bhatt83@gmal.com Abtract Th artcle deal wth probablty proporton to ze (pp) etmaton of the populaton total ncorporatng auxlarnformaton at etmaton tage va model-baed approach. An optmal etmator obtaned. Motvated from the optmal etmator few etmator are uggeted and compared emprcally wth the conventonal etmator. Keyword: Auxlarnformaton, Model-baed etmaton, Probablty proportonal to ze, Subject Clafcaton: 62D05 INTRODCTION It well-known that when the urvey varable and electon probablte are hghly correlated, pp etmator of the populaton mean lead to conderable gan n effcency a compared wth the cutomary etmator mple mean for equal probablty amplng. The pp etmator depend on multplcty but not on the order and hence nadmble. An mprove etmator then avalable by applyng the Rao-Blackwell theorem. Rao-Blackwellzaton of th etmator, condered by Pathak (1962), yeld a rather complcated etmator whch doe not admt a mple varance etmator a doe the pp etmator. Moreover the gan n effcenc condered to be mall, unle the amplng fracton large (ee, Cael et al., 1977). Thu, the reultng etmator le ueful n practce than the orgnal pp etmator. In th paper, alternatve etmator for etmatng populaton mean under pp amplng are uggeted.

10 P.A. Patel and Shraddha Bhatt Let be a fnte populaton of ze N. Let (, z ) be par of value of the tudy varable y and an auxlary varable z aocated wth each unt. Let be a ample of ze n drawn accordng to probablty proportonal to ze (pp) z, p z, and wth replacement (ppwr) amplng degn p = p(). Suppoe that the value,, and z,, are known. The problem how to ue th nformaton to make nference about the fnte populaton total Y =. If A, we wrte A for A and A for j A. The cutomary degn-baed unbaed etmator of Y whch make no ue of auxlarnformaton at the etmaton tage the Hanen-Hurwtz (HH) (1943) etmator Y HH = np (1) The amplng varance of Y HH gven by V(Y HH ) = 1 n p ( y 2 Y) p (2) A degn-unbaed etmator of V(Y HH ) gven by v(y HH ) = 1 2 n(n 1) ( Y HH ) p (3) In urvey amplng, auxlarnformaton about the fnte populaton often avalable at the etmaton tage. tlzng th nformaton more effcent etmator may be obtaned. There ext everal approache, uch a model-baed, calbraton, etc., each of whch provde a practcal approach to ncorporate auxlarnformaton at the etmaton tage. Here, we wll ue the followng model a workng model. Model GT. Aume that y 1,, y N are random varable havng jont dtrbuton ξ (ee, Cael et al., 1977, p.102) wth E ξ ( ) = a μ + b V ξ ( ) = a 2 σ 2 C ξ (, y j ) = a a j ρσ 2 ( j) whereμ, σ > 0 and ρ ( (N 1) 1, 1) are the parameter, and a > 0, b are known number. Here E ξ ( ), V ξ ( ) and C ξ (, ) denote ξ expectaton, ξ-varance and ξ-covarance, repectvely. We try to make a effcent ue of auxlarnformaton a poble through model. To fnd an optmal (n the ene mnmume ξ E p (T Y) 2 ) trategy (a combnaton of amplng degn and etmator), gven a model ξ, we mnmze the

Etmaton of Fnte Populaton Total under PPS Samplng n Preence of Extra 11 E ξ V p (T) = E ξ E p (T Y) 2 = E p {V ξ (T)} + E p {B ξ (T)} 2 (4) ubject to E ξ E p (T) = E ξ (Y), where B ξ (T) = E ξ (T Y). A degn-model-baed (.e. pξ-) etmaton of the populaton total Y under ppwr condered n Secton 2. The optmal pξ-unbaed etmator depend on unknown model parameter. Motvated by th n Secton 3 we have uggeted a few pp-type etmator for Y and obtaned ther approxmate ba and varance. In Secton 4 we demontrated through a lmted mulaton tudy the mproved performance of the propoed etmator over ther conventon counterpart. THE OPTIMAL ESTIMATOR Conder a lnear predctor T of Y a T = w 0 + w where w 0 and w are contant free from y-value. The predctor T p-unbaed f E p (T) = Y for all y = {y 1,, y N } R N and pξ-unbaed f, for every, E ξ E p (T) = E ξ (Y). nder Model GT, pξ-unbaedne mple that p w a = a n and E p (w 0 ) = b p w b Denotng y = b, = 1,, N, Y = andt = w y, under Model GT, we obtan E ξ V p (T ) = n[(σ 2 + μ 2 ) p (1 p )w 2 a 2 + (ρσ 2 + μ 2 ){ p 2 w 2 a 2 n( a ) 2 }] Mnmzaton of (4) (equvalently mnmzaton of E p {V ξ (T)} nce B ξ (T) = 0) ubject to the condton pξ-unbaedne yeld the optmal value of w a w = 1 nq 1 {(σ 2 + μ 2 )(1 p ) + (ρσ 2 + μ 2 )p } 1 a a Therefore, the optmal predctor of Y n the cla of pξ-unbaed lnear predctor obtaned a where T = w = np Q = [p {(σ 2 + μ 2 )(1 p ) + (ρσ 2 + μ 2 )p } ]

12 P.A. Patel and Shraddha Bhatt and p = Q[{(σ 2 + μ 2 )(1 p ) + (ρσ 2 + μ 2 )p }] a a Fnally, the optmal predctor of Y = Y + b obtaned a T = w ( b ) + b = np + ( b b ) np In partcularly, f ρ = 0, b = 0, a = p, then (5) uggeted by Arnab (2004) (5) reduce to the etmator T 1 = np 1 where p 1 = {δ(1 p ) + 1}p [p {δ(1 p ) + 1} ] wth δ = σ 2 μ 2. It nteretng to note that f we let ρ = 0, b = cx, a = p and c known calar, n (5) the reultng etmator can be vewed a a generalzed dfference etmator T 2 = np 2 + c (X x np 2 ) where p 1 = p 2. In th artcle we hall not focu on T 2. Remark 1.The optmal etmator T nvolve model parameter and hence ueful only when thee parameter are known. In uch tuaton auxlarnformaton can be effectvely ued through the ftted value. THE PROPOSED ESTIMATORS We aume further that the data {x, } are oberved. Here x the value for unt of an extra auxlary varable x whoe total, X = x, and coeffcent of varaton (cv), C x, are aumed to be known from a relable ource. In Model GT, nertng ρ = 0, a = x and b = 0, we obtan from T the modelated optmal etmator of Y a

Etmaton of Fnte Populaton Total under PPS Samplng n Preence of Extra 13 where Y 1 = n{δ(1 p ) + 1}(x X) [p {δ(1 p ) + 1} ] δ = σ 2 μ 2 = σ 2 a 2 μ 2 2 a = V ξ ( ) (E ξ ( )) 2 = (CV ξ ( )) 2 aumed to be known. Often th dffcult. Th motvate to ugget the followng etmator. n{c 2 x (1 p )+1}(x X) [p {C 2 x (1 p )+1}] Y 2 = (6) n{c 2 y (1 p )+1}(x X) [p {c 2 y (1 p )+1}] Y 3 = (7) n{(c y C x c x ) 2 (1 p )+1} x X [p {(c y C x c x ) 2 (1 p )+1}] Y 4 = (8) Y 5 = n{(c y +b(c x c x )) 2 (1 p )+1} x X [p (c y +b(c x c x )) 2 {(1 p )+1}] where c y and c x denote coeffcent of varaton of y and x varable and b denote regreon coeffcent between c y and c x baed on pp ample. Obvouly, the exact ba and exact mean quare error (MSE) of Y, = 3, 4, 5, are hard to obtan. Approxmate Ba and MSE of the Propoed Etmator Suppoe that p, are the reved probablte of electon and conder Y = a a pp etmator of the populaton total Y. The exact ba and exact varance of Y are obtaned a and np B(Y ) = p p Y (9) V(Y ) = 1 n [ 2 2 p p ( y 2 p p ) ] (10)

14 P.A. Patel and Shraddha Bhatt Snce, for uffcently large n, c y, c y C x c x and c y + b(c x c x ) are aymptotcally content and aymptotcally unbaed etmator for C y (Patel & Shah, 2009) the approxmate ba and varance of each of Y, = 2, 3, 4, 5, can be obtaned ung (9) and (10) wth p = {C 2 y (1 p ) + 1}(x X) [p {C 2 y (1 p ) + 1} ] AN EMPIRICAL STDY The etmator Y HH, Y 2, Y 3 and Y 4 gven at (1), (6), (7) and (8) were compared emprcally on 3 natural populaton gven below. For comparon of the etmator, a ample wa drawn ung pp amplng from each of the populaton and thee etmator were computed. Thee procedure wa repeated M = 5000 tme. For an etmator Y, t relatve percentage ba (RB%) wa calculated a RB(Y ) = 100 (Y Y) Y and the relatve effcencn percentage (RE%) a RE(Y ) = 100 MSE m (Y HH ) MSE m ( Y ) where Y = M j=1 M Y j M and MSE m (Y ) = (Y j Y) 2 j=1 (M 1) Data et I: Murthy (1967) y : Output for factore, x : Number of worker, z : Fxed captal ρ yz =.9149, ρ yx =.9413 Data et II: Murthy (1967) y : Number of cultvator, x : Number of peron, z : area n q. mle ρ yz =.6611, ρ yx =.8311 Data et III: Fher (1936) (combned all three data et) y : Patel wdth x : Sepal length z : Patel length ρ yz =0.8179, ρ yx =.9628

Etmaton of Fnte Populaton Total under PPS Samplng n Preence of Extra 15 The mulated reult are preented n the followng table. Table 1: Relatve ba and Mean Square Error Data Set Populaton Total N n Etmator Etmate RB% MSE m RE% Y HH 466632.5 12.54 1.1E+10 - I 414611 80 10 Y 2 413330.7-0.31 2.75E+09 400 Y 3 415116.5 0.12 2.94E+09 374 Y 4 411429.9-0.76 2.94E+09 374 Y HH 167792 53.59 5.6E+09 - II 109248 128 20 Y 2 109099-0.14 79893656 7011 Y 3 109141-0.10 79829722 7017 Y 4 109119-0.12 80058668 6997 Y HH 213.307 18.57 1437.936 - III 179.9 150 20 Y 2 180.126 0.125609 96.8654 1484 Y 3 180.127 0.126236 96.8312 1485 Y 4 180.117 0.120456 96.8243 1485 The above mulaton reveal that (1) the abolute RB % of the uggeted etmator are n reaonable range and (2) the effcency of the uggeted etmator ubtantal a compared to the conventonal etmator. REFERENCES [1] Arnab R. (2004). Optmum etmaton of a fnte populaton total n PPS amplng wth replacement for mult-character urvey, Jour. of Indan Soc. Agr. Statt., 58 (2), 231-43 [2] Cael, C. M., Särndal, C. E., and Wretman, J. H. (1977). Foundaton of Inferencen Survey Samplng., New York, John Wley. [3] Fher, R. A. (1936). The ue of multple meaurement n taxonomc problem, Annal of Eugenc, 7, 179-188. [4] Hanen, M. H. and Hurwtz, W. N. (1943). On the thory of amplng from fntepopulaton. Ann. Math. Statt. 14, 333 362.

16 P.A. Patel and Shraddha Bhatt [5] Murthy, M. N. (1967). Samplng Theory and Method, Stattcal Publhng Houe:Culcutta. [6] Patel, P. A. and Shah, R. M. (2009). A Monte Carlo comparon of ome uggeted etmator of coeffcent of varaton n fnte populaton, Journal of Stattc Scence, 1 (2), 137-48. [7] Pathak P. K. (1962). On amplng wth unequal probablte. Sankhya A 24, 315-326.