Estimation of Finite Population Total under PPS Sampling in Presence of Extra Auxiliary Information

Similar documents
Team. Outline. Statistics and Art: Sampling, Response Error, Mixed Models, Missing Data, and Inference

Additional File 1 - Detailed explanation of the expression level CPD

Confidence intervals for the difference and the ratio of Lognormal means with bounded parameters

Specification -- Assumptions of the Simple Classical Linear Regression Model (CLRM) 1. Introduction

Statistical Properties of the OLS Coefficient Estimators. 1. Introduction

Chapter 6 The Effect of the GPS Systematic Errors on Deformation Parameters

Improvement in Estimating the Population Mean Using Exponential Estimator in Simple Random Sampling

Estimation: Part 2. Chapter GREG estimation

Estimation of a proportion under a certain two-stage sampling design

Improvements on Waring s Problem

Natalie Shlomo, Chris J. Skinner and Barry Schouten Estimation of an indicator of the representativeness of survey response

Adaptive Centering with Random Effects in Studies of Time-Varying Treatments. by Stephen W. Raudenbush University of Chicago.

Introduction. Modeling Data. Approach. Quality of Fit. Likelihood. Probabilistic Approach

Efficient nonresponse weighting adjustment using estimated response probability

Population element: 1 2 N. 1.1 Sampling with Replacement: Hansen-Hurwitz Estimator(HH)

A note on regression estimation with unknown population size

On Efficiency of Midzuno-Sen Strategy under Two-phase Sampling

Small signal analysis

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

2.3 Least-Square regressions

Introduction to Interfacial Segregation. Xiaozhe Zhang 10/02/2015

Improvements on Waring s Problem

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010

Computation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models

On multivariate folded normal distribution

Method Of Fundamental Solutions For Modeling Electromagnetic Wave Scattering Problems

USE OF DOUBLE SAMPLING SCHEME IN ESTIMATING THE MEAN OF STRATIFIED POPULATION UNDER NON-RESPONSE

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

Sampling Theory MODULE V LECTURE - 17 RATIO AND PRODUCT METHODS OF ESTIMATION

MULTIPLE REGRESSION ANALYSIS For the Case of Two Regressors

BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS. M. Krishna Reddy, B. Naveen Kumar and Y. Ramu

Exponential Type Product Estimator for Finite Population Mean with Information on Auxiliary Attribute

Two Approaches to Proving. Goldbach s Conjecture

Chapter 11. Supplemental Text Material. The method of steepest ascent can be derived as follows. Suppose that we have fit a firstorder

Multivariate Ratio Estimation With Known Population Proportion Of Two Auxiliary Characters For Finite Population

UNIVERSITY OF TORONTO Faculty of Arts and Science. December 2005 Examinations STA437H1F/STA1005HF. Duration - 3 hours

Optimal inference of sameness Supporting information

PROBABILITY-CONSISTENT SCENARIO EARTHQUAKE AND ITS APPLICATION IN ESTIMATION OF GROUND MOTIONS

AP Statistics Ch 3 Examining Relationships

Weighted Estimating Equations with Response Propensities in Terms of Covariates Observed only for Responders

The multivariate Gaussian probability density function for random vector X (X 1,,X ) T. diagonal term of, denoted

Potential gains from using unit level cost information in a model-assisted framework

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y)

On the SO 2 Problem in Thermal Power Plants. 2.Two-steps chemical absorption modeling

Conditional and unconditional models in modelassisted estimation of finite population totals

Estimation of Current Population Variance in Two Successive Occasions

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

On The Estimation of Population Mean in Current Occasion in Two- Occasion Rotation Patterns

Linear Approximation with Regularization and Moving Least Squares

j) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1

Statistics for Economics & Business

Predicting Random Effects from a Finite Population of Unequal Size Clusters based on Two Stage Sampling

Sampling Theory MODULE VII LECTURE - 23 VARYING PROBABILITY SAMPLING

Chapter 4: Regression With One Regressor

A LINEAR PROGRAM TO COMPARE MULTIPLE GROSS CREDIT LOSS FORECASTS. Dr. Derald E. Wentzien, Wesley College, (302) ,

Generalized Linear Methods

A FAMILY OF ESTIMATORS FOR ESTIMATING POPULATION MEAN IN STRATIFIED SAMPLING UNDER NON-RESPONSE

Separation Axioms of Fuzzy Bitopological Spaces

ECE559VV Project Report

Statistics Chapter 4

Predictors Using Partially Conditional 2 Stage Response Error Ed Stanek

Primer on High-Order Moment Estimators

Lecture 10 Support Vector Machines II

COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS

PHYS 450 Spring semester Lecture 02: Dealing with Experimental Uncertainties. Ron Reifenberger Birck Nanotechnology Center Purdue University

Chapter 12 Analysis of Covariance

Econometrics of Panel Data

Improved Class of Ratio Estimators for Finite Population Variance

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg

PROBABILITY PRIMER. Exercise Solutions

Chapter 7 Generalized and Weighted Least Squares Estimation. In this method, the deviation between the observed and expected values of

A Design Effect Measure for Calibration Weighting in Cluster Samples

Lab 4: Two-level Random Intercept Model

Supporting Information. Hydroxyl Radical Production by H 2 O 2 -Mediated. Conditions

The Essential Dynamics Algorithm: Essential Results

Variance Estimation for Measures of Income Inequality and Polarization ± The Estimating Equations Approach

On Outlier Robust Small Area Mean Estimate Based on Prediction of Empirical Distribution Function

A Kernel Particle Filter Algorithm for Joint Tracking and Classification

Multiple Choice. Choose the one that best completes the statement or answers the question.

A Bound for the Relative Bias of the Design Effect

Bias-corrected nonparametric correlograms for geostatistical radar-raingauge combination

DETERMINATION OF UNCERTAINTY ASSOCIATED WITH QUANTIZATION ERRORS USING THE BAYESIAN APPROACH

Economics 130. Lecture 4 Simple Linear Regression Continued

A Comparative Study for Estimation Parameters in Panel Data Model

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

LECTURE 9 CANONICAL CORRELATION ANALYSIS

Durban Watson for Testing the Lack-of-Fit of Polynomial Regression Models without Replications

Digital Signal Processing

Bayesian Variable Selection and Computation for Generalized Linear Models with Conjugate Priors

ECONOMICS 351*-A Mid-Term Exam -- Fall Term 2000 Page 1 of 13 pages. QUEEN'S UNIVERSITY AT KINGSTON Department of Economics

Lossy Compression. Compromise accuracy of reconstruction for increased compression.

APPROXIMATE FUZZY REASONING BASED ON INTERPOLATION IN THE VAGUE ENVIRONMENT OF THE FUZZY RULEBASE AS A PRACTICAL ALTERNATIVE OF THE CLASSICAL CRI

x i1 =1 for all i (the constant ).

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise.

3 Implementation and validation of analysis methods

Tornado and Luby Transform Codes. Ashish Khisti Presentation October 22, 2003

Nonparametric model calibration estimation in survey sampling

Transcription:

Internatonal Journal of Stattc and Analy. ISSN 2248-9959 Volume 6, Number 1 (2016), pp. 9-16 Reearch Inda Publcaton http://www.rpublcaton.com Etmaton of Fnte Populaton Total under PPS Samplng n Preence of Extra Auxlary Informaton P. A. Patel 1 and Shraddha Bhatt 2 Department of Stattc, Sardar Patel nverty, Vallabh Vdyanagar 388120, Inda. 1,2 E-mal:patelpraful_a@yahoo.co.n, hraddha.bhatt83@gmal.com Abtract Th artcle deal wth probablty proporton to ze (pp) etmaton of the populaton total ncorporatng auxlarnformaton at etmaton tage va model-baed approach. An optmal etmator obtaned. Motvated from the optmal etmator few etmator are uggeted and compared emprcally wth the conventonal etmator. Keyword: Auxlarnformaton, Model-baed etmaton, Probablty proportonal to ze, Subject Clafcaton: 62D05 INTRODCTION It well-known that when the urvey varable and electon probablte are hghly correlated, pp etmator of the populaton mean lead to conderable gan n effcency a compared wth the cutomary etmator mple mean for equal probablty amplng. The pp etmator depend on multplcty but not on the order and hence nadmble. An mprove etmator then avalable by applyng the Rao-Blackwell theorem. Rao-Blackwellzaton of th etmator, condered by Pathak (1962), yeld a rather complcated etmator whch doe not admt a mple varance etmator a doe the pp etmator. Moreover the gan n effcenc condered to be mall, unle the amplng fracton large (ee, Cael et al., 1977). Thu, the reultng etmator le ueful n practce than the orgnal pp etmator. In th paper, alternatve etmator for etmatng populaton mean under pp amplng are uggeted.

10 P.A. Patel and Shraddha Bhatt Let be a fnte populaton of ze N. Let (, z ) be par of value of the tudy varable y and an auxlary varable z aocated wth each unt. Let be a ample of ze n drawn accordng to probablty proportonal to ze (pp) z, p z, and wth replacement (ppwr) amplng degn p = p(). Suppoe that the value,, and z,, are known. The problem how to ue th nformaton to make nference about the fnte populaton total Y =. If A, we wrte A for A and A for j A. The cutomary degn-baed unbaed etmator of Y whch make no ue of auxlarnformaton at the etmaton tage the Hanen-Hurwtz (HH) (1943) etmator Y HH = np (1) The amplng varance of Y HH gven by V(Y HH ) = 1 n p ( y 2 Y) p (2) A degn-unbaed etmator of V(Y HH ) gven by v(y HH ) = 1 2 n(n 1) ( Y HH ) p (3) In urvey amplng, auxlarnformaton about the fnte populaton often avalable at the etmaton tage. tlzng th nformaton more effcent etmator may be obtaned. There ext everal approache, uch a model-baed, calbraton, etc., each of whch provde a practcal approach to ncorporate auxlarnformaton at the etmaton tage. Here, we wll ue the followng model a workng model. Model GT. Aume that y 1,, y N are random varable havng jont dtrbuton ξ (ee, Cael et al., 1977, p.102) wth E ξ ( ) = a μ + b V ξ ( ) = a 2 σ 2 C ξ (, y j ) = a a j ρσ 2 ( j) whereμ, σ > 0 and ρ ( (N 1) 1, 1) are the parameter, and a > 0, b are known number. Here E ξ ( ), V ξ ( ) and C ξ (, ) denote ξ expectaton, ξ-varance and ξ-covarance, repectvely. We try to make a effcent ue of auxlarnformaton a poble through model. To fnd an optmal (n the ene mnmume ξ E p (T Y) 2 ) trategy (a combnaton of amplng degn and etmator), gven a model ξ, we mnmze the

Etmaton of Fnte Populaton Total under PPS Samplng n Preence of Extra 11 E ξ V p (T) = E ξ E p (T Y) 2 = E p {V ξ (T)} + E p {B ξ (T)} 2 (4) ubject to E ξ E p (T) = E ξ (Y), where B ξ (T) = E ξ (T Y). A degn-model-baed (.e. pξ-) etmaton of the populaton total Y under ppwr condered n Secton 2. The optmal pξ-unbaed etmator depend on unknown model parameter. Motvated by th n Secton 3 we have uggeted a few pp-type etmator for Y and obtaned ther approxmate ba and varance. In Secton 4 we demontrated through a lmted mulaton tudy the mproved performance of the propoed etmator over ther conventon counterpart. THE OPTIMAL ESTIMATOR Conder a lnear predctor T of Y a T = w 0 + w where w 0 and w are contant free from y-value. The predctor T p-unbaed f E p (T) = Y for all y = {y 1,, y N } R N and pξ-unbaed f, for every, E ξ E p (T) = E ξ (Y). nder Model GT, pξ-unbaedne mple that p w a = a n and E p (w 0 ) = b p w b Denotng y = b, = 1,, N, Y = andt = w y, under Model GT, we obtan E ξ V p (T ) = n[(σ 2 + μ 2 ) p (1 p )w 2 a 2 + (ρσ 2 + μ 2 ){ p 2 w 2 a 2 n( a ) 2 }] Mnmzaton of (4) (equvalently mnmzaton of E p {V ξ (T)} nce B ξ (T) = 0) ubject to the condton pξ-unbaedne yeld the optmal value of w a w = 1 nq 1 {(σ 2 + μ 2 )(1 p ) + (ρσ 2 + μ 2 )p } 1 a a Therefore, the optmal predctor of Y n the cla of pξ-unbaed lnear predctor obtaned a where T = w = np Q = [p {(σ 2 + μ 2 )(1 p ) + (ρσ 2 + μ 2 )p } ]

12 P.A. Patel and Shraddha Bhatt and p = Q[{(σ 2 + μ 2 )(1 p ) + (ρσ 2 + μ 2 )p }] a a Fnally, the optmal predctor of Y = Y + b obtaned a T = w ( b ) + b = np + ( b b ) np In partcularly, f ρ = 0, b = 0, a = p, then (5) uggeted by Arnab (2004) (5) reduce to the etmator T 1 = np 1 where p 1 = {δ(1 p ) + 1}p [p {δ(1 p ) + 1} ] wth δ = σ 2 μ 2. It nteretng to note that f we let ρ = 0, b = cx, a = p and c known calar, n (5) the reultng etmator can be vewed a a generalzed dfference etmator T 2 = np 2 + c (X x np 2 ) where p 1 = p 2. In th artcle we hall not focu on T 2. Remark 1.The optmal etmator T nvolve model parameter and hence ueful only when thee parameter are known. In uch tuaton auxlarnformaton can be effectvely ued through the ftted value. THE PROPOSED ESTIMATORS We aume further that the data {x, } are oberved. Here x the value for unt of an extra auxlary varable x whoe total, X = x, and coeffcent of varaton (cv), C x, are aumed to be known from a relable ource. In Model GT, nertng ρ = 0, a = x and b = 0, we obtan from T the modelated optmal etmator of Y a

Etmaton of Fnte Populaton Total under PPS Samplng n Preence of Extra 13 where Y 1 = n{δ(1 p ) + 1}(x X) [p {δ(1 p ) + 1} ] δ = σ 2 μ 2 = σ 2 a 2 μ 2 2 a = V ξ ( ) (E ξ ( )) 2 = (CV ξ ( )) 2 aumed to be known. Often th dffcult. Th motvate to ugget the followng etmator. n{c 2 x (1 p )+1}(x X) [p {C 2 x (1 p )+1}] Y 2 = (6) n{c 2 y (1 p )+1}(x X) [p {c 2 y (1 p )+1}] Y 3 = (7) n{(c y C x c x ) 2 (1 p )+1} x X [p {(c y C x c x ) 2 (1 p )+1}] Y 4 = (8) Y 5 = n{(c y +b(c x c x )) 2 (1 p )+1} x X [p (c y +b(c x c x )) 2 {(1 p )+1}] where c y and c x denote coeffcent of varaton of y and x varable and b denote regreon coeffcent between c y and c x baed on pp ample. Obvouly, the exact ba and exact mean quare error (MSE) of Y, = 3, 4, 5, are hard to obtan. Approxmate Ba and MSE of the Propoed Etmator Suppoe that p, are the reved probablte of electon and conder Y = a a pp etmator of the populaton total Y. The exact ba and exact varance of Y are obtaned a and np B(Y ) = p p Y (9) V(Y ) = 1 n [ 2 2 p p ( y 2 p p ) ] (10)

14 P.A. Patel and Shraddha Bhatt Snce, for uffcently large n, c y, c y C x c x and c y + b(c x c x ) are aymptotcally content and aymptotcally unbaed etmator for C y (Patel & Shah, 2009) the approxmate ba and varance of each of Y, = 2, 3, 4, 5, can be obtaned ung (9) and (10) wth p = {C 2 y (1 p ) + 1}(x X) [p {C 2 y (1 p ) + 1} ] AN EMPIRICAL STDY The etmator Y HH, Y 2, Y 3 and Y 4 gven at (1), (6), (7) and (8) were compared emprcally on 3 natural populaton gven below. For comparon of the etmator, a ample wa drawn ung pp amplng from each of the populaton and thee etmator were computed. Thee procedure wa repeated M = 5000 tme. For an etmator Y, t relatve percentage ba (RB%) wa calculated a RB(Y ) = 100 (Y Y) Y and the relatve effcencn percentage (RE%) a RE(Y ) = 100 MSE m (Y HH ) MSE m ( Y ) where Y = M j=1 M Y j M and MSE m (Y ) = (Y j Y) 2 j=1 (M 1) Data et I: Murthy (1967) y : Output for factore, x : Number of worker, z : Fxed captal ρ yz =.9149, ρ yx =.9413 Data et II: Murthy (1967) y : Number of cultvator, x : Number of peron, z : area n q. mle ρ yz =.6611, ρ yx =.8311 Data et III: Fher (1936) (combned all three data et) y : Patel wdth x : Sepal length z : Patel length ρ yz =0.8179, ρ yx =.9628

Etmaton of Fnte Populaton Total under PPS Samplng n Preence of Extra 15 The mulated reult are preented n the followng table. Table 1: Relatve ba and Mean Square Error Data Set Populaton Total N n Etmator Etmate RB% MSE m RE% Y HH 466632.5 12.54 1.1E+10 - I 414611 80 10 Y 2 413330.7-0.31 2.75E+09 400 Y 3 415116.5 0.12 2.94E+09 374 Y 4 411429.9-0.76 2.94E+09 374 Y HH 167792 53.59 5.6E+09 - II 109248 128 20 Y 2 109099-0.14 79893656 7011 Y 3 109141-0.10 79829722 7017 Y 4 109119-0.12 80058668 6997 Y HH 213.307 18.57 1437.936 - III 179.9 150 20 Y 2 180.126 0.125609 96.8654 1484 Y 3 180.127 0.126236 96.8312 1485 Y 4 180.117 0.120456 96.8243 1485 The above mulaton reveal that (1) the abolute RB % of the uggeted etmator are n reaonable range and (2) the effcency of the uggeted etmator ubtantal a compared to the conventonal etmator. REFERENCES [1] Arnab R. (2004). Optmum etmaton of a fnte populaton total n PPS amplng wth replacement for mult-character urvey, Jour. of Indan Soc. Agr. Statt., 58 (2), 231-43 [2] Cael, C. M., Särndal, C. E., and Wretman, J. H. (1977). Foundaton of Inferencen Survey Samplng., New York, John Wley. [3] Fher, R. A. (1936). The ue of multple meaurement n taxonomc problem, Annal of Eugenc, 7, 179-188. [4] Hanen, M. H. and Hurwtz, W. N. (1943). On the thory of amplng from fntepopulaton. Ann. Math. Statt. 14, 333 362.

16 P.A. Patel and Shraddha Bhatt [5] Murthy, M. N. (1967). Samplng Theory and Method, Stattcal Publhng Houe:Culcutta. [6] Patel, P. A. and Shah, R. M. (2009). A Monte Carlo comparon of ome uggeted etmator of coeffcent of varaton n fnte populaton, Journal of Stattc Scence, 1 (2), 137-48. [7] Pathak P. K. (1962). On amplng wth unequal probablte. Sankhya A 24, 315-326.