On Efficiency of Midzuno-Sen Strategy under Two-phase Sampling

Similar documents
A CHAIN RATIO EXPONENTIAL TYPE ESTIMATOR IN TWO- PHASE SAMPLING USING AUXILIARY INFORMATION

Efficient and Unbiased Estimation Procedure of Population Mean in Two-Phase Sampling

A Chain Regression Estimator in Two Phase Sampling Using Multi-auxiliary Information

Admissible Estimation of a Finite Population Total under PPS Sampling

EXPONENTIAL CHAIN DUAL TO RATIO AND REGRESSION TYPE ESTIMATORS OF POPULATION MEAN IN TWO-PHASE SAMPLING

A note on a difference type estimator for population mean under two phase sampling design

Effective estimation of population mean, ratio and product in two-phase sampling in presence of random non-response

Eective estimation of population mean, ratio and product in two-phase sampling in presence of random non-response

International Journal of Scientific and Research Publications, Volume 5, Issue 6, June ISSN

On The Class of Double Sampling Ratio Estimators Using Auxiliary Information on an Attribute and an Auxiliary Variable

Improvement in Estimating the Finite Population Mean Under Maximum and Minimum Values in Double Sampling Scheme

On The Class of Double Sampling Exponential Ratio Type Estimator Using Auxiliary Information on an Attribute and an Auxiliary Variable

Improved General Class of Ratio Type Estimators

Songklanakarin Journal of Science and Technology SJST R3 LAWSON

Unequal Probability Designs

O.O. DAWODU & A.A. Adewara Department of Statistics, University of Ilorin, Ilorin, Nigeria

Asymptotic Normality under Two-Phase Sampling Designs

New Modified Ratio Estimator for Estimation of Population Mean when Median of the Auxiliary Variable is Known

A New Class of Generalized Exponential Ratio and Product Type Estimators for Population Mean using Variance of an Auxiliary Variable

Research Article A Generalized Class of Exponential Type Estimators for Population Mean under Systematic Sampling Using Two Auxiliary Variables

SOME IMPROVED ESTIMATORS IN MULTIPHASE SAMPLING ABSTRACT KEYWORDS

Research Article An Improved Class of Chain Ratio-Product Type Estimators in Two-Phase Sampling Using Two Auxiliary Variables

A GENERAL FAMILY OF ESTIMATORS FOR ESTIMATING POPULATION MEAN USING KNOWN VALUE OF SOME POPULATION PARAMETER(S)

A Generalized Class of Double Sampling Estimators Using Auxiliary Information on an Attribute and an Auxiliary Variable

SEARCH OF GOOD ROTATION PATTERNS THROUGH EXPONENTIAL TYPE REGRESSION ESTIMATOR IN SUCCESSIVE SAMPLING OVER TWO OCCASIONS

A GENERAL FAMILY OF ESTIMATORS FOR ESTIMATING POPULATION MEAN USING KNOWN VALUE OF SOME POPULATION PARAMETER(S)

ASYMPTOTIC NORMALITY UNDER TWO-PHASE SAMPLING DESIGNS

Dual To Ratio Cum Product Estimator In Stratified Random Sampling

Some Modified Unbiased Estimators of Population Mean

Use of Auxiliary Variables and Asymptotically Optimum Estimator to Increase Efficiency of Estimators in Double Sampling

Research Article Some Improved Multivariate-Ratio-Type Estimators Using Geometric and Harmonic Means in Stratified Random Sampling

Modi ed Ratio Estimators in Strati ed Random Sampling

LINEAR COMBINATION OF ESTIMATORS IN PROBABILITY PROPORTIONAL TO SIZES SAMPLING TO ESTIMATE THE POPULATION MEAN AND ITS ROBUSTNESS TO OPTIMUM VALUE

Sampling from Finite Populations Jill M. Montaquila and Graham Kalton Westat 1600 Research Blvd., Rockville, MD 20850, U.S.A.

Bootstrapping Heteroskedasticity Consistent Covariance Matrix Estimator

SUCCESSIVE SAMPLING OF TWO OVERLAPPING FRAMES

IMPROVED CLASSES OF ESTIMATORS FOR POPULATION MEAN IN PRESENCE OF NON-RESPONSE

Application of Parametric Homogeneity of Variances Tests under Violation of Classical Assumption

Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling

An Alternate approach to Multi-Stage Sampling: UV Cubical Circular Systematic Sampling Method

An improved class of estimators for nite population variance

Estimation of the Parameters of Bivariate Normal Distribution with Equal Coefficient of Variation Using Concomitants of Order Statistics

Sampling Theory. A New Ratio Estimator in Stratified Random Sampling

(3) (S) THE BIAS AND STABILITY OF JACK -KNIFE VARIANCE ESTIMATOR IN RATIO ESTIMATION

ESTIMATION OF FINITE POPULATION MEAN USING KNOWN CORRELATION COEFFICIENT BETWEEN AUXILIARY CHARACTERS

of being selected and varying such probability across strata under optimal allocation leads to increased accuracy.

VARIANCE ESTIMATION FOR COMBINED RATIO ESTIMATOR

HISTORICAL PERSPECTIVE OF SURVEY SAMPLING

ISSN X A Modified Procedure for Estimating the Population Mean in Two-occasion Successive Samplings

VARIANCE ESTIMATION IN PRESENCE OF RANDOM NON-RESPONSE

Some Practical Aspects in Multi-Phase Sampling

Combining data from two independent surveys: model-assisted approach

A Monte-Carlo study of asymptotically robust tests for correlation coefficients

REMAINDER LINEAR SYSTEMATIC SAMPLING

Research Article Ratio Type Exponential Estimator for the Estimation of Finite Population Variance under Two-stage Sampling

Estimators for the binomial distribution that dominate the MLE in terms of Kullback Leibler risk

Adaptive two-stage sequential double sampling

All variances and covariances appearing in this formula are understood to be defined in the usual manner for finite populations; example

BEST LINEAR UNBIASED ESTIMATORS OF POPULATION MEAN ON CURRENT OCCASION IN TWO-OCCASION ROTATION PATTERNS

COMSATS Institute of Information and Technology, Lahore Pakistan 2

MA 575 Linear Models: Cedric E. Ginestet, Boston University Non-parametric Inference, Polynomial Regression Week 9, Lecture 2

Research Article Efficient Estimators Using Auxiliary Variable under Second Order Approximation in Simple Random Sampling and Two-Phase Sampling

An Improved Generalized Estimation Procedure of Current Population Mean in Two-Occasion Successive Sampling

Saadia Masood, Javid Shabbir. Abstract

Design and Estimation for Split Questionnaire Surveys

SIMULTANEOUS CONFIDENCE INTERVALS AMONG k MEAN VECTORS IN REPEATED MEASURES WITH MISSING DATA

COS513: FOUNDATIONS OF PROBABILISTIC MODELS LECTURE 9: LINEAR REGRESSION

The New sampling Procedure for Unequal Probability Sampling of Sample Size 2.

Multicollinearity and A Ridge Parameter Estimation Approach

Improved ratio-type estimators using maximum and minimum values under simple random sampling scheme

Combining Biased and Unbiased Estimators in High Dimensions. (joint work with Ed Green, Rutgers University)

Research Article A Class of Estimators for Finite Population Mean in Double Sampling under Nonresponse Using Fractional Raw Moments

Efficiency of NNBD and NNBIBD using autoregressive model

Inference for P(Y<X) in Exponentiated Gumbel Distribution

AN EFFICIENT CLASS OF CHAIN ESTIMATORS OF POPULATION VARIANCE UNDER SUB-SAMPLING SCHEME

High-dimensional asymptotic expansions for the distributions of canonical correlations

A better way to bootstrap pairs

A Practical Guide for Creating Monte Carlo Simulation Studies Using R

Almost unbiased estimation procedures of population mean in two-occasion successive sampling

REPLICATION VARIANCE ESTIMATION FOR TWO-PHASE SAMPLES

Impact of serial correlation structures on random effect misspecification with the linear mixed model.

Research Article Estimation of Population Mean in Chain Ratio-Type Estimator under Systematic Sampling

A bias improved estimator of the concordance correlation coefficient

Optimal Calibration Estimators Under Two-Phase Sampling

The Use of Survey Weights in Regression Modelling

Simple Linear Regression Analysis

Mantel-Haenszel Test Statistics. for Correlated Binary Data. Department of Statistics, North Carolina State University. Raleigh, NC

Testing Some Covariance Structures under a Growth Curve Model in High Dimension

A General Class of Estimators of Population Median Using Two Auxiliary Variables in Double Sampling

An Unbiased Estimator For Population Mean Using An Attribute And An Auxiliary Variable

Cluster Sampling 2. Chapter Introduction

Sampling Theory. Improvement in Variance Estimation in Simple Random Sampling

On the Use of Compromised Imputation for Missing data using Factor-Type Estimators

Estimation Tasks. Short Course on Image Quality. Matthew A. Kupinski. Introduction

INSTRUMENTAL-VARIABLE CALIBRATION ESTIMATION IN SURVEY SAMPLING

Bayesian inference for sample surveys. Roderick Little Module 2: Bayesian models for simple random samples

PARAMETER ESTIMATION FOR FRACTIONAL BROWNIAN SURFACES

Introduction to Sample Survey

REPLICATION VARIANCE ESTIMATION FOR THE NATIONAL RESOURCES INVENTORY

SHOTA KATAYAMA AND YUTAKA KANO. Graduate School of Engineering Science, Osaka University, 1-3 Machikaneyama, Toyonaka, Osaka , Japan

Transcription:

International Journal of Statistics and Analysis. ISSN 2248-9959 Volume 7, Number 1 (2017), pp. 19-26 Research India Publications http://www.ripublication.com On Efficiency of Midzuno-Sen Strategy under Two-phase Sampling P. A. Patel 1 and Shraddha Bhatt 2 Department of Statistics, Sardar Patel University, Vallabh Vidyanagar 388 120 Abstract This article deals with the ratio estimation of the finite population total under the Midzuno-Sen sampling scheme when complete auxiliary information is not available but another auxiliary variable, closely related to first auxiliary variable but remotely related to the study variable, is available. A chain ratio estimator in two-phase sampling is suggested. An empirical study is carried out to investigate the performance of the suggested estimator relative to the convention two-phase estimator. key words: Two-phase Sampling, Midzuno-Sen sampling, Ratio estimator, Regression estimator. MS Classification: 62D05: 62F10 1. INTRODUCTION Consider a finite population U = {1,, i, N}. Let y and x be the study variable and auxiliary variable taking the values y i and x i respectively for the population unit i. When the two variables are strongly related but no information is available on the population total X = U x i (or mean X), we wish to estimate the population total Y = U y i of y from a sample s, obtained through a two-phase selection. Scheme I: The first-phase sample s of fixed size n is drawn using simple random sampling without replacement (srswor) to observe only x in order to estimate X. Given

20 P. A. Patel and Shraddha Bhatt s, the second-phase sample s of fixed size n is drawn again using srswor to observe y only. The two-phase sampling ratio estimator is defined as Y Rd = Nx y x = N x s y s n x s (1) where x (x s ) is the mean (total) of x for the sample s, y (y s ), x (x s ) are means (totals) of y and x for the sample s. Suppose that the population mean Z of another auxiliary variable z closely related to x but compare to x remotely related to y is available. For instance, y could be acres of corn harvested for grain in 1964, x acres under corn in 1964 and z acres under corn in 1959. Chand (1975), Kiregyera (1980, 1984), Srivastava et al. (1990) used a second auxiliary variable z closely related to x to suggest different improved estimators under Scheme I. Raj (1965) suggests the following two-phase sample selection procedure. Scheme II: The initial sample s of size n is selected with probability proportional to z with replacement and the second phase sample s of size n is a subsample of s, selected with equal probabilities without replacement. An unbiased estimator of Y is provided by Y DRd = (y i np i s ) + k [ (x i n p i s ) (x i np i )] (2) s where k is a suitably chosen scalar. To compute this estimator it is necessary to assess the value of k. Often this is difficult. Motivated by this Srivenkataramana and Tracy (1989) have modified the sampling scheme. Scheme III: Select the first phase sample s of size n with probability proportional to z and select a subsample s of size n units with probability proportional to x z, with replacement. Under this scheme an unbiased estimator of Y is Y STd = Z (y i nx i ) (x i n z i ) (3) s s Experience shows that of all the unequal probability sampling without replacement schemes available in literature those due to Midzuno (1952) and Sen (1953), and Rao et al. (1962) have wider practical applicability since these procedures are quite simple and involve less computations at estimation stage compared to others. In this article to estimate the population total we will use the following two-phase sampling procedure.

On Efficiency of Midzuno-Sen Strategy under Two-phase Sampling 21 Scheme IV: A first-phase sample s, of size n, is drawn from U according to a design srswor. Given s, a second-phase sample s, of size n, is drawn from s according to the Midzuno-Sen (MS) sampling. The MS scheme of sampling describes the selection of the first unit with probability proportional to a given size measure (x) and remaining (n-1) units of the sample by srswor. The size of the measure is an auxiliary variable and is positively correlated with the study variable. Under this scheme the ordinary ratio estimator is known to be unbiased for the population total. The variance of this estimator is not available in literature in a meaningful form. Rao and Vijayan (1977) investigated the problem of estimating the variance of the ratio estimator, given at (1), under the MS sampling. The same problem addressed by Patel and Patel (2008, 2010a, 2010b). In Section 2 we suggest a chain ratio estimator using two auxiliary variables. To study the performance of the suggested estimator a small scale simulation is presented in Section 3 and a comparison of strategy is given in Section 4. The conclusion is given in Section 5. 2. THE SUGGESTED ESTIMATOR In this section a chain ratio estimator Y Rd is suggested. = N(x Z z )(y x) (4) Let S uv (s uv ) denote the population (sample) covariance between u and v variables, C u = S u U, the coefficient of variation of u variable and ρ uv = S uv S u S v, the correlation coefficient of u and v (u, v = y, x, z). Let B( ) and Var( ) denote respectively the bias and variance of an estimator under a given design. Then, under srswor, for large sample approximations: y = Y(1 + δ 1 ), x = X(1 + δ 2 ), z = Z(1 + δ 3 ) such that E(δ i ) = 0 for i = 1,2,3 we have the following results. Here we assume that δ i < 1 i. Theorem. Under Scheme IV, approximate bias and variance, too(n 1 ), of Y Rd are given by B(Y Rd ) = 1 f n Y(C 2 z ρ yz C y C z ) (5) Var(Y Rd ) = 1 f n Y 2 (C 2 y + C 2 z 2ρ yz C y C z ) + V [1 + 3 1 f n C 2 z 2( ZV) ] (6) ZV

22 P. A. Patel and Shraddha Bhatt where = 1 N [[ a iiy 2 i z i + a ( a ii y 2 i z j + 2 a ij y i y j z j ) i U i j U i j U + b a ij y i y j z k ] i j k U with a ij sand V are defined below and a = (n 1), b = (n 1)(n 2) and (N 1) (N 1)(N 2) f = n N. Proof. The expectation of Y Rd is evaluated using the conditional argument E( ) = E 1 E 2 ( s ) as E(Y Rd ) = NE 1 [(Z z )E 2 (x y x)] = NE 1 [(Z z )y ] and consequently using the formula for approximate bias of the ordinary ratio estimator (see, Cochran, 1977) of the population total under SRSWOR we obtain the bias of Y Rd as given in (5). Next, the formula for variance of Y Rd is The first component is readily seen to be Var(Y Rd ) = V 1 E 2 (Y Rd ) + E 1 V 2 (Y Rd ) (7) V 1 E 2 (Y Rd ) = V 1 (Ny Z z ) 1 f n Y 2 (C y 2 + C z 2 2ρ yz C y C z ) (8) Note that at second phase the ratio estimator x y x is unbiased for y under MS sampling design with variance (see, Rao, 1972) where a ii = 2 v = a ii y i + a ij y i y j s s X n 1 ) 1 1 and a S i x ij = X s ( N 1 n 1 ) 1 1 S i&j x s ( N 1 Also, note that v is unbiased for V = n N a 2 iiy i + n (n 1) U N(N 1) a ij y i y j

On Efficiency of Midzuno-Sen Strategy under Two-phase Sampling 23 at first phase (i.e., under srswor). The second component is written as E 1 V 2 (Y Rd ) = E 1 [(Z z ) 2 V 2 (x y x) ] = E 1 [( Z 2 z ) v ] Inserting v = V(1 + δ 4 ), δ 4 < 1, and using the expansion (1 + δ 3 ) 2 = 1 2δ 3 + 3δ 3 2 + O(n 1 ), after omitting the terms of δ s having power greater than two, we obtain E 1 V 2 (Y Rd ) V[1 + 3Var(δ 3 ) 2Cov(δ 3, δ 4 )] (9) Noting that Var(δ 3 ) = 1 f 2 C n z Cov(δ 3, δ 4 ) = [E(z v ) ZV] ZV = [ ZV] ZV (9) is written as E 1 V 2 (Y Rd ) V [1 + 3 1 f C 2 2( ZV) n z ] (10) ZV Inserting (8) and (10) in (7) we obtain (6). 3. SIMULATION STUDY The preceding estimators were compared empirically on 3 natural populations given below. For comparison of the estimators,a two-phase sample was drawn using srswor and the M-S sampling scheme from each of the populations and these estimators were computed. This procedure was repeated M = 5000 times. For each estimator Y d, the relative percentage bias RB(Y d) = 100 (Y Y) Y and the relative efficiency ascompared to Y were calculated, where RE(Y d) = MSE sim (Y ) MSE sim ( Y d) Y = M j=1 Y dj M M and MSE sim (Y d) = (Y dj Y) 2 j=1 (M 1)

24 P. A. Patel and Shraddha Bhatt Data set I: Murthy (1967) y: No of cultivators x : Area in square miles z : No. of households Data set II: Murthy (1967) y: workers at household industry x : Cultivated area (in acres) z : No. of households Data set III: Fisher data (combined all three data sets) y: Patel width x : Sepal length z : Patel length The following table shows corresponding relative bias and MSE for the data sets. Data Set n n N Estimator RB% MSE sim RE% I 10 30 128 Y Rd 0.60 3.79E+08 - Y Rd 1.02 2.95E+08 128 II 10 30 128 Y Rd 0.89 1.46E+07 - Y Rd 0.71 1.28E+07 114 III 15 40 150 Y Rd -3.16 1.01E+03 - Y Rd -3.32 7.07E+02 143 The above simulation reveals that (1) the absolute RBs % of both the estimators are in reasonable range and (2) the efficiency of the suggested estimator is substantial as compared to the conventional estimator. 4. COMPARISON OF STRATEGIES Here, we compare empirically the suggested strategy (mean a pair of design and estimator) given at (4) with the conventional strategy given at (1), denoted respectively by H 2 (srswor MS sampling,, Y Rd ) and H 1 (srswor srswor, Y Rd ). For the comparison of two strategies H 1 (p 1, Y 1) and H 2 (p 2, Y 2) we have computed the percentage gain in efficiency of H 2 over H 1 as Gain (H 2, H 1 ) = ( V p 1 (Y 1) 1) 100% V p2 (Y 2)

On Efficiency of Midzuno-Sen Strategy under Two-phase Sampling 25 The following table presents gain due to MS sampling scheme over srswor in twophase sampling. Data Set I II III Gain(H 2, H 1 ) 33% -4% 26% Remark. We have compared empirically Des Raj Strategy, Srivenkataramana and Tracy strategy and suggested strategy given at (2), (3) and (4) with the conventional strategy given at (1), for many real data. It has been observed that for most for the populations considered under simulation study Des Raj strategy performed very well. For sake of brevity the simulated results are not presented here. 5. CONCLUSION It can be concluded that there may be situations in which it is considered important to select the sample units with probability proportionate to some measure of size x, information on which is not readily available but could be collected at moderate cost for a fairly large sample and if the population mean Z of another auxiliary variable z closely related to x but compare to x remotely related to y is available then incorporating this extra auxiliary information at the estimation stage improves the convention estimator. However, pps without replacement schemes for general n are not easy to implement but the MS scheme is easy to execute. REFERENCES [1] Chand, L. (1975). Some ratio-type estimator based on two or more auxiliary variables. Unpublished Ph. D. dissertation, Iowa State University, Iowa. [2] Cochran, W. G. (1977). Sampling Techniques 3 rd edn (New York, John Wiley) [3] Fisher, R. A. (1936). The use of multiple measurements in taxonomic problem, Annals of Eugenics, 7, 179-188. [4] Patel, J.S. and Patel, P.A. (2008). A Monte Carlo comparison of some estimators of finite population Total under Midzuno sampling, Journal of the Indian Statistical Association, 46, 2, 141-153. [5] Patel, J.S. and Patel, P.A. (2010a). On Non-negative and Improved variance estimation for ratio estimator under Midzuno-Sen sampling scheme, Statistics in Transition-new series, 10(3), 371-385 [6] Patel, J.S. and Patel, P.A. (2010b). Comparison of some variance estimators of the ratio estimator in presence of two auxiliary variables, Interstat.

26 P. A. Patel and Shraddha Bhatt [7] Kiregyera, B. (1980). A chain ratio-type estimator in finite population double sampling using two auxiliary variables, Metrika 27, 217-23. [8] Kiregyera, B. (1984). Regression-type estimators using two auxiliary variables and model of double sampling from finite populations, Metrika 31, 215-26. [9] Midzuno, H.(1952). On the sampling system with probability proportionate to sum of sizes. Ann. Inst. Statist. Math., Tokyo, 3, 99-107. [10] Murthy, M. N. (1967) Sampling Theory and Methods, Calcutta: Statistical Publishing Society. [11] Raj, D. (1964). On double sampling for pps estimation, Ann. Math. Statist. 35 (2), 900--902. [12] Raj, D. (1965). On sampling over two occasions with probability proportional to size. Ann. Math. Statisti. 36, 327-30 [13] Rao, J.N.K., Hartley, H.O. and Cochran, W.G. (1962). On a simple procedure ofunequal probability sampling without replacement. J. Roy. Statist. Soc. B24, 482-491. [14] Rao, J.N.K. and Vijayan, K. (1977). On estimating the variance in sampling with probability proportional to aggregate size. Journal of the American Statistical Association, 72, 579-584. [15] Sen, A. R. (1953). On estimate of the variance in sampling with varying probabilities, J. Ind. Soc. Statist., 7, 119-127 [16] Srivastava, R. S., Srivastava, S. R., Khare, B. B. (1990). A generalized chain ratio estimator for mean of finite population. J. Ind. Soc. Agricult. Statist. 42, 108-17 [17] Srivenkataramana, T. and Tracy, D. S. (1989). Two-phase sampling for selection with probability proportional to size in sample surveys. Biometrika 76 (4), 818-21