Sample Allocation under a Population Model and Stratified Inclusion Probability Proportionate to Size Sampling

Similar documents
{ }{ ( )} (, ) = ( ) ( ) ( ) Chapter 14 Exercises in Sampling Theory. Exercise 1 (Simple random sampling): Solution:

Econometric Methods. Review of Estimation

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

Idea is to sample from a different distribution that picks points in important regions of the sample space. Want ( ) ( ) ( ) E f X = f x g x dx

LECTURE - 4 SIMPLE RANDOM SAMPLING DR. SHALABH DEPARTMENT OF MATHEMATICS AND STATISTICS INDIAN INSTITUTE OF TECHNOLOGY KANPUR

A New Family of Transformations for Lifetime Data

Class 13,14 June 17, 19, 2015

THE ROYAL STATISTICAL SOCIETY 2016 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE MODULE 5

Simple Linear Regression

STRATIFIED SAMPLING IN AGRICULTURAL SURVEYS

Functions of Random Variables

Application of Calibration Approach for Regression Coefficient Estimation under Two-stage Sampling Design

Comparison of Dual to Ratio-Cum-Product Estimators of Population Mean

A Combination of Adaptive and Line Intercept Sampling Applicable in Agricultural and Environmental Studies

Chapter -2 Simple Random Sampling

Investigation of Partially Conditional RP Model with Response Error. Ed Stanek

Chapter -2 Simple Random Sampling

22 Nonparametric Methods.

Ordinary Least Squares Regression. Simple Regression. Algebra and Assumptions.

X X X E[ ] E X E X. is the ()m n where the ( i,)th. j element is the mean of the ( i,)th., then

To use adaptive cluster sampling we must first make some definitions of the sampling universe:

Chapter 3 Sampling For Proportions and Percentages

Chapter 8. Inferences about More Than Two Population Central Values

best estimate (mean) for X uncertainty or error in the measurement (systematic, random or statistical) best

Lecture Notes Types of economic variables

General Method for Calculating Chemical Equilibrium Composition

Analysis of Variance with Weibull Data

Lecture 3 Probability review (cont d)

STK4011 and STK9011 Autumn 2016

Chapter 10 Two Stage Sampling (Subsampling)

CHAPTER VI Statistical Analysis of Experimental Data

ECONOMETRIC THEORY. MODULE VIII Lecture - 26 Heteroskedasticity

THE ROYAL STATISTICAL SOCIETY 2010 EXAMINATIONS SOLUTIONS GRADUATE DIPLOMA MODULE 2 STATISTICAL INFERENCE

Simple Linear Regression

Multiple Linear Regression Analysis

A stopping criterion for Richardson s extrapolation scheme. under finite digit arithmetic.

Chapter 11 Systematic Sampling

Chapter 5 Properties of a Random Sample

ECON 482 / WH Hong The Simple Regression Model 1. Definition of the Simple Regression Model

Lecture 2: The Simple Regression Model

Chapter 4 Multiple Random Variables

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. x, where. = y - ˆ " 1

( ) = ( ) ( ) Chapter 13 Asymptotic Theory and Stochastic Regressors. Stochastic regressors model

Lecture 3. Sampling, sampling distributions, and parameter estimation

Lecture 7. Confidence Intervals and Hypothesis Tests in the Simple CLR Model

THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA

Estimation of Stress- Strength Reliability model using finite mixture of exponential distributions

STA302/1001-Fall 2008 Midterm Test October 21, 2008

The equation is sometimes presented in form Y = a + b x. This is reasonable, but it s not the notation we use.

A Note on Ratio Estimators in two Stage Sampling

Summary of the lecture in Biostatistics

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

Parameter, Statistic and Random Samples

TESTS BASED ON MAXIMUM LIKELIHOOD

Discrete Mathematics and Probability Theory Fall 2016 Seshia and Walrand DIS 10b

Sampling Theory MODULE V LECTURE - 14 RATIO AND PRODUCT METHODS OF ESTIMATION

Lecture 1 Review of Fundamental Statistical Concepts

CHAPTER 4 RADICAL EXPRESSIONS

GOALS The Samples Why Sample the Population? What is a Probability Sample? Four Most Commonly Used Probability Sampling Methods

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE

Solving Constrained Flow-Shop Scheduling. Problems with Three Machines

4. Standard Regression Model and Spatial Dependence Tests

Sampling Theory MODULE X LECTURE - 35 TWO STAGE SAMPLING (SUB SAMPLING)

Statistics: Unlocking the Power of Data Lock 5

Midterm Exam 1, section 2 (Solution) Thursday, February hour, 15 minutes

SPECIAL CONSIDERATIONS FOR VOLUMETRIC Z-TEST FOR PROPORTIONS

X ε ) = 0, or equivalently, lim

ENGI 4421 Joint Probability Distributions Page Joint Probability Distributions [Navidi sections 2.5 and 2.6; Devore sections

hp calculators HP 30S Statistics Averages and Standard Deviations Average and Standard Deviation Practice Finding Averages and Standard Deviations

A practical threshold estimation for jump processes

ESS Line Fitting

Median as a Weighted Arithmetic Mean of All Sample Observations

UNIT 4 SOME OTHER SAMPLING SCHEMES

Introduction to local (nonparametric) density estimation. methods

Wu-Hausman Test: But if X and ε are independent, βˆ. ECON 324 Page 1

Chapter Business Statistics: A First Course Fifth Edition. Learning Objectives. Correlation vs. Regression. In this chapter, you learn:

THE ROYAL STATISTICAL SOCIETY 2009 EXAMINATIONS SOLUTIONS GRADUATE DIPLOMA MODULAR FORMAT MODULE 2 STATISTICAL INFERENCE

Chapter Two. An Introduction to Regression ( )

Objectives of Multiple Regression

PROPERTIES OF GOOD ESTIMATORS

8.1 Hashing Algorithms

Midterm Exam 1, section 1 (Solution) Thursday, February hour, 15 minutes

ENGI 3423 Simple Linear Regression Page 12-01

Multiple Regression. More than 2 variables! Grade on Final. Multiple Regression 11/21/2012. Exam 2 Grades. Exam 2 Re-grades

Asymptotic Formulas Composite Numbers II

Lecture Notes Forecasting the process of estimating or predicting unknown situations

Study of Correlation using Bayes Approach under bivariate Distributions

LINEAR REGRESSION ANALYSIS

Bayes Interval Estimation for binomial proportion and difference of two binomial proportions with Simulation Study

Simple Linear Regression Analysis

FREQUENCY ANALYSIS OF A DOUBLE-WALLED NANOTUBES SYSTEM

12.2 Estimating Model parameters Assumptions: ox and y are related according to the simple linear regression model

VARIANCE ESTIMATION FROM COMPLEX SURVEYS USING BALANCED REPEATED REPLICATION

Continuous Distributions

Module 7. Lecture 7: Statistical parameter estimation

Probability and. Lecture 13: and Correlation

Faculty Research Interest Seminar Department of Biostatistics, GSPH University of Pittsburgh. Gong Tang Feb. 18, 2005

Nonlinear Piecewise-Defined Difference Equations with Reciprocal Quadratic Terms

STA 105-M BASIC STATISTICS (This is a multiple choice paper.)

Lecture Notes 2. The ability to manipulate matrices is critical in economics.

Transcription:

Secto o Survey Researc Metods Sample Allocato uder a Populato Model ad Stratfed Icluso Probablty Proportoate to Sze Sampl Su Woo Km, Steve eera, Peter Soleberer 3 Statststcs, Douk Uversty, Seoul, Korea, Republc of Isttute for Socal Researc, Uversty of Mca, 46 Tompso, A Arbor, Mca, 4804 3 Isttute for Socal Researc, Uversty of Mca. Itroducto I stratfed sampl, a total sample of elemets s allocated to eac of =,, des strata ad depedet samples of elemets are selected depedetly wt strata. Oe of te mportat roles of te survey sampler s to determe te sample allocato to strata tat wll result te reatest precso for sample estmates of populato caracterstcs. May studes ave focused o sample allocato stratfed radom sampl. Te follow approaces ave bee popular survey sampl practce: ( proportoal sample allocato to strata, ad ( eyma (934 sample allocato. Proportoal sample allocato asss sample szes to strata proporto to te stratum populato sze. Proportoal allocato ca be used we formato o stratum varablty s lack or stratum varaces are approxmately equal. Sce proportoal allocato results a self-wet sample, populato estmates ad ter sampl varaces are easly computed. eyma allocato ca be used effectvely to mmze te varace of a estmator f te survey cost per sampl ut s te same all strata but elemet varaces, S, dffer across strata. Ts allocato metod requres kowlede of te values of te stadard devatos, S, of te varable of terest y for eac stratum. Ts formato o stratum-specfc varace s ofte ot avalable practce. A sample allocato metod wt practcal advataes over eyma allocato s termed x optmal allocato. Te x optmal allocato metod uses a auxlary varable x, ly correlated wt te y ad replaces te stratum stadard devatos of te y wt tose of te x te eyma allocato formula. Of course, ts allocato s ot strctly optmal f te correlato betwee x ad y s ot perfect. As a alteratve, Dayal (985 sowed tat a lear model wt respect to x ad y ca be approprately used te allocato of a stratfed radom sample. Ts tecque s called modelasssted allocato. I fact may stratfed sample dess, especally tose employed busess surveys, smple radom sampl wtout replacemet ca be employed to select elemets wt strata. But t s well-kow tat sampl stratees wt vary probabltes suc as probablty proportoal to sze ( PPS sampl wtout replacemet are superor to smple radom sampl wt respect to te effcecy of estmator of populato totals ad related quattes. PPS sampl wtout replacemet s ofte called cluso probablty proportoal to sze ( IPPS sampl or PS sampl. A umber of PS sampl scemes ave bee developed to select samples of sze equal to or reater ta two, ad most of tem are ot easly applcable practce. owever, some tecques suc as Sampford s (967 metod, are ot restrcted to stratum sample sze of = ad may be a attractve opto for reduc sampl varace compared to alteratve dess. Rao (968 dscusses a sample allocato approac tat mmzes te expected varace of te orvtz ad Tompso (-T (95 estmator uder PS sampl ad a superpopulato reresso model wtout te tercept. Rao s metod for sample allocato results te same expected sampl varace for ay PS sampl des. Rao s (968 dscusso rases several questos: ( It may be desreable to troduce a tercept term to te superpopulato reresso model. Cosder te tercept term, wat s te proper stratey for sample allocato PS sampl? ( If we use Sampford s (967 PS sampl metod, wat sample allocato stratey would be approprate? I ts paper, we attempt to aswer tese questos. We frst revew Rao s (968 metod. We sow tat te presece of te tercept te model produces a more complcated allocato problem, but 306

Secto o Survey Researc Metods oe tat ca be easly solved. I addto, we employ optmzato teory to sow ow to optmally determe stratum sample szes for Sampford s selecto metod.. Revst Rao s metod Cosder a fte populato cosst of =,, strata wt uts stratum. Let s be a sample of sze draw from eac stratum by a ve sampl des P( ad let S be te set of all possble samples from eac stratum. Te total sample sze s : =. (. = Te te probablty tat te ut te stratum wll be a sample, deoted, s ve by s, s S = Ps (, =,,, =,,, (. wc are called te frst-order cluso probabltes. Also, te probablty tat bot of te uts ad j wll be cluded a sample, deoted j, s obtaed by j = Ps (, =,,, j =,,., j s, s S (.3 Tese are termed te jot selecto probabltes or te secod-order cluso probabltes. Let y be te value of y for te ut te stratum. As a estmator of te populato total Y = y, cosder te -T estmator = = Yˆ y T =. (.4 = = If > 0, ts estmator s a ubased estmator of Y, wt varace: y = β x + ε, (.6 were x s te value of x for te ut stratum E y x = β x, V ( y x = σ x,, ξ ( ξ Covξ y, yj x, xj = 0. ere E ξ deotes te model expectato over all te fte populatos tat ca be draw from te superpopulato. Te we ave te follow expected varace uder te model (.6:, ad ( EVar Y ξ ( ˆ T σ x = =, (.7 = were, = p = x, = x. = To mmze (.7 subject to te codto (., us te Larae multpler λ, cosder ( ˆ T + = = = = p EVar ξ Y λ σ x + λ. = (.8 Equat (.8 to zero ad dfferetat wt respect to, we ave σ x =. (.9 λ = p Substtut (., we ave σ x =. (.0 p λ = = Replac λ (.9 wt (.0, we ave te follow sample allocato eac stratum: ( ˆ y j T = ( j j. Var Y = = j> j y (.5 Rao (968 cosdered te follow superpopulato reresso model wtout te tercept: = = x = = x. (. 306

Secto o Survey Researc Metods ote tat f =, te allocato uder te superpopulato model ad PS sampl reduces to: =, (. = wc s a proportoal sample allocato to te stratum. Also, Rao sowed tat terms of expected varace, ustratfed PS sampl uder te same superpopulato model s feror to stratfed PS sampl wt te allocato (.. Look at te expected varace (.7 ad te sample allocato (., t does ot volve te jot probabltes j eac stratum. It dcates tat uder te model wtout te tercept (.6 te specfc propertes of a ve PS sampl sceme (propertes tat determe te j are ot reflected te sample allocato, result te same sample allocato for ay PS sampl. ece te follow ssues, as metoed te Itroducto, are of terest. ( Te superpopulato reresso model wc we may ws to employ may surveys may be : y = α + βx + ε, (3. wc s a eeral form ad (.6 s a specal form of (3. we α = 0. Cosder te tercept term α, we eed to reexame te most approprate sample allocato stratey for PS sampl. ( Altou t wll be sow te follow secto tat us (3. ves a sample allocato volv te jot probabltes j, ad tese dffer accord to te cose PS sampl, f we focus o Sampford s (967 metod for PS sampl, wat sample allocato stratey would be approprate? Secto 3 wll address tese ssues of sample allocato. 3. Alteratve Sample Allocatos We assume two dfferet models volv a tercept term: Model I: y = α + βx + ε, =,,, =,, (3. were ε s umercally elble, tat s, x explas y well. Model II: y = α + βx + ε, =,,, =,, (3. were Eξ ( y x = α + βx, Vξ ( y x = σ x, Covξ y, yj x, xj = 0. ad ( Istead of (.5 we cosder te follow form of te varace of te -T estmator Var Y ( ˆ y ( = + j T y yj = = = = j> j = = j> j y y (3.3 Teorem 3.. Uder te Model I, te mmzato of te expected varace of (.4 uder PS sampl s equvalet to mmz A B +, (3.4 = = were, A = α + αβ( x + x (3.5 ad j j = j> x xj B ( α + βx = β. (3.6 = x Proof. For te expected varace of (.4 uder Model I te trd term (3.3 s a fxed value tat does ot volve, ad te oter terms are ve by: ( α + βx = = x = = ( α + βx α + αβ( x + xj x x + = = j> j, + β β = = (3.7 by ot j = ( / te secod = j> term (3.3. = = Sce ( α + βx ad β = te quatty to be mmzed (3.7 s: j are also fxed, 3063

Secto o Survey Researc Metods = = ( α + βx x α + αβ( x + xj + j β x x = = j> j = Te proof follows from substtuto of ad A α + αβ( x + x j = j = j> x xj B ( α + βx (3.8. = β = x (3.8 Remark 3.. Mmzato of (3.4 s a smple problem terms of because te A ad te B are kow values. Cosder Sampford s (967 PS sampl metod for select elemets eac stratum. Altou we ca use (3.4 to decde te stratum sample sze, we stll do t kow te values of te jot probabltes. Te follow approxmate 4 expresso for correct to O ( may be useful: j j ( ppj + ( p + pj pk k = 3 + { ( p + pj pk ( p pj k = + ( 3( p + pj pk ( 3 p k k= k=, (3.9 wc was derved by Asok ad Sukatme (976. From (3.4 ad (3.9 we obta te follow teorem. Teorem 3.. Uder te Model I, te sample allocato problem to mmze te expected varace of (.4 uder Sampford s metod we us te 4 jot probabltes, correct to O (, ve (3.9 s equvalet to mmz were D, (3.0 C + = = { α αβ }, (3. C = + ( x + x j j = j>, (3. j = ( p + pj pk p pj pk k= k= D = B { α + αβ( x + xj } j, (3.3 = j> ad ( p p p = + + j j k k = 3 + ( p + pj pk k = p pj 3( p pj pk 3 pk k= k=. (3.4 + + + Proof. Substtut j from (3.9 (3.5 for te frst term of (3.4, we et: A = { α + αβ( x + xj }, j 0 = = = j> were: (3.5 j 0 = + ( p + pj pk k = 3 + { ( p + pj pk ( p pj k = + ( 3( p + pj pk ( 3 p k k= k=. Express (3.6 terms of, we ave: j 0 j j (3.6 = +. (3.7 Substtut (3.7 (3.5, we obta A = { + ( x + x } α αβ j j = = = j> + { α + αβ( x + xj } j = = j> { α + αβ( x + xj } j = = j> { α + αβ( x + xj } j = = j>. 3064

Secto o Survey Researc Metods (3.8 Sce te secod ad trd terms (3.8 are te kow values, te mmzato of (3.8 reduces to mmz: Add { + ( x + x } α αβ j j = = j> { α + αβ( x + xj } j = = j> B. (3.9 (3.9, we ave te follow = equvalet mmzato problem to te mmzato of (3.4: { + ( x + x } α αβ j j = = j> + B { α + αβ( x + xj } j = = j>. Ts completes te proof. (3.0 Remark 3.. (3.0 s a smple allocato problem terms of because te C ad te D are te kow values. Remark 3.3. We ca defe te follow optmzato problem wt respect to : subject to, ad D Mmze (3. C + = =, =,,, (3., =,,, (3.3 =. (3.4 = Ts problem may be easly adled by covex matematcal proramm alortms ad te soluto provdes a effcet sample allocato stratey we us Sampford s metod uder te model assumpto of (3.. We obta te follow teorem reard te mmzato of te varace of te -T estmator (.4 PS sampl uder te assumpto of te model (3.. Teorem 3.3. Uder Model II, mmz te expected varace of (.4 uder PS sampl amouts to mmz: were, A B, (3.5 + = = ( ( (3.6 A = α x x αx + β ad j j = j> σ = B = x. (3.7 Proof. Cosder a dfferet form of (.5 us = p : j y y j ( T = p pj = = j> p pj Var Y. (3.8 By us Ey ξ = σ x + α + β x + αβx (3.9 ad Eξ ( y yj = α + αβ( x + xj + β x xj, (3.30 we obta y y j Eξ = σ p p p j xj x α + ( αx + β. (3.3 x x Te we et: j EVar ξ ( Y T σ = p ppj = = j> j xj x + α ppj ( αx + β = = j> xx j = EV + α ( xj x ( αx + β = = j> = = j> j ( x x ( x + α α + β wt EV = ( p p j j σ = = (3.3 3065

Secto o Survey Researc Metods x = σ ( p = = p = σ x = = p x σ x. (3.33 = = = = = σ Sce te secod term (3.3 ad te secod term (3.33 are fxed terms of, te mmzato of te model expectato of (3.8 reduces to mmz: = = j> ( x x ( x + α α β j j σ x = = +. (3.34 Sce (3.34 equals (3.5, te proof s completed. Remark 3.4. Mmz (3.5 s a smple problem terms of because te A ad te B are te kow values. Remark 3.4. (3.33 s a dfferet form of (.7. Te model expectato of (3.8 volves (.7 plus te oter terms due to Model II wt te tercept term, as sow (3.3. Teorem 3.4. Uder te Model II, te sample allocato problem uder Sampford s sampl sceme to mmze te expected varace of (.4, 4 we us te jot probabltes correct to O ( ve (3.9, s equvalet to mmz: D C +, (3.35 = = were C {( ( = α x xj αx + β j} (3.36 = j> ad wt {( ( } D = B α x x αx + β j j = j> ( p p p p p p, = + j j k j k k= k= (3.37 ad ( p p p = + + + ( p + p p j j k k = 3 j k k = j ( j k k k= k=, + p p 3 p + p p + 3 p σ = B = x. Proof. Substtut (3.9 te frst term of (3.5 ad us (3.7 wt (3. ad (3.4, we obta ( A {( x j x = α = = = j> ( αx β p pjj0} + = α {( x xj = = j> ( αx β( j j } + + = α {( x xj ( αx + β j} = = j> + α {( x xj ( αx + β j } = = j> α {( x xj ( αx + β j} = = j> α {( x xj ( αx + β j }. = = j> (3.38 Sce te secod ad trd terms (3.38 are equal, te mmzato of (3.38 reduces to mmz te oter terms, tat s, {( x xj ( x + j} = = j> α {( x xj ( αx + β j }. α α β = = j> (3.39 Tus, te mmzato of (3.5 wt (3.6 ad (3.7 amouts to te oe of {( x xj ( x + j} α α β = = j> 3066

Secto o Survey Researc Metods {( x xj ( x j } α α + β = = j> B +. = (3.39 Accordly, te follow reduced form from (3.39 ca be obtaed. {( x xj ( x + j} α α β = = j> B α {( x xj ( αx β j} + + = = j> ece, we ave proved te teorem. (3.40 Remark 3.5. (3.35 s a smple allocato problem terms of sce te C ad te D are te kow values. Remark3.6. I order to fd a soluto for, we may defe te follow optmzato problem: subject to ad Mmze D C + (3.4 = =, =,, (3.4, =,,. (3.43 It s oted tat te codto (. may ot be used as te costrat, dfferet from Remark 3.3. Corollary3.. Uder Model II, wtout te tercept te mmzato of te expected varace of (.4 uder PS sampl s equvalet to mmz: were (3.44 = = = x (3.45 Proof. We α = 0, (3.3 Teorem 3.3 reduces to smply EV, wc s expressed as (3.33. σ ad te secod term (3.33 are fxed values wt respect to, ad te mmzato of (3.33 reduces to te oe of (3.44. ece, we ave te corollary. Remark 3.7. (3.44 s qute a smple allocato problem terms of ot deped o te jot probabltes j. 4. Dscusso We ave addressed te topc of effcet sample allocato stratfed samples us more eeral superpopulato reresso models ta tose vestated by Rao (968. Uder more eeral models tat clude a tercept term, we ave developed several teorems to be useful for decd sample allocato PS sampl dess. Also, trou te teorems we ave sowed ow to apply ts sample allocato teory for Sampford s (967 sampl metod, oe of te more commo PS sampl dess used survey practce. We determed tat te sample allocato approaces to mmz te model expectato of te varace of te -T estmator may deped o te expressos of te varace. Based o te teorems developed ts paper, te optmzato problem wt respect to te stratum sample szes ca be solved by us software volv covex matematcal proramm alortms. Ts s a stratforward approac for sample allocato we us more effcet PS sampl metods. I addto to Sampford sampl, te approac ca be appled to a varety of PS sampl wtout replacemet dess. I future work t wll be mportat to exted te teory ad metods descrbed ere to allocato problems uder more complcated superpopulato models ad stuatos were te superpopulat model ca vary across strata Refereces Dayal, S. (985. Allocato of sample us values of auxlary caracterstc, Joural of te Statstcal Pla ad Iferece,, 3-38. orvtz, D. G. ad Tompso, D. J. (95. A eeralzato of sampl wtout replacemet from a fte uverse, Joural of te Amerca Statstcal Assocato, 47, 663-685. eyma, J. (934. O two dfferet aspects of te represetatve metod: te metod of stratfed sampl ad te metod of purposve selecto, Joural of te Royal Statstcal Socety, 97, 558-606. 3067

Secto o Survey Researc Metods Rao, T. J. (968. O te allocato of sample sze stratfed sampl, Aals of te Isttute of Statstcal Matematcs, 0, 59-66. Sampford, M. R. (967. O sampl wtout replacemet wt uequal probabltes of selecto, Bometrka, 54, 499-53. 3068