Element sampling: Part 2

Similar documents
It should be unbiased, or approximately unbiased. Variance of the variance estimator should be small. That is, the variance estimator is stable.

Estimation for Complete Data

Chapter 6 Principles of Data Reduction

5. Fractional Hot deck Imputation

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

Random Variables, Sampling and Estimation

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

Lecture 01: the Central Limit Theorem. 1 Central Limit Theorem for i.i.d. random variables

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

Simulation. Two Rule For Inverting A Distribution Function

Stat 421-SP2012 Interval Estimation Section

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

1 Introduction to reducing variance in Monte Carlo simulations

Lecture 2: Monte Carlo Simulation

Convergence of random variables. (telegram style notes) P.J.C. Spreij

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

Topic 9: Sampling Distributions of Estimators

Monte Carlo Integration

Topic 9: Sampling Distributions of Estimators

Unbiased Estimation. February 7-12, 2008

4.5 Multiple Imputation

ECONOMETRIC THEORY. MODULE XIII Lecture - 34 Asymptotic Theory and Stochastic Regressors

Infinite Sequences and Series

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss

STAT Homework 1 - Solutions

1.010 Uncertainty in Engineering Fall 2008

ECE 6980 An Algorithmic and Information-Theoretic Toolbox for Massive Data

Expectation and Variance of a random variable

32 estimating the cumulative distribution function

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f.

Frequentist Inference

Lecture 19: Convergence

MATH 472 / SPRING 2013 ASSIGNMENT 2: DUE FEBRUARY 4 FINALIZED

An Introduction to Randomized Algorithms

ON POINTWISE BINOMIAL APPROXIMATION

Properties and Hypothesis Testing

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator

Statistical inference: example 1. Inferential Statistics

Topic 9: Sampling Distributions of Estimators

Review Questions, Chapters 8, 9. f(y) = 0, elsewhere. F (y) = f Y(1) = n ( e y/θ) n 1 1 θ e y/θ = n θ e yn

SDS 321: Introduction to Probability and Statistics

5. Likelihood Ratio Tests

7.1 Convergence of sequences of random variables

Stat 319 Theory of Statistics (2) Exercises

Chapter 8: STATISTICAL INTERVALS FOR A SINGLE SAMPLE. Part 3: Summary of CI for µ Confidence Interval for a Population Proportion p

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

Improved Class of Ratio -Cum- Product Estimators of Finite Population Mean in two Phase Sampling

January 25, 2017 INTRODUCTION TO MATHEMATICAL STATISTICS

Machine Learning Brett Bernstein

Output Analysis and Run-Length Control

Lecture 12: September 27

6. Sufficient, Complete, and Ancillary Statistics

0, otherwise. EX = E(X 1 + X n ) = EX j = np and. Var(X j ) = np(1 p). Var(X) = Var(X X n ) =

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS

Table 12.1: Contingency table. Feature b. 1 N 11 N 12 N 1b 2 N 21 N 22 N 2b. ... a N a1 N a2 N ab

Elements of Statistical Methods Lots of Data or Large Samples (Ch 8)

Direction: This test is worth 150 points. You are required to complete this test within 55 minutes.

This exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.

Journal of Multivariate Analysis. Superefficient estimation of the marginals by exploiting knowledge on the copula

First Year Quantitative Comp Exam Spring, Part I - 203A. f X (x) = 0 otherwise

Mathematical Statistics - MS

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 3 9/11/2013. Large deviations Theory. Cramér s Theorem

A statistical method to determine sample size to estimate characteristic value of soil parameters

Clases 7-8: Métodos de reducción de varianza en Monte Carlo *

ST5215: Advanced Statistical Theory

Hypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance

Lecture 2: Concentration Bounds

Tests of Hypotheses Based on a Single Sample (Devore Chapter Eight)

Approximations and more PMFs and PDFs

Linear Regression Demystified

Advanced Stochastic Processes.

Chapter 6 Sampling Distributions

Chapter 8: Estimating with Confidence

Lecture 8: Convergence of transformations and law of large numbers

Module 1 Fundamentals in statistics

Problem Set 4 Due Oct, 12

7.1 Convergence of sequences of random variables

Monte Carlo Methods: Lecture 3 : Importance Sampling

Chapter 5. Inequalities. 5.1 The Markov and Chebyshev inequalities

This section is optional.

Notes 5 : More on the a.s. convergence of sums

Direction: This test is worth 250 points. You are required to complete this test within 50 minutes.

Topic 10: Introduction to Estimation

Castiel, Supernatural, Season 6, Episode 18

Statistics 511 Additional Materials

2 1. The r.s., of size n2, from population 2 will be. 2 and 2. 2) The two populations are independent. This implies that all of the n1 n2

IP Reference guide for integer programming formulations.

1 Inferential Methods for Correlation and Regression Analysis

AMS570 Lecture Notes #2

Exercise 4.3 Use the Continuity Theorem to prove the Cramér-Wold Theorem, Theorem. (1) φ a X(1).

Double Stage Shrinkage Estimator of Two Parameters. Generalized Exponential Distribution

6.3 Testing Series With Positive Terms

Lecture 33: Bootstrap

Statistical and Mathematical Methods DS-GA 1002 December 8, Sample Final Problems Solutions

PRACTICE PROBLEMS FOR THE FINAL

IIT JAM Mathematical Statistics (MS) 2006 SECTION A

AAEC/ECON 5126 FINAL EXAM: SOLUTIONS

Transcription:

Chapter 4 Elemet samplig: Part 2 4.1 Itroductio We ow cosider uequal probability samplig desigs which is very popular i practice. I the uequal probability samplig, we ca improve the efficiecy of the resultig poit estimator by properly icorporatig the auxiliary iformatio available i the samplig frame ito the samplig desig. Example 4.1. Cosider the followig artificial example of a fiite populatio of busiess compaies. The total umber of employees is available i the whole populatio but the aual total icome is available oly for the sample. Compay Size (Number of employees) y (Icome) A 100 11 B 200 20 C 300 24 D 1,000 245 If we are goig to select oly oe compay, we ca cosider the followig two approaches. 1. Equal probability selectio 1

2 CHAPTER 4. ELEMENT SAMPLING: PART 2 2. Uequal probability selectio with the selectio probability proportioal to size. The samplig distributio of the total icome estimator is the give by 11 4 if A is sampled 20 4 if B is sampled Ŷ 24 4 if C is sampled 245 4 if D is sampled ad so E(Ŷ ) 11 + 20 + 24 + 245 300 Var(Ŷ ) 154, 488. O the other had, for uequal probability samplig, 11 16 if A is sampled 20 (16/2) if B is sampled Ŷ 24 (16/3) if C is sampled 245 (16/10) if D is sampled ad so E ( Ŷ ) 11 + 20 + 24 + 245 300 Var ( Ŷ ) 14,248. Thus, ot surprisigly, the uequal probability selectio usig the umber of employees as the size is more efficiet tha the equal probability samplig mechaism. A compay with may employees is likely to have more icome tha a compay with smaller umber of employees. I geeral, whe the samplig frame cotais a iformatio about the size of the samplig uit, the size iformatio is ofte cosidered i the samplig desig stage such that the selectio probability of a uit is proportioal to the size. We ca classify the uequal probability samplig desigs ito four categories as i the followig table.

4.2. POISSON SAMPLING 3 Equal probability samplig Uequal probability samplig Features Beroulli samplig Poisso samplig I i idepedet SRS with replacemet PPS samplig Allows for duplicatio SRS without replacemet πps samplig Without replacemet samplig Systematic samplig Systematic PPS samplig Samplig systematically Stratified samplig is aother popular way of achievig uequal probability samplig ad will be covered i the ext chapter. 4.2 Poisso samplig Poisso samplig is a geeralizatio of Beroulli samplig by allowig for uequal probability selectio. I the Poisso samplig, the sample selectio idicator fuctio I i follows I i i.i.d. Beroulli(π i ), i 1,2,,N. Here, π i is the first order iclusio probability. Poisso samplig is rarely used i practice but it is useful i uderstadig the basic ature of uequal probability samplig. Uder Poisso samplig, the variace of HT estimator is expressed as V ( Ŷ HT ) N ( ) 1 1 y 2 i. (4.1) π i The followig theorem provide a result for the optimal choice of π i uder Poisso samplig. Theorem 4.1. Cosider a Poisso samplig with the first order iclusio probability π i. Give the same (expected) sample size, the variace of HT estimator uder Poisso samplig is miimized whe π i y i. (4.2)

4 CHAPTER 4. ELEMENT SAMPLING: PART 2 Proof. Usig the Cauchy-Schwarz iequality, we have ( N y i ) 2 ( N )( y 2 i N ) π i π i ad usig the fact that N π i is a fixed costat, we obtai (4.1) is miimized whe (4.2) holds. Thus, uder Poisso samplig, we have oly to make π i proportioal to y i. However, sice we ever observe y i at the time of samplig desig, we caot use y i but istead use x i i the populatio which is believed to be proportioal to y i. Poisso samplig, as with Beroulli samplig, has the disadvatage that the sample size is radom. I the extreme case, we may have equal to zero. Thus, Poisso samplig has a limited usage i practice but is useful i theory. 4.3 PPS samplig As see i Example 4.1, a samplig desig proportioal to the umber of employs is quite efficiet for estimatig the total icome of the compaies i the populatio. I this case, the total umber of employees serves the role of measure of size (MOS), which is a auxiliary variable to reflect the magitude of y i i the populatio. A samplig that selects elemets with probability proportioal to MOS with replacemet is called probability proportioal to size (PPS) samplig. Sice it is easy to select a sample of size oe with the selectio probability proportioal to MOS, we ca repeat the selectio times idepedetly with replacemet to achieve the PPS samplig of size. PPS samplig is easy to implemet, as it is a with-replacemet samplig, but it may have duplicated sample elemets. Let M i be the value of MOS associated with elemet i i the populatio. I this case, p i M i N M i is the probability of selectig elemet i for a sigle draw of sample selectio. Let a k be the idex of populatio elemet i the k-th draw of the PPS samplig, I

4.3. PPS SAMPLING 5 this case, a k is a radom variable with probability P(a k i) p i, for i U. The observed value of y i i the k-th draw is y ak N I(a k i)y i. Note that E(y ak ) N p iy i which is ot ecessarily equal to the populatio mea. If we defie z k y a k p ak N I(a k i) y i p i, k 1,2,,, the z 1,z 2,,z are idepedetly ad idetically distributed with the distributio y 1 /p 1 with probability p 1 y 2 /p 2 with probability p 2 z k. y N /p N with probability p N. Thus, for each k i the sample, we have Thus, we ca use E (z k ) Var (z k ) N y i N ( yi p i Y p i Ŷ PPS 1 k k1z 1 k1 ) 2. y ak p ak (4.3) as a estimator for Y N y i. The estimator i (4.3) is sometimes called Hase- Hurwitz (HH) estimator, as it was first proposed by Hase ad Hurwitz (1943). To discuss variace estimatio of HH estimator, we first prove the followig theorem. Theorem 4.2. Let X 1,X 2,,X be idepedet radom variables with E(X i ) µ ad V (X i ) σi 2. A ubiased estimator for the variace of X 1 X i is give by ˆV ( X ) 1 1 1 (X i X ) 2. (4.4)

6 CHAPTER 4. ELEMENT SAMPLING: PART 2 Proof. Sice X i s are idepedet, Also, as we ca express ˆV ( X ) 1 1 V ( X ) 1 2 σi 2. 1 { (X i µ) 2 ( X µ) 2 }, by takig the expectatio o both sides of the above term, we obtai E { ˆV ( X ) } { 1 1 } σi 2 1 1 2 σi 2 1 2 σi 2, which proves the ubiasedess of ˆV ( X ) i (4.4). By Theorem 4.2, a ubiased variace estimator of HH estimator is give by ˆV PPS 1 S2 z 1 1 1 k1 ( ) 2 yak Ŷ PPS. (4.5) p ak I some situatio, y i has a meaig of total value i uit i. For example, y i is the total crop yield i farm i ad M i is the total size of crop acres i form i. I this case, ȳ i is the average crop yield per acre. I this case, we ca express Ŷ PPS ( N M i ) 1 ȳ ak k1 ad ˆV ( Ŷ PPS ) ( N ) 2 M i 1 1 1 k1 (ȳ ak ˆȲ PPS ) 2 where ˆȲ PPS 1 k1 ȳa k. If the parameter of iterest is the mea Ȳ N y i N M i N M iȳ i N M, i the the HH estimator is ˆȲ PPS 1 ȳ ak k1

4.4. πps SAMPLING 7 ad its variace estimator is ( ) ˆV ˆȲ PPS 1 1 1 k1 ( ȳ ak ˆȲ PPS ) 2. That is, i the mea estimatio uder PPS samplig, we ca safely treat ȳ a1,ȳ a2,,ȳ a as a IID sample with E(ȳ ak ) Ȳ ad apply Theorem 4.2. 4.4 πps samplig The PPS samplig itroduced i the previous sectio has may advatages: it is very easy to implemet, the estimatio formula is simple. However, sice it is a with-replacemet samplig, it is iefficiet i the sese that it allows for duplicated sample elemets. Let x i be the size measure that we wat to make π i x i as close as possible. πps(π proportioal to size) samplig refers to a set of samplig desigs that satisfies the followig coditios: 1. The samplig desig is a fixed-size samplig desig that does ot allow for duplicatio. 2. The first order iclusio probability π i satisfies π i x i. 3. The secod order iclusio probability satisfies π i j > 0 ad π i j < π i π j (i j). The third coditio guaratees that SYG variace estimator is always oegative. For a fixed-size desig, π k x k ad N π i leads to π k π k N x. i If some x k satisfies x k > (N/) X N, the we have π i > 1. Thus, the exact proportioality π i x i is ot always possible. For 1, the πps samplig is the same as the PPS samplig. There are two approaches of implemetig a PPS samplig of size 1. Oe is cumulative total method ad the other is the Lahiri s method. The cumulative total method is described as follows:

8 CHAPTER 4. ELEMENT SAMPLING: PART 2 [Step 1] Set T 0 0 ad compute T k T k 1 + x k, k 1,2,,N. [Step 2] Draw ε Uif(0,1). If ε (T k 1 /T N,T k /T N ), elemet k is selected. The cumulative total method is very popular because it is easy to uderstad. It eeds a list of all x k i the populatio. The other method, developed by Lahiri (1951), ca be described as follows: [Step 0] Choose M > {x 1,x 2,,x N }. Set r 1. [Step 1] Draw k r by SRS from {1,2,,N}. [Step 2] Draw ε r Uif(0,1). [Step 3] If ε r x kr /M, the select elemet k r ad stop. Otherwise, reject k r ad goto Step 1 with r r + 1. Lahiri s method does ot eed a list of all x k i the populatio but it requires the kowledge of the upper boud of x k, deoted by M. Lahiri s method is a disecrete versio of rejectio algorithm due to Vo Neyma. To uderstad the rejectio algorithm for cotiuous radom variable with target desity f, suppose that there exist a desity g ad a costat M such that f (x) Mg(x) (4.6) o the support of f. The rejectio samplig method proceeds as follows: 1. Sample Y g ad U U (0,1), where U(0,1) deotes the uiform (0,1) distributio. 2. Reject Y if U > f (Y ) Mg(Y ). (4.7) I this case, do ot record the value of Y as a elemet i the target radom sample ad retur to step 1. 3. Otherwise, keep the value of Y. Set X Y, ad cosider X to be a elemet of the target radom sample.

4.4. πps SAMPLING 9 I the rejectio samplig method, P(X y) { P Y y U f (Y ) } Mg(Y ) y f (x)/mg(x) 0 dug(x)dx f (x)/mg(x) 0 dug(x)dx y f (x)dx f (x)dx. Note that the rejectio samplig method is applicable whe the desity f is kow up to a multiplicative factor, because the above equality still follows eve if f (x) f 1 (x) with f 1 (x) < Mg(x) ad the decisio rule (4.7) uses f 1 (x) istead of f (x). I Lahiri s method, g( ) is the desity for the discrete uiform distributio with support {1,,N} ad f ( ) is the desity for the discrete distributio with probability p i x i /( N j1 x j) for uit i 1,,N. A formal justificatio for Lahiri s method ca be described as follows: Sice ( Pr ε j > x ) k j M π k Pr(k A) r1 r1 r1 Pr(k A,R r) { Pr K r k,ε r < x r 1 k r M, (ε j > x } k j M ) r 1 1 x k N M j1 { Pr ( Pr ε j > x ) k j k M k j k where x U N 1 N x k, we ca obtai π k r1 1 N 1 N M x k x k N x. i j1 ε j > x k j M }. ( Pr(K j k) k ( x k 1 x ) r 1 U M M 1 1 (1 x U /M) 1 x k M ) 1 N 1 x U M,

10 CHAPTER 4. ELEMENT SAMPLING: PART 2 We ow discuss πps samplig for 2. Most existig schemes for fixed-size πps samplig with > 2 are quite complicated. The iterested reader is referred to Brewer ad Haif (1983). To discuss πps samplig of size 2, let θ i be the probability of selectig uit i i the first draw of the πps samplig ad let θ j i be the coditioal probability of selectig uit j i the secod draw give that uit i is selected i the first draw. Thus, writig p i x i /( N j1 x j), the problem at had is to fid a set of θ i ad θ j i satisfyig π i 2p i (4.8) ad Sice i θ i θ j i 1. j π i j θ i θ j i + θ j θ i j (4.9) ad, as it is a fixed-size samplig desig, we ca use (2.2) to get j i π i j π i, which implies Thus, costrait (4.8) reduces to Thus, we have may solutios to (4.10). ad Brewer (1963) proposed usig π i θ i + θ j θ i j. j i θ i + θ j θ i j 2p i. (4.10) j i θ i p i (1 p i ) 1 2p i θ j i p j to obtai (4.10), while Durbi (1967) proposed usig θ i p i

4.5. SYSTEMATIC πps SAMPLING 11 ad ( 1 θ j i p j + 1 ) 1 2p i 1 2p j to achieve the same goal. Usig (4.9), we ca show that both methods lead to π i j 2p ( ip j 1 + 1 ) (4.11) 1 + K 1 2p i 1 2p j where K N (1 2p i) 1 p i. Therefore, we have 4.5 Systematic πps samplig π i π i j 2p i. (4.12) j i Systematic πps samplig is similar to systematic samplig but allows for uequal probability of sample selectio. Let a N x i/ be the samplig iterval for the systematic samplig. Assume x k < a for all k U. (If some of the x k s are greater tha a, the such elemets are selected i advace ad the apply the systematic samplig i the reduced fiite populatio.) Systematic πps samplig ca be described as follows. 1. Choose R Ui f (0,a] 2. Uit k is selected iff L k < R + l a U k for some l 0,1,, 1, where L k k 1 j1 x j with L 0 0 ad U k L k +x k. Example 4.2. For example, cosider the followig artificial fiite populatio of size N 4. ID MOS (x i ) L U 1 10 0 10 2 20 10 30 3 30 30 60 4 40 60 100

12 CHAPTER 4. ELEMENT SAMPLING: PART 2 To obtai a systematic sample of size 2 with the first order iclusio probability proportioal to x i, ote that a 100/2 50. Thus, we first geerate R from a uiform distributio (0,50]. If R belogs to (0,10], we select A {1,3}. If R belogs to (10,30], we select A {2,4}. If R belogs to (30,50], we select {3,4}. The samplig distributio of the resultig sample will be 0.2, if A {1,3} P(A) 0.4, if A {2,4} 0.4, if A {3,4} To compute the first order iclusio probability of uit k, let l be the iteger satisfyig l a L k < U k (l + 1)a. Pr (k A) Pr {L k < R + l a U k } Uk l a 1 L k l a a dt x k/a x k. k U x k The systematic πps samplig is easy to implemet but it does ot allow for desigubiased variace estimator, as is the case with the classical systematic samplig. Referece Brewer, K. R. W. (1963). A model of systematic samplig with uequal probabilities, Australia Joural of Statistics 5, 93-105. Brewer, K. R. W., ad Haif, M. (1983). Samplig with Uequal Probabilities. New York:Spriger-Verlag. Durbi, J. (1967). Desig of multi-stage surveys for the estimatio of samplig errors. Applied Statistics, 16, 152 164. Hase, M.H. ad Hurwitz, W.N. (1943). O the theory of samplig from fiite populatios. Aals of Mathematical Statistics, 14, 333-362. Lahiri, D.B. (1951). A method of sample selectio providig ubiased ratio estimates. Bulleti of the Iteratioal Statistical Istitute 33, 133-140.