Bootstrap inference for the finite population total under complex sampling designs

Similar documents
INSTRUMENTAL-VARIABLE CALIBRATION ESTIMATION IN SURVEY SAMPLING

Chapter 3: Element sampling design: Part 1

Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling

A note on multiple imputation for general purpose estimation

Combining data from two independent surveys: model-assisted approach

Data Integration for Big Data Analysis for finite population inference

Confidence Intervals in Ridge Regression using Jackknife and Bootstrap Methods

Testing Statistical Hypotheses

Miscellanea A note on multiple imputation under complex sampling

Two-phase sampling approach to fractional hot deck imputation

Plugin Confidence Intervals in Discrete Distributions

Primer on statistics:

Testing Statistical Hypotheses

Cluster Sampling 2. Chapter Introduction

REPLICATION VARIANCE ESTIMATION FOR TWO-PHASE SAMPLES

Asymptotic Normality under Two-Phase Sampling Designs

Combining Non-probability and Probability Survey Samples Through Mass Imputation

11. Bootstrap Methods

ASYMPTOTIC NORMALITY UNDER TWO-PHASE SAMPLING DESIGNS

Midterm Examination. STA 215: Statistical Inference. Due Wednesday, 2006 Mar 8, 1:15 pm

Weighting in survey analysis under informative sampling

C. J. Skinner Cross-classified sampling: some estimation theory

Sampling from Finite Populations Jill M. Montaquila and Graham Kalton Westat 1600 Research Blvd., Rockville, MD 20850, U.S.A.

Empirical likelihood inference for regression parameters when modelling hierarchical complex survey data

Using R in Undergraduate and Graduate Probability and Mathematical Statistics Courses*

Statistical Prediction Based on Censored Life Data. Luis A. Escobar Department of Experimental Statistics Louisiana State University.

ESTIMATION OF CONFIDENCE INTERVALS FOR QUANTILES IN A FINITE POPULATION

Shu Yang and Jae Kwang Kim. Harvard University and Iowa State University

Spring 2012 Math 541B Exam 1

Nonresponse weighting adjustment using estimated response probability

Inferences for the Ratio: Fieller s Interval, Log Ratio, and Large Sample Based Confidence Intervals

6. Fractional Imputation in Survey Sampling

Bahadur representations for bootstrap quantiles 1

BOOTSTRAPPING SAMPLE QUANTILES BASED ON COMPLEX SURVEY DATA UNDER HOT DECK IMPUTATION

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

6.207/14.15: Networks Lecture 3: Erdös-Renyi graphs and Branching processes

of being selected and varying such probability across strata under optimal allocation leads to increased accuracy.

Chapter 8: Estimation 1

The Use of Survey Weights in Regression Modelling

arxiv: v2 [math.st] 20 Jun 2014

Statistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation

Sampling techniques for big data analysis in finite population inference

On the bias of the multiple-imputation variance estimator in survey sampling

Heteroskedasticity-Robust Inference in Finite Samples

Empirical Likelihood Methods for Sample Survey Data: An Overview

RESEARCH REPORT. Vanishing auxiliary variables in PPS sampling with applications in microscopy.

STA 2201/442 Assignment 2

Empirical Likelihood Methods

Fractional Imputation in Survey Sampling: A Comparative Review

ISI Web of Knowledge (Articles )

Empirical Likelihood Inference for Two-Sample Problems

Chapter 4. Replication Variance Estimation. J. Kim, W. Fuller (ISU) Chapter 4 7/31/11 1 / 28

Bootstrap. Director of Center for Astrostatistics. G. Jogesh Babu. Penn State University babu.

A new resampling method for sampling designs without replacement: the doubled half bootstrap

Confidence intervals for kernel density estimation

Robust Backtesting Tests for Value-at-Risk Models

An Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data

Chapter 4: Imputation

Conservative variance estimation for sampling designs with zero pairwise inclusion probabilities

Small area prediction based on unit level models when the covariate mean is measured with error

Better Bootstrap Confidence Intervals

The Nonparametric Bootstrap

(3) (S) THE BIAS AND STABILITY OF JACK -KNIFE VARIANCE ESTIMATOR IN RATIO ESTIMATION

The exact bootstrap method shown on the example of the mean and variance estimation

Chapter 5: Models used in conjunction with sampling. J. Kim, W. Fuller (ISU) Chapter 5: Models used in conjunction with sampling 1 / 70

Discussion Paper Series

Contents. Preface to Second Edition Preface to First Edition Abbreviations PART I PRINCIPLES OF STATISTICAL THINKING AND ANALYSIS 1

Likelihood-based inference with missing data under missing-at-random

SMOOTHED BLOCK EMPIRICAL LIKELIHOOD FOR QUANTILES OF WEAKLY DEPENDENT PROCESSES

TOLERANCE INTERVALS FOR DISCRETE DISTRIBUTIONS IN EXPONENTIAL FAMILIES

Confidence Regions For The Ratio Of Two Percentiles

The Union and Intersection for Different Configurations of Two Events Mutually Exclusive vs Independency of Events

Asymptotic Statistics-III. Changliang Zou

Supplement to Quantile-Based Nonparametric Inference for First-Price Auctions

A Practitioner s Guide to Cluster-Robust Inference

MA 575 Linear Models: Cedric E. Ginestet, Boston University Bootstrap for Regression Week 9, Lecture 1

STATISTICS SYLLABUS UNIT I

A measurement error model approach to small area estimation

Model Assisted Survey Sampling

One-Sample Numerical Data

Sampling: A Brief Review. Workshop on Respondent-driven Sampling Analyst Software

Week 1 Quantitative Analysis of Financial Markets Distributions A

Monte Carlo Study on the Successive Difference Replication Method for Non-Linear Statistics

Monte Carlo Simulations

Research Article A Nonparametric Two-Sample Wald Test of Equality of Variances

Topic 16 Interval Estimation. The Bootstrap and the Bayesian Approach

Analysis of incomplete data in presence of competing risks

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

Calibration estimation using exponential tilting in sample surveys

Hypothesis Testing For Multilayer Network Data

Introduction to Survey Data Integration

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach

Numerical Analysis for Statisticians

Stat 516, Homework 1

REMAINDER LINEAR SYSTEMATIC SAMPLING

A JACKKNIFE VARIANCE ESTIMATOR FOR SELF-WEIGHTED TWO-STAGE SAMPLES

Monte Carlo Studies. The response in a Monte Carlo study is a random variable.

Estimation of AUC from 0 to Infinity in Serial Sacrifice Designs

Recent Advances in the analysis of missing data with non-ignorable missingness

A union of Bayesian, frequentist and fiducial inferences by confidence distribution and artificial data sampling

Transcription:

Bootstrap inference for the finite population total under complex sampling designs Zhonglei Wang (Joint work with Dr. Jae Kwang Kim) Center for Survey Statistics and Methodology Iowa State University Jan. 16, 2018 Zhonglei Wang Bootstrap for complex sampling Jan. 16, 2018 1 / 36

Outline 1 Introduction 2 A brief review of some sampling designs 3 Bootstrap methods for complex sampling designs 4 Simulation studies 5 Conclusions Zhonglei Wang Bootstrap for complex sampling Jan. 16, 2018 2 / 36

Introduction Outline 1 Introduction 2 A brief review of some sampling designs 3 Bootstrap methods for complex sampling designs 4 Simulation studies 5 Conclusions Zhonglei Wang Bootstrap for complex sampling Jan. 16, 2018 3 / 36

Introduction Introduction Bootstrap is popular. Easy to implement, Higher accuracy compared with the Wald-type method (Hall, 1992, 3.3). Classical bootstrap method is not applicable under most sampling designs. Rao and Wu (1988) discussed a rescaling bootstrap method under stratified random sampling. Sitter (1992) considered a mirror-match bootstrap method for sampling designs without replacement. Zhonglei Wang Bootstrap for complex sampling Jan. 16, 2018 4 / 36

Introduction Introduction (Cont d) The goal of this study. Propose bootstrap methods for three commonly used sampling designs: Poisson sampling, simple random sampling (SRS) and probability proportional to size (PPS) sampling. Study the theoretical properties of the proposed method. Zhonglei Wang Bootstrap for complex sampling Jan. 16, 2018 5 / 36

A brief review of some sampling designs Outline 1 Introduction 2 A brief review of some sampling designs 3 Bootstrap methods for complex sampling designs 4 Simulation studies 5 Conclusions Zhonglei Wang Bootstrap for complex sampling Jan. 16, 2018 6 / 36

A brief review of some sampling designs Sampling designs Finite population F N = {y 1,..., y N } with a known size N. Parameter of interest Y = N i=1 y i (or Ȳ = N 1 Y equivalently). For Poisson sampling and SRS, Denote I i to be the sample indicator. Denote π i = E(I i ) to be the first-order inclusion probability. For Poisson sampling, a sample is obtained based on N independent Bernoulli trials. That is, I i Ber(π i ). Denote n 0 = N i=1 π i. For SRS, a without-replacement sample of size n is selected with equal probabilities. That is, π i = nn 1. Denote ŶPoi = N i=1 I iπ 1 i y i to be the Horvitz Thompson estimator of Y under Poisson sampling, and we can define Ŷ SRS similarly. Denote ˆV Poi and ˆV SRS to be the Horvitz Thompson variance estimators for the two designs. Zhonglei Wang Bootstrap for complex sampling Jan. 16, 2018 7 / 36

A brief review of some sampling designs Sampling designs (Cont d) For PPS sampling, Let p i (0, 1) be the selection probability of y i with N i=1 p i = 1. A sample of size n is obtained by independently selecting a single element from the same finite population for n times. Denote ŶPPS = n 1 n i=1 z i to be the the Hansen Hurwitz estimator of Y, where z i = p 1 a,i y a,i, a i is the index of the selected element for the i-th draw, p a,i = p k and y a,i = y k if a i = k. Denote ˆV PPS to be the design-unbiased estimator (Fuller, 2009; 1.2.5) Denote T Poi = ˆV 1/2 Poi (Ŷ Poi Y ) for Poisson sampling, and we can have T SRS and T PPS defined similarly. Zhonglei Wang Bootstrap for complex sampling Jan. 16, 2018 8 / 36

Bootstrap methods for complex sampling designs Outline 1 Introduction 2 A brief review of some sampling designs 3 Bootstrap methods for complex sampling designs 4 Simulation studies 5 Conclusions Zhonglei Wang Bootstrap for complex sampling Jan. 16, 2018 9 / 36

Bootstrap methods for complex sampling designs Bootstrap method for Poisson sampling 1 Based on the current sample of size n, generate (N 1,..., N n) by a multinomial distribution MN(N; ρ), where ρ = (ρ 1,, ρ n ) and ρ i π 1 i. 2 For each i = 1,, n, generate m i independently by a binomial distribution Bin(N i, π i). The bootstrap sample consists of m i replicates of y i under Poisson sampling. 3 Repeat the two steps above independently for M times. Zhonglei Wang Bootstrap for complex sampling Jan. 16, 2018 10 / 36

Bootstrap methods for complex sampling designs Theoretical results for Poisson sampling Denote (F N, B N, P N,Poi ) to be a probability space, where B N is the power set of F N, P N,Poi ( ) is a probability measure on F N associated with Poisson sampling. For any positive integer set J N +, denote P J,Poi = j J P j,poi to be the product probability measure on the product space j J F j. It can be shown that there exists a probability measure P Poi on U = N=1 F N equipped with the product σ-algebra B, such that P J,Poi = P Poi ξ 1 J for all finite positive integer set J N +, where ξ J is the canonical projection from U to j J F j (Klenke, 2014, 14.1). Zhonglei Wang Bootstrap for complex sampling Jan. 16, 2018 11 / 36

Bootstrap methods for complex sampling designs Theoretical results for Poisson sampling (Cont d) Lemma (Lemma 3.1) Under mild conditions, we have where V Poi = N i=1 π 1 i (1 π i )y 2 i, lim sup N (n 0 N 2 V Poi ) = O(1), n0 2 µ (3) N 3 Poi = O(1), n 0 ( ˆV N 2 Poi V Poi ) 0 a.s. (P Poi ), n0 2 (ˆµ (3) N 3 Poi µ(3) Poi ) = o p(1), µ (3) Poi = N i=1 y 3 i (1 π i){(1 π i ) 2 π 2 i 1}, ˆµ (3) Poi = n i=1 π 1 i yi 3(1 π i){(1 π i ) 2 π 2 i 1}. Zhonglei Wang Bootstrap for complex sampling Jan. 16, 2018 12 / 36

Bootstrap methods for complex sampling designs Theoretical results for Poisson sampling (Cont d) Theorem (Theorem 3.1) Under mild conditions, we have Furthermore, ˆF Poi (z) = Φ(z) + ˆµ (3) Poi ˆV 3/2 Poi = O p (n 1/2 0 ). (1) ˆµ(3) Poi 6 ˆV 3/2 Poi (1 z 2 )φ(z) + o p (n 1/2 0 ) (2) a.s. (P Poi ) for z R, where ˆF Poi (z) is the cumulative distribution function of T Poi = ˆV 1/2 Poi (Ŷ Poi Y ) under Poisson sampling, Φ(z) is the cumulative distribution function of the standard normal distribution with the density function φ(z). Zhonglei Wang Bootstrap for complex sampling Jan. 16, 2018 13 / 36

Bootstrap methods for complex sampling designs Theoretical results for Poisson sampling (Cont d) Theorem (Theorem 3.2) Under mild conditions, we have ˆF Poi (z) = Φ(z) + ˆµ(3) Poi 6 ˆV 3/2 Poi (1 z 2 )φ(z) + o p (n 1/2 0 ) (3) a.s. conditional on the sample {y 1,..., y n } obtained by Poisson sampling in probability for z R, where ˆF Poi (z) is the cumulative distribution function of TPoi conditional on the realized sample. Zhonglei Wang Bootstrap for complex sampling Jan. 16, 2018 14 / 36

Bootstrap methods for complex sampling designs Bootstrap method for SRS 1 Generate (N1,..., N n) by MN(N; ρ), where ρ i = n 1. 2 Generate a bootstrap sample of size n from FN using SRS. 3 Repeat the two steps above independently for M times. Zhonglei Wang Bootstrap for complex sampling Jan. 16, 2018 15 / 36

Bootstrap methods for complex sampling designs Theoretical results for SRS Lemma (Lemma 4.1) Under mild conditions, we have where σ 2 SRS = N 1 N i=1 (y i Ȳ )2, µ (3) SRS = N 1 N i=1 (y i Ȳ ) 3, lim sup N σ 2 SRS = O(1), µ (3) SRS = O(1), s 2 SRS σ2 SRS 0 a.s. (P SRS), ˆµ (3) SRS µ(3) SRS = o p(1), ˆµ (3) SRS = n 1 n i=1 y i 3 + 2ȳn 3 3ȳ n n 1 n i= y i 2, ȳ n = n 1 n i=1 y i. Zhonglei Wang Bootstrap for complex sampling Jan. 16, 2018 16 / 36

Bootstrap methods for complex sampling designs Theoretical results for SRS (Cont d) Theorem (Theorem 4.1) Under mild conditions, we have ˆF SRS (z) = Φ(z) + (1 2nN 1 )ˆµ (3) SRS 6{n(1 nn 1 )} 1/2 ssrs 3 (1 z 2 )φ(z) + o p (n 1/2 ) (4) a.s. (P SRS ) for z R, where ˆF SRS (z) is the cumulative distribution function of T SRS under SRS, and recall that T SRS = ˆV 1/2 SRS (Ŷ SRS Y ). Zhonglei Wang Bootstrap for complex sampling Jan. 16, 2018 17 / 36

Bootstrap methods for complex sampling designs Theoretical results for SRS (Cont d) Theorem (Theorem 4.2) Under mild conditions, we have ˆF SRS (z) = Φ(z) + (1 2nN 1 )ˆµ (3) SRS 6{n(1 nn 1 )} 1/2 ssrs 3 (1 z 2 )φ(z) + o p (n 1/2 ) (5) a.s. conditional on the sample {y 1,..., y n } obtained by SRS in probability for z R, where ˆF SRS (z) is the cumulative distribution function of T SRS conditional on the realized sample. Zhonglei Wang Bootstrap for complex sampling Jan. 16, 2018 18 / 36

Bootstrap methods for complex sampling designs Bootstrap method for PPS 1 Obtain (N a,1,..., N a,n) by a multinomial distribution MN(N; ρ), where ρ i p 1 a,i. 2 Based on FN, sample one element with selection probability (CN ) 1 pi for the i-th element independently for n times, where C N = N i=1 p i = n i=1 N a,i p a,i. 3 Repeat the two steps above independently for M times. Zhonglei Wang Bootstrap for complex sampling Jan. 16, 2018 19 / 36

Bootstrap methods for complex sampling designs Theoretical results for PPS Lemma (Lemma 5.1) Under mild conditions, we have where lim sup N (N 2 σ 2 PPS ) = O(1), N 3 µ (3) PPS = O(1), N 2 (s 2 PPS σ2 PPS ) 0 a.s. (P PPS), σ 2 PPS = N i=1 p i(p 1 i y i Y ) 2, N 3 (ˆµ (3) PPS µ(3) PPS ) = o p(1), µ (3) PPS = N i=1 p i(p 1 i y i Y ) 3, s 2 PPS is the sample variance of {z i : i = 1,..., n} with z i = p 1 a,i y a,i, ˆµ (3) PPS = n 1 n i=1 z3 i + 2 z n 3 3 z n n 1 n i= z2 i. Zhonglei Wang Bootstrap for complex sampling Jan. 16, 2018 20 / 36

Bootstrap methods for complex sampling designs Theoretical results for PPS (Cont d) Theorem (Theorem 5.1) Under mild conditions, we have ˆF PPS (z) = Φ(z) + ˆµ(3) PPS 6 nspps 3 (1 z 2 )φ(z) + o p (n 1/2 ) (6) a.s. (P PPS ), where ˆF PPS is the cumulative distribution function of T PPS = ˆV 1/2 PPS (Ŷ PPS Y ) under PPS sampling. Zhonglei Wang Bootstrap for complex sampling Jan. 16, 2018 21 / 36

Bootstrap methods for complex sampling designs Theoretical results for PPS (Cont d) Theorem (Theorem 5.2) Under mild conditions, we have ˆF PPS (z) = Φ(z) + ˆµ(3) PPS 6 nspps 3 (1 z 2 )φ(z) + o p (n 1/2 ) (7) a.s. conditional on the sample obtained by PPS sampling in probability for z R, where ˆF PPS (z) is the conditional distribution of T PPS given the realized sample. Zhonglei Wang Bootstrap for complex sampling Jan. 16, 2018 22 / 36

Simulation studies Outline 1 Introduction 2 A brief review of some sampling designs 3 Bootstrap methods for complex sampling designs 4 Simulation studies 5 Conclusions Zhonglei Wang Bootstrap for complex sampling Jan. 16, 2018 23 / 36

Simulation studies Single-stage sampling designs A finite population F N = {y 1,..., y N } is generated by y i Ex(10), N=500, Ex(λ) is an exponential distribution with a scale parameter λ. Size measure is simulated by z i = log(3 + s i ) with s i y i Ex(y i ). The expected sample size is n 0 {10, 100}. Goal: Construct 90% confidence interval for Ȳ under Poisson sampling with π i z i and N i=1 π i = n 0, SRS with sample size n 0, PPS sampling with p i z i and the sample size n 0. Denote Ỹ to be the design-unbiased estimate of Ȳ under a specific sampling design with variance estimator Ṽ. Zhonglei Wang Bootstrap for complex sampling Jan. 16, 2018 24 / 36

Simulation studies Single-stage sampling designs (Cont d) We consider two methods to obtain the 90% confidence interval. Proposed method by setting M = 1, 000, that is, (Ỹ q B,0.95Ṽ 1/2, Ỹ q B,0.05Ṽ 1/2 ), where q B,p is the p-th quantile of { T (m) : m = 1,..., M}, T (m) = (Ṽ (m) ) 1/2 (Ỹ (m) Ȳ (m) ). Ṽ (m), Ỹ (m) and Ȳ (m) are the quantities in the m-th resampling. Wald-type method, that is, where q p = Φ 1 (p). (Ỹ q 0.95 Ṽ 1/2, Ỹ q 0.05 Ṽ 1/2 ), 1, 000 Monte Carlo simulations are conducted for each sampling design. Zhonglei Wang Bootstrap for complex sampling Jan. 16, 2018 25 / 36

Simulation studies Single-stage sampling designs (Cont d) Design Poisson SRS PPS Method n 0 = 10 n 0 = 100 C.R. C.L. C.R. C.L. Bootstrap 0.89 15.2 0.89 3.6 Wald-type 0.83 11.9 0.87 3.5 Bootstrap 0.86 11.3 0.90 2.8 Wald-type 0.82 8.9 0.90 2.8 Bootstrap 0.88 10.3 0.90 2.6 Wald-type 0.82 7.5 0.89 2.5 Zhonglei Wang Bootstrap for complex sampling Jan. 16, 2018 26 / 36

Simulation studies Two-stage sampling designs A finite population F N = {y i,j : i = 1,..., H; j = 1,..., N i } is generated by y i,j = 50 + a i + e i,j, a i N(0, 50), e i,j Ex(20), N i a i Po(q i ) + c 0 where H = 100, Po(λ) is a Poisson distribution with a rate parameter λ, q i = (a i 25) 2 /20, c 0 = 40 is the minimum cluster size The finite population size is N = 17, 011. The cluster sizes range from 40 to 542. We assume that N and N 1,..., N H are known. Zhonglei Wang Bootstrap for complex sampling Jan. 16, 2018 27 / 36

Simulation studies Two-stage sampling designs (Cont d) Goal: Construct 90% confidence interval for Ȳ and P = N 1 H Ni i=1 j=1 δ (,q y )(y i,j ). We consider two different sampling designs for the first stage. Poisson sampling with π i N i and N i=1 π i = n 1, PPS sampling with p i z i and the sample size n 1. We use SRS as the second-stage sampling design with sample size n 2 for each sampled cluster. We consider n 1 {10, 30} and n 2 {10, 30}. Zhonglei Wang Bootstrap for complex sampling Jan. 16, 2018 28 / 36

Simulation studies Two-stage sampling designs (Cont d) We consider two methods to obtain the 90% confidence interval. The proposed method extended to a two-stage sampling design with M = 500. That is, use the following two steps to bootstrap the finite population. 1 Use the proposed method to bootstrap the H clusters by treating them as elements, and the original sample within each selected cluster are replicated accordingly. 2 For each bootstrap cluster, apply the proposed method to bootstrap the cluster finite population independently. Wald-type method, and it is the same as the one discussed before. 500 Monte Carlo simulations for each sampling design. Zhonglei Wang Bootstrap for complex sampling Jan. 16, 2018 29 / 36

Simulation studies Two-stage sampling designs (Cont d) Table : Coverage rate and length of the 90% confidence interval for Ȳ. Design Poisson PPS n 2 = 10 n 2 = 30 n 2 = 10 n 2 = 30 Method n 1 = 10 n 1 = 30 C.R. C.L. C.R. C.L. Bootstrap 0.91 74.8 0.90 34.4 Wald-type 0.89 69.6 0.90 33.9 Bootstrap 0.90 74.5 0.90 34.2 Wald-type 0.89 69.4 0.90 33.6 Bootstrap 0.90 9.1 0.91 4.8 Wald-type 0.87 8.0 0.90 4.7 Bootstrap 0.90 7.8 0.89 4.1 Wald-type 0.86 6.8 0.88 4.0 Zhonglei Wang Bootstrap for complex sampling Jan. 16, 2018 30 / 36

Simulation studies Two-stage sampling designs (Cont d) Table : Coverage rate and length of the 90% confidence interval for P. Design Poisson PPS n 2 = 10 n 2 = 30 n 2 = 10 n 2 = 30 Method n 1 = 10 n 1 = 30 C.R. C.L. C.R. C.L. Bootstrap 0.91 0.6 0.89 0.3 Wald-type 0.87 0.5 0.88 0.3 Bootstrap 0.88 0.6 0.90 0.2 Wald-type 0.87 0.5 0.91 0.2 Bootstrap 0.89 0.3 0.90 0.2 Wald-type 0.84 0.2 0.90 0.2 Bootstrap 0.90 0.3 0.90 0.1 Wald-type 0.85 0.2 0.89 0.1 Zhonglei Wang Bootstrap for complex sampling Jan. 16, 2018 31 / 36

Simulation studies Remark for the simulation studies For the two-stage sampling designs, The sampling distribution of Ỹ is approximately symmetric under both designs even when the sample size is small. The sampling distribution of the proportion estimator is slightly right-skewed when n 1 = 10. We have compared the proposed method with the nonparametric Bayesian bootstrap method (Dong et al., 2014) and that based on the two-step inverse sampling method (Sverchkov and Pfeffermann, 2004), and the proposed one works better in terms of the coverage rate. Zhonglei Wang Bootstrap for complex sampling Jan. 16, 2018 32 / 36

Conclusions Outline 1 Introduction 2 A brief review of some sampling designs 3 Bootstrap methods for complex sampling designs 4 Simulation studies 5 Conclusions Zhonglei Wang Bootstrap for complex sampling Jan. 16, 2018 33 / 36

Conclusions Conclusions We propose a bootstrap method for Poisson sampling, SRS and PPS sampling, and we show that the proposed method is second-order accurate. It is necessary to estimate the variance of the design-unbiased estimator since the proposed method is based on an asymptotically pivotal statistic. Although the proposed method is discussed under the single-stage sampling designs, simulation shows that it works well under some two-stage sampling designs. It may be extended to other complex sampling designs when the asymptotic distribution of the design-unbiased estimator exists, but the second-order accuracy may not be guaranteed. The proposed method can be easily parallelized in practice. Zhonglei Wang Bootstrap for complex sampling Jan. 16, 2018 34 / 36

Conclusions Selected reference Dong, Q., Elliott, M. R. & Raghunathan, T. E. (2014). A nonparametric method to generate synthetic populations to adjust for complex sampling design features. Surv. Methodol. 40, 29 46. Fuller, W. A. (2009). Sampling Statistics. Hoboken: John Wiley. Hall, P. (1992). The Bootstrap and Edgeworth Expansion. New York: Springer Science & Business Media. Klenke, A. (2014). Probability Theory: A Comprehensive Course. 2nd edition. London: Springer Verlag London Ltd.. Rao, J. N. K. & Wu, C. F. J. (1988). Resampling inference with complex survey data. J. Amer. Statist. Assoc. 83, 231 241. Sitter, R. R. (1992). A resampling procedure for complex survey data. J. Amer. Statist. Assoc. 20, 755 765. Sverchkov, M.& Pfeffermann, D. (2004). Prediction of finite population totals based on the sample distribution. Surv. Methodol. 30, 79 92. Zhonglei Wang Bootstrap for complex sampling Jan. 16, 2018 35 / 36

Conclusions Thank you! Zhonglei Wang Bootstrap for complex sampling Jan. 16, 2018 36 / 36