Bootstrap inference for the finite population total under complex sampling designs
|
|
- Grace Spencer
- 5 years ago
- Views:
Transcription
1 Bootstrap inference for the finite population total under complex sampling designs Zhonglei Wang (Joint work with Dr. Jae Kwang Kim) Center for Survey Statistics and Methodology Iowa State University Jan. 16, 2018 Zhonglei Wang Bootstrap for complex sampling Jan. 16, / 36
2 Outline 1 Introduction 2 A brief review of some sampling designs 3 Bootstrap methods for complex sampling designs 4 Simulation studies 5 Conclusions Zhonglei Wang Bootstrap for complex sampling Jan. 16, / 36
3 Introduction Outline 1 Introduction 2 A brief review of some sampling designs 3 Bootstrap methods for complex sampling designs 4 Simulation studies 5 Conclusions Zhonglei Wang Bootstrap for complex sampling Jan. 16, / 36
4 Introduction Introduction Bootstrap is popular. Easy to implement, Higher accuracy compared with the Wald-type method (Hall, 1992, 3.3). Classical bootstrap method is not applicable under most sampling designs. Rao and Wu (1988) discussed a rescaling bootstrap method under stratified random sampling. Sitter (1992) considered a mirror-match bootstrap method for sampling designs without replacement. Zhonglei Wang Bootstrap for complex sampling Jan. 16, / 36
5 Introduction Introduction (Cont d) The goal of this study. Propose bootstrap methods for three commonly used sampling designs: Poisson sampling, simple random sampling (SRS) and probability proportional to size (PPS) sampling. Study the theoretical properties of the proposed method. Zhonglei Wang Bootstrap for complex sampling Jan. 16, / 36
6 A brief review of some sampling designs Outline 1 Introduction 2 A brief review of some sampling designs 3 Bootstrap methods for complex sampling designs 4 Simulation studies 5 Conclusions Zhonglei Wang Bootstrap for complex sampling Jan. 16, / 36
7 A brief review of some sampling designs Sampling designs Finite population F N = {y 1,..., y N } with a known size N. Parameter of interest Y = N i=1 y i (or Ȳ = N 1 Y equivalently). For Poisson sampling and SRS, Denote I i to be the sample indicator. Denote π i = E(I i ) to be the first-order inclusion probability. For Poisson sampling, a sample is obtained based on N independent Bernoulli trials. That is, I i Ber(π i ). Denote n 0 = N i=1 π i. For SRS, a without-replacement sample of size n is selected with equal probabilities. That is, π i = nn 1. Denote ŶPoi = N i=1 I iπ 1 i y i to be the Horvitz Thompson estimator of Y under Poisson sampling, and we can define Ŷ SRS similarly. Denote ˆV Poi and ˆV SRS to be the Horvitz Thompson variance estimators for the two designs. Zhonglei Wang Bootstrap for complex sampling Jan. 16, / 36
8 A brief review of some sampling designs Sampling designs (Cont d) For PPS sampling, Let p i (0, 1) be the selection probability of y i with N i=1 p i = 1. A sample of size n is obtained by independently selecting a single element from the same finite population for n times. Denote ŶPPS = n 1 n i=1 z i to be the the Hansen Hurwitz estimator of Y, where z i = p 1 a,i y a,i, a i is the index of the selected element for the i-th draw, p a,i = p k and y a,i = y k if a i = k. Denote ˆV PPS to be the design-unbiased estimator (Fuller, 2009; 1.2.5) Denote T Poi = ˆV 1/2 Poi (Ŷ Poi Y ) for Poisson sampling, and we can have T SRS and T PPS defined similarly. Zhonglei Wang Bootstrap for complex sampling Jan. 16, / 36
9 Bootstrap methods for complex sampling designs Outline 1 Introduction 2 A brief review of some sampling designs 3 Bootstrap methods for complex sampling designs 4 Simulation studies 5 Conclusions Zhonglei Wang Bootstrap for complex sampling Jan. 16, / 36
10 Bootstrap methods for complex sampling designs Bootstrap method for Poisson sampling 1 Based on the current sample of size n, generate (N 1,..., N n) by a multinomial distribution MN(N; ρ), where ρ = (ρ 1,, ρ n ) and ρ i π 1 i. 2 For each i = 1,, n, generate m i independently by a binomial distribution Bin(N i, π i). The bootstrap sample consists of m i replicates of y i under Poisson sampling. 3 Repeat the two steps above independently for M times. Zhonglei Wang Bootstrap for complex sampling Jan. 16, / 36
11 Bootstrap methods for complex sampling designs Theoretical results for Poisson sampling Denote (F N, B N, P N,Poi ) to be a probability space, where B N is the power set of F N, P N,Poi ( ) is a probability measure on F N associated with Poisson sampling. For any positive integer set J N +, denote P J,Poi = j J P j,poi to be the product probability measure on the product space j J F j. It can be shown that there exists a probability measure P Poi on U = N=1 F N equipped with the product σ-algebra B, such that P J,Poi = P Poi ξ 1 J for all finite positive integer set J N +, where ξ J is the canonical projection from U to j J F j (Klenke, 2014, 14.1). Zhonglei Wang Bootstrap for complex sampling Jan. 16, / 36
12 Bootstrap methods for complex sampling designs Theoretical results for Poisson sampling (Cont d) Lemma (Lemma 3.1) Under mild conditions, we have where V Poi = N i=1 π 1 i (1 π i )y 2 i, lim sup N (n 0 N 2 V Poi ) = O(1), n0 2 µ (3) N 3 Poi = O(1), n 0 ( ˆV N 2 Poi V Poi ) 0 a.s. (P Poi ), n0 2 (ˆµ (3) N 3 Poi µ(3) Poi ) = o p(1), µ (3) Poi = N i=1 y 3 i (1 π i){(1 π i ) 2 π 2 i 1}, ˆµ (3) Poi = n i=1 π 1 i yi 3(1 π i){(1 π i ) 2 π 2 i 1}. Zhonglei Wang Bootstrap for complex sampling Jan. 16, / 36
13 Bootstrap methods for complex sampling designs Theoretical results for Poisson sampling (Cont d) Theorem (Theorem 3.1) Under mild conditions, we have Furthermore, ˆF Poi (z) = Φ(z) + ˆµ (3) Poi ˆV 3/2 Poi = O p (n 1/2 0 ). (1) ˆµ(3) Poi 6 ˆV 3/2 Poi (1 z 2 )φ(z) + o p (n 1/2 0 ) (2) a.s. (P Poi ) for z R, where ˆF Poi (z) is the cumulative distribution function of T Poi = ˆV 1/2 Poi (Ŷ Poi Y ) under Poisson sampling, Φ(z) is the cumulative distribution function of the standard normal distribution with the density function φ(z). Zhonglei Wang Bootstrap for complex sampling Jan. 16, / 36
14 Bootstrap methods for complex sampling designs Theoretical results for Poisson sampling (Cont d) Theorem (Theorem 3.2) Under mild conditions, we have ˆF Poi (z) = Φ(z) + ˆµ(3) Poi 6 ˆV 3/2 Poi (1 z 2 )φ(z) + o p (n 1/2 0 ) (3) a.s. conditional on the sample {y 1,..., y n } obtained by Poisson sampling in probability for z R, where ˆF Poi (z) is the cumulative distribution function of TPoi conditional on the realized sample. Zhonglei Wang Bootstrap for complex sampling Jan. 16, / 36
15 Bootstrap methods for complex sampling designs Bootstrap method for SRS 1 Generate (N1,..., N n) by MN(N; ρ), where ρ i = n 1. 2 Generate a bootstrap sample of size n from FN using SRS. 3 Repeat the two steps above independently for M times. Zhonglei Wang Bootstrap for complex sampling Jan. 16, / 36
16 Bootstrap methods for complex sampling designs Theoretical results for SRS Lemma (Lemma 4.1) Under mild conditions, we have where σ 2 SRS = N 1 N i=1 (y i Ȳ )2, µ (3) SRS = N 1 N i=1 (y i Ȳ ) 3, lim sup N σ 2 SRS = O(1), µ (3) SRS = O(1), s 2 SRS σ2 SRS 0 a.s. (P SRS), ˆµ (3) SRS µ(3) SRS = o p(1), ˆµ (3) SRS = n 1 n i=1 y i 3 + 2ȳn 3 3ȳ n n 1 n i= y i 2, ȳ n = n 1 n i=1 y i. Zhonglei Wang Bootstrap for complex sampling Jan. 16, / 36
17 Bootstrap methods for complex sampling designs Theoretical results for SRS (Cont d) Theorem (Theorem 4.1) Under mild conditions, we have ˆF SRS (z) = Φ(z) + (1 2nN 1 )ˆµ (3) SRS 6{n(1 nn 1 )} 1/2 ssrs 3 (1 z 2 )φ(z) + o p (n 1/2 ) (4) a.s. (P SRS ) for z R, where ˆF SRS (z) is the cumulative distribution function of T SRS under SRS, and recall that T SRS = ˆV 1/2 SRS (Ŷ SRS Y ). Zhonglei Wang Bootstrap for complex sampling Jan. 16, / 36
18 Bootstrap methods for complex sampling designs Theoretical results for SRS (Cont d) Theorem (Theorem 4.2) Under mild conditions, we have ˆF SRS (z) = Φ(z) + (1 2nN 1 )ˆµ (3) SRS 6{n(1 nn 1 )} 1/2 ssrs 3 (1 z 2 )φ(z) + o p (n 1/2 ) (5) a.s. conditional on the sample {y 1,..., y n } obtained by SRS in probability for z R, where ˆF SRS (z) is the cumulative distribution function of T SRS conditional on the realized sample. Zhonglei Wang Bootstrap for complex sampling Jan. 16, / 36
19 Bootstrap methods for complex sampling designs Bootstrap method for PPS 1 Obtain (N a,1,..., N a,n) by a multinomial distribution MN(N; ρ), where ρ i p 1 a,i. 2 Based on FN, sample one element with selection probability (CN ) 1 pi for the i-th element independently for n times, where C N = N i=1 p i = n i=1 N a,i p a,i. 3 Repeat the two steps above independently for M times. Zhonglei Wang Bootstrap for complex sampling Jan. 16, / 36
20 Bootstrap methods for complex sampling designs Theoretical results for PPS Lemma (Lemma 5.1) Under mild conditions, we have where lim sup N (N 2 σ 2 PPS ) = O(1), N 3 µ (3) PPS = O(1), N 2 (s 2 PPS σ2 PPS ) 0 a.s. (P PPS), σ 2 PPS = N i=1 p i(p 1 i y i Y ) 2, N 3 (ˆµ (3) PPS µ(3) PPS ) = o p(1), µ (3) PPS = N i=1 p i(p 1 i y i Y ) 3, s 2 PPS is the sample variance of {z i : i = 1,..., n} with z i = p 1 a,i y a,i, ˆµ (3) PPS = n 1 n i=1 z3 i + 2 z n 3 3 z n n 1 n i= z2 i. Zhonglei Wang Bootstrap for complex sampling Jan. 16, / 36
21 Bootstrap methods for complex sampling designs Theoretical results for PPS (Cont d) Theorem (Theorem 5.1) Under mild conditions, we have ˆF PPS (z) = Φ(z) + ˆµ(3) PPS 6 nspps 3 (1 z 2 )φ(z) + o p (n 1/2 ) (6) a.s. (P PPS ), where ˆF PPS is the cumulative distribution function of T PPS = ˆV 1/2 PPS (Ŷ PPS Y ) under PPS sampling. Zhonglei Wang Bootstrap for complex sampling Jan. 16, / 36
22 Bootstrap methods for complex sampling designs Theoretical results for PPS (Cont d) Theorem (Theorem 5.2) Under mild conditions, we have ˆF PPS (z) = Φ(z) + ˆµ(3) PPS 6 nspps 3 (1 z 2 )φ(z) + o p (n 1/2 ) (7) a.s. conditional on the sample obtained by PPS sampling in probability for z R, where ˆF PPS (z) is the conditional distribution of T PPS given the realized sample. Zhonglei Wang Bootstrap for complex sampling Jan. 16, / 36
23 Simulation studies Outline 1 Introduction 2 A brief review of some sampling designs 3 Bootstrap methods for complex sampling designs 4 Simulation studies 5 Conclusions Zhonglei Wang Bootstrap for complex sampling Jan. 16, / 36
24 Simulation studies Single-stage sampling designs A finite population F N = {y 1,..., y N } is generated by y i Ex(10), N=500, Ex(λ) is an exponential distribution with a scale parameter λ. Size measure is simulated by z i = log(3 + s i ) with s i y i Ex(y i ). The expected sample size is n 0 {10, 100}. Goal: Construct 90% confidence interval for Ȳ under Poisson sampling with π i z i and N i=1 π i = n 0, SRS with sample size n 0, PPS sampling with p i z i and the sample size n 0. Denote Ỹ to be the design-unbiased estimate of Ȳ under a specific sampling design with variance estimator Ṽ. Zhonglei Wang Bootstrap for complex sampling Jan. 16, / 36
25 Simulation studies Single-stage sampling designs (Cont d) We consider two methods to obtain the 90% confidence interval. Proposed method by setting M = 1, 000, that is, (Ỹ q B,0.95Ṽ 1/2, Ỹ q B,0.05Ṽ 1/2 ), where q B,p is the p-th quantile of { T (m) : m = 1,..., M}, T (m) = (Ṽ (m) ) 1/2 (Ỹ (m) Ȳ (m) ). Ṽ (m), Ỹ (m) and Ȳ (m) are the quantities in the m-th resampling. Wald-type method, that is, where q p = Φ 1 (p). (Ỹ q 0.95 Ṽ 1/2, Ỹ q 0.05 Ṽ 1/2 ), 1, 000 Monte Carlo simulations are conducted for each sampling design. Zhonglei Wang Bootstrap for complex sampling Jan. 16, / 36
26 Simulation studies Single-stage sampling designs (Cont d) Design Poisson SRS PPS Method n 0 = 10 n 0 = 100 C.R. C.L. C.R. C.L. Bootstrap Wald-type Bootstrap Wald-type Bootstrap Wald-type Zhonglei Wang Bootstrap for complex sampling Jan. 16, / 36
27 Simulation studies Two-stage sampling designs A finite population F N = {y i,j : i = 1,..., H; j = 1,..., N i } is generated by y i,j = 50 + a i + e i,j, a i N(0, 50), e i,j Ex(20), N i a i Po(q i ) + c 0 where H = 100, Po(λ) is a Poisson distribution with a rate parameter λ, q i = (a i 25) 2 /20, c 0 = 40 is the minimum cluster size The finite population size is N = 17, 011. The cluster sizes range from 40 to 542. We assume that N and N 1,..., N H are known. Zhonglei Wang Bootstrap for complex sampling Jan. 16, / 36
28 Simulation studies Two-stage sampling designs (Cont d) Goal: Construct 90% confidence interval for Ȳ and P = N 1 H Ni i=1 j=1 δ (,q y )(y i,j ). We consider two different sampling designs for the first stage. Poisson sampling with π i N i and N i=1 π i = n 1, PPS sampling with p i z i and the sample size n 1. We use SRS as the second-stage sampling design with sample size n 2 for each sampled cluster. We consider n 1 {10, 30} and n 2 {10, 30}. Zhonglei Wang Bootstrap for complex sampling Jan. 16, / 36
29 Simulation studies Two-stage sampling designs (Cont d) We consider two methods to obtain the 90% confidence interval. The proposed method extended to a two-stage sampling design with M = 500. That is, use the following two steps to bootstrap the finite population. 1 Use the proposed method to bootstrap the H clusters by treating them as elements, and the original sample within each selected cluster are replicated accordingly. 2 For each bootstrap cluster, apply the proposed method to bootstrap the cluster finite population independently. Wald-type method, and it is the same as the one discussed before. 500 Monte Carlo simulations for each sampling design. Zhonglei Wang Bootstrap for complex sampling Jan. 16, / 36
30 Simulation studies Two-stage sampling designs (Cont d) Table : Coverage rate and length of the 90% confidence interval for Ȳ. Design Poisson PPS n 2 = 10 n 2 = 30 n 2 = 10 n 2 = 30 Method n 1 = 10 n 1 = 30 C.R. C.L. C.R. C.L. Bootstrap Wald-type Bootstrap Wald-type Bootstrap Wald-type Bootstrap Wald-type Zhonglei Wang Bootstrap for complex sampling Jan. 16, / 36
31 Simulation studies Two-stage sampling designs (Cont d) Table : Coverage rate and length of the 90% confidence interval for P. Design Poisson PPS n 2 = 10 n 2 = 30 n 2 = 10 n 2 = 30 Method n 1 = 10 n 1 = 30 C.R. C.L. C.R. C.L. Bootstrap Wald-type Bootstrap Wald-type Bootstrap Wald-type Bootstrap Wald-type Zhonglei Wang Bootstrap for complex sampling Jan. 16, / 36
32 Simulation studies Remark for the simulation studies For the two-stage sampling designs, The sampling distribution of Ỹ is approximately symmetric under both designs even when the sample size is small. The sampling distribution of the proportion estimator is slightly right-skewed when n 1 = 10. We have compared the proposed method with the nonparametric Bayesian bootstrap method (Dong et al., 2014) and that based on the two-step inverse sampling method (Sverchkov and Pfeffermann, 2004), and the proposed one works better in terms of the coverage rate. Zhonglei Wang Bootstrap for complex sampling Jan. 16, / 36
33 Conclusions Outline 1 Introduction 2 A brief review of some sampling designs 3 Bootstrap methods for complex sampling designs 4 Simulation studies 5 Conclusions Zhonglei Wang Bootstrap for complex sampling Jan. 16, / 36
34 Conclusions Conclusions We propose a bootstrap method for Poisson sampling, SRS and PPS sampling, and we show that the proposed method is second-order accurate. It is necessary to estimate the variance of the design-unbiased estimator since the proposed method is based on an asymptotically pivotal statistic. Although the proposed method is discussed under the single-stage sampling designs, simulation shows that it works well under some two-stage sampling designs. It may be extended to other complex sampling designs when the asymptotic distribution of the design-unbiased estimator exists, but the second-order accuracy may not be guaranteed. The proposed method can be easily parallelized in practice. Zhonglei Wang Bootstrap for complex sampling Jan. 16, / 36
35 Conclusions Selected reference Dong, Q., Elliott, M. R. & Raghunathan, T. E. (2014). A nonparametric method to generate synthetic populations to adjust for complex sampling design features. Surv. Methodol. 40, Fuller, W. A. (2009). Sampling Statistics. Hoboken: John Wiley. Hall, P. (1992). The Bootstrap and Edgeworth Expansion. New York: Springer Science & Business Media. Klenke, A. (2014). Probability Theory: A Comprehensive Course. 2nd edition. London: Springer Verlag London Ltd.. Rao, J. N. K. & Wu, C. F. J. (1988). Resampling inference with complex survey data. J. Amer. Statist. Assoc. 83, Sitter, R. R. (1992). A resampling procedure for complex survey data. J. Amer. Statist. Assoc. 20, Sverchkov, M.& Pfeffermann, D. (2004). Prediction of finite population totals based on the sample distribution. Surv. Methodol. 30, Zhonglei Wang Bootstrap for complex sampling Jan. 16, / 36
36 Conclusions Thank you! Zhonglei Wang Bootstrap for complex sampling Jan. 16, / 36
INSTRUMENTAL-VARIABLE CALIBRATION ESTIMATION IN SURVEY SAMPLING
Statistica Sinica 24 (2014), 1001-1015 doi:http://dx.doi.org/10.5705/ss.2013.038 INSTRUMENTAL-VARIABLE CALIBRATION ESTIMATION IN SURVEY SAMPLING Seunghwan Park and Jae Kwang Kim Seoul National Univeristy
More informationChapter 3: Element sampling design: Part 1
Chapter 3: Element sampling design: Part 1 Jae-Kwang Kim Fall, 2014 Simple random sampling 1 Simple random sampling 2 SRS with replacement 3 Systematic sampling Kim Ch. 3: Element sampling design: Part
More informationFractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling
Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling Jae-Kwang Kim 1 Iowa State University June 26, 2013 1 Joint work with Shu Yang Introduction 1 Introduction
More informationA note on multiple imputation for general purpose estimation
A note on multiple imputation for general purpose estimation Shu Yang Jae Kwang Kim SSC meeting June 16, 2015 Shu Yang, Jae Kwang Kim Multiple Imputation June 16, 2015 1 / 32 Introduction Basic Setup Assume
More informationCombining data from two independent surveys: model-assisted approach
Combining data from two independent surveys: model-assisted approach Jae Kwang Kim 1 Iowa State University January 20, 2012 1 Joint work with J.N.K. Rao, Carleton University Reference Kim, J.K. and Rao,
More informationData Integration for Big Data Analysis for finite population inference
for Big Data Analysis for finite population inference Jae-kwang Kim ISU January 23, 2018 1 / 36 What is big data? 2 / 36 Data do not speak for themselves Knowledge Reproducibility Information Intepretation
More informationConfidence Intervals in Ridge Regression using Jackknife and Bootstrap Methods
Chapter 4 Confidence Intervals in Ridge Regression using Jackknife and Bootstrap Methods 4.1 Introduction It is now explicable that ridge regression estimator (here we take ordinary ridge estimator (ORE)
More informationTesting Statistical Hypotheses
E.L. Lehmann Joseph P. Romano Testing Statistical Hypotheses Third Edition 4y Springer Preface vii I Small-Sample Theory 1 1 The General Decision Problem 3 1.1 Statistical Inference and Statistical Decisions
More informationMiscellanea A note on multiple imputation under complex sampling
Biometrika (2017), 104, 1,pp. 221 228 doi: 10.1093/biomet/asw058 Printed in Great Britain Advance Access publication 3 January 2017 Miscellanea A note on multiple imputation under complex sampling BY J.
More informationTwo-phase sampling approach to fractional hot deck imputation
Two-phase sampling approach to fractional hot deck imputation Jongho Im 1, Jae-Kwang Kim 1 and Wayne A. Fuller 1 Abstract Hot deck imputation is popular for handling item nonresponse in survey sampling.
More informationPlugin Confidence Intervals in Discrete Distributions
Plugin Confidence Intervals in Discrete Distributions T. Tony Cai Department of Statistics The Wharton School University of Pennsylvania Philadelphia, PA 19104 Abstract The standard Wald interval is widely
More informationPrimer on statistics:
Primer on statistics: MLE, Confidence Intervals, and Hypothesis Testing ryan.reece@gmail.com http://rreece.github.io/ Insight Data Science - AI Fellows Workshop Feb 16, 018 Outline 1. Maximum likelihood
More informationTesting Statistical Hypotheses
E.L. Lehmann Joseph P. Romano, 02LEu1 ttd ~Lt~S Testing Statistical Hypotheses Third Edition With 6 Illustrations ~Springer 2 The Probability Background 28 2.1 Probability and Measure 28 2.2 Integration.........
More informationCluster Sampling 2. Chapter Introduction
Chapter 7 Cluster Sampling 7.1 Introduction In this chapter, we consider two-stage cluster sampling where the sample clusters are selected in the first stage and the sample elements are selected in the
More informationREPLICATION VARIANCE ESTIMATION FOR TWO-PHASE SAMPLES
Statistica Sinica 8(1998), 1153-1164 REPLICATION VARIANCE ESTIMATION FOR TWO-PHASE SAMPLES Wayne A. Fuller Iowa State University Abstract: The estimation of the variance of the regression estimator for
More informationAsymptotic Normality under Two-Phase Sampling Designs
Asymptotic Normality under Two-Phase Sampling Designs Jiahua Chen and J. N. K. Rao University of Waterloo and University of Carleton Abstract Large sample properties of statistical inferences in the context
More informationCombining Non-probability and Probability Survey Samples Through Mass Imputation
Combining Non-probability and Probability Survey Samples Through Mass Imputation Jae-Kwang Kim 1 Iowa State University & KAIST October 27, 2018 1 Joint work with Seho Park, Yilin Chen, and Changbao Wu
More information11. Bootstrap Methods
11. Bootstrap Methods c A. Colin Cameron & Pravin K. Trivedi 2006 These transparencies were prepared in 20043. They can be used as an adjunct to Chapter 11 of our subsequent book Microeconometrics: Methods
More informationASYMPTOTIC NORMALITY UNDER TWO-PHASE SAMPLING DESIGNS
Statistica Sinica 17(2007), 1047-1064 ASYMPTOTIC NORMALITY UNDER TWO-PHASE SAMPLING DESIGNS Jiahua Chen and J. N. K. Rao University of British Columbia and Carleton University Abstract: Large sample properties
More informationMidterm Examination. STA 215: Statistical Inference. Due Wednesday, 2006 Mar 8, 1:15 pm
Midterm Examination STA 215: Statistical Inference Due Wednesday, 2006 Mar 8, 1:15 pm This is an open-book take-home examination. You may work on it during any consecutive 24-hour period you like; please
More informationWeighting in survey analysis under informative sampling
Jae Kwang Kim and Chris J. Skinner Weighting in survey analysis under informative sampling Article (Accepted version) (Refereed) Original citation: Kim, Jae Kwang and Skinner, Chris J. (2013) Weighting
More informationC. J. Skinner Cross-classified sampling: some estimation theory
C. J. Skinner Cross-classified sampling: some estimation theory Article (Accepted version) (Refereed) Original citation: Skinner, C. J. (205) Cross-classified sampling: some estimation theory. Statistics
More informationSampling from Finite Populations Jill M. Montaquila and Graham Kalton Westat 1600 Research Blvd., Rockville, MD 20850, U.S.A.
Sampling from Finite Populations Jill M. Montaquila and Graham Kalton Westat 1600 Research Blvd., Rockville, MD 20850, U.S.A. Keywords: Survey sampling, finite populations, simple random sampling, systematic
More informationEmpirical likelihood inference for regression parameters when modelling hierarchical complex survey data
Empirical likelihood inference for regression parameters when modelling hierarchical complex survey data Melike Oguz-Alper Yves G. Berger Abstract The data used in social, behavioural, health or biological
More informationUsing R in Undergraduate and Graduate Probability and Mathematical Statistics Courses*
Using R in Undergraduate and Graduate Probability and Mathematical Statistics Courses* Amy G. Froelich Michael D. Larsen Iowa State University *The work presented in this talk was partially supported by
More informationStatistical Prediction Based on Censored Life Data. Luis A. Escobar Department of Experimental Statistics Louisiana State University.
Statistical Prediction Based on Censored Life Data Overview Luis A. Escobar Department of Experimental Statistics Louisiana State University and William Q. Meeker Department of Statistics Iowa State University
More informationESTIMATION OF CONFIDENCE INTERVALS FOR QUANTILES IN A FINITE POPULATION
Mathematical Modelling and Analysis Volume 13 Number 2, 2008, pages 195 202 c 2008 Technika ISSN 1392-6292 print, ISSN 1648-3510 online ESTIMATION OF CONFIDENCE INTERVALS FOR QUANTILES IN A FINITE POPULATION
More informationShu Yang and Jae Kwang Kim. Harvard University and Iowa State University
Statistica Sinica 27 (2017), 000-000 doi:https://doi.org/10.5705/ss.202016.0155 DISCUSSION: DISSECTING MULTIPLE IMPUTATION FROM A MULTI-PHASE INFERENCE PERSPECTIVE: WHAT HAPPENS WHEN GOD S, IMPUTER S AND
More informationSpring 2012 Math 541B Exam 1
Spring 2012 Math 541B Exam 1 1. A sample of size n is drawn without replacement from an urn containing N balls, m of which are red and N m are black; the balls are otherwise indistinguishable. Let X denote
More informationNonresponse weighting adjustment using estimated response probability
Nonresponse weighting adjustment using estimated response probability Jae-kwang Kim Yonsei University, Seoul, Korea December 26, 2006 Introduction Nonresponse Unit nonresponse Item nonresponse Basic strategy
More informationInferences for the Ratio: Fieller s Interval, Log Ratio, and Large Sample Based Confidence Intervals
Inferences for the Ratio: Fieller s Interval, Log Ratio, and Large Sample Based Confidence Intervals Michael Sherman Department of Statistics, 3143 TAMU, Texas A&M University, College Station, Texas 77843,
More information6. Fractional Imputation in Survey Sampling
6. Fractional Imputation in Survey Sampling 1 Introduction Consider a finite population of N units identified by a set of indices U = {1, 2,, N} with N known. Associated with each unit i in the population
More informationBahadur representations for bootstrap quantiles 1
Bahadur representations for bootstrap quantiles 1 Yijun Zuo Department of Statistics and Probability, Michigan State University East Lansing, MI 48824, USA zuo@msu.edu 1 Research partially supported by
More informationBOOTSTRAPPING SAMPLE QUANTILES BASED ON COMPLEX SURVEY DATA UNDER HOT DECK IMPUTATION
Statistica Sinica 8(998), 07-085 BOOTSTRAPPING SAMPLE QUANTILES BASED ON COMPLEX SURVEY DATA UNDER HOT DECK IMPUTATION Jun Shao and Yinzhong Chen University of Wisconsin-Madison Abstract: The bootstrap
More informationFall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.
1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n
More information6.207/14.15: Networks Lecture 3: Erdös-Renyi graphs and Branching processes
6.207/14.15: Networks Lecture 3: Erdös-Renyi graphs and Branching processes Daron Acemoglu and Asu Ozdaglar MIT September 16, 2009 1 Outline Erdös-Renyi random graph model Branching processes Phase transitions
More informationof being selected and varying such probability across strata under optimal allocation leads to increased accuracy.
5 Sampling with Unequal Probabilities Simple random sampling and systematic sampling are schemes where every unit in the population has the same chance of being selected We will now consider unequal probability
More informationChapter 8: Estimation 1
Chapter 8: Estimation 1 Jae-Kwang Kim Iowa State University Fall, 2014 Kim (ISU) Ch. 8: Estimation 1 Fall, 2014 1 / 33 Introduction 1 Introduction 2 Ratio estimation 3 Regression estimator Kim (ISU) Ch.
More informationThe Use of Survey Weights in Regression Modelling
The Use of Survey Weights in Regression Modelling Chris Skinner London School of Economics and Political Science (with Jae-Kwang Kim, Iowa State University) Colorado State University, June 2013 1 Weighting
More informationarxiv: v2 [math.st] 20 Jun 2014
A solution in small area estimation problems Andrius Čiginas and Tomas Rudys Vilnius University Institute of Mathematics and Informatics, LT-08663 Vilnius, Lithuania arxiv:1306.2814v2 [math.st] 20 Jun
More informationStatistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation
Statistics - Lecture One Charlotte Wickham wickham@stat.berkeley.edu http://www.stat.berkeley.edu/~wickham/ Outline 1. Basic ideas about estimation 2. Method of Moments 3. Maximum Likelihood 4. Confidence
More informationSampling techniques for big data analysis in finite population inference
Statistics Preprints Statistics 1-29-2018 Sampling techniques for big data analysis in finite population inference Jae Kwang Kim Iowa State University, jkim@iastate.edu Zhonglei Wang Iowa State University,
More informationOn the bias of the multiple-imputation variance estimator in survey sampling
J. R. Statist. Soc. B (2006) 68, Part 3, pp. 509 521 On the bias of the multiple-imputation variance estimator in survey sampling Jae Kwang Kim, Yonsei University, Seoul, Korea J. Michael Brick, Westat,
More informationHeteroskedasticity-Robust Inference in Finite Samples
Heteroskedasticity-Robust Inference in Finite Samples Jerry Hausman and Christopher Palmer Massachusetts Institute of Technology December 011 Abstract Since the advent of heteroskedasticity-robust standard
More informationEmpirical Likelihood Methods for Sample Survey Data: An Overview
AUSTRIAN JOURNAL OF STATISTICS Volume 35 (2006), Number 2&3, 191 196 Empirical Likelihood Methods for Sample Survey Data: An Overview J. N. K. Rao Carleton University, Ottawa, Canada Abstract: The use
More informationRESEARCH REPORT. Vanishing auxiliary variables in PPS sampling with applications in microscopy.
CENTRE FOR STOCHASTIC GEOMETRY AND ADVANCED BIOIMAGING 2014 www.csgb.dk RESEARCH REPORT Ina Trolle Andersen, Ute Hahn and Eva B. Vedel Jensen Vanishing auxiliary variables in PPS sampling with applications
More informationSTA 2201/442 Assignment 2
STA 2201/442 Assignment 2 1. This is about how to simulate from a continuous univariate distribution. Let the random variable X have a continuous distribution with density f X (x) and cumulative distribution
More informationEmpirical Likelihood Methods
Handbook of Statistics, Volume 29 Sample Surveys: Theory, Methods and Inference Empirical Likelihood Methods J.N.K. Rao and Changbao Wu (February 14, 2008, Final Version) 1 Likelihood-based Approaches
More informationFractional Imputation in Survey Sampling: A Comparative Review
Fractional Imputation in Survey Sampling: A Comparative Review Shu Yang Jae-Kwang Kim Iowa State University Joint Statistical Meetings, August 2015 Outline Introduction Fractional imputation Features Numerical
More informationISI Web of Knowledge (Articles )
ISI Web of Knowledge (Articles 1 -- 18) Record 1 of 18 Title: Estimation and prediction from gamma distribution based on record values Author(s): Sultan, KS; Al-Dayian, GR; Mohammad, HH Source: COMPUTATIONAL
More informationEmpirical Likelihood Inference for Two-Sample Problems
Empirical Likelihood Inference for Two-Sample Problems by Ying Yan A thesis presented to the University of Waterloo in fulfillment of the thesis requirement for the degree of Master of Mathematics in Statistics
More informationChapter 4. Replication Variance Estimation. J. Kim, W. Fuller (ISU) Chapter 4 7/31/11 1 / 28
Chapter 4 Replication Variance Estimation J. Kim, W. Fuller (ISU) Chapter 4 7/31/11 1 / 28 Jackknife Variance Estimation Create a new sample by deleting one observation n 1 n n ( x (k) x) 2 = x (k) = n
More informationBootstrap. Director of Center for Astrostatistics. G. Jogesh Babu. Penn State University babu.
Bootstrap G. Jogesh Babu Penn State University http://www.stat.psu.edu/ babu Director of Center for Astrostatistics http://astrostatistics.psu.edu Outline 1 Motivation 2 Simple statistical problem 3 Resampling
More informationA new resampling method for sampling designs without replacement: the doubled half bootstrap
1 Published in Computational Statistics 29, issue 5, 1345-1363, 2014 which should be used for any reference to this work A new resampling method for sampling designs without replacement: the doubled half
More informationConfidence intervals for kernel density estimation
Stata User Group - 9th UK meeting - 19/20 May 2003 Confidence intervals for kernel density estimation Carlo Fiorio c.fiorio@lse.ac.uk London School of Economics and STICERD Stata User Group - 9th UK meeting
More informationRobust Backtesting Tests for Value-at-Risk Models
Robust Backtesting Tests for Value-at-Risk Models Jose Olmo City University London (joint work with Juan Carlos Escanciano, Indiana University) Far East and South Asia Meeting of the Econometric Society
More informationAn Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data
An Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data Jae-Kwang Kim 1 Iowa State University June 28, 2012 1 Joint work with Dr. Ming Zhou (when he was a PhD student at ISU)
More informationChapter 4: Imputation
Chapter 4: Imputation Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Basic Theory for imputation 3 Variance estimation after imputation 4 Replication variance estimation
More informationConservative variance estimation for sampling designs with zero pairwise inclusion probabilities
Conservative variance estimation for sampling designs with zero pairwise inclusion probabilities Peter M. Aronow and Cyrus Samii Forthcoming at Survey Methodology Abstract We consider conservative variance
More informationSmall area prediction based on unit level models when the covariate mean is measured with error
Graduate Theses and Dissertations Iowa State University Capstones, Theses and Dissertations 2015 Small area prediction based on unit level models when the covariate mean is measured with error Andreea
More informationBetter Bootstrap Confidence Intervals
by Bradley Efron University of Washington, Department of Statistics April 12, 2012 An example Suppose we wish to make inference on some parameter θ T (F ) (e.g. θ = E F X ), based on data We might suppose
More informationThe Nonparametric Bootstrap
The Nonparametric Bootstrap The nonparametric bootstrap may involve inferences about a parameter, but we use a nonparametric procedure in approximating the parametric distribution using the ECDF. We use
More information(3) (S) THE BIAS AND STABILITY OF JACK -KNIFE VARIANCE ESTIMATOR IN RATIO ESTIMATION
THE BIAS AND STABILITY OF JACK -KNIFE VARIANCE ESTIMATOR IN RATIO ESTIMATION R.P. Chakrabarty and J.N.K. Rao University of Georgia and Texas A M University Summary The Jack -Knife variance estimator v(r)
More informationThe exact bootstrap method shown on the example of the mean and variance estimation
Comput Stat (2013) 28:1061 1077 DOI 10.1007/s00180-012-0350-0 ORIGINAL PAPER The exact bootstrap method shown on the example of the mean and variance estimation Joanna Kisielinska Received: 21 May 2011
More informationChapter 5: Models used in conjunction with sampling. J. Kim, W. Fuller (ISU) Chapter 5: Models used in conjunction with sampling 1 / 70
Chapter 5: Models used in conjunction with sampling J. Kim, W. Fuller (ISU) Chapter 5: Models used in conjunction with sampling 1 / 70 Nonresponse Unit Nonresponse: weight adjustment Item Nonresponse:
More informationDiscussion Paper Series
INSTITUTO TECNOLÓGICO AUTÓNOMO DE MÉXICO CENTRO DE INVESTIGACIÓN ECONÓMICA Discussion Paper Series Size Corrected Power for Bootstrap Tests Manuel A. Domínguez and Ignacio N. Lobato Instituto Tecnológico
More informationContents. Preface to Second Edition Preface to First Edition Abbreviations PART I PRINCIPLES OF STATISTICAL THINKING AND ANALYSIS 1
Contents Preface to Second Edition Preface to First Edition Abbreviations xv xvii xix PART I PRINCIPLES OF STATISTICAL THINKING AND ANALYSIS 1 1 The Role of Statistical Methods in Modern Industry and Services
More informationLikelihood-based inference with missing data under missing-at-random
Likelihood-based inference with missing data under missing-at-random Jae-kwang Kim Joint work with Shu Yang Department of Statistics, Iowa State University May 4, 014 Outline 1. Introduction. Parametric
More informationSMOOTHED BLOCK EMPIRICAL LIKELIHOOD FOR QUANTILES OF WEAKLY DEPENDENT PROCESSES
Statistica Sinica 19 (2009), 71-81 SMOOTHED BLOCK EMPIRICAL LIKELIHOOD FOR QUANTILES OF WEAKLY DEPENDENT PROCESSES Song Xi Chen 1,2 and Chiu Min Wong 3 1 Iowa State University, 2 Peking University and
More informationTOLERANCE INTERVALS FOR DISCRETE DISTRIBUTIONS IN EXPONENTIAL FAMILIES
Statistica Sinica 19 (2009), 905-923 TOLERANCE INTERVALS FOR DISCRETE DISTRIBUTIONS IN EXPONENTIAL FAMILIES Tianwen Tony Cai and Hsiuying Wang University of Pennsylvania and National Chiao Tung University
More informationConfidence Regions For The Ratio Of Two Percentiles
Confidence Regions For The Ratio Of Two Percentiles Richard Johnson Joint work with Li-Fei Huang and Songyong Sim January 28, 2009 OUTLINE Introduction Exact sampling results normal linear model case Other
More informationThe Union and Intersection for Different Configurations of Two Events Mutually Exclusive vs Independency of Events
Section 1: Introductory Probability Basic Probability Facts Probabilities of Simple Events Overview of Set Language Venn Diagrams Probabilities of Compound Events Choices of Events The Addition Rule Combinations
More informationAsymptotic Statistics-III. Changliang Zou
Asymptotic Statistics-III Changliang Zou The multivariate central limit theorem Theorem (Multivariate CLT for iid case) Let X i be iid random p-vectors with mean µ and and covariance matrix Σ. Then n (
More informationSupplement to Quantile-Based Nonparametric Inference for First-Price Auctions
Supplement to Quantile-Based Nonparametric Inference for First-Price Auctions Vadim Marmer University of British Columbia Artyom Shneyerov CIRANO, CIREQ, and Concordia University August 30, 2010 Abstract
More informationA Practitioner s Guide to Cluster-Robust Inference
A Practitioner s Guide to Cluster-Robust Inference A. C. Cameron and D. L. Miller presented by Federico Curci March 4, 2015 Cameron Miller Cluster Clinic II March 4, 2015 1 / 20 In the previous episode
More informationMA 575 Linear Models: Cedric E. Ginestet, Boston University Bootstrap for Regression Week 9, Lecture 1
MA 575 Linear Models: Cedric E. Ginestet, Boston University Bootstrap for Regression Week 9, Lecture 1 1 The General Bootstrap This is a computer-intensive resampling algorithm for estimating the empirical
More informationSTATISTICS SYLLABUS UNIT I
STATISTICS SYLLABUS UNIT I (Probability Theory) Definition Classical and axiomatic approaches.laws of total and compound probability, conditional probability, Bayes Theorem. Random variable and its distribution
More informationA measurement error model approach to small area estimation
A measurement error model approach to small area estimation Jae-kwang Kim 1 Spring, 2015 1 Joint work with Seunghwan Park and Seoyoung Kim Ouline Introduction Basic Theory Application to Korean LFS Discussion
More informationModel Assisted Survey Sampling
Carl-Erik Sarndal Jan Wretman Bengt Swensson Model Assisted Survey Sampling Springer Preface v PARTI Principles of Estimation for Finite Populations and Important Sampling Designs CHAPTER 1 Survey Sampling
More informationOne-Sample Numerical Data
One-Sample Numerical Data quantiles, boxplot, histogram, bootstrap confidence intervals, goodness-of-fit tests University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/~eariasca/teaching.html
More informationSampling: A Brief Review. Workshop on Respondent-driven Sampling Analyst Software
Sampling: A Brief Review Workshop on Respondent-driven Sampling Analyst Software 201 1 Purpose To review some of the influences on estimates in design-based inference in classic survey sampling methods
More informationWeek 1 Quantitative Analysis of Financial Markets Distributions A
Week 1 Quantitative Analysis of Financial Markets Distributions A Christopher Ting http://www.mysmu.edu/faculty/christophert/ Christopher Ting : christopherting@smu.edu.sg : 6828 0364 : LKCSB 5036 October
More informationMonte Carlo Study on the Successive Difference Replication Method for Non-Linear Statistics
Monte Carlo Study on the Successive Difference Replication Method for Non-Linear Statistics Amang S. Sukasih, Mathematica Policy Research, Inc. Donsig Jang, Mathematica Policy Research, Inc. Amang S. Sukasih,
More informationMonte Carlo Simulations
Monte Carlo Simulations What are Monte Carlo Simulations and why ones them? Pseudo Random Number generators Creating a realization of a general PDF The Bootstrap approach A real life example: LOFAR simulations
More informationResearch Article A Nonparametric Two-Sample Wald Test of Equality of Variances
Advances in Decision Sciences Volume 211, Article ID 74858, 8 pages doi:1.1155/211/74858 Research Article A Nonparametric Two-Sample Wald Test of Equality of Variances David Allingham 1 andj.c.w.rayner
More informationTopic 16 Interval Estimation. The Bootstrap and the Bayesian Approach
Topic 16 Interval Estimation and the Bayesian Approach 1 / 9 Outline 2 / 9 The confidence regions have been determined using aspects of the distribution of the data, by, for example, appealing to the central
More informationAnalysis of incomplete data in presence of competing risks
Journal of Statistical Planning and Inference 87 (2000) 221 239 www.elsevier.com/locate/jspi Analysis of incomplete data in presence of competing risks Debasis Kundu a;, Sankarshan Basu b a Department
More informationEXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY
EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA, 00 MODULE : Statistical Inference Time Allowed: Three Hours Candidates should answer FIVE questions. All questions carry equal marks. The
More informationCalibration estimation using exponential tilting in sample surveys
Calibration estimation using exponential tilting in sample surveys Jae Kwang Kim February 23, 2010 Abstract We consider the problem of parameter estimation with auxiliary information, where the auxiliary
More informationHypothesis Testing For Multilayer Network Data
Hypothesis Testing For Multilayer Network Data Jun Li Dept of Mathematics and Statistics, Boston University Joint work with Eric Kolaczyk Outline Background and Motivation Geometric structure of multilayer
More informationIntroduction to Survey Data Integration
Introduction to Survey Data Integration Jae-Kwang Kim Iowa State University May 20, 2014 Outline 1 Introduction 2 Survey Integration Examples 3 Basic Theory for Survey Integration 4 NASS application 5
More informationStatistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach
Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Observed likelihood 3 Mean Score
More informationNumerical Analysis for Statisticians
Kenneth Lange Numerical Analysis for Statisticians Springer Contents Preface v 1 Recurrence Relations 1 1.1 Introduction 1 1.2 Binomial CoefRcients 1 1.3 Number of Partitions of a Set 2 1.4 Horner's Method
More informationStat 516, Homework 1
Stat 516, Homework 1 Due date: October 7 1. Consider an urn with n distinct balls numbered 1,..., n. We sample balls from the urn with replacement. Let N be the number of draws until we encounter a ball
More informationREMAINDER LINEAR SYSTEMATIC SAMPLING
Sankhyā : The Indian Journal of Statistics 2000, Volume 62, Series B, Pt. 2, pp. 249 256 REMAINDER LINEAR SYSTEMATIC SAMPLING By HORNG-JINH CHANG and KUO-CHUNG HUANG Tamkang University, Taipei SUMMARY.
More informationA JACKKNIFE VARIANCE ESTIMATOR FOR SELF-WEIGHTED TWO-STAGE SAMPLES
Statistica Sinica 23 (2013), 595-613 doi:http://dx.doi.org/10.5705/ss.2011.263 A JACKKNFE VARANCE ESTMATOR FOR SELF-WEGHTED TWO-STAGE SAMPLES Emilio L. Escobar and Yves G. Berger TAM and University of
More informationMonte Carlo Studies. The response in a Monte Carlo study is a random variable.
Monte Carlo Studies The response in a Monte Carlo study is a random variable. The response in a Monte Carlo study has a variance that comes from the variance of the stochastic elements in the data-generating
More informationEstimation of AUC from 0 to Infinity in Serial Sacrifice Designs
Estimation of AUC from 0 to Infinity in Serial Sacrifice Designs Martin J. Wolfsegger Department of Biostatistics, Baxter AG, Vienna, Austria Thomas Jaki Department of Statistics, University of South Carolina,
More informationRecent Advances in the analysis of missing data with non-ignorable missingness
Recent Advances in the analysis of missing data with non-ignorable missingness Jae-Kwang Kim Department of Statistics, Iowa State University July 4th, 2014 1 Introduction 2 Full likelihood-based ML estimation
More informationA union of Bayesian, frequentist and fiducial inferences by confidence distribution and artificial data sampling
A union of Bayesian, frequentist and fiducial inferences by confidence distribution and artificial data sampling Min-ge Xie Department of Statistics, Rutgers University Workshop on Higher-Order Asymptotics
More information