The R package sampling, a software tool for training in official statistics and survey sampling
|
|
- Aubrey Whitehead
- 5 years ago
- Views:
Transcription
1 The R package sampling, a software tool for training in official statistics and survey sampling Yves Tillé 1 and Alina Matei 2 1 Institute of Statistics, University of Neuchâtel, Switzerland yves.tille@unine.ch 2 Institute of Statistics, University of Neuchâtel, Switzerland alina.matei@unine.ch Summary. The R package sampling is a software tool for training in official statistics and survey sampling. It is a collection of tools for selecting and weighting samples. Equal and unequal probability sampling, balanced sampling, and calibration methods are implemented. A large number of examples is available in the manual software. Key words: survey sampling, equal/unequal probability sampling, balanced sampling, calibration 1 Introduction Training programmes for official statisticians vary across countries and institutions, and it is a permanent preoccupation at official statistical agencies. The R package sampling [TM06] is a software tool for training in official statisticsand survey sampling It was developed for the training course Advanced methods of survey sampling 3 organized by the Swiss Statistical Federal Office in the framework of the European Statistical Training Programme. This paper is a general description of the package and an introduction for new users. The sampling package is a R package containing a collection of tools related to the sampling survey theory. The implemented functions regard selecting and weighting samples. Several procedures allow selecting samples with equal or unequal probabilities. It is also possible to employ a balanced sampling. The function for selecting a balanced sample uses the cube method [DT04]. Two methods for calibration are also implemented: the regression estimator that uses a chi-square distance, and the raking ratio estimator that uses the Kullback-Leibler divergence. Moreover, the package contains three databases, a set of tools for computing the inclusion probabilities and for rearranging strata. A large number of examples is 3 in April 2005, at Neuchâtel, Switzerland
2 1474 Yves Tillé and Alina Matei available in the manual software [TM06]. For a description of the used sampling algorithms, one can see [Til06]. A brief description of the package is given in Section 2. The function names are given in verbatim style, e.g. Rsrswor. We give three examples, involving methods for selecting and weighting samples (cf. Section 3). Conclusions are drawn in Section 4. 2 Package description 2.1 Some notations and basic concepts Let U = {1,..., k,..., N} be the finite population. The unit k is the reference unit. A sample s is a subset of U. It can be represented as a {0, 1} vector s = (s k ) k U 1 if unit k s, s k = 0 if not. (1) Let S be the sample support, which is the set of all possible samples drawn from U. Thus S is the set of 2 N subsets of U. A couple (S,p) is denoted as a sampling design, where p is a probability distribution on S. For a given p(.), any s S is viewed as a realization of a random variable S, such that Pr(S = s) = p(s). Suppose we have k S. Thus the random event S k is the event a sample containing k is realized [SSW92]. The cardinality of the set s is the sample size of s, and we shall denote it by n. Given p(.), the inclusion probability of a unit k is the probability that unit k will be in a sample. It is defined by π k = Pr(k S) = s k s S p(s). The quantities π k are denoted as the first-order inclusion probabilities, k U. Similarly, the second-order inclusion probabilities or the joint inclusion probabilities are defined as π kl = Pr(k S, l S) = p(s). s k,l s S When the sample size is fixed to n, the inclusion probabilities satisfy the conditions k U π k = n, l U l k π kl = (n 1)π k. Let y be the variable of interest. The Horvitz-Thompson estimator [HT52] of the population total t y = È k U y k is tπ = È k s y k/π k.
3 2.2 Sampling with equal or unequal probabilities The R package sampling 1475 The package contains functions for drawing samples with or without replacement, and with equal or unequal probabilities. The implemented unequal probability sampling designs are: Brewer sampling [Bre63], maximum entropy sampling (or Conditional Poisson sampling) [Háj64,Háj81], Midzuno sampling [Mid52], minimal support sampling [DT98], multinomial sampling [HH43], pivotal sampling [DT98], Poisson sampling [Háj58], systematic sampling [Mad49], Sampford sampling [Sam67] and Tillé sampling [Til96]. The first-order and the second-order inclusion probabilities are computed for the following sampling designs: maximum entropy, Midzuno, systematic, and Tillé. The functions which implement unequal probability sampling designs use in their names the prefix UP (Unequal Probability), e.g. RUPpoisson. 2.3 Balanced sampling A balanced sampling design is defined by the property that the Horvitz-Thompson estimators of the population totals of a set of auxiliary variables equal the known totals of these variables k s x jk π k = k U x jk, (2) for all s S such that p(s) > 0, where j {1,..., J}, and x k = (x 1k,..., x Jk ) is a row vector of auxiliary variables. The cube method [DT04] is a general method for selecting approximately balanced samples with equal or unequal inclusion probabilities and any number of auxiliary variables. The main function for selecting a balanced sample by means of the cube method is the function Rsamplecube. The two phases of the cube method, the flight phase (Rfastflightcube) and the landing phase (Rlandingcube), can be run separately. Additional procedures can be used to select a balanced stratified sample (Rbalancedstratification), a balanced cluster sample (Rbalancedcluster), and a balanced two-stage sample (Rbalancedtwostage). 2.4 Calibration The calibration estimator [DS92] is defined as tcal = k s w k y k, where k s w k x k = k U x k = t x, (3) for a row vector of auxiliary variables x k = (x 1k,..., x Jk ), for which t x is known. The equation (3) is called the calibration equation. Let d k be the initial weights, usually equal to 1/π k. Deville and Särndal required that the difference between the set of sampling design weights d k and w k, k s, satisfying equation (3), minimizes some function. The function to minimize is
4 1476 Yves Tillé and Alina Matei k s d k q k G k (w k /d k ) λ( k s w k x k t x), where λ is the vector of the Lagrange multipliers. Minimization leads to the calibration weights w k = d k F k (x kλ/q k ), where q k is a weight associated with unit k, unrelated to d k, that accounts for heteroscedastic residuals from fitting y on x = (x k ), and F k is the inverse of the dg k (u)/du function with the property that F k (0) = 1, F k(0) = q k > 0. Two methods of calibration are implemented: the regression estimator (Rregressionestimator) which uses a chi-square distance, and the raking ratio estimator (Rrakingratio) which uses the Kullback-Leibler divergence ( [DS92]). The g-weights (equal to w k /d k ) can be bounded for both methods by means of two additional procedures Rboundedregressionestimator, and Rboundedrakingratio. Since the calibration estimator does not always exist, the function Rcheckcalibration can check the existence of the solution. 2.5 Additional functions and datasets The package contains additional facilities such as: computation of inclusion probability for a πps sampling design (Rinclusionprobabilities), computation of inclusion probabilities for a stratified design (Rinclusionprobastrata), list of all possible samples with fixed sample size (Rwritesample), renumber and suppress the empty strata of a stratification variable (Rcleanstrata), create a disjunctive codification of a stratification or factor variable (Rdisjunctive). Three datasets are supplied with the package: MU284 dataset [SSW92], Belgian municipalities dataset, and Swiss municipalities dataset. 3 Demonstration 3.1 A small overview R is an environment for statistical computing and graphics based on the S programming language from Bell Labs. The software provides a wide variety of statistical and graphical techniques. R can be freely downloaded from the address Regardless the operating system, the package sampling can be installed by typing the following command at R prompt: install.packages("sampling") This will install the latest version from Comprehensive R Archive Network http: //CRAN.R-project.org/ A few examples of sampling s capabilities are shown in the following transcript of a R session. A more extensive demonstration can be seen by loading the package by library(sampling) The list of all function is given by typing help(package=sampling)
5 The R package sampling Examples Example 1 A simple example is given below. The vector of the first-order inclusion probabilities is defined and denoted by Rpik. A sample of fixed size equal to 3 is selected by using the systematic sampling with unequal probabilities. The sample is represented as in expression (1). The Horvitz-Thompson estimator of t y is computed. #define the first-order inclusion probabilities pik=c(0.2,0.7,0.8,0.5,0.4,0.4) #the population size N=length(pik) #define the variable of interest y=c(23.4,5.64,31.45) #select a sample s=upsystematic(pik) #the selected sample is (1:N)[s==1] #The Horvitz-Thompson estimator of the total is c((1/pik[s==1]) %*% y) If the selected sample is {1, 3,4}, the Horvitz-Thompson estimator is Example 2 A more complex example given below involves the selection of samples of fixed size or expected size equal to 200 with equal or unequal probabilities. The population is the Belgian municipalities dataset. The first-order inclusion probabilities are computed using an auxiliary information (the variable total 2004, Tot04). The following 9 sampling designs are considered: Poisson sampling, systematic sampling with random order of units in population (denoted in Fig. 1 as rsystematic), pivotal sampling with random order of units in population (denoted in Fig. 1 as rpivotal), Tillé sampling, Midzuno sampling, systematic sampling, pivotal sampling, multinomial sampling, and simple random sampling without replacement. The Horvitz-Thompson estimator of the total t y (y is the variable taxable income, TaxableIncome) is computed. Monte-Carlo simulations are executed in order to compare the accuracy of the Horvitz-Thompson estimator for these different sampling designs. The number of simulations (given by the variable sim) is fixed to The simulation results can be interpreted via boxplots (see Fig. 1). Simple random sampling, multinomial sampling, and Poisson sampling are not accurate. All the methods of unequal probability sampling seem to have the same accuracy, except from random systematic sampling and random pivotal sampling that have variances which depend on the order of the units in the file. data(belgianmunicipalities) attach(belgianmunicipalities) #compute the inclusion probabilities pik pik=inclusionprobabilities(tot04,200) #the population size N=length(pik) #the sample size n=sum(pik) #number of simulations sim=1000 ss=array(0,c(sim,9)) # the variable of interest y=taxableincome #simulations and computation of the Horvitz-Thompson estimator for(i in 1:sim)
6 1478 Yves Tillé and Alina Matei cat("step ",i," normal ") ss[i,]=ss[i,]+c( HTestimator(y,pik,UPpoisson(pik)), HTestimator(y,pik,UPrandomsystematic(pik)), HTestimator(y,pik,UPrandompivotal(pik)), HTestimator(y,pik,UPtille(pik)), HTestimator(y,pik,UPmidzuno(pik)), HTestimator(y,pik,UPsystematic(pik)), HTestimator(y,pik,UPpivotal(pik)), HTestimator(y,pik,UPmultinomial(pik)), HTestimator(y,rep(n/N,N),srswor(n,N))) # boxplots of the estimators colnames(ss) <- c("poisson","rsystematic","rpivotal","tille","midzuno", "systematic","pivotal","multinom","srswor") boxplot(data.frame(ss), las=3) poisson rsystematic rpivotal tille midzuno systematic pivotal multinom srswor 1.0 e e e e+11 Fig. 1. Accuracy of the Horvitz-Thompson estimator Example 3 The third example computes the g-weights for the regression estimator. There are 3 auxiliary variables and 10 population units. The first two auxiliary variables are categorical, and the last one is numerical. The first-order inclusion probabilities are equal to 0.2. The known population totals for the auxiliary variables are 24,
7 The R package sampling and 280. A simple random sample without replacement of size 4 is drawn. The calibration estimator of t y is computed. # matrix of auxiliary variables defined by columns Xs=cbind(c(1,1,1,1,1,0,0,0,0,0),c(0,0,0,0,0,1,1,1,1,1), c(1,2,3,4,5,6,7,8,9,10)) # the inclusion probabilities piks=rep(0.2,times=10) # the vector of totals t=c(24,26,280) # the g-weights g=regressionestimator(xs,piks,t) # verify the calibration # in the affirmative case, the printed values are equal to t if(checkcalibration(xs,piks,t,g)) c((g/piks) #draw a srswor of size 4 from a population of size 10 s=srswor(4,10) #the sample is (1:10)[s==1] #define the variable of interest y=c(23.4,5.64,31.45,10.23) # the calibration estimator is crossprod((g/piks)[s==1],y) The resulting g-weights are , and the calibration is possible. For the selected sample {3, 6,8, 10}, the calibration estimator is equal to Conslusions The R sampling package is both a training and a teaching tool. It can be used in official statistics, survey sampling, as well as in biostatistics. There are functions for selecting and weighting samples. For each package function illustrative examples can be found. Functions for variance estimations are forthcoming. The last version of the package and its manual can be freely downloaded from the address http: //cran.r-project.org/src/contrib/descriptions/sampling.html References [Bre63] K. R. W. Brewer. A model of systematic sampling with unequal probabilites. Australian Journal of Statistics, 5:5 13, [DS92] J.-C. Deville and C.-E. Särndal. Calibration estimators in survey sampling. Journal of the American Statistical Association, 87: , [DT98] J.-C. Deville and Y. Tillé. Unequal probability sampling without replacement through a splitting method. Biometrika, 85:89 101, [DT04] J.-C. Deville and Y. Tillé. Efficient balanced sampling: the cube method. Biometrika, 91: , [Háj58] J. Hájek. Some contributions to the theory of probability sampling. In ISI, editor, Bulletin of the International Statistical Institute: Proceedings of the 30th session (Stockholm), volume 36, book 3, pages , The Hague, [Háj64] J. Hájek. Asymptotic theory of rejective sampling with varying probabilities from a finite population. Annals of Mathematical Statistics, 35: , 1964.
8 1480 Yves Tillé and Alina Matei [Háj81] J. Hájek. Sampling from a Finite Population. Marcel Dekker, New York, [HH43] M.H. Hansen and W.N. Hurwitz. On the theory of sampling from finite populations. Annals of Mathematical Statistics, 14: , [HT52] D.G. Horvitz and D.J. Thompson. A generalization of sampling without replacement from a finite universe. Journal of the American Statistical Association, 47: , [Mad49] W.G. Madow. On the theory of systematic sampling, II. Annals of Mathematical Statistics, 20: , [Mid52] H. Midzuno. On the sampling system with probability proportional to sum of size. Annals of the Institute of Statistical Mathematics, 3:99 107, [Sam67] M.R. Sampford. On sampling without replacement with unequal probabilities of selection. Biometrika, 54: , [SSW92] C.-E. Särndal, B. Swensson, and J.H. Wretman. Model Assisted Survey Sampling. Springer Verlag, New York, [Til96] Y. Tillé. An elimination procedure of unequal probability sampling without replacement. Biometrika, 83: , [Til06] Y. Tillé. Sampling Algorithms. Springer, [TM06] Y. Tillé and A. Matei. The sampling package. Software manual, CRAN,
Variants of the splitting method for unequal probability sampling
Variants of the splitting method for unequal probability sampling Lennart Bondesson 1 1 Umeå University, Sweden e-mail: Lennart.Bondesson@math.umu.se Abstract This paper is mainly a review of the splitting
More informationBIAS-ROBUSTNESS AND EFFICIENCY OF MODEL-BASED INFERENCE IN SURVEY SAMPLING
Statistica Sinica 22 (2012), 777-794 doi:http://dx.doi.org/10.5705/ss.2010.238 BIAS-ROBUSTNESS AND EFFICIENCY OF MODEL-BASED INFERENCE IN SURVEY SAMPLING Desislava Nedyalova and Yves Tillé University of
More informationA comparison of stratified simple random sampling and sampling with probability proportional to size
A comparison of stratified simple random sampling and sampling with probability proportional to size Edgar Bueno Dan Hedlin Per Gösta Andersson Department of Statistics Stockholm University Introduction
More informationConservative variance estimation for sampling designs with zero pairwise inclusion probabilities
Conservative variance estimation for sampling designs with zero pairwise inclusion probabilities Peter M. Aronow and Cyrus Samii Forthcoming at Survey Methodology Abstract We consider conservative variance
More informationO.O. DAWODU & A.A. Adewara Department of Statistics, University of Ilorin, Ilorin, Nigeria
Efficiency of Alodat Sample Selection Procedure over Sen - Midzuno and Yates - Grundy Draw by Draw under Unequal Probability Sampling without Replacement Sample Size 2 O.O. DAWODU & A.A. Adewara Department
More informationEstimation of change in a rotation panel design
Int. Statistical Inst.: Proc. 58th World Statistical Congress, 2011, Dublin (Session CPS028) p.4520 Estimation of change in a rotation panel design Andersson, Claes Statistics Sweden S-701 89 Örebro, Sweden
More informationHISTORICAL PERSPECTIVE OF SURVEY SAMPLING
HISTORICAL PERSPECTIVE OF SURVEY SAMPLING A.K. Srivastava Former Joint Director, I.A.S.R.I., New Delhi -110012 1. Introduction The purpose of this article is to provide an overview of developments in sampling
More informationA comparison of stratified simple random sampling and sampling with probability proportional to size
A comparison of stratified simple random sampling and sampling with probability proportional to size Edgar Bueno Dan Hedlin Per Gösta Andersson 1 Introduction When planning the sampling strategy (i.e.
More informationA new resampling method for sampling designs without replacement: the doubled half bootstrap
1 Published in Computational Statistics 29, issue 5, 1345-1363, 2014 which should be used for any reference to this work A new resampling method for sampling designs without replacement: the doubled half
More informationREPLICATION VARIANCE ESTIMATION FOR THE NATIONAL RESOURCES INVENTORY
REPLICATION VARIANCE ESTIMATION FOR THE NATIONAL RESOURCES INVENTORY J.D. Opsomer, W.A. Fuller and X. Li Iowa State University, Ames, IA 50011, USA 1. Introduction Replication methods are often used in
More informationNONLINEAR CALIBRATION. 1 Introduction. 2 Calibrated estimator of total. Abstract
NONLINEAR CALIBRATION 1 Alesandras Pliusas 1 Statistics Lithuania, Institute of Mathematics and Informatics, Lithuania e-mail: Pliusas@tl.mii.lt Abstract The definition of a calibrated estimator of the
More informationDevelopment of methodology for the estimate of variance of annual net changes for LFS-based indicators
Development of methodology for the estimate of variance of annual net changes for LFS-based indicators Deliverable 1 - Short document with derivation of the methodology (FINAL) Contract number: Subject:
More informationThe New sampling Procedure for Unequal Probability Sampling of Sample Size 2.
. The New sampling Procedure for Unequal Probability Sampling of Sample Size. Introduction :- It is a well known fact that in simple random sampling, the probability selecting the unit at any given draw
More informationProbability Sampling Designs: Principles for Choice of Design and Balancing
Submitted to Statistical Science Probability Sampling Designs: Principles for Choice of Design and Balancing Yves Tillé, Matthieu Wilhelm University of Neuchâtel arxiv:1612.04965v1 [stat.me] 15 Dec 2016
More informationSample selection with probability proportional to size sampling using SAS and R software
Sample selection with probability proportional to size sampling using SAS and R software NobinChandra Paul Ph.D. Scholar, Indian Agricultural Statistics Research Institute, New Delhi, India ABSTRACT In
More informationComments on Design-Based Prediction Using Auxilliary Information under Random Permutation Models (by Wenjun Li (5/21/03) Ed Stanek
Comments on Design-Based Prediction Using Auxilliary Information under Random Permutation Models (by Wenjun Li (5/2/03) Ed Stanek Here are comments on the Draft Manuscript. They are all suggestions that
More informationNew Sampling Design For The Swiss Job Statistics
New Sampling Design For The Swiss Job Statistics Jean-Marc Nicoletti, Daniel Assoulin 1 Abstract In 2015 different aspects of the Swiss Job Statistics, JobStat, have been revised. One revision-point concerned
More informationA Unified Theory of Empirical Likelihood Confidence Intervals for Survey Data with Unequal Probabilities and Non Negligible Sampling Fractions
A Unified Theory of Empirical Likelihood Confidence Intervals for Survey Data with Unequal Probabilities and Non Negligible Sampling Fractions Y.G. Berger O. De La Riva Torres Abstract We propose a new
More informationA comparison of pivotal sampling and unequal. probability sampling with replacement
arxiv:1609.02688v2 [math.st] 13 Sep 2016 A comparison of pivotal sampling and unequal probability sampling with replacement Guillaume Chauvet 1 and Anne Ruiz-Gazen 2 1 ENSAI/IRMAR, Campus de Ker Lann,
More informationModel Assisted Survey Sampling
Carl-Erik Sarndal Jan Wretman Bengt Swensson Model Assisted Survey Sampling Springer Preface v PARTI Principles of Estimation for Finite Populations and Important Sampling Designs CHAPTER 1 Survey Sampling
More informationChapter 2. Section Section 2.9. J. Kim (ISU) Chapter 2 1 / 26. Design-optimal estimator under stratified random sampling
Chapter 2 Section 2.4 - Section 2.9 J. Kim (ISU) Chapter 2 1 / 26 2.4 Regression and stratification Design-optimal estimator under stratified random sampling where (Ŝxxh, Ŝxyh) ˆβ opt = ( x st, ȳ st )
More informationAlgorithms to Calculate Exact Inclusion Probabilities for a Non-Rejective Approximate πps Sampling Design
Revista Colombiana de Estadística Junio 2014, volumen 37, no. 1, pp. 127 a 140 Algorithms to Calculate Exact Inclusion Probabilities for a Non-Rejective Approximate πps Sampling Design Algoritmos para
More informationarxiv: v2 [math.st] 20 Jun 2014
A solution in small area estimation problems Andrius Čiginas and Tomas Rudys Vilnius University Institute of Mathematics and Informatics, LT-08663 Vilnius, Lithuania arxiv:1306.2814v2 [math.st] 20 Jun
More informationGeneralized Pseudo Empirical Likelihood Inferences for Complex Surveys
The Canadian Journal of Statistics Vol.??, No.?,????, Pages???-??? La revue canadienne de statistique Generalized Pseudo Empirical Likelihood Inferences for Complex Surveys Zhiqiang TAN 1 and Changbao
More informationCross-sectional variance estimation for the French Labour Force Survey
Survey Research Methods (007 Vol., o., pp. 75-83 ISS 864-336 http://www.surveymethods.org c European Survey Research Association Cross-sectional variance estimation for the French Labour Force Survey Pascal
More informationEmpirical Likelihood Methods for Sample Survey Data: An Overview
AUSTRIAN JOURNAL OF STATISTICS Volume 35 (2006), Number 2&3, 191 196 Empirical Likelihood Methods for Sample Survey Data: An Overview J. N. K. Rao Carleton University, Ottawa, Canada Abstract: The use
More informationMain sampling techniques
Main sampling techniques ELSTAT Training Course January 23-24 2017 Martin Chevalier Department of Statistical Methods Insee 1 / 187 Main sampling techniques Outline Sampling theory Simple random sampling
More informationStatistica Sinica Preprint No: SS R2
Statistica Sinica Preprint No: SS-13-244R2 Title Examining some aspects of balanced sampling in surveys Manuscript ID SS-13-244R2 URL http://www.stat.sinica.edu.tw/statistica/ DOI 10.5705/ss.2013.244 Complete
More informationSAS/STAT 13.1 User s Guide. Introduction to Survey Sampling and Analysis Procedures
SAS/STAT 13.1 User s Guide Introduction to Survey Sampling and Analysis Procedures This document is an individual chapter from SAS/STAT 13.1 User s Guide. The correct bibliographic citation for the complete
More informationICES training Course on Design and Analysis of Statistically Sound Catch Sampling Programmes
ICES training Course on Design and Analysis of Statistically Sound Catch Sampling Programmes Sara-Jane Moore www.marine.ie General Statistics - backed up by case studies General Introduction to sampling
More informationarxiv: v2 [stat.me] 11 Apr 2017
Sampling Designs on Finite Populations with Spreading Control Parameters Yves Tillé, University of Neuchâtel Lionel Qualité, Swiss Federal Office of Statistics and University of Neuchâtel Matthieu Wilhelm,
More informationWhat is Survey Weighting? Chris Skinner University of Southampton
What is Survey Weighting? Chris Skinner University of Southampton 1 Outline 1. Introduction 2. (Unresolved) Issues 3. Further reading etc. 2 Sampling 3 Representation 4 out of 8 1 out of 10 4 Weights 8/4
More informationRESEARCH REPORT. Vanishing auxiliary variables in PPS sampling with applications in microscopy.
CENTRE FOR STOCHASTIC GEOMETRY AND ADVANCED BIOIMAGING 2014 www.csgb.dk RESEARCH REPORT Ina Trolle Andersen, Ute Hahn and Eva B. Vedel Jensen Vanishing auxiliary variables in PPS sampling with applications
More informationEvaluation of Variance Approximations and Estimators in Maximum Entropy Sampling with Unequal Probability and Fixed Sample Size
Journal of Official Statistics, Vol. 21, No. 4, 2005, pp. 543 570 Evaluation of Variance Approximations and Estimators in Maximum Entropy Sampling with Unequal robability and Fixed Sample Size Alina Matei
More informationSmall area estimation by splitting the sampling weights
Small area estimation by splitting the sampling weights Toky Randrianasolo, Yves Tille To cite this version: Toky Randrianasolo, Yves Tille. Small area estimation by splitting the sampling weights. Electronic
More informationTaking into account sampling design in DAD. Population SAMPLING DESIGN AND DAD
Taking into account sampling design in DAD SAMPLING DESIGN AND DAD With version 4.2 and higher of DAD, the Sampling Design (SD) of the database can be specified in order to calculate the correct asymptotic
More informationSampling from Finite Populations Jill M. Montaquila and Graham Kalton Westat 1600 Research Blvd., Rockville, MD 20850, U.S.A.
Sampling from Finite Populations Jill M. Montaquila and Graham Kalton Westat 1600 Research Blvd., Rockville, MD 20850, U.S.A. Keywords: Survey sampling, finite populations, simple random sampling, systematic
More informationSAS/STAT 13.2 User s Guide. Introduction to Survey Sampling and Analysis Procedures
SAS/STAT 13.2 User s Guide Introduction to Survey Sampling and Analysis Procedures This document is an individual chapter from SAS/STAT 13.2 User s Guide. The correct bibliographic citation for the complete
More informationBOOK REVIEW Sampling: Design and Analysis. Sharon L. Lohr. 2nd Edition, International Publication,
STATISTICS IN TRANSITION-new series, August 2011 223 STATISTICS IN TRANSITION-new series, August 2011 Vol. 12, No. 1, pp. 223 230 BOOK REVIEW Sampling: Design and Analysis. Sharon L. Lohr. 2nd Edition,
More informationStatistical Education - The Teaching Concept of Pseudo-Populations
Statistical Education - The Teaching Concept of Pseudo-Populations Andreas Quatember Johannes Kepler University Linz, Austria Department of Applied Statistics, Johannes Kepler University Linz, Altenberger
More informationANALYSIS OF SURVEY DATA USING SPSS
11 ANALYSIS OF SURVEY DATA USING SPSS U.C. Sud Indian Agricultural Statistics Research Institute, New Delhi-110012 11.1 INTRODUCTION SPSS version 13.0 has many additional features over the version 12.0.
More informationThèse. présentée à la Faculté des Sciences Economiques pour l obtention du grade de Docteur en Statistique. par
Université de Neuchâtel Institut de Statistique Thèse présentée à la Faculté des Sciences Economiques pour l obtention du grade de Docteur en Statistique par Lionel Qualité Unequal probability sampling
More informationINSTRUMENTAL-VARIABLE CALIBRATION ESTIMATION IN SURVEY SAMPLING
Statistica Sinica 24 (2014), 1001-1015 doi:http://dx.doi.org/10.5705/ss.2013.038 INSTRUMENTAL-VARIABLE CALIBRATION ESTIMATION IN SURVEY SAMPLING Seunghwan Park and Jae Kwang Kim Seoul National Univeristy
More informationA JACKKNIFE VARIANCE ESTIMATOR FOR SELF-WEIGHTED TWO-STAGE SAMPLES
Statistica Sinica 23 (2013), 595-613 doi:http://dx.doi.org/10.5705/ss.2011.263 A JACKKNFE VARANCE ESTMATOR FOR SELF-WEGHTED TWO-STAGE SAMPLES Emilio L. Escobar and Yves G. Berger TAM and University of
More informationA MODEL-BASED EVALUATION OF SEVERAL WELL-KNOWN VARIANCE ESTIMATORS FOR THE COMBINED RATIO ESTIMATOR
Statistica Sinica 8(1998), 1165-1173 A MODEL-BASED EVALUATION OF SEVERAL WELL-KNOWN VARIANCE ESTIMATORS FOR THE COMBINED RATIO ESTIMATOR Phillip S. Kott National Agricultural Statistics Service Abstract:
More informationEmpirical Likelihood Methods
Handbook of Statistics, Volume 29 Sample Surveys: Theory, Methods and Inference Empirical Likelihood Methods J.N.K. Rao and Changbao Wu (February 14, 2008, Final Version) 1 Likelihood-based Approaches
More informationBootstrap inference for the finite population total under complex sampling designs
Bootstrap inference for the finite population total under complex sampling designs Zhonglei Wang (Joint work with Dr. Jae Kwang Kim) Center for Survey Statistics and Methodology Iowa State University Jan.
More informationPenalized Balanced Sampling. Jay Breidt
Penalized Balanced Sampling Jay Breidt Colorado State University Joint work with Guillaume Chauvet (ENSAI) February 4, 2010 1 / 44 Linear Mixed Models Let U = {1, 2,...,N}. Consider linear mixed models
More informationContributions to the Theory of Unequal Probability Sampling. Anders Lundquist
Contributions to the Theory of Unequal Probability Sampling Anders Lundquist Doctoral Dissertation Department of Mathematics and Mathematical Statistics Umeå University SE-90187 Umeå Sweden Copyright Anders
More informationGeneralized pseudo empirical likelihood inferences for complex surveys
The Canadian Journal of Statistics Vol. 43, No. 1, 2015, Pages 1 17 La revue canadienne de statistique 1 Generalized pseudo empirical likelihood inferences for complex surveys Zhiqiang TAN 1 * and Changbao
More informationREPLICATION VARIANCE ESTIMATION FOR TWO-PHASE SAMPLES
Statistica Sinica 8(1998), 1153-1164 REPLICATION VARIANCE ESTIMATION FOR TWO-PHASE SAMPLES Wayne A. Fuller Iowa State University Abstract: The estimation of the variance of the regression estimator for
More informationSimple design-efficient calibration estimators for rejective and high-entropy sampling
Biometrika (202), 99,, pp. 6 C 202 Biometrika Trust Printed in Great Britain Advance Access publication on 3 July 202 Simple design-efficient calibration estimators for rejective and high-entropy sampling
More informationSAS/STAT 14.2 User s Guide. Introduction to Survey Sampling and Analysis Procedures
SAS/STAT 14.2 User s Guide Introduction to Survey Sampling and Analysis Procedures This document is an individual chapter from SAS/STAT 14.2 User s Guide. The correct bibliographic citation for this manual
More informationPropensity Score Matching and Genetic Matching : Monte Carlo Results
Int. Statistical Inst.: Proc. 58th World Statistical Congress, 2011, Dublin (Session CPS060) p.5391 Propensity Score Matching and Genetic Matching : Monte Carlo Results Donzé, Laurent University of Fribourg
More informationC. J. Skinner Cross-classified sampling: some estimation theory
C. J. Skinner Cross-classified sampling: some estimation theory Article (Accepted version) (Refereed) Original citation: Skinner, C. J. (205) Cross-classified sampling: some estimation theory. Statistics
More informationBiology 559R: Introduction to Phylogenetic Comparative Methods Topics for this week:
Biology 559R: Introduction to Phylogenetic Comparative Methods Topics for this week: Course general information About the course Course objectives Comparative methods: An overview R as language: uses and
More informationA Note on the Effect of Auxiliary Information on the Variance of Cluster Sampling
Journal of Official Statistics, Vol. 25, No. 3, 2009, pp. 397 404 A Note on the Effect of Auxiliary Information on the Variance of Cluster Sampling Nina Hagesæther 1 and Li-Chun Zhang 1 A model-based synthesis
More informationConsistency of estimators and variance estimators for two-stage sampling
Consistency of estimators and variance estimators for two-stage sampling Guillaume Chauvet, Audrey-Anne Vallée To cite this version: Guillaume Chauvet, Audrey-Anne Vallée. Consistency of estimators and
More informationAnalysing Spatial Data in R Worked examples: Small Area Estimation
Analysing Spatial Data in R Worked examples: Small Area Estimation Virgilio Gómez-Rubio Department of Epidemiology and Public Heath Imperial College London London, UK 31 August 2007 Small Area Estimation
More informationCombining data from two independent surveys: model-assisted approach
Combining data from two independent surveys: model-assisted approach Jae Kwang Kim 1 Iowa State University January 20, 2012 1 Joint work with J.N.K. Rao, Carleton University Reference Kim, J.K. and Rao,
More informationCalibration to Deal with Nonresponse Comparing Different Sampling Designs
Örebro University Örebro University School of Business Master Program of Applied Statistics Supervisor: Per-Gösta Andersson Examiner: Panagiotis Mantalos 2013/05/27 Calibration to Deal with Nonresponse
More informationThis module is part of the. Memobust Handbook. on Methodology of Modern Business Statistics
This module is part of the Memobust Handbook on Methodology of Modern Business Statistics 26 March 2014 Method: Sample Co-ordination Using Simple Random Sampling with Permanent Random Numbers Contents
More informationNon-uniform coverage estimators for distance sampling
Abstract Non-uniform coverage estimators for distance sampling CREEM Technical report 2007-01 Eric Rexstad Centre for Research into Ecological and Environmental Modelling Research Unit for Wildlife Population
More informationA Note on the Asymptotic Equivalence of Jackknife and Linearization Variance Estimation for the Gini Coefficient
Journal of Official Statistics, Vol. 4, No. 4, 008, pp. 541 555 A Note on the Asymptotic Equivalence of Jackknife and Linearization Variance Estimation for the Gini Coefficient Yves G. Berger 1 The Gini
More informationThe ESS Sample Design Data File (SDDF)
The ESS Sample Design Data File (SDDF) Documentation Version 1.0 Matthias Ganninger Tel: +49 (0)621 1246 282 E-Mail: matthias.ganninger@gesis.org April 8, 2008 Summary: This document reports on the creation
More informationAdvanced Statistical Modelling
Markov chain Monte Carlo (MCMC) Methods and Their Applications in Bayesian Statistics School of Technology and Business Studies/Statistics Dalarna University Borlänge, Sweden. Feb. 05, 2014. Outlines 1
More informationSanjay Chaudhuri Department of Statistics and Applied Probability, National University of Singapore
AN EMPIRICAL LIKELIHOOD BASED ESTIMATOR FOR RESPONDENT DRIVEN SAMPLED DATA Sanjay Chaudhuri Department of Statistics and Applied Probability, National University of Singapore Mark Handcock, Department
More informationExact balanced random imputation for sample survey data
Exact balanced random imputation for sample survey data Guillaume Chauvet, Wilfried Do Paco To cite this version: Guillaume Chauvet, Wilfried Do Paco. Exact balanced random imputation for sample survey
More informationMaster of Science in Statistics A Proposal
1 Master of Science in Statistics A Proposal Rationale of the Program In order to cope up with the emerging complexity on the solutions of realistic problems involving several phenomena of nature it is
More informationSampling and Estimation in Agricultural Surveys
GS Training and Outreach Workshop on Agricultural Surveys Training Seminar: Sampling and Estimation in Cristiano Ferraz 24 October 2016 Download a free copy of the Handbook at: http://gsars.org/wp-content/uploads/2016/02/msf-010216-web.pdf
More informationF. Jay Breidt Colorado State University
Model-assisted survey regression estimation with the lasso 1 F. Jay Breidt Colorado State University Opening Workshop on Computational Methods in Social Sciences SAMSI August 2013 This research was supported
More informationWeighting in survey analysis under informative sampling
Jae Kwang Kim and Chris J. Skinner Weighting in survey analysis under informative sampling Article (Accepted version) (Refereed) Original citation: Kim, Jae Kwang and Skinner, Chris J. (2013) Weighting
More informationEmpirical likelihood inference for regression parameters when modelling hierarchical complex survey data
Empirical likelihood inference for regression parameters when modelling hierarchical complex survey data Melike Oguz-Alper Yves G. Berger Abstract The data used in social, behavioural, health or biological
More informationof being selected and varying such probability across strata under optimal allocation leads to increased accuracy.
5 Sampling with Unequal Probabilities Simple random sampling and systematic sampling are schemes where every unit in the population has the same chance of being selected We will now consider unequal probability
More informationContents. Preface to Second Edition Preface to First Edition Abbreviations PART I PRINCIPLES OF STATISTICAL THINKING AND ANALYSIS 1
Contents Preface to Second Edition Preface to First Edition Abbreviations xv xvii xix PART I PRINCIPLES OF STATISTICAL THINKING AND ANALYSIS 1 1 The Role of Statistical Methods in Modern Industry and Services
More informationNonparametric Regression Estimation of Finite Population Totals under Two-Stage Sampling
Nonparametric Regression Estimation of Finite Population Totals under Two-Stage Sampling Ji-Yeon Kim Iowa State University F. Jay Breidt Colorado State University Jean D. Opsomer Colorado State University
More informationA course in statistical modelling. session 09: Modelling count variables
A Course in Statistical Modelling SEED PGR methodology training December 08, 2015: 12 2pm session 09: Modelling count variables Graeme.Hutcheson@manchester.ac.uk blackboard: RSCH80000 SEED PGR Research
More informationUnequal Probability Designs
Unequal Probability Designs Department of Statistics University of British Columbia This is prepares for Stat 344, 2014 Section 7.11 and 7.12 Probability Sampling Designs: A quick review A probability
More informationEfficient estimators for adaptive two-stage sequential sampling
0.8Copyedited by: AA 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 Journal of Statistical Computation and
More informationOn Efficiency of Midzuno-Sen Strategy under Two-phase Sampling
International Journal of Statistics and Analysis. ISSN 2248-9959 Volume 7, Number 1 (2017), pp. 19-26 Research India Publications http://www.ripublication.com On Efficiency of Midzuno-Sen Strategy under
More informationUse of auxiliary information in the sampling strategy of a European area frame agro-environmental survey
Use of auxiliary information in the sampling strategy of a European area frame agro-environmental survey Laura Martino 1, Alessandra Palmieri 1 & Javier Gallego 2 (1) European Commission: DG-ESTAT (2)
More informationSome Material on the Statistics Curriculum
Some Material on the Curriculum A/Prof Ken Russell School of Mathematics & Applied University of Wollongong kgr@uow.edu.au involves planning the collection of data, collecting those data, describing, analysing
More informationExamination of approaches to calibration in survey sampling
A thesis submitted for the degree of Doctor of Philosophy March 2018 Examination of approaches to calibration in survey sampling Author: Gareth Davies Summary The analysis of sample surveys is one of the
More informationNew perspectives on sampling rare and clustered populations
New perspectives on sampling rare and clustered populations Abstract Emanuela Furfaro Fulvia Mecatti A new sampling design is derived for sampling a rare and clustered population under both cost and logistic
More informationTHE THEORY AND PRACTICE OF MAXIMAL BREWER SELECTION WITH POISSON PRN SAMPLING
THE THEORY AND PRACTICE OF MAXIMAL BREWER SELECTION WITH POISSON PRN SAMPLING Phillip S. Kott And Jeffrey T. Bailey, National Agricultural Statistics Service Phillip S. Kott, NASS, Room 305, 351 Old Lee
More informationGeneralized Linear Models
Generalized Linear Models Methods@Manchester Summer School Manchester University July 2 6, 2018 Generalized Linear Models: a generic approach to statistical modelling www.research-training.net/manchester2018
More informationCS570 Introduction to Data Mining
CS570 Introduction to Data Mining Department of Mathematics and Computer Science Li Xiong Data Exploration and Data Preprocessing Data and Attributes Data exploration Data pre-processing Data cleaning
More informationSample Survey Calibration: An Informationtheoretic
Southern Africa Labour and Development Research Unit Sample Survey Calibration: An Informationtheoretic perspective by Martin Wittenberg WORKING PAPER SERIES Number 41 This is a joint SALDRU/DataFirst
More informationSTATISTICS-STAT (STAT)
Statistics-STAT (STAT) 1 STATISTICS-STAT (STAT) Courses STAT 158 Introduction to R Programming Credit: 1 (1-0-0) Programming using the R Project for the Statistical Computing. Data objects, for loops,
More informationOne-phase estimation techniques
One-phase estimation techniques Based on Horwitz-Thompson theorem for continuous populations Radim Adolt ÚHÚL Brandýs nad Labem, Czech Republic USEWOOD WG2, Training school in Dublin, 16.-19. September
More informationWeight calibration and the survey bootstrap
Weight and the survey Department of Statistics University of Missouri-Columbia March 7, 2011 Motivating questions 1 Why are the large scale samples always so complex? 2 Why do I need to use weights? 3
More informationIntroduction to RStudio
Introduction to RStudio Carl Tony Fakhry Jie Chen April 4, 2015 Introduction R is a powerful language and environment for statistical computing and graphics. R is freeware and there is lot of help available
More informationMachine Learning using Bayesian Approaches
Machine Learning using Bayesian Approaches Sargur N. Srihari University at Buffalo, State University of New York 1 Outline 1. Progress in ML and PR 2. Fully Bayesian Approach 1. Probability theory Bayes
More informationA comparison of weighted estimators for the population mean. Ye Yang Weighting in surveys group
A comparison of weighted estimators for the population mean Ye Yang Weighting in surveys group Motivation Survey sample in which auxiliary variables are known for the population and an outcome variable
More informationDesign and Optimization of Energy Systems Prof. C. Balaji Department of Mechanical Engineering Indian Institute of Technology, Madras
Design and Optimization of Energy Systems Prof. C. Balaji Department of Mechanical Engineering Indian Institute of Technology, Madras Lecture - 13 Introduction to Curve Fitting Good afternoon. So, we will
More informationData Integration for Big Data Analysis for finite population inference
for Big Data Analysis for finite population inference Jae-kwang Kim ISU January 23, 2018 1 / 36 What is big data? 2 / 36 Data do not speak for themselves Knowledge Reproducibility Information Intepretation
More informationNew methods to handle nonresponse in surveys
New methods to handle nonresponse in surveys PhD Thesis submitted to the Faculty of Science Institute of Statistics University of Neuchâtel For the degree of PhD in Science by Caren Hasler Accepted by
More informationEstimation Techniques in the German Labor Force Survey (LFS)
Estimation Techniques in the German Labor Force Survey (LFS) Dr. Kai Lorentz Federal Statistical Office of Germany Group C1 - Mathematical and Statistical Methods Email: kai.lorentz@destatis.de Federal
More informationImproved Estimators of Mean of Sensitive Variables using Optional RRT Models
Improved Estimators of Mean of Sensitive Variables using Optional RRT Models Sat Gupta Department of Mathematics and Statistics University of North Carolina Greensboro sngupta@uncg.edu University of South
More informationIntroduction to Survey Data Analysis
Introduction to Survey Data Analysis JULY 2011 Afsaneh Yazdani Preface Learning from Data Four-step process by which we can learn from data: 1. Defining the Problem 2. Collecting the Data 3. Summarizing
More information