Deriving indicators from representative samples for the ESF

Size: px
Start display at page:

Download "Deriving indicators from representative samples for the ESF"

Transcription

1 Deriving indicators from representative samples for the ESF Brussels, June 17, 2014 Ralf Münnich and Stefan Zins Lisa Borsi and Jan-Philipp Kolb GESIS Mannheim and University of Trier

2 Outline 1 Choosing a Sample Size 2 Estimation methods for Indicators 3 Weighting 4 Non-response 5 ESF/YEI Sampling and Estimation Recommendations

3 Choosing a Sample Size

4 Variance and Sample Size Estimates can be expected to be more accurate with a higher sample size, in the sense of being closer to the expected value of the estimator. What we would like to do, is to determine a sample size that will assure that our estimate will not exceed a certain error e with probability 1 α. That is, ( ) Prob. ˆθ(s) θ < e 1 α If we can assume that the estimator ˆθ is approximately normal distributed, then the above condition is fulfilled if e ) V (ˆθ z 1 α. 2

5 Necessary Sample Size y Error e is the half CI length for symmetric CIs: e = z 1 α σy 2 n ( e = z 1 α Sy 1 n ) 2 n N (SRSWR) (SRSWoR) With given prior information for σ or S we get n z 2 1 α 2 n e 2 z 2 1 α 2 σ 2 y σ2 y e 2 + σ2 y N (SRSWR) (SRSWoR)

6 Necessary Sample Size p y I For binary response, i.e. µ = P y, we can write P y (1 P y ) n e 2 ( 1 n 1 ) z1 2 α 2 N 1 (SRSWR) assuming n 1 N 1 n N we get n P y(1 P y ) e 2 + z 2 1 α 2 n P y(1 P y ) e 2 z 2 1 α 2 Py (1 Py ) N (SRSWoR) (SRSWR)

7 Necessary Sample Size p y II Necessary sample size for different maximal errors to estimate P y = 0.5, and P y = 0.2 or 0.8, given a confidence level of 95%. P y = 0.5 P y = 0.2 N e = 0.03 e = 0.05 e=0.03 e=

8 Necessary Sample Size p y III The maximum of P y (1 P y ) is 0.25, therefore if p y is used as an estimator for P y a sample size of n z 2 1 α 2 4e 2 should always ensure the necessary precision on a 95% confidence level.

9 Stratified Designs

10 Necessary sample size StrSRS I For y StrSRS we have under WoR the problem is to minimize the sum n = H h=1 n h under the constrains 0 < n h N h, h = 1,..., H and side condition, e z1 2 H α 2 h=1 γ 2 h S 2 y h n h ( 1 n ) h. N h In praxis there are often minimal value m h and maximal value M h for all n h, (0 m h M h N h ). Usually m h 2 if N h 2. Thus we have the solve the above problem under constrains m h n h M h, h = 1,..., H.

11 Necessary Sample Size StrSRS III Necessary sample sizes for two-way stratification: Education Employment high middle low Total Employed N 11 N 12 N 13 N 1. Unemployed N 21 N 22 N 23 N 2. Total N.1 N.2 N.3 N Minimize H h Gg n gh, under m hg n hg M hg, h = 1,..., H, g = 1,..., G, and side conditions for the necessary sample size for each cell in the above table.

12 Necessary Sample Size StrSRS IV Solution via specialised box-constraints optimization algorithm Exact: Gabler, Ganninger and Münnich (2012), Metrika, and Numerical: Münnich, Sachs and Wagner (2012), Journal of Global Optimization, and For integer solution: Friedrich, Münnich, de Vries, Wagner, in preparation. Remark: Overallocation of optimal allocation may be handled easily!

13 Complex Designs

14 Design Effect Working with a more complex sampling design, e.g. one with cluster sampling, will usually have an effect on the variance of the estimators used. Design effect describes this difference: ) V (ˆθp p deff (θ) p = ) V (ˆθSRS SRS where ˆθ p is the used estimator under the complex design p( ) and ˆθ SRS the estimator under SRS, usually WoR. To have a fair comparison, the same (expected) sample size n is assumed under design p( ) and the SRS design.

15 Effective Sample Size The ratio n n eff = deff (θ) p is referred to as the effective sample size and it is the sample size required in a SRS which yields the same precision on a certain estimator as under a given complex sample design. If we select n eff as the sample size under SRS to achieve a certain level of precision, multiplying it by the design effect of the complex sample we get the sample size n p, which should assure the same under the complex design. Hence, n p = n eff deff (θ) p

16 deff Behaviour of the Design Effect Cluster size (b) Homogeneity (ρ)

17 Estimation methods for Indicators

18 The main types of statistics that are of interest are totals and means of the indicator variables. In the following we show how τ y and P y can be estimated for any sampling design introducing the unifying concept of inclusion probabilities.

19 First order inclusion probabilities The selection probability of element k in draw l is of special interest and denoted by ψ k,l. These probabilities vary from draw to draw when sampling without replacement is chosen. We now define: First order inclusion probabilities The first order inclusion probability π k is the probability, that element k is selected in the sample (independently from the draw!). Then: π k = s k p(s) In general, we use the inverse inclusion probability d k = 1/π k (IIP) and refer the quantity d k to as design weight.

20 Properties of first order inclusion probabilities The total probability over all samples: p(s) = 1 s S For fixed sample size n designs, we get π k = n k U and k s 1 π k = N

21 Second order inclusion probabilities In order to gain variances or variance estimates, we need also the second order inclusion probabilities: Second order inclusion probabilities The probability that both elements k and l are drawn in the sample is denoted by π kl = s {k,l} p(s), and is called second order inclusion probability. From this definition, we can conclude that π kk = π k holds.

22 SRS (revisited) As inclusion probabilities for SRSWoR we get π k = n N n(n 1) π kl = N (N 1) for all k U for all k l U

23 StrSRS (revisited) As inclusion probabilities for StrSRSWoR we get π k = n h N h π kl = n h(n h 1) N h (N h 1) π kl = n hn g N h N g for all k U h for all k l U h for all k U h, l U g, and h g

24 The Horvitz-Thompson estimator (HT) I The HT estimator for the total τ y of a variable y is Ŷ HT = i s y i = d i y i. π i i s The HT estimator is unbiased for any design! The variance of Ŷ HT is V (ŶHT ) = k s l s (π kl π k π l ) y k π k y l π l.

25 Horvitz-Thompson estimator II An unbiased estimate of the variance of the HT estimator is ) ˆV (ŶHT = i s j s (π ij π i π j ) π ij y i y j. π i π j The above estimator is only unbiased for designs with strictly positive second order inclusion probabilities.

26 Horvitz-Thompson estimator III As HT-type of estimator for a mean is given by ˆP yht = l s d iy i l s d i = ŶHT ˆN Variance estimation is more complex, as the estimator in not linear, but there exist approximate solutions.

27 Properties of the HT estimator HT estimators: a class linear unbiased estimators. Negative variance estimates may result in some designs! The design effect should be considered. Cassel, C.; Särndal, C.-E.; Wretman, J.H. (1977): Foundations of inference in survey sampling. Wiley. Gabler, S. (1990): Minimax Solutions in Sampling from Finite Populations. Lecture Notes in Statistics, 64. Springer. Hedayat, A.S.; Sinha, B.K. (1991); Design and Inference in Finite Population Sampling, Wiley.

28 General Regression Estimator (GREG) Ŷ GREG = Ŷ HT + (τ x X HT ) ˆB (cf. Särndal et al. 1992, pp. 225) with ˆB as estimate of the coefficients of a linear regression model to explaining the variable of interest y with the auxiliary variables x. Or: Ŷ GREG = i S g i d i y i where g i is the adjustment to the design weight by the linear model.

29 The Idea of Regression Estimators Sample data for two varaibles, y and x: Y µ x x µ^y y

30 Weighting

31 The Hurvitz-Thompson Estimator Revisited The Hurvitz-Thompson estimator Ŷ HT = i S d i y i The HT has very good properties. But, it does not gain in efficiency by any available auxiliary information. Cannot directly consider non-response or other disproportions Additional weights may help!

32 Poststratification The stratification is used after drawing a simple random sample (Lohr, 1999) SRS yields approximately equal proportions with respect to a stratification Correction of over- and under-representation of the categories (strata) Variance formulae of the proportional allocation are applied Poststratification is applied once the necessary information for stratification is not available in the sampling frame but from other sources, e.g. register. Example: nationality (German, non-german) and gender Note: stressing poststratification with many variables may lead to erroneous results

33 Example A simple random sample yields: n y sy 2 W M The proportion of women in the universe is 60%. Hence: SRS PostStr The variability of the sample size in the denominator of ˆV (y PS ) SRSWR was not considered (may be ignored in practice if not too low)!

34 Example A simple random sample yields: n y sy 2 W M The proportion of women in the universe is 60%. Hence: SRS y = ( ) = ˆV (y) SRSWR = ( ) = PostStr The variability of the sample size in the denominator of ˆV (y PS ) SRSWR was not considered (may be ignored in practice if not too low)!

35 Example A simple random sample yields: n y sy 2 W M The proportion of women in the universe is 60%. Hence: SRS y = ( ) = ˆV (y) SRSWR = ( ) = PostStr y PS = = ˆV (y PS ) SRSWR = L γq 2 s2 q = q=1 n q The variability of the sample size in the denominator of ˆV (y PS ) SRSWR was not considered (may be ignored in practice if not too low)!

36 The Calibration and GREG Estimator We learned that the GREG can be written as a HT with specialised weights: Ŷ GREG = i S g i d i y i. The weights come from a regression model which assists the estimation in order to improve the estimates. The better the model, the better the estimates. At least aggregate information is needed for covariates.

37 Weights and Estimation Why are weights so essentially important? Münnich and Burgard (2012) Weights can be used for Correction of the disproportions in the sample E.g. poststratifiction for gender Controlling for response propensities Correction for non-response Control for the variation of design weights Cf. Gelman (2007) discussion Known estimators using weights: GREG family Calibration estimators Expected solution: does there exist a weighting vector such that all (most / small area) estimates of the main variables are considered as calibration constraints? Extension: Benchmarking by penalized calibration

38 Our goal is Get weights w k for estimating ˆτ y = w k y k = d k g k y k, k s k s τ x = k s w k x k whereas these weights satisfy benchmarks by certain areas and domains, certain auxiliary variables. d k = πk 1 k : desgin weights g k = the ajustement weight w k = d k g k k : calibrated weights

39 Non-response

40 Missing Data Everybody has them, nobody wants them! Missing Completely at Random (MCAR) Complete random pattern. The pattern is neutral, i.e. not systematic. Missing at Random (MAR) The pattern is random but conditional on other data. The pattern is systematic, but can be explained by the observed auxiliary data. Missing not at random (MNAR) The pattern cannot be modelled as a random process. The pattern is systematic but cannot be explained.

41 Non-response Bias Source: Schnell, Rainer (2002): Antworten auf non-response.

42 Methods to Handle Missing data Procedures based on the available cases only, i.e., only those cases that are completely recorded for the variables of interest Weighting procedures such as Horvitz-Thompson type estimators or raking estimators that adjust for non-response Single imputation and correction of the variance estimates to account for imputation uncertainty Multiple imputation (MI) according to Rubin (1978, 1987) and standard complete-case analysis We regard multiple imputation as most flexible for multipurpose complex surveys

43 Complete Case Analysis Delete the observation with missing values from the analysis/estimation. Is an elimination procedure. Results in a reduction of observations. All incomplete observed rows of a data matrix are deleted. Large loss of information in case of item non-response. Remaining sample can be falsified, if missing data pattern is systematic (example income) Estimates will be biased if the data was not MCAR!

44 Single Imputation Methods Imputations are means or draws from a predictive distribution of the missing values, and require a method of creating a predictive distribution for the imputation based on the observed data (Little and Rubin, 2002, p. 59). With respect to Little and Rubin(2002) the single imputation methods can be divided into explizit and implizit modeling. When the modeling is explicit a formal statistical model (e.g. multivariate normal) is used for the predictive distribution. When the modeling is implicit the emphasis is on an algorithm, which implies an underlying model.

45 Explicit Modeling Mean Imputation: The mean of the responding units is used as imputed value. Regression Imputation: We have auxiliary variables x i = (x ig,..., x ig ) that are fully observed for all elements in the sample. Then we have the varaible of interest y i that has missing values for some elements in the sample. The imputed values ŷ i are obtained by using the auxiliary data of the nonrespondents in the regression equation: G ŷ i = ˆα + ˆβ g x ig, for all missing y i s. g=1 Ratio Imputation: Similar to regression imputation. For the respondent units the ratio of the sum of their values for the variable of interest Y and a auxiliary variable X is computed. ( The ) imputed value yobs for an certain nonrespondent is receivey by: cf. Little and Rubin(2002), Lee et al. (1994) ( xobs ) x mis

46 Implicit Modeling The Hot Deck Imputation: The imputed value is obtained from the observed data set. A commonly used procedure is to defined a pool of so called donors which are quite similar to the nonrespondent (may be with recourse to auxiliary data). The imputed value is received by drawing recorded values with SRS with replacement from the pool of donors. Nearest Neighbor Imputation: A special case of the hot deck imputation. A distance measure (e.g. absolute deviation) is used to define the distance based on auxiliary variables between a nonrespondent and the respondent units. As imputed values the value of the respondent is chosen which is the closest to the nonrespondent. Cold Deck Imputation: The imputed value comes from an external sources, e.g. a previous realization of the same survey. cf. Little and Rubin(2002), Longford (2005)

47 Response Homogeneity Classes In practice often the different imputation methods are performed within different groups or classes. The groups consist of similar elements To identify such groups or classes auxiliary variables may be used This procedure ensures that imputed values are obtained which are close to similar respondents Cf. Särndal and Lundstrøm (2005)

48 Weighting Methods

49 Calibration for Non-response Using the calibration approach directly yields biased estimates Update of calibration weights Correction of response probability Response homogeneity classes before calibrating the first order inclusion probabilities π k π k;nr We need auxiliary information with which we can explaine the response behavior sufficiently good. For example, if we know that employed and unemployed persons have different response propensities than we should calibrate on the variable employment.

50 Example: Unit Non-response in volunteer panels sample Population Microcensus Access Panel D-SILC 1 % Microcensus (MC) unit non-response 1% sample of German population 4 rotational quarters Access Panel (DSP) D-SILC strat. sample n=14,100 Recruitment out of the latest MC quarter Unit non-response due to self-selection (recruitment rates vary ernomously) Stratified sample from access panel Population estimates using weights

51 Multiple Imputation

52 The Multiple Imputation Principle (1) Y 1 Y 2 Y Estimate 1 Y 1 Y 2 Y 3 NA NA NA NA 1 Y 1 Y 2 Y Y 1 Y 2 Y 3 3 Estimate 2 MI estimate MI inference Estimate 3

53 The Multiple Imputation Principle (2) ˆθ, ˆV (ˆθ) Imputed data set 1 ˆθ (1), ˆV (ˆθ(1) ) Imputed data set 2 ˆθ (2), ˆV (ˆθ(2) ) Complete data missing values Incomplete data ˆθ MI ˆV MI (ˆθ) Imputed data set m ˆθ (m), ˆV (ˆθ(m) )

54 Variance Estimation under Multiple Imputation Multiple imputation (Rubin, 1987): ˆθ (j ) and ˆV (ˆθ(j ) ) Multiple imputation point estimate ˆθ MI = 1 m Multiple imputation variance Estimate T = W + (1 + 1 m )B with within imputation variance W = 1 m between imputation variance B = 1 m 1 m j =1 m m ˆθ (j ) j =1 ˆV (ˆθ (j ) ) und j =1 (ˆθ (j ) ˆθ MI ) 2 Problem: the imputation has to be proper in Rubin s sense.

55 Miscellaneous for Imputation Development for MI is on-going Hiding values to achieve anonymity is unlikely to be good! Regional patterns and other peculiarities have to be considered Imputer and analyst should not differ(future task) Software available Mice Baboon (R - language and environment for statistical computing and graphics)

56 Final methodological remarks

57 Recent developments in survey statistics Nowadays, survey optimization is under reconstruction Model-based estimation becomes more and more important Researchers want to use data (later) Sophisticated design weights yield problems Many sources of information are to be considered Administrative data Register data Big data Data are used for deriving regional policies Small area and domain estimation Unit- versus area-level models Designs weights have to be considered properly Münnich and Burgard (2012) Burgard, Münnich, and Zimmermann (2014, 2015)

58 Idea of small area estimation Parallel estimation of many areas Borrow strength from info from other areas (Rao, 2003) Small area model (Battese, Harter, and Fuller, 1988): The model has two components, a design based one similar to a GREG, and model based one. The two components are weighted according to amount of variation that can be explained by the membership of an area/domain. If much of the variation can be explained by area/domain membership more weight is given to the design part and vice versa. Fay and Herriot (1979) area-level model similar. Planned and unplanned areas/domains.

59 Small area estimation in practice 15 Variable of interest Auxiliary information

60 ESF/YEI Sampling and Estimation Recommendations

61 Representative Requirements Representative samples are randomly selected at the level of the socio-economic characteristics (variables) of the participants as captured by the output indicators covering personal data (gender, employment status, age, educational attainment and household situation). Representativeness shall also relate to the regional dimension of the output indicators. Regional representativeness shall be ensured to one NUTS level lower than the level of the programme.

62 StrSRSWoR and Allocation How could this requirement be addressed? An obvious choice would be to use a stratified random sample, with a simple random sample without replacement in each stratum and a proportional allocation of the sample size. To solve the rounding problem a Cox algorithm could be used. And a two-way stratification table, with region being on dimension and the cross-classification of all output indicators the other. This approach would answer to the naïve understanding of representativeness.

63 StrSRSWoR and Allocation Alternatively, selecting necessary or optimal stratum specific samples size to obtain a certain degree of accuracy or error tolerance. Using box-constraints optimization, as we have seen in the section on selecting sample sizes.

64 Problems with High Stratification There could be strata that are very sparsely populated. For proportional allocation this could result into the selection of only one or zero elements in a stratum. Exact variance estimation becomes then impossible. However, approximations or conservative variance estimates (SRS) can be used instead. If stratum specific minimal sample size are set the required total sample size might be to high.

65 Domain Estimation Stratification can be tool to select at least a minimum number of elements for each domain of interest (e.g. region X all output indicators). This would be necessary to use the types of (design-based) estimators we saw to estimate the indicators for each stratum/domain of interest. If extensive stratification is not feasible some of the strata resulting from crossing region and output indicators have to be collapsed. Estimates for each domain of interest can still be obtain by the using so called model-based estimators. This would however require having auxiliary information on the elements in the non-sampled domains. Results will be more prone to bias.

66 Sampling Frame Is it possible to build sampling frames that cover the populations of interest and segregate them from each other? Are the frames enriched with the stratification variables? Are the investment priorities identifiable? Non-overlapping Samples Selection non-overlapping samples from the same frame. Selection non-overlapping samples from different overlapping frames. If multiple samples have to be selected that covering the same, or parts of the same, population, then this can sprout highly complex sampling designs. Inclusion probabilities can become very difficult to determine and to calculate. Select as few samples as possible to cover the same population!

67 Non-response It is difficult to give specific recommendations in advance without having any knowledge of the response behavior. In general: If a survey has single variable that is of major interest (e.g. a particular indicator), single imputation or calibration might by appropriate. If the survey has multiple equally important variables all affect by nonresponse then it might be worth to take multiple imputation into consideration.

68 Cross National Comparability For the comparability of results across countries two important things: Sample sizes should be planned with the same error in all countries. Translation of a common questionnaire (e.g. in English) into national languages. For further good practice on cross naional comparability see ESS Methodoloy:

EMOS 2015 Spring School Sampling I + II

EMOS 2015 Spring School Sampling I + II EMOS 2015 Spring School Sampling I + II quad Trier, 24 th March 2015 Ralf Münnich 1 (113) EMOS 2015 Spring School Sampling I + II 1. Introduction to Survey Sampling Trier, 24 th March 2015 Ralf Münnich

More information

Introduction to Survey Data Analysis

Introduction to Survey Data Analysis Introduction to Survey Data Analysis JULY 2011 Afsaneh Yazdani Preface Learning from Data Four-step process by which we can learn from data: 1. Defining the Problem 2. Collecting the Data 3. Summarizing

More information

Data Integration for Big Data Analysis for finite population inference

Data Integration for Big Data Analysis for finite population inference for Big Data Analysis for finite population inference Jae-kwang Kim ISU January 23, 2018 1 / 36 What is big data? 2 / 36 Data do not speak for themselves Knowledge Reproducibility Information Intepretation

More information

New Developments in Nonresponse Adjustment Methods

New Developments in Nonresponse Adjustment Methods New Developments in Nonresponse Adjustment Methods Fannie Cobben January 23, 2009 1 Introduction In this paper, we describe two relatively new techniques to adjust for (unit) nonresponse bias: The sample

More information

Chapter 5: Models used in conjunction with sampling. J. Kim, W. Fuller (ISU) Chapter 5: Models used in conjunction with sampling 1 / 70

Chapter 5: Models used in conjunction with sampling. J. Kim, W. Fuller (ISU) Chapter 5: Models used in conjunction with sampling 1 / 70 Chapter 5: Models used in conjunction with sampling J. Kim, W. Fuller (ISU) Chapter 5: Models used in conjunction with sampling 1 / 70 Nonresponse Unit Nonresponse: weight adjustment Item Nonresponse:

More information

The ESS Sample Design Data File (SDDF)

The ESS Sample Design Data File (SDDF) The ESS Sample Design Data File (SDDF) Documentation Version 1.0 Matthias Ganninger Tel: +49 (0)621 1246 282 E-Mail: matthias.ganninger@gesis.org April 8, 2008 Summary: This document reports on the creation

More information

Model Assisted Survey Sampling

Model Assisted Survey Sampling Carl-Erik Sarndal Jan Wretman Bengt Swensson Model Assisted Survey Sampling Springer Preface v PARTI Principles of Estimation for Finite Populations and Important Sampling Designs CHAPTER 1 Survey Sampling

More information

VARIANCE ESTIMATION FOR NEAREST NEIGHBOR IMPUTATION FOR U.S. CENSUS LONG FORM DATA

VARIANCE ESTIMATION FOR NEAREST NEIGHBOR IMPUTATION FOR U.S. CENSUS LONG FORM DATA Submitted to the Annals of Applied Statistics VARIANCE ESTIMATION FOR NEAREST NEIGHBOR IMPUTATION FOR U.S. CENSUS LONG FORM DATA By Jae Kwang Kim, Wayne A. Fuller and William R. Bell Iowa State University

More information

Estimation Techniques in the German Labor Force Survey (LFS)

Estimation Techniques in the German Labor Force Survey (LFS) Estimation Techniques in the German Labor Force Survey (LFS) Dr. Kai Lorentz Federal Statistical Office of Germany Group C1 - Mathematical and Statistical Methods Email: kai.lorentz@destatis.de Federal

More information

ESTP course on Small Area Estimation

ESTP course on Small Area Estimation ESTP course on Small Area Estimation Statistics Finland, Helsinki, 29 September 2 October 2014 Topic 1: Introduction to small area estimation Risto Lehtonen, University of Helsinki Lecture topics: Monday

More information

Estimation of change in a rotation panel design

Estimation of change in a rotation panel design Int. Statistical Inst.: Proc. 58th World Statistical Congress, 2011, Dublin (Session CPS028) p.4520 Estimation of change in a rotation panel design Andersson, Claes Statistics Sweden S-701 89 Örebro, Sweden

More information

Two-phase sampling approach to fractional hot deck imputation

Two-phase sampling approach to fractional hot deck imputation Two-phase sampling approach to fractional hot deck imputation Jongho Im 1, Jae-Kwang Kim 1 and Wayne A. Fuller 1 Abstract Hot deck imputation is popular for handling item nonresponse in survey sampling.

More information

Weighting Missing Data Coding and Data Preparation Wrap-up Preview of Next Time. Data Management

Weighting Missing Data Coding and Data Preparation Wrap-up Preview of Next Time. Data Management Data Management Department of Political Science and Government Aarhus University November 24, 2014 Data Management Weighting Handling missing data Categorizing missing data types Imputation Summary measures

More information

Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling

Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling Jae-Kwang Kim 1 Iowa State University June 26, 2013 1 Joint work with Shu Yang Introduction 1 Introduction

More information

Nonresponse weighting adjustment using estimated response probability

Nonresponse weighting adjustment using estimated response probability Nonresponse weighting adjustment using estimated response probability Jae-kwang Kim Yonsei University, Seoul, Korea December 26, 2006 Introduction Nonresponse Unit nonresponse Item nonresponse Basic strategy

More information

Sampling from Finite Populations Jill M. Montaquila and Graham Kalton Westat 1600 Research Blvd., Rockville, MD 20850, U.S.A.

Sampling from Finite Populations Jill M. Montaquila and Graham Kalton Westat 1600 Research Blvd., Rockville, MD 20850, U.S.A. Sampling from Finite Populations Jill M. Montaquila and Graham Kalton Westat 1600 Research Blvd., Rockville, MD 20850, U.S.A. Keywords: Survey sampling, finite populations, simple random sampling, systematic

More information

MISSING or INCOMPLETE DATA

MISSING or INCOMPLETE DATA MISSING or INCOMPLETE DATA A (fairly) complete review of basic practice Don McLeish and Cyntha Struthers University of Waterloo Dec 5, 2015 Structure of the Workshop Session 1 Common methods for dealing

More information

analysis of incomplete data in statistical surveys

analysis of incomplete data in statistical surveys analysis of incomplete data in statistical surveys Ugo Guarnera 1 1 Italian National Institute of Statistics, Italy guarnera@istat.it Jordan Twinning: Imputation - Amman, 6-13 Dec 2014 outline 1 origin

More information

Pooling multiple imputations when the sample happens to be the population.

Pooling multiple imputations when the sample happens to be the population. Pooling multiple imputations when the sample happens to be the population. Gerko Vink 1,2, and Stef van Buuren 1,3 arxiv:1409.8542v1 [math.st] 30 Sep 2014 1 Department of Methodology and Statistics, Utrecht

More information

An Overview of the Pros and Cons of Linearization versus Replication in Establishment Surveys

An Overview of the Pros and Cons of Linearization versus Replication in Establishment Surveys An Overview of the Pros and Cons of Linearization versus Replication in Establishment Surveys Richard Valliant University of Michigan and Joint Program in Survey Methodology University of Maryland 1 Introduction

More information

Combining Non-probability and Probability Survey Samples Through Mass Imputation

Combining Non-probability and Probability Survey Samples Through Mass Imputation Combining Non-probability and Probability Survey Samples Through Mass Imputation Jae-Kwang Kim 1 Iowa State University & KAIST October 27, 2018 1 Joint work with Seho Park, Yilin Chen, and Changbao Wu

More information

arxiv: v2 [math.st] 20 Jun 2014

arxiv: v2 [math.st] 20 Jun 2014 A solution in small area estimation problems Andrius Čiginas and Tomas Rudys Vilnius University Institute of Mathematics and Informatics, LT-08663 Vilnius, Lithuania arxiv:1306.2814v2 [math.st] 20 Jun

More information

Biostat 2065 Analysis of Incomplete Data

Biostat 2065 Analysis of Incomplete Data Biostat 2065 Analysis of Incomplete Data Gong Tang Dept of Biostatistics University of Pittsburgh September 13 & 15, 2005 1. Complete-case analysis (I) Complete-case analysis refers to analysis based on

More information

Conservative variance estimation for sampling designs with zero pairwise inclusion probabilities

Conservative variance estimation for sampling designs with zero pairwise inclusion probabilities Conservative variance estimation for sampling designs with zero pairwise inclusion probabilities Peter M. Aronow and Cyrus Samii Forthcoming at Survey Methodology Abstract We consider conservative variance

More information

STA304H1F/1003HF Summer 2015: Lecture 11

STA304H1F/1003HF Summer 2015: Lecture 11 STA304H1F/1003HF Summer 2015: Lecture 11 You should know... What is one-stage vs two-stage cluster sampling? What are primary and secondary sampling units? What are the two types of estimation in cluster

More information

A comparison of stratified simple random sampling and sampling with probability proportional to size

A comparison of stratified simple random sampling and sampling with probability proportional to size A comparison of stratified simple random sampling and sampling with probability proportional to size Edgar Bueno Dan Hedlin Per Gösta Andersson Department of Statistics Stockholm University Introduction

More information

BOOK REVIEW Sampling: Design and Analysis. Sharon L. Lohr. 2nd Edition, International Publication,

BOOK REVIEW Sampling: Design and Analysis. Sharon L. Lohr. 2nd Edition, International Publication, STATISTICS IN TRANSITION-new series, August 2011 223 STATISTICS IN TRANSITION-new series, August 2011 Vol. 12, No. 1, pp. 223 230 BOOK REVIEW Sampling: Design and Analysis. Sharon L. Lohr. 2nd Edition,

More information

Cross-sectional variance estimation for the French Labour Force Survey

Cross-sectional variance estimation for the French Labour Force Survey Survey Research Methods (007 Vol., o., pp. 75-83 ISS 864-336 http://www.surveymethods.org c European Survey Research Association Cross-sectional variance estimation for the French Labour Force Survey Pascal

More information

A comparison of stratified simple random sampling and sampling with probability proportional to size

A comparison of stratified simple random sampling and sampling with probability proportional to size A comparison of stratified simple random sampling and sampling with probability proportional to size Edgar Bueno Dan Hedlin Per Gösta Andersson 1 Introduction When planning the sampling strategy (i.e.

More information

Chapter 3: Element sampling design: Part 1

Chapter 3: Element sampling design: Part 1 Chapter 3: Element sampling design: Part 1 Jae-Kwang Kim Fall, 2014 Simple random sampling 1 Simple random sampling 2 SRS with replacement 3 Systematic sampling Kim Ch. 3: Element sampling design: Part

More information

6. Fractional Imputation in Survey Sampling

6. Fractional Imputation in Survey Sampling 6. Fractional Imputation in Survey Sampling 1 Introduction Consider a finite population of N units identified by a set of indices U = {1, 2,, N} with N known. Associated with each unit i in the population

More information

MISSING or INCOMPLETE DATA

MISSING or INCOMPLETE DATA MISSING or INCOMPLETE DATA A (fairly) complete review of basic practice Don McLeish and Cyntha Struthers University of Waterloo Dec 5, 2015 Structure of the Workshop Session 1 Common methods for dealing

More information

REPLICATION VARIANCE ESTIMATION FOR TWO-PHASE SAMPLES

REPLICATION VARIANCE ESTIMATION FOR TWO-PHASE SAMPLES Statistica Sinica 8(1998), 1153-1164 REPLICATION VARIANCE ESTIMATION FOR TWO-PHASE SAMPLES Wayne A. Fuller Iowa State University Abstract: The estimation of the variance of the regression estimator for

More information

Taking into account sampling design in DAD. Population SAMPLING DESIGN AND DAD

Taking into account sampling design in DAD. Population SAMPLING DESIGN AND DAD Taking into account sampling design in DAD SAMPLING DESIGN AND DAD With version 4.2 and higher of DAD, the Sampling Design (SD) of the database can be specified in order to calculate the correct asymptotic

More information

Development of methodology for the estimate of variance of annual net changes for LFS-based indicators

Development of methodology for the estimate of variance of annual net changes for LFS-based indicators Development of methodology for the estimate of variance of annual net changes for LFS-based indicators Deliverable 1 - Short document with derivation of the methodology (FINAL) Contract number: Subject:

More information

Calibration Estimation of Semiparametric Copula Models with Data Missing at Random

Calibration Estimation of Semiparametric Copula Models with Data Missing at Random Calibration Estimation of Semiparametric Copula Models with Data Missing at Random Shigeyuki Hamori 1 Kaiji Motegi 1 Zheng Zhang 2 1 Kobe University 2 Renmin University of China Institute of Statistics

More information

Empirical Likelihood Methods for Sample Survey Data: An Overview

Empirical Likelihood Methods for Sample Survey Data: An Overview AUSTRIAN JOURNAL OF STATISTICS Volume 35 (2006), Number 2&3, 191 196 Empirical Likelihood Methods for Sample Survey Data: An Overview J. N. K. Rao Carleton University, Ottawa, Canada Abstract: The use

More information

Design and Estimation for Split Questionnaire Surveys

Design and Estimation for Split Questionnaire Surveys University of Wollongong Research Online Centre for Statistical & Survey Methodology Working Paper Series Faculty of Engineering and Information Sciences 2008 Design and Estimation for Split Questionnaire

More information

Combining multiple observational data sources to estimate causal eects

Combining multiple observational data sources to estimate causal eects Department of Statistics, North Carolina State University Combining multiple observational data sources to estimate causal eects Shu Yang* syang24@ncsuedu Joint work with Peng Ding UC Berkeley May 23,

More information

Minimax-Regret Sample Design in Anticipation of Missing Data, With Application to Panel Data. Jeff Dominitz RAND. and

Minimax-Regret Sample Design in Anticipation of Missing Data, With Application to Panel Data. Jeff Dominitz RAND. and Minimax-Regret Sample Design in Anticipation of Missing Data, With Application to Panel Data Jeff Dominitz RAND and Charles F. Manski Department of Economics and Institute for Policy Research, Northwestern

More information

Advanced Survey Sampling

Advanced Survey Sampling Lecture materials Advanced Survey Sampling Statistical methods for sample surveys Imbi Traat niversity of Tartu 2007 Statistical methods for sample surveys Lecture 1, Imbi Traat 2 1 Introduction Sample

More information

Some methods for handling missing values in outcome variables. Roderick J. Little

Some methods for handling missing values in outcome variables. Roderick J. Little Some methods for handling missing values in outcome variables Roderick J. Little Missing data principles Likelihood methods Outline ML, Bayes, Multiple Imputation (MI) Robust MAR methods Predictive mean

More information

On the bias of the multiple-imputation variance estimator in survey sampling

On the bias of the multiple-imputation variance estimator in survey sampling J. R. Statist. Soc. B (2006) 68, Part 3, pp. 509 521 On the bias of the multiple-imputation variance estimator in survey sampling Jae Kwang Kim, Yonsei University, Seoul, Korea J. Michael Brick, Westat,

More information

Statistical Methods. Missing Data snijders/sm.htm. Tom A.B. Snijders. November, University of Oxford 1 / 23

Statistical Methods. Missing Data  snijders/sm.htm. Tom A.B. Snijders. November, University of Oxford 1 / 23 1 / 23 Statistical Methods Missing Data http://www.stats.ox.ac.uk/ snijders/sm.htm Tom A.B. Snijders University of Oxford November, 2011 2 / 23 Literature: Joseph L. Schafer and John W. Graham, Missing

More information

Main sampling techniques

Main sampling techniques Main sampling techniques ELSTAT Training Course January 23-24 2017 Martin Chevalier Department of Statistical Methods Insee 1 / 187 Main sampling techniques Outline Sampling theory Simple random sampling

More information

Selection on Observables: Propensity Score Matching.

Selection on Observables: Propensity Score Matching. Selection on Observables: Propensity Score Matching. Department of Economics and Management Irene Brunetti ireneb@ec.unipi.it 24/10/2017 I. Brunetti Labour Economics in an European Perspective 24/10/2017

More information

Calibration Estimation of Semiparametric Copula Models with Data Missing at Random

Calibration Estimation of Semiparametric Copula Models with Data Missing at Random Calibration Estimation of Semiparametric Copula Models with Data Missing at Random Shigeyuki Hamori 1 Kaiji Motegi 1 Zheng Zhang 2 1 Kobe University 2 Renmin University of China Econometrics Workshop UNC

More information

Empirical Likelihood Methods for Two-sample Problems with Data Missing-by-Design

Empirical Likelihood Methods for Two-sample Problems with Data Missing-by-Design 1 / 32 Empirical Likelihood Methods for Two-sample Problems with Data Missing-by-Design Changbao Wu Department of Statistics and Actuarial Science University of Waterloo (Joint work with Min Chen and Mary

More information

Don t be Fancy. Impute Your Dependent Variables!

Don t be Fancy. Impute Your Dependent Variables! Don t be Fancy. Impute Your Dependent Variables! Kyle M. Lang, Todd D. Little Institute for Measurement, Methodology, Analysis & Policy Texas Tech University Lubbock, TX May 24, 2016 Presented at the 6th

More information

Survey Sample Methods

Survey Sample Methods Survey Sample Methods p. 1/54 Survey Sample Methods Evaluators Toolbox Refreshment Abhik Roy & Kristin Hobson abhik.r.roy@wmich.edu & kristin.a.hobson@wmich.edu Western Michigan University AEA Evaluation

More information

Weighting of Results. Sample weights

Weighting of Results. Sample weights Weighting of Results. Two inds of weights are used in processing the answers to qualitative questions. Here they are termed sample weights and size weights. Sample weights 2. Sample weights are the inverse

More information

Unequal Probability Designs

Unequal Probability Designs Unequal Probability Designs Department of Statistics University of British Columbia This is prepares for Stat 344, 2014 Section 7.11 and 7.12 Probability Sampling Designs: A quick review A probability

More information

A Note on the Effect of Auxiliary Information on the Variance of Cluster Sampling

A Note on the Effect of Auxiliary Information on the Variance of Cluster Sampling Journal of Official Statistics, Vol. 25, No. 3, 2009, pp. 397 404 A Note on the Effect of Auxiliary Information on the Variance of Cluster Sampling Nina Hagesæther 1 and Li-Chun Zhang 1 A model-based synthesis

More information

SAMPLE SIZE ESTIMATION FOR MONITORING AND EVALUATION: LECTURE NOTES

SAMPLE SIZE ESTIMATION FOR MONITORING AND EVALUATION: LECTURE NOTES SAMPLE SIZE ESTIMATION FOR MONITORING AND EVALUATION: LECTURE NOTES Joseph George Caldwell, PhD (Statistics) 1432 N Camino Mateo, Tucson, AZ 85745-3311 USA Tel. (001)(520)222-3446, E-mail jcaldwell9@yahoo.com

More information

Multidimensional Control Totals for Poststratified Weights

Multidimensional Control Totals for Poststratified Weights Multidimensional Control Totals for Poststratified Weights Darryl V. Creel and Mansour Fahimi Joint Statistical Meetings Minneapolis, MN August 7-11, 2005 RTI International is a trade name of Research

More information

REPLICATION VARIANCE ESTIMATION FOR THE NATIONAL RESOURCES INVENTORY

REPLICATION VARIANCE ESTIMATION FOR THE NATIONAL RESOURCES INVENTORY REPLICATION VARIANCE ESTIMATION FOR THE NATIONAL RESOURCES INVENTORY J.D. Opsomer, W.A. Fuller and X. Li Iowa State University, Ames, IA 50011, USA 1. Introduction Replication methods are often used in

More information

Analyzing Pilot Studies with Missing Observations

Analyzing Pilot Studies with Missing Observations Analyzing Pilot Studies with Missing Observations Monnie McGee mmcgee@smu.edu. Department of Statistical Science Southern Methodist University, Dallas, Texas Co-authored with N. Bergasa (SUNY Downstate

More information

Monte Carlo Study on the Successive Difference Replication Method for Non-Linear Statistics

Monte Carlo Study on the Successive Difference Replication Method for Non-Linear Statistics Monte Carlo Study on the Successive Difference Replication Method for Non-Linear Statistics Amang S. Sukasih, Mathematica Policy Research, Inc. Donsig Jang, Mathematica Policy Research, Inc. Amang S. Sukasih,

More information

Calibration Estimation for Semiparametric Copula Models under Missing Data

Calibration Estimation for Semiparametric Copula Models under Missing Data Calibration Estimation for Semiparametric Copula Models under Missing Data Shigeyuki Hamori 1 Kaiji Motegi 1 Zheng Zhang 2 1 Kobe University 2 Renmin University of China Economics and Economic Growth Centre

More information

Nonrespondent subsample multiple imputation in two-phase random sampling for nonresponse

Nonrespondent subsample multiple imputation in two-phase random sampling for nonresponse Nonrespondent subsample multiple imputation in two-phase random sampling for nonresponse Nanhua Zhang Division of Biostatistics & Epidemiology Cincinnati Children s Hospital Medical Center (Joint work

More information

No is the Easiest Answer: Using Calibration to Assess Nonignorable Nonresponse in the 2002 Census of Agriculture

No is the Easiest Answer: Using Calibration to Assess Nonignorable Nonresponse in the 2002 Census of Agriculture No is the Easiest Answer: Using Calibration to Assess Nonignorable Nonresponse in the 2002 Census of Agriculture Phillip S. Kott National Agricultural Statistics Service Key words: Weighting class, Calibration,

More information

Fractional Imputation in Survey Sampling: A Comparative Review

Fractional Imputation in Survey Sampling: A Comparative Review Fractional Imputation in Survey Sampling: A Comparative Review Shu Yang Jae-Kwang Kim Iowa State University Joint Statistical Meetings, August 2015 Outline Introduction Fractional imputation Features Numerical

More information

A decision theoretic approach to Imputation in finite population sampling

A decision theoretic approach to Imputation in finite population sampling A decision theoretic approach to Imputation in finite population sampling Glen Meeden School of Statistics University of Minnesota Minneapolis, MN 55455 August 1997 Revised May and November 1999 To appear

More information

How to Use the Internet for Election Surveys

How to Use the Internet for Election Surveys How to Use the Internet for Election Surveys Simon Jackman and Douglas Rivers Stanford University and Polimetrix, Inc. May 9, 2008 Theory and Practice Practice Theory Works Doesn t work Works Great! Black

More information

Statistical Education - The Teaching Concept of Pseudo-Populations

Statistical Education - The Teaching Concept of Pseudo-Populations Statistical Education - The Teaching Concept of Pseudo-Populations Andreas Quatember Johannes Kepler University Linz, Austria Department of Applied Statistics, Johannes Kepler University Linz, Altenberger

More information

SAMPLING BIOS 662. Michael G. Hudgens, Ph.D. mhudgens :55. BIOS Sampling

SAMPLING BIOS 662. Michael G. Hudgens, Ph.D.   mhudgens :55. BIOS Sampling SAMPLIG BIOS 662 Michael G. Hudgens, Ph.D. mhudgens@bios.unc.edu http://www.bios.unc.edu/ mhudgens 2008-11-14 15:55 BIOS 662 1 Sampling Outline Preliminaries Simple random sampling Population mean Population

More information

Imputation of rounded data

Imputation of rounded data 11 0 Imputation of rounded data Jan van der Laan and Léander Kuijvenhoven The views expressed in this paper are those of the author(s) and do not necessarily reflect the policies of Statistics Netherlands

More information

ANALYSIS OF ORDINAL SURVEY RESPONSES WITH DON T KNOW

ANALYSIS OF ORDINAL SURVEY RESPONSES WITH DON T KNOW SSC Annual Meeting, June 2015 Proceedings of the Survey Methods Section ANALYSIS OF ORDINAL SURVEY RESPONSES WITH DON T KNOW Xichen She and Changbao Wu 1 ABSTRACT Ordinal responses are frequently involved

More information

Combining data from two independent surveys: model-assisted approach

Combining data from two independent surveys: model-assisted approach Combining data from two independent surveys: model-assisted approach Jae Kwang Kim 1 Iowa State University January 20, 2012 1 Joint work with J.N.K. Rao, Carleton University Reference Kim, J.K. and Rao,

More information

Comments on Design-Based Prediction Using Auxilliary Information under Random Permutation Models (by Wenjun Li (5/21/03) Ed Stanek

Comments on Design-Based Prediction Using Auxilliary Information under Random Permutation Models (by Wenjun Li (5/21/03) Ed Stanek Comments on Design-Based Prediction Using Auxilliary Information under Random Permutation Models (by Wenjun Li (5/2/03) Ed Stanek Here are comments on the Draft Manuscript. They are all suggestions that

More information

What is Survey Weighting? Chris Skinner University of Southampton

What is Survey Weighting? Chris Skinner University of Southampton What is Survey Weighting? Chris Skinner University of Southampton 1 Outline 1. Introduction 2. (Unresolved) Issues 3. Further reading etc. 2 Sampling 3 Representation 4 out of 8 1 out of 10 4 Weights 8/4

More information

Imputation for Missing Data under PPSWR Sampling

Imputation for Missing Data under PPSWR Sampling July 5, 2010 Beijing Imputation for Missing Data under PPSWR Sampling Guohua Zou Academy of Mathematics and Systems Science Chinese Academy of Sciences 1 23 () Outline () Imputation method under PPSWR

More information

Successive Difference Replication Variance Estimation in Two-Phase Sampling

Successive Difference Replication Variance Estimation in Two-Phase Sampling Successive Difference Replication Variance Estimation in Two-Phase Sampling Jean D. Opsomer Colorado State University Michael White US Census Bureau F. Jay Breidt Colorado State University Yao Li Colorado

More information

Cluster Sampling 2. Chapter Introduction

Cluster Sampling 2. Chapter Introduction Chapter 7 Cluster Sampling 7.1 Introduction In this chapter, we consider two-stage cluster sampling where the sample clusters are selected in the first stage and the sample elements are selected in the

More information

Two Measures for Sample Size Determination

Two Measures for Sample Size Determination Survey Research Methods (2011) Vol.5, No.1, pp. 27-37 ISSN 1864-3361 http://www.surveymethods.org European Survey Research Association Two Measures for Sample Sie Determination Philippe Eichenberger Swiss

More information

INSTRUMENTAL-VARIABLE CALIBRATION ESTIMATION IN SURVEY SAMPLING

INSTRUMENTAL-VARIABLE CALIBRATION ESTIMATION IN SURVEY SAMPLING Statistica Sinica 24 (2014), 1001-1015 doi:http://dx.doi.org/10.5705/ss.2013.038 INSTRUMENTAL-VARIABLE CALIBRATION ESTIMATION IN SURVEY SAMPLING Seunghwan Park and Jae Kwang Kim Seoul National Univeristy

More information

5.3 LINEARIZATION METHOD. Linearization Method for a Nonlinear Estimator

5.3 LINEARIZATION METHOD. Linearization Method for a Nonlinear Estimator Linearization Method 141 properties that cover the most common types of complex sampling designs nonlinear estimators Approximative variance estimators can be used for variance estimation of a nonlinear

More information

Applied Microeconometrics (L5): Panel Data-Basics

Applied Microeconometrics (L5): Panel Data-Basics Applied Microeconometrics (L5): Panel Data-Basics Nicholas Giannakopoulos University of Patras Department of Economics ngias@upatras.gr November 10, 2015 Nicholas Giannakopoulos (UPatras) MSc Applied Economics

More information

Estimating and Using Propensity Score in Presence of Missing Background Data. An Application to Assess the Impact of Childbearing on Wellbeing

Estimating and Using Propensity Score in Presence of Missing Background Data. An Application to Assess the Impact of Childbearing on Wellbeing Estimating and Using Propensity Score in Presence of Missing Background Data. An Application to Assess the Impact of Childbearing on Wellbeing Alessandra Mattei Dipartimento di Statistica G. Parenti Università

More information

Basics of Modern Missing Data Analysis

Basics of Modern Missing Data Analysis Basics of Modern Missing Data Analysis Kyle M. Lang Center for Research Methods and Data Analysis University of Kansas March 8, 2013 Topics to be Covered An introduction to the missing data problem Missing

More information

of being selected and varying such probability across strata under optimal allocation leads to increased accuracy.

of being selected and varying such probability across strata under optimal allocation leads to increased accuracy. 5 Sampling with Unequal Probabilities Simple random sampling and systematic sampling are schemes where every unit in the population has the same chance of being selected We will now consider unequal probability

More information

MN 400: Research Methods. CHAPTER 7 Sample Design

MN 400: Research Methods. CHAPTER 7 Sample Design MN 400: Research Methods CHAPTER 7 Sample Design 1 Some fundamental terminology Population the entire group of objects about which information is wanted Unit, object any individual member of the population

More information

ICES training Course on Design and Analysis of Statistically Sound Catch Sampling Programmes

ICES training Course on Design and Analysis of Statistically Sound Catch Sampling Programmes ICES training Course on Design and Analysis of Statistically Sound Catch Sampling Programmes Sara-Jane Moore www.marine.ie General Statistics - backed up by case studies General Introduction to sampling

More information

New estimation methodology for the Norwegian Labour Force Survey

New estimation methodology for the Norwegian Labour Force Survey Notater Documents 2018/16 Melike Oguz-Alper New estimation methodology for the Norwegian Labour Force Survey Documents 2018/16 Melike Oguz Alper New estimation methodology for the Norwegian Labour Force

More information

SAS/STAT 14.2 User s Guide. Introduction to Survey Sampling and Analysis Procedures

SAS/STAT 14.2 User s Guide. Introduction to Survey Sampling and Analysis Procedures SAS/STAT 14.2 User s Guide Introduction to Survey Sampling and Analysis Procedures This document is an individual chapter from SAS/STAT 14.2 User s Guide. The correct bibliographic citation for this manual

More information

This module is part of the. Memobust Handbook. on Methodology of Modern Business Statistics

This module is part of the. Memobust Handbook. on Methodology of Modern Business Statistics This module is part of the Memobust Handbook on Methodology of Modern Business Statistics 26 March 2014 Method: Sample Co-ordination Using Simple Random Sampling with Permanent Random Numbers Contents

More information

Estimation of Parameters and Variance

Estimation of Parameters and Variance Estimation of Parameters and Variance Dr. A.C. Kulshreshtha U.N. Statistical Institute for Asia and the Pacific (SIAP) Second RAP Regional Workshop on Building Training Resources for Improving Agricultural

More information

CHOOSING THE RIGHT SAMPLING TECHNIQUE FOR YOUR RESEARCH. Awanis Ku Ishak, PhD SBM

CHOOSING THE RIGHT SAMPLING TECHNIQUE FOR YOUR RESEARCH. Awanis Ku Ishak, PhD SBM CHOOSING THE RIGHT SAMPLING TECHNIQUE FOR YOUR RESEARCH Awanis Ku Ishak, PhD SBM Sampling The process of selecting a number of individuals for a study in such a way that the individuals represent the larger

More information

Streamlining Missing Data Analysis by Aggregating Multiple Imputations at the Data Level

Streamlining Missing Data Analysis by Aggregating Multiple Imputations at the Data Level Streamlining Missing Data Analysis by Aggregating Multiple Imputations at the Data Level A Monte Carlo Simulation to Test the Tenability of the SuperMatrix Approach Kyle M Lang Quantitative Psychology

More information

NONLINEAR CALIBRATION. 1 Introduction. 2 Calibrated estimator of total. Abstract

NONLINEAR CALIBRATION. 1 Introduction. 2 Calibrated estimator of total.   Abstract NONLINEAR CALIBRATION 1 Alesandras Pliusas 1 Statistics Lithuania, Institute of Mathematics and Informatics, Lithuania e-mail: Pliusas@tl.mii.lt Abstract The definition of a calibrated estimator of the

More information

You are allowed 3? sheets of notes and a calculator.

You are allowed 3? sheets of notes and a calculator. Exam 1 is Wed Sept You are allowed 3? sheets of notes and a calculator The exam covers survey sampling umbers refer to types of problems on exam A population is the entire set of (potential) measurements

More information

A comparison of weighted estimators for the population mean. Ye Yang Weighting in surveys group

A comparison of weighted estimators for the population mean. Ye Yang Weighting in surveys group A comparison of weighted estimators for the population mean Ye Yang Weighting in surveys group Motivation Survey sample in which auxiliary variables are known for the population and an outcome variable

More information

Now we will define some common sampling plans and discuss their strengths and limitations.

Now we will define some common sampling plans and discuss their strengths and limitations. Now we will define some common sampling plans and discuss their strengths and limitations. 1 For volunteer samples individuals are self selected. Participants decide to include themselves in the study.

More information

Lecture 5: Sampling Methods

Lecture 5: Sampling Methods Lecture 5: Sampling Methods What is sampling? Is the process of selecting part of a larger group of participants with the intent of generalizing the results from the smaller group, called the sample, to

More information

Sampling and Estimation in Agricultural Surveys

Sampling and Estimation in Agricultural Surveys GS Training and Outreach Workshop on Agricultural Surveys Training Seminar: Sampling and Estimation in Cristiano Ferraz 24 October 2016 Download a free copy of the Handbook at: http://gsars.org/wp-content/uploads/2016/02/msf-010216-web.pdf

More information

Stochastic calculus for summable processes 1

Stochastic calculus for summable processes 1 Stochastic calculus for summable processes 1 Lecture I Definition 1. Statistics is the science of collecting, organizing, summarizing and analyzing the information in order to draw conclusions. It is a

More information

Discussing Effects of Different MAR-Settings

Discussing Effects of Different MAR-Settings Discussing Effects of Different MAR-Settings Research Seminar, Department of Statistics, LMU Munich Munich, 11.07.2014 Matthias Speidel Jörg Drechsler Joseph Sakshaug Outline What we basically want to

More information

Optimization Problems

Optimization Problems Optimization Problems The goal in an optimization problem is to find the point at which the minimum (or maximum) of a real, scalar function f occurs and, usually, to find the value of the function at that

More information

EMERGING MARKETS - Lecture 2: Methodology refresher

EMERGING MARKETS - Lecture 2: Methodology refresher EMERGING MARKETS - Lecture 2: Methodology refresher Maria Perrotta April 4, 2013 SITE http://www.hhs.se/site/pages/default.aspx My contact: maria.perrotta@hhs.se Aim of this class There are many different

More information

Implications of Ignoring the Uncertainty in Control Totals for Generalized Regression Estimators. Calibration Estimators

Implications of Ignoring the Uncertainty in Control Totals for Generalized Regression Estimators. Calibration Estimators Implications of Ignoring the Uncertainty in Control Totals for Generalized Regression Estimators Jill A. Dever, RTI Richard Valliant, JPSM & ISR is a trade name of Research Triangle Institute. www.rti.org

More information