EMOS 2015 Spring School Sampling I + II

Size: px
Start display at page:

Download "EMOS 2015 Spring School Sampling I + II"

Transcription

1 EMOS 2015 Spring School Sampling I + II quad Trier, 24 th March 2015 Ralf Münnich 1 (113) EMOS 2015 Spring School Sampling I + II

2 1. Introduction to Survey Sampling Trier, 24 th March 2015 Ralf Münnich 2 (113) EMOS 2015 Spring School Sampling I + II

3 1.1 Surveys and Samples 1.2 Examples for important surveys 1.3 Quality and error of surveys 1.4 Example: quality of statistical tables 1.5 R packages in survey statistics Business of statistics One major aim (business) of statistics, especially for official statistics, is gathering and provision of data for administrative, political, sociological, economic, and other research purposes. This information is obtained in universes (or samples) of units (persons, households, establishments, machinery etc.). The data are gained in total (complete inventory) or via pre-specified principles partially as samples (randomly, quota). The following shall be considered: Primary or secondary statistics (e.g. register data) Problems of adequacy Errors of different kinds Data protection (anonymisation) Trier, 24 th March 2015 Ralf Münnich 3 (113) EMOS 2015 Spring School Sampling I + II

4 1.1 Surveys and Samples 1.2 Examples for important surveys 1.3 Quality and error of surveys 1.4 Example: quality of statistical tables 1.5 R packages in survey statistics Surveys and Samples Definition of a survey A survey, in general, is the methodology for gathering information via samples on persons, households, or other units. In a survey planning and development, pretest, survey design, implementation, data gathering, editing and processing, as well as analysis play a vital role. Trier, 24 th March 2015 Ralf Münnich 4 (113) EMOS 2015 Spring School Sampling I + II

5 1.1 Surveys and Samples 1.2 Examples for important surveys 1.3 Quality and error of surveys 1.4 Example: quality of statistical tables 1.5 R packages in survey statistics Surveys and Samples Definition of a survey A survey, in general, is the methodology for gathering information via samples on persons, households, or other units. In a survey planning and development, pretest, survey design, implementation, data gathering, editing and processing, as well as analysis play a vital role. Trier, 24 th March 2015 Ralf Münnich 4 (113) EMOS 2015 Spring School Sampling I + II

6 1.1 Surveys and Samples 1.2 Examples for important surveys 1.3 Quality and error of surveys 1.4 Example: quality of statistical tables 1.5 R packages in survey statistics Areas of applications in survey statistics Economics statistics and empirical economic research Social statistics and empirical social research Demography and socio-demographic research Market and opinion research Quality control Medical statistics and biometry Meteorology Environmental statistiscs Forestry and environmental control Traffic control Inventory based on samples Natural science research and measurement Trier, 24 th March 2015 Ralf Münnich 5 (113) EMOS 2015 Spring School Sampling I + II

7 Data gathering in surveys Different kinds of data 1.1 Surveys and Samples 1.2 Examples for important surveys 1.3 Quality and error of surveys 1.4 Example: quality of statistical tables 1.5 R packages in survey statistics Cross sectional or longitudinal data (point of time, period) Random samples Systematic samples Time series Panel data (cross sectional and longitudinal) Data acquisition (Computer assisted) interviewing Telephone interviewing Use of register data Snowball sampling Quota sampling Trier, 24 th March 2015 Ralf Münnich 6 (113) EMOS 2015 Spring School Sampling I + II

8 Examples for important surveys 1.1 Surveys and Samples 1.2 Examples for important surveys 1.3 Quality and error of surveys 1.4 Example: quality of statistical tables 1.5 R packages in survey statistics EU-SILC: Statistics on Income and Living Conditions in Europe. http: //ec.europa.eu/eurostat/web/microdata/european_ union_statistics_on_income_and_living_conditions). ESS: European Social Survey The European Social Survey (the ESS) is an academically-driven social survey designed to chart and explain the interaction between Europe s changing institutions and the attitudes, beliefs and behaviour patterns of its diverse populations. Now in its third round, the survey covers over 20 nations and employs the most rigorous methodologies. It is funded via the European Commission s 5th and 6th Framework Programmes, the European Science Foundation and national funding bodies in each country. Source: Trier, 24 th March 2015 Ralf Münnich 7 (113) EMOS 2015 Spring School Sampling I + II

9 The German Microcensus (MZ) 1.1 Surveys and Samples 1.2 Examples for important surveys 1.3 Quality and error of surveys 1.4 Example: quality of statistical tables 1.5 R packages in survey statistics Survey of households via interviewer Structure of the population ILO unemployment and labour participation Special aspects (changing) 1% sample since 1957 Reform 1990 Microcensus law from 17. January 1996 Programme of questions, mandatory survey, max. 4 years Since 2005: short-term sampling Microcensus as rotational panel Microcensus law 2005: Trier, 24 th March 2015 Ralf Münnich 8 (113) EMOS 2015 Spring School Sampling I + II

10 Arrangement of statistical units 1.1 Surveys and Samples 1.2 Examples for important surveys 1.3 Quality and error of surveys 1.4 Example: quality of statistical tables 1.5 R packages in survey statistics 214 regional classes Federal states, districts Regional classes: min to inh. 5 house size classes Census 1987 and statistics on building activity 1 Small buildings: 1 to 4 dwellings 2 Mid size buildings: 5 to 10 dwellings 3 Large buildings: minimum 11 dwellings 4 Institutions: dw. = 0 or indiv. (dw. + 4) 4 6 New buildings Selection domain sequential arrangement 1 Approx. 12 dwellings ( 10 13); maximal 70 inhabitants 2 One building each dwellings; floor-wise selection 4 Initial letter of last name; approx. 15 inhabitants 6 Selection according 1 4 depending on the type of new building Trier, 24 th March 2015 Ralf Münnich 9 (113) EMOS 2015 Spring School Sampling I + II

11 1.1 Surveys and Samples 1.2 Examples for important surveys 1.3 Quality and error of surveys 1.4 Example: quality of statistical tables 1.5 R packages in survey statistics Sampling design of the Microcensus 4 -stage design Strata strata strata clusters 1% sample One selection domain from each zone Approx. 1% of individuals and households Semi-systematic random selection 4 zones block (rotation quarters) Replacement of 1 (4) rotation quarters each year Note: The dropped rotation quarter yields the basis for acquiring households for the access panel (EU-SILC and ICT). Trier, 24 th March 2015 Ralf Münnich 10 (113) EMOS 2015 Spring School Sampling I + II

12 Data quality: Eurostat definition 1.1 Surveys and Samples 1.2 Examples for important surveys 1.3 Quality and error of surveys 1.4 Example: quality of statistical tables 1.5 R packages in survey statistics Relevance of the statistical concept: End-user, user needs, hierarchical structure and contents Accuracy and reliability: Sampling errors: standard error, CI coverage Non-sampling errors: nonresponse, coverage error, measurement errors Timeliness and punctuality: Time and duration from data acquisition until publication Coherence and comparability: Preliminary and final statistics, annual and intermediate statistics (regions, domains, time) Accessibility and clarity: Publication of data, analysis and method reports Completeness (not part of the Code of Practice) Source: european-statistics-code-of-practice Trier, 24 th March 2015 Ralf Münnich 11 (113) EMOS 2015 Spring School Sampling I + II

13 1.1 Surveys and Samples 1.2 Examples for important surveys 1.3 Quality and error of surveys 1.4 Example: quality of statistical tables 1.5 R packages in survey statistics Errors and types of errors in surveys Frame error Interviewer error Processing error Coding error Nonresponse (NR) Unit-Nonresponse Item-Nonresponse Sampling error Trier, 24 th March 2015 Ralf Münnich 12 (113) EMOS 2015 Spring School Sampling I + II

14 1.1 Surveys and Samples 1.2 Examples for important surveys 1.3 Quality and error of surveys 1.4 Example: quality of statistical tables 1.5 R packages in survey statistics Evaluation of samples and surveys Main principles in survey sampling Practicability Costs of a survey Accuracy of results Standard errors Confidence interval coverage Disparity of subpopulations Robustness of results In order to adequately evaluate the estimates from samples, appropriate evaluation criteria have to be considered. Trier, 24 th March 2015 Ralf Münnich 13 (113) EMOS 2015 Spring School Sampling I + II

15 1.1 Surveys and Samples 1.2 Examples for important surveys 1.3 Quality and error of surveys 1.4 Example: quality of statistical tables 1.5 R packages in survey statistics Basic principles in statistics for surveys In addition to the analysis from survey data, we need to investigate the quality of the results. The aim is to generalize results from the sample to the universe. inferential statistics Sampling function and estimator Properties of estimators unbiasedness efficiency normality large sample properties (limit theorems) small sample properties Test procedures based on survey data Details can be drawn from Schaich and Münnich (2001), Chapter 4 or 6, Mittelhammer (2013), or others. Trier, 24 th March 2015 Ralf Münnich 14 (113) EMOS 2015 Spring School Sampling I + II

16 Unemployment in Saarland 1.1 Surveys and Samples 1.2 Examples for important surveys 1.3 Quality and error of surveys 1.4 Example: quality of statistical tables 1.5 R packages in survey statistics Unemployed Women τ Men τ τ True values in Saarland Estimates from the Microcensus Is the quality of the cell estimates identical? Trier, 24 th March 2015 Ralf Münnich 15 (113) EMOS 2015 Spring School Sampling I + II

17 Unemployment in Saarland 1.1 Surveys and Samples 1.2 Examples for important surveys 1.3 Quality and error of surveys 1.4 Example: quality of statistical tables 1.5 R packages in survey statistics Unemployed Women τ E τ Men τ E τ τ E τ True values in Saarland Estimates from the Microcensus Is the quality of the cell estimates identical? Trier, 24 th March 2015 Ralf Münnich 15 (113) EMOS 2015 Spring School Sampling I + II

18 Unemployment in Saarland 1.1 Surveys and Samples 1.2 Examples for important surveys 1.3 Quality and error of surveys 1.4 Example: quality of statistical tables 1.5 R packages in survey statistics Unemployed Women τ E τ Men τ E τ τ E τ True values in Saarland Estimates from the Microcensus Is the quality of the cell estimates identical? Trier, 24 th March 2015 Ralf Münnich 15 (113) EMOS 2015 Spring School Sampling I + II

19 Unemployment in Saarland 1.1 Surveys and Samples 1.2 Examples for important surveys 1.3 Quality and error of surveys 1.4 Example: quality of statistical tables 1.5 R packages in survey statistics Unemployed Women τ E τ Men τ E τ τ E τ True values in Saarland Estimates from the Microcensus Is the quality of the cell estimates identical? Trier, 24 th March 2015 Ralf Münnich 15 (113) EMOS 2015 Spring School Sampling I + II

20 Unemployed women, Raking estimator 1.1 Surveys and Samples 1.2 Examples for important surveys 1.3 Quality and error of surveys 1.4 Example: quality of statistical tables 1.5 R packages in survey statistics variance estimator NR rates: 1: 5%, 2: 10%, 3: 25%, 4: 40% 95% 90% Trier, 24 th March 2015 Ralf Münnich 16 (113) EMOS 2015 Spring School Sampling I + II

21 1.1 Surveys and Samples 1.2 Examples for important surveys 1.3 Quality and error of surveys 1.4 Example: quality of statistical tables 1.5 R packages in survey statistics Unemployed women, 25 44, distribution of point and variance estimator (25% NR) Dichte 0 e+00 1 e 04 2 e 04 3 e 04 4 e 04 Dichte 0 e+00 1 e 06 2 e 06 3 e 06 4 e Erwerbslose Varianz Trier, 24 th March 2015 Ralf Münnich 17 (113) EMOS 2015 Spring School Sampling I + II

22 Unemployed women, 65 + Raking estimator 1.1 Surveys and Samples 1.2 Examples for important surveys 1.3 Quality and error of surveys 1.4 Example: quality of statistical tables 1.5 R packages in survey statistics variance estimator NR rates: 1: 5%, 2: 10%, 3: 25%, 4: 40% 95% 90% Trier, 24 th March 2015 Ralf Münnich 18 (113) EMOS 2015 Spring School Sampling I + II

23 1.1 Surveys and Samples 1.2 Examples for important surveys 1.3 Quality and error of surveys 1.4 Example: quality of statistical tables 1.5 R packages in survey statistics Unemployed women, 65 +, distribution of point and variance estimator (25% NR) Dichte Dichte Erwerbslose Varianz Trier, 24 th March 2015 Ralf Münnich 19 (113) EMOS 2015 Spring School Sampling I + II

24 R packages in survey statistics 1.1 Surveys and Samples 1.2 Examples for important surveys 1.3 Quality and error of surveys 1.4 Example: quality of statistical tables 1.5 R packages in survey statistics survey Thomas Lumley R package for survey statistics focusing on estimation and accuracy measurement in complex surveys. Resampling methods and calibration are also considered. ReGenesees ISTAT http: // sampling Yves Tillé The package is focusing on sampling algorithms which help to draw samples from complex designs. Some estimators are also implemented. others Several packages exist on resampling methods (boot). Further emphasis is put on multiple imputation (several packages) and visualizing missing values (vim). Trier, 24 th March 2015 Ralf Münnich 20 (113) EMOS 2015 Spring School Sampling I + II

25 Definitions and notations 2.1 Notational background 2.2 Simple random sampling 2.3 Stratified random sampling 2.4 Single stage cluster sampling 2.5 Two-stage cluster sampling Universe (U) Sample (S) U = {1,..., N} S = {U 1,..., U n } Characteristic of interest Y y i Auxiliary variable X x i (i-th unit) Parameters of the universe Estimates from the sample Total τ τ Proportion θ θ Mean µ µ Ratio ψ = τ 1 τ 2 ψ General parameter π π A distinction between random variables and their outcomes follows from the context. Trier, 24 th March 2015 Ralf Münnich 21 (113) EMOS 2015 Spring School Sampling I + II

26 Population parameters 2.1 Notational background 2.2 Simple random sampling 2.3 Stratified random sampling 2.4 Single stage cluster sampling 2.5 Two-stage cluster sampling Total τ Y = Y = N y i i=1 Mean µ Y = Y = 1 N y i N i=1 Proportion θ Y = P Y = 1 N y i N i=1 Variance σy 2 = 1 N ( yi Y ) 2 N i=1, where y i {0; 1} i Variance (dichotomous variable) σ 2 Y = P Y (1 P Y ) Trier, 24 th March 2015 Ralf Münnich 22 (113) EMOS 2015 Spring School Sampling I + II

27 Sample values 2.1 Notational background 2.2 Simple random sampling 2.3 Stratified random sampling 2.4 Single stage cluster sampling 2.5 Two-stage cluster sampling Total y = n Mean y = 1 n y i i=1 n y i i=1 Proportion p Y = 1 n Variance s 2 y = 1 n 1 n y i i=1 n ( yi y ) 2 i=1, where y i {0; 1} i Variance (dichotomous variable) s 2 y = p Y (1 p Y ) Trier, 24 th March 2015 Ralf Münnich 23 (113) EMOS 2015 Spring School Sampling I + II

28 2.1 Notational background 2.2 Simple random sampling 2.3 Stratified random sampling 2.4 Single stage cluster sampling 2.5 Two-stage cluster sampling Sample selection scheme A sample S consists of elements from the universe U = {1, 2,..., N}: Sampling without replacement: The sample is a subset of the universe. Sampling with replacement: The sample consists of a index set with indices 1,..., N, in which elements may appear several times. Definition 1.1 Let U be a finite population with N elements. A sample S is generated with the help of a sample selection scheme that selects elements from the universe. The set of all possible samples with respect to the sample selection scheme is denoted by S. The probability of drawing a pre-specified sample i is P(S i ). Trier, 24 th March 2015 Ralf Münnich 24 (113) EMOS 2015 Spring School Sampling I + II

29 2.1 Notational background 2.2 Simple random sampling 2.3 Stratified random sampling 2.4 Single stage cluster sampling 2.5 Two-stage cluster sampling Example 2.1 (cf. Lohr, 1999, S. 25) A universe U consists of N = 4 elements. Sampling with replacement with a sample size n = 2 yields: S 1 = {1, 2} S 2 = {1, 3} S 3 = {1, 4} S 4 = {2, 3} S 5 = {2, 4} S 6 = {3, 4} 1. All samples are drawn with equal probability. 2. The probabilities for the samples are: P(S 1 ) = 1/3, P(S 2 ) = 1/6, P(S 6 ) = 1/2, P(S 3 ) = P(S 4 ) = P(S 5 ) = The probabilities for the samples are: P(S 1 ) = 1/3, P(S 2 ) = 1/6, P(S 4 ) = 1/2, P(S 3 ) = P(S 5 ) = P(S 6 ) = 0. What is the sample selection probability of a pre-specified element from the universe? How do the different probabilities affect the inclusion probabilities? Trier, 24 th March 2015 Ralf Münnich 25 (113) EMOS 2015 Spring School Sampling I + II

30 2.1 Notational background 2.2 Simple random sampling 2.3 Stratified random sampling 2.4 Single stage cluster sampling 2.5 Two-stage cluster sampling First order inclusion probabilities The probability π i for an element i from the universe to be included in a sample is called inclusion probability: S π i = I(i S i ) P(S i ). i=1 The statistic π to estimate the parameter π can be assessed from the set of possible samples S (which may be tedious in practice). We get the (design-based) expected value E( π) from S E( π) = π(s i ) P(S i ) i=1 The same procedure yields the (design-based) variance and MSE. Designs with unequal probabilities. Trier, 24 th March 2015 Ralf Münnich 26 (113) EMOS 2015 Spring School Sampling I + II

31 2.1 Notational background 2.2 Simple random sampling 2.3 Stratified random sampling 2.4 Single stage cluster sampling 2.5 Two-stage cluster sampling Examples for selection schemes SRS Simple random sampling with replacement SRSWOR Simple random sampling without replacement BERN Bernoulli-Sampling (each element of the universe will be selected with probability θ) SYS Systematic sampling One element c from 1,..., a N is drawn. Consequently the elements i a + c will be selected. Note: 1. SRS and SRSWOR yield fixed sample sizes n 2. BERN yields a random sample size with E(n) = θ N. Trier, 24 th March 2015 Ralf Münnich 27 (113) EMOS 2015 Spring School Sampling I + II

32 2.1 Notational background 2.2 Simple random sampling 2.3 Stratified random sampling 2.4 Single stage cluster sampling 2.5 Two-stage cluster sampling Examples for selection schemes SRS Simple random sampling with replacement SRSWOR Simple random sampling without replacement BERN Bernoulli-Sampling (each element of the universe will be selected with probability θ) SYS Systematic sampling One element c from 1,..., a N is drawn. Consequently the elements i a + c will be selected. Note: 1. SRS and SRSWOR yield fixed sample sizes n 2. BERN yields a random sample size with E(n) = θ N. Trier, 24 th March 2015 Ralf Münnich 27 (113) EMOS 2015 Spring School Sampling I + II

33 2.1 Notational background 2.2 Simple random sampling 2.3 Stratified random sampling 2.4 Single stage cluster sampling 2.5 Two-stage cluster sampling Simple random sampling Definition of a random sample A random process in which successively n elements are selected from a finite universe with N elements (n < N) is called random sample selection. The result from a random sample selection is called random sample. Simple random sample Drawing samples from an urn with or without replacement is called simple random sampling. The outcome is a simple random sample. Trier, 24 th March 2015 Ralf Münnich 28 (113) EMOS 2015 Spring School Sampling I + II

34 2.1 Notational background 2.2 Simple random sampling 2.3 Stratified random sampling 2.4 Single stage cluster sampling 2.5 Two-stage cluster sampling With or without replacement sampling N n many different simple random samples with replacement can be drawn. Each of them is drawn with identical probability. Each element can be drawn 0 to n times. The draws are stochastically independent. N!/(N n)! many different simple random samples without replacement can be drawn. Each of them is drawn with identical probability. Each element can be drawn 0 or 1 times. The draws are stochastically dependent. The first order inclusion probabilities are π i = n N for with and without replacement. Trier, 24 th March 2015 Ralf Münnich 29 (113) EMOS 2015 Spring School Sampling I + II

35 2.1 Notational background 2.2 Simple random sampling 2.3 Stratified random sampling 2.4 Single stage cluster sampling 2.5 Two-stage cluster sampling The sample mean µ for simple random sampling 1. E( µ) = µ (WR/WOR) 2. V( µ) = σ2 n σ2 (WR) and V( µ) = n N n N 1 3. If U N(µ; σ 2 ) (WR): µ N(µ; σ2 n ) 4. If U is normal (WR): µ µ S t n 1 distributed. n mit S 2 = 1 n 1 5. Without loss of generality for large n n; µ µ µ µ σ σ N n N 1 n (N n large); µ µ S are approximately standard normal. (WOR) resp. n (Y i Y ) 2 is i=1 n; µ µ S N n N 1 n Trier, 24 th March 2015 Ralf Münnich 30 (113) EMOS 2015 Spring School Sampling I + II

36 2.1 Notational background 2.2 Simple random sampling 2.3 Stratified random sampling 2.4 Single stage cluster sampling 2.5 Two-stage cluster sampling The estimator µ is unbiased for µ (WR/WOR) consistent for µ (only WR is relevant) is BLUE for µ is ML estimator for µ if U is normally distributed (WR) Similar results are derived for totals τ = N µ via the corresponding estimator τ = N µ. For SRSWOR, the random variables are slightly negatively correlated: σ2 CovX i ; X j = N 1 Trier, 24 th March 2015 Ralf Münnich 31 (113) EMOS 2015 Spring School Sampling I + II

37 2.1 Notational background 2.2 Simple random sampling 2.3 Stratified random sampling 2.4 Single stage cluster sampling 2.5 Two-stage cluster sampling Confidence intervals for the mean (WR) CI for µ; U normally distributed; σ 2 known: [ µ z(1 α/2) CI for µ; U normally distributed: [ µ t(1 α/2; n 1) σ n ; µ + z(1 α/2) ] σ n s n ; µ + t(1 α/2; n 1) CI for µ; σ 2 known; n > 30 (CLT): [ ] σ 2 σ 2 µ z(1 α/2) ; µ + z(1 α/2) n n CI for µ; n > 30 (CLT): [ µ z(1 α/2) s ; µ + z(1 α/2) s ] n n ] s n Trier, 24 th March 2015 Ralf Münnich 32 (113) EMOS 2015 Spring School Sampling I + II

38 2.1 Notational background 2.2 Simple random sampling 2.3 Stratified random sampling 2.4 Single stage cluster sampling 2.5 Two-stage cluster sampling Confidence interval for the mean (WOR) CI for µ; σ 2 known; n > 50; sample fraction not close to 1 (CLT): [ σ 2 N n µ z(1 α/2) n N 1 ; µ + z(1 α/2) σ 2 n ] N n N 1 CI for µ; n > 50; sample fraction not close to 1 (CLT): [ µ z(1 α/2) ] s N n n N ; µ + z(1 α/2) s N n n N Confidence intervals for variances are rarely considered in practice and, hence, are omitted here. Trier, 24 th March 2015 Ralf Münnich 33 (113) EMOS 2015 Spring School Sampling I + II

39 2.1 Notational background 2.2 Simple random sampling 2.3 Stratified random sampling 2.4 Single stage cluster sampling 2.5 Two-stage cluster sampling Necessary sample size I CI estimation for µ with confidence level (1 α); e is given; WR; no distributional assumptions are made. e is the half CI length for central CIs; this leads to e = z(1 α/2) σ n. With given prior information ˆσ for σ we get and finally n = z(1 α/2) ˆσ e, n (z(1 α/2)) 2 ˆσ2 e 2. Trier, 24 th March 2015 Ralf Münnich 34 (113) EMOS 2015 Spring School Sampling I + II

40 2.1 Notational background 2.2 Simple random sampling 2.3 Stratified random sampling 2.4 Single stage cluster sampling 2.5 Two-stage cluster sampling Necessary sample size II As before, but now considering WOR sampling: Analogously we have e = z(1 α/2) With (N 1 N) we get σ N n n N 1. and finally n = (z(1 α/2)) 2 ˆσ2 e 2 N n N n(1 + 1 N (z(1 α/2))2 ˆσ2 e 2 ) (z(1 α/2))2 ˆσ2 e 2, n (z(1 α/2))2 N ˆσ 2 (z(1 α/2)) 2 ˆσ 2 + Ne 2. Trier, 24 th March 2015 Ralf Münnich 35 (113) EMOS 2015 Spring School Sampling I + II

41 2.1 Notational background 2.2 Simple random sampling 2.3 Stratified random sampling 2.4 Single stage cluster sampling 2.5 Two-stage cluster sampling Necessary sample size III Again, with prior information θ (WR) we get θ(1 e z(1 α/2) θ) n, and hence n z(1 α/2) θ(1 θ) n (z(1 α/2)) 2 θ(1 θ) e 2. e WOR analogously. Trier, 24 th March 2015 Ralf Münnich 36 (113) EMOS 2015 Spring School Sampling I + II

42 2.1 Notational background 2.2 Simple random sampling 2.3 Stratified random sampling 2.4 Single stage cluster sampling 2.5 Two-stage cluster sampling Stratified random sampling The universe U is decomposed into L pairwise disjoint non-empty subpopulations (primary sampling units) G 1,..., G L : G q1 G q2 = for all q 1, q 2 = 1,..., L; q 1 q 2 and L G q = G q=1 In stratified random sampling the subpopulations are called strata. For the qth stratum we get: µ q = 1 Nq N q i=1 Nq y qi mean of stratum q σq 2 = 1 (y qi µ q ) 2 variance of stratum q N q i=1 Sampling from stratified populations is called stratified random sampling (StrRS). Trier, 24 th March 2015 Ralf Münnich 37 (113) EMOS 2015 Spring School Sampling I + II

43 2.1 Notational background 2.2 Simple random sampling 2.3 Stratified random sampling 2.4 Single stage cluster sampling 2.5 Two-stage cluster sampling For the universe U holds (γ q := N q /N): 1. Introduction to Survey Sampling µ = θ = σ 2 = σ 2 e = 1 L = 1 L L γ q µ q ; τ = N µ = q=1 L γ q θ q q=1 L γ q σq 2 + q=1 L N q µ q q=1 L γ q (µ q µ) 2 = σw 2 + σb 2 q=1 L (N q µ q Nµ L )2 = 1 L q=1 L Nqµ 2 2 q ( 1 L q=1 L Nqµ 2 2 q N2 µ 2 q=1 L N q µ q ) 2. q=1 Trier, 24 th March 2015 Ralf Münnich 38 (113) EMOS 2015 Spring School Sampling I + II L 2

44 Parameters in the sample Sample allocation 2.1 Notational background 2.2 Simple random sampling 2.3 Stratified random sampling 2.4 Single stage cluster sampling 2.5 Two-stage cluster sampling Given a decomposition of the universe U into L primary sampling units, the breakdown of the total sample size n into L stratum-specific sample sizes n 1,..., n q,..., n L with q n q = n is called allocation. For the qth primary sampling unit we get ȳ q = 1 n q n q i=1 s 2 q = 1 n q 1 y qi n q stratum mean (y qi ȳ q ) 2 stratum variance i=1 Trier, 24 th March 2015 Ralf Münnich 39 (113) EMOS 2015 Spring School Sampling I + II

45 2.1 Notational background 2.2 Simple random sampling 2.3 Stratified random sampling 2.4 Single stage cluster sampling 2.5 Two-stage cluster sampling Estimating means, totals, and proportions in StrRS Sample mean: µ StrRS = L γ q µ q = q=1 L q=1 γ q 1 n q n q i=1 y iq! = µ StrRSWOR The estimates do not differ from WR and WOR in StrRS. Lemma 2.1 The estimator µ StrRS is unbiased for µ (WR and WOR). The variance of the estimator µ is: V( µ StrRS ) = V( µ StrRSWOR ) = L q=1 L q=1 γ 2 q σ2 q n q γq 2 σ2 q Nq n q n q N q 1 (WR) (WOR) Trier, 24 th March 2015 Ralf Münnich 40 (113) EMOS 2015 Spring School Sampling I + II

46 Since E(S 2 q) = σ 2 q, we get V( µ StrRS ) = s 2 µ = 2.1 Notational background 2.2 Simple random sampling 2.3 Stratified random sampling 2.4 Single stage cluster sampling 2.5 Two-stage cluster sampling L q=1 γ 2 q s2 q n q as an unbiased estimate for V( µ StrRS ). Since E(Sq) 2 = N N 1 σ2 q, V( µ StrRSWOR ) = s 2 µ = L q=1 γ 2 q s2 q n q Nq n q N q (WR) (WOR) is an unbiased estimate for V( µ StrRSWOR ). In case of large stratum-specific sample sizes n q, the estimator µ StrRS is approximately normal (CLT). This yields the (1 α) 100% - CI for µ: [ µ StrRS z(1 α/2) V( µ StrRS ); µ StrRS + z(1 α/2) ] V( µ StrRS ) Trier, 24 th March 2015 Ralf Münnich 41 (113) EMOS 2015 Spring School Sampling I + II

47 Example Notational background 2.2 Simple random sampling 2.3 Stratified random sampling 2.4 Single stage cluster sampling 2.5 Two-stage cluster sampling A universe is split into 4 strata with stratum sizes 10000, 40000, and Drawing a stratified random sample yields the following table: Stratum n q ȳ q s q a) An unbiased estimate for µ is derived as µ StrRS = γ q ȳ q = Trier, 24 th March 2015 Ralf Münnich 42 (113) EMOS 2015 Spring School Sampling I + II

48 Example 2.2 (continued) b) In sampling WR we get 2.1 Notational background 2.2 Simple random sampling 2.3 Stratified random sampling 2.4 Single stage cluster sampling 2.5 Two-stage cluster sampling V( µ StrRS ) = and thus L q=1 γ 2 q s2 q n q = V( µ StrRS ) 0,3791. The 95%-CI for µ is then: [ ; ] = [55.09; 56.59] In the case of small sample fractions n h /N h (h = 1,..., L), sampling WOR yields (approximately) the same figures. Trier, 24 th March 2015 Ralf Münnich 43 (113) EMOS 2015 Spring School Sampling I + II

49 2.1 Notational background 2.2 Simple random sampling 2.3 Stratified random sampling 2.4 Single stage cluster sampling 2.5 Two-stage cluster sampling Example 2.3 (cf. Example 2.2) The sample results remain now out of consideration. We assume known true stratum-specific variances σ 2 1 = 100, σ2 2 = 55, σ2 3 = 40 and σ 2 4 = 150. Calculate V(µ StrRS) under WR a) using the allocation given in example 2.2 and b) with stratum-specific sample sizes n 1 = 60, n 2 = 180, n 3 = 110, and n 4 = 150. Further, comment on the results. In a) we get V( µ StrRS ) = and for b) V( µ StrRS ) = Trier, 24 th March 2015 Ralf Münnich 44 (113) EMOS 2015 Spring School Sampling I + II

50 2.4.1 Equal allocation 2.1 Notational background 2.2 Simple random sampling 2.3 Stratified random sampling 2.4 Single stage cluster sampling 2.5 Two-stage cluster sampling n q = n L q = 1,..., L µ StrRS,eq = 1 L L q=1 ȳ q V( µ StrRS, eq ) = L n V( µ StrRS, eq ) = 1 n L γq 2 σq 2 q=1 L q=1 γ 2 q σ 2 q NqL n N q 1 (WR) (WOR) Trier, 24 th March 2015 Ralf Münnich 45 (113) EMOS 2015 Spring School Sampling I + II

51 2.1 Notational background 2.2 Simple random sampling 2.3 Stratified random sampling 2.4 Single stage cluster sampling 2.5 Two-stage cluster sampling Proportional allocation (Bowley) n q = γ q n µ StrRS, prop = 1 L n q ȳ q n V( µ StrRS, prop ) = 1 n V( µ StrRSWOR, prop ) = 1 n = q=1 L γ q σq 2 q=1 L q=1 ( γ q σq 2 1 n ) N q N N q 1 ( 1 n ) 1 N n q = 1,..., L (WR) (WOR) L γ q σq 2 (N q N q 1) q=1 Trier, 24 th March 2015 Ralf Münnich 46 (113) EMOS 2015 Spring School Sampling I + II

52 2.1 Notational background 2.2 Simple random sampling 2.3 Stratified random sampling 2.4 Single stage cluster sampling 2.5 Two-stage cluster sampling Optimal allocation (Neyman-Tschuprov) n q = n n q = n γ q σ q = n L γ h σ h h=1 N q σ q L N h σ h h=1 γ q θq (1 θ q ) L γ h θh (1 θ h ) h=1 V( µ StrRS, opt ) = 1 ( L ) 2 γ q σ q n V( µ StrRS, opt ) = 1 n q=1 ( L q=1 γ q σ q ) 2 1 N L γ q σq 2 q=1 q = 1,..., L (U is dichotomous) (WR) (WOR) Trier, 24 th March 2015 Ralf Münnich 47 (113) EMOS 2015 Spring School Sampling I + II

53 Example Notational background 2.2 Simple random sampling 2.3 Stratified random sampling 2.4 Single stage cluster sampling 2.5 Two-stage cluster sampling N q σq Given a total sample size of n = 500, we calculate the Neyman-Tschuprov allocation via L γ q σ q = q=1 as n 1 = 60, n 2 = 179, n 3 = 114 and n 4 = 147. The variance estimate µ results in (WR) V( µ StrRS, opt ) = This variance is smaller than the variance in Example 2.3! Trier, 24 th March 2015 Ralf Münnich 48 (113) EMOS 2015 Spring School Sampling I + II

54 2.1 Notational background 2.2 Simple random sampling 2.3 Stratified random sampling 2.4 Single stage cluster sampling 2.5 Two-stage cluster sampling Problems in deriving the optimal allocation In sampling WOR one should consider the WOR correction: n q = n N q σ q L h=1 Nq N q 1 N h σ h Nh N h 1 In sampling WOR, an overallocation may result. In practice, the corresponding stratum is sampled completely and the remaining sample size will be reallocated optimally. Theory: box-constrained optimal allocation (later). Theoretically, the optimal allocation is an integer-optimization problem under constraints. Rounding may result in small errors since the sum of the rounded sample sizes is not equal to the total sample size. Trier, 24 th March 2015 Ralf Münnich 49 (113) EMOS 2015 Spring School Sampling I + II

55 2.1 Notational background 2.2 Simple random sampling 2.3 Stratified random sampling 2.4 Single stage cluster sampling 2.5 Two-stage cluster sampling Cost-optimal allocation (Yates-Zacopanay) n q = γ q σ q / cq (C C 0 ) L γ h σ h ch h=1 q = 1,..., L n q = γ q θq (1 θ q )/ cq L γ h θh (1 θ h ) c h h=1 (U is dichotomous) 1 ( L ) 2 V( µ StrRS, YZ ) = γ q σ q cq C C 0 q=1 1 ( L ) 2 1 V( µ StrRS, YZ ) = γ q σ q cq C C 0 N q=1 L γ q σq 2 q=1 (WR) Trier, 24 th March 2015 Ralf Münnich 50 (113) EMOS 2015 Spring School Sampling I + II (WOR)

56 Example 2.5 q N q σq 2 C q We calculate 2.1 Notational background 2.2 Simple random sampling 2.3 Stratified random sampling 2.4 Single stage cluster sampling 2.5 Two-stage cluster sampling A universe is split into L = 4 strata according to the table alongside. Calculate the cost-optimal allocation given C 0 = 2000 and C = n 1 = = n 2 = n 3 = n 4 = The total sample size results in n = Trier, 24 th March 2015 Ralf Münnich 51 (113) EMOS 2015 Spring School Sampling I + II

57 2.1 Notational background 2.2 Simple random sampling 2.3 Stratified random sampling 2.4 Single stage cluster sampling 2.5 Two-stage cluster sampling Problems in deriving the cost-optimal allocation Again, an overallocation may result in case of sampling WOR. As in the case of the optimal allocation, we have a integer-valued optimization under constraints. Rounding may lead to inadequate sample sizes that are (slightly) too expensive. Trier, 24 th March 2015 Ralf Münnich 52 (113) EMOS 2015 Spring School Sampling I + II

58 Accuracy comparisons I 2.1 Notational background 2.2 Simple random sampling 2.3 Stratified random sampling 2.4 Single stage cluster sampling 2.5 Two-stage cluster sampling V( µ SRS ) = σ2 n = 1 n (σ w 2 + σb 2 ) 1 = n = V( µ StrRS, prop ) + 1 n σ2 b ( L ) γ q σq 2 + σb 2 q=1 = V( µ SRS ) V( µ StrRS, prop ) After multiplying out we get L q=1 ( γ q σ q L ) 2 γ ι σ ι = ι=1 L γ q σq 2 q=1 }{{} n V( µ StrRS, prop ) and finally V( µ StrRS, prop ) V( µ StrRS, opt ) ( L ) 2 γ q σ q q=1 }{{} n V( µ StrRS, opt ) Trier, 24 th March 2015 Ralf Münnich 53 (113) EMOS 2015 Spring School Sampling I + II

59 2.1 Notational background 2.2 Simple random sampling 2.3 Stratified random sampling 2.4 Single stage cluster sampling 2.5 Two-stage cluster sampling Accuracy comparisons II Given the sample size n the following holds: Note: V( µ StrRS, opt ) V( µ StrRS, prop ) V( µ SRS ) 1. The more homogeneous strata are, the higher is the gain in efficiency by using StrRS instead of SRS. This results from σ 2 w (variance within) being considerably small in contrast to σ 2 b (variance between). This is called the effect of stratification. 2. The more heterogeneous the stratum-specific variances are, the higher is the gain in efficiency while applying the optimal rather than the proportional allocation. Trier, 24 th March 2015 Ralf Münnich 54 (113) EMOS 2015 Spring School Sampling I + II

60 2.1 Notational background 2.2 Simple random sampling 2.3 Stratified random sampling 2.4 Single stage cluster sampling 2.5 Two-stage cluster sampling Box-constrained optimal allocation Aim: Minimise 2-norm of the RRMSE-vector: RRMSE < > ( τ) 2 = G 2 RRMSE(τ <g> ) 2 subject to: g=1 Lower and upper sampling fractions are given in each stratum Maximum number of sampling units Solution via a specialised box-constraints optimization algorithm Exact: Gabler, Ganninger and Münnich (2012), Metrika Numerical: Münnich, Sachs and Wagner (2012), AStA Remark: Overallocation of optimal allocation may be handled easily! Trier, 24 th March 2015 Ralf Münnich 55 (113) EMOS 2015 Spring School Sampling I + II

61 2.1 Notational background 2.2 Simple random sampling 2.3 Stratified random sampling 2.4 Single stage cluster sampling 2.5 Two-stage cluster sampling Box-constraints optimization algorithm Idea: The set of strata is split into three groups. The first group L l uses the lower bounds, the second group L u uses the upper bounds, and the third group L R is allocated optimally. Given d h = N h S h, a necessary and sufficient condition of the box-constraints optimization problem are given by d h d h h L > R M h n m h M h h L l L u and d h m h < d h h L R n for all h L u h L l m h L u M h for all h L l. The optimal solution can be drawn while finding the right breakdown constant (r.h.s.) which splits the strata into the three sets Trier, 24with Marchlinear 2015 effort. Ralf Münnich 56 (113) EMOS 2015 Spring School Sampling I + II

62 Stratification 2.1 Notational background 2.2 Simple random sampling 2.3 Stratified random sampling 2.4 Single stage cluster sampling 2.5 Two-stage cluster sampling Stratification is the unique division of the universe into strata. Two problems arise: 1. The number of strata is given; 2. The stratum boundaries are given. Aim: Gain in efficiency of the estimator Problem: Prior information on universe level Official Statistics Sampling inventory Note: In many cases the number of strata is given in practice. Trier, 24 th March 2015 Ralf Münnich 57 (113) EMOS 2015 Spring School Sampling I + II

63 Stratification 2.1 Notational background 2.2 Simple random sampling 2.3 Stratified random sampling 2.4 Single stage cluster sampling 2.5 Two-stage cluster sampling Stratification is the unique division of the universe into strata. Two problems arise: 1. The number of strata is given; 2. The stratum boundaries are given. Aim: Gain in efficiency of the estimator Problem: Prior information on universe level Official Statistics Sampling inventory Note: In many cases the number of strata is given in practice. Trier, 24 th March 2015 Ralf Münnich 57 (113) EMOS 2015 Spring School Sampling I + II

64 2.1 Notational background 2.2 Simple random sampling 2.3 Stratified random sampling 2.4 Single stage cluster sampling 2.5 Two-stage cluster sampling Dalenius stratification Continuous variable of interest Y with density f (y); Number of strata L is given; Variable of interest is used for stratification; Strata are non-overlapping (division of D Y ) Stratification problem: The L 1 inner stratum boundaries ξ q (q = 1,..., L 1) are of interest, such that V ( µ StrRS ; ξ 1,..., ξ L 1 ) min. Under proportional allocation we get: V ( µ ) L ( ) ( ) StrRS ; ξ 1,..., ξ L 1 = γ q ξ1,..., ξ L 1 σ 2 q ξ1,..., ξ L 1 q=1 min ξ 1,...,ξ L 1 (cf. Münnich, 1997, S. 78 ff.). Trier, 24 th March 2015 Ralf Münnich 58 (113) EMOS 2015 Spring School Sampling I + II

65 Approximate solution 2.1 Notational background 2.2 Simple random sampling 2.3 Stratified random sampling 2.4 Single stage cluster sampling 2.5 Two-stage cluster sampling cum f rule (Dalenius and Hodges, 1957) The stratum boundaries will be set such that the cumulative values f (ξq ξ q 1 ) are approximately equal while applying the optimal allocation. equal aggregate σ rule (Wright, 1983) The stratum boundaries will be set such that i I q σ i = 1 L N σ i i=1 approximately holds. I q is the set of all indices in stratum q and σ i the individual standard deviation which are expected to be monotonous. This model-based approach uses the equal allocation (which is optimal for the variable of interest assuming the existence of an auxiliary variable). Trier, 24 th March 2015 Ralf Münnich 59 (113) EMOS 2015 Spring School Sampling I + II

66 Example 2.5 Class rel. freq. rel. freq. cum. over until Notational background 2.2 Simple random sampling 2.3 Stratified random sampling 2.4 Single stage cluster sampling 2.5 Two-stage cluster sampling Empirical frequencies of the variable of interest (20 classes) with L = 5 strata according to the Dalenius-Hodges approximation. Here, we have / 5 = 873. The stratum boundaries can be allocated at 8,73; 17,64; 26,19; and 34,92. In case the class widthes differ from each other, the computation has to be corrected. Trier, 24 th March 2015 Ralf Münnich 60 (113) EMOS 2015 Spring School Sampling I + II

67 Example Notational background 2.2 Simple random sampling 2.3 Stratified random sampling 2.4 Single stage cluster sampling 2.5 Two-stage cluster sampling 20 units were observed in a survey (Y is the variable of interest): The aim is to build 4 strata. The assumptions σ I y and σ II y 2 are to be considered as true. We get and as aggregate σ. This yields the stratum boundaries 11.43, and as well as 42.27, and respectively. i σ I,i σi,i σ II,i σii,i Note: the information was obtained from extremely few observations (teaching example!). The equal allocation will be applied. Trier, 24 th March 2015 Ralf Münnich 61 (113) EMOS 2015 Spring School Sampling I + II

68 2.1 Notational background 2.2 Simple random sampling 2.3 Stratified random sampling 2.4 Single stage cluster sampling 2.5 Two-stage cluster sampling Poststratification The stratification is used after drawing a simple random sample (Lohr, 1999) SRS yields approximately equal proportions with respect to a stratification Correction of over- and under-representation of the categories (strata) Variance formulae of the proportional allocation are applied Poststratification is applied when the necessary information for stratification is not available in the sampling frame but from other sources (e.g. register). Example: Nationality (German, non-german) and gender Note: Stressing poststratification with many variables may lead to erroneous results Trier, 24 th March 2015 Ralf Münnich 62 (113) EMOS 2015 Spring School Sampling I + II

69 Example 2.7 A simple random sample yields: 2.1 Notational background 2.2 Simple random sampling 2.3 Stratified random sampling 2.4 Single stage cluster sampling 2.5 Two-stage cluster sampling n y s 2 W M The proportion of women in the universe is 60%. Hence: SRS µ SRS = ( ) = V ( µ SRS ) = ( ) = PostStr µ PostStr = = V ( µ PostStr ) = L q=1 γ 2 q s2 q n q = The variability of the sample size in the denominator of V ( µ PostStr ) was not considered (may be ignored in practice if not too low)! Trier, 24 th March 2015 Ralf Münnich 63 (113) EMOS 2015 Spring School Sampling I + II

70 Example 2.7 A simple random sample yields: 2.1 Notational background 2.2 Simple random sampling 2.3 Stratified random sampling 2.4 Single stage cluster sampling 2.5 Two-stage cluster sampling n y s 2 W M The proportion of women in the universe is 60%. Hence: SRS µ SRS = ( ) = V ( µ SRS ) = ( ) = PostStr µ PostStr = = V ( µ PostStr ) = L q=1 γ 2 q s2 q n q = The variability of the sample size in the denominator of V ( µ PostStr ) was not considered (may be ignored in practice if not too low)! Trier, 24 th March 2015 Ralf Münnich 63 (113) EMOS 2015 Spring School Sampling I + II

71 Example 2.7 A simple random sample yields: 2.1 Notational background 2.2 Simple random sampling 2.3 Stratified random sampling 2.4 Single stage cluster sampling 2.5 Two-stage cluster sampling n y s 2 W M The proportion of women in the universe is 60%. Hence: SRS µ SRS = ( ) = V ( µ SRS ) = ( ) = PostStr µ PostStr = = V ( µ PostStr ) = L q=1 γ 2 q s2 q n q = The variability of the sample size in the denominator of V ( µ PostStr ) was not considered (may be ignored in practice if not too low)! Trier, 24 th March 2015 Ralf Münnich 63 (113) EMOS 2015 Spring School Sampling I + II

72 2.1 Notational background 2.2 Simple random sampling 2.3 Stratified random sampling 2.4 Single stage cluster sampling 2.5 Two-stage cluster sampling Further remarks on stratified random sampling In general, stratification is set up as regards content (cf. microcensus) Stratification and estimation variable have to be different (cf. Dalenius) Multidimensional allocation (cf. Schaich and Münnich, 1993): Possible conflict of targets while applying the optimal allocation decision problem Multidimensional stratification: rarely of interest Number of strata: Theory: L high ( variance reduction) integer valued solution! Practice: 5-8 strata High sensitivity to data (and estimators) Trier, 24 th March 2015 Ralf Münnich 64 (113) EMOS 2015 Spring School Sampling I + II

73 2.1 Notational background 2.2 Simple random sampling 2.3 Stratified random sampling 2.4 Single stage cluster sampling 2.5 Two-stage cluster sampling Single stage cluster sampling A universe is split into L primary sampling units (PSU; clusters). The selection of PSUs will follow SRS at the first stage; all secondary sampling units (SSU) in the selected PSUs (indicated by a superscript s) are sampled totally at the second stage. At the first-stage, l PSUs are selected by SRSWOR (single stage cluster sampling: SIC). The total sample size of sampling units is random. µ SIC = L l l q=1 γ sel q µ sel q = L l l q=1 N sel q L N r r=1 N 1 q sel Nq sel i=1 Y iq. is an unbiased estimator for the mean µ in the universe (here: µ sel q = µ sel q ) Trier, 24 th March 2015 Ralf Münnich 65 (113) EMOS 2015 Spring School Sampling I + II

74 2.1 Notational background 2.2 Simple random sampling 2.3 Stratified random sampling 2.4 Single stage cluster sampling 2.5 Two-stage cluster sampling Variance of SIC Single stage cluster sampling equals random sampling without replacement at the first stage; hence, the observed values can easily be summed up. The resulting survey is equivalent to stratified random sampling. It follows: V( µ SIC ) = L2 N 2 σ2 e l L l L 1 which can be (asymptotically unbiasedly) estimated via where V ( µ ) L 2 SIC = N 2 s2 e l s 2 e = 1 l 1 L l L l ( r=1 N sel r, µ sel r, N µ ) SIC 2. L Trier, 24 th March 2015 Ralf Münnich 66 (113) EMOS 2015 Spring School Sampling I + II

75 2.1 Notational background 2.2 Simple random sampling 2.3 Stratified random sampling 2.4 Single stage cluster sampling 2.5 Two-stage cluster sampling Estimation of totals under SIC We have τ SIC = N µ SIC and hence V( τ SIC ) = L 2 σ2 e l L l L 1 or V( τsic ) = L 2 s2 e l L l L respectively. Estimation of proportions under SIC Here, we have µ r q = p r q. Then, the relation of interest follows from σ 2 e;θ N 2 = 1 L (γ q θ q θ L L )2. q=1 Trier, 24 th March 2015 Ralf Münnich 67 (113) EMOS 2015 Spring School Sampling I + II

76 2.1 Notational background 2.2 Simple random sampling 2.3 Stratified random sampling 2.4 Single stage cluster sampling 2.5 Two-stage cluster sampling Accuracy comparisons I In general, we assume (l L): This yields V( µ SRS ) V( µ SIC ) V( µ SIC ) 1 l σ 2 b = 1 l (σ2 σ 2 w) 1. If σ 2 b 0 (and hence σ2 w = σ 2 ), the variance V( µ SIC ) becomes considerably small. SIC is then approximately equivalent to SRSWOR. 2. In contrast, if σ 2 w 0, the variance V( µ SIC ) may become very large. This effect is called cluster effect. Trier, 24 th March 2015 Ralf Münnich 68 (113) EMOS 2015 Spring School Sampling I + II

77 2.1 Notational background 2.2 Simple random sampling 2.3 Stratified random sampling 2.4 Single stage cluster sampling 2.5 Two-stage cluster sampling Accuracy comparisons II Whenever the PSUs are of approximately the same size (N q N), one can show that ( V( µ SIC ) = 1 l L N L q SSTOT = q=1 i=1 ) N SSB L 1 l (Y qi Y q ) 2 +, where L N q (Y q Y ) 2 = SSW + SSB. q=1 This leads to the intraclass correlation coefficient (ICC) and finally to ICC = 1 N N 1 SSW SSTOT V( µ SIC ) = LN 1 N(L 1) (1 + (N 1) ICC ) V( µ SRSWOR ) Trier, 24 th March 2015 Ralf Münnich 69 (113) EMOS 2015 Spring School Sampling I + II

Deriving indicators from representative samples for the ESF

Deriving indicators from representative samples for the ESF Deriving indicators from representative samples for the ESF Brussels, June 17, 2014 Ralf Münnich and Stefan Zins Lisa Borsi and Jan-Philipp Kolb GESIS Mannheim and University of Trier Outline 1 Choosing

More information

Main sampling techniques

Main sampling techniques Main sampling techniques ELSTAT Training Course January 23-24 2017 Martin Chevalier Department of Statistical Methods Insee 1 / 187 Main sampling techniques Outline Sampling theory Simple random sampling

More information

Taking into account sampling design in DAD. Population SAMPLING DESIGN AND DAD

Taking into account sampling design in DAD. Population SAMPLING DESIGN AND DAD Taking into account sampling design in DAD SAMPLING DESIGN AND DAD With version 4.2 and higher of DAD, the Sampling Design (SD) of the database can be specified in order to calculate the correct asymptotic

More information

Sample size and Sampling strategy

Sample size and Sampling strategy Sample size and Sampling strategy Dr. Abdul Sattar Programme Officer, Assessment & Analysis Do you agree that Sample should be a certain proportion of population??? Formula for sample N = p 1 p Z2 C 2

More information

Introduction to Survey Data Analysis

Introduction to Survey Data Analysis Introduction to Survey Data Analysis JULY 2011 Afsaneh Yazdani Preface Learning from Data Four-step process by which we can learn from data: 1. Defining the Problem 2. Collecting the Data 3. Summarizing

More information

Sampling from Finite Populations Jill M. Montaquila and Graham Kalton Westat 1600 Research Blvd., Rockville, MD 20850, U.S.A.

Sampling from Finite Populations Jill M. Montaquila and Graham Kalton Westat 1600 Research Blvd., Rockville, MD 20850, U.S.A. Sampling from Finite Populations Jill M. Montaquila and Graham Kalton Westat 1600 Research Blvd., Rockville, MD 20850, U.S.A. Keywords: Survey sampling, finite populations, simple random sampling, systematic

More information

Stat472/572 Sampling: Theory and Practice Instructor: Yan Lu

Stat472/572 Sampling: Theory and Practice Instructor: Yan Lu Stat472/572 Sampling: Theory and Practice Instructor: Yan Lu 1 Chapter 5 Cluster Sampling with Equal Probability Example: Sampling students in high school. Take a random sample of n classes (The classes

More information

Model Assisted Survey Sampling

Model Assisted Survey Sampling Carl-Erik Sarndal Jan Wretman Bengt Swensson Model Assisted Survey Sampling Springer Preface v PARTI Principles of Estimation for Finite Populations and Important Sampling Designs CHAPTER 1 Survey Sampling

More information

Sampling. Sampling. Sampling. Sampling. Population. Sample. Sampling unit

Sampling. Sampling. Sampling. Sampling. Population. Sample. Sampling unit Defined is the process by which a portion of the of interest is drawn to study Technically, any portion of a is a sample. However, not all samples are good samples. Population Terminology A collection

More information

BOOK REVIEW Sampling: Design and Analysis. Sharon L. Lohr. 2nd Edition, International Publication,

BOOK REVIEW Sampling: Design and Analysis. Sharon L. Lohr. 2nd Edition, International Publication, STATISTICS IN TRANSITION-new series, August 2011 223 STATISTICS IN TRANSITION-new series, August 2011 Vol. 12, No. 1, pp. 223 230 BOOK REVIEW Sampling: Design and Analysis. Sharon L. Lohr. 2nd Edition,

More information

Survey Sample Methods

Survey Sample Methods Survey Sample Methods p. 1/54 Survey Sample Methods Evaluators Toolbox Refreshment Abhik Roy & Kristin Hobson abhik.r.roy@wmich.edu & kristin.a.hobson@wmich.edu Western Michigan University AEA Evaluation

More information

Variance estimation on SILC based indicators

Variance estimation on SILC based indicators Variance estimation on SILC based indicators Emilio Di Meglio Eurostat emilio.di-meglio@ec.europa.eu Guillaume Osier STATEC guillaume.osier@statec.etat.lu 3rd EU-LFS/EU-SILC European User Conference 1

More information

A4. Methodology Annex: Sampling Design (2008) Methodology Annex: Sampling design 1

A4. Methodology Annex: Sampling Design (2008) Methodology Annex: Sampling design 1 A4. Methodology Annex: Sampling Design (2008) Methodology Annex: Sampling design 1 Introduction The evaluation strategy for the One Million Initiative is based on a panel survey. In a programme such as

More information

Chapter 3: Element sampling design: Part 1

Chapter 3: Element sampling design: Part 1 Chapter 3: Element sampling design: Part 1 Jae-Kwang Kim Fall, 2014 Simple random sampling 1 Simple random sampling 2 SRS with replacement 3 Systematic sampling Kim Ch. 3: Element sampling design: Part

More information

Lecture 5: Sampling Methods

Lecture 5: Sampling Methods Lecture 5: Sampling Methods What is sampling? Is the process of selecting part of a larger group of participants with the intent of generalizing the results from the smaller group, called the sample, to

More information

3. When a researcher wants to identify particular types of cases for in-depth investigation; purpose less to generalize to larger population than to g

3. When a researcher wants to identify particular types of cases for in-depth investigation; purpose less to generalize to larger population than to g Chapter 7: Qualitative and Quantitative Sampling Introduction Quantitative researchers more concerned with sampling; primary goal to get a representative sample (smaller set of cases a researcher selects

More information

EC969: Introduction to Survey Methodology

EC969: Introduction to Survey Methodology EC969: Introduction to Survey Methodology Peter Lynn Tues 1 st : Sample Design Wed nd : Non-response & attrition Tues 8 th : Weighting Focus on implications for analysis What is Sampling? Identify the

More information

Development of methodology for the estimate of variance of annual net changes for LFS-based indicators

Development of methodology for the estimate of variance of annual net changes for LFS-based indicators Development of methodology for the estimate of variance of annual net changes for LFS-based indicators Deliverable 1 - Short document with derivation of the methodology (FINAL) Contract number: Subject:

More information

Estimation Techniques in the German Labor Force Survey (LFS)

Estimation Techniques in the German Labor Force Survey (LFS) Estimation Techniques in the German Labor Force Survey (LFS) Dr. Kai Lorentz Federal Statistical Office of Germany Group C1 - Mathematical and Statistical Methods Email: kai.lorentz@destatis.de Federal

More information

Introduction to Survey Sampling

Introduction to Survey Sampling A three day short course sponsored by the Social & Economic Research Institute, Qatar University Introduction to Survey Sampling James M. Lepkowski & Michael Traugott Institute for Social Research University

More information

Stochastic calculus for summable processes 1

Stochastic calculus for summable processes 1 Stochastic calculus for summable processes 1 Lecture I Definition 1. Statistics is the science of collecting, organizing, summarizing and analyzing the information in order to draw conclusions. It is a

More information

Design of 2 nd survey in Cambodia WHO workshop on Repeat prevalence surveys in Asia: design and analysis 8-11 Feb 2012 Cambodia

Design of 2 nd survey in Cambodia WHO workshop on Repeat prevalence surveys in Asia: design and analysis 8-11 Feb 2012 Cambodia Design of 2 nd survey in Cambodia WHO workshop on Repeat prevalence surveys in Asia: design and analysis 8-11 Feb 2012 Cambodia Norio Yamada RIT/JATA Primary Objectives 1. To determine the prevalence of

More information

SAMPLING III BIOS 662

SAMPLING III BIOS 662 SAMPLIG III BIOS 662 Michael G. Hudgens, Ph.D. mhudgens@bios.unc.edu http://www.bios.unc.edu/ mhudgens 2009-08-11 09:52 BIOS 662 1 Sampling III Outline One-stage cluster sampling Systematic sampling Multi-stage

More information

104 Business Research Methods - MCQs

104 Business Research Methods - MCQs 104 Business Research Methods - MCQs 1) Process of obtaining a numerical description of the extent to which a person or object possesses some characteristics a) Measurement b) Scaling c) Questionnaire

More information

SYA 3300 Research Methods and Lab Summer A, 2000

SYA 3300 Research Methods and Lab Summer A, 2000 May 17, 2000 Sampling Why sample? Types of sampling methods Probability Non-probability Sampling distributions Purposes of Today s Class Define generalizability and its relation to different sampling strategies

More information

ICES training Course on Design and Analysis of Statistically Sound Catch Sampling Programmes

ICES training Course on Design and Analysis of Statistically Sound Catch Sampling Programmes ICES training Course on Design and Analysis of Statistically Sound Catch Sampling Programmes Sara-Jane Moore www.marine.ie General Statistics - backed up by case studies General Introduction to sampling

More information

ESTP course on Small Area Estimation

ESTP course on Small Area Estimation ESTP course on Small Area Estimation Statistics Finland, Helsinki, 29 September 2 October 2014 Topic 1: Introduction to small area estimation Risto Lehtonen, University of Helsinki Lecture topics: Monday

More information

CHOOSING THE RIGHT SAMPLING TECHNIQUE FOR YOUR RESEARCH. Awanis Ku Ishak, PhD SBM

CHOOSING THE RIGHT SAMPLING TECHNIQUE FOR YOUR RESEARCH. Awanis Ku Ishak, PhD SBM CHOOSING THE RIGHT SAMPLING TECHNIQUE FOR YOUR RESEARCH Awanis Ku Ishak, PhD SBM Sampling The process of selecting a number of individuals for a study in such a way that the individuals represent the larger

More information

FCE 3900 EDUCATIONAL RESEARCH LECTURE 8 P O P U L A T I O N A N D S A M P L I N G T E C H N I Q U E

FCE 3900 EDUCATIONAL RESEARCH LECTURE 8 P O P U L A T I O N A N D S A M P L I N G T E C H N I Q U E FCE 3900 EDUCATIONAL RESEARCH LECTURE 8 P O P U L A T I O N A N D S A M P L I N G T E C H N I Q U E OBJECTIVE COURSE Understand the concept of population and sampling in the research. Identify the type

More information

POPULATION AND SAMPLE

POPULATION AND SAMPLE 1 POPULATION AND SAMPLE Population. A population refers to any collection of specified group of human beings or of non-human entities such as objects, educational institutions, time units, geographical

More information

Part 7: Glossary Overview

Part 7: Glossary Overview Part 7: Glossary Overview In this Part This Part covers the following topic Topic See Page 7-1-1 Introduction This section provides an alphabetical list of all the terms used in a STEPS surveillance with

More information

Conservative variance estimation for sampling designs with zero pairwise inclusion probabilities

Conservative variance estimation for sampling designs with zero pairwise inclusion probabilities Conservative variance estimation for sampling designs with zero pairwise inclusion probabilities Peter M. Aronow and Cyrus Samii Forthcoming at Survey Methodology Abstract We consider conservative variance

More information

ECON1310 Quantitative Economic and Business Analysis A

ECON1310 Quantitative Economic and Business Analysis A ECON1310 Quantitative Economic and Business Analysis A Topic 1 Descriptive Statistics 1 Main points - Statistics descriptive collecting/presenting data; inferential drawing conclusions from - Data types

More information

Examine characteristics of a sample and make inferences about the population

Examine characteristics of a sample and make inferences about the population Chapter 11 Introduction to Inferential Analysis Learning Objectives Understand inferential statistics Explain the difference between a population and a sample Explain the difference between parameter and

More information

European Regional and Urban Statistics

European Regional and Urban Statistics European Regional and Urban Statistics Dr. Berthold Feldmann berthold.feldmann@ec.europa.eu Eurostat Structure of the talk Regional statistics in the EU The tasks of Eurostat Regional statistics Urban

More information

Lesson 6 Population & Sampling

Lesson 6 Population & Sampling Lesson 6 Population & Sampling Lecturer: Dr. Emmanuel Adjei Department of Information Studies Contact Information: eadjei@ug.edu.gh College of Education School of Continuing and Distance Education 2014/2015

More information

SAMPLING- Method of Psychology. By- Mrs Neelam Rathee, Dept of Psychology. PGGCG-11, Chandigarh.

SAMPLING- Method of Psychology. By- Mrs Neelam Rathee, Dept of Psychology. PGGCG-11, Chandigarh. By- Mrs Neelam Rathee, Dept of 2 Sampling is that part of statistical practice concerned with the selection of a subset of individual observations within a population of individuals intended to yield some

More information

New Developments in Nonresponse Adjustment Methods

New Developments in Nonresponse Adjustment Methods New Developments in Nonresponse Adjustment Methods Fannie Cobben January 23, 2009 1 Introduction In this paper, we describe two relatively new techniques to adjust for (unit) nonresponse bias: The sample

More information

The ESS Sample Design Data File (SDDF)

The ESS Sample Design Data File (SDDF) The ESS Sample Design Data File (SDDF) Documentation Version 1.0 Matthias Ganninger Tel: +49 (0)621 1246 282 E-Mail: matthias.ganninger@gesis.org April 8, 2008 Summary: This document reports on the creation

More information

Conducting Fieldwork and Survey Design

Conducting Fieldwork and Survey Design Conducting Fieldwork and Survey Design (Prepared for Young South Asian Scholars Participating in SARNET Training Programme at IHD) Dr. G. C. Manna Structure of Presentation Advantages of Sample Surveys

More information

Q.1 Define Population Ans In statistical investigation the interest usually lies in the assessment of the general magnitude and the study of

Q.1 Define Population Ans In statistical investigation the interest usually lies in the assessment of the general magnitude and the study of Q.1 Define Population Ans In statistical investigation the interest usually lies in the assessment of the general magnitude and the study of variation with respect to one or more characteristics relating

More information

Estimation of change in a rotation panel design

Estimation of change in a rotation panel design Int. Statistical Inst.: Proc. 58th World Statistical Congress, 2011, Dublin (Session CPS028) p.4520 Estimation of change in a rotation panel design Andersson, Claes Statistics Sweden S-701 89 Örebro, Sweden

More information

Lectures of STA 231: Biostatistics

Lectures of STA 231: Biostatistics Lectures of STA 231: Biostatistics Second Semester Academic Year 2016/2017 Text Book Biostatistics: Basic Concepts and Methodology for the Health Sciences (10 th Edition, 2014) By Wayne W. Daniel Prepared

More information

This module is part of the. Memobust Handbook. on Methodology of Modern Business Statistics

This module is part of the. Memobust Handbook. on Methodology of Modern Business Statistics This module is part of the Memobust Handbook on Methodology of Modern Business Statistics 26 March 2014 Method: Sample Co-ordination Using Simple Random Sampling with Permanent Random Numbers Contents

More information

Multivariate Optimal Allocation with Box-Constraints

Multivariate Optimal Allocation with Box-Constraints Austrian Journal of Statistics February 2018, Volume 47, 33 52. AJS http://www.ajs.or.at/ doi:10.17713/ajs.v47i2.764 Multivariate Optimal Allocation with Box-Constraints Ulf Friedrich Trier University

More information

Monte Carlo Study on the Successive Difference Replication Method for Non-Linear Statistics

Monte Carlo Study on the Successive Difference Replication Method for Non-Linear Statistics Monte Carlo Study on the Successive Difference Replication Method for Non-Linear Statistics Amang S. Sukasih, Mathematica Policy Research, Inc. Donsig Jang, Mathematica Policy Research, Inc. Amang S. Sukasih,

More information

Prepared by: DR. ROZIAH MOHD RASDI Faculty of Educational Studies Universiti Putra Malaysia

Prepared by: DR. ROZIAH MOHD RASDI Faculty of Educational Studies Universiti Putra Malaysia Prepared by: DR. ROZIAH MOHD RASDI Faculty of Educational Studies Universiti Putra Malaysia roziah_m@upm.edu.my Topic 11 Sample Selection Introduction Strength of quantitative research method its ability

More information

Lecturer: Dr. Adote Anum, Dept. of Psychology Contact Information:

Lecturer: Dr. Adote Anum, Dept. of Psychology Contact Information: Lecturer: Dr. Adote Anum, Dept. of Psychology Contact Information: aanum@ug.edu.gh College of Education School of Continuing and Distance Education 2014/2015 2016/2017 Session Overview In this Session

More information

CPT Section D Quantitative Aptitude Chapter 15. Prof. Bharat Koshti

CPT Section D Quantitative Aptitude Chapter 15. Prof. Bharat Koshti CPT Section D Quantitative Aptitude Chapter 15 Prof. Bharat Koshti Different Methods of Sampling 1. Probability Sampling Methods 2. Non-Probabilistic Sampling Methods 3. Mixed Sampling Methods (1) Probability

More information

Advanced Survey Sampling

Advanced Survey Sampling Lecture materials Advanced Survey Sampling Statistical methods for sample surveys Imbi Traat niversity of Tartu 2007 Statistical methods for sample surveys Lecture 1, Imbi Traat 2 1 Introduction Sample

More information

Sampling: What you don t know can hurt you. Juan Muñoz

Sampling: What you don t know can hurt you. Juan Muñoz Sampling: What you don t know can hurt you Juan Muñoz Outline of presentation Basic concepts Scientific Sampling Simple Random Sampling Sampling Errors and Confidence Intervals Sampling error and sample

More information

TYPES OF SAMPLING TECHNIQUES IN PHYSICAL EDUCATION AND SPORTS

TYPES OF SAMPLING TECHNIQUES IN PHYSICAL EDUCATION AND SPORTS TYPES OF SAMPLING TECHNIQUES IN PHYSICAL EDUCATION AND SPORTS Daksh Sharma Assistant Professor of Phy.Edu, SGGS Khalsa Mahilpur ABSTRACT In sports statistics, sampling techniques has a immense importance

More information

TECH 646 Analysis of Research in Industry and Technology

TECH 646 Analysis of Research in Industry and Technology TECH 646 Analysis of Research in Industry and Technology PART III The Sources and Collection of data: Measurement, Measurement Scales, Questionnaires & Instruments, Ch. 14 Lecture note based on the text

More information

Applied Statistics in Business & Economics, 5 th edition

Applied Statistics in Business & Economics, 5 th edition A PowerPoint Presentation Package to Accompany Applied Statistics in Business & Economics, 5 th edition David P. Doane and Lori E. Seward Prepared by Lloyd R. Jaisingh McGraw-Hill/Irwin Copyright 2015

More information

Explain the role of sampling in the research process Distinguish between probability and nonprobability sampling Understand the factors to consider

Explain the role of sampling in the research process Distinguish between probability and nonprobability sampling Understand the factors to consider Visanou Hansana Explain the role of sampling in the research process Distinguish between probability and nonprobability sampling Understand the factors to consider when determining sample size Understand

More information

Introduction to Sample Survey

Introduction to Sample Survey Introduction to Sample Survey Girish Kumar Jha gjha_eco@iari.res.in Indian Agricultural Research Institute, New Delhi-12 INTRODUCTION Statistics is defined as a science which deals with collection, compilation,

More information

Statistics for Managers Using Microsoft Excel 5th Edition

Statistics for Managers Using Microsoft Excel 5th Edition Statistics for Managers Using Microsoft Ecel 5th Edition Chapter 7 Sampling and Statistics for Managers Using Microsoft Ecel, 5e 2008 Pearson Prentice-Hall, Inc. Chap 7-12 Why Sample? Selecting a sample

More information

Sanjay Chaudhuri Department of Statistics and Applied Probability, National University of Singapore

Sanjay Chaudhuri Department of Statistics and Applied Probability, National University of Singapore AN EMPIRICAL LIKELIHOOD BASED ESTIMATOR FOR RESPONDENT DRIVEN SAMPLED DATA Sanjay Chaudhuri Department of Statistics and Applied Probability, National University of Singapore Mark Handcock, Department

More information

Part 3: Inferential Statistics

Part 3: Inferential Statistics - 1 - Part 3: Inferential Statistics Sampling and Sampling Distributions Sampling is widely used in business as a means of gathering information about a population. Reasons for Sampling There are several

More information

Data Integration for Big Data Analysis for finite population inference

Data Integration for Big Data Analysis for finite population inference for Big Data Analysis for finite population inference Jae-kwang Kim ISU January 23, 2018 1 / 36 What is big data? 2 / 36 Data do not speak for themselves Knowledge Reproducibility Information Intepretation

More information

How to Use the Internet for Election Surveys

How to Use the Internet for Election Surveys How to Use the Internet for Election Surveys Simon Jackman and Douglas Rivers Stanford University and Polimetrix, Inc. May 9, 2008 Theory and Practice Practice Theory Works Doesn t work Works Great! Black

More information

MN 400: Research Methods. CHAPTER 7 Sample Design

MN 400: Research Methods. CHAPTER 7 Sample Design MN 400: Research Methods CHAPTER 7 Sample Design 1 Some fundamental terminology Population the entire group of objects about which information is wanted Unit, object any individual member of the population

More information

COMBINING ENUMERATION AREA MAPS AND SATELITE IMAGES (LAND COVER) FOR THE DEVELOPMENT OF AREA FRAME (MULTIPLE FRAMES) IN AN AFRICAN COUNTRY:

COMBINING ENUMERATION AREA MAPS AND SATELITE IMAGES (LAND COVER) FOR THE DEVELOPMENT OF AREA FRAME (MULTIPLE FRAMES) IN AN AFRICAN COUNTRY: COMBINING ENUMERATION AREA MAPS AND SATELITE IMAGES (LAND COVER) FOR THE DEVELOPMENT OF AREA FRAME (MULTIPLE FRAMES) IN AN AFRICAN COUNTRY: PRELIMINARY LESSONS FROM THE EXPERIENCE OF ETHIOPIA BY ABERASH

More information

Day 8: Sampling. Daniel J. Mallinson. School of Public Affairs Penn State Harrisburg PADM-HADM 503

Day 8: Sampling. Daniel J. Mallinson. School of Public Affairs Penn State Harrisburg PADM-HADM 503 Day 8: Sampling Daniel J. Mallinson School of Public Affairs Penn State Harrisburg mallinson@psu.edu PADM-HADM 503 Mallinson Day 8 October 12, 2017 1 / 46 Road map Why Sample? Sampling terminology Probability

More information

Sampling. Jian Pei School of Computing Science Simon Fraser University

Sampling. Jian Pei School of Computing Science Simon Fraser University Sampling Jian Pei School of Computing Science Simon Fraser University jpei@cs.sfu.ca INTRODUCTION J. Pei: Sampling 2 What Is Sampling? Select some part of a population to observe estimate something about

More information

Formalizing the Concepts: Simple Random Sampling. Juan Muñoz Kristen Himelein March 2012

Formalizing the Concepts: Simple Random Sampling. Juan Muñoz Kristen Himelein March 2012 Formalizing the Concepts: Simple Random Sampling Juan Muñoz Kristen Himelein March 2012 Purpose of sampling To study a portion of the population through observations at the level of the units selected,

More information

TECH 646 Analysis of Research in Industry and Technology

TECH 646 Analysis of Research in Industry and Technology TECH 646 Analysis of Research in Industry and Technology PART III The Sources and Collection of data: Measurement, Measurement Scales, Questionnaires & Instruments, Sampling Ch. 14 Sampling Lecture note

More information

Introduction to Survey Data Integration

Introduction to Survey Data Integration Introduction to Survey Data Integration Jae-Kwang Kim Iowa State University May 20, 2014 Outline 1 Introduction 2 Survey Integration Examples 3 Basic Theory for Survey Integration 4 NASS application 5

More information

VARIANCE ESTIMATION FOR NEAREST NEIGHBOR IMPUTATION FOR U.S. CENSUS LONG FORM DATA

VARIANCE ESTIMATION FOR NEAREST NEIGHBOR IMPUTATION FOR U.S. CENSUS LONG FORM DATA Submitted to the Annals of Applied Statistics VARIANCE ESTIMATION FOR NEAREST NEIGHBOR IMPUTATION FOR U.S. CENSUS LONG FORM DATA By Jae Kwang Kim, Wayne A. Fuller and William R. Bell Iowa State University

More information

Labour Supply Responses and the Extensive Margin: The US, UK and France

Labour Supply Responses and the Extensive Margin: The US, UK and France Labour Supply Responses and the Extensive Margin: The US, UK and France Richard Blundell Antoine Bozio Guy Laroque UCL and IFS IFS INSEE-CREST, UCL and IFS January 2011 Blundell, Bozio and Laroque ( )

More information

You are allowed 3? sheets of notes and a calculator.

You are allowed 3? sheets of notes and a calculator. Exam 1 is Wed Sept You are allowed 3? sheets of notes and a calculator The exam covers survey sampling umbers refer to types of problems on exam A population is the entire set of (potential) measurements

More information

STA304H1F/1003HF Summer 2015: Lecture 11

STA304H1F/1003HF Summer 2015: Lecture 11 STA304H1F/1003HF Summer 2015: Lecture 11 You should know... What is one-stage vs two-stage cluster sampling? What are primary and secondary sampling units? What are the two types of estimation in cluster

More information

1 Outline. Introduction: Econometricians assume data is from a simple random survey. This is never the case.

1 Outline. Introduction: Econometricians assume data is from a simple random survey. This is never the case. 24. Strati cation c A. Colin Cameron & Pravin K. Trivedi 2006 These transparencies were prepared in 2002. They can be used as an adjunct to Chapter 24 (sections 24.2-24.4 only) of our subsequent book Microeconometrics:

More information

FINITE MIXTURES OF LOGNORMAL AND GAMMA DISTRIBUTIONS

FINITE MIXTURES OF LOGNORMAL AND GAMMA DISTRIBUTIONS The 7 th International Days of Statistics and Economics, Prague, September 9-, 03 FINITE MIXTURES OF LOGNORMAL AND GAMMA DISTRIBUTIONS Ivana Malá Abstract In the contribution the finite mixtures of distributions

More information

Formalizing the Concepts: Simple Random Sampling. Juan Muñoz Kristen Himelein March 2013

Formalizing the Concepts: Simple Random Sampling. Juan Muñoz Kristen Himelein March 2013 Formalizing the Concepts: Simple Random Sampling Juan Muñoz Kristen Himelein March 2013 Purpose of sampling To study a portion of the population through observations at the level of the units selected,

More information

Sampling and Estimation in Agricultural Surveys

Sampling and Estimation in Agricultural Surveys GS Training and Outreach Workshop on Agricultural Surveys Training Seminar: Sampling and Estimation in Cristiano Ferraz 24 October 2016 Download a free copy of the Handbook at: http://gsars.org/wp-content/uploads/2016/02/msf-010216-web.pdf

More information

REPLICATION VARIANCE ESTIMATION FOR THE NATIONAL RESOURCES INVENTORY

REPLICATION VARIANCE ESTIMATION FOR THE NATIONAL RESOURCES INVENTORY REPLICATION VARIANCE ESTIMATION FOR THE NATIONAL RESOURCES INVENTORY J.D. Opsomer, W.A. Fuller and X. Li Iowa State University, Ames, IA 50011, USA 1. Introduction Replication methods are often used in

More information

Two Measures for Sample Size Determination

Two Measures for Sample Size Determination Survey Research Methods (2011) Vol.5, No.1, pp. 27-37 ISSN 1864-3361 http://www.surveymethods.org European Survey Research Association Two Measures for Sample Sie Determination Philippe Eichenberger Swiss

More information

ANALYSIS OF SURVEY DATA USING SPSS

ANALYSIS OF SURVEY DATA USING SPSS 11 ANALYSIS OF SURVEY DATA USING SPSS U.C. Sud Indian Agricultural Statistics Research Institute, New Delhi-110012 11.1 INTRODUCTION SPSS version 13.0 has many additional features over the version 12.0.

More information

Jakarta, Indonesia,29 Sep-10 October 2014.

Jakarta, Indonesia,29 Sep-10 October 2014. Regional Training Course on Sampling Methods for Producing Core Data Items for Agricultural and Rural Statistics Jakarta, Indonesia,29 Sep-0 October 204. LEARNING OBJECTIVES At the end of this session

More information

Typical information required from the data collection can be grouped into four categories, enumerated as below.

Typical information required from the data collection can be grouped into four categories, enumerated as below. Chapter 6 Data Collection 6.1 Overview The four-stage modeling, an important tool for forecasting future demand and performance of a transportation system, was developed for evaluating large-scale infrastructure

More information

Weighting Missing Data Coding and Data Preparation Wrap-up Preview of Next Time. Data Management

Weighting Missing Data Coding and Data Preparation Wrap-up Preview of Next Time. Data Management Data Management Department of Political Science and Government Aarhus University November 24, 2014 Data Management Weighting Handling missing data Categorizing missing data types Imputation Summary measures

More information

Business Statistics: A First Course

Business Statistics: A First Course Business Statistics: A First Course 5 th Edition Chapter 7 Sampling and Sampling Distributions Basic Business Statistics, 11e 2009 Prentice-Hall, Inc. Chap 7-1 Learning Objectives In this chapter, you

More information

SAMPLE SIZE ESTIMATION FOR MONITORING AND EVALUATION: LECTURE NOTES

SAMPLE SIZE ESTIMATION FOR MONITORING AND EVALUATION: LECTURE NOTES SAMPLE SIZE ESTIMATION FOR MONITORING AND EVALUATION: LECTURE NOTES Joseph George Caldwell, PhD (Statistics) 1432 N Camino Mateo, Tucson, AZ 85745-3311 USA Tel. (001)(520)222-3446, E-mail jcaldwell9@yahoo.com

More information

Data Collection. Lecture Notes in Transportation Systems Engineering. Prof. Tom V. Mathew. 1 Overview 1

Data Collection. Lecture Notes in Transportation Systems Engineering. Prof. Tom V. Mathew. 1 Overview 1 Data Collection Lecture Notes in Transportation Systems Engineering Prof. Tom V. Mathew Contents 1 Overview 1 2 Survey design 2 2.1 Information needed................................. 2 2.2 Study area.....................................

More information

Basics of Modern Missing Data Analysis

Basics of Modern Missing Data Analysis Basics of Modern Missing Data Analysis Kyle M. Lang Center for Research Methods and Data Analysis University of Kansas March 8, 2013 Topics to be Covered An introduction to the missing data problem Missing

More information

Cross-sectional variance estimation for the French Labour Force Survey

Cross-sectional variance estimation for the French Labour Force Survey Survey Research Methods (007 Vol., o., pp. 75-83 ISS 864-336 http://www.surveymethods.org c European Survey Research Association Cross-sectional variance estimation for the French Labour Force Survey Pascal

More information

C. J. Skinner Cross-classified sampling: some estimation theory

C. J. Skinner Cross-classified sampling: some estimation theory C. J. Skinner Cross-classified sampling: some estimation theory Article (Accepted version) (Refereed) Original citation: Skinner, C. J. (205) Cross-classified sampling: some estimation theory. Statistics

More information

Abstract Title Page. Title: Degenerate Power in Multilevel Mediation: The Non-monotonic Relationship Between Power & Effect Size

Abstract Title Page. Title: Degenerate Power in Multilevel Mediation: The Non-monotonic Relationship Between Power & Effect Size Abstract Title Page Title: Degenerate Power in Multilevel Mediation: The Non-monotonic Relationship Between Power & Effect Size Authors and Affiliations: Ben Kelcey University of Cincinnati SREE Spring

More information

SAMPLING BIOS 662. Michael G. Hudgens, Ph.D. mhudgens :55. BIOS Sampling

SAMPLING BIOS 662. Michael G. Hudgens, Ph.D.   mhudgens :55. BIOS Sampling SAMPLIG BIOS 662 Michael G. Hudgens, Ph.D. mhudgens@bios.unc.edu http://www.bios.unc.edu/ mhudgens 2008-11-14 15:55 BIOS 662 1 Sampling Outline Preliminaries Simple random sampling Population mean Population

More information

An Overview of the Pros and Cons of Linearization versus Replication in Establishment Surveys

An Overview of the Pros and Cons of Linearization versus Replication in Establishment Surveys An Overview of the Pros and Cons of Linearization versus Replication in Establishment Surveys Richard Valliant University of Michigan and Joint Program in Survey Methodology University of Maryland 1 Introduction

More information

SURVEY SAMPLING. ijli~iiili~llil~~)~i"lij liilllill THEORY AND METHODS DANKIT K. NASSIUMA

SURVEY SAMPLING. ijli~iiili~llil~~)~ilij liilllill THEORY AND METHODS DANKIT K. NASSIUMA SURVEY SAMPLING THEORY AND METHODS DANKIT K. NASSIUMA ijli~iiili~llil~~)~i"lij liilllill 0501941 9 Table of Contents PREFACE 1 INTRODUCTION 1.1 Overview of researc h methods 1.2 Surveys and sampling 1.3

More information

The R package sampling, a software tool for training in official statistics and survey sampling

The R package sampling, a software tool for training in official statistics and survey sampling The R package sampling, a software tool for training in official statistics and survey sampling Yves Tillé 1 and Alina Matei 2 1 Institute of Statistics, University of Neuchâtel, Switzerland yves.tille@unine.ch

More information

New estimation methodology for the Norwegian Labour Force Survey

New estimation methodology for the Norwegian Labour Force Survey Notater Documents 2018/16 Melike Oguz-Alper New estimation methodology for the Norwegian Labour Force Survey Documents 2018/16 Melike Oguz Alper New estimation methodology for the Norwegian Labour Force

More information

PPS Sampling with Panel Rotation for Estimating Price Indices on Services

PPS Sampling with Panel Rotation for Estimating Price Indices on Services PPS Sampling with Panel Rotation for Estimating Price Indices on Services 2016 18 Sander Scholtus Arnout van Delden Content 1. Introduction 4 2. SPPI estimation using a PPS panel from a fixed population

More information

Microeconometrics. Bernd Süssmuth. IEW Institute for Empirical Research in Economics. University of Leipzig. April 4, 2011

Microeconometrics. Bernd Süssmuth. IEW Institute for Empirical Research in Economics. University of Leipzig. April 4, 2011 Microeconometrics Bernd Süssmuth IEW Institute for Empirical Research in Economics University of Leipzig April 4, 2011 Bernd Süssmuth (University of Leipzig) Microeconometrics April 4, 2011 1 / 22 Organizational

More information

BUSINESS STATISTICS. MBA, Pokhara University. Bijay Lal Pradhan, Ph.D.

BUSINESS STATISTICS. MBA, Pokhara University. Bijay Lal Pradhan, Ph.D. BUSINESS STATISTICS MBA, Pokhara University Bijay Lal Pradhan, Ph.D. WHY BUSINESS STATISTICS Most successful Manager and Decision makers understand the information and know how to use it effectively COURSE

More information

Statistical Education - The Teaching Concept of Pseudo-Populations

Statistical Education - The Teaching Concept of Pseudo-Populations Statistical Education - The Teaching Concept of Pseudo-Populations Andreas Quatember Johannes Kepler University Linz, Austria Department of Applied Statistics, Johannes Kepler University Linz, Altenberger

More information

Use of auxiliary information in the sampling strategy of a European area frame agro-environmental survey

Use of auxiliary information in the sampling strategy of a European area frame agro-environmental survey Use of auxiliary information in the sampling strategy of a European area frame agro-environmental survey Laura Martino 1, Alessandra Palmieri 1 & Javier Gallego 2 (1) European Commission: DG-ESTAT (2)

More information

Selection on Observables: Propensity Score Matching.

Selection on Observables: Propensity Score Matching. Selection on Observables: Propensity Score Matching. Department of Economics and Management Irene Brunetti ireneb@ec.unipi.it 24/10/2017 I. Brunetti Labour Economics in an European Perspective 24/10/2017

More information