Session 2.1: Terminology, Concepts and Definitions

Similar documents
Jakarta, Indonesia,29 Sep-10 October 2014.

Introduction to Survey Data Analysis

Survey of Smoking Behavior. Samples and Elements. Survey of Smoking Behavior. Samples and Elements

Sampling and Estimation in Agricultural Surveys

VALIDATING A SURVEY ESTIMATE - A COMPARISON OF THE GUYANA RURAL FARM HOUSEHOLD SURVEY AND INDEPENDENT RICE DATA

1. Regressions and Regression Models. 2. Model Example. EEP/IAS Introductory Applied Econometrics Fall Erin Kelley Section Handout 1

Error Analysis of Sampling Frame in Sample Survey*

Taking into account sampling design in DAD. Population SAMPLING DESIGN AND DAD

Lesson 6 Population & Sampling

A Review of Concept of Peri-urban Area & Its Identification

Part 7: Glossary Overview

Q.1 Define Population Ans In statistical investigation the interest usually lies in the assessment of the general magnitude and the study of

Special Topic: Bayesian Finite Population Survey Sampling

Intermediate Econometrics

The Use of Random Geographic Cluster Sampling to Survey Pastoralists. Kristen Himelein, World Bank Addis Ababa, Ethiopia January 23, 2013

Formalizing the Concepts: Simple Random Sampling. Juan Muñoz Kristen Himelein March 2012

S. LEIVANG, MD. H. ALI AND S. SAGOLSEM

Figure Figure

Formalizing the Concepts: Simple Random Sampling. Juan Muñoz Kristen Himelein March 2013

Sampling Distributions

MATH 1150 Chapter 2 Notation and Terminology

CHOOSING THE RIGHT SAMPLING TECHNIQUE FOR YOUR RESEARCH. Awanis Ku Ishak, PhD SBM

Introduction to Sample Survey

Application of Statistical Analysis in Population and Sampling Population

Statistical Inference, Populations and Samples

MN 400: Research Methods. CHAPTER 7 Sample Design

Estimation of Parameters and Variance

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur

Conducting Fieldwork and Survey Design

SAMPLING BIOS 662. Michael G. Hudgens, Ph.D. mhudgens :55. BIOS Sampling

APPLICATION OF DEFINITION AND METHODS IN MEXICO

Survey of Smoking Behavior. Survey of Smoking Behavior. Survey of Smoking Behavior

STATISTICS INDEX NUMBER

4.4-Multiplication Rule: Basics

POPULATION AND SAMPLE

Sampling (Statistics)

Apéndice 1: Figuras y Tablas del Marco Teórico

1-1. Chapter 1. Sampling and Descriptive Statistics by The McGraw-Hill Companies, Inc. All rights reserved.

Notes 3: Statistical Inference: Sampling, Sampling Distributions Confidence Intervals, and Hypothesis Testing

Sampling from Finite Populations Jill M. Montaquila and Graham Kalton Westat 1600 Research Blvd., Rockville, MD 20850, U.S.A.

Regression analysis an example in quantitative methods

Summary and Statistical Report of the 2007 Population and Housing Census Results 1

Input Costs Trends for Arkansas Field Crops, AG -1291

CHAPTER 3 PROBABILITY: EVENTS AND PROBABILITIES

ILO Assessment Report: Community Based Emergency Employment Nabulini, Manu and Naibita Village.

Chisoni Mumba. Presentation made at the Zambia Science Conference 2017-Reseachers Symposium, th November 2017, AVANI, Livingstone, Zambia

Business Analytics and Data Mining Modeling Using R Prof. Gaurav Dixit Department of Management Studies Indian Institute of Technology, Roorkee

UNIVERSITY OF TORONTO MISSISSAUGA. SOC222 Measuring Society In-Class Test. November 11, 2011 Duration 11:15a.m. 13 :00p.m.

Lesson 6: Accuracy Assessment

EC969: Introduction to Survey Methodology

Review of the Normal Distribution

GEOGRAPHY 204: STATISTICAL PROBLEM SOLVING IN GEOGRAPHY

The World Bank. Key Dates. Project Development Objectives. Components. Overall Ratings. Public Disclosure Authorized

HYPERGEOMETRIC and NEGATIVE HYPERGEOMETIC DISTRIBUTIONS

An Internet-Based Integrated Resource Management System (IRMS)

Geographic Boundaries of Population Census of Japan 1

Total llo7. '1'1 999 'IOo'l. J 2113 '155 72o o61 'I

Agile Mind Mathematics 8 Scope and Sequence, Texas Essential Knowledge and Skills for Mathematics

Uganda - National Panel Survey

Applied Statistics in Business & Economics, 5 th edition

Introduction to Survey Data Integration

Essentials of Statistics and Probability

Tribhuvan University Institute of Science and Technology 2065

Prediction of RPO by Model Tree. Saleh Shahinfar Department of Dairy Science University of Wisconsin-Madison JAM2013

Rockefeller College University at Albany

Globally Estimating the Population Characteristics of Small Geographic Areas. Tom Fitzwater

LUCAS: A possible scheme for a master sampling frame. J. Gallego, MARS AGRI4CAST

Sample size and Sampling strategy

More on Roy Model of Self-Selection

SEASONAL AGRICULTURE SURVEY (SAS) The Overview of the Multiple Frame Sample Survey in Rwanda

Disaggregation according to geographic location the need and the challenges

CHE Chemical Engineering Operations

Stochastic calculus for summable processes 1

Week 1: Conceptual Framework of Microeconomic theory (Malivaud, September Chapter 6, 20151) / Mathemat 1 / 25. (Jehle and Reny, Chapter A1)

Representativeness. Sampling and. Department of Government London School of Economics and Political Science

The National Spatial Strategy

Sampling. Module II Chapter 3

The Thirteen Colonies

2. Linear regression with multiple regressors

Sampling. General introduction to sampling methods in epidemiology and some applications to food microbiology study October Hanoi

ANALYSIS OF SURVEY DATA USING SPSS

How to Develop Master Sampling Frames using Dot Sampling Method and Google Earth

0 0'0 2S ~~ Employment category

Probability and Statistics

SYLLOGISM CHAPTER 13.1 INTRODUCTION

2) There should be uncertainty as to which outcome will occur before the procedure takes place.

Describing distributions with numbers

SYLLOGISM CHAPTER 13.1 INTRODUCTION

Data Collection: What Is Sampling?

Nonlinear Regression. Summary. Sample StatFolio: nonlinear reg.sgp

New Mexico Public Education Department. 7th Grade Social Studies End-of-Course (EoC) Exam

COMPILATION OF ENVIRONMENTALLY-RELATED QUESTIONS IN CENSUSES AND SURVEY, AND SPECIALIZED ENVIRONMENTAL SURVEYS

IOP Conference Series: Earth and Environmental Science. Related content PAPER OPEN ACCESS

ROLE OF SINDH AGRICULTURE UNIVERSITY TANDOJAM IN AGRICULTURE INFORMATION SYSTEM

Fourth Grade Social Studies Crosswalk

ADePT Software platform for Automated Economic Analysis

Probability and Conditional Probability

Statistical inference


Higher Education FY Omnibus Appropriations, S.F. 943 Conf. Report 2017 Regular Session (dollars in 000s)

Transcription:

Second Regional Training Course on Sampling Methods for Producing Core Data Items for Agricultural and Rural Statistics Module 2: Review of Basics of Sampling Methods Session 2.1: Terminology, Concepts and Definitions 9 20 November 2015, Jakarta, Indonesia Contents Probability Sampling in Statistical Surveys Sampling unit and Sampling Frame Characteristics quantitative variables and categorical variable Important Definitions: Population parameter, statistic, estimator & estimate, sampling with and without replacement, Simple Random Sampling (SRS) Sampling error, Sampling variance and Sampling Distribution Unbiasedness, Efficiency and Consistency 2

Probability Sampling in Statistical Surveys 3 Probability Sampling Statistical Survey Design Issues involved 1. Determining survey objectives and data requirements 2. The population of interest or the target population 3. Reference period; Geographic and demographic boundaries 4. Sampling frame and sampling unit 5. Sample design 6. Selection of the sample (at different stages) 7. Survey management and field procedures 8. Data collection 9. Summary and analysis of the data 10. Dissemination 4

Probability Sampling Sample A subset of the population on which observations are taken for the purpose of obtaining information about the population. By studying a sample we hope to draw valid conclusions about the population. Thus, a sample should desirably be representative of the target population. 5 Probability Sampling Sample Design - two broad kinds Sample design specifies how to select the part of the population, i.e. sample, to be surveyed. Two broad kinds: Probability sampling or Random sampling: In which each element of the population is assigned a nonzero chance of being included in the sample. [focus of this course] Non probability sampling: Consists of a variety of procedures, including judgmentbased and purposive choice of elements considered representative of the population. 6

Probability Sampling Use of Random Numbers for Sampling (1) Consider a population whose population size is 1000. [Number of units in a population is called Population size] In the simplest kind of probability sampling, each unit is assigned equal chance of selection. In this case, the chance, i.e. the selection probability, of all the units is 1/1000. Selection probability of a unit is the probability of selecting it at the first draw, when the sample is required to contain more than one unit. A sample usually consists of more than 1 unit. The number of unit a sample should consist of is called its sample size. 7 Probability Sampling Use of Random Numbers for Sampling (2) Before selection, all the units in the population are first given a running serial number from 1 to 1000. This serial number is called Sampling Serial Number. Forselectingaunit,arandomnumberbetween1and1000is drawn. The unit having the sampling serial number same as the random number is selected for the sample. This ensures that every unit in the population has a non zero chance of selection and that all their selection probability is the same. We will later see how units from a population can be drawn with unequal probability. 8

Probability Sampling A Few Questions Which of the following are random or probability samples? A. From a class of 50 students, a sample consisting of those with roll numbers that are multiple of 7? B. From the same class, a sample of those with roll numbers same as first seven random numbers drawn by you? C. In a ranch with 35 grazing cows, a sample of the first five cows that you come across as you walk in? D. A ladle full of rice being cooked (in the traditional way) in a pot, after having stirred its contents well? E. A random sample of holdings from a randomly selected sample of villages? 9 Probability Sampling With and Without Replacement Sampling (1) With Replacement Sampling: in which, at each draw, the member of the population selected for the sample is returned to the population (i.e. replaced) before the next draw is made. In these, a unit may appear more than once (repeated) in a sample. Commonly not used in practice. Without Replacement: in which, once a unit is selected in one draw is excluded from the population for subsequent draws. In these, a unit can at most appear once in a sample. 10

Probability Sampling With and Without Replacement Sampling (2) Conceptually, with replacement sampling schemes are same as without replacement sampling from a notional population that has infinite number of each unit of the original population. Thus, with replacement schemes are notionally sampling from infinite population, while without replacement schemes are sampling from finite population. 11 Probability Sampling Simple Random Sampling (SRS) SRS is the simplest method of probability sampling Special type of equal probability selection method (epsem) SRSWR and SRSWOR SRSWR rarely used in practice for large scale surveys Theoretical basis for other sample designs 12

Probability Sampling Probability Sampling in Statistical Surveys Statistical surveys like those conducted government agencies are always conducted to obtain information about a population, such as Resident (human) population / households All villages Livestock population All agricultural holdings All fruit bearing mango trees etc. Note that the population in a sample surveys is always finite. 13 Sampling Unit and Sampling Frame 14

Sampling Unit and Sampling Frame Target population and Sampling frame Target population: The population intended to be studied or covered in the survey; also known as coverage universe. Sampling frame: A list of units (possibly with other information on each unit) from which selection of sample is made. Note that it is desirable that the sampling frame be exactly the same as the target population, but it is not always so. 15 Sampling Unit and Sampling Frame Sampling Unit The units constituting the sampling frame are called sampling units. These are entities that are selected in the sampling process. Each sampling unit in the frame should be distinguishable and have a unique identification number. Examples of sampling units: Individual in the sampling frame for a Labor force survey (LFS) Cattles in a livestock survey Holdings in agricultural surveys Wewilllatersee,thatavillagecanbesamplingunitinan Agricultural Survey with a multi stage sampling design. 16

Definitions Sampling Unit and Unit of Observation A sampling unit may be different from unit of observation. When we select a sample of households for LFS, target units of observation are persons living in the household. select a sample of households (target unit of observation) by selecting a sample of villages first, then selecting a sample of households within the sampled villages. 17 Characteristics quantitative variables and categorical variable 18

Definitions Characteristic (1) Different kinds of information on the elements of the population (or populations) are collected in a survey. Each of these items of information is called a characteristic. Each of the characteristics have different possible values for different individual units. Observations on several characteristics of the units are collected in a survey. 19 Definitions Characteristic (2) A characteristic can be a quantitative variable like age of a person income of a household number of cattle on a farm area of land under rice crop in an agricultural holding value of output of a manufacturing unit or an attribute or categorical variable like sexofapersonaged15+ employment status of a person economic activity code of a production unit whether agricultural produced sold 20

Definitions Characteristic (3) For a quantitative variable, we can talk of total and mean. For a categorical variable, the total and averages are NOT meaningful. For a categorical variable, what we are interested in is proportion of a category in the population or Percentage distribution of different categories in the population. 21 Important Definitions: Population parameter, statistic, estimator & estimate With & without replacement and SRS 22

Definitions Population Parameter and Sample Statistic A population parameter is a numerical summary of a population Any numerical measure computed from a subset of the population (typically a sample) is a sample statistic. Population Parameter Sample 23 Statistic (x ) Definitions Population Parameter A population parameter is a summary measure of a population, the value of which helps to describe a population. For example: Total population Average size of agricultural holdings Literacy rate 24

An Example the values of X and Y shown in the table below are the actual values (not known to the sampler). Milk Producers # milch animals (X) Milk output (Y) average yield (R) A 2 65 32.5 B 7 71 10.1 C 21 265 12.6 D 12 113 9.4 E 18 166 9.2 F 3 35 11.7 Average = 10.5 119 11.3 25 In this example Samples sample values of X sample values of Y sample ratio estimate of 1 st unit 2 nd unit 1 st unit 2 nd unit mean ( x ) 1 st unit 2 nd unit mean R C D 32 6 19 475 72 273.5 14.4 A B 6 7 6.5 65 71 68 10.5 Estimates Estimators Sample mean Sample ratio 26

Definitions Estimator An estimator is a sample statistic a function of sample observations that gives information about an unknown population parameter. Examples: sample mean is an estimator of the population mean. Sample proportion is an estimator of population proportion Sample ratio is an estimator of population ratio. 27 Definitions Estimate An estimate is an indication of the value of an unknown quantity based on observed data. More formally, an estimate is the particular value of an estimator that is obtained from a particular sample of data and used to indicate the value of a parameter. 28

Sampling error, Sampling variance and Sampling Distribution 29 Sampling Error Sampling Error The error in a sample estimate that owes to the selection of only a subset (sample) of the total population rather than the entire population. Sampling error represents the difference between the estimate and the value of the population parameter. All sample estimates are subject to sampling error. The most commonly used measure of sampling error is sampling variance. 30

Sampling Error Sampling Variance (1) Sampling variance is a measure of sampling error. Sampling error reflects the difference between an estimate derived from a sample and the "true value. It can be measured from the population values, if they are known. (But these are unknown otherwise there would be no need for a survey). 31 Sampling Error Sampling Variance (2) Definition: The average of squares of (value of the survey estimator obtained from a sample minus the value of the population parameter) over all possible samples that can be drawn from the population. The variance of an estimator contains information regarding how close the estimator is to the population parameter. Estimates from a survey may have different sampling variance. 32 32

Sampling Error Sampling Distribution A frequency distribution of the values of an estimator for each sample that can possibly be drawn from the population. 6 5 4 3 2 1 0 Sampling Distribution 0 3 3 7 7 11 11 15 15 19 19 23 23 27 27 31 31 35 35 39 39 43 Sample mean 33 33 Sampling Error Sampling Variance determining factors Sampling error (variance) is affected by a number of factors: variability within the population. sample size sampling fraction sample design, and If sampling principles are applied carefully within the constraints of available resources, sampling error can be accurately measured and kept to a minimum. 34

Unbiasedness, Efficiency and Consistency 35 Estimators Desirable Qualities Desirable Qualities of an Estimator Unbiasedness Consistency Efficiency 36

Estimators Desirable Qualities Unbiased Estimator An unbiased estimator of a population parameter is an estimator whose expected value is equal to that parameter. In the example, Sample means of average number of milch animals and average output are unbiased estimators of the respective population parameters. But, sample yield rate (which is a ratio) is not an unbiased estimator of the corresponding population parameter. 37 Estimators Desirable Qualities Consistency An estimator is said to be consistent if the difference between the estimator and the parameter grows smaller as the sample size grows larger. Sample ratio (in the example) is not unbiased but is a consistent estimator. 38

Estimators Desirable Qualities Efficiency Efficiency is defined as the reciprocal of sampling variance. If there are two unbiased estimators of a parameter, the one whose variance is smaller is said to be relatively efficient. 39 Thanks 40