Sampling Concepts. IUFRO-SPDC Snowbird, UT September 29 Oct 3, 2014 Drs. Rolfe Leary and John A. Kershaw, Jr.

Similar documents
Experimental Design. IUFRO-SPDC Snowbird, UT September 29 Oct 3, 2014 Drs. Rolfe Leary and John A. Kershaw, Jr.

9/2/2010. Wildlife Management is a very quantitative field of study. throughout this course and throughout your career.

Formalizing the Concepts: Simple Random Sampling. Juan Muñoz Kristen Himelein March 2012

BIOSTATISTICS. Lecture 4 Sampling and Sampling Distribution. dr. Petr Nazarov

Formalizing the Concepts: Simple Random Sampling. Juan Muñoz Kristen Himelein March 2013

Day 8: Sampling. Daniel J. Mallinson. School of Public Affairs Penn State Harrisburg PADM-HADM 503

Unequal Probability Designs

Sampling. Sampling. Sampling. Sampling. Population. Sample. Sampling unit

SYA 3300 Research Methods and Lab Summer A, 2000

Stochastic calculus for summable processes 1

Approach to Field Research Data Generation and Field Logistics Part 1. Road Map 8/26/2016

Section 8.1: Interval Estimation

Module 9: Sampling IPDET. Sampling. Intro Concepts Types Confidence/ Precision? How Large? Intervention or Policy. Evaluation Questions

Last week: Sample, population and sampling distributions finished with estimation & confidence intervals

We would like to describe this population. Central tendency (mean) Variability (standard deviation) ( X X ) 2 N

Lecture 5: Sampling Methods

Business Statistics: A First Course

Why do we Care About Forest Sampling?

Last two weeks: Sample, population and sampling distributions finished with estimation & confidence intervals

Data Mining Chapter 4: Data Analysis and Uncertainty Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University

Part 7: Glossary Overview

Part I. Sampling design. Overview. INFOWO Lecture M6: Sampling design and Experiments. Outline. Sampling design Experiments.

Sampling. Where we re heading: Last time. What is the sample? Next week: Lecture Monday. **Lab Tuesday leaving at 11:00 instead of 1:00** Tomorrow:

Notes 3: Statistical Inference: Sampling, Sampling Distributions Confidence Intervals, and Hypothesis Testing

You are allowed 3? sheets of notes and a calculator.

Survey Sample Methods

BALANCED HALF-SAMPLE REPLICATION WITH AGGREGATION UNITS

7 Estimation. 7.1 Population and Sample (P.91-92)

Survey of Smoking Behavior. Survey of Smoking Behavior. Survey of Smoking Behavior

Sampling Techniques. Esra Akdeniz. February 9th, 2016

Sampling in Space and Time. Natural experiment? Analytical Surveys

Explain the role of sampling in the research process Distinguish between probability and nonprobability sampling Understand the factors to consider

Main sampling techniques

2 and F Distributions. Barrow, Statistics for Economics, Accounting and Business Studies, 4 th edition Pearson Education Limited 2006

The variable θ is called the parameter of the model, and the set Ω is called the parameter space.

Why Sample? Selecting a sample is less time-consuming than selecting every item in the population (census).

SAMPLING- Method of Psychology. By- Mrs Neelam Rathee, Dept of Psychology. PGGCG-11, Chandigarh.

Algebra of Random Variables: Optimal Average and Optimal Scaling Minimising

Model Assisted Survey Sampling

Z score indicates how far a raw score deviates from the sample mean in SD units. score Mean % Lower Bound

Statistics for Managers Using Microsoft Excel 5th Edition

Define characteristic function. State its properties. State and prove inversion theorem.

ENGRG Introduction to GIS

Biostat 2065 Analysis of Incomplete Data

(DMSTT 01) M.Sc. DEGREE EXAMINATION, DECEMBER First Year Statistics Paper I PROBABILITY AND DISTRIBUTION THEORY. Answer any FIVE questions.

Sampling. General introduction to sampling methods in epidemiology and some applications to food microbiology study October Hanoi

Sampling Methods and the Central Limit Theorem GOALS. Why Sample the Population? 9/25/17. Dr. Richard Jerz

AP Statistics Chapter 7 Multiple Choice Test

UNIVERSITY OF TORONTO Faculty of Arts and Science

Empirical Evaluation (Ch 5)

Now we will define some common sampling plans and discuss their strengths and limitations.

Large n normal approximations (Central Limit Theorem). xbar ~ N[mu, sigma 2 / n] (sketch a normal with mean mu and sd = sigma / root(n)).

STA 291 Lecture 16. Normal distributions: ( mean and SD ) use table or web page. The sampling distribution of and are both (approximately) normal

2WB05 Simulation Lecture 7: Output analysis

AP Statistics Cumulative AP Exam Study Guide

Inferential Statistics. Chapter 5

The Random Effects Model Introduction

GROUPED DATA E.G. FOR SAMPLE OF RAW DATA (E.G. 4, 12, 7, 5, MEAN G x / n STANDARD DEVIATION MEDIAN AND QUARTILES STANDARD DEVIATION

Tribhuvan University Institute of Science and Technology 2065

Computer Science, Informatik 4 Communication and Distributed Systems. Simulation. Discrete-Event System Simulation. Dr.

EC969: Introduction to Survey Methodology

Section 9.4. Notation. Requirements. Definition. Inferences About Two Means (Matched Pairs) Examples

How do we compare the relative performance among competing models?

Interpret Standard Deviation. Outlier Rule. Describe the Distribution OR Compare the Distributions. Linear Transformations SOCS. Interpret a z score

LAB 2. HYPOTHESIS TESTING IN THE BIOLOGICAL SCIENCES- Part 2

the logic of parametric tests

POPULATION AND SAMPLE

Statistical Concepts. Distributions of Data

Design of 2 nd survey in Cambodia WHO workshop on Repeat prevalence surveys in Asia: design and analysis 8-11 Feb 2012 Cambodia

Chapter 23: Inferences About Means

Estimation of Parameters and Variance

Chapter 11. Output Analysis for a Single Model Prof. Dr. Mesut Güneş Ch. 11 Output Analysis for a Single Model

FCE 3900 EDUCATIONAL RESEARCH LECTURE 8 P O P U L A T I O N A N D S A M P L I N G T E C H N I Q U E

Practical Statistics for the Analytical Scientist Table of Contents

Stat472/572 Sampling: Theory and Practice Instructor: Yan Lu

Properties of the least squares estimates

Section 9 2B:!! Using Confidence Intervals to Estimate the Difference ( µ 1 µ 2 ) in Two Population Means using Two Independent Samples.

Monte Carlo Study on the Successive Difference Replication Method for Non-Linear Statistics

Instance Selection. Motivation. Sample Selection (1) Sample Selection (2) Sample Selection (3) Sample Size (1)

Analysis of Variance (ANOVA)

Training and Technical Assistance Webinar Series Statistical Analysis for Criminal Justice Research

POST GRADUATE DIPLOMA IN APPLIED STATISTICS (PGDAST) Term-End Examination June, 2016 MST-005 : STATISTICAL TECHNIQUES

TECH 646 Analysis of Research in Industry and Technology

3/9/2015. Overview. Introduction - sampling. Introduction - sampling. Before Sampling

Basic Data Analysis. Stephen Turnbull Business Administration and Public Policy Lecture 5: May 9, Abstract

Propensity Score Methods for Causal Inference

MVE055/MSG Lecture 8

What is sampling? shortcut whole population small part Why sample? not enough; time, energy, money, labour/man power, equipment, access measure

Georgia Kayser, PhD. Module 4 Approaches to Sampling. Hello and Welcome to Monitoring Evaluation and Learning: Approaches to Sampling.

Lecture 3: Inference in SLR

Empirical Likelihood Methods for Sample Survey Data: An Overview

Measurement And Uncertainty

CHOOSING THE RIGHT SAMPLING TECHNIQUE FOR YOUR RESEARCH. Awanis Ku Ishak, PhD SBM

Conservative variance estimation for sampling designs with zero pairwise inclusion probabilities

CPT Section D Quantitative Aptitude Chapter 15. Prof. Bharat Koshti

Four aspects of a sampling strategy necessary to make accurate and precise inferences about populations are:

arxiv: v1 [stat.ap] 7 Aug 2007

Advanced Methods for Agricultural and Agroenvironmental. Emily Berg, Zhengyuan Zhu, Sarah Nusser, and Wayne Fuller

Module 4 Approaches to Sampling. Georgia Kayser, PhD The Water Institute

From the help desk: It s all about the sampling

Transcription:

Sampling Concepts IUFRO-SPDC Snowbird, UT September 9 Oct 3, 014 Drs. Rolfe Leary and John A. Kershaw, Jr.

Sampling Concepts Simple Sampling Strategies: Random Sampling Systematic Sampling Stratified Sampling Blow-up Estimation

Why Sample? $$$$ Sample may actually provide a better estimate than a census Bias Measurement Error

Sampling Implications for Experimental Design Selection of Experimental Units Assignment of Treatments to Experimental Units

Population versus Sample Population The set of individuals we are interested in quantifying Sample The set of individuals, selected according to some rules of probability, that we use to represent the population

Today s Population Nut Population 10 9 8 7 Frequency 6 5 4 3 1 0 1 1.8.6 3.4 4. 5 5.8 6.6 7.4 8. 9 9.8 10.6 11.4 1. 13 13.8 Weight (gm)

Population and Sample Parameters: The Mean Population Mean Sample Mean N i1 X N i x n i1 n X i

Population Mean 4 i1 X i 4 13.0 10.3110.49.54.06.5 4 178.4 4 4.48

Our Population Distribution Nut Population 10 µ = 4.48 9 8 7 Frequency 6 5 4 3 1 0 1 1.8.6 3.4 4. 5 5.8 6.6 7.4 8. 9 9.8 10.6 11.4 1. 13 13.8 Weight (gm)

Population and Sample Parameters: Standard Deviation Population Standard Deviation Sample Standard Deviation N i1 X i N X x s n i1 i n 1

Population Standard Deviation.991 8.956 4 376.010 4 3.994 4.7878 54.0505 80.1366 4 4.48) (.5 4.48) (.06 4.48) (10.31 4.48) (13.0 4 4 1 i 4.48 X i 4 4 1 i μ X i σ

Mean and Standard Deviation Mean measures where the population is located The average value Standard Deviation measures dispersion of individuals about the mean The average distance individuals are from the mean

Our Population Distribution Nut Population Frequency 10 9 8 -σ 7 6 5 4 3 1 0 µ = 4.48 +σ 1 1.8.6 3.4 4. 5 5.8 6.6 7.4 8. 9 9.8 10.6 11.4 1. 13 13.8 Weight (gm)

Elements of a Sample Sample Frame Individual Sample Observations (Individuals selected for quantification) Error (Individuals not selected)

Sample Frame for Random Sampling {1}, {}, {3} {4}, {5}, {6} {7}, {8}, {9} {10}, {11}, {1} {13}, {14}, {15} {16}, {17}, {18} {19}, {0}, {1} {}, {3}, {4} {5}, {6}, {7} {8}, {9}, {30} {31}, {3}, {33} {34}, {35}, {36} {37}, {38}, {39} {40}, {41}, {4}

Probability of Selection under Random Sampling Each individual represents 1/4 nd of the sampling frame (population) Pr(selection) = 1/4 = 0.0381 Five individuals are being selected Pr(sampled) = 5*(1/4) = 0.11905

Example: sample of size 5 Select 5 elements from our sample frame Randomly sort the sample frame: Generate random numbers for sampling unit Rank and select the first 5

Our Sample Observation Nut # Wt 1 9 4.03 41.06 3 4.5 4 5 13. 5 6 7.64

Estimation of Population Mean: The Sample Mean x n i1 n X i Note: Formula assumes equal probability of sampling

Our Sample Mean X 5 X i i1 5 4.03 9.18 5 5.836.06.513. 5 7.64

Estimation of Population Standard Deviation: Sample Standard Deviation 1 n n n 1 i X i n 1 i i X 1 n n 1 i X X i s

Our Sample Standard Deviation 4.6867 1.9655 4 87.861 4 170.945 58.1566 4 5 9.18 58.1566 1 5 5 7.64 13..5.06 4.03 7.64 13..5.06 4.03 1 n n n 1 i X i n 1 i i X s

Population and Sample Nut Population 10 9 8 7 Frequency 6 5 4 3 1 0 1 1.8.6 3.4 4. 5 5.8 6.6 7.4 8. 9 9.8 10.6 11.4 1. 13 13.8 Weight (gm) Single Sample 9 8 7 6 Frequency 5 4 3 1 0 1 1.8.6 3.4 4. 5 5.8 6.6 7.4 8. 9 9.8 10.6 11.4 1. 13 13.8 Nut Weight (gm)

Sampling Error Now we only selected 5 of 4 elements from the population Our estimates have error associated with them If we sample another 5 elements, we most likely will get different answers We need to assess that error

Distribution of Means of size 5 Distribution of Means 9 8 7 Frequency 6 5 4 3 1 0 1 1.8.6 3.4 4. 5 5.8 6.6 7.4 8. 9 9.8 10.6 11.4 1. 13 13.8 Mean Weight (gm)

Standard Error Distribution of means is known (approximately) Estimate sampling error from a single sample: s x Standard Error of Estimate (Mean) s n

Our Sample of Size 5 Uncorrected s x 4.6867 5 4.6867.36.0960 Corrected s x 4.6867 5 45 4.0960 37 4.09600.938 1.967

Confidence Interval C.I. x t s s C.I. x t n x

Using Student s t-table Our probability is 1 - Confidence Degrees of Two-tailed probability of obtaining a large value freedom 0.5 0.4 0.3 0. 0.1 0.05 0.0 0.01 0.001 1 1 1.3764 1.966 3.0777 6.3137 1.706 31.81 63.6559 636.5776 0.8165 1.0607 1.386 1.8856.9 4.307 6.9645 9.95 31.5998 3 0.7649 0.9785 1.498 1.6377.3534 3.184 4.5407 5.8408 1.944 4 0.7407 0.941 1.1896 1.533.1318.7765 3.7469 4.6041 8.6101 5 0.767 0.9195 1.1558 1.4759.015.5706 3.3649 4.031 6.8685 6 0.7176 0.9057 1.134 1.4398 1.943.4469 3.147 3.7074 5.9587 7 0.7111 0.896 1.119 1.4149 1.8946.3646.9979 3.4995 5.4081 8 0.7064 0.8889 1.1081 1.3968 1.8595.306.8965 3.3554 5.0414 9 0.707 0.8834 1.0997 1.383 1.8331.6.814 3.498 4.7809 10 0.6998 0.8791 1.0931 1.37 1.815.81.7638 3.1693 4.5868 11 0.6974 0.8755 1.0877 1.3634 1.7959.01.7181 3.1058 4.4369 1 0.6955 0.876 1.083 1.356 1.783.1788.681 3.0545 4.3178 Our degrees of freedom are n-1 Where have we seen this before?

95% Confidence Interval for our Sample (Uncorrected) n = 5, CI =.95 T((1-.95),5-1) =.776 CI: C.I. 5.836 t.096 5.836.776.096 5.836 5.819 CI: Pr(0.017 11.655) =.95

Some Exploration of Random Sampling Results Repeated Sampling Effects of Sample Size How many samples should I measure? Let s Do Some Exploration

Sample Size s C.I. x t n E n t t E s s n

Sample Size Our sample of size 5 produced a sampling error of 5.819 at 95% confidence Error is almost 100% of the mean Not a very good sample We might want (or our boss might want) a sampling error of.5 grams with 95% confidence

Sample Size When we look at the sample size calculation we are calculating n (left side of equation) But.t depends on n (right side of equation) Must be solved iteratively n t s,n1 E

Sample Size Start with a guess n0 = 10 t(.05,9) =.6 S = 4.6867 N1 = 18 n0 So repeat with n1 Repeat until n(i) = n(i-1).05,17.05,15.6 776 n1 17.98 18.5 t.1098.1098 776 n 15.64 16.5 t.1314.1314 776 n3 15.97 16.5

Elements of a Sample Sample Frame Individual Sample Observations (Individuals selected for quantification) Error (Individuals not selected)

Sample Layout Random Individuals selected independent of each other No spatial or temporal pattern Systematic Individuals selected relative to one another Once first individual selected, all other sample units determined Spatial or temporal pattern

Random Sampling Statistically the most efficient Mean and standard error unbiased with single sample Can be logistically very hard to implement Hard to develop truly random samples

Systematic Sampling samples laid out along a systematic grid logistically the easiest unbiased mean biased standard deviation

Systematic Sample of our population {1}, {}, {3} {4}, {5}, {6} {7}, {8}, {9} {10}, {11}, {1} {13}, {14}, {15} {16}, {17}, {18} {19}, {0}, {1} {}, {3}, {4} {5}, {6}, {7} {8}, {9}, {30} {31}, {3}, {33} {34}, {35}, {36} {37}, {38}, {39} {40}, {41}, {4}

Questions How many independent simple random samples of size 5 are there? How many independent systematic samples of size 5 are there? For the sample illustrated, how many independent choices were made?

Properties of systematic samples Unbiased mean One sample selection (once 1 plot is located, all others are fixed) Sampling units are not independent Need at least two independent selections to calculate unbiased estimate of standard deviation (remember the n-1) s, as estimated from systematic sample, tends to be too large

Why is s too large? How many random samples of size 5? What is range of random samples of size 5? How many systematic samples of size 5? What is range of systematic samples of size 5? So what does systematic sampling do for us?

Systematic Sample of Size 5 Only 9 samples possible 3., 4.416, 3.168, 7.656, 5.476,.69, 3.018, 4.76, 5.88 Clearly not as much variance possible True standard error would be the standard deviation of these 9 means

Let s do some exploration

Stratified Sampling Our nut population is composed of 3 nut types: Walnuts (6) Filberts (15) Peanuts (1) Var(between nut types) >> Var(within nut types) Could use stratified sampling to reduce variability and/or sample size requirements

Stratified Sampling with Proportional Allocation Samples are allocated to each Stratum proportional to some measure of abundance in population Frequency Area Total Weight n j N A j k A j j1

Strata Summaries X j n j X ij i1 n j s s X j X j s n j i1 X j n j X ij X j n j 1

Population Summary XPop k A j X j j1 k A j j1 s k A j j1 A Total s XPop X j

Let s do some exploration

Implications of Sample Design Blow-up Estimation Want to sample 10 nuts from our current population of 4 Use that sample to estimate total for a new population 50 walnuts 300 filberts 000 peanuts real total weight = 6600 g