Sampling Questions Define: Sampling, statistical inference, statistical vs. biological population, accuracy, precision, bias, random sampling Why do people use sampling techniques in monitoring? How do you make an unbiased sample? Describe and compare the three types of random sampling: simple, stratified, and systematic Why are points or quadrats located on a single transcet line not independent sampling units? Last time Where we re heading: Cover measurements line intercept methods range from objective to subjective point intercept visual estimates w/cover classes visual estimates Tomorrow: 11:00 Lecture about riparian areas, riparian sampling methods 1:00 Lab @ Rancho San Rafael, Oxbow Park wear shoes that can get muddy Next week: What is the sample? Ecological site types & Rangeland health Lecture Monday **Lab Tuesday leaving at 11:00 instead of 1:00** 1
For both intercept and point methods: Sampling unit is always the transect All that work- and you only get a couple samples Sometimes, each quadrat can count as a sample Depending on how you set up your sampling, each quadrat can be an independent sample Samples must be independent of each other! Doesn t seem fair, does it? Sampling- definition The act or process of selecting a part of something with the intent of showing the nature of the whole In our case, the whole is a population- the complete set of individuals that we are interested in Provides information about a population in such a way that inferences about the total population can be made Why do you need to sample? Population is too big to count everything Sampling may require that you destroy individuals (don t want to kill every single last plant out there) Sampling can even be more accurate than other methods... Sampling can be the most accurate method! Some measurements are highly error-prone (counting the blades of grass in a football field) diving the task up into small, representative samples is MUCH more likely to be accurate fewer judgement calls not as likely to loose count Statistical inference Definition: generalizations about the whole based on observations of a few These generalizations can never be made with complete certainty We can, however, present information about the degree of uncertainty SD, Confidence intervals 2
Goals of sampling 1. Accurately estimate the true value of the population 2. Present information about your degree of certainty 3. Minimize time spent while maximizing accuracy and precision Cost-benefit Sampling can reduce the amount of work and cost associated with describing populations In all biological sampling, there is a dynamic pull between more effort and more information Most of the time, constraints are set by outside forces Why do they only ask 1000 people who they want to be president, when there are 300,000,000 people in the US? Margin of error Diminishing returns 5% margin of error means 95% certain that the true value is within a given interval Have to poll 8000 more people to go from 98% certain to 99% certain Number of people polled Statistical population vs. biological populations Biological population: is the collection of organisms of a particular species living in a given geographical area Statistical population: The complete set of individual objects about which you want to make inferences a.k.a. sampling units : individual plants, quadrats, points, or transects Goals of sampling 1. Accurately estimate the true value of the population 2. Present information about your degree of certainty 3. Minimize time spent while maximizing accuracy and precision 3
Review: accuracy vs. precision Accuracy: the closeness of a measured or computed value to its true value Precision: the closeness of repeated measurements of the same quantity. Review: accuracy vs. precision Accuracy: the closeness of a measured or computed value to its true value Precision: the closeness of repeated measurements of the same quantity. Standard deviation will tell you how precise you are: smaller # means less variability, more precision Unfortunately, nothing will tell you how accurate you are! That is, it is possible to be very precise, but still wrong Bias: systematic distortion arising from a flaw in measurement or inappropriate sampling (You re wrong) Worse than just being wrong- you are wrong but it looks like you are right (e.g. nice, tight standard deviations) Accurately estimate population values What we want is an unbiased estimate of the true population value Does not over or under-estimate the true value A biased estimate: An incorrect estimate of the mean There are many things you can do to bias your sampling Sources of error Sampling errors- chance events sample information does not match the true population can happen when chance events place samples in unusually high or low density areas can be estimated with statistics, improved with methodological changes Non-sampling errors- associated with human mistakes and are therefore, of course, more common :-) Examples of non-sampling errors Selecting sampling units based on non-random methods 1. This stuff can t be detected or Measuring and counting corrected after mistakes sampling is Inconsistent field sampling complete methods Poor handwriting and transcription errors 2. Introduces bias Bad taxonomy into your dataset 4
How to make an unbiased sample Use trained observers Establish protocols and ground rules Take your time and be careful Use random sampling Sampling should be interspersed throughout population Random sampling Every possible sampling location has an equal chance of being selected sampling is non-random for many reasons areas close to the road get over sampled key areas (purposeful non-random sampling) etc. Interspersion Goal is to have sampling units well interspersed throughout the area of the target population Common practice to place all sampling units (points or quadrats) along a single or very few transects this is bad! better to have fewer samples per transect and more transects Types of random sampling 1. Simple random sampling 2. Stratified random sampling 3. Systematic sampling Simple random sampling Each sampling unit has the same probability of being selected Selection of each sampling unit is unrelated to the selection of any other unit Random coordinate method: Make X,Y axes along your population, randomly choose X,Y coordinates A couple methods 5
A couple methods A couple methods Grid cell method 1. Overlay population with conceptual (not actual) grid 50 m Field-friendly axis method 1. Lay tape through middle of plot 2. Randomly select 5 points along meter tape 2.Number each cell, and randomly pick a sample 3. Randomly select side of tape 0m A couple methods Randomly select points and use GPS to find them 4. Randomly select distance from tape Good vs. bad Easiest statistical analysis Get lots of independent sample Good in small populations Completely random sampling takes too long I ve been here for three days and I can tell I m not sampling the whole area! Stratified random sampling Jeff Herrick, USDA-ARS Jornada Experimental Range Stratified random sampling Jeff Herrick, USDA-ARS Jornada Experimental Range 6
Stratified random sampling Dividing the population into two or more subgroups prior to sampling Sampling units within the same subgroup are very similar, while units between subgroups are very different Random vs. stratified Random: 2 1 3 Strata should be defined by soil type, aspect, major vegetation type, soil moisture, etc. Stratified random: 1 2 3 NOT by something that is likely to change, like species abundance Stratified random Can be equally allocated to each strata, relative to the size of each, or in proportion to variability, management interest, cost of sampling, in each strata Stratified random: good and bad Save me from wasting my time! Get nice interspersion of transects Reduced variability within strata Still have to hunt around and find @!!@@ spots on the landscape Statistical analysis more complex Systematic sampling Regular placement of sampling units along a randomly-selected start point. 7
Examples of systematic sampling Systematic quadrats with a random start point 0m Four 50 m line transects have been randomly placed at 5, 25, 30, and 40m marks, off of a permanent baseline 50m Quadrats are place at 5 m intervals, but the starting point is randomly selected between 0-5 m What is the sampling unit?? Sampling unit is the randomly selected target individual plant Even for systematic sampling, the transect is still the sample quadrat (if simple random sampling) line transect point frame (if randomly located) 0m Examples of systematic sampling Four 50 m line transects have been randomly placed at 5, 25, 30, and 40m marks, off of a permanent baseline Count each line transect as an independent sample 50m Systematic quadrats with a random start point Quadrats are place at 5 m intervals, but the starting point is randomly selected between 0-5 m Is each quadrat a sample? There is some gray area here- maybe, probably not 8
Why are many measurements along a single line a sampling unit?? Because sampling units cannot be correlated with each other along a single line, or within a single quadrat, values are more likely to be similar to each other than if points were drawn totally at random So what? Then you ve got a biased estimator... and you re going to get the wrong answer Effort: What is a sample? Lines are likely to intersect similar vegetationtherefore, measurements along a line are not independent samples N = 3 Effort: What is a sample? Randomly located quadrats have an equal chance of landing anywhere, therefore, they are independent samples N = 8 9