THE SAMPLING DISTRIBUTION OF THE MEAN

Similar documents
Statistic: a that can be from a sample without making use of any unknown. In practice we will use to establish unknown parameters.

Probability Distributions

Additional practice with these ideas can be found in the problems for Tintle Section P.1.1

Inferential Statistics

Chapter 26: Comparing Counts (Chi Square)

Chapter 18. Sampling Distribution Models. Copyright 2010, 2007, 2004 Pearson Education, Inc.

Lecture 27. DATA 8 Spring Sample Averages. Slides created by John DeNero and Ani Adhikari

MIT BLOSSOMS INITIATIVE

Lesson 19: Understanding Variability When Estimating a Population Proportion

Unit 4 Probability. Dr Mahmoud Alhussami

Unit 22: Sampling Distributions

CHAPTER 10 Comparing Two Populations or Groups

3/30/2009. Probability Distributions. Binomial distribution. TI-83 Binomial Probability

Last week: Sample, population and sampling distributions finished with estimation & confidence intervals

green green green/green green green yellow green/yellow green yellow green yellow/green green yellow yellow yellow/yellow yellow

Sampling Distribution Models. Chapter 17

AIM HIGH SCHOOL. Curriculum Map W. 12 Mile Road Farmington Hills, MI (248)

where Female = 0 for males, = 1 for females Age is measured in years (22, 23, ) GPA is measured in units on a four-point scale (0, 1.22, 3.45, etc.

STAT/SOC/CSSS 221 Statistical Concepts and Methods for the Social Sciences. Random Variables

Concepts in Statistics

MATH 1150 Chapter 2 Notation and Terminology

Last two weeks: Sample, population and sampling distributions finished with estimation & confidence intervals

COSC 341 Human Computer Interaction. Dr. Bowen Hui University of British Columbia Okanagan

Probability and Statistics

Chapter 9 Inferences from Two Samples

Supporting Australian Mathematics Project. A guide for teachers Years 11 and 12. Probability and statistics: Module 25. Inference for means

Examine characteristics of a sample and make inferences about the population

Probability deals with modeling of random phenomena (phenomena or experiments whose outcomes may vary)

MA 1125 Lecture 15 - The Standard Normal Distribution. Friday, October 6, Objectives: Introduce the standard normal distribution and table.

MALLOY PSYCH 3000 MEAN & VARIANCE PAGE 1 STATISTICS MEASURES OF CENTRAL TENDENCY. In an experiment, these are applied to the dependent variable (DV)

Probability and Samples. Sampling. Point Estimates

20 Hypothesis Testing, Part I

Chapter 18. Sampling Distribution Models /51

MAT Mathematics in Today's World

Do students sleep the recommended 8 hours a night on average?

How to write maths (well)

Statistics Boot Camp. Dr. Stephanie Lane Institute for Defense Analyses DATAWorks 2018

Correlation. January 11, 2018

Chapter 8: Sampling Distributions. A survey conducted by the U.S. Census Bureau on a continual basis. Sample

appstats27.notebook April 06, 2017

Objectives. 2.1 Scatterplots. Scatterplots Explanatory and response variables Interpreting scatterplots Outliers

Chi Square Analysis M&M Statistics. Name Period Date

Two-sample Categorical data: Testing

the yellow gene from each of the two parents he wrote Experiments in Plant

THE SIMPLE PROOF OF GOLDBACH'S CONJECTURE. by Miles Mathis

M1-Lesson 8: Bell Curves and Standard Deviation

Exam #2 Results (as percentages)

Chapter 27 Summary Inferences for Regression


Chapter 6 The Standard Deviation as a Ruler and the Normal Model

ACMS Statistics for Life Sciences. Chapter 13: Sampling Distributions

Keppel, G. & Wickens, T.D. Design and Analysis Chapter 2: Sources of Variability and Sums of Squares

COMP6053 lecture: Sampling and the central limit theorem. Jason Noble,

8.1 Frequency Distribution, Frequency Polygon, Histogram page 326

Week 11 Sample Means, CLT, Correlation

Section 6.2 Hypothesis Testing

Chapter I, Introduction from Online Statistics Education: An Interactive Multimedia Course of Study comprises public domain material by David M.

STANDARDS OF LEARNING CONTENT REVIEW NOTES. ALGEBRA I Part II 1 st Nine Weeks,

MATH 3C: MIDTERM 1 REVIEW. 1. Counting

Warm-up Using the given data Create a scatterplot Find the regression line

4/1/2012. Test 2 Covers Topics 12, 13, 16, 17, 18, 14, 19 and 20. Skipping Topics 11 and 15. Topic 12. Normal Distribution

Chapter 6. The Standard Deviation as a Ruler and the Normal Model 1 /67

Now, with all the definitions we ve made, we re ready see where all the math stuff we took for granted, like numbers, come from.

Probability and Independence Terri Bittner, Ph.D.

Data 1 Assessment Calculator allowed for all questions

Statistics 511 Additional Materials

Data 1 Assessment Calculator allowed for all questions

Lab 5 for Math 17: Sampling Distributions and Applications

Two-sample inference: Continuous data

M 225 Test 1 B Name SHOW YOUR WORK FOR FULL CREDIT! Problem Max. Points Your Points Total 75

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Asymptotic Analysis, recurrences Date: 9/7/17

Review Packet for Test 8 - Statistics. Statistical Measures of Center: and. Statistical Measures of Variability: and.

Experimental Design, Data, and Data Summary

Hypothesis testing. Data to decisions

Lesson 8: Graphs of Simple Non Linear Functions

Data Analysis and Statistical Methods Statistics 651

Math 140 Introductory Statistics

Toss 1. Fig.1. 2 Heads 2 Tails Heads/Tails (H, H) (T, T) (H, T) Fig.2

Math 140 Introductory Statistics

Section 7.1 How Likely are the Possible Values of a Statistic? The Sampling Distribution of the Proportion

Business Statistics:

Announcements. Lecture 1 - Data and Data Summaries. Data. Numerical Data. all variables. continuous discrete. Homework 1 - Out 1/15, due 1/22

2. Probability. Chris Piech and Mehran Sahami. Oct 2017

Measures of Central Tendency. Mean, Median, and Mode

Section 3.4 Normal Distribution MDM4U Jensen

Recap: Ø Distribution Shape Ø Mean, Median, Mode Ø Standard Deviations

Background to Statistics

MATH 3012 N Solutions to Review for Midterm 2

MATH 10 INTRODUCTORY STATISTICS

Linear Regression. Linear Regression. Linear Regression. Did You Mean Association Or Correlation?

Stat 20 Midterm 1 Review

You have 3 hours to complete the exam. Some questions are harder than others, so don t spend too long on any one question.

Math 243 Chapter 7 Supplement The Sampling Distribution of a Proportion

L6: Regression II. JJ Chen. July 2, 2015

= - = = 1 = -2 = 3. Jeremy can plant 10 trees in 4 hours. How many trees can he plant in 10 hours? A. 16

Math 7 /Unit 5 Practice Test: Statistics

Objective: Recognize halves within a circular clock face and tell time to the half hour.

Examples of frequentist probability include games of chance, sample surveys, and randomized experiments. We will focus on frequentist probability sinc

CENTRAL LIMIT THEOREM (CLT)

MAT Mathematics in Today's World

Transcription:

THE SAMPLING DISTRIBUTION OF THE MEAN COGS 14B JANUARY 26, 2017

TODAY Sampling Distributions Sampling Distribution of the Mean Central Limit Theorem

INFERENTIAL STATISTICS Inferential statistics: allows us to generalize beyond collections of actual observations and allows us to determine how confident we can be that our conclusions are correct

PROBABILITY AND STATISTICS Probability plays an important role in inferential statistics. There is always variability that comes along with any particular result, and so each result is viewed within the context of many possible results that could have occurred by chance. Rare Outcomes Common Outcomes Rare Outcomes

PROBABILITY AND STATISTICS Statisticians interpret different proportions of area under theoretical curves as probabilities of random outcomes. If I took all of your midterm scores and computed a z-score for each one: What is the probability that one quiz score taken at random will be less than -1.5? 6.7% What is the probability that one quiz score taken at random will be greater than 1? 15.9% What is the probability that one quiz score taken at random will be less than -1.5 OR greater than 1? ADDITION RULE! (they are mutually exclusive) 6.7% + 15.9% = 22.6%

REVIEW: NORMAL DISTRIBUTIONS Many distributions found in nature have some common properties: Symmetric, with mean, median and modes of equal value Distributions are bell-shaped Examples: heights and weights of people, physical characteristics of plants/ animals, errors in measurement Sampling error (as an error in measurement) is normally distributed.

PROBABILITY AND STATISTICS In inferential statistics, we are making an inferential leap from interpreting probability as providing an estimated relative frequency of an event in a future random sampling from the population. When is an event (sample) common vs rare?

PROBABILITY AND STATISTICS Common outcomes: nothing special has occurred Rare outcomes: something special has occurred

UCSD = 67 COGS 14B = 68

UCSD = 67 COGS 14B = 75

Imagine you measured the temperature (in F) each day in a week and got the following measurements: 70, 72, 68, 50, 90, 80, 72 The average of these temperatures is 70 degrees. If the standard deviation for temperature for that week is typically 25 degrees, is a temperature of 90 unusually warm? What if the standard deviation is typically 5 degrees?

SAMPLING DISTRIBUTIONS A sampling distribution acts as a frame of reference for statistical decision making. It is a theoretical probability distribution of the possible values of some sample statistic that would occur if we were to draw all possible samples of a fixed size from a given population. The sampling distribution allows us to determine whether, given the variability among all possible sample means, the one we observed is a common out come or a rare outcome.

SAMPLING DISTRIBUTIONS Imagine that each one of you asks a random sample of 10 people in this class what their height is. (assume the population is all COGS 14B students this quarter). You each calculate the average height of your sample to get the sample mean. When you report back, would you expect all of your sample means to be the same? How much would you expect them to differ?

SAMPLING DISTRIBUTION OF THE MEAN Random samples rarely exactly represent the underlying population. We rely on sampling distributions to give us a better idea whether the sample we ve observed represents a common or rare outcome. Sampling distribution of the mean: probability distribution of means for ALL possible random samples OF A GIVEN SIZE from some population ALL possible samples is a lot! Example: All possible samples of size 10 from a class of 90 = 5.72*10 12

EXAMPLE: SAT MATH SCORES Take a sample of 10 random students from a population of 100. You might get a mean of 502 for that sample. Then, you do it again with a new sample of 10 students. You might get a mean of 480 this time. Then, you do it again. And again. And again and get the following means for each of those three new samples of 10 people: 550, 517, 472

SAMPLING DISTRIBUTION OF THE MEAN Sampling distribution of the mean: probability distribution of means for ALL possible random samples OF A GIVEN SIZE from some population By taking a sample from a population, we don t know whether the sample mean reflects the population mean. From the sampling distribution, we can calculate the possibility of a particular sample mean: chances are that our observed sample mean originates from the middle of the true sampling distribution. The sampling distribution of the mean has a mean, standard deviation, etc. just like other distributions you ve encountered!

REVIEW: A particular observed sample mean: A) equals the population mean B) equals the mean of the sampling distribution C) most likely has a value in the vicinity of the population mean D) is equally likely to have a value either near to, or far from, the population mean.

SAMPLING DISTRIBUTION OF THE MEAN Keeping notation straight: Type of Distribution Sample Population Sampling Distribution of the mean Mean (mean of all sample means) Standard Deviation (standard error of the mean)

SAMPLING DISTRIBUTION OF THE MEAN The mean of the sampling distribution ALWAYS equals the mean of the population.

SAMPLING DISTRIBUTION OF THE MEAN Standard error of the mean: measures the variability in the sampling distribution (roughly represents the average amount the sample means deviate from the mean of the sampling distribution) Sample size: n Remember, there s a different distribution for each different sample size!

REVIEW: Assuming the standard deviation remains constant, which of the following sample sizes would result in the largest value of standard error? A) 100 B) 16 C) 25 D) 49

INTUITION BUILDER http://www.muelaner.com/uncertainty-of-measurement/

CENTRAL LIMIT THEOREM When the sample size is sufficiently large, the shape of the sampling distribution approximates a normal curve (regardless of the shape of the parent population)! The distribution of sample means is a more normal distribution than a distribution of scores, even if the underlying population is not normal.

CENTRAL LIMIT THEOREM Parent Populations (can be of any shape, size, etc.) Sampling Distribution of the Mean (approximates a normal curve, no matter the parent population) https://www.etsy.com/shop/nausicaadistribution

DEMO: PUTTING IT ALL TOGETHER In a bag of M&Ms, there are 6 colors: brown, yellow, red, orange, green and blue. In a single bag, which color do you think there is the most of? Guess the proportion of orange M&Ms. If each student took a random sample of the number of orange M&Ms in 10 M&Ms bags to get an average number of orange M&Ms, would you expect each sample to have the same average number of orange M&Ms? Why or why not?

DEMO: PUTTING IT ALL TOGETHER 1. Get into groups of 3-4 (20 groups total) 2. Each group will receive one bag of M&Ms. 3. Open your bag of M&Ms and sort them by color. 4. Count the number of orange M&Ms and the total number of M&Ms in your bag. 5. Send a member of each group to randomly sample 10 groups, asking how many orange M&Ms they have (you ll receive a piece of paper saying which specific groups to sample it s been randomly generated for you). 6. Calculate the average number of orange M&Ms in your sample of 10 groups. 7. Create a histogram of the number of orange M&Ms in your 10 samples. 8. The mean of your data represent a single sample mean (where n = 10). It is this one mean that will get added to the overall distribution of sample means, which represents the distribution of ALL possible sample means.

DEMO: PUTTING IT ALL TOGETHER Create a histogram of all of our sample MEANS

DEMO: PUTTING IT ALL TOGETHER What is a sample in this context? What is the parent population? What is the sampling distribution of means? How does the histogram of your particular sample differ from the sampling distribution of the mean? What is the population parameter? What is the statistic? Do we know the value(s) of the parameter(s)? Do we know the value(s) of the statistic(s)? Did each sample have the same average number of orange M&Ms? Describe the variability of the distribution of sample proportions (shape, central tendency, spread).

DEMO: PUTTING IT ALL TOGETHER Imagine that the proportions of different colors in a bag of M&Ms are based on consumer preference tests.* 24% blue, 20% orange, 16% green, 14% yellow, 13% red, 13% brown We ll base our simulation on these proportions. http://www.rossmanchance.com/applets/oneprop/oneprop.htm? candy=2

DEMO: REVIEW 1. As sample size increases, does how well the sample statistic resemble the population parameter change? 2. What value does the sampling distribution tend to center around? 3. Does this observation depend on sample size? 4. If M&M colors were uniformly distributed (~16.67% of each color in each bag), would the shape of the sampling distribution change? Why or why not? 5. How does this relate to the Central Limit Theorem?

DEMO: NEXT STEPS PREVIEW FOR WHAT S NEXT: Inferential statistical tests! Based on the claim that 20% of M&Ms in a bag are supposed to be orange, do we have evidence that our bags of M&Ms differ from that proportion?

SUMMARY A sampling distribution acts as a frame of reference for statistical decision making. Sampling distribution of the mean: Probability distribution of means for ALL possible random samples OF A GIVEN SIZE from some population The mean of sampling distribution of the mean is always equal to the mean of the population The standard error of the mean measures the variability in the sampling distribution Central limit theorem: regardless of the shape of the population, the shape of the sampling distribution approximates a normal curve IF THE SAMPLE SIZE IS SUFFICIENTLY LARGE

ONLINE DEMO 2 http://bcs.wiley.com/he-bcs/books? action=mininav&bcsid=8500&itemid=1118450531&assetid=353506&r esourceid=34768&newwindow=true

SAME PARENT DISTRIBUTION, DIFFERENT SAMPLE SIZES N=5 N=25

FOR NEXT TIME Play around with the sampling distribution demo online!