STAT 350 Handout 19 Sampling Distribution, Central Limit Theorem (6.6)

Similar documents
Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Chapter 6 Sampling Distributions

(6) Fundamental Sampling Distribution and Data Discription

Sampling Distributions, Z-Tests, Power

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

1 Inferential Methods for Correlation and Regression Analysis

Parameter, Statistic and Random Samples

(7 One- and Two-Sample Estimation Problem )

Parameter, Statistic and Random Samples

Topic 9: Sampling Distributions of Estimators

Statistics 511 Additional Materials

Module 1 Fundamentals in statistics

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

Understanding Samples

7.1 Convergence of sequences of random variables

Introducing Sample Proportions

Chapter 1 (Definitions)

STAT 203 Chapter 18 Sampling Distribution Models

Introducing Sample Proportions

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

If, for instance, we were required to test whether the population mean μ could be equal to a certain value μ

Topic 9: Sampling Distributions of Estimators

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 9

Topic 9: Sampling Distributions of Estimators

Infinite Sequences and Series

AAEC/ECON 5126 FINAL EXAM: SOLUTIONS

PRACTICE PROBLEMS FOR THE FINAL

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015

Expectation and Variance of a random variable

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

Random Variables, Sampling and Estimation

Sampling Error. Chapter 6 Student Lecture Notes 6-1. Business Statistics: A Decision-Making Approach, 6e. Chapter Goals

The standard deviation of the mean

Stat 225 Lecture Notes Week 7, Chapter 8 and 11

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

Agreement of CI and HT. Lecture 13 - Tests of Proportions. Example - Waiting Times

7.1 Convergence of sequences of random variables

Simulation. Two Rule For Inverting A Distribution Function

PH 425 Quantum Measurement and Spin Winter SPINS Lab 1

Since X n /n P p, we know that X n (n. Xn (n X n ) Using the asymptotic result above to obtain an approximation for fixed n, we obtain

Background Information

This exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.

Econ 371 Exam #1. Multiple Choice (5 points each): For each of the following, select the single most appropriate option to complete the statement.

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10

Topic 10: Introduction to Estimation


4. Partial Sums and the Central Limit Theorem

Chapter 8: Estimating with Confidence

Solutions to Odd Numbered End of Chapter Exercises: Chapter 4

Frequentist Inference

Exam 2 Instructions not multiple versions

ENGI 4421 Confidence Intervals (Two Samples) Page 12-01

This section is optional.

6.041/6.431 Spring 2009 Final Exam Thursday, May 21, 1:30-4:30 PM.

Computing Confidence Intervals for Sample Data

Example: Find the SD of the set {x j } = {2, 4, 5, 8, 5, 11, 7}.

AMS570 Lecture Notes #2

Lecture 2: Monte Carlo Simulation

IE 230 Seat # Name < KEY > Please read these directions. Closed book and notes. 60 minutes.

MBACATÓLICA. Quantitative Methods. Faculdade de Ciências Económicas e Empresariais UNIVERSIDADE CATÓLICA PORTUGUESA 9. SAMPLING DISTRIBUTIONS

IE 230 Probability & Statistics in Engineering I. Closed book and notes. No calculators. 120 minutes.

Binomial Distribution

Statisticians use the word population to refer the total number of (potential) observations under consideration

Stat 421-SP2012 Interval Estimation Section

BIOS 4110: Introduction to Biostatistics. Breheny. Lab #9

Data Analysis and Statistical Methods Statistics 651

KLMED8004 Medical statistics. Part I, autumn Estimation. We have previously learned: Population and sample. New questions

n outcome is (+1,+1, 1,..., 1). Let the r.v. X denote our position (relative to our starting point 0) after n moves. Thus X = X 1 + X 2 + +X n,

STAT 515 fa 2016 Lec Sampling distribution of the mean, part 2 (central limit theorem)

Test of Statistics - Prof. M. Romanazzi

Hypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance

Lecture 8: Non-parametric Comparison of Location. GENOME 560, Spring 2016 Doug Fowler, GS

It is always the case that unions, intersections, complements, and set differences are preserved by the inverse image of a function.

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals

Sample Size Determination (Two or More Samples)

PRACTICE PROBLEMS FOR THE FINAL

Confidence Intervals for the Population Proportion p

Stat 400: Georgios Fellouris Homework 5 Due: Friday 24 th, 2017

Instructor: Judith Canner Spring 2010 CONFIDENCE INTERVALS How do we make inferences about the population parameters?

Section 9.2. Tests About a Population Proportion 12/17/2014. Carrying Out a Significance Test H A N T. Parameters & Hypothesis

Inferential Statistics. Inference Process. Inferential Statistics and Probability a Holistic Approach. Inference Process.

April 18, 2017 CONFIDENCE INTERVALS AND HYPOTHESIS TESTING, UNDERGRADUATE MATH 526 STYLE

Lecture 2: Concentration Bounds

LESSON 20: HYPOTHESIS TESTING

( ) = p and P( i = b) = q.

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f.

Mathematics 170B Selected HW Solutions.

Power and Type II Error


A statistical method to determine sample size to estimate characteristic value of soil parameters

11 Correlation and Regression

Distribution of Random Samples & Limit theorems

Basis for simulation techniques

Overview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions

BHW #13 1/ Cooper. ENGR 323 Probabilistic Analysis Beautiful Homework # 13

CHAPTER 2. Mean This is the usual arithmetic mean or average and is equal to the sum of the measurements divided by number of measurements.

CHAPTER 8 FUNDAMENTAL SAMPLING DISTRIBUTIONS AND DATA DESCRIPTIONS. 8.1 Random Sampling. 8.2 Some Important Statistics

Open book and notes. 120 minutes. Cover page and six pages of exam. No calculators.

Chapter 13: Tests of Hypothesis Section 13.1 Introduction

BIOSTATS 640 Intermediate Biostatistics Frequently Asked Questions Topic 1 FAQ 1 Review of BIOSTATS 540 Introductory Biostatistics

Transcription:

STAT 350 Hadout 9 Samplig Distributio, Cetral Limit Theorem (6.6) A radom sample is a sequece of radom variables X, X 2,, X that are idepedet ad idetically distributed. o This property is ofte abbreviated as i.i.d. o The umber is called the sample size. A statistic is a fuctio of the radom variables i a radom sample. o Each statistic is itself a radom variable ad therefore has its ow probability distributio, describig how it would vary uder repeated radom samplig. o The probability distributio of a statistic is called a samplig distributio. Example 9-: Grade Poit Average Suppose that a studet amed Marius has a.3 probability of gettig a A, a.5 probability of gettig a B, ad a.2 probability of gettig a C i a class. Suppose further that this probability distributio holds idepedetly for each of two classes that he is takig this term. Let X deote the umber of grade poits (A = 4 poits, B = 3 poits, C = 2 poits) that he receives course ad similarly for X 2. a) Calculate the expected value, variace, ad stadard deviatio of Marius s grade poits i a sigle course. Now cosider the statistic X = average (mea) grade poits i the two courses. The followig table lists all of Marius s possible grades i these two courses. b) Determie the probabilities of these 9 possible outcomes, ad record them i the table alog with the value of the sample mea (GPA). Grades A, A A, B A, C B, A B, B B, C C, A C, B C, C Probability GPA c) Report the probability (samplig) distributio of the sample mea grade poits by listig its possible values ad the probability of each: x p( x )

d) Determie the expected value of the sample mea grade poits. How does it compare to the expected grade poits i a sigle course? e) Determie the variace ad SD of the sample mea grade poits. How do they compare to their couterparts for grade poits i a sigle course? Now suppose that you wat to ivestigate Marius s academic performace over a year i which he takes 0 courses. f) If you were to list all possible outcomes (grade permutatios) for those 0 courses, how may would there be? It s o loger feasible to eumerate all possible outcomes, but we ca rely o simulatio to approximate the samplig distributios of these statistics. The followig R code performs such a simulatio: # start with N = umber of repetitios, = umber of courses # also start with pa = Pr(A), pb = Pr(B), pc = Pr(C) # grpts = rep(na, times = ) GPA = rep(na, times = N) for (i i :N) { rad = ruif(,0,) for (j i :) { if (rad[j] < pa) {grpts[j] = 4} if ((rad[j] >= pa) & (rad[j] < pa+pb)) {grpts[j] = 3} if (rad[j] >= pa+pb) {grpts[j] = 2} } GPA[i] = mea(grpts) } hist(gpa); table(gpa) mea(gpa); sd(gpa) 2

g) Explai the differece betwee the (i i :N) ad the (j i :) loops. h) Explai what the GPA vector does. i) Ru this code for 00,000 simulated years of 0 courses per year. What do you otice about the (approximate) samplig distributio of the sample mea GPA? Commet o its shape, mea, ad SD. How do these compare to their couter-parts with a sample size of = 2? j) Use the simulatio results to approximate the probability that Marius s GPA will be at least 3.0. The do the same for a GPA of 3.25. k) Commet o how these probabilities i the = 0 case compare to the = 2 case. l) Icrease the sample size (umber of courses) to 40, represetig a etire college career. Before you ru the simulatio, predict what you will see with regard to the distributio of the sample mea (GPA). m) Ru a simulatio with 00,000 simulated college careers. Commet o what the simulatio reveals about the samplig distributio of the sample mea (GPA). 3

) Agai use the simulatio results to approximate the probability that Marius s GPA will be at least 3.0. The do the same for a GPA of 3.25. Commet o how these probabilities i the = 40 case compare to the = 0 case. Example 9-2: Fast-food service time Suppose agai that the service time for a radomly selected customer at a particular fast-food restaurat follows a expoetial distributio with mea.25 miutes. Let the radom variable T represet this service time, ad let T = sample of customers. i= a) Report the mea ad stadard deviatio of T. T i deote the average service time i a radom b) Simulate the waitig times for N = 00,000 samples, usig each of the followig sample sizes for umber of customers: =, = 5, = 25, = 00. For each sample size, commet o the shape of the samplig distributio of T ad report the mea ad SD of the sample meas. c) Commet o how the samplig distributio of T chages as the sample size icreases. 4

Theoretical result: Let X, X 2,, X be i.i.d. from ay probability distributio. Deote E(X i ) by μ ad Var(X i ) by σ 2. Let X = i= X i for some positive iteger (sample size). a) Use properties of expectatio to determie E( X ). b) Use properties of variace to determie Var( X ) ad SD( X ). c) Now suppose that the X i s have a ormal distributio. What do you kow about the distributio of X i this case? Explai. Your simulatios ad theoretical derivatios from last time lead to the followig result, the most importat i all of probability ad statistics: Cetral Limit Theorem (CLT): Let X, X 2,, X be i.i.d. with μ = E(X i ) ad σ 2 = Var(X i ). Also let X = X deote the sample mea. The the samplig distributio of X has: o E( X ) = μ Be careful i readig this statemet, which speaks of 3 differet meas: The sample mea, X The populatio mea, μ The mea of the sample meas, E( X ) o Var( X ) = σ 2 /, so SD( X ) = σ/ Averages vary less tha idividual values. SD decreases proportioally to the square root of sample size. o A approximately ormal distributio for large values of Regardless of the distributio of the Xi s Exactly ormal for ay if the Xi s are ormally distributio Becomes closer ad closer to ormal as the sample size icreases Also closer to ormal for Xi s that are closer to ormal o Corollary: The distributio of the sum of idepedet radom variables also approaches a ormal distributio as the sample size icreases, with E(Sum) = μ ad Var(sum) = σ 2. i= i 5

Example 9-3: Maufacturig potato chips Suppose that the weights of bags of potato chips comig off a assembly lie are ormally distributed with mea μ = 2 ouces ad stadard deviatio σ = 0.4 ouces. a) Determie the probability that oe radomly selected bag weighs less tha.9 ouces. b) If you take a radom sample of 0 bags, would you expect the probability of their sample mea weight beig less tha.9 ouces to be greater or less tha the probability foud i (a)? Explai, without performig the calculatio. c) Calculate the probability asked about i the previous questio. [Hit: Draw ad label a sketch of the samplig distributio ad shade the regio whose area correspods to this probability.] Does this probability idicate that a sample mea as small as.9 ouces would be surprisig if the populatio mea were really 2 ouces? d) Repeat this aalysis, for a sample of 00 radomly selected bags. 6

e) What is the smallest sample size for which the probability of the sample mea beig less tha.9 ouces is less tha.0? [Hits: Fid the first percetile of the stadard ormal distributio as the value z such that P(Z<z) <.0. Set this percetile equal to the z-score from stadardizig.9 ad solve for.] f) If you were told that a cosumer group had weighed radomly selected bags ad foud a sample mea weight of.9 ouces, would you doubt the claim that the true mea weight of all of the potato chip bags is 2 ouces? O what uspecified iformatio does your aswer deped? Explai. g) Which of your above aswers to would be affected if the distributio of the weights of the bags was ot ormal but was rather skewed? h) Fid a value k such that the probability of the sample mea weight of 000 radomly selected bags beig betwee 2 - k ad 2 + k is roughly 0.95. I other words, betwee what two x values do the middle 95% of the x values fall? i) Determie the smallest sample size for which the probability is.95 that the sample mea falls withi ±.05 of 2 ouces (i.e., betwee.95 ad 2.05). 7