Single Sample Means. SOCY601 Alan Neustadtl

Similar documents
CIVL /8904 T R A F F I C F L O W T H E O R Y L E C T U R E - 8

Chapter 9 Inferences from Two Samples

HYPOTHESIS TESTING. Hypothesis Testing

Last week: Sample, population and sampling distributions finished with estimation & confidence intervals

Last two weeks: Sample, population and sampling distributions finished with estimation & confidence intervals

Chapter 5 Confidence Intervals

STAT Chapter 8: Hypothesis Tests

Two-Sample Inferential Statistics

Statistical Inference for Means

STA Module 10 Comparing Two Proportions

AMS7: WEEK 7. CLASS 1. More on Hypothesis Testing Monday May 11th, 2015

PSY 305. Module 3. Page Title. Introduction to Hypothesis Testing Z-tests. Five steps in hypothesis testing

CENTRAL LIMIT THEOREM (CLT)

Chapter 23. Inferences About Means. Monday, May 6, 13. Copyright 2009 Pearson Education, Inc.

T.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS

Advanced Experimental Design

The t-test: A z-score for a sample mean tells us where in the distribution the particular mean lies

WISE Power Tutorial Answer Sheet

EXAM 3 Math 1342 Elementary Statistics 6-7

Purposes of Data Analysis. Variables and Samples. Parameters and Statistics. Part 1: Probability Distributions

HYPOTHESIS TESTING II TESTS ON MEANS. Sorana D. Bolboacă

Mathematical Notation Math Introduction to Applied Statistics

Statistical Inference. Why Use Statistical Inference. Point Estimates. Point Estimates. Greg C Elvers

Hypothesis Testing. Hypothesis: conjecture, proposition or statement based on published literature, data, or a theory that may or may not be true

Psychology 282 Lecture #4 Outline Inferences in SLR

Two Sample Problems. Two sample problems

Inferential statistics

CH.9 Tests of Hypotheses for a Single Sample

Chapter 7 Comparison of two independent samples

Practice Questions: Statistics W1111, Fall Solutions

Chapter 23. Inference About Means

1 Descriptive statistics. 2 Scores and probability distributions. 3 Hypothesis testing and one-sample t-test. 4 More on t-tests

Questions 3.83, 6.11, 6.12, 6.17, 6.25, 6.29, 6.33, 6.35, 6.50, 6.51, 6.53, 6.55, 6.59, 6.60, 6.65, 6.69, 6.70, 6.77, 6.79, 6.89, 6.

Lecture on Null Hypothesis Testing & Temporal Correlation

Section 9.4. Notation. Requirements. Definition. Inferences About Two Means (Matched Pairs) Examples

Statistics Primer. ORC Staff: Jayme Palka Peter Boedeker Marcus Fagan Trey Dejong

The Chi-Square Distributions

The t-statistic. Student s t Test

y = a + bx 12.1: Inference for Linear Regression Review: General Form of Linear Regression Equation Review: Interpreting Computer Regression Output

Introduction to Business Statistics QM 220 Chapter 12

The Chi-Square Distributions

Inferential Statistics

Lab #12: Exam 3 Review Key

Harvard University. Rigorous Research in Engineering Education

Business Statistics. Lecture 10: Course Review

Problem Set 4 - Solutions

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series

PSY 216. Assignment 9 Answers. Under what circumstances is a t statistic used instead of a z-score for a hypothesis test

Sampling, Confidence Interval and Hypothesis Testing

Quantitative Methods for Economics, Finance and Management (A86050 F86050)

CHAPTER 9, 10. Similar to a courtroom trial. In trying a person for a crime, the jury needs to decide between one of two possibilities:

Difference in two or more average scores in different groups

Exam 2 (KEY) July 20, 2009

Statistics 251: Statistical Methods

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)

Multiple t Tests. Introduction to Analysis of Variance. Experiments with More than 2 Conditions

Business Statistics: Lecture 8: Introduction to Estimation & Hypothesis Testing

1 MA421 Introduction. Ashis Gangopadhyay. Department of Mathematics and Statistics. Boston University. c Ashis Gangopadhyay

10/4/2013. Hypothesis Testing & z-test. Hypothesis Testing. Hypothesis Testing

STAT 515 fa 2016 Lec Statistical inference - hypothesis testing

Chapter 24. Comparing Means

Stat 529 (Winter 2011) Experimental Design for the Two-Sample Problem. Motivation: Designing a new silver coins experiment

Hypothesis Testing and Confidence Intervals (Part 2): Cohen s d, Logic of Testing, and Confidence Intervals

Chapter 7: Hypothesis Testing

Chapter 7: Hypothesis Testing - Solutions

Introduction to the Analysis of Variance (ANOVA) Computing One-Way Independent Measures (Between Subjects) ANOVAs

KDF2C QUANTITATIVE TECHNIQUES FOR BUSINESSDECISION. Unit : I - V

ME3620. Theory of Engineering Experimentation. Spring Chapter IV. Decision Making for a Single Sample. Chapter IV

Soc3811 Second Midterm Exam

Sampling Distributions: Central Limit Theorem

A proportion is the fraction of individuals having a particular attribute. Can range from 0 to 1!

Statistics for IT Managers

Basics of Experimental Design. Review of Statistics. Basic Study. Experimental Design. When an Experiment is Not Possible. Studying Relations

Section 10.1 (Part 2 of 2) Significance Tests: Power of a Test

Chapter 12: Inference about One Population

ANOVA TESTING 4STEPS. 1. State the hypothesis. : H 0 : µ 1 =

Review. One-way ANOVA, I. What s coming up. Multiple comparisons

Statistics 301: Probability and Statistics 1-sample Hypothesis Tests Module

1/24/2008. Review of Statistical Inference. C.1 A Sample of Data. C.2 An Econometric Model. C.4 Estimating the Population Variance and Other Moments

23. MORE HYPOTHESIS TESTING

Hypothesis Tests and Estimation for Population Variances. Copyright 2014 Pearson Education, Inc.

Sampling distribution of t. 2. Sampling distribution of t. 3. Example: Gas mileage investigation. II. Inferential Statistics (8) t =

16.400/453J Human Factors Engineering. Design of Experiments II

AP Statistics Ch 12 Inference for Proportions

Answer keys for Assignment 10: Measurement of study variables (The correct answer is underlined in bold text)

Making Inferences About Parameters

Visual interpretation with normal approximation

Review: General Approach to Hypothesis Testing. 1. Define the research question and formulate the appropriate null and alternative hypotheses.

POLI 443 Applied Political Research

7.2 One-Sample Correlation ( = a) Introduction. Correlation analysis measures the strength and direction of association between

Chapter. Hypothesis Testing with Two Samples. Copyright 2015, 2012, and 2009 Pearson Education, Inc. 1

How do we compare the relative performance among competing models?

Section 6.2 Hypothesis Testing

Lecture Slides. Elementary Statistics Eleventh Edition. by Mario F. Triola. and the Triola Statistics Series 9.1-1

An inferential procedure to use sample data to understand a population Procedures

A3. Statistical Inference

" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2

hypothesis a claim about the value of some parameter (like p)

Outline. PubH 5450 Biostatistics I Prof. Carlin. Confidence Interval for the Mean. Part I. Reviews

LECTURE 12 CONFIDENCE INTERVAL AND HYPOTHESIS TESTING

Transcription:

Single Sample Means SOCY601 Alan Neustadtl

The Central Limit Theorem If we have a population measured by a variable with a mean µ and a standard deviation σ, and if all possible random samples of size n are drawn from this population, regardless of the shape of the distribution of the population, then as n becomes large: the distribution of sample means will be approximately normally distributed with a mean equal to the population parameter, and a Standard error equal to the population standard deviation divided by the square-root of N. = µ and s = σ N

Important Points We repeatedly take random samples and calculate means. Then we use these means as a variable and create a frequency distribution. This distribution represents the mean of every sample that possibly could be selected. It is a sampling distribution. The distribution of sample means will be normally distributed (particularly if n is large), regardless of the shape of the population. The mean of the sampling distribution is equal to the mean of the population. As sample size increases, the standard error (read standard deviation) of the sampling distribution will decrease.

Three Distributions Three Different Types of Distributions Population Sample Sampling Distribution Central Tendency µ = = = = µ N n n Dispersion ( µ ) 2 σ ( ) 2 = N s = n 1 σ σ = n

What Does This Mean? Suppose that we have a population with a mean equal to 100 (µ=100) and a standard deviation equal to 15 (σ=15). Assuming that we take a simple random sample of 400 cases (n=400) from this population, we can immediately calculate the standard error of the sampling distribution using the following formula: σ CL CL 95 95 15 = = 0.750 400 = µ ± ( 1.96)( 0.750) = 98.53 < µ < 101.47

The Effect of Sample Size If the sample size was increased to 1,600, the standard error would be smaller and the confidence interval narrower. For example, the standard error would be equal to: σ CL CL 95 95 15 = = 0.375 1, 600 = µ ± ( 1.96)( 0.375) = 99.27 < µ < 100.74

The Effect of Confidence Size If the sample size is held constant at 1,600, but we used a larger confidence interval, 99% for example, we would see an increase in the range of possible sample means: σ CL CL 95 95 15 = = 0.375 1, 600 = µ ± ( 2.58)( 0.375) = 99.03 < µ < 100.97

Confidence Intervals ± ( )( σ ) = ± ( ) z z σ n σ x _ µ 1. 645 645σ σx µ + 1. 90% Samples x _ µ 1.96σ µ + 1. 96 95% Samples x σ x µ 2. 58σ + 58σ x µ 2. 99% Samples x

Intervals & Level of Confidence Sampling Distribution of the Mean Intervals Extend from Zσ to + Zσ α/2 µ σ x _ 1 - α α/2 = µ Confidence Intervals _ (1 - α) % of Intervals Contain µ. α % Do Not.

Important Points All else being equal: As sample size increases, the standard error decreases. As the standard error decreases, the confidence interval decreases. Conversely, small sample sizes are associated with larger standard errors that in turn are associated with larger confidence intervals. Moving from a smaller to larger confidence limit (e.g. 0.95 to 0.99), the confidence interval increases in size it is more inclusive. Conversely, smaller confidence limits (e.g. 0.95 versus 0.99) are associated with smaller confidence intervals they are more exclusive. The smaller the population standard deviation (s), the smaller the standard error and, in turn, the confidence interval. Conversely, the larger the population standard deviation, the larger the standard error and confidence interval.

Sample Point Estimates and Confidence Intervals Symbolically a point estimate of a mean is given as. We can place a confidence interval around this value. For example, using a 95% confidence interval (α=0.05) we define boundaries approximately two standard errors below and above the point estimate: ± ( 1.96) σ

Sample Point Estimates and Confidence Intervals Similarly, we can construct a 68% confidence interval: ± σ Or a 99% confidence interval: ± ( 2.58) σ

Sample Point Estimates and Confidence Intervals In general, confidence intervals can be constructed for any desired level of confidence, 1-α, using this formula: z ± α σ 2

Summary of Assumptions We Assume that: 1. the sample for estimatingμ is drawn randomly. 2. we have chosen a sample where n is equal to or greater than 50. 3. that we know σ.

Confidence Intervals when the Standard Error is Unknown Typically, we will not know the population parameters. We may be in a position to make assumptions about the mean, but rarely about the standard deviation. We can usually make an estimate of the standard error using the following formula: σ ˆ = s n 1

Confidence Intervals when the Standard Error is Unknown When we use this formula, we have to use the t-distribution, not the z-distribution. In general, they are similar. For example, the general formula for confidence intervals becomes: t ± α σˆ = 2 t s ± α 2 n 1

z- and t-distributions Similarities to z: There are many t-distributions; their shape varies with the sample size and the sample standard deviation. The t-distribution is bell shaped and has a mean of zero. With large sample sizes (n 150) the t-and z-distributions converge. Difference from z: The use of the t-distribution to test hypotheses assumes that the sample was drawn from a normally distributed population. The use of t is generally robust against the violation of this assumption. A t-distribution for a given sample size has a larger variance than a similar z-distribution. Therefore, the standard error of a t-distribution is larger than that of a similar z-distribution.

Student s t Distribution Standard Normal Bell-Shaped Symmetric Fatter Tails t (df = 13) t (df = 5) 0 Z t

An Example Using t to Construct Confidence Intervals Research in the 1970s indicated that there was an increase in city size since World War I. But with a reversal in this trend by 1970. Using data measuring the percentage change in city populations in 63 American cities, we find that the mean of the difference is -1.26 with a standard deviation of 6.32. That is, the point estimate indicated that between 1960 and 1970 there was a decrease in average city size of 1.26%.

An Example Using t to Construct Confidence Intervals Using an alpha level of 0.05, there are 62 degrees of freedom (n-l) the tabled value of t is approximately equal to 2.00. It is approximate because 62 df is not in the table. However, we can use 60 instead. The 95% confidence interval, then, is equal to: CL t ( )( σ ) 95 = ± 0.025 ˆ 6.32 = 1.26 ± ( 2.00) 63 1 = 1.26 ± 2.00 0.7962 = 1.26 ± 1.592 ( )( ) or: -2.85 < < 0.33

z- and t-tests Besides placing confidence intervals around point estimates of the mean, we can also calculate standard z-tests and t-tests: z µ µ = t = ˆ σ σ

An Example of Hypothesis Testing Using Point Estimates If the difference is not equal to zero, do we reject the null hypothesis? To answer that question we need to know what chance or random error can do what kind of differences is chance likely to produce? The central limit theorem provides a distribution based on chance. This allows us to see how chance operates on means.

An Example of Hypothesis Testing Using Point Estimates We know that the mean score on an intelligence test in the general population is 100 with a standard deviation of 15. The mean based on a sample of size 100 from a program for accelerated students is 108. Clearly, there is a difference between the population and sample means. What could produce this difference? 1. The program is successful or 2. random error, sampling error, or chance The real question we need to answer is how likely is it that chance produced this difference. Typically, we choose to assume #2 and call it the null hypothesis (H0). In other words, it is not likely that the difference between the sample mean and the population mean is equal exactly to zero; there will generally be some difference. The null hypothesis is the assumption that this difference is due to random error.

Hypothesis Testing What could produce differences between observed and expected values? There actually is a difference, or random error, sampling error, or chance. There are five basic steps in hypothesis testing: Assume the null hypothesis of no difference We have to have an idea about the range of outcomes if the null hypothesis is true. We obtain this from an appropriate sampling distribution. We have to decide or set a criterion for enough evidence to be convinced that the null hypothesis is false. This is a significance level called alpha or α. We have to go to the real world and collect data. That is determine some sample statistic. We compare 4 with 3 and reject or fail to reject the null hypothesis. If the value we calculate falls in the critical region or exceeds the critical value associated with α, we must reject the null hypothesis; otherwise we fail to reject it.

Null and Alternative Hypotheses First we posit the null H hypothesis: 0 : µ = 0 Next, we choose one of three different alternative hypotheses, depending on a priori expectations: { 1 2-tailed H : µ 0 1-tailed H: µ > 0 1 H : µ < 0 1

Hypothesis Testing 1. Assume the null hypothesis of no difference 2. We have to have an idea about the range of outcomes if the null hypothesis is true. We obtain this from an appropriate sampling distribution. 3. We have to decide or set a criterion for enough evidence to be convinced that the null hypothesis is false. This is a significance level called alpha or α. H o : no IQ difference between population and sample H 1 :there is a statistically significant difference in IQ between the population and the sample In this problem, we have a large sample size and we know the population standard deviation. We can safely use the z-distribution to answer this question. It is reasonable to assume that students in an accelerated program should have higher average I.Q. scores. Therefore, we choose to use a onetailed test. Furthermore, since implementing a program like this universally would be expensive we wish to minimize the probability of a Type I error. So, we select α=0.01. In this case, z-critical is equal to 2.327.

Hypothesis Testing 4. We have to go to the real world and collect data. That is determine some sample statistic. 5. We compare 4 with 3 and reject or fail to reject the null hypothesis. If the value we calculate falls in the critical region or exceeds the critical value associated with α, we must reject the null hypothesis; otherwise we fail to reject it. µ 108 100 z = = 5.33 ˆ σ 15 100 The calculated z of 5.31 exceeds the z critical of 2.327. We reject the null hypothesis in favor of the alternative hypothesis knowing that the probability that we have made a Type I error is 1%.

Determining How Big A Sample You Need You know that sample size affects the amount of error in parameter estimates ceterus paribus larger samples have less error. This is bound up in the following formula: error t = α 2 ( σ ) ˆ t s n = α 2

Determining How Big A Sample You Need So, knowing a and either knowing the population standard deviation or making an estimate of it, you can solve this formula for n, sample size. Consider the following: n t α = 2 error ( σ ) ˆ 2

An Example We know that the population mean and standard deviation for the Stanford-Binet intelligence test is 100 and 15 respectively. How large a sample do we need to produce a parameter estimate of the mean within three points of the parameter? Since we know the actual parameters, we can use z.

An Example ( 1.96)( 15) 2 n = 100 3 What if we wanted to reduce the margin of error to one point? How big a sample size do we need to draw? ( 1.96)( 15) 2 n = 1 865

Tests Involving Proportions pˆ pˆ pˆ z α ± ( σ ) 2 pˆ = z ( p)( 1 p) α ± = 2 n z ( pˆ)( 1 pˆ) n α ± 2 where: pˆ = n : When is large, ˆ can approximate the value of in the formula for. Note n p p σ p ˆ

An Example In a sample of 1,000 American citizens, 637 respond that they trust the president. Using a 95% confidence interval show the range of the population that trusts the president. CL 95 pˆ z ( pˆ)( 1 pˆ) n α = ± 2 = 0.637 ± 1.96 ( ) = 0.637 ± 0.30 0.607 < pˆ < 0.667 0.637 0.363 ( )( ) 1,000

Tests Involving Proportions z = p p s u 1 ( )( ) u n p p u

An Example In a sample of 40 students taking an examination, 70% earned a score of 80% or greater. The professor claims success if 80% meet or exceed the goal of mastering 80% of the examination material. Evaluate this examination using a 99% confidence interval. z 0.70 0.80 = 1.58 ( 0.80)( 0.20) 40 z critical is equal to 2.575, so we fail to reject the null hypothesis

An Example Using a confidence interval, we get: CL 99 ˆ z ( pˆ)( 1 pˆ) n α = p± 2 = 0.70 ± 2.575 ( ) = 0.70 ± 0.16 0.54< pˆ < 086. 0.8 0.2 ( )( ) 40