The t-test: A z-score for a sample mean tells us where in the distribution the particular mean lies

Similar documents
Sampling Distributions: Central Limit Theorem

An inferential procedure to use sample data to understand a population Procedures

HYPOTHESIS TESTING. Hypothesis Testing

CIVL /8904 T R A F F I C F L O W T H E O R Y L E C T U R E - 8

Analysis of Variance (ANOVA)

PSY 216. Assignment 9 Answers. Under what circumstances is a t statistic used instead of a z-score for a hypothesis test

T.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS

10/31/2012. One-Way ANOVA F-test

Two-Sample Inferential Statistics

Hypothesis testing: Steps

Student s t-distribution. The t-distribution, t-tests, & Measures of Effect Size

CBA4 is live in practice mode this week exam mode from Saturday!

Lab #12: Exam 3 Review Key

Distribution of sample means

INTERVAL ESTIMATION AND HYPOTHESES TESTING

Difference in two or more average scores in different groups

Applied Statistics for the Behavioral Sciences

Hypothesis testing: Steps

Note: k = the # of conditions n = # of data points in a condition N = total # of data points

Variance Estimates and the F Ratio. ERSH 8310 Lecture 3 September 2, 2009

Hypothesis Testing hypothesis testing approach formulation of the test statistic

Lectures 5 & 6: Hypothesis Testing

10/4/2013. Hypothesis Testing & z-test. Hypothesis Testing. Hypothesis Testing

Can you tell the relationship between students SAT scores and their college grades?

1 Descriptive statistics. 2 Scores and probability distributions. 3 Hypothesis testing and one-sample t-test. 4 More on t-tests

Single Sample Means. SOCY601 Alan Neustadtl

COSC 341 Human Computer Interaction. Dr. Bowen Hui University of British Columbia Okanagan

Analysis of Variance: Part 1

Unit 27 One-Way Analysis of Variance

Inferences for Correlation

t-test for 2 matched/related/ dependent samples

Population Variance. Concepts from previous lectures. HUMBEHV 3HB3 one-sample t-tests. Week 8

Last two weeks: Sample, population and sampling distributions finished with estimation & confidence intervals

Statistical Inference for Means

Introduction to the Analysis of Variance (ANOVA)

AMS7: WEEK 7. CLASS 1. More on Hypothesis Testing Monday May 11th, 2015

Sampling distribution of t. 2. Sampling distribution of t. 3. Example: Gas mileage investigation. II. Inferential Statistics (8) t =

Business Statistics. Lecture 10: Course Review

Introduction to Business Statistics QM 220 Chapter 12

Difference between means - t-test /25

Harvard University. Rigorous Research in Engineering Education

Chapter 9 Inferences from Two Samples

16.400/453J Human Factors Engineering. Design of Experiments II

Analysis of Variance: Repeated measures

8/23/2018. One-Way ANOVA F-test. 1. Situation/hypotheses. 2. Test statistic. 3.Distribution. 4. Assumptions

Last week: Sample, population and sampling distributions finished with estimation & confidence intervals

Independent Samples ANOVA

Mathematical Notation Math Introduction to Applied Statistics

Multiple Regression Analysis

The One-Way Repeated-Measures ANOVA. (For Within-Subjects Designs)

PSY 305. Module 3. Page Title. Introduction to Hypothesis Testing Z-tests. Five steps in hypothesis testing

Correlation. A statistics method to measure the relationship between two variables. Three characteristics

Introduction to the Analysis of Variance (ANOVA) Computing One-Way Independent Measures (Between Subjects) ANOVAs

The t-statistic. Student s t Test

Descriptive Statistics-I. Dr Mahmoud Alhussami

CHAPTER 9: HYPOTHESIS TESTING

Wed, June 26, (Lecture 8-2). Nonlinearity. Significance test for correlation R-squared, SSE, and SST. Correlation in SPSS.

STA 101 Final Review

Notes for Week 13 Analysis of Variance (ANOVA) continued WEEK 13 page 1

Review. One-way ANOVA, I. What s coming up. Multiple comparisons

PHP2510: Principles of Biostatistics & Data Analysis. Lecture X: Hypothesis testing. PHP 2510 Lec 10: Hypothesis testing 1

Chapter 7 Comparison of two independent samples

The One-Way Independent-Samples ANOVA. (For Between-Subjects Designs)

Analysis of Covariance. The following example illustrates a case where the covariate is affected by the treatments.

Statistics Primer. ORC Staff: Jayme Palka Peter Boedeker Marcus Fagan Trey Dejong

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)

5 Basic Steps in Any Hypothesis Test

Inferential Statistics

In ANOVA the response variable is numerical and the explanatory variables are categorical.

psyc3010 lecture 2 factorial between-ps ANOVA I: omnibus tests

Multiple t Tests. Introduction to Analysis of Variance. Experiments with More than 2 Conditions

The Chi-Square Distributions

Acknowledge error Smaller samples, less spread

The Chi-Square Distributions

Using SPSS for One Way Analysis of Variance

An Analysis of College Algebra Exam Scores December 14, James D Jones Math Section 01

Final Exam - Solutions

Psychology 282 Lecture #4 Outline Inferences in SLR

CHAPTER 10 Comparing Two Populations or Groups

CHAPTER 10 Comparing Two Populations or Groups

COMPARING SEVERAL MEANS: ANOVA

7.2 One-Sample Correlation ( = a) Introduction. Correlation analysis measures the strength and direction of association between

" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2

Review: General Approach to Hypothesis Testing. 1. Define the research question and formulate the appropriate null and alternative hypotheses.

y = a + bx 12.1: Inference for Linear Regression Review: General Form of Linear Regression Equation Review: Interpreting Computer Regression Output

One sided tests. An example of a two sided alternative is what we ve been using for our two sample tests:

+ Specify 1 tail / 2 tail

Advanced Experimental Design

In a one-way ANOVA, the total sums of squares among observations is partitioned into two components: Sums of squares represent:

Simple Linear Regression: One Quantitative IV

Ordinary Least Squares Regression Explained: Vartanian

EXAM 3 Math 1342 Elementary Statistics 6-7

Business Analytics and Data Mining Modeling Using R Prof. Gaurav Dixit Department of Management Studies Indian Institute of Technology, Roorkee

Hypothesis testing I. - In particular, we are talking about statistical hypotheses. [get everyone s finger length!] n =

STA Module 10 Comparing Two Proportions

T-TEST FOR HYPOTHESIS ABOUT

Review 6. n 1 = 85 n 2 = 75 x 1 = x 2 = s 1 = 38.7 s 2 = 39.2

Lecture 18: Analysis of variance: ANOVA

Comparing Means from Two-Sample

M(t) = 1 t. (1 t), 6 M (0) = 20 P (95. X i 110) i=1

Transcription:

The t-test: So Far: Sampling distribution benefit is that even if the original population is not normal, a sampling distribution based on this population will be normal (for sample size > 30). Benefit of a normal distribution is that we know what proportion of means lie within a certain margin of the mean. The distances from the mean are best understood in terms of z- scores. Example of a non-normal or skewed distribution: Annual income of Americans. Many people in the lower and medium income bracket; very few are ultra rich. (So distribution is NOT normal.) Today: Single sample and Independent-measures (between-subjects) tests Frequency Next week: Repeated-measures (within-subjects) test Low Income High Sampling distribution based on a skewed population distribution A z-score for a sample mean tells us where in the distribution the particular mean lies z = -2 z = -1 z = 0 z = +1 z = +2 µ = mean σ X = standard deviation µ 3σ X µ 2σ X µ σ X µ µ +σ X µ + 2σ X µ + 3σ X From the z-score we can decide how many sample means are above/below that particular mean. 1

For example for z = +1 15.87% means lie above it and 84.13% means lie below it z = -1.96 z = +1 z = +1.96 From last week: Hypothesis testing 1. Small difference between sample mean and population mean retain null hypothesis (difference has occurred by chance). Large difference reject null hypothesis in favour of alternative hypothesis (difference has not occurred by chance, it is real. (OR a large difference can occur by chance only very rarely). 2. "Large" is a difference that is likely to occur by chance only 5% of the time or less (p <.05) a compromise between Type 1 and Type 2 errors. 3. Directional hypothesis versus non-directional hypothesis. µ 3σ X µ 2σ X µ σ X µ µ +σ X µ + 2σ X µ + 3σ X Two new things today: 1) We use the z-test when we know the population SD. When we don t know the population SD, we calculate SD from the sample, and use the t-test. z = X µ σ X t = X µ 2) We have been discussing single samples. - Reason: Learning how to work single samples is easier (less computation). - Today we ll learn how to deal with two samples. TWO DIFFERENCES between a z-test and a t-test: (1): When population SD (σ) is known we use to compute the error term for the z-test. BUT if we don t know σ, we do a t-test, and we calculate what is called an unbiased sample SD (ŝ) as follows: (X i X ) 2 s ˆ = N 1 2 sum of squares = SS = (X i X ) Then we use ŝ to compute what is called the estimated error term ( ) as follows: estσ = s ˆ X N σ N 2

(2): In performing a t-test we use a t-distribution. t-distribution is used the same way as the z-distribution, BUT unlike the z-distribution there are as many different t-distributions. And which one we use will depend on sample size, and the df = degrees of freedom. df = N - 1 Example: Suppose a city s year-four children take a maths exam and produce µ = 75. At the beginning of the year 26 children are chosen at random to take part in an experiment, which involves special teaching methods. At the end of the year this group produces X = 81. Population standard deviation is unknown; the unbiased sample standard deviation is ŝ = 16.317. Did the special teaching methods work? We choose α = 0.05 WE WILL USE THE t-test. NOTE: POPULATION SD IS NOT KNOWN H 0 : µ s = 75 (no effect) H 1 : µ s 75 (there is an effect) Hypothesis testing from last week: If z-observed z-critical then reject H 0 Finding t-critical from the t-distribution Comparison of t-distributions for df = 4 and df = 25; α = 0.05 Now it is: If t-observed t-critical then reject H 0 Before we find t-critical we need to find out which distribution to look at For sample size 26; df = 25 Portion of a table of critical values of t (two tailed) If t-observed 2.060 then reject H 0 3

Calculating t-observed Calculating t-observed t is computed the same way as z, except we use an estimate of σ X ( ) t = X µ = ˆ s N Computing z-observed versus t-observed z = X µ σ X t = X µ σ X = σ N = ˆ s N 81 75 t = 16.317 26 = 6 3.2 =1.88 Since this is smaller than t-critical; we fail to reject the null hypothesis. If our hypothesis was directional (one-tailed): Special teaching methods will improve the maths score Then we would have been successful in rejecting the null hypothesis as t-critical = 1.708 (for a one-tailed test, at α =.05) two-tailed df = 25 lower t-critical value -2.060 upper t-critical value 2.060 0.025 0.025 t-critical value 1.708 one-tailed df = 25 0.05 Two-Sample Experiment This involves more complex computations, BUT it is much more applicable in everyday life. Independent measures t-test. 4

Working with two samples: Using subscripts - Stating the null hypothesis: µ 1 - µ 2 = 0 - Two sample means:, - The error term: 1 - Two sample sizes : N 1, N 2 - Two degrees of freedom df 1, df 2 EXAMPLE: Experiment on the effects of alcohol on task performance (time taken to type 20 words). Measure time taken to perform the task for one set of subjects when under the influence of alcohol, and a different set of subjects when sober. Null hypothesis: alcohol has no effect on time taken: variation between the drunk sample mean and the sober sample mean is due to sampling variation. i.e. The drunk and sober mean performance times are based on samples from the same population. Subject group 1 Subject group 2 Participant 1 13.0 Participant 1 11.1 Participant 2 16.5 Participant 2 13.5 Participant 3 16.9 Participant 3 11.0 Participant 4 19.7 Participant 4 9.1 Participant 5 17.6 Participant 5 13.3 Participant 6 17.5 Participant 6 11.7 Participant 7 18.1 Participant 7 14.3 Participant 8 17.3 Participant 8 10.8 Participant 9 14.5 Participant 9 12.6 Participant 10 13.3 Participant 10 11.2 t = ( ) (µ 1 µ 2 )hypothesized 1 the difference between samples means (should be close to zero if there is no difference between the two conditions) the predicted average difference between scores in our two samples (usually zero, since we assume the two samples don t differ ) estimated standard error of the difference between means (a measure of how much the difference between means might vary from one occasion to the next). 5

1 t = ( ) (µ 1 µ 2 )hypothesized 1 = 4.58 Subject group 1 Subject group 2 Participant 1 13.0 Participant 1 11.1 Participant 2 16.5 Participant 2 13.5 Participant 3 16.9 Participant 3 11.0 Participant 4 19.7 Participant 4 9.1 Participant 5 17.6 Participant 5 13.3 Participant 6 17.5 Participant 6 11.7 Participant 7 18.1 Participant 7 14.3 Participant 8 17.3 Participant 8 10.8 Participant 9 14.5 Participant 9 12.6 Participant 10 13.3 Participant 10 11.2 =164.4 =16.44 =118.6 =11.86 s p 2 = 2 t = ( ) (µ 1 µ 2 )hypothesized 1 2a Calculating pooled variance 2 ( ) + (X2 ) 2 df 1 + df 2 s p 2 = SS 1 + SS 2 df 1 + df 2 2b Calculating two sample standard error 1 = s 2 p + s 2 p N 1 N 2 2a Participant 1 Participant 2 Participant 3 Participant 4 Participant 5 Participant 6 Participant 7 Participant 8 Participant 9 Participant 10 SS 1 = ( ) =16.44 13 11.83 16.5 0.004 16.9 0.21 19.7 10.63 17.6 1.35 17.5 1.12 18.1 2.76 17.3 0.74 14.5 3.76 13.3 9.86 Participant 1 Participant 2 Participant 3 Participant 4 Participant 5 Participant 6 Participant 7 Participant 8 Participant 9 Participant 10 =11.86 Subject group 1 Subject group 2 X ( ) 2 ( ) 2 1 11.1 0.58 13.5 2.69 11 0.74 9.1 7.62 13.3 2.07 11.7 0.03 14.3 5.95 10.8 1.12 12.6 0.55 11.2 0.44 2 2 = 42.26 SS2 = ( ) = 21.78 2a 2b s p 2 = N 1 = 10; df =9 N 2 = 10; df=9 s p 2 = SS 1 + SS 2 df 1 + df 2 42.26 + 21.78 9 + 9 1 = s 2 p + s 2 p = 3.56 N 1 N 2 10 + 3.56 10 = 3.56 = 0.356 2 = 0.843 6

t(observed) = ( ) (µ 1 µ 2 )hypothesized 1 3 Frequently µ 1 µ 2 = 0 6 t = 4.58 0 0.843 = 5.429 4 5 df = (N 1 1)+(N 2 1) = 18 finding t-critical from the table: t-critical = 2.10; df = 18 7 Decision: t-observed > t-critical Therefore reject the null hypothesis Alcohol has an impact of time taken on the task. t(18) = 5.429, p < 0.05 Data Entry Using SPSS to do an independent measures t-test 7

Running SPSS (independent measures t-test) Running SPSS (independent measures t-test) Running SPSS (independent measures t-test) Running SPSS (independent measures t-test) 8

SPSS output (independent measures t-test) 9