Hypothesis Testing in Action: t-tests

Size: px
Start display at page:

Download "Hypothesis Testing in Action: t-tests"

Transcription

1 Hypothesis Testing in Action: t-tests Mark Muldoon School of Mathematics, University of Manchester Mark Muldoon, January 30, 2007 t-testing - p. 1/31

2 Overview large Computing t for two : reprise Today we ll examine four data sets and use hypothesis tests to explore them. Differences in proportions: The Boston aspirin study Differences in means: the t-tests and H.H. Koh s macular pigment data Confidence intervals revisited: confidence with t Are my data normally qq-plots Mark Muldoon, January 30, 2007 t-testing - p. 2/31

3 The Boston aspirin study large Computing t for two : reprise In a famous and very large study during the 1980 s, several hospitals in the Boston area worked together to conduct a placebo controlled, double-blind study of the efficacy of aspirin in preventing heart attacks. The results were: Group N-attacks N Patients Aspirin Placebo Is this an important difference? Mark Muldoon, January 30, 2007 t-testing - p. 3/31

4 Setup large Computing t for two : reprise The first question to ask is: How likely is this difference to have arisen by chance? We begin with a hypothesis test based on a z-score that addresses this question. Null Hypothesis The two proportions are the same. Alternative Hypothesis Either The two proportions differ (two-sided test) or A specific one of the proportions is larger (one-sided test). Mark Muldoon, January 30, 2007 t-testing - p. 4/31

5 Differences of proportions, large large Computing t for two : reprise The ingredients for this test are two experimentally observed proportions: p 1 = (r 1 /n 1 ) and p 2 = (r 2 /n 2 ). (a) As the null hypothesis is that the proportions are the same, combine the data to get a single estimate of the underlying proportion: p = r 1 + r 2 n 1 + n 2 (b) Estimate the standard error of the difference between the two measured proportions: ( 1 SE = p(1 p) + 1 ) n 1 n 2 Mark Muldoon, January 30, 2007 t-testing - p. 5/31

6 Differences of proportions, continued large Computing t for two : reprise (c) Compute z = p 1 p 2 SE p 1 p 2 = ( ) 1 p(1 p) n n 2 (d) Consult the table for the standard normal. Mark Muldoon, January 30, 2007 t-testing - p. 6/31

7 Application to the aspirin data large Computing t for two : reprise (a) Under the null hypothesis our best estimate fo p is p = ( )/( ) (b) The standard error of the difference is then ( 1 SE = p(1 p) ) (c) The z-score is z = (189/11034) (104/11037) Mark Muldoon, January 30, 2007 t-testing - p. 7/31

8 Application to the aspirin data large Computing t for two : reprise (d) This is a massively implausible z-score: we can reject the null hypothesis in favour of the alternative that the aspirin group has fewer heart attacks with confidence 99.99%. Mark Muldoon, January 30, 2007 t-testing - p. 8/31

9 Visual pigments and macular large Computing t for two : reprise The next two examples involve measurements of Macular Pigment Optical Density (MPOD) collected from two groups: patients suffering from macular and healthy control subjects. Raw data are 10 total measurements per subject, collected in two sessions of 5 measurements and with around a 30 minute break in between. The MPOD is the difference between measurements taken at central fixation and another in the periphery (5 degree visual angle). All measurements are on healthy eyes even among the patients, each of whom had only one d eye. These data were collected by Ms. Hui Hiang Koh (now Dr. Koh) and her advisor, Dr. Ian Murray. Mark Muldoon, January 30, 2007 t-testing - p. 9/31

10 Are the two groups different? large Computing t for two : reprise Patients Controls MPOD SEM MPOD SEM m x m y s x s y Mark Muldoon, January 30, 2007 t-testing - p. 10/31

11 Testing for differences large Computing t for two : reprise If anything, the patients seem to have more pigment than the controls. Is this apparent difference significant? Test with a new hypothesis test, the Two Sample t-test, designed for differences in the means of small. Null Hypothesis MPOD for Patients and Controls are drawn from the same normal distribution (same mean, same variance). Alternative Hypothesis MPOD for the two groups drawn from normal distributions with different means, but the same variance. This will involve a two-sided test based on a new statistic, t. Mark Muldoon, January 30, 2007 t-testing - p. 11/31

12 Folklore large Computing t for two : reprise The t-test was developed by W.S. Gosset ( ), a statistician who worked for the Guiness brewing company. Employees of the firm were not allowed to publish under their own names so he wrote under the pseudonym Student. The t-statistic is: similar to a z-score, but is applicable when the sample is too small to assume that s 2 x and s 2 y provide good estimates of the variances; this advantage comes at a small cost: the t-distribution (and hence the tables one consults to use it) are less straightforward than those for the normal distribution; depends on the size of the when this grows large the distribution of t tends to the normal. Mark Muldoon, January 30, 2007 t-testing - p. 12/31

13 Distribution of t large Computing t for two : reprise t Student s t-distribution for ν = 2, 4 and 8. The dashed curve at the top is the standard normal distribution (µ = 0, σ = 1). Mark Muldoon, January 30, 2007 t-testing - p. 13/31

14 Computing t for two large Computing t for two : reprise The ingredients are a confidence level C and two of lists of numbers, say, {x 1, x 2,..., x Nx } and {y 1, y 2,..., y Ny }. (a) Computes the two sample means, m x and m y. Recall that, for example, Nx j=1 m x = x j. N x (b) Computes the two standard deviations, s x and s y. Recall that, for example, Ny s 2 j=1 y = (y j m y ) 2. (N y 1) Mark Muldoon, January 30, 2007 t-testing - p. 14/31

15 Two-sample t continued large Computing t for two : reprise (c) Computes the pooled standard deviation, s, which satisfies (d) Last, one computes s 2 = (N x 1)s 2 x + (N y 1)s 2 y (N x 1) + (N y 1) t = m x m y s N x N y N x + N y. (e) Consult the t-table for ν = N x + N y 2 degrees of freedom.. Mark Muldoon, January 30, 2007 t-testing - p. 15/31

16 Testing the MPOD data large Computing t for two : reprise Working through the recipe, N x = N y = 9 and: (a) Patients had m x = 0.293; Controls had m y = (b) Patients had s x = 0.135; Controls had s y = (c) The pooled variance is thus s 2 = (N x 1)s 2 x + (N y 1)s 2 y (N x 1) + (N y 1) = 8(0.135)2 + 8(0.142) Mark Muldoon, January 30, 2007 t-testing - p. 16/31

17 Testing the MPOD data, continued large Computing t for two : reprise (d) Thus t = m x m y N x N y s N x + N y = (e) This is smaller than the critical value, 2.120, for a two-sided test with ν = 16 degrees of freedom at 95% confidence. We cannot reject the null hypothesis. Mark Muldoon, January 30, 2007 t-testing - p. 17/31

18 Paired sample design large Computing t for two : reprise The considerable variation within the groups may make it hard to see whether there is much systematic difference between groups. Design a new type of study in which Patients and Controls are matched for age, gender, eye (left or right), iris colour and smoking habits. Compare with the Paired Sample t-test. Mark Muldoon, January 30, 2007 t-testing - p. 18/31

19 Computing t for paired large Computing t for two : reprise The only ingredients are a confidence level C and a list of N pairs of numbers {(x 1, y 1 ),..., (x N, y N )}. Null hypothesis is that the two members of each pair are drawn from normal distributions having the same mean. All the distributions for the x s are assumed to share the same variance as are all the y s, but the variance shared by the x s need not equal that shared by the y s. Mark Muldoon, January 30, 2007 t-testing - p. 19/31

20 Computing t for paired, continued large Computing t for two : reprise (a) Compute the differences δ j = (x j y j ); (b) Compute the mean of the differences N j=1 m = δ j N ; (c) Estimate the variance of the differences N s 2 j=1 = (δ j m) 2 ; N 1 Mark Muldoon, January 30, 2007 t-testing - p. 20/31

21 Computing t for paired, concluded large Computing t for two : reprise (d) Computes the paired-sample t-statistic t = m N s (e) Check against critical values in the t-table, here using ν = N 1 degrees of freedom.. Mark Muldoon, January 30, 2007 t-testing - p. 21/31

22 Paired MPOD data, differences large Computing t for two : reprise MPOD Control Patient δ Mark Muldoon, January 30, 2007 t-testing - p. 22/31

23 Paired MPOD data, conclusions large Computing t for two : reprise The mean difference is m = with standard deviation s = This leads to t = m N s = This far exceeds the critical value, 2.306, for a two-sided test at the 95% confidence level (α = 0.025, ν = 8). We can reject the null hypothesis and conclude that the difference is nonzero. Mark Muldoon, January 30, 2007 t-testing - p. 23/31

24 Confidence intervals for large : reprise large Computing t for two : reprise Recall: confidence intervals for a population mean based on large. (1) Choose a confidence level C (C = 0.95 for 95%) and define α = 1 C. (2) Use the standard normal table to find that z-score, z α/2, such that P( z z α/2 ) = (α/2). (3) If your sample has mean m the desired confidence interval for the population mean µ is: m z α/2 SEM µ m + z α/2 SEM Mark Muldoon, January 30, 2007 t-testing - p. 24/31

25 Confidence intervals: small large Computing t for two : reprise When sample is small (say, N < 30) then one cannot assume that the sample standard deviation s is a good estimate of that for the population, σ: that s why Gosset developed the t-test. His statistic comes into small-sample confidence intervals too: (1) Choose a confidence level C and define α = 1 C. (2) Use the t-table with ν = N 1 degrees of freedom to find that t-score, t α/2, ν, such that P( t t α/2, ν ) = (α/2). (3) Desired confidence interval is: m t α/2, ν SEM µ m + t α/2, ν SEM Mark Muldoon, January 30, 2007 t-testing - p. 25/31

26 Are my data normally large Computing t for two : reprise The hypotheses for the t-tests all involve normal distributions: how does one check whether the data are normally Impossible to answer definitely: exact distribution is a property beyond the reach of measurement. Informative standard graphical methods are available. Mark Muldoon, January 30, 2007 t-testing - p. 26/31

27 Making a qq-plot large Computing t for two : reprise Sole ingredients is a list of, say, N numbers. (a) Sort the data into ascending order so that x 1 x 2... x N (b) Assign a cumulative probability to each x j p j = (j 0.5) N Mark Muldoon, January 30, 2007 t-testing - p. 27/31

28 Making a qq-plot, continued large Computing t for two : reprise (c) Work out the z-score that would be associated with each cumulative probability p j Φ(zj) = pj One has to use the z-score table in reverse for this. (d) Plot the pairs (z j, x j ). If the data really are normally distributed then the points will lie near to the line x j = σz j + µ where µ and σ are the population mean and standard deviation and for the x s. Mark Muldoon, January 30, 2007 t-testing - p. 28/31

29 Example: uniformly distributed data pdf x Samples x j Quantiles from std. normal A distribution that is uniform over the interval 0 x 1 and the normal distribution with the same mean and variance. The qq-plot for a sample of 50 values drawn from the uniform distribution at left. The dashed line is x = s z + m where m and s are the sample mean and standard deviation for the x s. Mark Muldoon, January 30, 2007 t-testing - p. 29/31

30 Example: normally distributed data pdf The standard normal distribution: µ = 0, σ = 1. x Samples x j Quantiles: std. normal The qq-plot for a sample of 50 values drawn from the normal distribution at left: notice that the dots are concentrated along the dashed line. Mark Muldoon, January 30, 2007 t-testing - p. 30/31

31 The Shapiro-Wilk Test large Computing t for two : reprise One can take seriously the remark: If the data really are normally distributed then the points will lie near to the line x j = σz j + µ where µ and σ are the population mean and standard deviation and for the x s. from a few slides back and, more-or-less, try to fit a line to the pairs (z j, x j ). A goodness-of-fit test on the line (about which we ll learn more later in the term) then yields a test statistic and a numerical p-value. The evaluation of this statistic is somewhat more involved than my sketch suggests and the test requires special tables, so one normally resorts to software, for example the shapiro.test() function in R. Mark Muldoon, January 30, 2007 t-testing - p. 31/31

Hypothesis Testing in Action

Hypothesis Testing in Action Hypothesis Testing in Action Jonathan Bagley School of Mathematics, University of Manchester Jonathan Bagley, September 23, 2005 The t-tests - p. 1/23 Overview Today we ll examine three data sets and use

More information

Nonparametric tests. Mark Muldoon School of Mathematics, University of Manchester. Mark Muldoon, November 8, 2005 Nonparametric tests - p.

Nonparametric tests. Mark Muldoon School of Mathematics, University of Manchester. Mark Muldoon, November 8, 2005 Nonparametric tests - p. Nonparametric s Mark Muldoon School of Mathematics, University of Manchester Mark Muldoon, November 8, 2005 Nonparametric s - p. 1/31 Overview The sign, motivation The Mann-Whitney Larger Larger, in pictures

More information

Hypotheses and Errors

Hypotheses and Errors Hypotheses and Errors Jonathan Bagley School of Mathematics, University of Manchester Jonathan Bagley, September 23, 2005 Hypotheses & Errors - p. 1/22 Overview Today we ll develop the standard framework

More information

Purposes of Data Analysis. Variables and Samples. Parameters and Statistics. Part 1: Probability Distributions

Purposes of Data Analysis. Variables and Samples. Parameters and Statistics. Part 1: Probability Distributions Part 1: Probability Distributions Purposes of Data Analysis True Distributions or Relationships in the Earths System Probability Distribution Normal Distribution Student-t Distribution Chi Square Distribution

More information

T.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS

T.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS In our work on hypothesis testing, we used the value of a sample statistic to challenge an accepted value of a population parameter. We focused only

More information

CBA4 is live in practice mode this week exam mode from Saturday!

CBA4 is live in practice mode this week exam mode from Saturday! Announcements CBA4 is live in practice mode this week exam mode from Saturday! Material covered: Confidence intervals (both cases) 1 sample hypothesis tests (both cases) Hypothesis tests for 2 means as

More information

The t-test Pivots Summary. Pivots and t-tests. Patrick Breheny. October 15. Patrick Breheny Biostatistical Methods I (BIOS 5710) 1/18

The t-test Pivots Summary. Pivots and t-tests. Patrick Breheny. October 15. Patrick Breheny Biostatistical Methods I (BIOS 5710) 1/18 and t-tests Patrick Breheny October 15 Patrick Breheny Biostatistical Methods I (BIOS 5710) 1/18 Introduction The t-test As we discussed previously, W.S. Gossett derived the t-distribution as a way of

More information

Review: General Approach to Hypothesis Testing. 1. Define the research question and formulate the appropriate null and alternative hypotheses.

Review: General Approach to Hypothesis Testing. 1. Define the research question and formulate the appropriate null and alternative hypotheses. 1 Review: Let X 1, X,..., X n denote n independent random variables sampled from some distribution might not be normal!) with mean µ) and standard deviation σ). Then X µ σ n In other words, X is approximately

More information

Distribution-Free Procedures (Devore Chapter Fifteen)

Distribution-Free Procedures (Devore Chapter Fifteen) Distribution-Free Procedures (Devore Chapter Fifteen) MATH-5-01: Probability and Statistics II Spring 018 Contents 1 Nonparametric Hypothesis Tests 1 1.1 The Wilcoxon Rank Sum Test........... 1 1. Normal

More information

The Student s t Distribution

The Student s t Distribution The Student s t Distribution What do we do if (a) we don t know σ and (b) n is small? If the population of interest is normally distributed, we can use the Student s t-distribution in place of the standard

More information

Design of Engineering Experiments Part 2 Basic Statistical Concepts Simple comparative experiments

Design of Engineering Experiments Part 2 Basic Statistical Concepts Simple comparative experiments Design of Engineering Experiments Part 2 Basic Statistical Concepts Simple comparative experiments The hypothesis testing framework The two-sample t-test Checking assumptions, validity Comparing more that

More information

Slides for Data Mining by I. H. Witten and E. Frank

Slides for Data Mining by I. H. Witten and E. Frank Slides for Data Mining by I. H. Witten and E. Frank Predicting performance Assume the estimated error rate is 5%. How close is this to the true error rate? Depends on the amount of test data Prediction

More information

Section 9.5. Testing the Difference Between Two Variances. Bluman, Chapter 9 1

Section 9.5. Testing the Difference Between Two Variances. Bluman, Chapter 9 1 Section 9.5 Testing the Difference Between Two Variances Bluman, Chapter 9 1 This the last day the class meets before spring break starts. Please make sure to be present for the test or make appropriate

More information

Statistics Part IV Confidence Limits and Hypothesis Testing. Joe Nahas University of Notre Dame

Statistics Part IV Confidence Limits and Hypothesis Testing. Joe Nahas University of Notre Dame Statistics Part IV Confidence Limits and Hypothesis Testing Joe Nahas University of Notre Dame Statistic Outline (cont.) 3. Graphical Display of Data A. Histogram B. Box Plot C. Normal Probability Plot

More information

LECTURE 12 CONFIDENCE INTERVAL AND HYPOTHESIS TESTING

LECTURE 12 CONFIDENCE INTERVAL AND HYPOTHESIS TESTING LECTURE 1 CONFIDENCE INTERVAL AND HYPOTHESIS TESTING INTERVAL ESTIMATION Point estimation of : The inference is a guess of a single value as the value of. No accuracy associated with it. Interval estimation

More information

LAB 2. HYPOTHESIS TESTING IN THE BIOLOGICAL SCIENCES- Part 2

LAB 2. HYPOTHESIS TESTING IN THE BIOLOGICAL SCIENCES- Part 2 LAB 2. HYPOTHESIS TESTING IN THE BIOLOGICAL SCIENCES- Part 2 Data Analysis: The mean egg masses (g) of the two different types of eggs may be exactly the same, in which case you may be tempted to accept

More information

Null Hypothesis Significance Testing p-values, significance level, power, t-tests Spring 2017

Null Hypothesis Significance Testing p-values, significance level, power, t-tests Spring 2017 Null Hypothesis Significance Testing p-values, significance level, power, t-tests 18.05 Spring 2017 Understand this figure f(x H 0 ) x reject H 0 don t reject H 0 reject H 0 x = test statistic f (x H 0

More information

Null Hypothesis Significance Testing p-values, significance level, power, t-tests

Null Hypothesis Significance Testing p-values, significance level, power, t-tests Null Hypothesis Significance Testing p-values, significance level, power, t-tests 18.05 Spring 2014 January 1, 2017 1 /22 Understand this figure f(x H 0 ) x reject H 0 don t reject H 0 reject H 0 x = test

More information

The t-statistic. Student s t Test

The t-statistic. Student s t Test The t-statistic 1 Student s t Test When the population standard deviation is not known, you cannot use a z score hypothesis test Use Student s t test instead Student s t, or t test is, conceptually, very

More information

Population Variance. Concepts from previous lectures. HUMBEHV 3HB3 one-sample t-tests. Week 8

Population Variance. Concepts from previous lectures. HUMBEHV 3HB3 one-sample t-tests. Week 8 Concepts from previous lectures HUMBEHV 3HB3 one-sample t-tests Week 8 Prof. Patrick Bennett sampling distributions - sampling error - standard error of the mean - degrees-of-freedom Null and alternative/research

More information

Two-Sample Inferential Statistics

Two-Sample Inferential Statistics The t Test for Two Independent Samples 1 Two-Sample Inferential Statistics In an experiment there are two or more conditions One condition is often called the control condition in which the treatment is

More information

Confidence Intervals with σ unknown

Confidence Intervals with σ unknown STAT 141 Confidence Intervals and Hypothesis Testing 10/26/04 Today (Chapter 7): CI with σ unknown, t-distribution CI for proportions Two sample CI with σ known or unknown Hypothesis Testing, z-test Confidence

More information

Chapter 22. Comparing Two Proportions 1 /29

Chapter 22. Comparing Two Proportions 1 /29 Chapter 22 Comparing Two Proportions 1 /29 Homework p519 2, 4, 12, 13, 15, 17, 18, 19, 24 2 /29 Objective Students test null and alternate hypothesis about two population proportions. 3 /29 Comparing Two

More information

Statistical Inference for Means

Statistical Inference for Means Statistical Inference for Means Jamie Monogan University of Georgia February 18, 2011 Jamie Monogan (UGA) Statistical Inference for Means February 18, 2011 1 / 19 Objectives By the end of this meeting,

More information

Hypothesis testing and the Gamma function

Hypothesis testing and the Gamma function Math 10A November 29, 2016 Announcements Please send me email to sign up for the next two (last two?) breakfasts: This Thursday (December 1) at 9AM. Next Monday (December 5), also at 9AM. Pop-in lunch

More information

In a one-way ANOVA, the total sums of squares among observations is partitioned into two components: Sums of squares represent:

In a one-way ANOVA, the total sums of squares among observations is partitioned into two components: Sums of squares represent: Activity #10: AxS ANOVA (Repeated subjects design) Resources: optimism.sav So far in MATH 300 and 301, we have studied the following hypothesis testing procedures: 1) Binomial test, sign-test, Fisher s

More information

Inference for Distributions Inference for the Mean of a Population

Inference for Distributions Inference for the Mean of a Population Inference for Distributions Inference for the Mean of a Population PBS Chapter 7.1 009 W.H Freeman and Company Objectives (PBS Chapter 7.1) Inference for the mean of a population The t distributions The

More information

INTERVAL ESTIMATION AND HYPOTHESES TESTING

INTERVAL ESTIMATION AND HYPOTHESES TESTING INTERVAL ESTIMATION AND HYPOTHESES TESTING 1. IDEA An interval rather than a point estimate is often of interest. Confidence intervals are thus important in empirical work. To construct interval estimates,

More information

Chapter 10: STATISTICAL INFERENCE FOR TWO SAMPLES. Part 1: Hypothesis tests on a µ 1 µ 2 for independent groups

Chapter 10: STATISTICAL INFERENCE FOR TWO SAMPLES. Part 1: Hypothesis tests on a µ 1 µ 2 for independent groups Chapter 10: STATISTICAL INFERENCE FOR TWO SAMPLES Part 1: Hypothesis tests on a µ 1 µ 2 for independent groups Sections 10-1 & 10-2 Independent Groups It is common to compare two groups, and do a hypothesis

More information

Chapter 22. Comparing Two Proportions 1 /30

Chapter 22. Comparing Two Proportions 1 /30 Chapter 22 Comparing Two Proportions 1 /30 Homework p519 2, 4, 12, 13, 15, 17, 18, 19, 24 2 /30 3 /30 Objective Students test null and alternate hypothesis about two population proportions. 4 /30 Comparing

More information

Lecture on Null Hypothesis Testing & Temporal Correlation

Lecture on Null Hypothesis Testing & Temporal Correlation Lecture on Null Hypothesis Testing & Temporal Correlation CS 590.21 Analysis and Modeling of Brain Networks Department of Computer Science University of Crete Acknowledgement Resources used in the slides

More information

Inferential Statistics

Inferential Statistics Inferential Statistics Part 1 Sampling Distributions, Point Estimates & Confidence Intervals Inferential statistics are used to draw inferences (make conclusions/judgements) about a population from a sample.

More information

Exam 2 (KEY) July 20, 2009

Exam 2 (KEY) July 20, 2009 STAT 2300 Business Statistics/Summer 2009, Section 002 Exam 2 (KEY) July 20, 2009 Name: USU A#: Score: /225 Directions: This exam consists of six (6) questions, assessing material learned within Modules

More information

The t-distribution. Patrick Breheny. October 13. z tests The χ 2 -distribution The t-distribution Summary

The t-distribution. Patrick Breheny. October 13. z tests The χ 2 -distribution The t-distribution Summary Patrick Breheny October 13 Patrick Breheny Biostatistical Methods I (BIOS 5710) 1/25 Introduction Introduction What s wrong with z-tests? So far we ve (thoroughly!) discussed how to carry out hypothesis

More information

Student s t-distribution. The t-distribution, t-tests, & Measures of Effect Size

Student s t-distribution. The t-distribution, t-tests, & Measures of Effect Size Student s t-distribution The t-distribution, t-tests, & Measures of Effect Size Sampling Distributions Redux Chapter 7 opens with a return to the concept of sampling distributions from chapter 4 Sampling

More information

BIOE 198MI Biomedical Data Analysis. Spring Semester Lab 5: Introduction to Statistics

BIOE 198MI Biomedical Data Analysis. Spring Semester Lab 5: Introduction to Statistics BIOE 98MI Biomedical Data Analysis. Spring Semester 209. Lab 5: Introduction to Statistics A. Review: Ensemble and Sample Statistics The normal probability density function (pdf) from which random samples

More information

Chapter 9 Inferences from Two Samples

Chapter 9 Inferences from Two Samples Chapter 9 Inferences from Two Samples 9-1 Review and Preview 9-2 Two Proportions 9-3 Two Means: Independent Samples 9-4 Two Dependent Samples (Matched Pairs) 9-5 Two Variances or Standard Deviations Review

More information

Suppose that we are concerned about the effects of smoking. How could we deal with this?

Suppose that we are concerned about the effects of smoking. How could we deal with this? Suppose that we want to study the relationship between coffee drinking and heart attacks in adult males under 55. In particular, we want to know if there is an association between coffee drinking and heart

More information

Sampling distribution of t. 2. Sampling distribution of t. 3. Example: Gas mileage investigation. II. Inferential Statistics (8) t =

Sampling distribution of t. 2. Sampling distribution of t. 3. Example: Gas mileage investigation. II. Inferential Statistics (8) t = 2. The distribution of t values that would be obtained if a value of t were calculated for each sample mean for all possible random of a given size from a population _ t ratio: (X - µ hyp ) t s x The result

More information

The t-test: A z-score for a sample mean tells us where in the distribution the particular mean lies

The t-test: A z-score for a sample mean tells us where in the distribution the particular mean lies The t-test: So Far: Sampling distribution benefit is that even if the original population is not normal, a sampling distribution based on this population will be normal (for sample size > 30). Benefit

More information

POLI 443 Applied Political Research

POLI 443 Applied Political Research POLI 443 Applied Political Research Session 4 Tests of Hypotheses The Normal Curve Lecturer: Prof. A. Essuman-Johnson, Dept. of Political Science Contact Information: aessuman-johnson@ug.edu.gh College

More information

Confidence Interval Estimation

Confidence Interval Estimation Department of Psychology and Human Development Vanderbilt University 1 Introduction 2 3 4 5 Relationship to the 2-Tailed Hypothesis Test Relationship to the 1-Tailed Hypothesis Test 6 7 Introduction In

More information

Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing

Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing 1 In most statistics problems, we assume that the data have been generated from some unknown probability distribution. We desire

More information

Two-sample inference: Continuous Data

Two-sample inference: Continuous Data Two-sample inference: Continuous Data November 5 Diarrhea Diarrhea is a major health problem for babies, especially in underdeveloped countries Diarrhea leads to dehydration, which results in millions

More information

CONTINUOUS RANDOM VARIABLES

CONTINUOUS RANDOM VARIABLES the Further Mathematics network www.fmnetwork.org.uk V 07 REVISION SHEET STATISTICS (AQA) CONTINUOUS RANDOM VARIABLES The main ideas are: Properties of Continuous Random Variables Mean, Median and Mode

More information

Note that we are looking at the true mean, μ, not y. The problem for us is that we need to find the endpoints of our interval (a, b).

Note that we are looking at the true mean, μ, not y. The problem for us is that we need to find the endpoints of our interval (a, b). Confidence Intervals 1) What are confidence intervals? Simply, an interval for which we have a certain confidence. For example, we are 90% certain that an interval contains the true value of something

More information

STA Module 10 Comparing Two Proportions

STA Module 10 Comparing Two Proportions STA 2023 Module 10 Comparing Two Proportions Learning Objectives Upon completing this module, you should be able to: 1. Perform large-sample inferences (hypothesis test and confidence intervals) to compare

More information

Everything is not normal

Everything is not normal Everything is not normal According to the dictionary, one thing is considered normal when it s in its natural state or conforms to standards set in advance. And this is its normal meaning. But, like many

More information

Simple linear regression

Simple linear regression Simple linear regression Biometry 755 Spring 2008 Simple linear regression p. 1/40 Overview of regression analysis Evaluate relationship between one or more independent variables (X 1,...,X k ) and a single

More information

Section 6.2 Hypothesis Testing

Section 6.2 Hypothesis Testing Section 6.2 Hypothesis Testing GIVEN: an unknown parameter, and two mutually exclusive statements H 0 and H 1 about. The Statistician must decide either to accept H 0 or to accept H 1. This kind of problem

More information

HYPOTHESIS TESTING. Hypothesis Testing

HYPOTHESIS TESTING. Hypothesis Testing MBA 605 Business Analytics Don Conant, PhD. HYPOTHESIS TESTING Hypothesis testing involves making inferences about the nature of the population on the basis of observations of a sample drawn from the population.

More information

Epidemiology Principles of Biostatistics Chapter 10 - Inferences about two populations. John Koval

Epidemiology Principles of Biostatistics Chapter 10 - Inferences about two populations. John Koval Epidemiology 9509 Principles of Biostatistics Chapter 10 - Inferences about John Koval Department of Epidemiology and Biostatistics University of Western Ontario What is being covered 1. differences in

More information

An inferential procedure to use sample data to understand a population Procedures

An inferential procedure to use sample data to understand a population Procedures Hypothesis Test An inferential procedure to use sample data to understand a population Procedures Hypotheses, the alpha value, the critical region (z-scores), statistics, conclusion Two types of errors

More information

STA2601. Tutorial letter 203/2/2017. Applied Statistics II. Semester 2. Department of Statistics STA2601/203/2/2017. Solutions to Assignment 03

STA2601. Tutorial letter 203/2/2017. Applied Statistics II. Semester 2. Department of Statistics STA2601/203/2/2017. Solutions to Assignment 03 STA60/03//07 Tutorial letter 03//07 Applied Statistics II STA60 Semester Department of Statistics Solutions to Assignment 03 Define tomorrow. university of south africa QUESTION (a) (i) The normal quantile

More information

Hypothesis Testing. Mean (SDM)

Hypothesis Testing. Mean (SDM) Confidence Intervals and Hypothesis Testing Readings: Howell, Ch. 4, 7 The Sampling Distribution of the Mean (SDM) Derivation - See Thorne & Giesen (T&G), pp. 169-171 or online Chapter Overview for Ch.

More information

Last week: Sample, population and sampling distributions finished with estimation & confidence intervals

Last week: Sample, population and sampling distributions finished with estimation & confidence intervals Past weeks: Measures of central tendency (mean, mode, median) Measures of dispersion (standard deviation, variance, range, etc). Working with the normal curve Last week: Sample, population and sampling

More information

Mathematical statistics

Mathematical statistics November 15 th, 2018 Lecture 21: The two-sample t-test Overview Week 1 Week 2 Week 4 Week 7 Week 10 Week 14 Probability reviews Chapter 6: Statistics and Sampling Distributions Chapter 7: Point Estimation

More information

Ch. 7. One sample hypothesis tests for µ and σ

Ch. 7. One sample hypothesis tests for µ and σ Ch. 7. One sample hypothesis tests for µ and σ Prof. Tesler Math 18 Winter 2019 Prof. Tesler Ch. 7: One sample hypoth. tests for µ, σ Math 18 / Winter 2019 1 / 23 Introduction Data Consider the SAT math

More information

11 CHI-SQUARED Introduction. Objectives. How random are your numbers? After studying this chapter you should

11 CHI-SQUARED Introduction. Objectives. How random are your numbers? After studying this chapter you should 11 CHI-SQUARED Chapter 11 Chi-squared Objectives After studying this chapter you should be able to use the χ 2 distribution to test if a set of observations fits an appropriate model; know how to calculate

More information

Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras. Lecture 11 t- Tests

Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras. Lecture 11 t- Tests Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras Lecture 11 t- Tests Welcome to the course on Biostatistics and Design of Experiments.

More information

Probability and Statistics

Probability and Statistics Probability and Statistics Kristel Van Steen, PhD 2 Montefiore Institute - Systems and Modeling GIGA - Bioinformatics ULg kristel.vansteen@ulg.ac.be CHAPTER 4: IT IS ALL ABOUT DATA 4a - 1 CHAPTER 4: IT

More information

Sampling Distributions: Central Limit Theorem

Sampling Distributions: Central Limit Theorem Review for Exam 2 Sampling Distributions: Central Limit Theorem Conceptually, we can break up the theorem into three parts: 1. The mean (µ M ) of a population of sample means (M) is equal to the mean (µ)

More information

Multivariate Statistical Analysis

Multivariate Statistical Analysis Multivariate Statistical Analysis Fall 2011 C. L. Williams, Ph.D. Lecture 9 for Applied Multivariate Analysis Outline Addressing ourliers 1 Addressing ourliers 2 Outliers in Multivariate samples (1) For

More information

Inferences About Two Proportions

Inferences About Two Proportions Inferences About Two Proportions Quantitative Methods II Plan for Today Sampling two populations Confidence intervals for differences of two proportions Testing the difference of proportions Examples 1

More information

23. MORE HYPOTHESIS TESTING

23. MORE HYPOTHESIS TESTING 23. MORE HYPOTHESIS TESTING The Logic Behind Hypothesis Testing For simplicity, consider testing H 0 : µ = µ 0 against the two-sided alternative H A : µ µ 0. Even if H 0 is true (so that the expectation

More information

Hypothesis tests for two means

Hypothesis tests for two means Chapter 3 Hypothesis tests for two means 3.1 Introduction Last week you were introduced to the concept of hypothesis testing in statistics, and we considered hypothesis tests for the mean if we have a

More information

AMS7: WEEK 7. CLASS 1. More on Hypothesis Testing Monday May 11th, 2015

AMS7: WEEK 7. CLASS 1. More on Hypothesis Testing Monday May 11th, 2015 AMS7: WEEK 7. CLASS 1 More on Hypothesis Testing Monday May 11th, 2015 Testing a Claim about a Standard Deviation or a Variance We want to test claims about or 2 Example: Newborn babies from mothers taking

More information

79 Wyner Math Academy I Spring 2016

79 Wyner Math Academy I Spring 2016 79 Wyner Math Academy I Spring 2016 CHAPTER NINE: HYPOTHESIS TESTING Review May 11 Test May 17 Research requires an understanding of underlying mathematical distributions as well as of the research methods

More information

Harvard University. Rigorous Research in Engineering Education

Harvard University. Rigorous Research in Engineering Education Statistical Inference Kari Lock Harvard University Department of Statistics Rigorous Research in Engineering Education 12/3/09 Statistical Inference You have a sample and want to use the data collected

More information

Basics on t-tests Independent Sample t-tests Single-Sample t-tests Summary of t-tests Multiple Tests, Effect Size Proportions. Statistiek I.

Basics on t-tests Independent Sample t-tests Single-Sample t-tests Summary of t-tests Multiple Tests, Effect Size Proportions. Statistiek I. Statistiek I t-tests John Nerbonne CLCG, Rijksuniversiteit Groningen http://www.let.rug.nl/nerbonne/teach/statistiek-i/ John Nerbonne 1/46 Overview 1 Basics on t-tests 2 Independent Sample t-tests 3 Single-Sample

More information

Statistical methods for comparing multiple groups. Lecture 7: ANOVA. ANOVA: Definition. ANOVA: Concepts

Statistical methods for comparing multiple groups. Lecture 7: ANOVA. ANOVA: Definition. ANOVA: Concepts Statistical methods for comparing multiple groups Lecture 7: ANOVA Sandy Eckel seckel@jhsph.edu 30 April 2008 Continuous data: comparing multiple means Analysis of variance Binary data: comparing multiple

More information

McGill University. Faculty of Science MATH 204 PRINCIPLES OF STATISTICS II. Final Examination

McGill University. Faculty of Science MATH 204 PRINCIPLES OF STATISTICS II. Final Examination McGill University Faculty of Science MATH 204 PRINCIPLES OF STATISTICS II Final Examination Date: 20th April 2009 Time: 9am-2pm Examiner: Dr David A Stephens Associate Examiner: Dr Russell Steele Please

More information

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 Data Analysis and Statistical Methods Statistics 65 http://www.stat.tamu.edu/~suhasini/teaching.html Suhasini Subba Rao Comparing populations Suppose I want to compare the heights of males and females

More information

Introduction to Business Statistics QM 220 Chapter 12

Introduction to Business Statistics QM 220 Chapter 12 Department of Quantitative Methods & Information Systems Introduction to Business Statistics QM 220 Chapter 12 Dr. Mohammad Zainal 12.1 The F distribution We already covered this topic in Ch. 10 QM-220,

More information

Last two weeks: Sample, population and sampling distributions finished with estimation & confidence intervals

Last two weeks: Sample, population and sampling distributions finished with estimation & confidence intervals Past weeks: Measures of central tendency (mean, mode, median) Measures of dispersion (standard deviation, variance, range, etc). Working with the normal curve Last two weeks: Sample, population and sampling

More information

Chapter 27 Summary Inferences for Regression

Chapter 27 Summary Inferences for Regression Chapter 7 Summary Inferences for Regression What have we learned? We have now applied inference to regression models. Like in all inference situations, there are conditions that we must check. We can test

More information

Multiple Regression Analysis

Multiple Regression Analysis Multiple Regression Analysis y = β 0 + β 1 x 1 + β 2 x 2 +... β k x k + u 2. Inference 0 Assumptions of the Classical Linear Model (CLM)! So far, we know: 1. The mean and variance of the OLS estimators

More information

1 Hypothesis testing for a single mean

1 Hypothesis testing for a single mean This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

Statistics for IT Managers

Statistics for IT Managers Statistics for IT Managers 95-796, Fall 2012 Module 2: Hypothesis Testing and Statistical Inference (5 lectures) Reading: Statistics for Business and Economics, Ch. 5-7 Confidence intervals Given the sample

More information

Descriptive Statistics-I. Dr Mahmoud Alhussami

Descriptive Statistics-I. Dr Mahmoud Alhussami Descriptive Statistics-I Dr Mahmoud Alhussami Biostatistics What is the biostatistics? A branch of applied math. that deals with collecting, organizing and interpreting data using well-defined procedures.

More information

STA 2101/442 Assignment 2 1

STA 2101/442 Assignment 2 1 STA 2101/442 Assignment 2 1 These questions are practice for the midterm and final exam, and are not to be handed in. 1. A polling firm plans to ask a random sample of registered voters in Quebec whether

More information

Multiple samples: Modeling and ANOVA

Multiple samples: Modeling and ANOVA Multiple samples: Modeling and Patrick Breheny April 29 Patrick Breheny Introduction to Biostatistics (171:161) 1/23 Multiple group studies In the latter half of this course, we have discussed the analysis

More information

Independent Samples t tests. Background for Independent Samples t test

Independent Samples t tests. Background for Independent Samples t test Independent Samples t tests Dr. J. Kyle Roberts Southern Methodist University Simmons School of Education and Human Development Department of Teaching and Learning Background for Independent Samples t

More information

Warm-up Using the given data Create a scatterplot Find the regression line

Warm-up Using the given data Create a scatterplot Find the regression line Time at the lunch table Caloric intake 21.4 472 30.8 498 37.7 335 32.8 423 39.5 437 22.8 508 34.1 431 33.9 479 43.8 454 42.4 450 43.1 410 29.2 504 31.3 437 28.6 489 32.9 436 30.6 480 35.1 439 33.0 444

More information

Elementary Statistics Triola, Elementary Statistics 11/e Unit 17 The Basics of Hypotheses Testing

Elementary Statistics Triola, Elementary Statistics 11/e Unit 17 The Basics of Hypotheses Testing (Section 8-2) Hypotheses testing is not all that different from confidence intervals, so let s do a quick review of the theory behind the latter. If it s our goal to estimate the mean of a population,

More information

Chapter 9: Hypothesis Testing Sections

Chapter 9: Hypothesis Testing Sections Chapter 9: Hypothesis Testing Sections 9.1 Problems of Testing Hypotheses 9.2 Testing Simple Hypotheses 9.3 Uniformly Most Powerful Tests Skip: 9.4 Two-Sided Alternatives 9.6 Comparing the Means of Two

More information

Note that we are looking at the true mean, μ, not y. The problem for us is that we need to find the endpoints of our interval (a, b).

Note that we are looking at the true mean, μ, not y. The problem for us is that we need to find the endpoints of our interval (a, b). Confidence Intervals 1) What are confidence intervals? Simply, an interval for which we have a certain confidence. For example, we are 90% certain that an interval contains the true value of something

More information

4.1. Introduction: Comparing Means

4.1. Introduction: Comparing Means 4. Analysis of Variance (ANOVA) 4.1. Introduction: Comparing Means Consider the problem of testing H 0 : µ 1 = µ 2 against H 1 : µ 1 µ 2 in two independent samples of two different populations of possibly

More information

Probabilities & Statistics Revision

Probabilities & Statistics Revision Probabilities & Statistics Revision Christopher Ting Christopher Ting http://www.mysmu.edu/faculty/christophert/ : christopherting@smu.edu.sg : 6828 0364 : LKCSB 5036 January 6, 2017 Christopher Ting QF

More information

Lecture 8 Hypothesis Testing

Lecture 8 Hypothesis Testing Lecture 8 Hypothesis Testing Taylor Ch. 6 and 10.6 Introduction l The goal of hypothesis testing is to set up a procedure(s) to allow us to decide if a mathematical model ("theory") is acceptable in light

More information

CIVL /8904 T R A F F I C F L O W T H E O R Y L E C T U R E - 8

CIVL /8904 T R A F F I C F L O W T H E O R Y L E C T U R E - 8 CIVL - 7904/8904 T R A F F I C F L O W T H E O R Y L E C T U R E - 8 Chi-square Test How to determine the interval from a continuous distribution I = Range 1 + 3.322(logN) I-> Range of the class interval

More information

Hypothesis Testing One Sample Tests

Hypothesis Testing One Sample Tests STATISTICS Lecture no. 13 Department of Econometrics FEM UO Brno office 69a, tel. 973 442029 email:jiri.neubauer@unob.cz 12. 1. 2010 Tests on Mean of a Normal distribution Tests on Variance of a Normal

More information

Chapter 9. Hypothesis testing. 9.1 Introduction

Chapter 9. Hypothesis testing. 9.1 Introduction Chapter 9 Hypothesis testing 9.1 Introduction Confidence intervals are one of the two most common types of statistical inference. Use them when our goal is to estimate a population parameter. The second

More information

Statistical Analysis How do we know if it works? Group workbook: Cartoon from XKCD.com. Subscribe!

Statistical Analysis How do we know if it works? Group workbook: Cartoon from XKCD.com. Subscribe! Statistical Analysis How do we know if it works? Group workbook: Cartoon from XKCD.com. Subscribe! http://www.xkcd.com/552/ Significant Concepts We structure the presentation and processing of data to

More information

Are data normally normally distributed?

Are data normally normally distributed? Standard Normal Image source Are data normally normally distributed? Sample mean: 66.78 Sample standard deviation: 3.37 (66.78-1 x 3.37, 66.78 + 1 x 3.37) (66.78-2 x 3.37, 66.78 + 2 x 3.37) (66.78-3 x

More information

Department of Mathematics & Statistics STAT 2593 Final Examination 17 April, 2000

Department of Mathematics & Statistics STAT 2593 Final Examination 17 April, 2000 Department of Mathematics & Statistics STAT 2593 Final Examination 17 April, 2000 TIME: 3 hours. Total marks: 80. (Marks are indicated in margin.) Remember that estimate means to give an interval estimate.

More information

Chapter 7: Statistical Inference (Two Samples)

Chapter 7: Statistical Inference (Two Samples) Chapter 7: Statistical Inference (Two Samples) Shiwen Shen University of South Carolina 2016 Fall Section 003 1 / 41 Motivation of Inference on Two Samples Until now we have been mainly interested in a

More information

Confidence intervals

Confidence intervals Confidence intervals We now want to take what we ve learned about sampling distributions and standard errors and construct confidence intervals. What are confidence intervals? Simply an interval for which

More information

Question. Hypothesis testing. Example. Answer: hypothesis. Test: true or not? Question. Average is not the mean! μ average. Random deviation or not?

Question. Hypothesis testing. Example. Answer: hypothesis. Test: true or not? Question. Average is not the mean! μ average. Random deviation or not? Hypothesis testing Question Very frequently: what is the possible value of μ? Sample: we know only the average! μ average. Random deviation or not? Standard error: the measure of the random deviation.

More information

How do we compare the relative performance among competing models?

How do we compare the relative performance among competing models? How do we compare the relative performance among competing models? 1 Comparing Data Mining Methods Frequent problem: we want to know which of the two learning techniques is better How to reliably say Model

More information