Chapter 10: STATISTICAL INFERENCE FOR TWO SAMPLES. Part 1: Hypothesis tests on a µ 1 µ 2 for independent groups

Size: px
Start display at page:

Download "Chapter 10: STATISTICAL INFERENCE FOR TWO SAMPLES. Part 1: Hypothesis tests on a µ 1 µ 2 for independent groups"

Transcription

1 Chapter 10: STATISTICAL INFERENCE FOR TWO SAMPLES Part 1: Hypothesis tests on a µ 1 µ 2 for independent groups Sections 10-1 & 10-2 Independent Groups It is common to compare two groups, and do a hypothesis test regarding the parameters of the groups. We will discuss two data collection designs in this chapter, and we will discuss how the design choice affects how we analyze the data. Number of accidents for the 9-5pm work shift compared to number of accidents for midnight-8am work shift. (Comparison of means, µ 1 vs. µ 2 ) 1

2 The weights of individuals before a diet starts compared to the weights of the individuals after 10 weeks on the diet. (Comparison of means, µ 1 vs. µ 2 ) This year, we will not cover comparison of proportions, but they are also common: Proportion of defects in a device manufactured under process 1 vs. proportion of defects in a device manufactured under process 2. (Comparison of proportions, p 1 vs. p 2 ) Satisfied customer rate for AT&T cellular compared to satisfaction rate for T- mobile. (Comparison of proportions, p 1 vs. p 2 ) 2

3 Hypothesis testing for comparison of means: H 0 : µ 1 = µ 2 H 0 : µ 1 µ 2 = 0 H 1 : µ 1 µ 2 H 1 : µ 1 µ 2 0 When comparing the means of two samples ( X 1 vs. X2 ), we must determine if the data from the groups are totally independent, or if they are related, because this affects the type of analysis performed (more on this later). We start with independent groups (sections 10-1 to 10-2)... Example: Time spent exercising for males and females. QUESTION: Do males and females at UI spend the same amount of time, on average, at the UI fitness center? 3

4 X=Time in minutes spent at the fitness center. Women Men n 1 = 15 n 2 = 15 63,32,86,53,49 52,75,74,68,93 73,39,56,45,67 77,41,87,72,53 49,51,65,54,56 84,65,66,69,62 4

5 x 1 = minutes s 1 = minutes x 2 = minutes s 2 = minutes The sample of females spends, on average, minutes less time at the Fitness center than this sample of males. x 1 x 2 = = minutes Is this difference in sample means large enough to say that µ 1 µ 2? As before, to answer this from a statistical viewpoint, we need to consider the chance of getting a sample difference this large, even when the two populations actually spend the same amount of time, on average, i.e. when µ 1 µ 2 = 0. 5

6 To get at this probability, we need to know the behavior of the difference in sample means X 1 X 2. We looked at the distribution of this random variable ( X 1 X 2 ) in chapter 7, and we will revisit it here. 6

7 Difference in sample means: 1. If σ1 2 and σ2 2 are known, and the distribution of values from both groups is normal, we have ( ) and X 1 X 2 N µ 1 µ 2, σ2 1 n 1 + σ2 2 n 2 Z = ( X 1 X 2 ) (µ 1 µ 2 ) σ 2 1 n 1 + σ2 2 n 2 where Z has a N(0,1) distribution. If the original distributions are not normal but n is large, the above will also follow from the central limit theorem. 7

8 Difference in sample means: 2. If σ1 2 and σ2 2 are NOT known, we will have to estimate them. IF IT IS REA- SONABLE TO ASSUME BOTH GROUPS HAVE A COMMON σ 2, we will pool the information from both groups to estimate this common σ 2. The pooled estimator of σ 2 denoted by S 2 p is defined by S 2 p = (n 1 1)S (n 2 1)S 2 2 n 1 + n 2 2 This value estimates both σ 2 1 and σ2 2 because σ 2 1 = σ2 2 = σ2. S 2 1 S 2 2 is the sample variance from group 1. is the sample variance from group 2. 8

9 S 2 p is a weighted average of the two sample variances. If the distribution of values from both groups is normal, we have the random variable T as T = ( X 1 X 2 ) (µ 1 µ 2 ) S 2 p n 1 + S2 p n 2 = ( X 1 X 2 ) (µ 1 µ 2 ) S p 1n1 + 1 n 2 and T has a t n1 +n 2 2 distribution, i.e. a t-distribution with n 1 + n 2 2 degrees of freedom. 9

10 Difference in sample means: 3. If σ1 2 and σ2 2 are NOT known, AND they have different variances, then we should not take a pooled estimate of the variability, we should instead leave them separate as S1 2 and S2 2. In this case we have the random variable T as T = ( X 1 X 2 ) (µ 1 µ 2 ) S 2 1 n 1 + S2 2 n 2 where T has a t ν distribution, and the degrees of freedom ν for the t distribution is given by... 10

11 ν = ( S 2 1 n 1 + S2 2 n 2 ) 2 ( ) S n 1 n ( ) S n 2 n 2 1 This is known as Welch s approximate t. 11

12 Thus, in working with X 1 X 2 to make an inference on µ 1 µ 2 if we know σ1 2 and σ2 2 Z-distribution. we will use a if we don t known them and we think σ1 2 = σ2 2, we will use a t-distribution with n 1 + n 2 2 degrees of freedom and use a pooled estimate of the common variance as Sp. 2 if we don t known them and we think σ1 2 σ2 2, we will use a t-distribution with ν degrees of freedom (ugly but useful, formula on previous slide) and have separate estimates for the variances as S1 2 and S2 2. Back to the example where we will utilize a hypothesis test... 12

13 QUESTION: Do males and females at UI spend the same amount of time, on average, at the UI fitness center? 1. H 0 : µ 1 = µ 2 H 0 : µ 1 µ 2 = 0 H 1 : µ 1 µ 2 H 1 : µ 1 µ 2 0 Group 1 is the females, group 2 is the males. 2. TEST STATISTIC: We have a small sample from each group and σ1 2 and σ2 2 are not given to us. We will use a t-statistic. We will assume the groups have a common variance σ 2 (we can check this assumption later). n 1 = 15 n 2 = 15 x 1 = min. x 2 = min. s 1 = min. s 2 = min. 13

14 Pooled estimate of common σ 2 : s 2 p = (n 1 1)s (n 2 1)s 2 2 n 1 + n 2 2 = = and s p = = The observed test statistic under H 0 true, t 0 = ( x 1 x 2 ) (µ 1 µ 2 ) s 1n1 p + n 1 2 ( 13.33) (0) = = and T 0 t 28 under H 0 true (n 1 +n 2 2=28). 14

15 3. P-VALUE: Under H 0 true, T 0 t 28. Compute P (T ) = (from software, not your t-table) This is a 2-sided test, so P-value = = DECISION: Because the P-value is < α = 0.05, we reject H 0. There IS statistically significant evidence that the mean time spent at the Fitness Centers is not the same for men and women. 5. CHECK ANY ASSUMPTIONS: With the T test statistic (unknown σ 2 ), we ll check that the original distributions are nearly normal. 15

16 Women normal probability plot Men normal probability plot We could also check the constant variance assumption (we note that s1 and s2 are very similar, but there are specific tests and plots we can use to check this.) We should also make sure that we have independent random samples from the two populations we were interested in. 16

17 There are many choices for statistical software, Minitab being one of them. Here is the output from this analysis performed in the the freely available software called R: > t.test(women,men,var.equal=t) Two Sample t-test data: women and men t = , df = 28, p-value = alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval of difference: sample estimates: mean of x mean of y

18 Sometimes we want to do a test such as H 0 : µ 1 µ 2 = 0 Where we re interested in a specific difference between the means... Example: Viscosity Fifteen batches of polymer are manufactured under the present process and the viscosity is measured: 724, 718, 776, 760, 745, 759, 795, 756, 742, 740, 761, 749, 739, 747, 742 A process change is made and eight batches are manufactured and the viscosity is measured: 755, 785, 729, 775, 783, 760, 738,

19 From a long history of viscosity measurements, they know the variability of viscosity is fairly stable, and they know σ = 20. They also know the viscosity measurements are normally distributed. They would like to detect it if the mean viscosity of the new process is more than 10 units above the old mean (this could cause problems in manufacturing down the line). Perform a hypothesis test at the α = 0.10 level. Because σ is known, the test statistic will be a Z-statistic. x 2 = 765 and x 1 = and x 2 x 1 = 14.8 where group 2 is from the new manufacturing process. 19

20 The difference in sample means IS more than 10 units, but have we collected enough data to feel fairly confident that the population means are more than 10 units apart? 1. Hypotheses: H 0 : µ 2 µ 1 = 10 H 1 : µ 2 µ 1 > 10 where µ 2 is the average viscosity for the new process. 2. Test statistic: z 0 = ( x 2 x 1 ) (µ 2 µ 1 ) = σ 2 2 n 2 + σ2 1 n 1 (14.8) (10) =

21 3. P-value: P (Z > 0.55) = {1-sided test} 4. Decision: p-value = is not less than α = We fail to reject H We were given info that viscosity followed a normal distribution. There is NOT statistically significant evidence that the mean of the new process is more than 10 units above the mean of the old process. If the true difference in means is this large (14.8), increasing n will make this apparent and we would eventually reject. If the true difference is actually less than 10, this would also become apparent as we collect more data (we ll hone in on the truth with larger n). 21

22 Example: Differential Gene Expression A gene that shows differential expression between two groups can be very informative from a biological perspective. Comparisons: Cancer vs. Healthy patients Fast running mice vs. lazy mice Obese individuals vs. healthy weight indvl s High yield plants vs. low yield plants In a plant genetics study, the gene called At5g50550 in the Arabidopsis plant showed the following two sample expression distributions for the genetic lines called Columbia and Landsberg. 22

23 Perform a hypothesis test for differential expression (i.e. for non-equality of mean expression). 23

24 The numerical summaries: (expression values have been normalized) Columbia : n 1 = 19 x 1 = 0.60 s 2 1 = Landsberg : n 2 = 11 x 2 = 0.16 s 2 2 = We do not know the population variances, so we will use a t-statistic. We will NOT assume a common variance, so we will use Welch s approximate t and get the degrees of freedom ν from... 24

25 ν = ( S 2 1 n 1 + S2 2 n 2 ) 2 ( ) S n 1 n ( ) S n 2 n 2 1 = ( ) ( ) ( ) =14.78 and since more degrees of freedom means more information, we don t want to imply we have more info than we do, so we rounddown to 14 (to be conservative). 25

26 1. Hypotheses: H 0 : µ 1 = µ 2 H 1 : µ 1 µ 2 {equal expression} where µ 1 is the average gene expression for Columbia. 2. Test statistic: t 0 = ( x 1 x 2 ) (µ 1 µ 2 ) = s 2 1 n 1 + s2 2 n 2 ( ) (0) And under H 0 true, T 0 t 14 = P-value: 2 P (T 0 > 11.07) 0 Very small. 26

27 4. Decision: Reject H 0. There is strong statistical significant evidence that these two groups have different mean gene expression for this gene. 5. Since we used a t-statistic, we should check normality of the two distributions (normal probability plots not shown here). There was one outlier in the Landsberg group which may be of some concern. 27

28 Comparison of two independent groups also called... Two-sample t-test (for H 0 : µ 1 µ 2 = 0 )...but if we know σ we actually do a Z-test. This type of test is performed when the measurements in the first group are independent of the measurements in the second group In these comparative experiments, we should have a simple random sample from each population (or group). 28

29 Summary of which test statistic to use: 1. If σ1 2 and σ2 2 Z-statistic. are KNOWN, we ll use a (We should have the original distributions be normal, or n large enough for X s to be normal.) 2. If σ1 2 and σ2 2 a t-statistic: are NOT KNOWN, we ll use (a) If it is reasonable to assume both groups have A COMMON σ 2, we will pool the information from both groups to estimate this common σ 2 with S 2 p, and the degrees of freedom for the t is (n 1 +n 2 2). 29

30 (b) If the groups DO NOT HAVE A COMMON σ 2, we should not pool the information for a common estimate of σ 2. We will instead keep separate estimates for the variances as S 2 1 and S2 2, and the degrees of freedom for the t will be ν where ν comes from Welch s approximate t degrees of freedom formula. 30

31 100(1-α)% Confidence interval for µ 1 µ 2 The point estimate for µ 1 µ 2 is x 1 x 2 We can form a 100(1-α)% confidence interval for the difference in parameters µ 1 µ 2 using the same criterion as the previous pages as: 1. If σ1 2 and σ2 2 are KNOWN and two independent samples are taken from two normal distributions x 1 x 2 ± z α/2 σ 2 1 n 1 + σ2 2 n 2 31

32 2. If σ1 2 and σ2 2 are NOT KNOWN and two independent samples are taken from two normal distributions (a) with a common σ 2 x 1 x 2 ± t α/2,n1 +n 2 2 s p 1 n n 2 (b) without a common σ 2 x 1 x 2 ± t α/2,ν s 2 1 n 1 + s2 2 n 2 where ν is from Welch s approximate t degrees of freedom 32

33 Comparison of two independent groups... we do a Two-sample t-test (for H 0 : µ 1 µ 2 = 0 ) This type of test is performed when the measurements in the first group are independent of the measurements in the second group The µ 1 µ 2 hypothesis examples so far fit this scenario: Time exercising for males and females n 1 = 15 and n 2 = 15 and the men and women chosen didn t have anything in common, they were independent 33

34 Viscosity from old process and new process n 1 = 15 and n 2 = 8 and the two groups of measurements were taken independent of each other Differential gene expression in Columbia and Landsberg n 1 = 19 and n 2 = 11 for two independent groups of plants If the two groups are not independent and we have paired data, we will perform a Paired t-test... next section. 34

Two sided, two sample t-tests. a) IQ = 100 b) Average height for men = c) Average number of white blood cells per cubic millimeter is 7,000.

Two sided, two sample t-tests. a) IQ = 100 b) Average height for men = c) Average number of white blood cells per cubic millimeter is 7,000. Two sided, two sample t-tests. I. Brief review: 1) We are interested in how a sample compares to some pre-conceived notion. For example: a) IQ = 100 b) Average height for men = 5 10. c) Average number

More information

Chapter 7: Statistical Inference (Two Samples)

Chapter 7: Statistical Inference (Two Samples) Chapter 7: Statistical Inference (Two Samples) Shiwen Shen University of South Carolina 2016 Fall Section 003 1 / 41 Motivation of Inference on Two Samples Until now we have been mainly interested in a

More information

# of 6s # of times Test the null hypthesis that the dice are fair at α =.01 significance

# of 6s # of times Test the null hypthesis that the dice are fair at α =.01 significance Practice Final Exam Statistical Methods and Models - Math 410, Fall 2011 December 4, 2011 You may use a calculator, and you may bring in one sheet (8.5 by 11 or A4) of notes. Otherwise closed book. The

More information

Chapter 7. Inference for Distributions. Introduction to the Practice of STATISTICS SEVENTH. Moore / McCabe / Craig. Lecture Presentation Slides

Chapter 7. Inference for Distributions. Introduction to the Practice of STATISTICS SEVENTH. Moore / McCabe / Craig. Lecture Presentation Slides Chapter 7 Inference for Distributions Introduction to the Practice of STATISTICS SEVENTH EDITION Moore / McCabe / Craig Lecture Presentation Slides Chapter 7 Inference for Distributions 7.1 Inference for

More information

Epidemiology Principles of Biostatistics Chapter 10 - Inferences about two populations. John Koval

Epidemiology Principles of Biostatistics Chapter 10 - Inferences about two populations. John Koval Epidemiology 9509 Principles of Biostatistics Chapter 10 - Inferences about John Koval Department of Epidemiology and Biostatistics University of Western Ontario What is being covered 1. differences in

More information

An inferential procedure to use sample data to understand a population Procedures

An inferential procedure to use sample data to understand a population Procedures Hypothesis Test An inferential procedure to use sample data to understand a population Procedures Hypotheses, the alpha value, the critical region (z-scores), statistics, conclusion Two types of errors

More information

AMS7: WEEK 7. CLASS 1. More on Hypothesis Testing Monday May 11th, 2015

AMS7: WEEK 7. CLASS 1. More on Hypothesis Testing Monday May 11th, 2015 AMS7: WEEK 7. CLASS 1 More on Hypothesis Testing Monday May 11th, 2015 Testing a Claim about a Standard Deviation or a Variance We want to test claims about or 2 Example: Newborn babies from mothers taking

More information

Hypothesis testing. Data to decisions

Hypothesis testing. Data to decisions Hypothesis testing Data to decisions The idea Null hypothesis: H 0 : the DGP/population has property P Under the null, a sample statistic has a known distribution If, under that that distribution, the

More information

T.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS

T.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS In our work on hypothesis testing, we used the value of a sample statistic to challenge an accepted value of a population parameter. We focused only

More information

Statistics for IT Managers

Statistics for IT Managers Statistics for IT Managers 95-796, Fall 2012 Module 2: Hypothesis Testing and Statistical Inference (5 lectures) Reading: Statistics for Business and Economics, Ch. 5-7 Confidence intervals Given the sample

More information

Ch18 links / ch18 pdf links Ch18 image t-dist table

Ch18 links / ch18 pdf links Ch18 image t-dist table Ch18 links / ch18 pdf links Ch18 image t-dist table ch18 (inference about population mean) exercises: 18.3, 18.5, 18.7, 18.9, 18.15, 18.17, 18.19, 18.27 CHAPTER 18: Inference about a Population Mean The

More information

CBA4 is live in practice mode this week exam mode from Saturday!

CBA4 is live in practice mode this week exam mode from Saturday! Announcements CBA4 is live in practice mode this week exam mode from Saturday! Material covered: Confidence intervals (both cases) 1 sample hypothesis tests (both cases) Hypothesis tests for 2 means as

More information

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 Data Analysis and Statistical Methods Statistics 65 http://www.stat.tamu.edu/~suhasini/teaching.html Suhasini Subba Rao Comparing populations Suppose I want to compare the heights of males and females

More information

Lecture 5: ANOVA and Correlation

Lecture 5: ANOVA and Correlation Lecture 5: ANOVA and Correlation Ani Manichaikul amanicha@jhsph.edu 23 April 2007 1 / 62 Comparing Multiple Groups Continous data: comparing means Analysis of variance Binary data: comparing proportions

More information

STAT 201 Assignment 6

STAT 201 Assignment 6 STAT 201 Assignment 6 Partial Solutions 12.1 Research question: Do parents in the school district support the new education program? Parameter: p = proportion of all parents in the school district who

More information

Statistical Analysis for QBIC Genetics Adapted by Ellen G. Dow 2017

Statistical Analysis for QBIC Genetics Adapted by Ellen G. Dow 2017 Statistical Analysis for QBIC Genetics Adapted by Ellen G. Dow 2017 I. χ 2 or chi-square test Objectives: Compare how close an experimentally derived value agrees with an expected value. One method to

More information

ANOVA Analysis of Variance

ANOVA Analysis of Variance ANOVA Analysis of Variance ANOVA Analysis of Variance Extends independent samples t test ANOVA Analysis of Variance Extends independent samples t test Compares the means of groups of independent observations

More information

MAT 2379, Introduction to Biostatistics, Sample Calculator Questions 1. MAT 2379, Introduction to Biostatistics

MAT 2379, Introduction to Biostatistics, Sample Calculator Questions 1. MAT 2379, Introduction to Biostatistics MAT 2379, Introduction to Biostatistics, Sample Calculator Questions 1 MAT 2379, Introduction to Biostatistics Sample Calculator Problems for the Final Exam Note: The exam will also contain some problems

More information

Review: General Approach to Hypothesis Testing. 1. Define the research question and formulate the appropriate null and alternative hypotheses.

Review: General Approach to Hypothesis Testing. 1. Define the research question and formulate the appropriate null and alternative hypotheses. 1 Review: Let X 1, X,..., X n denote n independent random variables sampled from some distribution might not be normal!) with mean µ) and standard deviation σ). Then X µ σ n In other words, X is approximately

More information

Marketing Research Session 10 Hypothesis Testing with Simple Random samples (Chapter 12)

Marketing Research Session 10 Hypothesis Testing with Simple Random samples (Chapter 12) Marketing Research Session 10 Hypothesis Testing with Simple Random samples (Chapter 12) Remember: Z.05 = 1.645, Z.01 = 2.33 We will only cover one-sided hypothesis testing (cases 12.3, 12.4.2, 12.5.2,

More information

HYPOTHESIS TESTING. Hypothesis Testing

HYPOTHESIS TESTING. Hypothesis Testing MBA 605 Business Analytics Don Conant, PhD. HYPOTHESIS TESTING Hypothesis testing involves making inferences about the nature of the population on the basis of observations of a sample drawn from the population.

More information

Problem Set 4 - Solutions

Problem Set 4 - Solutions Problem Set 4 - Solutions Econ-310, Spring 004 8. a. If we wish to test the research hypothesis that the mean GHQ score for all unemployed men exceeds 10, we test: H 0 : µ 10 H a : µ > 10 This is a one-tailed

More information

Inferences About Two Proportions

Inferences About Two Proportions Inferences About Two Proportions Quantitative Methods II Plan for Today Sampling two populations Confidence intervals for differences of two proportions Testing the difference of proportions Examples 1

More information

The Difference in Proportions Test

The Difference in Proportions Test Overview The Difference in Proportions Test Dr Tom Ilvento Department of Food and Resource Economics A Difference of Proportions test is based on large sample only Same strategy as for the mean We calculate

More information

Two Sample Problems. Two sample problems

Two Sample Problems. Two sample problems Two Sample Problems Two sample problems The goal of inference is to compare the responses in two groups. Each group is a sample from a different population. The responses in each group are independent

More information

INFERENCE FOR REGRESSION

INFERENCE FOR REGRESSION CHAPTER 3 INFERENCE FOR REGRESSION OVERVIEW In Chapter 5 of the textbook, we first encountered regression. The assumptions that describe the regression model we use in this chapter are the following. We

More information

Statistics and Sampling distributions

Statistics and Sampling distributions Statistics and Sampling distributions a statistic is a numerical summary of sample data. It is a rv. The distribution of a statistic is called its sampling distribution. The rv s X 1, X 2,, X n are said

More information

One sided tests. An example of a two sided alternative is what we ve been using for our two sample tests:

One sided tests. An example of a two sided alternative is what we ve been using for our two sample tests: One sided tests So far all of our tests have been two sided. While this may be a bit easier to understand, this is often not the best way to do a hypothesis test. One simple thing that we can do to get

More information

E509A: Principle of Biostatistics. GY Zou

E509A: Principle of Biostatistics. GY Zou E509A: Principle of Biostatistics (Week 4: Inference for a single mean ) GY Zou gzou@srobarts.ca Example 5.4. (p. 183). A random sample of n =16, Mean I.Q is 106 with standard deviation S =12.4. What

More information

y ˆ i = ˆ " T u i ( i th fitted value or i th fit)

y ˆ i = ˆ  T u i ( i th fitted value or i th fit) 1 2 INFERENCE FOR MULTIPLE LINEAR REGRESSION Recall Terminology: p predictors x 1, x 2,, x p Some might be indicator variables for categorical variables) k-1 non-constant terms u 1, u 2,, u k-1 Each u

More information

Confidence Intervals with σ unknown

Confidence Intervals with σ unknown STAT 141 Confidence Intervals and Hypothesis Testing 10/26/04 Today (Chapter 7): CI with σ unknown, t-distribution CI for proportions Two sample CI with σ known or unknown Hypothesis Testing, z-test Confidence

More information

y = a + bx 12.1: Inference for Linear Regression Review: General Form of Linear Regression Equation Review: Interpreting Computer Regression Output

y = a + bx 12.1: Inference for Linear Regression Review: General Form of Linear Regression Equation Review: Interpreting Computer Regression Output 12.1: Inference for Linear Regression Review: General Form of Linear Regression Equation y = a + bx y = dependent variable a = intercept b = slope x = independent variable Section 12.1 Inference for Linear

More information

Chapters 4-6: Inference with two samples Read sections 4.2.5, 5.2, 5.3, 6.2

Chapters 4-6: Inference with two samples Read sections 4.2.5, 5.2, 5.3, 6.2 Chapters 4-6: Inference with two samples Read sections 45, 5, 53, 6 COMPARING TWO POPULATION MEANS When presented with two samples that you wish to compare, there are two possibilities: I independent samples

More information

7.2 One-Sample Correlation ( = a) Introduction. Correlation analysis measures the strength and direction of association between

7.2 One-Sample Correlation ( = a) Introduction. Correlation analysis measures the strength and direction of association between 7.2 One-Sample Correlation ( = a) Introduction Correlation analysis measures the strength and direction of association between variables. In this chapter we will test whether the population correlation

More information

Ch 11- One Way Analysis of Variance

Ch 11- One Way Analysis of Variance Multiple Choice Questions Ch 11- One Way Analysis of Variance Use the following to solve questions 1 &. Suppose n = 8 and there are 4 groups, how many between groups (samples) degrees of freedom are there?

More information

Simple Linear Regression: One Qualitative IV

Simple Linear Regression: One Qualitative IV Simple Linear Regression: One Qualitative IV 1. Purpose As noted before regression is used both to explain and predict variation in DVs, and adding to the equation categorical variables extends regression

More information

Hypotheses Testing. 1-Single Mean

Hypotheses Testing. 1-Single Mean Hypotheses Testing 1-Single Mean ( if σ known ): ( if σ unknown ): 68 Question 1: Suppose that we are interested in estimating the true average time in seconds it takes an adult to open a new type of tamper-resistant

More information

Chapter 7 Comparison of two independent samples

Chapter 7 Comparison of two independent samples Chapter 7 Comparison of two independent samples 7.1 Introduction Population 1 µ σ 1 1 N 1 Sample 1 y s 1 1 n 1 Population µ σ N Sample y s n 1, : population means 1, : population standard deviations N

More information

One-sample categorical data: approximate inference

One-sample categorical data: approximate inference One-sample categorical data: approximate inference Patrick Breheny October 6 Patrick Breheny Biostatistical Methods I (BIOS 5710) 1/25 Introduction It is relatively easy to think about the distribution

More information

Lecture 14. Analysis of Variance * Correlation and Regression. The McGraw-Hill Companies, Inc., 2000

Lecture 14. Analysis of Variance * Correlation and Regression. The McGraw-Hill Companies, Inc., 2000 Lecture 14 Analysis of Variance * Correlation and Regression Outline Analysis of Variance (ANOVA) 11-1 Introduction 11-2 Scatter Plots 11-3 Correlation 11-4 Regression Outline 11-5 Coefficient of Determination

More information

Lecture 14. Outline. Outline. Analysis of Variance * Correlation and Regression Analysis of Variance (ANOVA)

Lecture 14. Outline. Outline. Analysis of Variance * Correlation and Regression Analysis of Variance (ANOVA) Outline Lecture 14 Analysis of Variance * Correlation and Regression Analysis of Variance (ANOVA) 11-1 Introduction 11- Scatter Plots 11-3 Correlation 11-4 Regression Outline 11-5 Coefficient of Determination

More information

CIVL /8904 T R A F F I C F L O W T H E O R Y L E C T U R E - 8

CIVL /8904 T R A F F I C F L O W T H E O R Y L E C T U R E - 8 CIVL - 7904/8904 T R A F F I C F L O W T H E O R Y L E C T U R E - 8 Chi-square Test How to determine the interval from a continuous distribution I = Range 1 + 3.322(logN) I-> Range of the class interval

More information

Lecture Slides. Elementary Statistics Eleventh Edition. by Mario F. Triola. and the Triola Statistics Series 9.1-1

Lecture Slides. Elementary Statistics Eleventh Edition. by Mario F. Triola. and the Triola Statistics Series 9.1-1 Lecture Slides Elementary Statistics Eleventh Edition and the Triola Statistics Series by Mario F. Triola Copyright 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. 9.1-1 Chapter 9 Inferences

More information

Confidence intervals CE 311S

Confidence intervals CE 311S CE 311S PREVIEW OF STATISTICS The first part of the class was about probability. P(H) = 0.5 P(T) = 0.5 HTTHHTTTTHHTHTHH If we know how a random process works, what will we see in the field? Preview of

More information

Mathematical statistics

Mathematical statistics November 15 th, 2018 Lecture 21: The two-sample t-test Overview Week 1 Week 2 Week 4 Week 7 Week 10 Week 14 Probability reviews Chapter 6: Statistics and Sampling Distributions Chapter 7: Point Estimation

More information

Chapter 24. Comparing Means. Copyright 2010 Pearson Education, Inc.

Chapter 24. Comparing Means. Copyright 2010 Pearson Education, Inc. Chapter 24 Comparing Means Copyright 2010 Pearson Education, Inc. Plot the Data The natural display for comparing two groups is boxplots of the data for the two groups, placed side-by-side. For example:

More information

Chapter 22. Comparing Two Proportions. Bin Zou STAT 141 University of Alberta Winter / 15

Chapter 22. Comparing Two Proportions. Bin Zou STAT 141 University of Alberta Winter / 15 Chapter 22 Comparing Two Proportions Bin Zou (bzou@ualberta.ca) STAT 141 University of Alberta Winter 2015 1 / 15 Introduction In Ch.19 and Ch.20, we studied confidence interval and test for proportions,

More information

The t-test: A z-score for a sample mean tells us where in the distribution the particular mean lies

The t-test: A z-score for a sample mean tells us where in the distribution the particular mean lies The t-test: So Far: Sampling distribution benefit is that even if the original population is not normal, a sampling distribution based on this population will be normal (for sample size > 30). Benefit

More information

Lecture 14: Introduction to Poisson Regression

Lecture 14: Introduction to Poisson Regression Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu 8 May 2007 1 / 52 Overview Modelling counts Contingency tables Poisson regression models 2 / 52 Modelling counts I Why

More information

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview Modelling counts I Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu Why count data? Number of traffic accidents per day Mortality counts in a given neighborhood, per week

More information

Introduction to Business Statistics QM 220 Chapter 12

Introduction to Business Statistics QM 220 Chapter 12 Department of Quantitative Methods & Information Systems Introduction to Business Statistics QM 220 Chapter 12 Dr. Mohammad Zainal 12.1 The F distribution We already covered this topic in Ch. 10 QM-220,

More information

CHAPTER 10 Comparing Two Populations or Groups

CHAPTER 10 Comparing Two Populations or Groups CHAPTER 10 Comparing Two Populations or Groups 10. Comparing Two Means The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers Comparing Two Means Learning

More information

CHAPTER 10 Comparing Two Populations or Groups

CHAPTER 10 Comparing Two Populations or Groups CHAPTER 10 Comparing Two Populations or Groups 10.2 Comparing Two Means The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers Comparing Two Means Learning

More information

Class 24. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

Class 24. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700 Class 4 Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science Copyright 013 by D.B. Rowe 1 Agenda: Recap Chapter 9. and 9.3 Lecture Chapter 10.1-10.3 Review Exam 6 Problem Solving

More information

Stat 529 (Winter 2011) Experimental Design for the Two-Sample Problem. Motivation: Designing a new silver coins experiment

Stat 529 (Winter 2011) Experimental Design for the Two-Sample Problem. Motivation: Designing a new silver coins experiment Stat 529 (Winter 2011) Experimental Design for the Two-Sample Problem Reading: 2.4 2.6. Motivation: Designing a new silver coins experiment Sample size calculations Margin of error for the pooled two sample

More information

STA 101 Final Review

STA 101 Final Review STA 101 Final Review Statistics 101 Thomas Leininger June 24, 2013 Announcements All work (besides projects) should be returned to you and should be entered on Sakai. Office Hour: 2 3pm today (Old Chem

More information

Analysis of Variance

Analysis of Variance Analysis of Variance Chapter 12 McGraw-Hill/Irwin Copyright 2013 by The McGraw-Hill Companies, Inc. All rights reserved. Learning Objectives LO 12-1 List the characteristics of the F distribution and locate

More information

Binary Logistic Regression

Binary Logistic Regression The coefficients of the multiple regression model are estimated using sample data with k independent variables Estimated (or predicted) value of Y Estimated intercept Estimated slope coefficients Ŷ = b

More information

Statistics: CI, Tolerance Intervals, Exceedance, and Hypothesis Testing. Confidence intervals on mean. CL = x ± t * CL1- = exp

Statistics: CI, Tolerance Intervals, Exceedance, and Hypothesis Testing. Confidence intervals on mean. CL = x ± t * CL1- = exp Statistics: CI, Tolerance Intervals, Exceedance, and Hypothesis Lecture Notes 1 Confidence intervals on mean Normal Distribution CL = x ± t * 1-α 1- α,n-1 s n Log-Normal Distribution CL = exp 1-α CL1-

More information

Hypothesis Testing: Chi-Square Test 1

Hypothesis Testing: Chi-Square Test 1 Hypothesis Testing: Chi-Square Test 1 November 9, 2017 1 HMS, 2017, v1.0 Chapter References Diez: Chapter 6.3 Navidi, Chapter 6.10 Chapter References 2 Chi-square Distributions Let X 1, X 2,... X n be

More information

Chapter 9 Inferences from Two Samples

Chapter 9 Inferences from Two Samples Chapter 9 Inferences from Two Samples 9-1 Review and Preview 9-2 Two Proportions 9-3 Two Means: Independent Samples 9-4 Two Dependent Samples (Matched Pairs) 9-5 Two Variances or Standard Deviations Review

More information

Goodness of Fit Tests

Goodness of Fit Tests Goodness of Fit Tests Marc H. Mehlman marcmehlman@yahoo.com University of New Haven (University of New Haven) Goodness of Fit Tests 1 / 38 Table of Contents 1 Goodness of Fit Chi Squared Test 2 Tests of

More information

Purposes of Data Analysis. Variables and Samples. Parameters and Statistics. Part 1: Probability Distributions

Purposes of Data Analysis. Variables and Samples. Parameters and Statistics. Part 1: Probability Distributions Part 1: Probability Distributions Purposes of Data Analysis True Distributions or Relationships in the Earths System Probability Distribution Normal Distribution Student-t Distribution Chi Square Distribution

More information

Regression, part II. I. What does it all mean? A) Notice that so far all we ve done is math.

Regression, part II. I. What does it all mean? A) Notice that so far all we ve done is math. Regression, part II I. What does it all mean? A) Notice that so far all we ve done is math. 1) One can calculate the Least Squares Regression Line for anything, regardless of any assumptions. 2) But, if

More information

Wolf River. Lecture 19 - ANOVA. Exploratory analysis. Wolf River - Data. Sta 111. June 11, 2014

Wolf River. Lecture 19 - ANOVA. Exploratory analysis. Wolf River - Data. Sta 111. June 11, 2014 Aldrin in the Wolf River Wolf River Lecture 19 - Sta 111 Colin Rundel June 11, 2014 The Wolf River in Tennessee flows past an abandoned site once used by the pesticide industry for dumping wastes, including

More information

CHAPTER 9: HYPOTHESIS TESTING

CHAPTER 9: HYPOTHESIS TESTING CHAPTER 9: HYPOTHESIS TESTING THE SECOND LAST EXAMPLE CLEARLY ILLUSTRATES THAT THERE IS ONE IMPORTANT ISSUE WE NEED TO EXPLORE: IS THERE (IN OUR TWO SAMPLES) SUFFICIENT STATISTICAL EVIDENCE TO CONCLUDE

More information

T test for two Independent Samples. Raja, BSc.N, DCHN, RN Nursing Instructor Acknowledgement: Ms. Saima Hirani June 07, 2016

T test for two Independent Samples. Raja, BSc.N, DCHN, RN Nursing Instructor Acknowledgement: Ms. Saima Hirani June 07, 2016 T test for two Independent Samples Raja, BSc.N, DCHN, RN Nursing Instructor Acknowledgement: Ms. Saima Hirani June 07, 2016 Q1. The mean serum creatinine level is measured in 36 patients after they received

More information

Sampling Distributions: Central Limit Theorem

Sampling Distributions: Central Limit Theorem Review for Exam 2 Sampling Distributions: Central Limit Theorem Conceptually, we can break up the theorem into three parts: 1. The mean (µ M ) of a population of sample means (M) is equal to the mean (µ)

More information

Hypothesis Testing. We normally talk about two types of hypothesis: the null hypothesis and the research or alternative hypothesis.

Hypothesis Testing. We normally talk about two types of hypothesis: the null hypothesis and the research or alternative hypothesis. Hypothesis Testing Today, we are going to begin talking about the idea of hypothesis testing how we can use statistics to show that our causal models are valid or invalid. We normally talk about two types

More information

SMAM 314 Exam 3 Name. F A. A null hypothesis that is rejected at α =.05 will always be rejected at α =.01.

SMAM 314 Exam 3 Name. F A. A null hypothesis that is rejected at α =.05 will always be rejected at α =.01. SMAM 314 Exam 3 Name 1. Indicate whether the following statements are true (T) or false (F) (6 points) F A. A null hypothesis that is rejected at α =.05 will always be rejected at α =.01. T B. A course

More information

Analysis of variance (ANOVA) Comparing the means of more than two groups

Analysis of variance (ANOVA) Comparing the means of more than two groups Analysis of variance (ANOVA) Comparing the means of more than two groups Example: Cost of mating in male fruit flies Drosophila Treatments: place males with and without unmated (virgin) females Five treatments

More information

Chapter 9. Hypothesis testing. 9.1 Introduction

Chapter 9. Hypothesis testing. 9.1 Introduction Chapter 9 Hypothesis testing 9.1 Introduction Confidence intervals are one of the two most common types of statistical inference. Use them when our goal is to estimate a population parameter. The second

More information

Multiple Testing. Gary W. Oehlert. January 28, School of Statistics University of Minnesota

Multiple Testing. Gary W. Oehlert. January 28, School of Statistics University of Minnesota Multiple Testing Gary W. Oehlert School of Statistics University of Minnesota January 28, 2016 Background Suppose that you had a 20-sided die. Nineteen of the sides are labeled 0 and one of the sides is

More information

We know from STAT.1030 that the relevant test statistic for equality of proportions is:

We know from STAT.1030 that the relevant test statistic for equality of proportions is: 2. Chi 2 -tests for equality of proportions Introduction: Two Samples Consider comparing the sample proportions p 1 and p 2 in independent random samples of size n 1 and n 2 out of two populations which

More information

COSC 341 Human Computer Interaction. Dr. Bowen Hui University of British Columbia Okanagan

COSC 341 Human Computer Interaction. Dr. Bowen Hui University of British Columbia Okanagan COSC 341 Human Computer Interaction Dr. Bowen Hui University of British Columbia Okanagan 1 Last Topic Distribution of means When it is needed How to build one (from scratch) Determining the characteristics

More information

Warm-up Using the given data Create a scatterplot Find the regression line

Warm-up Using the given data Create a scatterplot Find the regression line Time at the lunch table Caloric intake 21.4 472 30.8 498 37.7 335 32.8 423 39.5 437 22.8 508 34.1 431 33.9 479 43.8 454 42.4 450 43.1 410 29.2 504 31.3 437 28.6 489 32.9 436 30.6 480 35.1 439 33.0 444

More information

Inference for Distributions Inference for the Mean of a Population

Inference for Distributions Inference for the Mean of a Population Inference for Distributions Inference for the Mean of a Population PBS Chapter 7.1 009 W.H Freeman and Company Objectives (PBS Chapter 7.1) Inference for the mean of a population The t distributions The

More information

Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institution of Technology, Kharagpur

Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institution of Technology, Kharagpur Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institution of Technology, Kharagpur Lecture No. # 36 Sampling Distribution and Parameter Estimation

More information

Inference and Regression

Inference and Regression Name Inference and Regression Final Examination, 2016 Department of IOMS This course and this examination are governed by the Stern Honor Code. Instructions Please write your name at the top of this page.

More information

[ z = 1.48 ; accept H 0 ]

[ z = 1.48 ; accept H 0 ] CH 13 TESTING OF HYPOTHESIS EXAMPLES Example 13.1 Indicate the type of errors committed in the following cases: (i) H 0 : µ = 500; H 1 : µ 500. H 0 is rejected while H 0 is true (ii) H 0 : µ = 500; H 1

More information

Hypothesis Testing in Action: t-tests

Hypothesis Testing in Action: t-tests Hypothesis Testing in Action: t-tests Mark Muldoon School of Mathematics, University of Manchester Mark Muldoon, January 30, 2007 t-testing - p. 1/31 Overview large Computing t for two : reprise Today

More information

7 Estimation. 7.1 Population and Sample (P.91-92)

7 Estimation. 7.1 Population and Sample (P.91-92) 7 Estimation MATH1015 Biostatistics Week 7 7.1 Population and Sample (P.91-92) Suppose that we wish to study a particular health problem in Australia, for example, the average serum cholesterol level for

More information

Comparison of Two Population Means

Comparison of Two Population Means Comparison of Two Population Means Esra Akdeniz March 15, 2015 Independent versus Dependent (paired) Samples We have independent samples if we perform an experiment in two unrelated populations. We have

More information

Standard normal distribution. t-distribution, (df=5) t-distribution, (df=2) PDF created with pdffactory Pro trial version

Standard normal distribution. t-distribution, (df=5) t-distribution, (df=2) PDF created with pdffactory Pro trial version t-ditribution In biological reearch the population variance i uually unknown and an unbiaed etimate,, obtained from the ample data, ha to be ued in place of σ. The propertie of t- ditribution are: -It

More information

The Components of a Statistical Hypothesis Testing Problem

The Components of a Statistical Hypothesis Testing Problem Statistical Inference: Recall from chapter 5 that statistical inference is the use of a subset of a population (the sample) to draw conclusions about the entire population. In chapter 5 we studied one

More information

Descriptive Statistics

Descriptive Statistics Descriptive Statistics Once an experiment is carried out and the results are measured, the researcher has to decide whether the results of the treatments are different. This would be easy if the results

More information

P-values and statistical tests 3. t-test

P-values and statistical tests 3. t-test P-values and statistical tests 3. t-test Marek Gierliński Division of Computational Biology Hand-outs available at http://is.gd/statlec Statistical test Null hypothesis H 0 : no effect Significance level

More information

Rejection regions for the bivariate case

Rejection regions for the bivariate case Rejection regions for the bivariate case The rejection region for the T 2 test (and similarly for Z 2 when Σ is known) is the region outside of an ellipse, for which there is a (1-α)% chance that the test

More information

W&M CSCI 628: Design of Experiments Homework 1

W&M CSCI 628: Design of Experiments Homework 1 W&M CSCI 68: Design of Experiments Homework 1 Megan Rose Bryant September, 014 1. Suppose that you want to investigate the factors that potentially affect cooking rice. a.)what would you use as a response

More information

Department of Mathematics & Statistics STAT 2593 Final Examination 17 April, 2000

Department of Mathematics & Statistics STAT 2593 Final Examination 17 April, 2000 Department of Mathematics & Statistics STAT 2593 Final Examination 17 April, 2000 TIME: 3 hours. Total marks: 80. (Marks are indicated in margin.) Remember that estimate means to give an interval estimate.

More information

The t-statistic. Student s t Test

The t-statistic. Student s t Test The t-statistic 1 Student s t Test When the population standard deviation is not known, you cannot use a z score hypothesis test Use Student s t test instead Student s t, or t test is, conceptually, very

More information

Contingency Tables. Safety equipment in use Fatal Non-fatal Total. None 1, , ,128 Seat belt , ,878

Contingency Tables. Safety equipment in use Fatal Non-fatal Total. None 1, , ,128 Seat belt , ,878 Contingency Tables I. Definition & Examples. A) Contingency tables are tables where we are looking at two (or more - but we won t cover three or more way tables, it s way too complicated) factors, each

More information

Chapter 8. Inferences Based on a Two Samples Confidence Intervals and Tests of Hypothesis

Chapter 8. Inferences Based on a Two Samples Confidence Intervals and Tests of Hypothesis Chapter 8 Inferences Based on a Two Samples Confidence Intervals and Tests of Hypothesis Copyright 2018, 2014, and 2011 Pearson Education, Inc. Slide - 1 Content 1. Identifying the Target Parameter 2.

More information

Visual interpretation with normal approximation

Visual interpretation with normal approximation Visual interpretation with normal approximation H 0 is true: H 1 is true: p =0.06 25 33 Reject H 0 α =0.05 (Type I error rate) Fail to reject H 0 β =0.6468 (Type II error rate) 30 Accept H 1 Visual interpretation

More information

Hypothesis Tests and Estimation for Population Variances. Copyright 2014 Pearson Education, Inc.

Hypothesis Tests and Estimation for Population Variances. Copyright 2014 Pearson Education, Inc. Hypothesis Tests and Estimation for Population Variances 11-1 Learning Outcomes Outcome 1. Formulate and carry out hypothesis tests for a single population variance. Outcome 2. Develop and interpret confidence

More information

Tests about a population mean

Tests about a population mean October 2 nd, 2017 Overview Week 1 Week 2 Week 4 Week 7 Week 10 Week 12 Chapter 1: Descriptive statistics Chapter 6: Statistics and Sampling Distributions Chapter 7: Point Estimation Chapter 8: Confidence

More information

Two sample Hypothesis tests in R.

Two sample Hypothesis tests in R. Example. (Dependent samples) Two sample Hypothesis tests in R. A Calculus professor gives their students a 10 question algebra pretest on the first day of class, and a similar test towards the end of the

More information

Chapter 27 Summary Inferences for Regression

Chapter 27 Summary Inferences for Regression Chapter 7 Summary Inferences for Regression What have we learned? We have now applied inference to regression models. Like in all inference situations, there are conditions that we must check. We can test

More information

Module 03 Lecture 14 Inferential Statistics ANOVA and TOI

Module 03 Lecture 14 Inferential Statistics ANOVA and TOI Introduction of Data Analytics Prof. Nandan Sudarsanam and Prof. B Ravindran Department of Management Studies and Department of Computer Science and Engineering Indian Institute of Technology, Madras Module

More information

Quantitative Analysis and Empirical Methods

Quantitative Analysis and Empirical Methods Hypothesis testing Sciences Po, Paris, CEE / LIEPP Introduction Hypotheses Procedure of hypothesis testing Two-tailed and one-tailed tests Statistical tests with categorical variables A hypothesis A testable

More information