Chapter 10: Inferences based on two samples

Similar documents
Mathematical statistics

CBA4 is live in practice mode this week exam mode from Saturday!

Review. December 4 th, Review

AMS7: WEEK 7. CLASS 1. More on Hypothesis Testing Monday May 11th, 2015

Class 24. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

Review: General Approach to Hypothesis Testing. 1. Define the research question and formulate the appropriate null and alternative hypotheses.

Test 3 Practice Test A. NOTE: Ignore Q10 (not covered)

T.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS

χ L = χ R =

Lecture 15: Inference Based on Two Samples

INTERVAL ESTIMATION AND HYPOTHESES TESTING

As an example, consider the Bond Strength data in Table 2.1, atop page 26 of y1 y 1j/ n , S 1 (y1j y 1) 0.

PSY 307 Statistics for the Behavioral Sciences. Chapter 20 Tests for Ranked Data, Choosing Statistical Tests

SMAM 314 Exam 3 Name. F A. A null hypothesis that is rejected at α =.05 will always be rejected at α =.01.

Chapter 3. Comparing two populations

Randomized Complete Block Designs

Tables Table A Table B Table C Table D Table E 675

Confidence Intervals, Testing and ANOVA Summary

1 Statistical inference for a population mean

ME3620. Theory of Engineering Experimentation. Spring Chapter IV. Decision Making for a Single Sample. Chapter IV

Statistical Inference Theory Lesson 46 Non-parametric Statistics

EC2001 Econometrics 1 Dr. Jose Olmo Room D309

Multiple Regression Analysis: The Problem of Inference

Statistics for Business and Economics

Institute of Actuaries of India

Applied Statistics I

Dr. Maddah ENMG 617 EM Statistics 10/12/12. Nonparametric Statistics (Chapter 16, Hines)

Comparison of Two Population Means

Visual interpretation with normal approximation

Inference for the Regression Coefficient

Summary of Chapter 7 (Sections ) and Chapter 8 (Section 8.1)

For tests of hypotheses made about population, we consider. only tests having a hypothesis of. Note that under the assumption of

Comparing two independent samples

Tests about a population mean

Non-parametric Hypothesis Testing

Topic 22 Analysis of Variance

Statistical Inference

Lecture 9 Two-Sample Test. Fall 2013 Prof. Yao Xie, H. Milton Stewart School of Industrial Systems & Engineering Georgia Tech

Introductory Econometrics

GROUPED DATA E.G. FOR SAMPLE OF RAW DATA (E.G. 4, 12, 7, 5, MEAN G x / n STANDARD DEVIATION MEDIAN AND QUARTILES STANDARD DEVIATION

The t-distribution. Patrick Breheny. October 13. z tests The χ 2 -distribution The t-distribution Summary

Section 9.4. Notation. Requirements. Definition. Inferences About Two Means (Matched Pairs) Examples

Statistics and Sampling distributions

Summary of Chapters 7-9

The Chi-Square Distributions

Statistical inference (estimation, hypothesis tests, confidence intervals) Oct 2018

7.2 One-Sample Correlation ( = a) Introduction. Correlation analysis measures the strength and direction of association between

Unit 14: Nonparametric Statistical Methods

i=1 X i/n i=1 (X i X) 2 /(n 1). Find the constant c so that the statistic c(x X n+1 )/S has a t-distribution. If n = 8, determine k such that

CH.9 Tests of Hypotheses for a Single Sample

Solutions to Practice Test 2 Math 4753 Summer 2005

Hypothesis Testing One Sample Tests

Solution: First note that the power function of the test is given as follows,

BIO5312 Biostatistics Lecture 6: Statistical hypothesis testings

Confidence intervals

Using Tables and Graphing Calculators in Math 11

Design of Engineering Experiments Part 2 Basic Statistical Concepts Simple comparative experiments

Review of Statistics 101

TUTORIAL 8 SOLUTIONS #

Sampling distribution of t. 2. Sampling distribution of t. 3. Example: Gas mileage investigation. II. Inferential Statistics (8) t =

Statistical Analysis for QBIC Genetics Adapted by Ellen G. Dow 2017

Comparing Means from Two-Sample

Econ 325: Introduction to Empirical Economics

Quantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing

Hypothesis Tests and Estimation for Population Variances. Copyright 2014 Pearson Education, Inc.

Inference for Regression

Multivariate Statistical Analysis

The t-test Pivots Summary. Pivots and t-tests. Patrick Breheny. October 15. Patrick Breheny Biostatistical Methods I (BIOS 5710) 1/18

A discussion on multiple regression models

Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2

An interval estimator of a parameter θ is of the form θl < θ < θu at a

Ch 2: Simple Linear Regression

Small-Sample CI s for Normal Pop. Variance

Statistical Inference

Purposes of Data Analysis. Variables and Samples. Parameters and Statistics. Part 1: Probability Distributions

Inference in Regression Analysis

STA 101 Final Review

1 Descriptive statistics. 2 Scores and probability distributions. 3 Hypothesis testing and one-sample t-test. 4 More on t-tests

Exam details. Final Review Session. Things to Review

UNIVERSITY OF TORONTO Faculty of Arts and Science

Inference About Two Means: Independent Samples

Comparing Two Variances. CI For Variance Ratio

Inferences for Proportions and Count Data

Math 423/533: The Main Theoretical Topics

One sample problem. sample mean: ȳ = . sample variance: s 2 = sample standard deviation: s = s 2. y i n. i=1. i=1 (y i ȳ) 2 n 1

Epidemiology Principles of Biostatistics Chapter 10 - Inferences about two populations. John Koval

Lectures 5 & 6: Hypothesis Testing

Measuring the fit of the model - SSR

POLI 443 Applied Political Research

ANOVA Situation The F Statistic Multiple Comparisons. 1-Way ANOVA MATH 143. Department of Mathematics and Statistics Calvin College

" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2

17: INFERENCE FOR MULTIPLE REGRESSION. Inference for Individual Regression Coefficients

Statistical Inference for Means

Paired comparisons. We assume that

Chapter 7: Statistical Inference (Two Samples)

y ˆ i = ˆ " T u i ( i th fitted value or i th fit)

Inferences about central values (.)

Classroom Activity 7 Math 113 Name : 10 pts Intro to Applied Stats

7 Estimation. 7.1 Population and Sample (P.91-92)

Confidence intervals CE 311S

Transcription:

November 16 th, 2017

Overview Week 1 Week 2 Week 4 Week 7 Week 10 Week 12 Chapter 1: Descriptive statistics Chapter 6: Statistics and Sampling Distributions Chapter 7: Point Estimation Chapter 8: Confidence Intervals Chapter 9: Tests of Hypotheses Chapter 10: Two-sample inference

Overview 10.1 Difference between two population means z-test confidence intervals 10.2 The two-sample t test and confidence interval 10.3 Analysis of paired data

Two-sample inference: example Example Let µ 1 and µ 2 denote true average decrease in cholesterol for two drugs. From two independent samples X 1, X 2,..., X m and Y 1, Y 2,..., Y n, we want to test: H 0 : µ 1 = µ 2 H a : µ 1 µ 2

Settings This week: independent samples Assumption 1 X 1, X 2,..., X m is a random sample from a population with mean µ 1 and variance σ 2 1. 2 Y 1, Y 2,..., Y n is a random sample from a population with mean µ 2 and variance σ 2 2. 3 The X and Y samples are independent of each other. Next week: paired-sample test

Review Chapter 6 and Chapter 7 Problem Assume that X 1, X 2,..., X m is a random sample from a population with mean µ 1 and variance σ 2 1. Y 1, Y 2,..., Y n is a random sample from a population with mean µ 2 and variance σ 2 2. The X and Y samples are independent of each other. Compute (in terms of µ 1, µ 2, σ 1, σ 2, m, n) (a) E[ X Ȳ ] (b) Var[ X Ȳ ] and σ X Ȳ

Properties of X Ȳ Proposition

Normal distributions with known variances

Confidence intervals

Testing the difference between two population means Setting: independent normal random samples X 1, X 2,..., X m and Y 1, Y 2,..., Y n with known values of σ 1 and σ 2. Constant 0. Null hypothesis: Alternative hypothesis: (a) H a : µ 1 µ 2 > 0 (b) H a : µ 1 µ 2 < 0 (c) H a : µ 1 µ 2 0 H 0 : µ 1 µ 2 = 0 When = 0, the test (c) becomes H 0 : µ 1 = µ 2 H a : µ 1 µ 2

Testing the difference between two population means Proposition

Large-sample tests/confidence intervals (with unknown σ)

Principles Central Limit Theorem: X and Ȳ are approximately normal when n > 30 so is X Ȳ. Thus ( X Ȳ ) (µ 1 µ 2 ) σ1 2 m + σ2 2 n is approximately standard normal When n is sufficiently large S 1 σ 1 and S 2 σ 2 Conclusion: ( X Ȳ ) (µ 1 µ 2 ) S1 2 m + S2 2 n is approximately standard normal when n is sufficiently large If m, n > 40, we can ignore the normal assumption and replace σ by S

Large-sample tests Proposition

Large-sample CIs Proposition Provided that m and n are both large, a CI for µ 1 µ 2 with a confidence level of approximately 100(1 α)% is x ȳ ± z α/2 s 2 1 m + s2 2 n where gives the lower limit and + the upper limit of the interval. An upper or lower confidence bound can also be calculated by retaining the appropriate sign and replacing z α/2 by z α.

Example Example Let µ 1 and µ 2 denote true average tread lives for two competing brands of size P205/65R15 radial tires. (a) Test H 0 : µ 1 = µ 2 H a : µ 1 µ 2 at level 0.05 using the following data: m = 45, x = 42, 500, s 1 = 2200, n = 45, ȳ = 40, 400, and s 2 = 1900. (b) Construct a 95% CI for µ 1 µ 2.

The two-sample t test and confidence interval

Overview Section 8.1 Normal distribution σ is known Section 8.2 Normal distribution Using Central Limit Theorem needs n > 30 σ is known needs n > 40 Section 8.3 Normal distribution σ is known n is small Introducing t-distribution

Principles For one-sample inferences: For two-sample inferences: X µ s/ n t n 1 ( X Ȳ ) (µ 1 µ 2 ) S1 2 m + S2 2 n t ν where ν is some appropriate degree of freedom (which depends on m and n).

Chi-squared distribution Proposition If Z has standard normal distribution Z(0, 1) and X = Z 2, then X has Chi-squared distribution with 1 degree of freedom, i.e. X χ 2 1 distribution. If Z 1, Z 2,..., Z n are independent and each has the standard normal distribution, then Z 2 1 + Z 2 2 +... + Z 2 n χ 2 n

t distributions Definition Let Z be a standard normal rv and let W be a χ 2 ν rv independent of Z. Then the t distribution with degrees of freedom ν is defined to be the distribution of the ratio T = Z W /ν

2 plus 2 is 4 minus 1 that s 3 Definition of t distributions: Z W /ν t ν Our statistic: ( X Ȳ ) (µ 1 µ 2 ) S 2 1 m + S2 2 n = [ ( X Ȳ ) (µ 1 µ 2 ) ] σ1 / 2 ( ) ( ) S 2 1 m + S2 2 σ 2 n / 1 m + σ2 2 n m + σ2 2 n What we need: ( S 2 1 m + S 2 2 ) ( ) σ 2 / 1 n m + σ2 2 = W n ν

Quick maths What we need: ( S 2 1 m + S 2 2 ) ( ) σ 2 = 1 n m + σ2 2 W n ν What we have E[W ] = ν, V [W ] = 2ν E[S1 2] = σ2 1, V [S 1 2] = 2σ4 1 /(m 1) E[S2 2] = σ2 2, V [S 2 2] = 2σ4 2 /(n 1) Variance of the LHS [ S 2 V 1 m + S 2 2 ] 2σ1 4 = n (m 1)m 2 + 2σ2 4 (n 1)n 2 Variance of the RHS V [( σ 2 1 m + σ2 2 n ) W ν ] ( ) σ 2 2 = 1 m + σ2 2 2ν n ν 2

2-sample t test: degree of freedom

CIs for difference of the two population means

2-sample t procedures

Example Example A paper reported the following data on tensile strength (psi) of liner specimens both when a certain fusion process was used and when this process was not used: The authors of the article stated that the fusion process increased the average tensile strength. Carry out a test of hypotheses to see whether the data supports this conclusion (and provide the P-value of the test)

Solution

Solution