Two-Sample Inference for Proportions and Inference for Linear Regression

Similar documents
10.4 Hypothesis Testing: Two Independent Samples Proportion

Comparing Means from Two-Sample

Inferences for Regression

Stats Review Chapter 14. Mary Stangler Center for Academic Success Revised 8/16

Objectives Simple linear regression. Statistical model for linear regression. Estimating the regression parameters

STATISTICS 141 Final Review

STA 101 Final Review

Inferences About Two Proportions

9-6. Testing the difference between proportions /20

Assumptions, Diagnostics, and Inferences for the Simple Linear Regression Model with Normal Residuals

AMS 7 Correlation and Regression Lecture 8

Inference for Regression

Chapter 22. Comparing Two Proportions 1 /29

Important note: Transcripts are not substitutes for textbook assignments. 1

Sampling Distributions in Regression. Mini-Review: Inference for a Mean. For data (x 1, y 1 ),, (x n, y n ) generated with the SRM,

6.4 Type I and Type II Errors

Business Statistics. Lecture 10: Course Review

STAT420 Midterm Exam. University of Illinois Urbana-Champaign October 19 (Friday), :00 4:15p. SOLUTIONS (Yellow)

Chapter 22. Comparing Two Proportions 1 /30

1 Statistical inference for a population mean

Econometrics. 4) Statistical inference

Lecture 6 Multiple Linear Regression, cont.

Measuring the fit of the model - SSR

Inference for Regression Simple Linear Regression

LECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit

Correlation Analysis

MGEC11H3Y L01 Introduction to Regression Analysis Term Test Friday July 5, PM Instructor: Victor Yu

Chapter 10. Regression. Understandable Statistics Ninth Edition By Brase and Brase Prepared by Yixun Shi Bloomsburg University of Pennsylvania

Linear Regression. Simple linear regression model determines the relationship between one dependent variable (y) and one independent variable (x).

Stat 412/512 REVIEW OF SIMPLE LINEAR REGRESSION. Jan Charlotte Wickham. stat512.cwick.co.nz

1 Independent Practice: Hypothesis tests for one parameter:

Review: General Approach to Hypothesis Testing. 1. Define the research question and formulate the appropriate null and alternative hypotheses.

Topic 10 - Linear Regression

Lecture 11 - Tests of Proportions

Scatter plot of data from the study. Linear Regression

Announcements Wednesday, August 30

Review for Final Exam Stat 205: Statistics for the Life Sciences

Business Statistics. Lecture 9: Simple Regression

Can you tell the relationship between students SAT scores and their college grades?

Scatter plot of data from the study. Linear Regression

7.2 One-Sample Correlation ( = a) Introduction. Correlation analysis measures the strength and direction of association between

PubH 7405: REGRESSION ANALYSIS. MLR: INFERENCES, Part I

Warm-up Using the given data Create a scatterplot Find the regression line

Announcements. Unit 3: Foundations for inference Lecture 3: Decision errors, significance levels, sample size, and power.

Chapter 3 Multiple Regression Complete Example

10.1 Simple Linear Regression

appstats27.notebook April 06, 2017

Ch 2: Simple Linear Regression

Lecture 3: Inference in SLR

Basic Business Statistics 6 th Edition

STAT 512 MidTerm I (2/21/2013) Spring 2013 INSTRUCTIONS

Lecture 10 Multiple Linear Regression

The simple linear regression model discussed in Chapter 13 was written as

Chapter 14 Simple Linear Regression (A)

Chapter 27 Summary Inferences for Regression

What is a Hypothesis?

Occupy movement - Duke edition. Lecture 14: Large sample inference for proportions. Exploratory analysis. Another poll on the movement

Stat 135 Fall 2013 FINAL EXAM December 18, 2013

Test 3 Practice Test A. NOTE: Ignore Q10 (not covered)

Multiple Linear Regression for the Salary Data

Linear regression. We have that the estimated mean in linear regression is. ˆµ Y X=x = ˆβ 0 + ˆβ 1 x. The standard error of ˆµ Y X=x is.

y n 1 ( x i x )( y y i n 1 i y 2

Inferences About Two Population Proportions

STAT 111 Recitation 7

Statistics for Managers using Microsoft Excel 6 th Edition

INFERENCE FOR REGRESSION

Simple Linear Regression

Difference Between Pair Differences v. 2 Samples

Annoucements. MT2 - Review. one variable. two variables

Lecture 9 Two-Sample Test. Fall 2013 Prof. Yao Xie, H. Milton Stewart School of Industrial Systems & Engineering Georgia Tech

Announcements Wednesday, August 30

Inference for the Regression Coefficient

Ch 3: Multiple Linear Regression

Lecture 18: Simple Linear Regression

(ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box.

Stat 135, Fall 2006 A. Adhikari HOMEWORK 10 SOLUTIONS

STAT 3A03 Applied Regression With SAS Fall 2017

Chapter 12 - Lecture 2 Inferences about regression coefficient

Inference with Simple Regression

Applied Regression. Applied Regression. Chapter 2 Simple Linear Regression. Hongcheng Li. April, 6, 2013

Lecture 5: ANOVA and Correlation

Sociology 6Z03 Review II

Extra Exam Empirical Methods VU University Amsterdam, Faculty of Exact Sciences , July 2, 2015

STA121: Applied Regression Analysis

Multiple Linear Regression for the Supervisor Data

Lecture 19: Inference for SLR & Transformations

A discussion on multiple regression models

Lecture 5: Clustering, Linear Regression

Statistics and Quantitative Analysis U4320

Regression Analysis II

Lectures 5 & 6: Hypothesis Testing

Confidence Intervals, Testing and ANOVA Summary

STAT Chapter 8: Hypothesis Tests

Tables Table A Table B Table C Table D Table E 675

STAT 511. Lecture : Simple linear regression Devore: Section Prof. Michael Levine. December 3, Levine STAT 511

Practice Questions: Statistics W1111, Fall Solutions

Chapter 22. Comparing Two Proportions. Bin Zou STAT 141 University of Alberta Winter / 15

Figure 1: The fitted line using the shipment route-number of ampules data. STAT5044: Regression and ANOVA The Solution of Homework #2 Inyoung Kim

SOLUTIONS y n. n 1 = 605, y 1 = 351. y1. p y n. n 2 = 195, y 2 = 41. y p H 0 : p 1 = p 2 vs. H 1 : p 1 p 2.

Hypothesis testing Goodness of fit Multicollinearity Prediction. Applied Statistics. Lecturer: Serena Arima

Transcription:

Two-Sample Inference for Proportions and Inference for Linear Regression Kwonsang Lee University of Pennsylvania kwonlee@wharton.upenn.edu April 24, 2015 Kwonsang Lee STAT111 April 24, 2015 1 / 13

Announcement: Review Session There is no mandatory recitation next Friday, but there is a review session next Friday at the same time and the same location (optional). The professor will go over the material next Tuesday in class, so I ll focus on problem solving. Also, I ll hold the office hours next Tuesday 1-2pm and Wednesday 3-4pm. Kwonsang Lee STAT111 April 24, 2015 2 / 13

Announcement: Returning HW Homework 6 and 7 are not yet graded. I ll return them during next Friday review session. For those who are not able to come to the review session, I ll make a returning box at the entrance of the Statistic department on 4th floor, JMHH. I ll let you know via eamil when the box is ready. Kwonsang Lee STAT111 April 24, 2015 3 / 13

Hypothesis Test for Two Proportions Assume that we have two independent samples and test if the two proportions are the same or not. H 0 : p 1 = p 2 (or p 1 p 2 = 0), H a : p 1 p 2 The test statistic Z 0 is given by Z 0 = ˆp 1 ˆp 2 SE(ˆp 1 ˆp 2 ) = ˆp 1 ˆp 2 ( ) ˆp p (1 ˆp p ) 1 + 1 n1 n2 where ˆp 1 = Y 1 n 1, ˆp 2 = Y 2 n 2 and ˆp p = Y 1+Y 2 n 1 +n 2. Then, we compute P-value. Kwonsang Lee STAT111 April 24, 2015 4 / 13

Confidence Interval for Difference A confidence interval is for the difference between Population 1 and Population 2. i.e. CI for p 1 p 2. Here, the point estimate of p 1 p 2 is ˆp 1 ˆp 2 and a confidence interval is ˆp 1 ˆp 2 ± Z ˆp 1 (1 ˆp 1 ) + ˆp 2(1 ˆp 2 ) n 1 n 2 Kwonsang Lee STAT111 April 24, 2015 5 / 13

Example: Smoking Time magazine reported the result of a telephone poll of 800 adult Americans. The question posed of the Americans who were surveyed was: Should the federal tax on cigarettes be raised to pay for health care reform? The results of the survey were: Non-Smokers Smokers n 1 = 605 n 2 = 195 y 1 = 351 said yes y 2 = 41 said yes ˆp 1 = 351 605 = 0.58 ˆp 2 = 41 195 = 0.21 Kwonsang Lee STAT111 April 24, 2015 6 / 13

Confidence Interval 1. What is the 95% Confidence Interval for p 1? ˆp 1 (1 ˆp 1 ) ˆp 1 ± 1.96 = (0.54, 0.62) n 1 2. What is the 95% Confidence Interval for p 2? ˆp 2 (1 ˆp 2 ) ˆp 2 ± 1.96 = (0.15, 0.27) n 2 3. What is the 95% Confidence Interval for p 1 p 2? ˆp 1 (1 ˆp 1 ) (ˆp 1 ˆp 2 ) ± 1.96 + ˆp 2(1 ˆp 2 ) = (0.30, 0.44) n 1 n 2 Kwonsang Lee STAT111 April 24, 2015 7 / 13

Hypothesis Test We want to test whether smokers and non-smokers have significantly different opinions. The null and alternative are H 0 : p 1 = p 2, H a : p 1 p 2 Here, ˆp p = 351+41 605+195 = 0.49. The test statistic Z 0 is Z 0 = (ˆp 1 ˆp 2 ) 0 ( ) = 0.58 0.21 ˆp p (1 ˆp p ) 1 + 1 0.49 (1 0.49) ( 1 605 + 1 ) = 8.99 195 n1 n2 The P-value is 2 P(Z > 8.99) 0. Therefore, we reject the null. Kwonsang Lee STAT111 April 24, 2015 8 / 13

Linear Regression Simple Linear Regression Model: Y i = α + βx i + e i where α is the intercept and β is the slope. We estimate α and β, not e i. Best fit line minimizes sum of squared residuals (Residual = Y i (α + βx i )). SSR = n (Y i (α + βx i )) 2. i=1 Best estimators of α, β are a, b, Remember this formula! b = r Sy S x, a = Ȳ b X. Kwonsang Lee STAT111 April 24, 2015 9 / 13

Prediction Using Linear Regression For a new value of X, we can predict Y using the least-squares line. Y predicted = a + b X new. The regression equation is obtained from the data ((X 1, Y 1 ),..., (X n, Y n )). This equation is good to predict Y for new X between X min and X max where X min = min (X i ) and X max = max (X i ). However, if new X is outside the interval, then prediction might not work well. i.e. extrapolation might have a problem. Kwonsang Lee STAT111 April 24, 2015 10 / 13

Inference for Linear Regression We want to see if there is a linear relationship between Y and X. This is equivalent to test whether the (population) slope β is zero or not. H 0 : β = 0, H a : β 0 If the null hypothesis is rejected, then we can say that there is a linear relationship. We can test whether the intercept α is zero or not, but usually, we are not interested in this test. Kwonsang Lee STAT111 April 24, 2015 11 / 13

Test for Slope β = 0 From the sample (X i, Y i ), we have the estimates a and b. a. State the appropriate hypotheses H 0 : β = 0, H a : β 0 b. Test statistic T 0 T 0 = b 0 SE(b) SE(b) is usually given or can be found in the output of JMP regression c. P-value We can compute the range of P-value from t-table with n 2 degrees of freedom, or find the P-value in the output of JMP regression. Kwonsang Lee STAT111 April 24, 2015 12 / 13

Confidence Interval for β The slope β is interpreted as average change of Y when one unit of X changes. We might be interested in a confidence interval for the average change of Y i.e. β. A confidence interval for β is b ± t SE(b) where t is the critical value with n 2 degrees of freedom. Kwonsang Lee STAT111 April 24, 2015 13 / 13