Review of Regression Basics

Similar documents
Review of Regression Basics

AP Statistics Unit 6 Note Packet Linear Regression. Scatterplots and Correlation

Relationships Regression

Chapter 3: Describing Relationships

Linear Regression Communication, skills, and understanding Calculator Use

3.2: Least Squares Regressions

Chapter 3: Describing Relationships

AP Statistics Two-Variable Data Analysis

Chapter 6: Exploring Data: Relationships Lesson Plan

23. Inference for regression

Describing Bivariate Relationships

Least-Squares Regression. Unit 3 Exploring Data

Chapter 2: Looking at Data Relationships (Part 3)

5.1 Bivariate Relationships

Determine is the equation of the LSRL. Determine is the equation of the LSRL of Customers in line and seconds to check out.. Chapter 3, Section 2

Chapter 9. Correlation and Regression

Example: Can an increase in non-exercise activity (e.g. fidgeting) help people gain less weight?

INFERENCE FOR REGRESSION

Chapter 5 Least Squares Regression

Objectives. 2.3 Least-squares regression. Regression lines. Prediction and Extrapolation. Correlation and r 2. Transforming relationships

Looking at data: relationships

AP Statistics Bivariate Data Analysis Test Review. Multiple-Choice

Related Example on Page(s) R , 148 R , 148 R , 156, 157 R3.1, R3.2. Activity on 152, , 190.

UNIT 12 ~ More About Regression

Analysis of Bivariate Data

Ch. 3 Review - LSRL AP Stats

Chapter 3: Examining Relationships

Chapter 10. Correlation and Regression. Lecture 1 Sections:

Chapter 12 Summarizing Bivariate Data Linear Regression and Correlation

Chapter 10 Correlation and Regression

Examining Relationships. Chapter 3

IF YOU HAVE DATA VALUES:

The response variable depends on the explanatory variable.

Simple Linear Regression

Bivariate Data Summary

Summarizing Data: Paired Quantitative Data

Chapter 5 Friday, May 21st

Lecture 4 Scatterplots, Association, and Correlation

Linear Regression. Linear Regression. Linear Regression. Did You Mean Association Or Correlation?

Lecture 4 Scatterplots, Association, and Correlation

The following formulas related to this topic are provided on the formula sheet:

Nov 13 AP STAT. 1. Check/rev HW 2. Review/recap of notes 3. HW: pg #5,7,8,9,11 and read/notes pg smartboad notes ch 3.

Quantitative Bivariate Data

Lecture 18: Simple Linear Regression

Chapter 8. Linear Regression /71

BIVARIATE DATA data for two variables

What is the easiest way to lose points when making a scatterplot?

Scatterplots and Correlation

y = a + bx 12.1: Inference for Linear Regression Review: General Form of Linear Regression Equation Review: Interpreting Computer Regression Output

y n 1 ( x i x )( y y i n 1 i y 2

1. Use Scenario 3-1. In this study, the response variable is

Unit 6 - Introduction to linear regression

AP STATISTICS Name: Period: Review Unit IV Scatterplots & Regressions

STA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #6

Warm-up Using the given data Create a scatterplot Find the regression line

Conditions for Regression Inference:

appstats8.notebook October 11, 2016

10 Model Checking and Regression Diagnostics

MULTIPLE REGRESSION METHODS

Correlation & Simple Regression

Chapter 12: Multiple Regression

Start with review, some new definitions, and pictures on the white board. Assumptions in the Normal Linear Regression Model

This document contains 3 sets of practice problems.

Inference for Regression Inference about the Regression Model and Using the Regression Line

Simple Linear Regression: A Model for the Mean. Chap 7

Simple linear regression: linear relationship between two qunatitative variables. Linear Regression. The regression line

Confidence Interval for the mean response

Unit 6 - Simple linear regression

Section 5.4 Residuals

Scatterplots. 3.1: Scatterplots & Correlation. Scatterplots. Explanatory & Response Variables. Section 3.1 Scatterplots and Correlation

Regression. Marc H. Mehlman University of New Haven

28. SIMPLE LINEAR REGRESSION III

SMAM 319 Exam 1 Name. 1.Pick the best choice for the multiple choice questions below (10 points 2 each)

Pre-Calculus Multiple Choice Questions - Chapter S8

CHAPTER 3 Describing Relationships

AP Statistics L I N E A R R E G R E S S I O N C H A P 7

11 Correlation and Regression

Linear Regression. Al Nosedal University of Toronto. Summer Al Nosedal University of Toronto Linear Regression Summer / 115

AP Statistics. The only statistics you can trust are those you falsified yourself. RE- E X P R E S S I N G D A T A ( P A R T 2 ) C H A P 9

Model Building Chap 5 p251

HOMEWORK (due Wed, Jan 23): Chapter 3: #42, 48, 74

Inferences for linear regression (sections 12.1, 12.2)

3.1 Scatterplots and Correlation

Ch 13 & 14 - Regression Analysis

Chapter 8. Linear Regression. The Linear Model. Fat Versus Protein: An Example. The Linear Model (cont.) Residuals

Analysis of Covariance. The following example illustrates a case where the covariate is affected by the treatments.

MODELING. Simple Linear Regression. Want More Stats??? Crickets and Temperature. Crickets and Temperature 4/16/2015. Linear Model

IT 403 Practice Problems (2-2) Answers

Chapter 7. Scatterplots, Association, and Correlation

SMAM 314 Exam 42 Name

Math 2311 Written Homework 6 (Sections )

Correlation and Regression

SMAM 319 Exam1 Name. a B.The equation of a line is 3x + y =6. The slope is a. -3 b.3 c.6 d.1/3 e.-1/3

Prob/Stats Questions? /32

Mrs. Poyner/Mr. Page Chapter 3 page 1

Chapter 12 - Part I: Correlation Analysis

appstats27.notebook April 06, 2017

School of Mathematical Sciences. Question 1

CRP 272 Introduction To Regression Analysis

Chapter Goals. To understand the methods for displaying and describing relationship among variables. Formulate Theories.

Transcription:

Review of Regression Basics When describing a Bivariate Relationship: Make a Scatterplot Strength, Direction, Form Model: y-hat=a+bx Interpret slope in context Make Predictions Residual = Observed-Predicted Assess the Model Interpret r Residual Plot 4 4 3 3 The Endangered Manatee The Endangered Manatee 4 4 3 3 The Endangered Manatee The Endangered Manatee 4 4 7 7 4 4 Registrations ( thousand ) 7 7 Registrations ( thousand ) 4 4 4 4 7 7 7 Registrations ( thousand ) 7 Registrations ( thousand ) Killed = (.12 thousand^-1)registrations - 41.4 ; r Killed = (.12 thousand^-1)registrations - 41.4 ; 2 r 2 =.8 =.8 The The Endangered Endangered Manatee Scatter Manatee Plot 4 4 3 3 4 4 4 4 7 7 7 7 Registrations (( thousand thousand ) ) - - 4 4 4 4 7 7 7 7 Registrations (( thousand thousand )) Killed Killed = (.12 (.12 thousand^-1)registrations -- 41 41 ;; rr 2 2 =.8.8

Reading Minitab Output Regression Analysis: Fat gain versus NEA The regression equation is FatGain = ****** + ******(NEA) Predictor Coef SE Coef T P Constant 3.1.33.4. NEA -.344.74141-4.4. S=.7383 R-Sq =.% R- Sq(adj)=7.8% Regression equations aren t always as easy to spot as they are on your TI-84. Can you find the slope and intercept above?

Age Age at at First First Word Word and and Gesell Gesell Score Score Child Child Age Age Score Score 1 1 1 1 2 2 2 2 2 2 71 71 3 3 3 3 83 83 4 4 4 4 1 1 2 2 87 87 7 7 7 7 18 18 3 3 8 8 8 8 8 8 4 4 4 4 7 7 3 3 12 12 12 12 13 13 13 13 83 83 14 14 14 14 84 84 2 2 1 1 1 1 17 17 17 17 12 12 18 18 18 18 42 42 7 7 1 1 1 1 17 17 121 121 8 8 13 21 21 21 21 13 1 Outliers/Influential Points Does the age of a child s first word predict his/her mental ability? Consider the following data on (age of first word, Gesell Adaptive Score) for 21 children. Age Age at at First First Word Word and and Gesell Gesell Score Scatter Score Scatter Plot Plot 13 13 1 1 1 1 Influential? 8 8 7 7 2 2 3 3 3 3 4 4 4 4 Age Age () () Age at First Word and Gesell Score Age at First Word and Gesell Score 13 13 1 1 1 1 8 8 7 7 2 2 3 3 3 3 4 4 4 Age ( ) 4 Age ( ) Score = (-1.13 ^-1)Age + 1 ; r Score = (-1.13 ^-1)Age + 1 2 ; r 2 =.41 =.41 Age at First Word and Gesell Score Age at First Word and Gesell Score 1 1 1 8 8 7 7 2 2 3 3 3 3 4 4 4 Age ( ) 4 Score = (-.77 ^-1)Age + Age ( ) ; r 2 Score = (-.77 ^-1)Age + ; r 2 =. =. Does the highlighted point markedly affect the equation of the LSRL? If so, it is influential. Test by removing the point and finding the new LSRL.

Explanatory vs. Response The Distinction Between Explanatory and Response variables is essential in regression. Switching the distinction results in a different leastsquares regression line. Hubble 12 data Hubble 12 data 1 1 8 8 4 4 - - -4-4...2.2.4.4...8.8 1. 1. 1.2 1.2 1.4 1.4 1. 1. 1.8 1.8 2. 2. 2.2 r 2.2 r v = 44r - 38 v = 44r - 38 ; r ; r 2 2 =.3 =.3 Hubble 12 data Hubble 12 data 2.2 2.2 2. 2. 1.8 1.8 1. 1. 1.4 1.4 1.2 1.2 1. 1..8.8...4.4.2.2.. -4-4 - - 4 4 8 8 1 v 1 v r =.13v +.3 r =.13v +.3 ; r ; r 2 2 =.3 =.3 Note: The correlation value, r, does NOT depend on the distinction between Explanatory and Response.

Correlation Beer Beer and and Blood Blood Alcohol Scatter P.2.2.1.1.1.1.1.1.1.1.1.1.......... 1 2 3 4 7 8 Beers BAC BAC =.18Beers ; r; 2 r 2 =.8.8 --.13.13 There is a weak, positive, linear relationship between x and y. However, there is a strong nonlinear relationship. r measures the strength of linearity... The correlation, r, describes the strength of the straight-line relationship between x and y. Ex: There is a strong, positive, LINEAR relationship between # of beers and BAC. Collection 1 - - - - - - - - Scatter P 2 4 8 12 12 x y = x ;- r; - 2 r 2 =.14.14

Coefficient of Determination The coefficient of determination, r 2, describes the percent of variability in y that is explained by the linear regression on x. Wine Consumption and Heart Scatter Dise P 3 3 2 1 2 3 4 7 8 Alcohol (L/yr )) DeathRate = (-23. yr/l)alcohol ;; r 2 r =.71 71% of the variability in death rates due to heart disease can be explained by the LSRL on alcohol consumption. That is, alcohol consumption provides us with a fairly good prediction of death rate due to heart disease, but other factors contribute to this rate, so our prediction will be off somewhat.

Cautions Correlation and Regression are NOT RESISTANT to outliers and Influential Points! Correlations based on averaged data tend to be higher than correlations based on all raw data. Extrapolating beyond the observed data can result in predictions that are unreliable.

Correlation vs. Causation Consider the following historical data: Collection 1 Year MinistersRum 1 2 3 4 7 8 12 12 18 3 3 837 837 18 48 48 4 4 187 3 3 7 7 187 4 4 848 848 188 72 72 188 8 8 4 18 8 8 2 18 7 7 7 1 8 8 4 1 83 83 1 1 1388 1 14 14 18 Collection 1 Scatter P 18 18 1 1 14 14 14 1 1 1 8 8 8 8 4 4 4 4 4 4 841 4 4 841 x y = 132x 132x ;; r 2 r+ = 1. 33 1. 33 x There is an almost perfect linear relationship between x and y. (r=.7) x = # Methodist Ministers in New England y = # of Barrels of Rum Imported to Boston CORRELATION DOES NOT IMPLY CAUSATION!

Summary The The Endangered Endangered Manatee Manatee Scatter Scatter P 4 4 3 3 4477 4477 Registrations Registrations (thousand (thousand ) ) The The Endangered Endangered Manatee Manatee Scatter Scatter P 4 4 3 3 4477 4477 Registrations Registrations (thousand (thousand ) ) Killed Killed = = (.12 (.12 thousand^-1)reg thousand^-1)reg ; r ; 2 r 2 =.8 =.8 The The Endangered Endangered Manatee Manatee Scatter Scatter P 4 4 3 3 4477 4477 Registrations Registrations (thousand (thousand ) ) - - 4477 4477 Registrations Registrations (thousand (thousand )) Killed Killed = = (.12 (.12 thousand^-1)reg thousand^-1)reg ; r; 2 r 2 =.8 =.8