Introduction to Regression. Myra O Regan Room 142 Lloyd Institute

Size: px
Start display at page:

Download "Introduction to Regression. Myra O Regan Room 142 Lloyd Institute"

Transcription

1 Introduction to Regression Myra O Regan Myra.ORegan@tcd.ie Room 142 Lloyd Institute 1

2 Description of module Practical module on regression Focussing on the application of multiple regression Software Lots of computer output will use R sometimes 2 labs Some Mathematics but no linear Algebra 2

3 Topics to be covered Revision of Simple linear regression Introduction to Multiple regression Use of logs and other transformations Regression Diagnostics Use of Indicator Variables Polynomial regression Building a regression model Dealing with multicollinearity Introduction to Logistic regression Other fun techniques 3

4 Notes and Books I use BlackBoard Sheather, S. J. A Modern Approach to regression with R,, New York:, Springer 2009 Neter, J., Wasserman, W. & Kutner, M.H. Applied Linear Models, 2 nd edition Boston, Irwin:1989 Kutner. M. H., Nachtsheim, C.J., Neter, J. & Li, W. Applied Linear Statistical Models, 5 th, Boston: McGraw-Hill,

5 Purpose of regression To build a model for prediction purposes Price of diamond from number of carats Price of a house Time to process invoices Measuring the volume of wood in trees To look at relationships Factors relating to cot death 5

6 Netflix competition Variables were user, movie, date of grade, grade Grade was measured from 1 to 5 100,480,507 ratings 480,189 users 17,770 movies Movie, title and year of release 6

7 7

8 8

9 308 diamnonds, price, colour, clarity and size 9

10 10

11 11

12 Initial examination of data Know the story behind the data Understand the background Understand meanings of variables Look at each variable separately Check the quality of data Summary statistics and graphs How much missing data? 12

13 Revision of simple linear regression Manager of a purchasing department of a large company would like to predict average amount of time it takes to process a given number of invoices. Data was collected over a sample of 30 days on the number of invoices and time taken in hours Three variables Time, Number of Invoices and Day 13

14 Invoices Time N N* 0 0 Mean SE Mean StDev Minimum Q Median Q Maximum

15 15

16 Model to fit Time i = α + β Invoices i + ε i Linear model Need estimates of α and β Need SE for estimates We use Minitab to calculate estimates of α and β 16

17 17

18 What is going on here? What are the lines? More importantly what are the differences 18

19 Prediction vs Confidence intervals Confidence interval For a given value of x 0 this is an interval for the average value of the dependent variable Point Estimate ± t *s Distance value t has n-(k+1) df where k = no. of predictors s= what does this measure Distance value = 1 n + (x 0 x ) 2 (x i x ) 2 19

20 Prediction vs Confidence intervals Prediction interval For a given value of x 0 this an interval for the particular value of the dependent variable Point Estimate ± t *s 1 + Distance value t has n-(k+1) df where k = no. of predictors s= what doe this measure Distance value = 1 n + (x 0 x ) 2 (x i x ) 2 20

21 Approximate intervals for reasonably large samples Confidence intervals=2*s* 1 n Prediction intervals = 2*s * n 21

22 Example Let number of invoices = 50 Where do these numbers come from roughly? 22

23 ANOVA table Total sums of squares(ss) =(Y i Y) 2 Regression SS=(Y i Y) 2 Error SS =(Y i Y i ) 2 What is R 2? 23

24 What happens if we do the following? Let Invoices=X Subtract k from each case What will change? Time = α + β X + ε original model Time=α + β*(x-k)+ε= (α- βk)+ βx+ ε Slope does not change but intercept does Intercept = expected value of Time when X=k Normally we use k=mean of the variable 24

25 The regression equation is Time = Centered invoices 25

26 26

27 Trees data Sample of 31 black cherry trees in the Allegheny national Forest in Pennsylvania Volume in cubic feet Height in feet Diameter in inches 54 inches above ground 27

28 Variable Diameter Height Volume N N* Mean SE mean StDev Minimum Q Median Q Maximum

29 29

30 30

31 31

32 32

33 33

34 34

35 What does the F-test mean? Testing a hypothesis Null hypothesis H 0 : β 1 = β 2 = 0 Alternative Hypothesis H 1 : Not all β s =0 F=254.97, df=(2,28) p<0.001 Enough evidence against the null hypothesis 35

36 Interpretation of coefficients Volume = β 0 + β 1 *Height+ β 2 *Diameter + ε E(Volume) or Predicted(Volume) or sometimes written as Y = *Height+4.71 *Diameter Constant (-58.0) is the mean response when Height=0 and Diameter=0 β 1 change in mean response per unit increase in Height when Diameter is held constant (at any value) Similarly β 2 change in mean response per unit increase in Diameter when Height is held constant (at any value) 36

37 And a little more Example let Diameter =12 E(Volume) = Height *12 = Height Intercept changes but β 1 stays the same. Effect on mean response of height does not depend on Diameter We say effects are additive or not to interact Partial regression coefficients 37

38 Changing coefficients Height by itself 1.54 (.38) Diameter by itself 5.07 (0.25) Multiple regression Height Diameter 0.34 (0.13) Diameter Height 4.71 (0.26) 38

39 Sums of squares Same calculation as before Sequential sums of squares Diameter & Height Diameter Height Sequential sums of squares Height & Diameter Height Diameter

40 Derived variables Create a new x from the given x-variables Could be a transformation or a combination Use background knowledge to create new variable Tree crudely modeled by cylinder cylinder vol = πr 2 x ht = π 4 (Diam)2 x ht ht (Diam) 2 40

41 Plot first 41

42 42

43 43

44 44

45 45

46 Transform using logs y=log b a; b y =a; 2 3 =8; log 2 8=3; b is called the base Typical bases are e and 10 We are going to use base 10 e is a mathematical number =2.71 logs to the base e are called natural logs often written as ln 46

47 Basic rules for logs using base 10 Log(10) =1 Log(10) a =a Log(1)=0 Log(0) is not defined Log(x r )=rlog(x) 10 log(a) =a Richter scale for measuring earthquake strength is on a log 10 scale 47

48 And some more Log(ab) = log(a)+log(b) log a b = log a log b 10 ab =(10 a ) b; 10 (a+b) =10 a 10 b ;10 a-b = 10a 10 b 48

49 What are we going to do with all this? Linear Model We can take logs of X; of Y; or of both; What we are interested in examining is the interpretation of the coefficients and interpret them in the original scale We will see later when it is appropriate Let us start with the model Y=α + β*log(x) + ε 49

50 50

51 51

52 Interpretation of coefficients A 1 unit increase in log(x) is associated with β increase in Y units log(x)+1 = log(x) +log(10)= log(10x) Converting to a percentage Multiplying X by 10 equivalent to (10-1)*100% change = 900% increase in x β expected change in Y when X is multiplied by 10 β expected change in Y when X increases by 900% 52

53 And more For other percentage changes p p% increase in X = β log ( 100+p ) increase in Y 100 A 10% increase in X associated with β log ( ) increase in Y 100 β *log(1.1) increase in Y β *0.041 increase in Y 53

54 What does this mean? Volume = logheight An increase in 1 in logheight will increase Volume by 262 Multiplying height by 10 will increase Volume by 262 A 10% increase in height will increase Volume by β log ( 100+p 100 ) =262*log(1.1)=

55 Next situation Log(Y)=α+β*X+ε A 1 unit increase in X is associated with β increase in log Y units Log Y + β =10 (log y +β) = Y 10 β Each 1-unit increase in X multiplies the expected value of Y by 10 β The effect of a c-unit increase in X is to multiply the expected value of Y by 10 cβ 55

56 More Calculate ch= Y 10 β Calculate (ch-1)*100 Ch=1.20 implies a 20% increase Ch=.7 implies a 30% decrease 56

57 57

58 And now.. logvolume = Height A 1 unit increase in height increase logvolume by Each unit increase of height increases Volume by a multiple of =1.055 or 5.5% increase 58

59 Last situation Log Y = α +β*log(x) +ε A 1 unit increase in log(x) is associated with β*log(y) units p% increase in X = β log ( 100+p ) increase in log Y units a= β log ( 100+p 100 ) Log Y+a =Y*10 a

60 60

61 Again some interpretation logvolume = logheight A 1 unit increase in logheight will increase logvolume by 3.98 Multiplying height by 10 multiplies Volume by A 10% increase in height multiplies Volume by 10 (3.98*log(1.1)) = 1.46 Can interpret this a 46% increase in Volume 10% increase in height associated with a 46% increase in Volume. 61

62 Interpretation logvolume = logheight Can write as 10 logvolume =10 ( logheight) Volume = * logheight This is sometimes called a multiplicative model Using the above for prediction Height = 85 remember to use log10(85)=1.929 Using Minitab we get (1.2412, ) as PI 62

63 In the original units (1.2412, ) = ( , )= (17.42, 99.38) 85 is not in the centre Will return to when to use logs 63

64 Interpret coefficients in original scale Calculate predicted Sun circulation for weekday circulation of 300,000 both predicted and CI. You can just use the approximate solution 64

65 Interpretations 10% increase in weekly circulation associated with a 10 (1.05*log(1.1)) = increase in Sunday circulation equivalent % Increase in weekly Increase in Sunday % increase in Sunday 65

66 Approximate Confidence Intervals Calculate CI s for weekly circulation of 300,000 Predicted Value= *log(300000) =5.62 on log scale N=89;s= %CI = 5.617±2*0.056* =(5.605,5.629) = 402,835 to 425,473 66

67 Great chapter on derived variables Linoff, G. S & Berry, M. J. A. Data Mining Techniques 3 rd Edition, Wiley: Indianapolis,

68 Some summary thoughts Get to know the story of your data Use simple plots and summary statistics Does it look ok? Think about derived variables Start simply Don t forget your common sense 68

Chapter 14 Student Lecture Notes 14-1

Chapter 14 Student Lecture Notes 14-1 Chapter 14 Student Lecture Notes 14-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter 14 Multiple Regression Analysis and Model Building Chap 14-1 Chapter Goals After completing this

More information

INFERENCE FOR REGRESSION

INFERENCE FOR REGRESSION CHAPTER 3 INFERENCE FOR REGRESSION OVERVIEW In Chapter 5 of the textbook, we first encountered regression. The assumptions that describe the regression model we use in this chapter are the following. We

More information

Chapter 10. Regression. Understandable Statistics Ninth Edition By Brase and Brase Prepared by Yixun Shi Bloomsburg University of Pennsylvania

Chapter 10. Regression. Understandable Statistics Ninth Edition By Brase and Brase Prepared by Yixun Shi Bloomsburg University of Pennsylvania Chapter 10 Regression Understandable Statistics Ninth Edition By Brase and Brase Prepared by Yixun Shi Bloomsburg University of Pennsylvania Scatter Diagrams A graph in which pairs of points, (x, y), are

More information

Multiple Regression Methods

Multiple Regression Methods Chapter 1: Multiple Regression Methods Hildebrand, Ott and Gray Basic Statistical Ideas for Managers Second Edition 1 Learning Objectives for Ch. 1 The Multiple Linear Regression Model How to interpret

More information

Chapter 3 Multiple Regression Complete Example

Chapter 3 Multiple Regression Complete Example Department of Quantitative Methods & Information Systems ECON 504 Chapter 3 Multiple Regression Complete Example Spring 2013 Dr. Mohammad Zainal Review Goals After completing this lecture, you should be

More information

Ch 13 & 14 - Regression Analysis

Ch 13 & 14 - Regression Analysis Ch 3 & 4 - Regression Analysis Simple Regression Model I. Multiple Choice:. A simple regression is a regression model that contains a. only one independent variable b. only one dependent variable c. more

More information

General Linear Statistical Models

General Linear Statistical Models General Linear Statistical Models Statistics 135 Autumn 2005 Copyright c 2005 by Mark E. Irwin This framework includes General Linear Statistical Models Linear Regression Analysis of Variance (ANOVA) Analysis

More information

SMAM 314 Practice Final Examination Winter 2003

SMAM 314 Practice Final Examination Winter 2003 SMAM 314 Practice Final Examination Winter 2003 You may use your textbook, one page of notes and a calculator. Please hand in the notes with your exam. 1. Mark the following statements True T or False

More information

Basic Business Statistics, 10/e

Basic Business Statistics, 10/e Chapter 4 4- Basic Business Statistics th Edition Chapter 4 Introduction to Multiple Regression Basic Business Statistics, e 9 Prentice-Hall, Inc. Chap 4- Learning Objectives In this chapter, you learn:

More information

Sampling Distributions in Regression. Mini-Review: Inference for a Mean. For data (x 1, y 1 ),, (x n, y n ) generated with the SRM,

Sampling Distributions in Regression. Mini-Review: Inference for a Mean. For data (x 1, y 1 ),, (x n, y n ) generated with the SRM, Department of Statistics The Wharton School University of Pennsylvania Statistics 61 Fall 3 Module 3 Inference about the SRM Mini-Review: Inference for a Mean An ideal setup for inference about a mean

More information

STK4900/ Lecture 3. Program

STK4900/ Lecture 3. Program STK4900/9900 - Lecture 3 Program 1. Multiple regression: Data structure and basic questions 2. The multiple linear regression model 3. Categorical predictors 4. Planned experiments and observational studies

More information

Models with qualitative explanatory variables p216

Models with qualitative explanatory variables p216 Models with qualitative explanatory variables p216 Example gen = 1 for female Row gpa hsm gen 1 3.32 10 0 2 2.26 6 0 3 2.35 8 0 4 2.08 9 0 5 3.38 8 0 6 3.29 10 0 7 3.21 8 0 8 2.00 3 0 9 3.18 9 0 10 2.34

More information

Polynomial and Synthetic Division

Polynomial and Synthetic Division Polynomial and Synthetic Division Polynomial Division Polynomial Division is very similar to long division. Example: 3x 3 5x 3x 10x 1 3 Polynomial Division 3x 1 x 3x 3 3 x 5x 3x x 6x 4 10x 10x 7 3 x 1

More information

UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Midterm Test, October 2013

UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Midterm Test, October 2013 UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Midterm Test, October 2013 STAC67H3 Regression Analysis Duration: One hour and fifty minutes Last Name: First Name: Student

More information

Inference for Regression Simple Linear Regression

Inference for Regression Simple Linear Regression Inference for Regression Simple Linear Regression IPS Chapter 10.1 2009 W.H. Freeman and Company Objectives (IPS Chapter 10.1) Simple linear regression p Statistical model for linear regression p Estimating

More information

Lecture 19: Inference for SLR & Transformations

Lecture 19: Inference for SLR & Transformations Lecture 19: Inference for SLR & Transformations Statistics 101 Mine Çetinkaya-Rundel April 3, 2012 Announcements Announcements HW 7 due Thursday. Correlation guessing game - ends on April 12 at noon. Winner

More information

The linear model. Our models so far are linear. Change in Y due to change in X? See plots for: o age vs. ahe o carats vs.

The linear model. Our models so far are linear. Change in Y due to change in X? See plots for: o age vs. ahe o carats vs. 8 Nonlinear effects Lots of effects in economics are nonlinear Examples Deal with these in two (sort of three) ways: o Polynomials o Logarithms o Interaction terms (sort of) 1 The linear model Our models

More information

A discussion on multiple regression models

A discussion on multiple regression models A discussion on multiple regression models In our previous discussion of simple linear regression, we focused on a model in which one independent or explanatory variable X was used to predict the value

More information

Inference with Simple Regression

Inference with Simple Regression 1 Introduction Inference with Simple Regression Alan B. Gelder 06E:071, The University of Iowa 1 Moving to infinite means: In this course we have seen one-mean problems, twomean problems, and problems

More information

Business Statistics. Chapter 14 Introduction to Linear Regression and Correlation Analysis QMIS 220. Dr. Mohammad Zainal

Business Statistics. Chapter 14 Introduction to Linear Regression and Correlation Analysis QMIS 220. Dr. Mohammad Zainal Department of Quantitative Methods & Information Systems Business Statistics Chapter 14 Introduction to Linear Regression and Correlation Analysis QMIS 220 Dr. Mohammad Zainal Chapter Goals After completing

More information

1.) Fit the full model, i.e., allow for separate regression lines (different slopes and intercepts) for each species

1.) Fit the full model, i.e., allow for separate regression lines (different slopes and intercepts) for each species Lecture notes 2/22/2000 Dummy variables and extra SS F-test Page 1 Crab claw size and closing force. Problem 7.25, 10.9, and 10.10 Regression for all species at once, i.e., include dummy variables for

More information

Correlation and Regression

Correlation and Regression Correlation and Regression Dr. Bob Gee Dean Scott Bonney Professor William G. Journigan American Meridian University 1 Learning Objectives Upon successful completion of this module, the student should

More information

Inference for Regression Inference about the Regression Model and Using the Regression Line

Inference for Regression Inference about the Regression Model and Using the Regression Line Inference for Regression Inference about the Regression Model and Using the Regression Line PBS Chapter 10.1 and 10.2 2009 W.H. Freeman and Company Objectives (PBS Chapter 10.1 and 10.2) Inference about

More information

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X. Estimating σ 2 We can do simple prediction of Y and estimation of the mean of Y at any value of X. To perform inferences about our regression line, we must estimate σ 2, the variance of the error term.

More information

Simple Linear Regression: A Model for the Mean. Chap 7

Simple Linear Regression: A Model for the Mean. Chap 7 Simple Linear Regression: A Model for the Mean Chap 7 An Intermediate Model (if the groups are defined by values of a numeric variable) Separate Means Model Means fall on a straight line function of the

More information

MATH Notebook 4 Spring 2018

MATH Notebook 4 Spring 2018 MATH448001 Notebook 4 Spring 2018 prepared by Professor Jenny Baglivo c Copyright 2010 2018 by Jenny A. Baglivo. All Rights Reserved. 4 MATH448001 Notebook 4 3 4.1 Simple Linear Model.................................

More information

Multiple Regression. Inference for Multiple Regression and A Case Study. IPS Chapters 11.1 and W.H. Freeman and Company

Multiple Regression. Inference for Multiple Regression and A Case Study. IPS Chapters 11.1 and W.H. Freeman and Company Multiple Regression Inference for Multiple Regression and A Case Study IPS Chapters 11.1 and 11.2 2009 W.H. Freeman and Company Objectives (IPS Chapters 11.1 and 11.2) Multiple regression Data for multiple

More information

PART I. (a) Describe all the assumptions for a normal error regression model with one predictor variable,

PART I. (a) Describe all the assumptions for a normal error regression model with one predictor variable, Concordia University Department of Mathematics and Statistics Course Number Section Statistics 360/2 01 Examination Date Time Pages Final December 2002 3 hours 6 Instructors Course Examiner Marks Y.P.

More information

Lecture 18 MA Applied Statistics II D 2004

Lecture 18 MA Applied Statistics II D 2004 Lecture 18 MA 2612 - Applied Statistics II D 2004 Today 1. Examples of multiple linear regression 2. The modeling process (PNC 8.4) 3. The graphical exploration of multivariable data (PNC 8.5) 4. Fitting

More information

MULTIPLE REGRESSION METHODS

MULTIPLE REGRESSION METHODS DEPARTMENT OF POLITICAL SCIENCE AND INTERNATIONAL RELATIONS Posc/Uapp 816 MULTIPLE REGRESSION METHODS I. AGENDA: A. Residuals B. Transformations 1. A useful procedure for making transformations C. Reading:

More information

Class Notes Spring 2014

Class Notes Spring 2014 Psychology 513 Quantitative Models in Psychology Class Notes Spring 2014 Robert M. McFatter University of Louisiana Lafayette 5.5 5 4.5 Positive Emotional Intensity 4 3.5 3 2.5 2.5 1.25 2-2.5-2 -1.5-1

More information

Strategies for dealing with Missing Data

Strategies for dealing with Missing Data Institut für Soziologie Eberhard Karls Universität Tübingen http://www.maartenbuis.nl What do we want from an analysis strategy? Simple example We have a theory that working for cash is mainly men s work

More information

Basic Business Statistics 6 th Edition

Basic Business Statistics 6 th Edition Basic Business Statistics 6 th Edition Chapter 12 Simple Linear Regression Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value of a dependent variable based

More information

23. Inference for regression

23. Inference for regression 23. Inference for regression The Practice of Statistics in the Life Sciences Third Edition 2014 W. H. Freeman and Company Objectives (PSLS Chapter 23) Inference for regression The regression model Confidence

More information

CHAPTER EIGHT Linear Regression

CHAPTER EIGHT Linear Regression 7 CHAPTER EIGHT Linear Regression 8. Scatter Diagram Example 8. A chemical engineer is investigating the effect of process operating temperature ( x ) on product yield ( y ). The study results in the following

More information

Section 3: Simple Linear Regression

Section 3: Simple Linear Regression Section 3: Simple Linear Regression Carlos M. Carvalho The University of Texas at Austin McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/ 1 Regression: General Introduction

More information

Ch. 12 Higher Degree Equations Rational Root

Ch. 12 Higher Degree Equations Rational Root Ch. 12 Higher Degree Equations Rational Root Sec 1. Synthetic Substitution ~ Division of Polynomials This first section was covered in the chapter on polynomial operations. I m reprinting it here because

More information

Final Exam Bus 320 Spring 2000 Russell

Final Exam Bus 320 Spring 2000 Russell Name Final Exam Bus 320 Spring 2000 Russell Do not turn over this page until you are told to do so. You will have 3 hours minutes to complete this exam. The exam has a total of 100 points and is divided

More information

Lecture 18: Simple Linear Regression

Lecture 18: Simple Linear Regression Lecture 18: Simple Linear Regression BIOS 553 Department of Biostatistics University of Michigan Fall 2004 The Correlation Coefficient: r The correlation coefficient (r) is a number that measures the strength

More information

Business Statistics. Lecture 10: Correlation and Linear Regression

Business Statistics. Lecture 10: Correlation and Linear Regression Business Statistics Lecture 10: Correlation and Linear Regression Scatterplot A scatterplot shows the relationship between two quantitative variables measured on the same individuals. It displays the Form

More information

Steps for Regression. Simple Linear Regression. Data. Example. Residuals vs. X. Scatterplot. Make a Scatter plot Does it make sense to plot a line?

Steps for Regression. Simple Linear Regression. Data. Example. Residuals vs. X. Scatterplot. Make a Scatter plot Does it make sense to plot a line? Steps for Regression Simple Linear Regression Make a Scatter plot Does it make sense to plot a line? Check Residual Plot (Residuals vs. X) Are there any patterns? Check Histogram of Residuals Is it Normal?

More information

Lecture 10 Multiple Linear Regression

Lecture 10 Multiple Linear Regression Lecture 10 Multiple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: 6.1-6.5 10-1 Topic Overview Multiple Linear Regression Model 10-2 Data for Multiple Regression Y i is the response variable

More information

MBA Statistics COURSE #4

MBA Statistics COURSE #4 MBA Statistics 51-651-00 COURSE #4 Simple and multiple linear regression What should be the sales of ice cream? Example: Before beginning building a movie theater, one must estimate the daily number of

More information

Chapter 14 Student Lecture Notes Department of Quantitative Methods & Information Systems. Business Statistics. Chapter 14 Multiple Regression

Chapter 14 Student Lecture Notes Department of Quantitative Methods & Information Systems. Business Statistics. Chapter 14 Multiple Regression Chapter 14 Student Lecture Notes 14-1 Department of Quantitative Methods & Information Systems Business Statistics Chapter 14 Multiple Regression QMIS 0 Dr. Mohammad Zainal Chapter Goals After completing

More information

This document contains 3 sets of practice problems.

This document contains 3 sets of practice problems. P RACTICE PROBLEMS This document contains 3 sets of practice problems. Correlation: 3 problems Regression: 4 problems ANOVA: 8 problems You should print a copy of these practice problems and bring them

More information

Final Exam - Solutions

Final Exam - Solutions Ecn 102 - Analysis of Economic Data University of California - Davis March 19, 2010 Instructor: John Parman Final Exam - Solutions You have until 5:30pm to complete this exam. Please remember to put your

More information

The simple linear regression model discussed in Chapter 13 was written as

The simple linear regression model discussed in Chapter 13 was written as 1519T_c14 03/27/2006 07:28 AM Page 614 Chapter Jose Luis Pelaez Inc/Blend Images/Getty Images, Inc./Getty Images, Inc. 14 Multiple Regression 14.1 Multiple Regression Analysis 14.2 Assumptions of the Multiple

More information

Inferences for Regression

Inferences for Regression Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In

More information

USEFUL TRANSFORMATIONS

USEFUL TRANSFORMATIONS Appendix B USEFUL TRANSFORMATIONS Purpose of Transformations Transformations are used to present data on a different scale. The nature of a transformation determines how the scale of the untransformed

More information

(4) 1. Create dummy variables for Town. Name these dummy variables A and B. These 0,1 variables now indicate the location of the house.

(4) 1. Create dummy variables for Town. Name these dummy variables A and B. These 0,1 variables now indicate the location of the house. Exam 3 Resource Economics 312 Introductory Econometrics Please complete all questions on this exam. The data in the spreadsheet: Exam 3- Home Prices.xls are to be used for all analyses. These data are

More information

Notebook Tab 6 Pages 183 to ConteSolutions

Notebook Tab 6 Pages 183 to ConteSolutions Notebook Tab 6 Pages 183 to 196 When the assumed relationship best fits a straight line model (r (Pearson s correlation coefficient) is close to 1 ), this approach is known as Linear Regression Analysis.

More information

1 Introduction to Minitab

1 Introduction to Minitab 1 Introduction to Minitab Minitab is a statistical analysis software package. The software is freely available to all students and is downloadable through the Technology Tab at my.calpoly.edu. When you

More information

Q1: What is the interpretation of the number 4.1? A: There were 4.1 million visits to ER by people 85 and older, Q2: What percent of people 65-74

Q1: What is the interpretation of the number 4.1? A: There were 4.1 million visits to ER by people 85 and older, Q2: What percent of people 65-74 Lecture 4 This week lab:exam 1! Review lectures, practice labs 1 to 4 and homework 1 to 5!!!!! Need help? See me during my office hrs, or goto open lab or GS 211. Bring your picture ID and simple calculator.(note

More information

Interactions. Interactions. Lectures 1 & 2. Linear Relationships. y = a + bx. Slope. Intercept

Interactions. Interactions. Lectures 1 & 2. Linear Relationships. y = a + bx. Slope. Intercept Interactions Lectures 1 & Regression Sometimes two variables appear related: > smoking and lung cancers > height and weight > years of education and income > engine size and gas mileage > GMAT scores and

More information

Six Sigma Black Belt Study Guides

Six Sigma Black Belt Study Guides Six Sigma Black Belt Study Guides 1 www.pmtutor.org Powered by POeT Solvers Limited. Analyze Correlation and Regression Analysis 2 www.pmtutor.org Powered by POeT Solvers Limited. Variables and relationships

More information

Regression Analysis IV... More MLR and Model Building

Regression Analysis IV... More MLR and Model Building Regression Analysis IV... More MLR and Model Building This session finishes up presenting the formal methods of inference based on the MLR model and then begins discussion of "model building" (use of regression

More information

28. SIMPLE LINEAR REGRESSION III

28. SIMPLE LINEAR REGRESSION III 28. SIMPLE LINEAR REGRESSION III Fitted Values and Residuals To each observed x i, there corresponds a y-value on the fitted line, y = βˆ + βˆ x. The are called fitted values. ŷ i They are the values of

More information

CHAPTER 5 LINEAR REGRESSION AND CORRELATION

CHAPTER 5 LINEAR REGRESSION AND CORRELATION CHAPTER 5 LINEAR REGRESSION AND CORRELATION Expected Outcomes Able to use simple and multiple linear regression analysis, and correlation. Able to conduct hypothesis testing for simple and multiple linear

More information

Warm-up Using the given data Create a scatterplot Find the regression line

Warm-up Using the given data Create a scatterplot Find the regression line Time at the lunch table Caloric intake 21.4 472 30.8 498 37.7 335 32.8 423 39.5 437 22.8 508 34.1 431 33.9 479 43.8 454 42.4 450 43.1 410 29.2 504 31.3 437 28.6 489 32.9 436 30.6 480 35.1 439 33.0 444

More information

using the beginning of all regression models

using the beginning of all regression models Estimating using the beginning of all regression models 3 examples Note about shorthand Cavendish's 29 measurements of the earth's density Heights (inches) of 14 11 year-old males from Alberta study Half-life

More information

Chapter 26 Multiple Regression, Logistic Regression, and Indicator Variables

Chapter 26 Multiple Regression, Logistic Regression, and Indicator Variables Chapter 26 Multiple Regression, Logistic Regression, and Indicator Variables 26.1 S 4 /IEE Application Examples: Multiple Regression An S 4 /IEE project was created to improve the 30,000-footlevel metric

More information

Module 8: Linear Regression. The Applied Research Center

Module 8: Linear Regression. The Applied Research Center Module 8: Linear Regression The Applied Research Center Module 8 Overview } Purpose of Linear Regression } Scatter Diagrams } Regression Equation } Regression Results } Example Purpose } To predict scores

More information

Trendlines Simple Linear Regression Multiple Linear Regression Systematic Model Building Practical Issues

Trendlines Simple Linear Regression Multiple Linear Regression Systematic Model Building Practical Issues Trendlines Simple Linear Regression Multiple Linear Regression Systematic Model Building Practical Issues Overfitting Categorical Variables Interaction Terms Non-linear Terms Linear Logarithmic y = a +

More information

Regression Models. Chapter 4. Introduction. Introduction. Introduction

Regression Models. Chapter 4. Introduction. Introduction. Introduction Chapter 4 Regression Models Quantitative Analysis for Management, Tenth Edition, by Render, Stair, and Hanna 008 Prentice-Hall, Inc. Introduction Regression analysis is a very valuable tool for a manager

More information

Histogram of Residuals. Residual Normal Probability Plot. Reg. Analysis Check Model Utility. (con t) Check Model Utility. Inference.

Histogram of Residuals. Residual Normal Probability Plot. Reg. Analysis Check Model Utility. (con t) Check Model Utility. Inference. Steps for Regression Simple Linear Regression Make a Scatter plot Does it make sense to plot a line? Check Residual Plot (Residuals vs. X) Are there any patterns? Check Histogram of Residuals Is it Normal?

More information

Simple Linear Regression. Steps for Regression. Example. Make a Scatter plot. Check Residual Plot (Residuals vs. X)

Simple Linear Regression. Steps for Regression. Example. Make a Scatter plot. Check Residual Plot (Residuals vs. X) Simple Linear Regression 1 Steps for Regression Make a Scatter plot Does it make sense to plot a line? Check Residual Plot (Residuals vs. X) Are there any patterns? Check Histogram of Residuals Is it Normal?

More information

Solving Quadratic & Higher Degree Equations

Solving Quadratic & Higher Degree Equations Chapter 7 Solving Quadratic & Higher Degree Equations Sec 1. Zero Product Property Back in the third grade students were taught when they multiplied a number by zero, the product would be zero. In algebra,

More information

Bootstrap Simulation Procedure Applied to the Selection of the Multiple Linear Regressions

Bootstrap Simulation Procedure Applied to the Selection of the Multiple Linear Regressions JKAU: Sci., Vol. 21 No. 2, pp: 197-212 (2009 A.D. / 1430 A.H.); DOI: 10.4197 / Sci. 21-2.2 Bootstrap Simulation Procedure Applied to the Selection of the Multiple Linear Regressions Ali Hussein Al-Marshadi

More information

Simple and Multiple Linear Regression

Simple and Multiple Linear Regression Sta. 113 Chapter 12 and 13 of Devore March 12, 2010 Table of contents 1 Simple Linear Regression 2 Model Simple Linear Regression A simple linear regression model is given by Y = β 0 + β 1 x + ɛ where

More information

Problem #1 #2 #3 #4 #5 #6 Total Points /6 /8 /14 /10 /8 /10 /56

Problem #1 #2 #3 #4 #5 #6 Total Points /6 /8 /14 /10 /8 /10 /56 STAT 391 - Spring Quarter 2017 - Midterm 1 - April 27, 2017 Name: Student ID Number: Problem #1 #2 #3 #4 #5 #6 Total Points /6 /8 /14 /10 /8 /10 /56 Directions. Read directions carefully and show all your

More information

appstats27.notebook April 06, 2017

appstats27.notebook April 06, 2017 Chapter 27 Objective Students will conduct inference on regression and analyze data to write a conclusion. Inferences for Regression An Example: Body Fat and Waist Size pg 634 Our chapter example revolves

More information

OPEN QUESTIONS FOR MIDDLE SCHOOL MATH. Marian Small NOVEMBER 2018

OPEN QUESTIONS FOR MIDDLE SCHOOL MATH. Marian Small NOVEMBER 2018 OPEN QUESTIONS FOR MIDDLE SCHOOL MATH Marian Small NOVEMBER 2018 1 LET S DO A LITTLE MATH The answer is 30% What might the question be? 2 maybe What is a percent less than half? What is 3/10? What is a

More information

Regression. Marc H. Mehlman University of New Haven

Regression. Marc H. Mehlman University of New Haven Regression Marc H. Mehlman marcmehlman@yahoo.com University of New Haven the statistician knows that in nature there never was a normal distribution, there never was a straight line, yet with normal and

More information

Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference.

Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference. Understanding regression output from software Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals In 1966 Cyril Burt published a paper called The genetic determination of differences

More information

Summer Work Packet For Students Entering Algebra 1 Honors

Summer Work Packet For Students Entering Algebra 1 Honors June 2017 Summer Work Packet For Students Entering Algebra 1 Honors Dear Student, Welcome! I have prepared a summer work packet for you to help you better prepare for your upcoming course, Algebra 1 Honors.

More information

Lecture 18 Miscellaneous Topics in Multiple Regression

Lecture 18 Miscellaneous Topics in Multiple Regression Lecture 18 Miscellaneous Topics in Multiple Regression STAT 512 Spring 2011 Background Reading KNNL: 8.1-8.5,10.1, 11, 12 18-1 Topic Overview Polynomial Models (8.1) Interaction Models (8.2) Qualitative

More information

Analysis of Variance. Source DF Squares Square F Value Pr > F. Model <.0001 Error Corrected Total

Analysis of Variance. Source DF Squares Square F Value Pr > F. Model <.0001 Error Corrected Total Math 221: Linear Regression and Prediction Intervals S. K. Hyde Chapter 23 (Moore, 5th Ed.) (Neter, Kutner, Nachsheim, and Wasserman) The Toluca Company manufactures refrigeration equipment as well as

More information

Correlation and regression

Correlation and regression 1 Correlation and regression Yongjua Laosiritaworn Introductory on Field Epidemiology 6 July 2015, Thailand Data 2 Illustrative data (Doll, 1955) 3 Scatter plot 4 Doll, 1955 5 6 Correlation coefficient,

More information

Multiple Linear Regression

Multiple Linear Regression Andrew Lonardelli December 20, 2013 Multiple Linear Regression 1 Table Of Contents Introduction: p.3 Multiple Linear Regression Model: p.3 Least Squares Estimation of the Parameters: p.4-5 The matrix approach

More information

Stat 501, F. Chiaromonte. Lecture #8

Stat 501, F. Chiaromonte. Lecture #8 Stat 501, F. Chiaromonte Lecture #8 Data set: BEARS.MTW In the minitab example data sets (for description, get into the help option and search for "Data Set Description"). Wild bears were anesthetized,

More information

Formative Assignment PART A

Formative Assignment PART A MHF4U_2011: Advanced Functions, Grade 12, University Preparation Unit 2: Advanced Polynomial and Rational Functions Activity 2: Families of polynomial functions Formative Assignment PART A For each of

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression 1 Correlation indicates the magnitude and direction of the linear relationship between two variables. Linear Regression: variable Y (criterion) is predicted by variable X (predictor)

More information

ACOVA and Interactions

ACOVA and Interactions Chapter 15 ACOVA and Interactions Analysis of covariance (ACOVA) incorporates one or more regression variables into an analysis of variance. As such, we can think of it as analogous to the two-way ANOVA

More information

SMAM 314 Computer Assignment 5 due Nov 8,2012 Data Set 1. For each of the following data sets use Minitab to 1. Make a scatterplot.

SMAM 314 Computer Assignment 5 due Nov 8,2012 Data Set 1. For each of the following data sets use Minitab to 1. Make a scatterplot. SMAM 314 Computer Assignment 5 due Nov 8,2012 Data Set 1. For each of the following data sets use Minitab to 1. Make a scatterplot. 2. Fit the linear regression line. Regression Analysis: y versus x y

More information

Chapter 7 Student Lecture Notes 7-1

Chapter 7 Student Lecture Notes 7-1 Chapter 7 Student Lecture Notes 7- Chapter Goals QM353: Business Statistics Chapter 7 Multiple Regression Analysis and Model Building After completing this chapter, you should be able to: Explain model

More information

Los Angeles Unified School District Secondary Mathematics Branch

Los Angeles Unified School District Secondary Mathematics Branch Essential Standards in Mathematics (Grade 10, 11 or 12) Los Angeles Unified School District 310209 Essential Standards in Mathematics COURSE DESCRIPTION This one semester course is designed as a preparation

More information

The Model Building Process Part I: Checking Model Assumptions Best Practice (Version 1.1)

The Model Building Process Part I: Checking Model Assumptions Best Practice (Version 1.1) The Model Building Process Part I: Checking Model Assumptions Best Practice (Version 1.1) Authored by: Sarah Burke, PhD Version 1: 31 July 2017 Version 1.1: 24 October 2017 The goal of the STAT T&E COE

More information

Chapter 16. Simple Linear Regression and Correlation

Chapter 16. Simple Linear Regression and Correlation Chapter 16 Simple Linear Regression and Correlation 16.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will

More information

Chapter 5. Logistic Regression

Chapter 5. Logistic Regression Chapter 5 Logistic Regression In logistic regression, there is s categorical dependent variables, often coded 1=Yes and 0=No. Many important phenomena fit this framework. The patient survives the operation,

More information

Data Analyses in Multivariate Regression Chii-Dean Joey Lin, SDSU, San Diego, CA

Data Analyses in Multivariate Regression Chii-Dean Joey Lin, SDSU, San Diego, CA Data Analyses in Multivariate Regression Chii-Dean Joey Lin, SDSU, San Diego, CA ABSTRACT Regression analysis is one of the most used statistical methodologies. It can be used to describe or predict causal

More information

Lectures 5 & 6: Hypothesis Testing

Lectures 5 & 6: Hypothesis Testing Lectures 5 & 6: Hypothesis Testing in which you learn to apply the concept of statistical significance to OLS estimates, learn the concept of t values, how to use them in regression work and come across

More information

THE EFFECTS OF MULTICOLLINEARITY IN ORDINARY LEAST SQUARES (OLS) ESTIMATION

THE EFFECTS OF MULTICOLLINEARITY IN ORDINARY LEAST SQUARES (OLS) ESTIMATION THE EFFECTS OF MULTICOLLINEARITY IN ORDINARY LEAST SQUARES (OLS) ESTIMATION Weeraratne N.C. Department of Economics & Statistics SUSL, BelihulOya, Sri Lanka ABSTRACT The explanatory variables are not perfectly

More information

MFin Econometrics I Session 5: F-tests for goodness of fit, Non-linearity and Model Transformations, Dummy variables

MFin Econometrics I Session 5: F-tests for goodness of fit, Non-linearity and Model Transformations, Dummy variables MFin Econometrics I Session 5: F-tests for goodness of fit, Non-linearity and Model Transformations, Dummy variables Thilo Klein University of Cambridge Judge Business School Session 5: Non-linearity,

More information

SCHOOL OF MATHEMATICS AND STATISTICS

SCHOOL OF MATHEMATICS AND STATISTICS SHOOL OF MATHEMATIS AND STATISTIS Linear Models Autumn Semester 2015 16 2 hours Marks will be awarded for your best three answers. RESTRITED OPEN BOOK EXAMINATION andidates may bring to the examination

More information

Regression, part II. I. What does it all mean? A) Notice that so far all we ve done is math.

Regression, part II. I. What does it all mean? A) Notice that so far all we ve done is math. Regression, part II I. What does it all mean? A) Notice that so far all we ve done is math. 1) One can calculate the Least Squares Regression Line for anything, regardless of any assumptions. 2) But, if

More information

2.4.3 Estimatingσ Coefficient of Determination 2.4. ASSESSING THE MODEL 23

2.4.3 Estimatingσ Coefficient of Determination 2.4. ASSESSING THE MODEL 23 2.4. ASSESSING THE MODEL 23 2.4.3 Estimatingσ 2 Note that the sums of squares are functions of the conditional random variables Y i = (Y X = x i ). Hence, the sums of squares are random variables as well.

More information

Unit 11: Multiple Linear Regression

Unit 11: Multiple Linear Regression Unit 11: Multiple Linear Regression Statistics 571: Statistical Methods Ramón V. León 7/13/2004 Unit 11 - Stat 571 - Ramón V. León 1 Main Application of Multiple Regression Isolating the effect of a variable

More information

Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras

Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras Lecture - 39 Regression Analysis Hello and welcome to the course on Biostatistics

More information

Chapter 9. Correlation and Regression

Chapter 9. Correlation and Regression Chapter 9 Correlation and Regression Lesson 9-1/9-2, Part 1 Correlation Registered Florida Pleasure Crafts and Watercraft Related Manatee Deaths 100 80 60 40 20 0 1991 1993 1995 1997 1999 Year Boats in

More information

Intro to Linear Regression

Intro to Linear Regression Intro to Linear Regression Introduction to Regression Regression is a statistical procedure for modeling the relationship among variables to predict the value of a dependent variable from one or more predictor

More information