Intro to Linear Regression

Similar documents
Intro to Linear Regression

Correlation and Linear Regression

Inferences for Regression

Can you tell the relationship between students SAT scores and their college grades?

INFERENCE FOR REGRESSION

Variance. Standard deviation VAR = = value. Unbiased SD = SD = 10/23/2011. Functional Connectivity Correlation and Regression.

Simple Linear Regression Using Ordinary Least Squares

Multiple Regression. Inference for Multiple Regression and A Case Study. IPS Chapters 11.1 and W.H. Freeman and Company

Correlation: Relationships between Variables

REVIEW 8/2/2017 陈芳华东师大英语系

Business Statistics. Lecture 10: Correlation and Linear Regression

Correlation. A statistics method to measure the relationship between two variables. Three characteristics

Inference for the Regression Coefficient

Business Statistics. Lecture 9: Simple Regression

Multiple Regression. More Hypothesis Testing. More Hypothesis Testing The big question: What we really want to know: What we actually know: We know:

Unit 6 - Introduction to linear regression

Unit 6 - Simple linear regression

Chapter 27 Summary Inferences for Regression

Reminder: Student Instructional Rating Surveys

STA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #6

Measuring the fit of the model - SSR

Introduction and Single Predictor Regression. Correlation

Chapter 7 Linear Regression

s e, which is large when errors are large and small Linear regression model

Lecture 10 Multiple Linear Regression

AMS 315/576 Lecture Notes. Chapter 11. Simple Linear Regression

Lecture 18: Simple Linear Regression

Psychology 282 Lecture #3 Outline

9 Correlation and Regression

This document contains 3 sets of practice problems.

Homework 6. Wife Husband XY Sum Mean SS

Chapter 12 - Part I: Correlation Analysis

bivariate correlation bivariate regression multiple regression

Key Algebraic Results in Linear Regression

appstats27.notebook April 06, 2017

Relationship Between Interval and/or Ratio Variables: Correlation & Regression. Sorana D. BOLBOACĂ

Answer Key. 9.1 Scatter Plots and Linear Correlation. Chapter 9 Regression and Correlation. CK-12 Advanced Probability and Statistics Concepts 1

Multiple Regression Analysis

Regression, part II. I. What does it all mean? A) Notice that so far all we ve done is math.

STAT 350: Geometry of Least Squares

Lecture 19 Multiple (Linear) Regression

Lecture 18 MA Applied Statistics II D 2004

Answer Key: Problem Set 6

From last time... The equations

Simple Linear Regression for the Climate Data

Simple Linear Regression

Information Sources. Class webpage (also linked to my.ucdavis page for the class):

Mathematics for Economics MA course

Chapter 9 - Correlation and Regression

AMS 7 Correlation and Regression Lecture 8

Multiple Linear Regression

Introduction to Linear Regression

STATISTICAL DATA ANALYSIS IN EXCEL

Chapter 12 - Lecture 2 Inferences about regression coefficient

Announcements: You can turn in homework until 6pm, slot on wall across from 2202 Bren. Make sure you use the correct slot! (Stats 8, closest to wall)

Black White Total Observed Expected χ 2 = (f observed f expected ) 2 f expected (83 126) 2 ( )2 126

Objectives Simple linear regression. Statistical model for linear regression. Estimating the regression parameters

ANCOVA. ANCOVA allows the inclusion of a 3rd source of variation into the F-formula (called the covariate) and changes the F-formula

regression analysis is a type of inferential statistics which tells us whether relationships between two or more variables exist

" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2

Chapter Learning Objectives. Regression Analysis. Correlation. Simple Linear Regression. Chapter 12. Simple Linear Regression

Confidence Intervals, Testing and ANOVA Summary

Simple Linear Regression

AP Statistics L I N E A R R E G R E S S I O N C H A P 7

How to mathematically model a linear relationship and make predictions.

How to mathematically model a linear relationship and make predictions.

Regression, Part I. - In correlation, it would be irrelevant if we changed the axes on our graph.

Data Analysis and Statistical Methods Statistics 651

Chapter 14 Simple Linear Regression (A)

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

23. Inference for regression

Overview. Overview. Overview. Specific Examples. General Examples. Bivariate Regression & Correlation

Chapter 3. Introduction to Linear Correlation and Regression Part 3

Regression and the 2-Sample t

Analysis of Bivariate Data

SMAM 314 Exam 42 Name

Chapter 19 Sir Migo Mendoza

ST430 Exam 1 with Answers

Regression Estimation Least Squares and Maximum Likelihood

Inference for Regression Inference about the Regression Model and Using the Regression Line

28. SIMPLE LINEAR REGRESSION III

Chapter 10-Regression

Ch. 16: Correlation and Regression

Lecture 9: Linear Regression

Multiple Regression Examples

Simple linear regression

Review of Statistics 101

9. Linear Regression and Correlation

Regression. Estimation of the linear function (straight line) describing the linear component of the joint relationship between two variables X and Y.

1 A Review of Correlation and Regression

ECON3150/4150 Spring 2015

Linear Regression. Linear Regression. Linear Regression. Did You Mean Association Or Correlation?

Simple Linear Regression

Mrs. Poyner/Mr. Page Chapter 3 page 1

Section 3: Simple Linear Regression

Regression Estimation - Least Squares and Maximum Likelihood. Dr. Frank Wood

Important note: Transcripts are not substitutes for textbook assignments. 1

Section Least Squares Regression

Chs. 15 & 16: Correlation & Regression

Business Statistics. Chapter 14 Introduction to Linear Regression and Correlation Analysis QMIS 220. Dr. Mohammad Zainal

Transcription:

Intro to Linear Regression

Introduction to Regression Regression is a statistical procedure for modeling the relationship among variables to predict the value of a dependent variable from one or more predictor variables. Imagine that I ask you to guess the weight of a college-aged male who is hidden from view What would your best guess be?

Introduction to Regression weight 158.26 w eight 18.64

Introduction to Regression Regression is a statistical procedure for modeling the relationship among variables to predict the value of a dependent variable from one or more predictor variables. Imagine that I ask you to guess the weight of a college-aged male who is hidden from view What would your best guess be? What if I also gave you his height? Intuitively, it should be clear that you can do better

Introduction to Regression

Introduction to Regression The Pearson correlation, which we covered in the last lecture measures the degree to which a set of data points form a linear (straight line) relationship. Simple regression describes the linear relationship between a dependent variable () and one predictor variable () The resulting line is called the regression line.

Regression and Linear Equations ou should remember the following from your high school algebra course: Any straight line can be represented by an equation of the form = b + a, where b and a are constants. The value of b is called the slope and determines the direction and degree to which the line is tilted. The value of a is called the -intercept and determines the point where the line crosses the -axis. In the context of linear regression, a and b are called regression coefficients

Regression and Linear Equations b 0.5 a 1.0 ˆ b a 0.5 1

Residuals: Errors of Prediction How well a regression line fits a set of data points can be measured by calculating the distance between the data points and the line. Using the formula Ŷ = b + a, it is possible to find the predicted value of Ŷ for any. The residual, or error of prediction, between the predicted value and the actual value can be found by computing the difference -Ŷ The regression line is selected to be the best fit in the leastsquares sense. This means that we want to compute the line that minimizes the sum of squared residuals: SS 2 ˆ residual

Residuals: Errors of Prediction ˆ b a, ˆ

The Standard Error of Estimate The measure of unpredicted variability or error for the regression line is called the standard error of estimate (s e or s -Ŷ ) ou can think of it as analogous to the standard deviation if we were to use the mean M as our estimate of the variable s SS df M 2 n 1 s ˆ SS df residual residual ˆ 2 n 2

Computing Regression Coefficients b change in (as a function of ) change in SP SS or rs s a M bm

Example 70 150 67 140 72 180 75 190 68 145 69 150 71.5 164 71 140 72 142 69 136 67 123 68 155 66 140 72 145 73.5 160 73 190 69 155 73 165 72 150 74 190 M M s s 70.6 155. 5 2.6 19.2 cov 36. 8

Example: Computing Regression Coefficients M M s s 70.6 155. 5 2.6 19.2 cov 36. 8 Compute r : r cov 36.8 s s 2. 6 19.2 Compute b : 0.737 Compute a : a M bm 155.5 5.44 70.6 228.56 b rs 0.737 19.2 s 2.6 5.44 So, ˆ b a 5.44 228.56

Example: Predicting from ou are told that a college-aged male is 74 inches tall. Given the computed regression coefficients, what is your best estimate of his weight? : height : weight ˆ b a 5.44 228.56 ˆ 5.44 74 228.56 174 402.56 228.56

Example: Computing Accuracy of Prediction Regression M M s s cov 70.6 155.5 2.6 19.2 r 0. 737 36.8 Two measures for accuracy of prediction: standard error of estimate (s e or s -Ŷ ) interpreted as standard deviation of the error around the regression line r 2 interpreted as % of variance accounted for by regression model r 2 2 cov s s 2 2 variation explained by total variation r r 2 0.737 0. 54 2 ˆ 2 n 1 s ˆ s 1 r n 2 n 2 19 19.2 1 0.54 13. 38 18

Example: Computing Accuracy of Prediction Regression Just as σ or s can be used to compute confidence intervals for population means, s -Ŷ can be used to compute predictive intervals for t df crit s ˆ

Example: Computing Accuracy of Prediction Regression Just as σ or s can be used to compute confidence intervals for population means, s -Ŷ can be used to compute predictive intervals for t crit df s ˆ 1 x M 2 SS nss Note that the actual formula for the predictive interval is slightly more complicated and depends on x.

Standardized Regression The standardized regression coefficient (β) is computed by first standardizing both the predictor and dependent variables (i.e., by converting both the values and the values to z-scores) and then computing the regression coefficient (b) on the transformed scores For standardized regression, the y-offset is always zero For standardized regression with a single predictor variable, β is always equal to r. Standardized regression coefficients are only really useful in multiple regression, where there are multiple predictor variables). In these cases, standardizing can make it easier to determine the relative contribution of the different predictor variables to the regression model

Multiple Regression Often, researchers measure several variables that are hypothesized to predict a particular dependent variable For example, we might be interested in how well both SAT scores and high school GPAs predict college GPA s Multiple regression is an appropriate tool for such situations. Multiple regression describes the linear relationship between multiple predictor variables ( 1,, n ) and one criterion variable () The resulting surface is called the regression surface.

Multiple Regression with Two Predictor Variables In the same way that linear regression produces an equation that uses values of to predict values of, multiple regression produces an equation that uses two different variables ( 1 and 2 ) to predict values of. The equation is determined by a least squared error solution that minimizes the squared distances between the actual values and the predicted values. For two predictor variables, the general form of the multiple regression equation is: Ŷ= b 1 1 + b 2 2 + a The resulting plane is called the regression plane.

Multiple Regression

Multiple Regression

Multiple Regression