Simple Linear Regression

Similar documents
Review of Multiple Regression

Multiple Regression. More Hypothesis Testing. More Hypothesis Testing The big question: What we really want to know: What we actually know: We know:

Inferences for Regression

Area1 Scaled Score (NAPLEX) .535 ** **.000 N. Sig. (2-tailed)

Multiple Regression and Model Building Lecture 20 1 May 2006 R. Ryznar

Regression: Main Ideas Setting: Quantitative outcome with a quantitative explanatory variable. Example, cont.

Simple Linear Regression Using Ordinary Least Squares

36-309/749 Experimental Design for Behavioral and Social Sciences. Sep. 22, 2015 Lecture 4: Linear Regression

SPSS Output. ANOVA a b Residual Coefficients a Standardized Coefficients

Introduction to Linear regression analysis. Part 2. Model comparisons

Introduction to Regression

Chapter 9 - Correlation and Regression

STAT 3900/4950 MIDTERM TWO Name: Spring, 2015 (print: first last ) Covered topics: Two-way ANOVA, ANCOVA, SLR, MLR and correlation analysis

Interactions and Centering in Regression: MRC09 Salaries for graduate faculty in psychology

EDF 7405 Advanced Quantitative Methods in Educational Research MULTR.SAS

Chapter Goals. To understand the methods for displaying and describing relationship among variables. Formulate Theories.

Daniel Boduszek University of Huddersfield

ECON 497 Midterm Spring

9. Linear Regression and Correlation

Applied Regression Analysis

Advanced Quantitative Data Analysis

: The model hypothesizes a relationship between the variables. The simplest probabilistic model: or.

8/28/2017. Both examine linear (straight line) relationships Correlation works with a pair of scores One score on each of two variables (X and Y)

IT 403 Practice Problems (2-2) Answers

Univariate analysis. Simple and Multiple Regression. Univariate analysis. Simple Regression How best to summarise the data?

( ), which of the coefficients would end

THE PEARSON CORRELATION COEFFICIENT

Chapter Learning Objectives. Regression Analysis. Correlation. Simple Linear Regression. Chapter 12. Simple Linear Regression

Multiple Regression. Peerapat Wongchaiwat, Ph.D.

Correlation and simple linear regression S5

Regression Diagnostics Procedures

Regression ( Kemampuan Individu, Lingkungan kerja dan Motivasi)

Basic Business Statistics 6 th Edition

This module focuses on the logic of ANOVA with special attention given to variance components and the relationship between ANOVA and regression.

Unit 6 - Introduction to linear regression

Chapter 4 Regression with Categorical Predictor Variables Page 1. Overview of regression with categorical predictors

WISE Regression/Correlation Interactive Lab. Introduction to the WISE Correlation/Regression Applet

In Class Review Exercises Vartanian: SW 540

SIMPLE REGRESSION ANALYSIS. Business Statistics

Example: Forced Expiratory Volume (FEV) Program L13. Example: Forced Expiratory Volume (FEV) Example: Forced Expiratory Volume (FEV)

Can you tell the relationship between students SAT scores and their college grades?

REVIEW 8/2/2017 陈芳华东师大英语系

Stat 101 Exam 1 Important Formulas and Concepts 1

Interactions between Binary & Quantitative Predictors

Multiple Linear Regression II. Lecture 8. Overview. Readings

Multiple Linear Regression II. Lecture 8. Overview. Readings. Summary of MLR I. Summary of MLR I. Summary of MLR I

Sociology 593 Exam 1 February 14, 1997

Predicted Y Scores. The symbol stands for a predicted Y score

Prepared by: Prof. Dr Bahaman Abu Samah Department of Professional Development and Continuing Education Faculty of Educational Studies Universiti

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Multiple linear regression S6

Difference in two or more average scores in different groups

y response variable x 1, x 2,, x k -- a set of explanatory variables

Correlation. A statistics method to measure the relationship between two variables. Three characteristics

Single and multiple linear regression analysis

Regression Analysis. Table Relationship between muscle contractile force (mj) and stimulus intensity (mv).

Practical Biostatistics

STATISTICS 110/201 PRACTICE FINAL EXAM

Multivariate Correlational Analysis: An Introduction

Immigration attitudes (opposes immigration or supports it) it may seriously misestimate the magnitude of the effects of IVs

1 Correlation and Inference from Regression

Intro to Linear Regression

Multiple OLS Regression

AMS 7 Correlation and Regression Lecture 8

ANOVA Situation The F Statistic Multiple Comparisons. 1-Way ANOVA MATH 143. Department of Mathematics and Statistics Calvin College

Intro to Linear Regression

DETAILED CONTENTS PART I INTRODUCTION AND DESCRIPTIVE STATISTICS. 1. Introduction to Statistics

Analysis of Covariance. The following example illustrates a case where the covariate is affected by the treatments.

Self-Assessment Weeks 6 and 7: Multiple Regression with a Qualitative Predictor; Multiple Comparisons

SMAM 319 Exam1 Name. a B.The equation of a line is 3x + y =6. The slope is a. -3 b.3 c.6 d.1/3 e.-1/3

Multiple Linear Regression II. Lecture 8. Overview. Readings

Multiple Linear Regression II. Lecture 8. Overview. Readings. Summary of MLR I. Summary of MLR I. Summary of MLR I

Last updated: Oct 18, 2012 LINEAR REGRESSION PSYC 3031 INTERMEDIATE STATISTICS LABORATORY. J. Elder

Research Design - - Topic 19 Multiple regression: Applications 2009 R.C. Gardner, Ph.D.

Key Concepts. Correlation (Pearson & Spearman) & Linear Regression. Assumptions. Correlation parametric & non-para. Correlation

Module 8: Linear Regression. The Applied Research Center

Unit 6 - Simple linear regression

Parametric Test. Multiple Linear Regression Spatial Application I: State Homicide Rates Equations taken from Zar, 1984.

Ø Set of mutually exclusive categories. Ø Classify or categorize subject. Ø No meaningful order to categorization.

Biostatistics. Chapter 11 Simple Linear Correlation and Regression. Jing Li

MORE ON SIMPLE REGRESSION: OVERVIEW

LAB 3 INSTRUCTIONS SIMPLE LINEAR REGRESSION

Ordinary Least Squares Regression Explained: Vartanian

The inductive effect in nitridosilicates and oxysilicates and its effects on 5d energy levels of Ce 3+

Introduction and Single Predictor Regression. Correlation

Correlation and Simple Linear Regression

Analysis of Covariance

Chapter 10-Regression

Correlation, linear regression

Self-Assessment Weeks 8: Multiple Regression with Qualitative Predictors; Multiple Comparisons

Analysis of Bivariate Data

Keller: Stats for Mgmt & Econ, 7th Ed July 17, 2006

Bivariate data analysis

Density Temp vs Ratio. temp

WELCOME! Lecture 13 Thommy Perlinger

x3,..., Multiple Regression β q α, β 1, β 2, β 3,..., β q in the model can all be estimated by least square estimators

STATISTICS. Multiple regression

Lecture (chapter 13): Association between variables measured at the interval-ratio level

Regression in R. Seth Margolis GradQuant May 31,

ESP 178 Applied Research Methods. 2/23: Quantitative Analysis

Transcription:

Simple Linear Regression 1

Correlation indicates the magnitude and direction of the linear relationship between two variables. Linear Regression: variable Y (criterion) is predicted by variable X (predictor) using a linear equation. Advantages: Scores on X allow prediction of scores on Y. Allows for multiple predictors (continuous and categorical) so you can control for variables. 2

Linear Regression Equation Geometry equation for a line: y = mx + b Regression equation for a line (population): y = β 0 + β 1 x β 0 : point where the line intercepts y-axis β 1 : slope of the line

Regression: Finding the Best-Fitting Line Grade in Class

Best-Fitting Line Minimize this squared distance across all data points Grade in Class

Slope and Intercept in Scatterplots slope is: rise/run = -2/1 y = b 0 + b 1 x + e y = -4 + 1.33x + e y = b 0 + b 1 x + e y = 3-2x + e

Estimating Equation from Scatterplot rise = 15 run = 50 y = b 0 + b 1 x + e slope = 15/50 =.3 y = 5 +.3x + e Predict price at quality = 90 y = 5 +.3x + e y = 5 +.3*90 = 35

Example Van Camp, Barden & Sloan (2010) Contact with Blacks Scale: Ex: What percentage of your neighborhood growing up was Black? 0%-100% Race Related Reasons for College Choice: Ex: To what extent did you come to Howard specifically because the student body is predominantly Black? 1(not very much) 10 (very much) Your predictions, how would prior contact predict race related reasons?

Results Van Camp, Barden & Sloan (2010) Regression equation (sample): y = b 0 + b 1 x + e Contact(x) predict Reasons: y = 6.926 -.223x + e b 0 : t(107) = 14.17, p <.01 b 1 : t(107) = -2.93, p <.01 df = N k 1 = 109 1 1 k: predictors entered

Unstandardized and Standardized b unstandardized b: in the original units of X and Y tells us how much a change in X will produce a change in Y in the original units (meters, scale points ) not possible to compare relative impact of multiple predictors standardized b: scores 1 st standardized to SD units +1 SD change in X produces b*sd change in Y indicates relative importance of multiple predictors of Y

Results Van Camp, Barden & Sloan (2010) Contact predicts Reasons: Unstandardized: y = 6.926 -.223x + e Standardized: y = 0 -.272x + e (M x = 5.89, SD x = 2.53; M y = 5.61, SD y = 2.08) (M x = 0, SD x = 1.00; M y = 0, SD y = 1.00)

save new variables that are standardized versions of current variables

add fit lines add reference lines (may need to adjust to mean) select fit line

Predicting Y from X Once we have a straight line we can know what the change in Y is with each change in X Y prime (Y ) is the prediction of Y at a given X, and it is the average Y score at that X score. Warning: Predictions can only be made: (1) within the range of the sample (2) for individuals taken from a similar population under similar circumstances.

Errors around the regression line Regression equation give us the straight line that minimizes the error involved in making predictions (least squares regression line). Residual: difference between an actual Y value and predicted (Y ) value: Y Y It is the amount of the original value that is left over after the prediction is subtracted out The amount of error above and below the line is the same

Y residual Y Y

Dividing up Variance Total: deviation of individual data points from the sample mean Explained: deviation of the regression line from the mean Unexplained: deviation of individual data points from the regression line (error in prediction) Y Y Y Y Y Y unexplained explained total variance variance variance (residual)

total variance Y explained residual Y Y Y Y Y Y Y Y unexplained explained total variance variance variance (residual)

Coefficient of determination: proportion of the total variance that is explained by the predictor variable R 2 = explained variance total variance

SPSS - regression Analyze regression linear Select criterion variable (Y) [SPSS calls DV] [Racereas] Select predictor variable (X) [SPSS calls IV] [ContactBlacks] OK

Model Summary b Model R R Square Adjusted R Square Std. Error of the Estimate 1.272 a.074.066 2.00872 a. Predictors: (Constant), ContactBlacksperc124 b. Dependent Variable: RaceReasons coefficient of determination SS error : minimized in OLS Model Coefficients a Unstandardized Coefficients Standardized Coefficients B Std. Error Beta 1 (Constant) 6.926.489 14.172.000 ContactBlacksperc1 24 Sum of a. Dependent Variable: RaceReasons ANOVA b Mean Model Squares df Square F Sig. 1 Regression 34.582 1 34.582 8.571.004 a Residual 431.739 107 4.035 Total 466.321 108 a. Predictors: (Constant), ContactBlacksperc124 b. Dependent Variable: RaceReasons t Sig. -.223.076 -.272-2.928.004 Unstandardized: y = 6.926 -.223x + e Reporting in Results: b = -.27, t(107) = -2.93, p <.01. (pp. 240 in Van Camp 2010) Standardized: y = 0 -.272x + e

Assumptions Underlying Linear Regression 1. Independent random sampling 2. Normal distribution 3. Linear relationships (not curvilinear) 4. Homoscedasticity of errors (homogeneity) Best way to check 2-4? Diagnostic Plots.

Test for Normality Right (positive) Skew Solution? Normal Distribution: transform data Narrow Distribution not serious Positive Outliers o o o o investigate further

Homoscedastic? Linear Appropriate? Homoscedastic residual errors & Linear relationship Heteroscedasticity (of residual errors) Solution: transform data or weighted least squares (WLS) Curvilinear relationship Solution: add x 2 as predictor (linear regression not appropriate)

SPSS Diagnostic Graphs

END