Regression. Estimation of the linear function (straight line) describing the linear component of the joint relationship between two variables X and Y.

Similar documents
Correlation and Regression Theory 1) Multivariate Statistics

Chapter 12 - Part I: Correlation Analysis

STATS DOESN T SUCK! ~ CHAPTER 16

Correlation and the Analysis of Variance Approach to Simple Linear Regression

1. (Problem 3.4 in OLRT)

Measuring the fit of the model - SSR

Categorical Predictor Variables

appstats27.notebook April 06, 2017

Linear models and their mathematical foundations: Simple linear regression

Subject CS1 Actuarial Statistics 1 Core Principles

INFERENCE FOR REGRESSION

Variance. Standard deviation VAR = = value. Unbiased SD = SD = 10/23/2011. Functional Connectivity Correlation and Regression.

Intro to Linear Regression

Intro to Linear Regression

WORD: EXAMPLE(S): COUNTEREXAMPLE(S): EXAMPLE(S): COUNTEREXAMPLE(S): WORD: EXAMPLE(S): COUNTEREXAMPLE(S): EXAMPLE(S): COUNTEREXAMPLE(S): WORD:

Regression Models - Introduction

Chapter 12: Linear regression II

III. Inferential Tools

Applied Regression Modeling: A Business Approach Chapter 3: Multiple Linear Regression Sections

Overview. Overview. Overview. Specific Examples. General Examples. Bivariate Regression & Correlation

MAT2377. Rafa l Kulik. Version 2015/November/26. Rafa l Kulik

Linear Methods for Regression. Lijun Zhang

Simple Linear Regression Using Ordinary Least Squares

STAT 540: Data Analysis and Regression

Basic Business Statistics 6 th Edition

Statistical Modelling in Stata 5: Linear Models

Inferences for Regression

Correlation and Linear Regression

Chapter 10 Correlation and Regression

Stat 5101 Lecture Notes

Experimental Design and Data Analysis for Biologists

1.5 GEOMETRIC PROPERTIES OF LINEAR FUNCTIONS

Simple Linear Regression: A Model for the Mean. Chap 7

Linear Equations in Two Variables

Confidence Intervals, Testing and ANOVA Summary

Simple Linear Regression for the Climate Data

Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression

Chapter 12 Summarizing Bivariate Data Linear Regression and Correlation

Next is material on matrix rank. Please see the handout

regression analysis is a type of inferential statistics which tells us whether relationships between two or more variables exist

Linear Equations. Find the domain and the range of the following set. {(4,5), (7,8), (-1,3), (3,3), (2,-3)}

Chapter 10. Regression. Understandable Statistics Ninth Edition By Brase and Brase Prepared by Yixun Shi Bloomsburg University of Pennsylvania

Psychology 282 Lecture #3 Outline

Chapter 8. Linear Regression. The Linear Model. Fat Versus Protein: An Example. The Linear Model (cont.) Residuals

Correlation and Regression Analysis. Linear Regression and Correlation. Correlation and Linear Regression. Three Questions.

Lesson 3: Using Linear Combinations to Solve a System of Equations

Answer Keys to Homework#10

appstats8.notebook October 11, 2016

ANCOVA. ANCOVA allows the inclusion of a 3rd source of variation into the F-formula (called the covariate) and changes the F-formula

Multiple Regression an Introduction. Stat 511 Chap 9

Assignment 9 Answer Keys

LAB 5 INSTRUCTIONS LINEAR REGRESSION AND CORRELATION

Statistical Techniques II EXST7015 Simple Linear Regression

Notebook Tab 6 Pages 183 to ConteSolutions

Midterm 2 - Solutions

Chapter 7. Linear Regression (Pt. 1) 7.1 Introduction. 7.2 The Least-Squares Regression Line

One-sided and two-sided t-test

POL 681 Lecture Notes: Statistical Interactions

Lecture 10: F -Tests, ANOVA and R 2

nonadjacent angles that lie on opposite sides of the transversal and between the other two lines.

Simple linear regression

Simple Linear Regression

Chapter 8. Linear Regression. Copyright 2010 Pearson Education, Inc.

L21: Chapter 12: Linear regression

Simple Linear Regression Analysis

Chemometrics. Matti Hotokka Physical chemistry Åbo Akademi University

LINEAR REGRESSION ANALYSIS. MODULE XVI Lecture Exercises

Business Statistics. Lecture 10: Correlation and Linear Regression

Introduction to Regression

Correlation and simple linear regression S5

MULTIPLE REGRESSION METHODS

Key Algebraic Results in Linear Regression

SKILL BUILDER TEN. Graphs of Linear Equations with Two Variables. If x = 2 then y = = = 7 and (2, 7) is a solution.

Regression and correlation. Correlation & Regression, I. Regression & correlation. Regression vs. correlation. Involve bivariate, paired data, X & Y

Chapter 12 - Lecture 2 Inferences about regression coefficient

Two-Variable Regression Model: The Problem of Estimation

SMAM 314 Computer Assignment 5 due Nov 8,2012 Data Set 1. For each of the following data sets use Minitab to 1. Make a scatterplot.

Single and multiple linear regression analysis

Hierarchical Linear Modeling. Lesson Two

* Tuesday 17 January :30-16:30 (2 hours) Recored on ESSE3 General introduction to the course.

Chapter 27 Summary Inferences for Regression

Simple Linear Regression

Problems. Suppose both models are fitted to the same data. Show that SS Res, A SS Res, B

13 Simple Linear Regression

From last time... The equations

Linear Regression. Simple linear regression model determines the relationship between one dependent variable (y) and one independent variable (x).

2.4.3 Estimatingσ Coefficient of Determination 2.4. ASSESSING THE MODEL 23

Multiple Regression. Inference for Multiple Regression and A Case Study. IPS Chapters 11.1 and W.H. Freeman and Company

Simple Linear Regression

Regression. Marc H. Mehlman University of New Haven

Inference about the Slope and Intercept

Simple Linear Regression

Stat 5102 Final Exam May 14, 2015

CS 5014: Research Methods in Computer Science

Optimal design of experiments

Regression Models for Quantitative and Qualitative Predictors: An Overview

Assumptions in Regression Modeling

:Effects of Data Scaling We ve already looked at the effects of data scaling on the OLS statistics, 2, and R 2. What about test statistics?

Lecture 14 Simple Linear Regression

General Linear Model (Chapter 4)

Transcription:

Regression Bivariate i linear regression: Estimation of the linear function (straight line) describing the linear component of the joint relationship between two variables and. Generally describe as a function of. Covariance and correlation measure strength of the relationship between and. Covariance depends on units of and. Correlation is dimensionless. Thus regression line and correlation (or covariance) provide complementary information about joint distribution.

Linear components of relationships: Represented by linear equations (involving 1 ): Two parameters (pieces of information) needed to uniquely specify a line: Two points One point and slope..... Two comparable formulations: Geometric: m b Statistical: b0 b1

(1) Geometric formulation: m b m = slope of line (change in relative to change in ). Can be determined from any two points ( i, i ) and ( j, j ) lying on the line: i j j i m i j j i ( i, i ) b = -intercept, value of when = 0. If know slope (m), can find intercept. Substitute it, and the coordinates of any point on the line, into the equation, solve for b: 0 In particular: b m ( j, j ) i m i b b m i i

() Statistical formulation: b0 b1 b 0 = -intercept b 1 = slope This form extrapolates to polynomials of higher order: k b b b b 0 1 Statistics b 0 and b 1 are assumed to be sample estimates of the population parameters 0 and 1. Estimated with error, From a sample of individuals that don t lie exactly on a straight line:. Express individual id residual variation. i.» Doesn t fit the linear model.. Comprises two sources of variation:. (a) Individual biological variation.. (b) Measurement error. k

Response (dependent) variable True slope 1 Residual variation Mean response True } linear relationship True intercept 1 0 0 0 0. 1 1.... Predictor (independent) variable Mean residual

There are an infinite number of different regression models. Correspond to differing assumptions made about the direction of residual variation. Two most common models: (1) Predictive (type I). () Major axis (type II).

Predictive regression (1a) Predictive regression of on : The true population relationship is assumed to be: i 0 1 i i i i 0 1 i All residual variation is assumed dto be concentrated t din, and none in (vertical residuals). Arises from analysis of variance (ANOVA): Purpose: test whether means for two or more groups (k groups) differ significantly. Groups () assumed to be known, with no error. Within-group residual variation (in ) assumed to be normally distributed. All groups assumed to have the same amount of variation.

Step from ANOVA to regression: Assume that groups can be represented by the (exact) value of a predictor (independent) variable,. Ask whether there is a linear trend in response () among groups. Appropriate in experimental studies in which predictor variable is set by the investigator, with no error. 0. A 1 A A 0. 0. Frequ uency 0. 0. 0. 0. 01 0.1 1 0 1 0 1 A

Thus and have different roles in the regression analysis: is the predictor or independent variable. Assigned values independently of the model, with no error. is the response or dependent variable. Assumed dto be dependent d on the value of f, with some individual (residual) variation in response. Estimation: s b1, b0 b1 s r = 0. b 1 = 0. 10 11 r = 0. b 1 = 0. 10 11

(1b) Predictive regression of on : Reverse role of and. True population relationship is then: i 0 1 i i. i i 0 1 i.... All residual variation is assumed to be concentrated in, and none in (=horizontal residuals). s Estimation: d, 1 d 0 d 1 s

Major-axis regression () Major axis regression: The two predictive-regression models above are extreme cases: Predictive regression of on assumes that var( i )=0; i.e., that all residual variation is in. Predictive regression of on assumes that var( i )=0; i.e., that all residual variation is in. Instead of assuming that t all residual variation ( error ) is in either or, assume that both variables have some residual variation. If knew amounts of residual variation, could express their relative errors as a ratio: var i var i

So: var var i i For any value of, we can estimate the corresponding slope of a unique regression line: m s s s s s b m s This general case is called functional regression.

There are an infinite number of functional-regression lines, corresponding to values of. In general, don t know how much residual variation is in versus. If treating and equivalently, reasonable assumption: they have approximately equal amounts of residual iti variation: var i var i var i 1 var Then slope reduces to: i m s s s s s s This special case is major-axis regression: Regression line equivalent to major axis of the confidence ellipse for the data.

Major-axis regression line is always intermediate in slope between predictive regression of on and that of on. More precisely: major axis slope is geometric mean of the two predictive-regression slopes: m b 1 d 1 (d 1 is converted to reciprocal so that all three slopes describe change in relative to change in.)

Predictive and major-axis regressions 10 r=0 0. 10 r=0 0. 10 1 10 10 11 r=00 0.0 r=00 0.0 on MA on 1 10 11 0 1 10

Predictive and major-axis regressions 10 r=00 0.0 r=00 0.0 10 11 1 0 10 00 000 on 10 r = 0.00 MA r = 0.0 on 10 11 10 11

Major-axis regression is the bivariate case of principal component analysis. Corresponds to case in which residuals are assumed to be orthogonal to the regression line. Can be extended to > variables.