Objectives of Multiple Regression

Similar documents
Simple Linear Regression

Chapter Two. An Introduction to Regression ( )

STA302/1001-Fall 2008 Midterm Test October 21, 2008

Chapter 13 Student Lecture Notes 13-1

Multiple Regression. More than 2 variables! Grade on Final. Multiple Regression 11/21/2012. Exam 2 Grades. Exam 2 Re-grades

Multiple Choice Test. Chapter Adequacy of Models for Regression

Multiple Linear Regression Analysis

ENGI 3423 Simple Linear Regression Page 12-01

12.2 Estimating Model parameters Assumptions: ox and y are related according to the simple linear regression model

Linear Regression with One Regressor

Simple Linear Regression and Correlation.

Chapter 14 Logistic Regression Models

residual. (Note that usually in descriptions of regression analysis, upper-case

Statistics. Correlational. Dr. Ayman Eldeib. Simple Linear Regression and Correlation. SBE 304: Linear Regression & Correlation 1/3/2018

Probability and. Lecture 13: and Correlation

Summary of the lecture in Biostatistics

Lecture Notes Forecasting the process of estimating or predicting unknown situations

Statistics MINITAB - Lab 5

Lecture 7. Confidence Intervals and Hypothesis Tests in the Simple CLR Model

( ) = ( ) ( ) Chapter 13 Asymptotic Theory and Stochastic Regressors. Stochastic regressors model

Example: Multiple linear regression. Least squares regression. Repetition: Simple linear regression. Tron Anders Moger

Econometric Methods. Review of Estimation

Chapter Business Statistics: A First Course Fifth Edition. Learning Objectives. Correlation vs. Regression. In this chapter, you learn:

Simple Linear Regression and Correlation. Applied Statistics and Probability for Engineers. Chapter 11 Simple Linear Regression and Correlation

Ordinary Least Squares Regression. Simple Regression. Algebra and Assumptions.

Correlation and Simple Linear Regression

CS 2750 Machine Learning. Lecture 8. Linear regression. CS 2750 Machine Learning. Linear regression. is a linear combination of input components x

Lecture 8: Linear Regression


Chapter 13, Part A Analysis of Variance and Experimental Design. Introduction to Analysis of Variance. Introduction to Analysis of Variance

Previous lecture. Lecture 8. Learning outcomes of this lecture. Today. Statistical test and Scales of measurement. Correlation

Simple Linear Regression

STA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #1

Lecture Notes 2. The ability to manipulate matrices is critical in economics.

ESS Line Fitting

Homework Solution (#5)

: At least two means differ SST

ECON 482 / WH Hong The Simple Regression Model 1. Definition of the Simple Regression Model

Lecture 1: Introduction to Regression

4. Standard Regression Model and Spatial Dependence Tests

ENGI 4421 Propagation of Error Page 8-01

Lecture 3. Sampling, sampling distributions, and parameter estimation

Correlation and Regression Analysis

Lecture 1: Introduction to Regression

Midterm Exam 1, section 1 (Solution) Thursday, February hour, 15 minutes

Simple Linear Regression - Scalar Form

Chapter 2 Simple Linear Regression

Statistics: Unlocking the Power of Data Lock 5

{ }{ ( )} (, ) = ( ) ( ) ( ) Chapter 14 Exercises in Sampling Theory. Exercise 1 (Simple random sampling): Solution:

Lecture Notes Types of economic variables

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE

Linear Regression. Can height information be used to predict weight of an individual? How long should you wait till next eruption?

Midterm Exam 1, section 2 (Solution) Thursday, February hour, 15 minutes

Lecture 2: Linear Least Squares Regression

Chapter 11 The Analysis of Variance

Sampling Theory MODULE V LECTURE - 14 RATIO AND PRODUCT METHODS OF ESTIMATION

Chapter 2 Supplemental Text Material

i 2 σ ) i = 1,2,...,n , and = 3.01 = 4.01

ε. Therefore, the estimate

b. There appears to be a positive relationship between X and Y; that is, as X increases, so does Y.

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. x, where. = y - ˆ " 1

Analyzing Two-Dimensional Data. Analyzing Two-Dimensional Data

Regression. Linear Regression. A Simple Data Display. A Batch of Data. The Mean is 220. A Value of 474. STAT Handout Module 15 1 st of June 2009

STA 105-M BASIC STATISTICS (This is a multiple choice paper.)

ENGI 4421 Joint Probability Distributions Page Joint Probability Distributions [Navidi sections 2.5 and 2.6; Devore sections

Lecture 2: The Simple Regression Model

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

Recall MLR 5 Homskedasticity error u has the same variance given any values of the explanatory variables Var(u x1,...,xk) = 2 or E(UU ) = 2 I

Introduction to Matrices and Matrix Approach to Simple Linear Regression

r y Simple Linear Regression How To Study Relation Between Two Quantitative Variables? Scatter Plot Pearson s Sample Correlation Correlation

Investigation of Partially Conditional RP Model with Response Error. Ed Stanek

Simple Linear Regression. How To Study Relation Between Two Quantitative Variables? Scatter Plot. Pearson s Sample Correlation.

Linear Regression Linear Regression with Shrinkage. Some slides are due to Tommi Jaakkola, MIT AI Lab

Multivariate Transformation of Variables and Maximum Likelihood Estimation

Comparison of Dual to Ratio-Cum-Product Estimators of Population Mean

UNIVERSITY OF TORONTO AT SCARBOROUGH. Sample Exam STAC67. Duration - 3 hours

THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA

STATISTICAL INFERENCE

Applied Statistics and Probability for Engineers, 5 th edition February 23, b) y ˆ = (85) =

Supervised learning: Linear regression Logistic regression

CLASS NOTES. for. PBAF 528: Quantitative Methods II SPRING Instructor: Jean Swanson. Daniel J. Evans School of Public Affairs

Chapter 3 Multiple Linear Regression Model

Simulation Output Analysis

LINEAR REGRESSION ANALYSIS

Kernel-based Methods and Support Vector Machines

MEASURES OF DISPERSION

Maximum Likelihood Estimation

Fundamentals of Regression Analysis

Chapter 8. Inferences about More Than Two Population Central Values

best estimate (mean) for X uncertainty or error in the measurement (systematic, random or statistical) best

Wu-Hausman Test: But if X and ε are independent, βˆ. ECON 324 Page 1

Special Instructions / Useful Data

Example. Row Hydrogen Carbon

Class 13,14 June 17, 19, 2015

Summarizing Bivariate Data. Correlation. Scatter Plot. Pearson s Sample Correlation. Summarizing Bivariate Data SBD - 1

The Randomized Block Design

Handout #8. X\Y f(x) 0 1/16 1/ / /16 3/ / /16 3/16 0 3/ /16 1/16 1/8 g(y) 1/16 1/4 3/8 1/4 1/16 1

6.867 Machine Learning

Linear Regression Siana Halim

University of Belgrade. Faculty of Mathematics. Master thesis Regression and Correlation

Transcription:

Obectves of Multple Regresso Establsh the lear equato that best predcts values of a depedet varable Y usg more tha oe eplaator varable from a large set of potetal predctors {,,... k }. Fd that subset of all possble predctor varables that eplas a sgfcat ad apprecable proporto of the varace of Y, tradg off adequac of predcto agast the cost of measurg more predctor varables. 5-

Epadg Smple Lear Regresso Quadratc model. Y ε Geeral polomal model. Y 3 3... k k ε Addg oe or more polomal terms to the model. A depedet varable,, whch appears the polomal regresso model as k s called a k th -degree term. 5-

Polomal model shapes. Lear.5..5 3. 3.5 34 Addg oe more terms to the model sgfcatl mproves the model ft..5..5 3. 3.5 Quadratc 34 5-3

Icorporatg Addtoal Predctors Smple addtve multple regresso model 3 3... k k ε Addtve (Effect) Assumpto - The epected chage per ut cremet s costat ad does ot deped o the value of a other predctor. Ths chage s equal to. 5-4

Addtve regresso models: For two depedet varables, the respose s modeled as a surface. 5-5

Iterpretg Parameter Values (Model Coeffcets) Itercept - value of whe all predctors are. Partal slopes,, 3,... k - descrbes the epected chage per ut cremet whe all other predctors the model are held at a costat value. 5-6

Graphcal depcto of. - slope drecto of. - slope drecto of. 5-7

Multple Regresso wth Iteracto Terms Y 3 3... k k 3 3... k k... cross-product terms quatf the teracto amog predctors. k-,k k- k ε Iteractve (Effect) Assumpto: The effect of oe predctor,, o the respose,, wll deped o the value of oe or more of the other predctors. 5-8

Iterpretg Iteracto Iteracto Model or Defe: 3 3 ε ε No dfferece No loger the epected chage Y per ut cremet X! No eas terpretato! The effect o of a ut cremet X, ow depeds o X. 5-9

o-teracto } } teracto 5-

Multple Regresso models wth teracto: Les move apart Les come together 5-

Effect of the Iteracto Term Multple Regresso Surface s twsted. 5-

A Protocol for Multple Regresso Idetf all possble predctors. Establsh a method for estmatg model parameters ad ther stadard errors. Develop tests to determe f a parameter s equal to zero (.e. o evdece of assocato). Reduce umber of predctors appropratel. Develop predctos ad assocated stadard error. 5-3

Estmatg Model Parameters Least Squares Estmato Assumg a radom sample of observatos (,,,..., k ),,,...,. The estmates of the parameters for the best predctg equato: ŷ! k k Is foud b choosg the values:,,!, k whch mmze the epresso: SSE ( ŷ) (! kk ) 5-4

5-5 Normal Equatos Take the partal dervatves of the SSE fucto wth respect to,,, k, ad equate each equato to. Solve ths sstem of k equatos k ukows to obta the equatos for the parameter estmates. k k k k k k k k k k! " " "!!

A Overall Measure of How Well the Full Model Performs Coeffcet of Multple Determato Deoted as R. Defed as the proporto of the varablt the depedet varable that s accouted for b the depedet varables,,,..., k, through the regresso model. Wth ol oe depedet varable (k), R r, the square of the smple correlato coeffcet. 5-6

5-7 Computg the Coeffcet of Determato, R S SSE S S SSR R k! TSS ) ( S k k SSE ) (!

Multcolleart A further assumpto multple regresso (abset SLR), s that the predctors (,,... k ) are statstcall ucorrelated. That s, the predctors do ot co-var. Whe the predctors are sgfcatl correlated (correlato greater tha about.6) the the multple regresso model s sad to suffer from problems of multcolleart. r r.6 r.8 5 5 6 4 3 3 4-345-3456 46 5-8

Effect of Multcolleart o the Ftted Surface Etreme colleart 5-9

Multcolleart leads to Numercal stablt the estmates of the regresso parameters wld fluctuatos these estmates f a few observatos are added or removed. No loger have smple terpretatos for the regresso coeffcets the addtve model. Was to detect multcolleart Scatterplots of the predctor varables. Correlato matr for the predctor varables the hgher these correlatos the worse the problem. Varace Iflato Factors (VIFs) reported b software packages. Values larger tha usuall sgal a substatal amout of colleart. What ca be doe about multcolleart Regresso estmates are stll OK, but the resultg cofdece/predcto tervals are ver wde. Choose eplaator varables wsel! (E.g. cosder omttg oe of two hghl correlated varables.) More advaced solutos: prcpal compoets aalss;; rdge regresso. 5-

Testg Multple Regresso Testg dvdual parameters the model. Computg predcted values ad assocated stadard errors. Y X! X k k ε, ε ~ N(, σ ) Overall AOV F-test H : Noe of the eplaator varables s a sgfcat predctor of Y SSR / k F SSE /( k ) MSR MSE Reect f: F > F k, k,α 5-

Stadard Error for Partal Slope Estmate The estmated stadard error for: s where σ ε S ( R! )! k S σ ε SSE ( k ) ( ) ad R!! k s the coeffcet of determato for the model wth as the depedet varable ad all other varables as predctors. What happes f all the predctors are trul depedet of each other? R s! k! σ S ε If there s hgh depedec? R!! k s 5-

Cofdece Iterval (-α)% Cofdece Iterval for ± t ( k ), α s Reflects the umber of data pots mus the umber of parameters that have to be estmated. df for SSE 5-3

5-4 Testg whether a partal slope coeffcet s equal to zero. < > a H H Test Statstc: s t Reecto Rego: ), ( ), ( ), ( α α α > < > k k k t t t t t t Alteratves:

Predctg Y We use the least squares ftted value,, as our predctor of a sgle value of at a partcular value of the eplaator varables (,,..., k ). The correspodg terval about the predcted value of s called a predcto terval. The least squares ftted value also provdes the best predctor of E(), the mea value of, at a partcular value of (,,..., k ). The correspodg terval for the mea predcto s called a cofdece terval. Formulas for these tervals are much more complcated tha the case of SLR;; the caot be calculated b had (see the book). ŷ 5-5

Mmum R for a Sgfcat Regresso Sce we have formulas for R ad F, terms of, k, SSE ad TSS, we ca relate these two quattes. We ca the ask the questo: what s the m R whch wll esure the regresso model wll be declared sgfcat, as measured b the approprate quatle from the F dstrbuto? The aswer (below), shows that ths depeds o, k, ad SSE/TSS. R m F k, k,α k k SSE TSS 5-6

Mmum R for Smple Lear Regresso (k) 5-7