Simple Linear Regression. How To Study Relation Between Two Quantitative Variables? Scatter Plot. Pearson s Sample Correlation.

Similar documents
r y Simple Linear Regression How To Study Relation Between Two Quantitative Variables? Scatter Plot Pearson s Sample Correlation Correlation

Linear Regression. Can height information be used to predict weight of an individual? How long should you wait till next eruption?

Summarizing Bivariate Data. Correlation. Scatter Plot. Pearson s Sample Correlation. Summarizing Bivariate Data SBD - 1

Predicting the eruption time after observed an eruption of 4 minutes in duration.

Correlation: Examine Quantitative Bivariate Data

Correlation. Pearson s Sample Correlation. Correlation and Linear Regression. Scatter Plot

Reaction Time VS. Drug Percentage Subject Amount of Drug Times % Reaction Time in Seconds 1 Mary John Carl Sara William 5 4

REVIEW OF SIMPLE LINEAR REGRESSION SIMPLE LINEAR REGRESSION

Simple Linear Regression Analysis

ENGI 3423 Simple Linear Regression Page 12-01

Regression. Chapter 11 Part 4. More than you ever wanted to know about how to interpret the computer printout

Chapter 13 Student Lecture Notes 13-1

Quiz 1- Linear Regression Analysis (Based on Lectures 1-14)

12.2 Estimating Model parameters Assumptions: ox and y are related according to the simple linear regression model

Correlation and Simple Linear Regression

Objectives of Multiple Regression

Lecture 7. Confidence Intervals and Hypothesis Tests in the Simple CLR Model

8 The independence problem

Example: Multiple linear regression. Least squares regression. Repetition: Simple linear regression. Tron Anders Moger

STA302/1001-Fall 2008 Midterm Test October 21, 2008

Mean is only appropriate for interval or ratio scales, not ordinal or nominal.

Multiple Regression. More than 2 variables! Grade on Final. Multiple Regression 11/21/2012. Exam 2 Grades. Exam 2 Re-grades

Simple Linear Regression

Simple Linear Regression and Correlation.

Chapter Business Statistics: A First Course Fifth Edition. Learning Objectives. Correlation vs. Regression. In this chapter, you learn:

b. There appears to be a positive relationship between X and Y; that is, as X increases, so does Y.

Simple Linear Regression

Statistics MINITAB - Lab 5

Linear Regression with One Regressor

Midterm Exam 1, section 1 (Solution) Thursday, February hour, 15 minutes

Regression. Linear Regression. A Simple Data Display. A Batch of Data. The Mean is 220. A Value of 474. STAT Handout Module 15 1 st of June 2009

Lecture 8: Linear Regression


Simple Linear Regression - Scalar Form

Formulas and Tables from Beginning Statistics

σ σ r = x i x N Statistics Formulas Sample Mean Population Mean Interquartile Range Population Variance Population Standard Deviation

KR20 & Coefficient Alpha Their equivalence for binary scored items

: At least two means differ SST

Probability and. Lecture 13: and Correlation

Statistics. Correlational. Dr. Ayman Eldeib. Simple Linear Regression and Correlation. SBE 304: Linear Regression & Correlation 1/3/2018

Statistics: Unlocking the Power of Data Lock 5

Previous lecture. Lecture 8. Learning outcomes of this lecture. Today. Statistical test and Scales of measurement. Correlation

Lecture Notes Types of economic variables

Lecture 3. Sampling, sampling distributions, and parameter estimation

Econometric Methods. Review of Estimation

Midterm Exam 1, section 2 (Solution) Thursday, February hour, 15 minutes

Summary of the lecture in Biostatistics

STA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #1

Lecture 2: Linear Least Squares Regression

Correlation and Regression Analysis

CLASS NOTES. for. PBAF 528: Quantitative Methods II SPRING Instructor: Jean Swanson. Daniel J. Evans School of Public Affairs

Continuous Distributions

Chapter Two. An Introduction to Regression ( )

residual. (Note that usually in descriptions of regression analysis, upper-case

Chapter 2 Simple Linear Regression

STA 105-M BASIC STATISTICS (This is a multiple choice paper.)

ESS Line Fitting

Multiple Linear Regression Analysis

STS 414 ANALYSIS OF VARIANCE (ANOVA) REVIEW OF SIMPLEREGRESSION

Multiple Choice Test. Chapter Adequacy of Models for Regression

MEASURES OF DISPERSION

Chapter 8. Inferences about More Than Two Population Central Values

Linear Regression. Hsiao-Lung Chan Dept Electrical Engineering Chang Gung University, Taiwan

1. a. Houston Chronicle, Des Moines Register, Chicago Tribune, Washington Post

Parameter, Statistic and Random Samples

Ordinary Least Squares Regression. Simple Regression. Algebra and Assumptions.

ECON 482 / WH Hong The Simple Regression Model 1. Definition of the Simple Regression Model

The equation is sometimes presented in form Y = a + b x. This is reasonable, but it s not the notation we use.

Linear Regression. Hsiao-Lung Chan Dept Electrical Engineering Chang Gung University, Taiwan

Handout #4. Statistical Inference. Probability Theory. Data Generating Process (i.e., Probability distribution) Observed Data (i.e.

Functions of Random Variables

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

ENGI 4421 Propagation of Error Page 8-01

Simple Linear Regression and Correlation. Applied Statistics and Probability for Engineers. Chapter 11 Simple Linear Regression and Correlation

Module 7. Lecture 7: Statistical parameter estimation

Chapter 11 Systematic Sampling

Francis Galton ( ) The Inventor of Modern Regression Analysis

{ }{ ( )} (, ) = ( ) ( ) ( ) Chapter 14 Exercises in Sampling Theory. Exercise 1 (Simple random sampling): Solution:

European Journal of Mathematics and Computer Science Vol. 5 No. 2, 2018 ISSN

Lecture 3 Probability review (cont d)

The Randomized Block Design

Analyzing Two-Dimensional Data. Analyzing Two-Dimensional Data

Lecture Notes Forecasting the process of estimating or predicting unknown situations

Lecture 1: Introduction to Regression

Lecture 2: The Simple Regression Model

Handout #1. Title: Foundations of Econometrics. POPULATION vs. SAMPLE

Sum Mean n

Chapter 13, Part A Analysis of Variance and Experimental Design. Introduction to Analysis of Variance. Introduction to Analysis of Variance

European Journal of Mathematics and Computer Science Vol. 5 No. 2, 2018 ISSN

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE

ε. Therefore, the estimate

Lecture 1: Introduction to Regression

Applied Statistics and Probability for Engineers, 5 th edition February 23, b) y ˆ = (85) =

Contents. Introduction to Bayesian methods. Introduction to Bayesian methods Meta Analysis. Models and Methods. Mantel-Haenzel methods for 2x2 tables

A Result of Convergence about Weighted Sum for Exchangeable Random Variable Sequence in the Errors-in-Variables Model

ENGI 4421 Joint Probability Distributions Page Joint Probability Distributions [Navidi sections 2.5 and 2.6; Devore sections

Lecture Notes 2. The ability to manipulate matrices is critical in economics.

ANOVA with Summary Statistics: A STATA Macro

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

STK3100 and STK4100 Autumn 2017

Transcription:

Correlato & Regreo How To Study Relato Betwee Two Quattatve Varable? Smple Lear Regreo 6. A Smple Regreo Problem I there relato betwee umber of power boat the area ad umber of maatee klled? Year NPB( ) Nkll(y ) 77 447 3 78 4 79 48 4 498 6 8 53 4 8 5 0 83 56 5 84 559 34 85 585 33 86 64 33 87 645 39 88 675 43 89 7 79 47 3 Maatee Klled 0 0 Scatter Plot 0 0 0 Number of Power Boat 0 4 Correlato Pearo Sample Correlato The correlato, ρ, betwee two radom varable, X ad Y, defed a, ( X μ X ) ( Y μ ρ average Y σ X σy ) product of the tadard devate of X ad Y, quatfe the tregth of lear relatohp. 5 r y y y : Sample tadard devato of y : Sample tadard devato of y 6

Correlato & Regreo y y y y y y y 77 447 3 -.3 -.35.77 78 4 -.7-0.69 0.8 79 48 4-0.94-0.45 0.4 498 6-0.76 -. 0.83 8 53 4-0.59-0.45 0.6 8 5 0-0. -0.77 0.47 83 56 5-0.45 -.8 0.53 84 559 34-0.09 0.38-0.03 85 585 33 0.9 0.9 0.06 86 64 33 0.5 0.9 0.5 87 645 39 0.84 0.79 0.66 88 675 43.7.. 89 7.56.69.64 79 47.65.44.38 Total.4. d. 9.9 y.9 r 0.9447789 Mea 567. y 9.43 7 y Number of People Klled (447, 3) 0 0 Scatter Plot 567.5 0 0 0 Number of Hadgu Regtered 0 y 9.43 8 : y: Oe of par (447, 3) 447 567.5.3 9.9 3 9.43.35.9 9 Shortcut Formula y y r y y y y y yy, yy y y Pearo Sample Correlato a Dfferet Formula ( )( y y) r ( ) ( y y) y yy Correlato Coeffcet Σ 7945 (Σ) 635 Σy 4 (Σy) 69744 Σ 468597 Σy 56 Σy 475 S 99.5, S yy 93.4857 S y 37 r 0.9447789, r 0.886379485

Correlato & Regreo Iterpretato of r < r < It meaure the tregth ad drecto of the lear relato betwee two quattatve varable. r f all pot le eactly o a traght le. ρ he otato for populato correlato coeffcet. Correlato Coeffcet r cloe to r cloe to r cloe to zero r cloe to zero 3 4 Correlato Doe Not Imply Cauato How to Model Lear Relato? Eample: The umber of powerboat regtered may ot be the drect caue for the death of Maatee. 5 6 Maatee Klled 0 0 Graph wth a Ftted Le Maatee Klled 0 y? +? 0 0 0 0 0 0 0 0 0 Leat Square Prcple Fd oluto of α ad β of a traght le that mmze the followg varablty meaure: [ ( ˆ α + ˆ β )] y ˆ α + ˆ β Number of Power Boat Number of Power Boat 7 8 3

Correlato & Regreo mmze q q α q β e ( )[ y ( ) [ y y α y α + β [ y ( α + β )] ( α + β )] 0 ( α + β )] 0 + β α? β? 9 The Equato of The Ftted Le y? +? The leae-quared etmate of α, β are deoted a αˆ ad βˆ ad they are ˆ y β, ˆ α y ˆ β 0 Other formula ˆ y β r, ˆ α y r y The Equato of a Ftted Le y ˆ α + ˆ β the ample tadard devato of y the ample tadard devato of y Ca be ued for etmato or predcto. The Equato of a Ftted Le y ˆ α + ˆ β Mea of y at 4 Maatee Eample y ˆ 37 β. 486 99. 5 4 7945 ˆ α.486 4. 4439 4 4 3 4 Ca be ued for etmato or predcto. Gve the etmate of locato of mea repoe for varou. 3 The regreo (predcto) equato: ˆ α + ˆ β 4. 4369 +. 486 4 4

Correlato & Regreo Data R Commader R for Smple Lear Regreo 5 6 R Output Call: lm(formula MANKILL ~ NPOWERBT, data Dataet) Redual: M Q Meda 3Q Ma -9.468 -.066 0.07.3369 5.6375 Coeffcet: Etmate Std. Error t value Pr(> t ) (Itercept) -4.44 7.4-5.589 0.0008 *** NPOWERBT 0.49 0.09 9.675 5.e-07 *** --- Sgf. code: 0 '***' 0.00 '**' 0.0 '*' 0.05 '.' 0. ' ' y Redual tadard error: 4.76 o degree of freedom Multple R-Squared: 0.8864, Adjuted R-quared: 0.8769 Coeffcet of determato R 7 F-tattc: 93.6 o ad DF, p-value: 5.9e-07 8 Equato of the regreo le: ˆ α + ˆ β ; 4.44 +.49 A Etmato If at a certa year the umber of power boat regtered 0,000, etmate how may maatee o average would be klled. 4. 4439 +. 486 4. 4439 +. 486 0 45. 973 The average repoe at 0 45.973. 9 5

Correlato & Regreo Graph wth a Ftted Le Maatee Klled How log hould you wat tll et erupto? 0 0 0 0 0 0 Number of Power Boat 3 3 0 Durato ad Iter-erupto Tme 0 Itererupto Tme.5.0.5 3.0 3.5 4.0 4.5 5.0 CDUR.00.00 5.5 Iter-erupto Tme Durato of Erupto.5.0 Durato.5 3.0 3.5 4.0 4.5 5.0 5.5 33 34 Durato ad Iter-erupto Tme Cauto 0 Avod uure etrapolato. Caualty? Iter-erupto Tme.5.0.5 3.0 3.5 4.0 4.5 5.0 5.5 Durato 35 36 6

Correlato & Regreo Problem of etrapolato Problem of etrapolato Scope of data Scope of data 37 38 Problem of etrapolato Problem of etrapolato Etrapolated reult for a value out of the cope of Etrapolated reult for a value out of the cope of A poble tred Scope of data Etmate y at Scope of data Etmate y at 39 Regreo ad Caualty Eample: y female lfe epectacy GDP (Gro dometc product) Regreo telf provde o formato about caual patter ad mut upplemeted by addtoal aaly (wth deged ad cotrolled epermet) to obta ght about caual relatohp. Female lfe epectacy 99-000 0 000 0000 000 4 GDP per capta Before Traformato 4 7

Correlato & Regreo Eample: y female lfe epectacy GDP (Gro dometc product) Eample: y female lfe epectacy GDP (Gro dometc product) Female lfe epectacy 99 Female lfe epectacy 99 4 5 6 7 8 9 4 5 6 7 8 9 Natural log of GDP Natural log of GDP ŷ ˆ α + ˆ β l() After l(gdp) Traformato 43 44 Traformato Crcle of Power: p or y p y up Quadrat II Quadrat I Traformato For up or y up: try p > for p or y p Eample:, y, 3, y 3, or e, e y dow up For dow or y dow: try p < for p or y p Eample: -/, y -/, -, y -, or l(), l(y) Quadrat III y dow Quadrat IV 45 46 Smple Lear Regreo t-tet for correlato Hypothe: H 0 : ρ ρ 0, v.. H a : ρ ρ 0 Tet Stattc: (If data are bvarate ormal.) 8.9 Tet Cocerg Regreo ad Correlato t r ρ0 ( r )/( ) ~ t-dtrbuto d.f. Deco rule: Reject H 0, f C.V. approach: t < t α/ or t > t α/ p-value approach: p-value < α 47 48 8

Correlato & Regreo I there a gfcat correlato? R for Correlato Eample: (Maatee) H 0 : ρ 0, v.. H a : ρ 0 t. 94 0 9. 65 (. 886)/( 4 ) d.f. 4 -, p-value <.0005, reject H 0, there gfcat lear relato. r.94 49 R Output wth t-tet for Zero Correlato Pearo' product-momet correlato data: Dataet$MANKILL ad Dataet$NPOWERBT t 9.6755, df, p-value 5.9e-07 alteratve hypothe: true correlato ot equal to 0 95 percet cofdece terval: 0.84 0.986797 ample etmate: cor 0.944773 5 5 Eample: I a vetgato, coutre were cluded to tudy the relato betwee female lfe epectacy ad the brthrate. Frt Order Smple Lear Regreo Model Model aumpto: Female lfe epectacy 99 0 0 r.87 y α + β + ε wth error, ε, depedet, detcally ad ormally dtrbuted a Ν (0, σ ), ad mea of y at μ y α + β. Brth per 00 populato, 99 53 54 9

Correlato & Regreo Model Aumpto Redual y Redual: e y y ( ˆ ˆ α + β ) 3 4 55 56 Eample: Fd the redual at 4 ad the oberved y. Redual Sum of Square ŷ Predcted y 4.4439 +.486 4 6.0. The redual 6.0 4.99. Redual Sum of ( or Square (SSRed) Error Sum of Square, SSE) ( y ) 57 58 Meaure Square Error ad Stadard devato for regreo Etmato of σ : y MSE SSE / ( ) 8.87 (Degree of freedom ) Etmated Stadard Error of the regreo model: y 4.8 59 Iferece for Regreo Coeffcet β (t-tet) Hypothe: H o : β β 0, v.. H a : β β 0 (It ofte tetg for Ho: β 0 v.. Ha: β 0.) Tet Stattc: ˆ β β 0 t e ˆ ( ˆ β ) ~ t-dtrbuto d.f., where e ˆ ( ˆ β ) y ( ).03 Deco rule: Reject H o, f C.V. approach: t < t α/ or t > t α/ p-value approach: p-value < α

Correlato & Regreo Iferece for Regreo Coeffcet α (t-tet) Hypothe: H o : α α 0, v.. H a : α α 0 (It ofte tetg for H o : α 0 v.. H a : α 0.) Tet Stattc: ˆ α α 0 t ~ t-dtrbuto d.f., e ˆ ( ˆ α) where e ˆ ( ˆ α) + y 7.4 ( ) Deco rule: Reject H o, f C.V. approach: t < t α/ or t > t α/ p-value approach: p-value < α Predctg Mea Repoe The (-α) 0% cofdece terval for predctg the mea repoe at : t e ˆ ( ) / ± α where e ( ) ( ) d.f. ˆ ( ) + y Predcted Number of Maatee Klled o Average at 4 > 6.0 ± 3.9 > (.09, 9.9) 6 6 Predctg a Sgle New Repoe The (-α) 0% cofdece terval for predctg a dvdual outcome at : t e ˆ ( ~ y) / ± α Cofdece Iterval Bad where e ( ) ( ) d.f. ˆ ( ~ y) + + y Predcted Number of Maatee Klled at 4 > 6.0 ±. > (5.9, 6.) Number of maatee klled 0 0 0 Number of Powerboat 0 0 0 63 64 0 Evaluato of the Model Itererupto Tme CDUR.00.00 Total Populato Coeffcet of Determato (R ): It the proporto of varato oberved y that ca be eplaed by the varable wth the lear regreo model..5.0.5 3.0 3.5 4.0 4.5 5.0 5.5 Durato of Erupto 65 66

Correlato & Regreo R Output Call: lm(formula MANKILL ~ NPOWERBT, data Dataet) Redual: M Q Meda 3Q Ma -9.468 -.066 0.07.3369 5.6375 Coeffcet: Etmate Std. Error t value Pr(> t ) (Itercept) -4.44 7.4-5.589 0.0008 *** NPOWERBT 0.49 0.09 9.675 5.e-07 *** --- Sgf. code: 0 '***' 0.00 '**' 0.0 '*' 0.05 '.' 0. ' ' Redual tadard error: 4.76 o degree of freedom Multple R-Squared: 0.8864, Adjuted R-quared: 0.8769 Coeffcet of determato R F-tattc: 93.6 o ad DF, p-value: 5.9e-07 67 Redual Plot A catter plot of the redual agat the predcted value of the repoe varable to verfy the aumpto behd the regreo model. Homogeety of varace Radom ormal error Appropratee of the lear model 68 Graph wth a Ftted Le Redual Plot.5 Scatterplot Depedet Varable: Number of maatee klle Maatee Klled 0 Regreo Stadardzed Redual.0.5 0.0 -.5 -.0 -.5 -.0 -.5 0 0 0 0 0 -.5 -.0 -.5 0.0.5.0.5.0 Number of Power Boat Regreo Stadardzed Predcted Value 69 Redual Plot 0 0 Model ot a good lear ft Volato of the equal varace aumpto 7