SIMPLE LINEAR REGRESSION and CORRELATION

Size: px
Start display at page:

Download "SIMPLE LINEAR REGRESSION and CORRELATION"

Transcription

1 Expermental Desgn and Statstcal Methods Workshop SIMPLE LINEAR REGRESSION and CORRELATION Jesús Pedrafta Arlla Departament de Cènca Anmal dels Alments

2 Items Correlaton: degree of assocaton Regresson: predcton The model Assumptons Matrx notaton Protocol of analss Plottng data ANOVA n regresson Confdence ntervals Analss of resduals Influental observatons Basc commands cor.test lm anova seq nfluence.measures Lbrares car (scatterplot)

3 Analss of several varables Two man nterests:. Estmatng the degree of assocaton between two varables: CORRELATION analss.. Predctng the values of one varable gven that we know the realsed value of another varable(s): REGRESSION analss. Ths analss can also be used to understand the relatonshp among varables. a) A response varable and an ndependent varable: smple (lnear) regresson. b) A response varable and two or more ndependent varables: multple (lnear) regresson. c) When the relatonshp among varables s not lnear: nonlnear regresson. d) If the varable s a dchotomous or bnar varable: logstc regresson. 3

4 Data example Suppose we have recorded the age (ears) and blood pressure (mm Hg) of 0 people, obtanng the data presented n the table. Age Blood pressure

5 Smple statstcs > ## Importng data > BLOODP<-read.csv("bloodpress.csv", header=t) > attach(bloodp) > optons(na.acton=na.exclude) > summar(bloodp) AGE BLPRESS Mn. :9.0 Mn. :0.0 st Qu.:6.0 st Qu.:5.5 Medan :44.5 Medan :9.0 Mean :43. Mean :3.5 3rd Qu.:58.0 3rd Qu.:37.0 Max. :70.0 Max. :46.0 To avod problems n predcton when mssng values are present, we must use optons(na.acton=na.exclude). Wth the current data set t would be unnecessar. 5

6 BLPRESS Plot of raw data A plot for a par of varables gves us a frst mpresson about ther relatonshp. It s also useful for detectng some extreme values. > plot(age,blpress) > lbrar(car) > scatterplot(age,blpress) AGE Data not obvousl non lnear and no evdence of non-normalt (boxplots not asmmetrcal). No evdence of extreme values. 6

7 Correlaton (Pearson) The correlaton s a measure of the degree of assocaton between two varables. It s calculated as r s an estmator of, the populaton parameter. r cov( x, ) s x s ( x)( x) ) ) > cor.test(blpress, AGE) x ( x Pearson's product-moment correlaton ( data: BLPRESS and AGE t = 6.06, df = 8, p-value = 4.39e- alternatve hpothess: true correlaton s not equal to 0 95 percent confdence nterval: sample estmates: cor r The denomnator s the geometrc mean of the sample varances estmates. Ths makes r to range from - to. As close s an estmate to - or, the correlaton s larger. H 0 : ( = 0), s rejected 7

8 Correlaton sample sze - The sample sze requred to have a partcular correlaton statstcall dfferent from 0 depends upon the same correlaton coeffcent: z' 0.5ln r r Fsher s classc z-transformaton to normalze the dstrbuton of Pearson correlaton coeffcent. n z / z z' r z' 0 r 3 r 0 = 0 and r s the magntude of the coeffcent we want to estmate. Sample sze for a power of 80% 90%

9 Smple lnear regresson - the model - Dependent varable 0 0 x Intercept Regresson coeffcent (slope) = tg s the ncrease of the dependent varable when the ndependent varable ncreases unt Independent varable x Random error To estmate 0 and we resort to the Least Squares methodolog,.e., mnmze the sum of the squares of the devatons (red arrows) between actual (blue damond) an predcted values (on the slope). ˆ ˆ 0 cov( x, ) s x ˆ x 9

10 Assumptons n regresson analss. The varables x and are lnearl related (defnton of the model).. Both varables are measured for each of n observatons. 3. Varable x s measured wthout error (fxed). 4. Varable s a set of random observatons measured wth error. 5. The errors are ndependent and normall dstrbuted wth homogeneous varance: ε ~ N( 0, I e ) Some of the above condtons can be seen n the fgure. For each value (fxed) of x, there s a normal dstrbuton of (random), wth mean on the regresson lne. x x. x n 0

11 Matrx notaton n n x n x x x X X X β X X Xβ ' ) ' ( ˆ ' ˆ ' As n ANOVA, we can mnmze the sum of the squared errors and then we have the normal equatons: ε Xβ Now X X s not sngular and can be solved wthout need of a generalsed nverse (or restrctons). Note that X s not a matrx of 0 and, but contans the values of the ndependent varable x.

12 Smple lnear regresson Protocol -. Decde whch varable s to be and whch s to be x.. Plot data, n the vertcal axs. 3. Check evenness of x and varables b a box-plot. 4. Transform x and/or f not even. 5. Compute regresson, save resduals, ftted values and nfluence statstcs. Calculate Durbn-Watson statstc f data are n a logcal order. 6. Plot studentzed or standardzed resduals aganst ftted values (or x varable). Examne resdual plots for outlers, and consder rejecton of outlers wth studentzed or standardzed resduals > 3 and go to step Compare nfluence statstcs wth crtcal values: Leverage > p/n Dffts (absolute value) > (p/n) Cook s D > 4/n Dfbetas > /n Where p = number of parameters n the model (number of ) and n = number of data ponts n the regresson. If two or more nfluence statstcs (among the frst three) are greater than the crtcal values, consder rejectng ponts and return to step If outlers or leverage ponts are a problem, consder usng a robust regresson method. (Adapted from Fr, 993)

13 Smple lnear regresson - Results () - > BLOODP.REG <- lm(blpress ~ AGE); summar(bloodp.reg) Resduals: Mn Q Medan 3Q Max ˆ 0 ˆ Coeffcents: Estmate Std. Error t value Pr(> t ) (Intercept) < e-6 *** AGE e- *** --- Sgnf. codes: 0 *** 0.00 ** 0.0 * H 0 : =0, s rejected For each ncrement of ear, blood pressure ncreases mm Hg ˆ x t( ˆ ) ˆ s. e.( ˆ )

14 Smple lnear regresson - Results () - Resdual standard error:. on 8 degrees of freedom Multple R-squared: , Adjusted R-squared: F-statstc: 56.8 on and 8 DF, p-value: 4.39e- > anova(bloodp.reg) Analss of Varance Table F t H 0 ( =0) s rejected Response: BLPRESS Df Sum Sq Mean Sq F value Pr(>F) AGE e- *** Resduals Sgnf. codes: 0 *** 0.00 ** 0.0 * R-Squared s the square of the correlaton coeffcent. It represents the fracton of the total varaton n blood pressure that s explaned b the lnear relatonshp wth age. Adj R-Sq ncludes a correcton to overcome the ncrement n R-Squared wth the number of regressors (k). R R adj SS AGE / SS ( R TOTAL N ) N k 4

15 ANOVA n regresson ( ˆ) ( ˆ ) ( ) ( ˆ) ( ˆ ) Devated to regresson Due to regresson 0 x Squarng and and summng on both sdes of the equaton we can arrve at the followng ANOVA table: Source d.f. S.S. M.S. E(M.S.) F Due to regresson Devatons to regresson SP x ˆ SP x n- SS ˆ SPx ˆ SS x ( SS ˆ SP ) /( n ) x MS Reg / MS Error 5

16 Smple lnear regresson - Results (3) - Confdence ntervals for (for 0 s smlar): ( ˆ..( ˆ ), ˆ..( ˆ t / s e t / s e )) Ths can be done easl wth R (both for b 0 and b ): > confnt(bloodp.reg,level=0.95).5 % 97.5 % (Intercept) AGE

17 Smple lnear regresson - Results (4) - > data.frame(bloodp, Predcted=ftted(BLOODP.REG), Resduals=resd(BLOODP.REG), + RIstudent=rstandard(BLOODP.REG), Restudent=rstudent(BLOODP.REG)) AGE BLPRESS Predcted Resduals RIstudent Restudent ˆ ˆ ; r RIstudent s an Internall studentzed resdual,.e., the resdual dvded b the own standard error (not unform across observatons). REstudent s an Externall studentzed resdual. Onl observaton s a weak outler. 7

18 Some statstcs useful for regresson analss Internall studentzed resdual Weak outler, rs > (95% confdence) Strong outler, rs >3 (95% confdence) rs r MSE( h ) ~ t N k Externall studentzed resduals (-) Calculated as the prevous one, but removng the observaton to calculate the s. Under H 0, t follows a t dstrbuton wth N-k- df. Leverage (h ) Standardzed value of how much an observaton devates from the centre of the space of x values. Observatons wth hgh leverage can ndcate an outler n the x and are potentall nfluent. Computed as the dagonal elements of X(X X) - X. DFFITS ˆ s ˆ, h R student h h where c jj are the dagonal elements of (X X) -. Analse onl DFBETAS correspondng to hgh values of DFFITS. DFBETAS Cook s D Essentall a DFFITS statstc scaled and squared to make extreme values stand out more clearl. j, ˆ j ˆ s c j, jj 8

19 Smple lnear regresson - Results (5) - > nfluence.measures(bloodp.reg) dfb._ dfb.age dfft cov.r cook.d hat nf e e e e e e e e e e * e * e e e e e e e e e All values of DFFIT are below the crtcal value 0.63 (= (/0)),.e., not nfluental observatons on the predcted values. DFBETAS (dfb.) test nfluence on the parameter estmates, and do not need to be examned because DFFIT values are low. Cook s D values are below the crtcal value 0. (=4/0). Leverage s presented n hat. All values are lower than 0., the crtcal value. 9

20 Crtera to flag an observaton as nfluental n R In slde we presented some crtcal ponts to decde f an observaton can be nfluental or not. These crtcal ponts are not statstcal tests but rules of thumb. Furthermore, there are not agreement among statstcans on the values. In fact, R puts a flag (a star) on an observaton, when: an of ts absolute dfbetas value s greater than, or ts absolute dffts value s greater than 3(p/(n-p), or abs(-covrato) s greater than 3p/(n-p), or ts Cook s dstance s greater than the 50% percentle of an F- dstrbuton wth p and n-p degrees of freedom, or ts hat value s greater than 3p/n Where p denotes the number of model coeffcents, ncludng the ntercept. 0

21 Some graphcs about nfluental observatons Hgh leverage, nfluental Hgh leverage, not nfluental x x Low leverage, nfluental Low leverage, not nfluental x x

22 Standardzed resduals Standardzed resduals Resduals -4-0 Standardzed resduals Smple lnear regresson - dagnostcs - > laout(matrx(c(,,3,4),,)) # optonal 4 graphs/page > plot(bloodp.reg) Resduals are dstrbuted approxmatel at random: homogenet of varance met. No mportant devatons n Q-Q plot: response varable normal. None of the ponts approach the hgh Cook s dstance contour(s): none of the observatons are nfluental. 4 Resduals vs Ftted Ftted values Normal Q-Q 4 4 Scale-Locaton Ftted values Resduals vs Leverage Cook's dstance Theoretcal Quantles Leverage

23 Some plots of resduals Ideal resdual plot (random dstrbuton around 0) e 0 ŷ e 0 Model should nvolve curvature e 0 Heterogeneous varance ŷ ŷ 3

24 Blood pressure (mm Hg) Smple lnear regresson - Regresson lne and CL - 45 Blood presure = 0.44*Age+.3 R % upper lmt Regresson lne 95% lower lmt 40 Observatons ( ) ( x x) sˆ MSRES n SSxx Note that for greater values of x the standard error of predcted values s greater, and thus CL. Ths s lttle dstngushable when the predcton s made n the nterval of the observed x s Age (ears) 4

25 Smple lnear regresson program of the graphc - > ## Summar scatterplot > #cretate a plot wth sold dots (pch=6) and no axs or labels > plot(blpress~age, pch=6, axes=f, xlab="", lab="") > #put the x-axs (axs) wth smaller label font sze > axs(, cex.axs=.8) > #put the x-axs label 3 lnes down from the axs > mtext(text="age (ears)", sde=, lne=3) > #put the -axs (axs ) wth horzontal tck labels > axs(, las=) > #put the -ax label 3 lnes to the left of the axs > mtext(text= "Blood pressure (mm Hg)", sde=, lne=3) > #add the regresson lne from the ftted model > ablne(bloodp.reg) > #add the regresson formula > text(50,45,"blood presure = 0.44*Age+.3", pos=) > #add the r squared value > text(50,43,expresson(paste(r^==0.935)), pos=) > #create a sequence of 00 numbers spannng the range of ages > x<-seq(mn(age), max(age), l=000) > #for each value of x, calculate the upper and lower 95% confdence > <-predct(bloodp.reg, data.frame(age=x), nterval="c") > #plot the upper and lower 95% confdence lmts > matlnes(x,, lt=, col=) > #put an L-shaped box to complete the axs > box(bt="l") 5

26 References Fr J.C Bologcal Data Analss. IRL Press, Oxford. 6

Chapter 14 Simple Linear Regression

Chapter 14 Simple Linear Regression Chapter 4 Smple Lnear Regresson Chapter 4 - Smple Lnear Regresson Manageral decsons often are based on the relatonshp between two or more varables. Regresson analss can be used to develop an equaton showng

More information

Chapter 11: Simple Linear Regression and Correlation

Chapter 11: Simple Linear Regression and Correlation Chapter 11: Smple Lnear Regresson and Correlaton 11-1 Emprcal Models 11-2 Smple Lnear Regresson 11-3 Propertes of the Least Squares Estmators 11-4 Hypothess Test n Smple Lnear Regresson 11-4.1 Use of t-tests

More information

Comparison of Regression Lines

Comparison of Regression Lines STATGRAPHICS Rev. 9/13/2013 Comparson of Regresson Lnes Summary... 1 Data Input... 3 Analyss Summary... 4 Plot of Ftted Model... 6 Condtonal Sums of Squares... 6 Analyss Optons... 7 Forecasts... 8 Confdence

More information

Y = β 0 + β 1 X 1 + β 2 X β k X k + ε

Y = β 0 + β 1 X 1 + β 2 X β k X k + ε Chapter 3 Secton 3.1 Model Assumptons: Multple Regresson Model Predcton Equaton Std. Devaton of Error Correlaton Matrx Smple Lnear Regresson: 1.) Lnearty.) Constant Varance 3.) Independent Errors 4.) Normalty

More information

Statistics for Economics & Business

Statistics for Economics & Business Statstcs for Economcs & Busness Smple Lnear Regresson Learnng Objectves In ths chapter, you learn: How to use regresson analyss to predct the value of a dependent varable based on an ndependent varable

More information

Statistics for Managers Using Microsoft Excel/SPSS Chapter 13 The Simple Linear Regression Model and Correlation

Statistics for Managers Using Microsoft Excel/SPSS Chapter 13 The Simple Linear Regression Model and Correlation Statstcs for Managers Usng Mcrosoft Excel/SPSS Chapter 13 The Smple Lnear Regresson Model and Correlaton 1999 Prentce-Hall, Inc. Chap. 13-1 Chapter Topcs Types of Regresson Models Determnng the Smple Lnear

More information

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6 Department of Quanttatve Methods & Informaton Systems Tme Seres and Ther Components QMIS 30 Chapter 6 Fall 00 Dr. Mohammad Zanal These sldes were modfed from ther orgnal source for educatonal purpose only.

More information

Chap 10: Diagnostics, p384

Chap 10: Diagnostics, p384 Chap 10: Dagnostcs, p384 Multcollnearty 10.5 p406 Defnton Multcollnearty exsts when two or more ndependent varables used n regresson are moderately or hghly correlated. - when multcollnearty exsts, regresson

More information

Regression. The Simple Linear Regression Model

Regression. The Simple Linear Regression Model Regresson Smple Lnear Regresson Model Least Squares Method Coeffcent of Determnaton Model Assumptons Testng for Sgnfcance Usng the Estmated Regresson Equaton for Estmaton and Predcton Resdual Analss: Valdatng

More information

Correlation and Regression. Correlation 9.1. Correlation. Chapter 9

Correlation and Regression. Correlation 9.1. Correlation. Chapter 9 Chapter 9 Correlaton and Regresson 9. Correlaton Correlaton A correlaton s a relatonshp between two varables. The data can be represented b the ordered pars (, ) where s the ndependent (or eplanator) varable,

More information

[The following data appear in Wooldridge Q2.3.] The table below contains the ACT score and college GPA for eight college students.

[The following data appear in Wooldridge Q2.3.] The table below contains the ACT score and college GPA for eight college students. PPOL 59-3 Problem Set Exercses n Smple Regresson Due n class /8/7 In ths problem set, you are asked to compute varous statstcs by hand to gve you a better sense of the mechancs of the Pearson correlaton

More information

Negative Binomial Regression

Negative Binomial Regression STATGRAPHICS Rev. 9/16/2013 Negatve Bnomal Regresson Summary... 1 Data Input... 3 Statstcal Model... 3 Analyss Summary... 4 Analyss Optons... 7 Plot of Ftted Model... 8 Observed Versus Predcted... 10 Predctons...

More information

Basic Business Statistics, 10/e

Basic Business Statistics, 10/e Chapter 13 13-1 Basc Busness Statstcs 11 th Edton Chapter 13 Smple Lnear Regresson Basc Busness Statstcs, 11e 009 Prentce-Hall, Inc. Chap 13-1 Learnng Objectves In ths chapter, you learn: How to use regresson

More information

x yi In chapter 14, we want to perform inference (i.e. calculate confidence intervals and perform tests of significance) in this setting.

x yi In chapter 14, we want to perform inference (i.e. calculate confidence intervals and perform tests of significance) in this setting. The Practce of Statstcs, nd ed. Chapter 14 Inference for Regresson Introducton In chapter 3 we used a least-squares regresson lne (LSRL) to represent a lnear relatonshp etween two quanttatve explanator

More information

Introduction to Regression

Introduction to Regression Introducton to Regresson Dr Tom Ilvento Department of Food and Resource Economcs Overvew The last part of the course wll focus on Regresson Analyss Ths s one of the more powerful statstcal technques Provdes

More information

Statistics MINITAB - Lab 2

Statistics MINITAB - Lab 2 Statstcs 20080 MINITAB - Lab 2 1. Smple Lnear Regresson In smple lnear regresson we attempt to model a lnear relatonshp between two varables wth a straght lne and make statstcal nferences concernng that

More information

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers Psychology 282 Lecture #24 Outlne Regresson Dagnostcs: Outlers In an earler lecture we studed the statstcal assumptons underlyng the regresson model, ncludng the followng ponts: Formal statement of assumptons.

More information

Chapter 15 Student Lecture Notes 15-1

Chapter 15 Student Lecture Notes 15-1 Chapter 15 Student Lecture Notes 15-1 Basc Busness Statstcs (9 th Edton) Chapter 15 Multple Regresson Model Buldng 004 Prentce-Hall, Inc. Chap 15-1 Chapter Topcs The Quadratc Regresson Model Usng Transformatons

More information

STAT 3008 Applied Regression Analysis

STAT 3008 Applied Regression Analysis STAT 3008 Appled Regresson Analyss Tutoral : Smple Lnear Regresson LAI Chun He Department of Statstcs, The Chnese Unversty of Hong Kong 1 Model Assumpton To quantfy the relatonshp between two factors,

More information

Chapter 13: Multiple Regression

Chapter 13: Multiple Regression Chapter 13: Multple Regresson 13.1 Developng the multple-regresson Model The general model can be descrbed as: It smplfes for two ndependent varables: The sample ft parameter b 0, b 1, and b are used to

More information

Chapter 9: Statistical Inference and the Relationship between Two Variables

Chapter 9: Statistical Inference and the Relationship between Two Variables Chapter 9: Statstcal Inference and the Relatonshp between Two Varables Key Words The Regresson Model The Sample Regresson Equaton The Pearson Correlaton Coeffcent Learnng Outcomes After studyng ths chapter,

More information

28. SIMPLE LINEAR REGRESSION III

28. SIMPLE LINEAR REGRESSION III 8. SIMPLE LINEAR REGRESSION III Ftted Values and Resduals US Domestc Beers: Calores vs. % Alcohol To each observed x, there corresponds a y-value on the ftted lne, y ˆ = βˆ + βˆ x. The are called ftted

More information

SIMPLE LINEAR REGRESSION

SIMPLE LINEAR REGRESSION Smple Lnear Regresson and Correlaton Introducton Prevousl, our attenton has been focused on one varable whch we desgnated b x. Frequentl, t s desrable to learn somethng about the relatonshp between two

More information

18. SIMPLE LINEAR REGRESSION III

18. SIMPLE LINEAR REGRESSION III 8. SIMPLE LINEAR REGRESSION III US Domestc Beers: Calores vs. % Alcohol Ftted Values and Resduals To each observed x, there corresponds a y-value on the ftted lne, y ˆ ˆ = α + x. The are called ftted values.

More information

Linear Regression Analysis: Terminology and Notation

Linear Regression Analysis: Terminology and Notation ECON 35* -- Secton : Basc Concepts of Regresson Analyss (Page ) Lnear Regresson Analyss: Termnology and Notaton Consder the generc verson of the smple (two-varable) lnear regresson model. It s represented

More information

F statistic = s2 1 s 2 ( F for Fisher )

F statistic = s2 1 s 2 ( F for Fisher ) Stat 4 ANOVA Analyss of Varance /6/04 Comparng Two varances: F dstrbuton Typcal Data Sets One way analyss of varance : example Notaton for one way ANOVA Comparng Two varances: F dstrbuton We saw that the

More information

Biostatistics 360 F&t Tests and Intervals in Regression 1

Biostatistics 360 F&t Tests and Intervals in Regression 1 Bostatstcs 360 F&t Tests and Intervals n Regresson ORIGIN Model: Y = X + Corrected Sums of Squares: X X bar where: s the y ntercept of the regresson lne (translaton) s the slope of the regresson lne (scalng

More information

/ n ) are compared. The logic is: if the two

/ n ) are compared. The logic is: if the two STAT C141, Sprng 2005 Lecture 13 Two sample tests One sample tests: examples of goodness of ft tests, where we are testng whether our data supports predctons. Two sample tests: called as tests of ndependence

More information

Learning Objectives for Chapter 11

Learning Objectives for Chapter 11 Chapter : Lnear Regresson and Correlaton Methods Hldebrand, Ott and Gray Basc Statstcal Ideas for Managers Second Edton Learnng Objectves for Chapter Usng the scatterplot n regresson analyss Usng the method

More information

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands Content. Inference on Regresson Parameters a. Fndng Mean, s.d and covarance amongst estmates.. Confdence Intervals and Workng Hotellng Bands 3. Cochran s Theorem 4. General Lnear Testng 5. Measures of

More information

Linear regression. Regression Models. Chapter 11 Student Lecture Notes Regression Analysis is the

Linear regression. Regression Models. Chapter 11 Student Lecture Notes Regression Analysis is the Chapter 11 Student Lecture Notes 11-1 Lnear regresson Wenl lu Dept. Health statstcs School of publc health Tanjn medcal unversty 1 Regresson Models 1. Answer What Is the Relatonshp Between the Varables?.

More information

Statistics for Business and Economics

Statistics for Business and Economics Statstcs for Busness and Economcs Chapter 11 Smple Regresson Copyrght 010 Pearson Educaton, Inc. Publshng as Prentce Hall Ch. 11-1 11.1 Overvew of Lnear Models n An equaton can be ft to show the best lnear

More information

Lecture Notes for STATISTICAL METHODS FOR BUSINESS II BMGT 212. Chapters 14, 15 & 16. Professor Ahmadi, Ph.D. Department of Management

Lecture Notes for STATISTICAL METHODS FOR BUSINESS II BMGT 212. Chapters 14, 15 & 16. Professor Ahmadi, Ph.D. Department of Management Lecture Notes for STATISTICAL METHODS FOR BUSINESS II BMGT 1 Chapters 14, 15 & 16 Professor Ahmad, Ph.D. Department of Management Revsed August 005 Chapter 14 Formulas Smple Lnear Regresson Model: y =

More information

Unit 10: Simple Linear Regression and Correlation

Unit 10: Simple Linear Regression and Correlation Unt 10: Smple Lnear Regresson and Correlaton Statstcs 571: Statstcal Methods Ramón V. León 6/28/2004 Unt 10 - Stat 571 - Ramón V. León 1 Introductory Remarks Regresson analyss s a method for studyng the

More information

Lecture 6: Introduction to Linear Regression

Lecture 6: Introduction to Linear Regression Lecture 6: Introducton to Lnear Regresson An Manchakul amancha@jhsph.edu 24 Aprl 27 Lnear regresson: man dea Lnear regresson can be used to study an outcome as a lnear functon of a predctor Example: 6

More information

Systematic Error Illustration of Bias. Sources of Systematic Errors. Effects of Systematic Errors 9/23/2009. Instrument Errors Method Errors Personal

Systematic Error Illustration of Bias. Sources of Systematic Errors. Effects of Systematic Errors 9/23/2009. Instrument Errors Method Errors Personal 9/3/009 Sstematc Error Illustraton of Bas Sources of Sstematc Errors Instrument Errors Method Errors Personal Prejudce Preconceved noton of true value umber bas Prefer 0/5 Small over large Even over odd

More information

ECONOMICS 351*-A Mid-Term Exam -- Fall Term 2000 Page 1 of 13 pages. QUEEN'S UNIVERSITY AT KINGSTON Department of Economics

ECONOMICS 351*-A Mid-Term Exam -- Fall Term 2000 Page 1 of 13 pages. QUEEN'S UNIVERSITY AT KINGSTON Department of Economics ECOOMICS 35*-A Md-Term Exam -- Fall Term 000 Page of 3 pages QUEE'S UIVERSITY AT KIGSTO Department of Economcs ECOOMICS 35* - Secton A Introductory Econometrcs Fall Term 000 MID-TERM EAM ASWERS MG Abbott

More information

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding Recall: man dea of lnear regresson Lecture 9: Lnear regresson: centerng, hypothess testng, multple covarates, and confoundng Sandy Eckel seckel@jhsph.edu 6 May 8 Lnear regresson can be used to study an

More information

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding Lecture 9: Lnear regresson: centerng, hypothess testng, multple covarates, and confoundng Sandy Eckel seckel@jhsph.edu 6 May 008 Recall: man dea of lnear regresson Lnear regresson can be used to study

More information

Economics 130. Lecture 4 Simple Linear Regression Continued

Economics 130. Lecture 4 Simple Linear Regression Continued Economcs 130 Lecture 4 Contnued Readngs for Week 4 Text, Chapter and 3. We contnue wth addressng our second ssue + add n how we evaluate these relatonshps: Where do we get data to do ths analyss? How do

More information

β0 + β1xi. You are interested in estimating the unknown parameters β

β0 + β1xi. You are interested in estimating the unknown parameters β Revsed: v3 Ordnar Least Squares (OLS): Smple Lnear Regresson (SLR) Analtcs The SLR Setup Sample Statstcs Ordnar Least Squares (OLS): FOCs and SOCs Back to OLS and Sample Statstcs Predctons (and Resduals)

More information

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y)

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y) Secton 1.5 Correlaton In the prevous sectons, we looked at regresson and the value r was a measurement of how much of the varaton n y can be attrbuted to the lnear relatonshp between y and x. In ths secton,

More information

STATISTICS QUESTIONS. Step by Step Solutions.

STATISTICS QUESTIONS. Step by Step Solutions. STATISTICS QUESTIONS Step by Step Solutons www.mathcracker.com 9//016 Problem 1: A researcher s nterested n the effects of famly sze on delnquency for a group of offenders and examnes famles wth one to

More information

Interval Estimation in the Classical Normal Linear Regression Model. 1. Introduction

Interval Estimation in the Classical Normal Linear Regression Model. 1. Introduction ECONOMICS 35* -- NOTE 7 ECON 35* -- NOTE 7 Interval Estmaton n the Classcal Normal Lnear Regresson Model Ths note outlnes the basc elements of nterval estmaton n the Classcal Normal Lnear Regresson Model

More information

2016 Wiley. Study Session 2: Ethical and Professional Standards Application

2016 Wiley. Study Session 2: Ethical and Professional Standards Application 6 Wley Study Sesson : Ethcal and Professonal Standards Applcaton LESSON : CORRECTION ANALYSIS Readng 9: Correlaton and Regresson LOS 9a: Calculate and nterpret a sample covarance and a sample correlaton

More information

Chapter 15 - Multiple Regression

Chapter 15 - Multiple Regression Chapter - Multple Regresson Chapter - Multple Regresson Multple Regresson Model The equaton that descrbes how the dependent varable y s related to the ndependent varables x, x,... x p and an error term

More information

Statistics II Final Exam 26/6/18

Statistics II Final Exam 26/6/18 Statstcs II Fnal Exam 26/6/18 Academc Year 2017/18 Solutons Exam duraton: 2 h 30 mn 1. (3 ponts) A town hall s conductng a study to determne the amount of leftover food produced by the restaurants n the

More information

NANYANG TECHNOLOGICAL UNIVERSITY SEMESTER I EXAMINATION MTH352/MH3510 Regression Analysis

NANYANG TECHNOLOGICAL UNIVERSITY SEMESTER I EXAMINATION MTH352/MH3510 Regression Analysis NANYANG TECHNOLOGICAL UNIVERSITY SEMESTER I EXAMINATION 014-015 MTH35/MH3510 Regresson Analyss December 014 TIME ALLOWED: HOURS INSTRUCTIONS TO CANDIDATES 1. Ths examnaton paper contans FOUR (4) questons

More information

17 - LINEAR REGRESSION II

17 - LINEAR REGRESSION II Topc 7 Lnear Regresson II 7- Topc 7 - LINEAR REGRESSION II Testng and Estmaton Inferences about β Recall that we estmate Yˆ ˆ β + ˆ βx. 0 μ Y X x β0 + βx usng To estmate σ σ squared error Y X x ε s ε we

More information

is the calculated value of the dependent variable at point i. The best parameters have values that minimize the squares of the errors

is the calculated value of the dependent variable at point i. The best parameters have values that minimize the squares of the errors Multple Lnear and Polynomal Regresson wth Statstcal Analyss Gven a set of data of measured (or observed) values of a dependent varable: y versus n ndependent varables x 1, x, x n, multple lnear regresson

More information

where I = (n x n) diagonal identity matrix with diagonal elements = 1 and off-diagonal elements = 0; and σ 2 e = variance of (Y X).

where I = (n x n) diagonal identity matrix with diagonal elements = 1 and off-diagonal elements = 0; and σ 2 e = variance of (Y X). 11.4.1 Estmaton of Multple Regresson Coeffcents In multple lnear regresson, we essentally solve n equatons for the p unnown parameters. hus n must e equal to or greater than p and n practce n should e

More information

x i1 =1 for all i (the constant ).

x i1 =1 for all i (the constant ). Chapter 5 The Multple Regresson Model Consder an economc model where the dependent varable s a functon of K explanatory varables. The economc model has the form: y = f ( x,x,..., ) xk Approxmate ths by

More information

Topic 7: Analysis of Variance

Topic 7: Analysis of Variance Topc 7: Analyss of Varance Outlne Parttonng sums of squares Breakdown the degrees of freedom Expected mean squares (EMS) F test ANOVA table General lnear test Pearson Correlaton / R 2 Analyss of Varance

More information

Resource Allocation and Decision Analysis (ECON 8010) Spring 2014 Foundations of Regression Analysis

Resource Allocation and Decision Analysis (ECON 8010) Spring 2014 Foundations of Regression Analysis Resource Allocaton and Decson Analss (ECON 800) Sprng 04 Foundatons of Regresson Analss Readng: Regresson Analss (ECON 800 Coursepak, Page 3) Defntons and Concepts: Regresson Analss statstcal technques

More information

ANOVA. The Observations y ij

ANOVA. The Observations y ij ANOVA Stands for ANalyss Of VArance But t s a test of dfferences n means The dea: The Observatons y j Treatment group = 1 = 2 = k y 11 y 21 y k,1 y 12 y 22 y k,2 y 1, n1 y 2, n2 y k, nk means: m 1 m 2

More information

Chapter 14 Simple Linear Regression Page 1. Introduction to regression analysis 14-2

Chapter 14 Simple Linear Regression Page 1. Introduction to regression analysis 14-2 Chapter 4 Smple Lnear Regresson Page. Introducton to regresson analyss 4- The Regresson Equaton. Lnear Functons 4-4 3. Estmaton and nterpretaton of model parameters 4-6 4. Inference on the model parameters

More information

The SAS program I used to obtain the analyses for my answers is given below.

The SAS program I used to obtain the analyses for my answers is given below. Homework 1 Answer sheet Page 1 The SAS program I used to obtan the analyses for my answers s gven below. dm'log;clear;output;clear'; *************************************************************; *** EXST7034

More information

Correlation and Regression

Correlation and Regression Correlaton and Regresson otes prepared by Pamela Peterson Drake Index Basc terms and concepts... Smple regresson...5 Multple Regresson...3 Regresson termnology...0 Regresson formulas... Basc terms and

More information

Reduced slides. Introduction to Analysis of Variance (ANOVA) Part 1. Single factor

Reduced slides. Introduction to Analysis of Variance (ANOVA) Part 1. Single factor Reduced sldes Introducton to Analss of Varance (ANOVA) Part 1 Sngle factor 1 The logc of Analss of Varance Is the varance explaned b the model >> than the resdual varance In regresson models Varance explaned

More information

β0 + β1xi. You are interested in estimating the unknown parameters β

β0 + β1xi. You are interested in estimating the unknown parameters β Ordnary Least Squares (OLS): Smple Lnear Regresson (SLR) Analytcs The SLR Setup Sample Statstcs Ordnary Least Squares (OLS): FOCs and SOCs Back to OLS and Sample Statstcs Predctons (and Resduals) wth OLS

More information

Biostatistics. Chapter 11 Simple Linear Correlation and Regression. Jing Li

Biostatistics. Chapter 11 Simple Linear Correlation and Regression. Jing Li Bostatstcs Chapter 11 Smple Lnear Correlaton and Regresson Jng L jng.l@sjtu.edu.cn http://cbb.sjtu.edu.cn/~jngl/courses/2018fall/b372/ Dept of Bonformatcs & Bostatstcs, SJTU Recall eat chocolate Cell 175,

More information

Lecture 3 Stat102, Spring 2007

Lecture 3 Stat102, Spring 2007 Lecture 3 Stat0, Sprng 007 Chapter 3. 3.: Introducton to regresson analyss Lnear regresson as a descrptve technque The least-squares equatons Chapter 3.3 Samplng dstrbuton of b 0, b. Contnued n net lecture

More information

The Ordinary Least Squares (OLS) Estimator

The Ordinary Least Squares (OLS) Estimator The Ordnary Least Squares (OLS) Estmator 1 Regresson Analyss Regresson Analyss: a statstcal technque for nvestgatng and modelng the relatonshp between varables. Applcatons: Engneerng, the physcal and chemcal

More information

Properties of Least Squares

Properties of Least Squares Week 3 3.1 Smple Lnear Regresson Model 3. Propertes of Least Squares Estmators Y Y β 1 + β X + u weekly famly expendtures X weekly famly ncome For a gven level of x, the expected level of food expendtures

More information

Introduction to Analysis of Variance (ANOVA) Part 1

Introduction to Analysis of Variance (ANOVA) Part 1 Introducton to Analss of Varance (ANOVA) Part 1 Sngle factor The logc of Analss of Varance Is the varance explaned b the model >> than the resdual varance In regresson models Varance explaned b regresson

More information

Statistics for Managers Using Microsoft Excel/SPSS Chapter 14 Multiple Regression Models

Statistics for Managers Using Microsoft Excel/SPSS Chapter 14 Multiple Regression Models Statstcs for Managers Usng Mcrosoft Excel/SPSS Chapter 14 Multple Regresson Models 1999 Prentce-Hall, Inc. Chap. 14-1 Chapter Topcs The Multple Regresson Model Contrbuton of Indvdual Independent Varables

More information

ANSWERS CHAPTER 9. TIO 9.2: If the values are the same, the difference is 0, therefore the null hypothesis cannot be rejected.

ANSWERS CHAPTER 9. TIO 9.2: If the values are the same, the difference is 0, therefore the null hypothesis cannot be rejected. ANSWERS CHAPTER 9 THINK IT OVER thnk t over TIO 9.: χ 2 k = ( f e ) = 0 e Breakng the equaton down: the test statstc for the ch-squared dstrbuton s equal to the sum over all categores of the expected frequency

More information

Lecture 6 More on Complete Randomized Block Design (RBD)

Lecture 6 More on Complete Randomized Block Design (RBD) Lecture 6 More on Complete Randomzed Block Desgn (RBD) Multple test Multple test The multple comparsons or multple testng problem occurs when one consders a set of statstcal nferences smultaneously. For

More information

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise.

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise. Chapter - The Smple Lnear Regresson Model The lnear regresson equaton s: where y + = β + β e for =,..., y and are observable varables e s a random error How can an estmaton rule be constructed for the

More information

e i is a random error

e i is a random error Chapter - The Smple Lnear Regresson Model The lnear regresson equaton s: where + β + β e for,..., and are observable varables e s a random error How can an estmaton rule be constructed for the unknown

More information

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction ECONOMICS 5* -- NOTE (Summary) ECON 5* -- NOTE The Multple Classcal Lnear Regresson Model (CLRM): Specfcaton and Assumptons. Introducton CLRM stands for the Classcal Lnear Regresson Model. The CLRM s also

More information

Lecture 4 Hypothesis Testing

Lecture 4 Hypothesis Testing Lecture 4 Hypothess Testng We may wsh to test pror hypotheses about the coeffcents we estmate. We can use the estmates to test whether the data rejects our hypothess. An example mght be that we wsh to

More information

Statistics Chapter 4

Statistics Chapter 4 Statstcs Chapter 4 "There are three knds of les: les, damned les, and statstcs." Benjamn Dsrael, 1895 (Brtsh statesman) Gaussan Dstrbuton, 4-1 If a measurement s repeated many tmes a statstcal treatment

More information

January Examinations 2015

January Examinations 2015 24/5 Canddates Only January Examnatons 25 DO NOT OPEN THE QUESTION PAPER UNTIL INSTRUCTED TO DO SO BY THE CHIEF INVIGILATOR STUDENT CANDIDATE NO.. Department Module Code Module Ttle Exam Duraton (n words)

More information

Regression Analysis. Regression Analysis

Regression Analysis. Regression Analysis Regresson Analyss Smple Regresson Multvarate Regresson Stepwse Regresson Replcaton and Predcton Error 1 Regresson Analyss In general, we "ft" a model by mnmzng a metrc that represents the error. n mn (y

More information

Lecture 2: Prelude to the big shrink

Lecture 2: Prelude to the big shrink Lecture 2: Prelude to the bg shrnk Last tme A slght detour wth vsualzaton tools (hey, t was the frst day... why not start out wth somethng pretty to look at?) Then, we consdered a smple 120a-style regresson

More information

Lecture 16 Statistical Analysis in Biomaterials Research (Part II)

Lecture 16 Statistical Analysis in Biomaterials Research (Part II) 3.051J/0.340J 1 Lecture 16 Statstcal Analyss n Bomaterals Research (Part II) C. F Dstrbuton Allows comparson of varablty of behavor between populatons usng test of hypothess: σ x = σ x amed for Brtsh statstcan

More information

IV. Modeling a Mean: Simple Linear Regression

IV. Modeling a Mean: Simple Linear Regression IV. Modelng a Mean: Smple Lnear Regresson We have talked about nference for a sngle mean, for comparng two means, and for comparng several means. What f the mean of one varable depends on the value of

More information

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4) I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes

More information

Activity #13: Simple Linear Regression. actgpa.sav; beer.sav;

Activity #13: Simple Linear Regression. actgpa.sav; beer.sav; ctvty #3: Smple Lnear Regresson Resources: actgpa.sav; beer.sav; http://mathworld.wolfram.com/leastfttng.html In the last actvty, we learned how to quantfy the strength of the lnear relatonshp between

More information

Measuring the Strength of Association

Measuring the Strength of Association Stat 3000 Statstcs for Scentsts and Engneers Measurng the Strength of Assocaton Note that the slope s one measure of the lnear assocaton between two contnuous varables t tells ou how much the average of

More information

Chapter 3 Describing Data Using Numerical Measures

Chapter 3 Describing Data Using Numerical Measures Chapter 3 Student Lecture Notes 3-1 Chapter 3 Descrbng Data Usng Numercal Measures Fall 2006 Fundamentals of Busness Statstcs 1 Chapter Goals To establsh the usefulness of summary measures of data. The

More information

Econ Statistical Properties of the OLS estimator. Sanjaya DeSilva

Econ Statistical Properties of the OLS estimator. Sanjaya DeSilva Econ 39 - Statstcal Propertes of the OLS estmator Sanjaya DeSlva September, 008 1 Overvew Recall that the true regresson model s Y = β 0 + β 1 X + u (1) Applyng the OLS method to a sample of data, we estmate

More information

Reminder: Nested models. Lecture 9: Interactions, Quadratic terms and Splines. Effect Modification. Model 1

Reminder: Nested models. Lecture 9: Interactions, Quadratic terms and Splines. Effect Modification. Model 1 Lecture 9: Interactons, Quadratc terms and Splnes An Manchakul amancha@jhsph.edu 3 Aprl 7 Remnder: Nested models Parent model contans one set of varables Extended model adds one or more new varables to

More information

Number of cases Number of factors Number of covariates Number of levels of factor i. Value of the dependent variable for case k

Number of cases Number of factors Number of covariates Number of levels of factor i. Value of the dependent variable for case k ANOVA Model and Matrx Computatons Notaton The followng notaton s used throughout ths chapter unless otherwse stated: N F CN Y Z j w W Number of cases Number of factors Number of covarates Number of levels

More information

Professor Chris Murray. Midterm Exam

Professor Chris Murray. Midterm Exam Econ 7 Econometrcs Sprng 4 Professor Chrs Murray McElhnney D cjmurray@uh.edu Mdterm Exam Wrte your answers on one sde of the blank whte paper that I have gven you.. Do not wrte your answers on ths exam.

More information

STAT 3340 Assignment 1 solutions. 1. Find the equation of the line which passes through the points (1,1) and (4,5).

STAT 3340 Assignment 1 solutions. 1. Find the equation of the line which passes through the points (1,1) and (4,5). (out of 15 ponts) STAT 3340 Assgnment 1 solutons (10) (10) 1. Fnd the equaton of the lne whch passes through the ponts (1,1) and (4,5). β 1 = (5 1)/(4 1) = 4/3 equaton for the lne s y y 0 = β 1 (x x 0

More information

( )( ) [ ] [ ] ( ) 1 = [ ] = ( ) 1. H = X X X X is called the hat matrix ( it puts the hats on the Y s) and is of order n n H = X X X X.

( )( ) [ ] [ ] ( ) 1 = [ ] = ( ) 1. H = X X X X is called the hat matrix ( it puts the hats on the Y s) and is of order n n H = X X X X. ( ) ( ) where ( ) 1 ˆ β = X X X X β + ε = β + Aε A = X X 1 X [ ] E ˆ β β AE ε β so ˆ = + = β s unbased ( )( ) [ ] ˆ Cov β = E ˆ β β ˆ β β = E Aεε A AE ε ε A Aσ IA = σ AA = σ X X = [ ] = ( ) 1 Ftted values

More information

β0 + β1xi and want to estimate the unknown

β0 + β1xi and want to estimate the unknown SLR Models Estmaton Those OLS Estmates Estmators (e ante) v. estmates (e post) The Smple Lnear Regresson (SLR) Condtons -4 An Asde: The Populaton Regresson Functon B and B are Lnear Estmators (condtonal

More information

a. (All your answers should be in the letter!

a. (All your answers should be in the letter! Econ 301 Blkent Unversty Taskn Econometrcs Department of Economcs Md Term Exam I November 8, 015 Name For each hypothess testng n the exam complete the followng steps: Indcate the test statstc, ts crtcal

More information

First Year Examination Department of Statistics, University of Florida

First Year Examination Department of Statistics, University of Florida Frst Year Examnaton Department of Statstcs, Unversty of Florda May 7, 010, 8:00 am - 1:00 noon Instructons: 1. You have four hours to answer questons n ths examnaton.. You must show your work to receve

More information

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Analyss of Varance and Desgn of Experment-I MODULE VII LECTURE - 3 ANALYSIS OF COVARIANCE Dr Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur Any scentfc experment s performed

More information

Systems of Equations (SUR, GMM, and 3SLS)

Systems of Equations (SUR, GMM, and 3SLS) Lecture otes on Advanced Econometrcs Takash Yamano Fall Semester 4 Lecture 4: Sstems of Equatons (SUR, MM, and 3SLS) Seemngl Unrelated Regresson (SUR) Model Consder a set of lnear equatons: $ + ɛ $ + ɛ

More information

On the Influential Points in the Functional Circular Relationship Models

On the Influential Points in the Functional Circular Relationship Models On the Influental Ponts n the Functonal Crcular Relatonshp Models Department of Mathematcs, Faculty of Scence Al-Azhar Unversty-Gaza, Gaza, Palestne alzad33@yahoo.com Abstract If the nterest s to calbrate

More information

General Linear Models

General Linear Models General Lnear Models Revsed: 10/10/2017 Summary... 2 Analyss Summary... 6 Analyss Optons... 10 Model Coeffcents... 11 Scatterplot... 14 Table of Means... 15 Means Plot... 16 Interacton Plot... 19 Multple

More information

7.1. Single classification analysis of variance (ANOVA) Why not use multiple 2-sample 2. When to use ANOVA

7.1. Single classification analysis of variance (ANOVA) Why not use multiple 2-sample 2. When to use ANOVA Sngle classfcaton analyss of varance (ANOVA) When to use ANOVA ANOVA models and parttonng sums of squares ANOVA: hypothess testng ANOVA: assumptons A non-parametrc alternatve: Kruskal-Walls ANOVA Power

More information

Introduction to Generalized Linear Models

Introduction to Generalized Linear Models INTRODUCTION TO STATISTICAL MODELLING TRINITY 00 Introducton to Generalzed Lnear Models I. Motvaton In ths lecture we extend the deas of lnear regresson to the more general dea of a generalzed lnear model

More information

Scatter Plot x

Scatter Plot x Construct a scatter plot usng excel for the gven data. Determne whether there s a postve lnear correlaton, negatve lnear correlaton, or no lnear correlaton. Complete the table and fnd the correlaton coeffcent

More information

Durban Watson for Testing the Lack-of-Fit of Polynomial Regression Models without Replications

Durban Watson for Testing the Lack-of-Fit of Polynomial Regression Models without Replications Durban Watson for Testng the Lack-of-Ft of Polynomal Regresson Models wthout Replcatons Ruba A. Alyaf, Maha A. Omar, Abdullah A. Al-Shha ralyaf@ksu.edu.sa, maomar@ksu.edu.sa, aalshha@ksu.edu.sa Department

More information

4.3 Poisson Regression

4.3 Poisson Regression of teratvely reweghted least squares regressons (the IRLS algorthm). We do wthout gvng further detals, but nstead focus on the practcal applcaton. > glm(survval~log(weght)+age, famly="bnomal", data=baby)

More information