Chapter 11: Linear Regression and Correla4on. Correla4on
|
|
- Josephine Mosley
- 5 years ago
- Views:
Transcription
1 Chapter 11: Linear Regression and Correla4on Regression analysis is a sta3s3cal tool that u3lizes the rela3on between two or more quan3ta3ve variables so that one variable can be predicted from the other, or others. Some Examples: Height and weight of people Income and expenses of people Produc3on size and produc3on 3me Soil ph and the rate of growth of plants 1 Correla4on An easy way to determine if two quan3ta3ve variables are linearly related is by looking at their scakerplot. Another way is to calculate the correla3on coefficient, denoted usually by r. The Linear Correla+on measures the strength of the linear rela3onship between explanatory variable (x) and the response variable (y). An es3mate of this correla3on parameter is provided by the Pearson sample correla3on coefficient, r. Note: - 1 r
2 Example Sca;erplots with Correla4ons If X and Y are independent, then their correla3on is 0. 3 Correla4on If the correla3on between X and Y is 0, it doesn t mean they are independent. It only means that they are not linearly related. One complain about the correla3on is that it can be subjec3ve when interpre3ng its value. Some people are very happy with r 0.6, while others are not. Note: Correla3on does not necessarily imply Causa3on Some Guidelines in Interpre3ng r. Value of r Strength of linear rela1onship If r 0.95 Very Strong If 0.85 r < 0.95 Strong If 0.65 r < 0.85 Moderate to Strong If 0.45 r < 0.65 Moderate If 0.25 r < 0.45 Weak If r < 0.25 Very weak/close to none 4 2
3 Compu4ng Correla4on in R data.health=read.csv("healthexam.csv",header=t) head(data.health) Gender Age Height Weight Waist Pulse SysBP DiasBP Cholesterol BodyMass Leg Elbow Wrist Arm 1 F F M attach(data.health) plot(height,weight,pch=19,main="scatterplot") cor(height,weight) # cor(waist,weight) # plot(waist,weight,pch=19,main="scatterplot") 5 Simple Linear Regression Model: Y i =(β 0 +β 1 x i ) + ε i Random Error where, Y i is the i th value of the response variable. x i is the i th value of the explanatory variable. ε i s are uncorrelated with a mean of 0 and constant variance σ 2. How do we determine the underlying linear rela3onship? Well, since the points are following this linear trend, why don t we look for a line that best fit the points. But what do we mean by best fit? We need a criterion to help us determine which between 2 compe3ng candidate lines is beker. y L 4 L 3 Observed point ε 1 L 2 x 1 L 1 Expected point x Y=β 0 +β 1 x 6 3
4 Method of Least Squares Model: Y i =(β 0 +β 1 x i ) + ε i Y=β 0 +β 1 x where, Y i is the i th value of the response variable. x i is the i th value of the explanatory variable. Observed ε i s are uncorrelated with a mean of 0 and Predicted constant variance σ 2. Residual = (Observed y- value) (Predicted y- value) e 1 = y 1 y y 1 P 1 (x 1,y 1 ) e 1 e 2 Example: 2+.8x Method of Least Squares: Choose the line that minimizes the SSE as the best line. This line is unknown as the Least- Squares Regression Line. x 1 Ques1on: But there are infinite possible candidate lines, how can we find the one that minimizes the SSE? x Answer: Since SSE is a con+nuous func+on of 2 variables, we can use methods from calculus to minimize the SSE. 7 Obtaining the Regression Line in R data.health=read.csv("healthexam.csv",header=t) head(data.health) Gender Age Height Weight Waist Pulse SysBP DiasBP Cholesterol BodyMass Leg Elbow Wrist Arm 1 F F M attach(health.exam) plot(waist,weight,pch=19,main="scatterplot") result=lm(weight~waist) coef(result) (Intercept) Waist As waist increases by 1 cm, weight goes up by about 2.4 pounds abline(a= ,b= ,lwd=2,col="blue") So, for the first person, her predicted weight is pounds. Predicted.1= # pounds Since her actual weight is pounds. Residual.1= # 13.1 pounds 8 4
5 What else do we get from the lm func4on? data.health=read.csv("healthexam.csv",header=t) attach(health.exam) result=lm(weight~waist) attributes(result) $names "coefficients" "residuals" "effects" "rank" "fitted.values" "assign" "qr" "df.residual" "xlevels" "call" "terms" "model" result$fit[1] # result$res[1] # summary(result) lm(formula = Weight ~ Waist) Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) e-05 Waist < 2e Signif. codes: Residual standard error: on 78 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 1 and 78 DF, p-value: < 2.2e-16 Coefficient of Determination (R 2 ) : This index measures the amount of variability in the dependent variable (y) that can be explained by the regression line. Hence, about 82.51% of the variability of weight can be explained by the regression line involving the waist size. Testing H 0 : β 1 = 0 vs. H 1 : β 1 0. Since the p-vlaue is extremely small (<0.05), we can reject the null hypothesis and conclude that waist has a significant effect on weight. Model Assump4ons Model: Y i =(β 0 +β 1 x i ) + ε i where, ε i s are uncorrelated with a mean of 0 and constant variance σ 2 ε. ε i s are normally distributed. (This is needed in the test for the slope.) Y=β 0 +β 1 x y Observed point ε 1 e 1 Expected point Predicted point Since the underlying (green) line is unknown to us, we can t calculate the values of the error terms (ε i ). The best that we can do is study the residuals (e i ). x 1 x 10 5
6 Es4ma4ng the Variance of the Error Terms The unbiased estimator for σ 2 ε is sse=sum(result$residuals^2) # mse=sse/(80-2) # sigma.hat=sqrt(mse) # anova(result) Response: Weight Df Sum Sq Mean Sq F value Pr(>F) Waist < 2.2e-16 Residuals Total summary(result) lm(formula = Weight ~ Waist) Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) e-05 Waist < 2e-16 Residual standard error: on 78 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 1 and 78 DF, p-value: < 2.2e-16 R 2 = SSR/SSTO y i P i (x i,y i ) Y=β 0 +β 1 x SSTO = SSE + SSR Since the p-vlaue is less than 0.05, we conclude the the regression model account for a significant 11 amount of the variability in weight. x i y Things that affect the slope es4mate Ø Watch the regression podcast by Dr. Will posted on our course webpage. Three things that affect the slope estimate: 1. Sample size (n). 2. Variability of the error terms (σ ε2 ). 3. Spread of the independent variable. summary(result) lm(formula = Weight ~ Waist) Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) e-05 Waist < 2e-16 Testing H 0 : β 1 = 0 vs. H 1 : β 1 0. t.obs=(beta1.hat-0)/se.beta1 # p.value=2(1-pt(19.18,df=78)) # virtually 0 The smaller σ ε is, the smaller the standard error of the slope estimate. MSE=anova(result)$Mean[2] # SE.beta1=sqrt(MSE/SSxx) # SS=function(x,y){sum((x-mean(x))(y-mean(y)))} SSxy=SS(Waist,Weight) # SSxx=SS(Waist,Waist) # SSyy=SS(Weight,Weight) # = SSTO Beta1.hat=SSxy/SSxx # As n increases, the standard error of the slope estimate decreases. 12 6
7 Effect of Outliers to the Slope Es4mate Three types of outliers: 1. Outlier in the x direction This type of an outlier is said to be a high leverage point. 2. Outlier in the y direction. 3. Outlier in both x and y directions This point is said to be a high influence point. The effect of a high influence point. The effect of a point with an outlying y value. 13 The (1-α)100% C.I. for β 1 : Confidence Intervals Hence, the 90% C.I. for β 1 for our example is Lower=Beta1.hat qt(0.95,df=78)se.beta1 # Upper=Beta1.hat + qt(0.95,df=78)se.beta1 # confint(result,level=.90) 5 % 95 % (Intercept) Waist Estimating the mean response (µ y ) at a specified value of x: predict(result,newdata=data.frame(waist=c(80,90))) Confidence interval for the mean response (µ y ) at a specified value of x: predict(result,newdata=data.frame(waist=c(80,90)),interval= confidence ) fit lwr upr
8 Predic4on Intervals Predicting the value of the response variable at a specified value of x: predict(result,newdata=data.frame(waist=c(80,90))) Prediction interval for the value of new response value (y n+1 ) at a specified value of x: predict(result,newdata=data.frame(waist=c(80,90)),interval= prediction ) fit lwr upr predict(result,newdata=data.frame(waist=c(80,90)),interval= prediction,level=.99) fit lwr upr Note that the only difference between the prediction interval and confidence interval for the mean response is the addition of 1 inside the square root. This makes the prediction intervals wider than the confidence intervals for the mean response. Confidence and Predic4on Bands Working-Hotelling (1-α)100% confidence band:, result=lm(weight~waist) CI=predict(result,se.fit=TRUE) # se.fit=se(mean) W=sqrt(2qf(0.95,2,78)) # band.lower=ci$fit - WCI$se.fit band.upper=ci$fit + WCI$se.fit plot(waist,weight,xlab="waist,ylab="weight,main="confidence Band") abline(result) points(sort(waist),sort(band.lower),type="l",lwd=2,lty=2,col= Blue") points(sort(waist),sort(band.upper),type="l",lwd=2,lty=2,col= Blue") The ((1-α)100% Prediction Band: mse=anova(result)$mean[2] se.pred=sqrt(ci$se.fit^2+mse) band.lower.pred=ci$fit - Wse.pred band.upper.pred=ci$fit + Wse.pred points(sort(waist),sort(band.lower.pred),type="l",lwd=2,lty=2,col="red") points(sort(waist),sort(band.upper.pred),type="l",lwd=2,lty=2,col="red ) 8
9 Tests for Correla4ons Testing H 0 : ρ = 0 vs. H 1 : ρ 0. cor(waist,weight) # Computes the Pearson correlation coefficient, r cor.test(waist,weight, conf.level=.99) # Tests Ho:rho=0 and also constructs C.I. for rho Pearson's product-moment correlation data: Waist and Weight Note that the results are exactly t = , df = 78, p-value < 2.2e-16 the same as what we got when alternative hypothesis: true correlation is not equal to 0 99 percent confidence interval: testing H 0 : β 1 = 0 vs. H 1 : β Testing H 0 : ρ = 0 vs. H 1 : ρ 0 using the (Nonparametric) Spearman s method. cor.test(waist,weight,method="spearman") # Test of independence using the Spearman's rank correlation rho # Spearman Rank correlation data: Waist and Weight S = 8532, p-value < 2.2e-16 alternative hypothesis: true rho is not equal to 0 sample estimates: rho Model Diagnos4cs Model: Y i =(β 0 +β 1 x i ) + ε i where, ε i s are uncorrelated with a mean of 0 and constant variance σ 2 ε. ε i s are normally distributed. (This is needed in the test for the slope.) Assessing uncorrelatedness of the error terms plot(result$residuals,type='b') Assessing Normality qqnorm(result$residuals); qqline(result$residuals) shapiro.test(result$residuals) W = , p-value = Assessing Constant Variance plot(result$fitted,result$residuals) levene.test(result$residuals,waist) Test Statistic = , p-value =
Inference for Regression
Inference for Regression Section 9.4 Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c Department of Mathematics University of Houston Lecture 13b - 3339 Cathy Poliak, Ph.D. cathy@math.uh.edu
More informationBiostatistics for physicists fall Correlation Linear regression Analysis of variance
Biostatistics for physicists fall 2015 Correlation Linear regression Analysis of variance Correlation Example: Antibody level on 38 newborns and their mothers There is a positive correlation in antibody
More informationFigure 1: The fitted line using the shipment route-number of ampules data. STAT5044: Regression and ANOVA The Solution of Homework #2 Inyoung Kim
0.0 1.0 1.5 2.0 2.5 3.0 8 10 12 14 16 18 20 22 y x Figure 1: The fitted line using the shipment route-number of ampules data STAT5044: Regression and ANOVA The Solution of Homework #2 Inyoung Kim Problem#
More informationChapter 8: Correlation & Regression
Chapter 8: Correlation & Regression We can think of ANOVA and the two-sample t-test as applicable to situations where there is a response variable which is quantitative, and another variable that indicates
More information1 Multiple Regression
1 Multiple Regression In this section, we extend the linear model to the case of several quantitative explanatory variables. There are many issues involved in this problem and this section serves only
More informationExample: Data from the Child Health and Development Study
Example: Data from the Child Health and Development Study Can we use linear regression to examine how well length of gesta:onal period predicts birth weight? First look at the sca@erplot: Does a linear
More informationVariance Decomposition and Goodness of Fit
Variance Decomposition and Goodness of Fit 1. Example: Monthly Earnings and Years of Education In this tutorial, we will focus on an example that explores the relationship between total monthly earnings
More informationGarvan Ins)tute Biosta)s)cal Workshop 16/6/2015. Tuan V. Nguyen. Garvan Ins)tute of Medical Research Sydney, Australia
Garvan Ins)tute Biosta)s)cal Workshop 16/6/2015 Tuan V. Nguyen Tuan V. Nguyen Garvan Ins)tute of Medical Research Sydney, Australia Introduction to linear regression analysis Purposes Ideas of regression
More informationRegression. Marc H. Mehlman University of New Haven
Regression Marc H. Mehlman marcmehlman@yahoo.com University of New Haven the statistician knows that in nature there never was a normal distribution, there never was a straight line, yet with normal and
More informationLecture 18: Simple Linear Regression
Lecture 18: Simple Linear Regression BIOS 553 Department of Biostatistics University of Michigan Fall 2004 The Correlation Coefficient: r The correlation coefficient (r) is a number that measures the strength
More informationUnit 6 - Simple linear regression
Sta 101: Data Analysis and Statistical Inference Dr. Çetinkaya-Rundel Unit 6 - Simple linear regression LO 1. Define the explanatory variable as the independent variable (predictor), and the response variable
More informationVariance Decomposition in Regression James M. Murray, Ph.D. University of Wisconsin - La Crosse Updated: October 04, 2017
Variance Decomposition in Regression James M. Murray, Ph.D. University of Wisconsin - La Crosse Updated: October 04, 2017 PDF file location: http://www.murraylax.org/rtutorials/regression_anovatable.pdf
More informationCh 2: Simple Linear Regression
Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component
More informationInferences for Regression
Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In
More informationUnit 6 - Introduction to linear regression
Unit 6 - Introduction to linear regression Suggested reading: OpenIntro Statistics, Chapter 7 Suggested exercises: Part 1 - Relationship between two numerical variables: 7.7, 7.9, 7.11, 7.13, 7.15, 7.25,
More informationHandout 4: Simple Linear Regression
Handout 4: Simple Linear Regression By: Brandon Berman The following problem comes from Kokoska s Introductory Statistics: A Problem-Solving Approach. The data can be read in to R using the following code:
More informationST430 Exam 2 Solutions
ST430 Exam 2 Solutions Date: November 9, 2015 Name: Guideline: You may use one-page (front and back of a standard A4 paper) of notes. No laptop or textbook are permitted but you may use a calculator. Giving
More informationHomework 9 Sample Solution
Homework 9 Sample Solution # 1 (Ex 9.12, Ex 9.23) Ex 9.12 (a) Let p vitamin denote the probability of having cold when a person had taken vitamin C, and p placebo denote the probability of having cold
More informationLecture 11: Simple Linear Regression
Lecture 11: Simple Linear Regression Readings: Sections 3.1-3.3, 11.1-11.3 Apr 17, 2009 In linear regression, we examine the association between two quantitative variables. Number of beers that you drink
More information14 Multiple Linear Regression
B.Sc./Cert./M.Sc. Qualif. - Statistics: Theory and Practice 14 Multiple Linear Regression 14.1 The multiple linear regression model In simple linear regression, the response variable y is expressed in
More information13 Simple Linear Regression
B.Sc./Cert./M.Sc. Qualif. - Statistics: Theory and Practice 3 Simple Linear Regression 3. An industrial example A study was undertaken to determine the effect of stirring rate on the amount of impurity
More informationL21: Chapter 12: Linear regression
L21: Chapter 12: Linear regression Department of Statistics, University of South Carolina Stat 205: Elementary Statistics for the Biological and Life Sciences 1 / 37 So far... 12.1 Introduction One sample
More informationBiostatistics 380 Multiple Regression 1. Multiple Regression
Biostatistics 0 Multiple Regression ORIGIN 0 Multiple Regression Multiple Regression is an extension of the technique of linear regression to describe the relationship between a single dependent (response)
More informationCan you tell the relationship between students SAT scores and their college grades?
Correlation One Challenge Can you tell the relationship between students SAT scores and their college grades? A: The higher SAT scores are, the better GPA may be. B: The higher SAT scores are, the lower
More informationdf=degrees of freedom = n - 1
One sample t-test test of the mean Assumptions: Independent, random samples Approximately normal distribution (from intro class: σ is unknown, need to calculate and use s (sample standard deviation)) Hypotheses:
More informationSTAT 215 Confidence and Prediction Intervals in Regression
STAT 215 Confidence and Prediction Intervals in Regression Colin Reimer Dawson Oberlin College 24 October 2016 Outline Regression Slope Inference Partitioning Variability Prediction Intervals Reminder:
More information7.2 One-Sample Correlation ( = a) Introduction. Correlation analysis measures the strength and direction of association between
7.2 One-Sample Correlation ( = a) Introduction Correlation analysis measures the strength and direction of association between variables. In this chapter we will test whether the population correlation
More informationBinomial Logis5c Regression with glm()
Friday 10/10/2014 Binomial Logis5c Regression with glm() > plot(x,y) > abline(reg=lm(y~x)) Binomial Logis5c Regression numsessions relapse -1.74 No relapse 1.15 No relapse 1.87 No relapse.62 No relapse
More informationCorrelation and the Analysis of Variance Approach to Simple Linear Regression
Correlation and the Analysis of Variance Approach to Simple Linear Regression Biometry 755 Spring 2009 Correlation and the Analysis of Variance Approach to Simple Linear Regression p. 1/35 Correlation
More informationST430 Exam 1 with Answers
ST430 Exam 1 with Answers Date: October 5, 2015 Name: Guideline: You may use one-page (front and back of a standard A4 paper) of notes. No laptop or textook are permitted but you may use a calculator.
More informationTests of Linear Restrictions
Tests of Linear Restrictions 1. Linear Restricted in Regression Models In this tutorial, we consider tests on general linear restrictions on regression coefficients. In other tutorials, we examine some
More informationSTA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #6
STA 8 Applied Linear Models: Regression Analysis Spring 011 Solution for Homework #6 6. a) = 11 1 31 41 51 1 3 4 5 11 1 31 41 51 β = β1 β β 3 b) = 1 1 1 1 1 11 1 31 41 51 1 3 4 5 β = β 0 β1 β 6.15 a) Stem-and-leaf
More informationMS&E 226: Small Data
MS&E 226: Small Data Lecture 15: Examples of hypothesis tests (v5) Ramesh Johari ramesh.johari@stanford.edu 1 / 32 The recipe 2 / 32 The hypothesis testing recipe In this lecture we repeatedly apply the
More informationChapter 12: Linear regression II
Chapter 12: Linear regression II Timothy Hanson Department of Statistics, University of South Carolina Stat 205: Elementary Statistics for the Biological and Life Sciences 1 / 14 12.4 The regression model
More informationChapter 8: Correlation & Regression
Chapter 8: Correlation & Regression We can think of ANOVA and the two-sample t-test as applicable to situations where there is a response variable which is quantitative, and another variable that indicates
More informationEstimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.
Estimating σ 2 We can do simple prediction of Y and estimation of the mean of Y at any value of X. To perform inferences about our regression line, we must estimate σ 2, the variance of the error term.
More informationCorrelation. A statistics method to measure the relationship between two variables. Three characteristics
Correlation Correlation A statistics method to measure the relationship between two variables Three characteristics Direction of the relationship Form of the relationship Strength/Consistency Direction
More informationInference for Regression Inference about the Regression Model and Using the Regression Line, with Details. Section 10.1, 2, 3
Inference for Regression Inference about the Regression Model and Using the Regression Line, with Details Section 10.1, 2, 3 Basic components of regression setup Target of inference: linear dependency
More informationExample: 1982 State SAT Scores (First year state by state data available)
Lecture 11 Review Section 3.5 from last Monday (on board) Overview of today s example (on board) Section 3.6, Continued: Nested F tests, review on board first Section 3.4: Interaction for quantitative
More informationData files for today. CourseEvalua2on2.sav pontokprediktorok.sav Happiness.sav Ca;erplot.sav
Correlation Data files for today CourseEvalua2on2.sav pontokprediktorok.sav Happiness.sav Ca;erplot.sav Defining Correlation Co-variation or co-relation between two variables These variables change together
More informationNature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference.
Understanding regression output from software Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals In 1966 Cyril Burt published a paper called The genetic determination of differences
More informationLinear Regression and Correla/on. Correla/on and Regression Analysis. Three Ques/ons 9/14/14. Chapter 13. Dr. Richard Jerz
Linear Regression and Correla/on Chapter 13 Dr. Richard Jerz 1 Correla/on and Regression Analysis Correla/on Analysis is the study of the rela/onship between variables. It is also defined as group of techniques
More informationLinear Regression and Correla/on
Linear Regression and Correla/on Chapter 13 Dr. Richard Jerz 1 Correla/on and Regression Analysis Correla/on Analysis is the study of the rela/onship between variables. It is also defined as group of techniques
More informationCorrelation and Simple Linear Regression
Correlation and Simple Linear Regression Sasivimol Rattanasiri, Ph.D Section for Clinical Epidemiology and Biostatistics Ramathibodi Hospital, Mahidol University E-mail: sasivimol.rat@mahidol.ac.th 1 Outline
More informationUNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Midterm Test, October 2013
UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Midterm Test, October 2013 STAC67H3 Regression Analysis Duration: One hour and fifty minutes Last Name: First Name: Student
More informationREVIEW 8/2/2017 陈芳华东师大英语系
REVIEW Hypothesis testing starts with a null hypothesis and a null distribution. We compare what we have to the null distribution, if the result is too extreme to belong to the null distribution (p
More informationMultiple Linear Regression
Multiple Linear Regression ST 430/514 Recall: a regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates).
More informationBIOL 458 BIOMETRY Lab 9 - Correlation and Bivariate Regression
BIOL 458 BIOMETRY Lab 9 - Correlation and Bivariate Regression Introduction to Correlation and Regression The procedures discussed in the previous ANOVA labs are most useful in cases where we are interested
More informationLinear Modelling: Simple Regression
Linear Modelling: Simple Regression 10 th of Ma 2018 R. Nicholls / D.-L. Couturier / M. Fernandes Introduction: ANOVA Used for testing hpotheses regarding differences between groups Considers the variation
More informationLECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit
LECTURE 6 Introduction to Econometrics Hypothesis testing & Goodness of fit October 25, 2016 1 / 23 ON TODAY S LECTURE We will explain how multiple hypotheses are tested in a regression model We will define
More informationCorrelation. We don't consider one variable independent and the other dependent. Does x go up as y goes up? Does x go down as y goes up?
Comment: notes are adapted from BIOL 214/312. I. Correlation. Correlation A) Correlation is used when we want to examine the relationship of two continuous variables. We are not interested in prediction.
More informationChapter 8: Correlation & Regression
Chapter 8: Correlation & Regression We can think of ANOVA and the two-sample t-test as applicable to situations where there is a response variable which is quantitative, and another variable that indicates
More informationChapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression
BSTT523: Kutner et al., Chapter 1 1 Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression Introduction: Functional relation between
More informationRegression Analysis: Exploring relationships between variables. Stat 251
Regression Analysis: Exploring relationships between variables Stat 251 Introduction Objective of regression analysis is to explore the relationship between two (or more) variables so that information
More informationCorrelation Analysis
Simple Regression Correlation Analysis Correlation analysis is used to measure strength of the association (linear relationship) between two variables Correlation is only concerned with strength of the
More informationStat 411/511 ESTIMATING THE SLOPE AND INTERCEPT. Charlotte Wickham. stat511.cwick.co.nz. Nov
Stat 411/511 ESTIMATING THE SLOPE AND INTERCEPT Nov 20 2015 Charlotte Wickham stat511.cwick.co.nz Quiz #4 This weekend, don t forget. Usual format Assumptions Display 7.5 p. 180 The ideal normal, simple
More informationSTA 302 H1F / 1001 HF Fall 2007 Test 1 October 24, 2007
STA 302 H1F / 1001 HF Fall 2007 Test 1 October 24, 2007 LAST NAME: SOLUTIONS FIRST NAME: STUDENT NUMBER: ENROLLED IN: (circle one) STA 302 STA 1001 INSTRUCTIONS: Time: 90 minutes Aids allowed: calculator.
More informationSCHOOL OF MATHEMATICS AND STATISTICS
RESTRICTED OPEN BOOK EXAMINATION (Not to be removed from the examination hall) Data provided: Statistics Tables by H.R. Neave MAS5052 SCHOOL OF MATHEMATICS AND STATISTICS Basic Statistics Spring Semester
More informationCh 3: Multiple Linear Regression
Ch 3: Multiple Linear Regression 1. Multiple Linear Regression Model Multiple regression model has more than one regressor. For example, we have one response variable and two regressor variables: 1. delivery
More informationAMS 7 Correlation and Regression Lecture 8
AMS 7 Correlation and Regression Lecture 8 Department of Applied Mathematics and Statistics, University of California, Santa Cruz Suumer 2014 1 / 18 Correlation pairs of continuous observations. Correlation
More informationMultiple Regression and Regression Model Adequacy
Multiple Regression and Regression Model Adequacy Joseph J. Luczkovich, PhD February 14, 2014 Introduction Regression is a technique to mathematically model the linear association between two or more variables,
More informationChapter 16: Understanding Relationships Numerical Data
Chapter 16: Understanding Relationships Numerical Data These notes reflect material from our text, Statistics, Learning from Data, First Edition, by Roxy Peck, published by CENGAGE Learning, 2015. Linear
More informationLecture 6 Multiple Linear Regression, cont.
Lecture 6 Multiple Linear Regression, cont. BIOST 515 January 22, 2004 BIOST 515, Lecture 6 Testing general linear hypotheses Suppose we are interested in testing linear combinations of the regression
More informationBusiness Statistics. Lecture 10: Course Review
Business Statistics Lecture 10: Course Review 1 Descriptive Statistics for Continuous Data Numerical Summaries Location: mean, median Spread or variability: variance, standard deviation, range, percentiles,
More information1. Introduc9on 2. Bivariate Data 3. Linear Analysis of Data
Lecture 3: Bivariate Data & Linear Regression 1. Introduc9on 2. Bivariate Data 3. Linear Analysis of Data a) Freehand Linear Fit b) Least Squares Fit c) Interpola9on/Extrapola9on 4. Correla9on 1. Introduc9on
More informationNo other aids are allowed. For example you are not allowed to have any other textbook or past exams.
UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Sample Exam Note: This is one of our past exams, In fact the only past exam with R. Before that we were using SAS. In
More information9 Correlation and Regression
9 Correlation and Regression SW, Chapter 12. Suppose we select n = 10 persons from the population of college seniors who plan to take the MCAT exam. Each takes the test, is coached, and then retakes the
More information2. Outliers and inference for regression
Unit6: Introductiontolinearregression 2. Outliers and inference for regression Sta 101 - Spring 2016 Duke University, Department of Statistical Science Dr. Çetinkaya-Rundel Slides posted at http://bit.ly/sta101_s16
More informationSTAT 3022 Spring 2007
Simple Linear Regression Example These commands reproduce what we did in class. You should enter these in R and see what they do. Start by typing > set.seed(42) to reset the random number generator so
More informationChapter 16. Simple Linear Regression and Correlation
Chapter 16 Simple Linear Regression and Correlation 16.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will
More informationSSR = The sum of squared errors measures how much Y varies around the regression line n. It happily turns out that SSR + SSE = SSTO.
Analysis of variance approach to regression If x is useless, i.e. β 1 = 0, then E(Y i ) = β 0. In this case β 0 is estimated by Ȳ. The ith deviation about this grand mean can be written: deviation about
More informationSection 3: Simple Linear Regression
Section 3: Simple Linear Regression Carlos M. Carvalho The University of Texas at Austin McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/ 1 Regression: General Introduction
More informationInference for Regression Simple Linear Regression
Inference for Regression Simple Linear Regression IPS Chapter 10.1 2009 W.H. Freeman and Company Objectives (IPS Chapter 10.1) Simple linear regression p Statistical model for linear regression p Estimating
More informationApplied Regression Analysis
Applied Regression Analysis Chapter 3 Multiple Linear Regression Hongcheng Li April, 6, 2013 Recall simple linear regression 1 Recall simple linear regression 2 Parameter Estimation 3 Interpretations of
More informationKeller: Stats for Mgmt & Econ, 7th Ed July 17, 2006
Chapter 17 Simple Linear Regression and Correlation 17.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will
More informationLecture 1 Linear Regression with One Predictor Variable.p2
Lecture Linear Regression with One Predictor Variablep - Basics - Meaning of regression parameters p - β - the slope of the regression line -it indicates the change in mean of the probability distn of
More informationSome Review and Hypothesis Tes4ng. Friday, March 15, 13
Some Review and Hypothesis Tes4ng Outline Discussing the homework ques4ons from Joey and Phoebe Review of Sta4s4cal Inference Proper4es of OLS under the normality assump4on Confidence Intervals, T test,
More informationMathematics for Economics MA course
Mathematics for Economics MA course Simple Linear Regression Dr. Seetha Bandara Simple Regression Simple linear regression is a statistical method that allows us to summarize and study relationships between
More informationLinear Regression Model. Badr Missaoui
Linear Regression Model Badr Missaoui Introduction What is this course about? It is a course on applied statistics. It comprises 2 hours lectures each week and 1 hour lab sessions/tutorials. We will focus
More information2.1: Inferences about β 1
Chapter 2 1 2.1: Inferences about β 1 Test of interest throughout regression: Need sampling distribution of the estimator b 1. Idea: If b 1 can be written as a linear combination of the responses (which
More informationST Correlation and Regression
Chapter 5 ST 370 - Correlation and Regression Readings: Chapter 11.1-11.4, 11.7.2-11.8, Chapter 12.1-12.2 Recap: So far we ve learned: Why we want a random sample and how to achieve it (Sampling Scheme)
More informationReview of Statistics
Review of Statistics Topics Descriptive Statistics Mean, Variance Probability Union event, joint event Random Variables Discrete and Continuous Distributions, Moments Two Random Variables Covariance and
More informationQuantitative Understanding in Biology Module II: Model Parameter Estimation Lecture I: Linear Correlation and Regression
Quantitative Understanding in Biology Module II: Model Parameter Estimation Lecture I: Linear Correlation and Regression Correlation Linear correlation and linear regression are often confused, mostly
More informationRegression. Bret Hanlon and Bret Larget. December 8 15, Department of Statistics University of Wisconsin Madison.
Regression Bret Hanlon and Bret Larget Department of Statistics University of Wisconsin Madison December 8 15, 2011 Regression 1 / 55 Example Case Study The proportion of blackness in a male lion s nose
More informationLectures on Simple Linear Regression Stat 431, Summer 2012
Lectures on Simple Linear Regression Stat 43, Summer 0 Hyunseung Kang July 6-8, 0 Last Updated: July 8, 0 :59PM Introduction Previously, we have been investigating various properties of the population
More informationIntroduction and Single Predictor Regression. Correlation
Introduction and Single Predictor Regression Dr. J. Kyle Roberts Southern Methodist University Simmons School of Education and Human Development Department of Teaching and Learning Correlation A correlation
More informationRegression Models - Introduction
Regression Models - Introduction In regression models there are two types of variables that are studied: A dependent variable, Y, also called response variable. It is modeled as random. An independent
More informationMultiple Regression Introduction to Statistics Using R (Psychology 9041B)
Multiple Regression Introduction to Statistics Using R (Psychology 9041B) Paul Gribble Winter, 2016 1 Correlation, Regression & Multiple Regression 1.1 Bivariate correlation The Pearson product-moment
More informationANOVA: Analysis of Variation
ANOVA: Analysis of Variation The basic ANOVA situation Two variables: 1 Categorical, 1 Quantitative Main Question: Do the (means of) the quantitative variables depend on which group (given by categorical
More informationMultiple linear regression
Multiple linear regression Course MF 930: Introduction to statistics June 0 Tron Anders Moger Department of biostatistics, IMB University of Oslo Aims for this lecture: Continue where we left off. Repeat
More informationLecture 8: Fitting Data Statistical Computing, Wednesday October 7, 2015
Lecture 8: Fitting Data Statistical Computing, 36-350 Wednesday October 7, 2015 In previous episodes Loading and saving data sets in R format Loading and saving data sets in other structured formats Intro
More informationSimple and Multiple Linear Regression
Sta. 113 Chapter 12 and 13 of Devore March 12, 2010 Table of contents 1 Simple Linear Regression 2 Model Simple Linear Regression A simple linear regression model is given by Y = β 0 + β 1 x + ɛ where
More informationLectures 5 & 6: Hypothesis Testing
Lectures 5 & 6: Hypothesis Testing in which you learn to apply the concept of statistical significance to OLS estimates, learn the concept of t values, how to use them in regression work and come across
More informationSwarthmore Honors Exam 2012: Statistics
Swarthmore Honors Exam 2012: Statistics 1 Swarthmore Honors Exam 2012: Statistics John W. Emerson, Yale University NAME: Instructions: This is a closed-book three-hour exam having six questions. You may
More informationChapter 14 Simple Linear Regression (A)
Chapter 14 Simple Linear Regression (A) 1. Characteristics Managerial decisions often are based on the relationship between two or more variables. can be used to develop an equation showing how the variables
More information9. Linear Regression and Correlation
9. Linear Regression and Correlation Data: y a quantitative response variable x a quantitative explanatory variable (Chap. 8: Recall that both variables were categorical) For example, y = annual income,
More informationUNIVERSITY OF MASSACHUSETTS. Department of Mathematics and Statistics. Basic Exam - Applied Statistics. Tuesday, January 17, 2017
UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics Tuesday, January 17, 2017 Work all problems 60 points are needed to pass at the Masters Level and 75
More informationStatistiek II. John Nerbonne. March 17, Dept of Information Science incl. important reworkings by Harmut Fitz
Dept of Information Science j.nerbonne@rug.nl incl. important reworkings by Harmut Fitz March 17, 2015 Review: regression compares result on two distinct tests, e.g., geographic and phonetic distance of
More informationSTAT 4385 Topic 03: Simple Linear Regression
STAT 4385 Topic 03: Simple Linear Regression Xiaogang Su, Ph.D. Department of Mathematical Science University of Texas at El Paso xsu@utep.edu Spring, 2017 Outline The Set-Up Exploratory Data Analysis
More informationSimple Linear Regression
Simple Linear Regression ST 370 Regression models are used to study the relationship of a response variable and one or more predictors. The response is also called the dependent variable, and the predictors
More information