Linear Modelling: Simple Regression
|
|
- Osborn Baldwin
- 5 years ago
- Views:
Transcription
1 Linear Modelling: Simple Regression 10 th of Ma 2018 R. Nicholls / D.-L. Couturier / M. Fernandes
2 Introduction: ANOVA Used for testing hpotheses regarding differences between groups Considers the variation within and between groups Regression Used for revealing and investigating relationships between input and output variables Model data, and etrapolate as much information as possible 2
3 Correlation: How to measure the strength of a linear relationship between variables? 3
4 Correlation: 4
5 Correlation: 60 5
6 Correlation:
7 Correlation: 7
8 Correlation: 8
9 Correlation: 9
10 Correlation: 10
11 Correlation: 11
12 Correlation: 12
13 Correlation: Positivel correlated: Negativel correlated: Uncorrelated: 13
14 Correlation: Pearson s product-moment correlation coefficient: Coefficient of Variation (R 2 value): 14
15 Correlation: 60 r = R 2 = r = R 2 =
16 Correlation: r = R 2 = r = R 2 =
17 Correlation: Can I sa whether m data are correlated? Is an observed correlation significant? data: and t = , df = 48, p-value < 2.2e-16 alternative hpothesis: true correlation is not equal to 0 95 percent confidence interval: sample estimates: cor data: and t = , df = 48, p-value = alternative hpothesis: true correlation is not equal to 0 95 percent confidence interval: sample estimates: cor
18 Simple Regression: Aims: To investigate linear correlation between two variables in more detail Be able to predict response given a knowledge of the independent variable Response variable Dependent variable Predictor variable Independent variable 18
19 Simple Regression: Aims: To investigate linear correlation between two variables in more detail Be able to predict response given a knowledge of the independent variable Response variable Dependent variable Predictor variable Independent variable 19
20 Simple Regression: Aims: To investigate linear correlation between two variables in more detail Be able to predict response given a knowledge of the independent variable Response variable Dependent variable Predictor variable Independent variable 20
21 Simple Regression: Aims: To investigate linear correlation between two variables in more detail Be able to predict response given a knowledge of the independent variable Response variable Dependent variable i i ε i For the i th observation: ε i = errors, residuals Predictor variable Independent variable 21
22 Simple Regression: So how do we fit the regression line? Suppose we know parameter estimates and Observations: i i ε i 22
23 Simple Regression: So how do we fit the regression line? Suppose we know parameter estimates and Observations: Fitted values: =!(! +!! +!!,!) i i ε i ŷ i 23
24 Simple Regression: So how do we fit the regression line? Suppose we know parameter estimates and Residuals:! =!! i i ε i ŷ i 24
25 Simple Regression: So how do we fit the regression line? Suppose we know parameter estimates and! =!! Residuals:! =!!! =! +! i i ε i ŷ i 25
26 Simple Regression: So how do we fit the regression line? Suppose we know parameter estimates and! =!! Residuals:! =!!! =! +!! ~!(!,!! ) i i ε i ŷ i 26
27 Simple Regression: So how do we fit the regression line? Suppose we know parameter estimates and! =!! Residuals:! =!!! =! +!! ~!(!,!! )!(!!;!,!) = 1!!(!!!)! 2!!!! i ε i i ŷ i 27
28 Simple Regression: So how do we fit the regression line? Obtain estimates and Maimise likelihood of parameters given the data! =!!!(!!;!,!) = 1!!(!!!)! 2!!!!!!,!!,! =!(!!!! ;!,!) =!! 1 2!!!!!(!!!!! )!!!! i i ε i ŷ i 28
29 Simple Regression: So how do we fit the regression line? Obtain estimates and Maimise likelihood of parameters given the data = 2!!!! 50!(!!!!! )!!!! ŷi 0! 1 i 20! 30!(!!!! ;!,!) εi 10!!,!!,! = i 40 2!!!!(!!!)!!!!!!(!!;!,!) = 1!=!! ln!!,!!,! = =!!!!!! log!!!! ) ln 2!!! (!!!!! 2!!!!!!!! (!!!!! 0 )!
30 Simple Regression: So how do we fit the regression line? Obtain estimates and Maimise likelihood of parameters given the data! =!!!!,!!,!!"# i i ε i ŷ i Optimal parameters : minimise residual sum of squares Maimum Likelihood and Least Squares estimates are equivalent (for Gaussian errors model) 30
31 Simple Regression: So how do we fit the regression line? Obtain estimates and Maimise likelihood of parameters given the data! =!!!!,!!,!!"#!! Optimal parameters : minimise residual sum of squares Maimum Likelihood and Least Squares estimates are equivalent (for Gaussian error model) 31
32 Simple Regression: So how do we fit the regression line? Obtain estimates and Maimise likelihood of parameters given the data Minimise sum of squared residuals! =!! Final answer:!! 32
33 Simple Regression: Eample: Predicting timber volume of felled black cherr trees > cor(trees$volume,trees$girth) [1] > m1 = lm(volume~girth,data=trees) > summar(m1) Call: lm(formula = Volume ~ Girth, data = trees) Residuals: Min 1Q Median 3Q Ma Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) e-12 *** Girth < 2e-16 *** --- Signif. codes: 0 *** ** 0.01 * Residual standard error: on 29 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 1 and 29 DF, p-value: < 2.2e-16 Volume Girth Response: = Volume Predictor: = Girth 33
34 Simple Regression: Eample: Predicting timber volume of felled black cherr trees > cor(trees$volume,trees$girth) [1] > m1 = lm(volume~girth,data=trees) > summar(m1) Call: lm(formula = Volume ~ Girth, data = trees) Residuals: Min 1Q Median 3Q Ma Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) e-12 *** Girth < 2e-16 *** --- Signif. codes: 0 *** ** 0.01 * Residual standard error: on 29 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 1 and 29 DF, p-value: < 2.2e-16 Volume Girth Response: = Volume Predictor: = Girth 34
35 Simple Regression: Eample: Predicting timber volume of felled black cherr trees > cor(trees$volume,trees$girth) [1] > m1 = lm(volume~girth,data=trees) > summar(m1) Call: lm(formula = Volume ~ Girth, data = trees) Residuals: Min 1Q Median 3Q Ma Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) e-12 *** Girth < 2e-16 *** --- Signif. codes: 0 *** ** 0.01 * Residual standard error: on 29 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 1 and 29 DF, p-value: < 2.2e-16 Frequenc Residuals Response: = Volume Predictor: = Girth! = 4.252!! =
36 Simple Regression: Eample: Predicting timber volume of felled black cherr trees > cor(trees$volume,trees$girth) [1] > m1 = lm(volume~girth,data=trees) > summar(m1) Call: lm(formula = Volume ~ Girth, data = trees) Residuals: Min 1Q Median 3Q Ma Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) e-12 *** Girth < 2e-16 *** --- Signif. codes: 0 *** ** 0.01 * Residual standard error: on 29 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 1 and 29 DF, p-value: < 2.2e-16! = 4.252!! = 18.1 Frequenc Residuals Response: = Volume Predictor: = Girth 95% within ±8.5 36
37 Linear Regression: Assumptions: 1. Model is linear in parameters. 37
38 Linear Regression: Assumptions: 1. Model is linear in parameters. 38
39 Linear Regression: Assumptions: 1. Model is linear in parameters. 39
40 Linear Regression: Assumptions: 1. Model is linear in parameters. 40
41 Linear Regression: Assumptions: 1. Model is linear in parameters. 41
42 Linear Regression: Assumptions: 1. Model is linear in parameters. 2. Gaussian error model. 42
43 Linear Regression: Assumptions: 1. Model is linear in parameters. 2. Gaussian error model. 3. Additive error model. 43
44 Linear Regression: Assumptions: 1. Model is linear in parameters. 2. Gaussian error model. 3. Additive error model. 4. Independence of errors. No autocorrelation when one observation depends on the last 44
45 Linear Regression: Assumptions: 1. Model is linear in parameters. 2. Gaussian error model. 3. Additive error model. 4. Independence of errors. No autocorrelation when one observation depends on the last 5. Homoscedasticit. Homogeneit / stabilit of variance of the residuals 45
46 Testing Assumptions: diagnostic plots 1. Residuals vs Fitted Values Residuals vs Fitted Should not be related No visible pattern Mean residual = zero Constant variance Residuals Fitted values lm(volume ~ Girth) 46
47 Testing Assumptions: diagnostic plots 1. Residuals vs Fitted Values 2. Normal Quantile-Quantile plot Normal Q-Q Visual test for Normalit No strong trends/departures Standardized residuals Theoretical Quantiles lm(volume ~ Girth) 47
48 Testing Assumptions: diagnostic plots 1. Residuals vs Fitted Values 2. Normal Quantile-Quantile Plot 3. Scale-Location Plot Scale-Location 31 Test for homoscedasticit Should be constant, 1 No trend Standardized residuals Fitted values lm(volume ~ Girth) 48
49 Testing Assumptions: diagnostic plots 1. Residuals vs Fitted Values 2. Normal Quantile-Quantile Plot 3. Scale-Location Plot 4. Inde Plot of Cook s Distance Measures the influence of a particular observation Etreme -vals : high leverage Ma inform outlier rejection Standardized residuals Residuals vs Leverage Cook's distance Leverage lm(volume ~ Girth) 49
50 Modelling Non-Linear Relationships Linear models can be used to describe non-linear relationships Appling transformations to response and/or predictor variables can be useful to: Linearise the data, i.e. make the relationship between variables more linear. Stabilise the variance of the residuals, so that σ 2 doesn t depend on the independent variable. Normalise the distribution of the residuals 50
51 Modelling Non-Linear Relationships Eample: Stopping distance of cars versus speed (mph) dist speed Response: Predictor: = distance = speed 51
52 Modelling Non-Linear Relationships Eample: Stopping distance of cars versus speed (mph) R 2 = dist speed Response: = distance Predictor: = speed Residuals Residuals vs Fitted Fitted values lm(dist ~ speed) 52
53 Modelling Non-Linear Relationships Eample: Stopping distance of cars versus speed (mph) sqrt(dist) speed Response: = distance Predictor: = speed Residuals Residuals vs Fitted R 2 = R 2 = Fitted values lm(sqrt(dist) ~ speed) 53
54 Modelling Non-Linear Relationships Eample: Stopping distance of cars versus speed (mph) log(dist) log(speed) Response: = distance Predictor: = speed Residuals Residuals vs Fitted R 2 = R 2 = R 2 = Fitted values lm(log(dist) ~ log(speed)) 54
55 Modelling Non-Linear Relationships Eample: Stopping distance of cars versus speed (mph) dist speed Response: = distance Predictor: = speed R 2 = R 2 = R 2 = Call: lm(formula = log(dist) ~ log(speed), data = cars) Residuals: Min 1Q Median 3Q Ma Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) log(speed) e-15 *** --- Signif. codes: 0 *** ** 0.01 * Residual standard error: on 48 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 1 and 48 DF, p-value: 2.259e-15 55
56 Modelling Non-Linear Relationships Can ou use simple regression to fit this model? Non-linear Multiplicative error model 56
57 Modelling Non-Linear Relationships Can ou use simple regression to fit this model? Non-linear Multiplicative error model 57
58 Modelling Non-Linear Relationships Can ou use simple regression to fit this model? Non-linear Multiplicative error model Yes, so long as Error model is log-normal. dist speed 58
59 Modelling Non-Linear Relationships Eample: Stopping distance of cars versus speed (mph) dist ! =!!!!!! speed Response: = distance Predictor: = speed R 2 = R 2 = R 2 = Call: lm(formula = log(dist) ~ log(speed), data = cars) Residuals: Min 1Q Median 3Q Ma Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) log(speed) e-15 *** --- Signif. codes: 0 *** ** 0.01 * Residual standard error: on 48 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 1 and 48 DF, p-value: 2.259e-15 59
60 Simple Regression in R: Correlation Coefficients: R functions: plot(,) cor(,) cor.test(,) data: and t = , df = 48, p-value < 2.2e-16 alternative hpothesis: true correlation is not equal to 0 95 percent confidence interval: sample estimates: cor
61 Simple Regression in R: R functions: plot(,) m1 = lm(~) abline(m1) summar(m1) Call: lm(formula = Volume ~ Girth, data = trees) Residuals: Min 1Q Median 3Q Ma Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) e-12 *** Girth < 2e-16 *** --- Signif. codes: 0 *** ** 0.01 * Residual standard error: on 29 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 1 and 29 DF, p-value: < 2.2e-16 61
62 Simple Regression in R: R functions: plot(,) m1 = lm(~) abline(m1) summar(m1) r1 = residuals(r1) hist(r1) 62
63 Simple Regression in R: R functions: plot(,) m1 = lm(~) abline(m1) Residuals Residuals vs Fitted Fitted values 31 Standardized residuals Normal Q-Q Theoretical Quantiles summar(m1) r1 = residuals(r1) hist(r1) Standardized residuals Scale-Location Standardized residuals Residuals vs Leverage 1 28 Cook's distance plot(m1) Fitted values Leverage 63
Tutorial 6: Linear Regression
Tutorial 6: Linear Regression Rob Nicholls nicholls@mrc-lmb.cam.ac.uk MRC LMB Statistics Course 2014 Contents 1 Introduction to Simple Linear Regression................ 1 2 Parameter Estimation and Model
More informationIntroduction and Single Predictor Regression. Correlation
Introduction and Single Predictor Regression Dr. J. Kyle Roberts Southern Methodist University Simmons School of Education and Human Development Department of Teaching and Learning Correlation A correlation
More informationRegression. Marc H. Mehlman University of New Haven
Regression Marc H. Mehlman marcmehlman@yahoo.com University of New Haven the statistician knows that in nature there never was a normal distribution, there never was a straight line, yet with normal and
More informationIntroduction to Linear Regression Rebecca C. Steorts September 15, 2015
Introduction to Linear Regression Rebecca C. Steorts September 15, 2015 Today (Re-)Introduction to linear models and the model space What is linear regression Basic properties of linear regression Using
More informationStat 412/512 REVIEW OF SIMPLE LINEAR REGRESSION. Jan Charlotte Wickham. stat512.cwick.co.nz
Stat 412/512 REVIEW OF SIMPLE LINEAR REGRESSION Jan 7 2015 Charlotte Wickham stat512.cwick.co.nz Announcements TA's Katie 2pm lab Ben 5pm lab Joe noon & 1pm lab TA office hours Kidder M111 Katie Tues 2-3pm
More informationBIOL 458 BIOMETRY Lab 9 - Correlation and Bivariate Regression
BIOL 458 BIOMETRY Lab 9 - Correlation and Bivariate Regression Introduction to Correlation and Regression The procedures discussed in the previous ANOVA labs are most useful in cases where we are interested
More informationST430 Exam 1 with Answers
ST430 Exam 1 with Answers Date: October 5, 2015 Name: Guideline: You may use one-page (front and back of a standard A4 paper) of notes. No laptop or textook are permitted but you may use a calculator.
More informationInference for Regression
Inference for Regression Section 9.4 Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c Department of Mathematics University of Houston Lecture 13b - 3339 Cathy Poliak, Ph.D. cathy@math.uh.edu
More informationSimple Linear Regression
Simple Linear Regression September 24, 2008 Reading HH 8, GIll 4 Simple Linear Regression p.1/20 Problem Data: Observe pairs (Y i,x i ),i = 1,...n Response or dependent variable Y Predictor or independent
More informationSTAT 215 Confidence and Prediction Intervals in Regression
STAT 215 Confidence and Prediction Intervals in Regression Colin Reimer Dawson Oberlin College 24 October 2016 Outline Regression Slope Inference Partitioning Variability Prediction Intervals Reminder:
More informationMATH 644: Regression Analysis Methods
MATH 644: Regression Analysis Methods FINAL EXAM Fall, 2012 INSTRUCTIONS TO STUDENTS: 1. This test contains SIX questions. It comprises ELEVEN printed pages. 2. Answer ALL questions for a total of 100
More informationAnalytics 512: Homework # 2 Tim Ahn February 9, 2016
Analytics 512: Homework # 2 Tim Ahn February 9, 2016 Chapter 3 Problem 1 (# 3) Suppose we have a data set with five predictors, X 1 = GP A, X 2 = IQ, X 3 = Gender (1 for Female and 0 for Male), X 4 = Interaction
More informationCh 2: Simple Linear Regression
Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component
More informationUNIVERSITY OF MASSACHUSETTS. Department of Mathematics and Statistics. Basic Exam - Applied Statistics. Tuesday, January 17, 2017
UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics Tuesday, January 17, 2017 Work all problems 60 points are needed to pass at the Masters Level and 75
More informationChapter 12: Linear regression II
Chapter 12: Linear regression II Timothy Hanson Department of Statistics, University of South Carolina Stat 205: Elementary Statistics for the Biological and Life Sciences 1 / 14 12.4 The regression model
More informationRegression. Chapter Introduction
Chapter 7 Regression 7.1 Introduction Regression is a general statistical method to fit a straight line or other model to data. The objective is to find a model for predicting the dependent variable (response)
More informationBiostatistics for physicists fall Correlation Linear regression Analysis of variance
Biostatistics for physicists fall 2015 Correlation Linear regression Analysis of variance Correlation Example: Antibody level on 38 newborns and their mothers There is a positive correlation in antibody
More informationL21: Chapter 12: Linear regression
L21: Chapter 12: Linear regression Department of Statistics, University of South Carolina Stat 205: Elementary Statistics for the Biological and Life Sciences 1 / 37 So far... 12.1 Introduction One sample
More informationNature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference.
Understanding regression output from software Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals In 1966 Cyril Burt published a paper called The genetic determination of differences
More informationChapter 8: Correlation & Regression
Chapter 8: Correlation & Regression We can think of ANOVA and the two-sample t-test as applicable to situations where there is a response variable which is quantitative, and another variable that indicates
More informationComparing Nested Models
Comparing Nested Models ST 370 Two regression models are called nested if one contains all the predictors of the other, and some additional predictors. For example, the first-order model in two independent
More informationMath 2311 Written Homework 6 (Sections )
Math 2311 Written Homework 6 (Sections 5.4 5.6) Name: PeopleSoft ID: Instructions: Homework will NOT be accepted through email or in person. Homework must be submitted through CourseWare BEFORE the deadline.
More informationMotor Trend Car Road Analysis
Motor Trend Car Road Analysis Zakia Sultana February 28, 2016 Executive Summary You work for Motor Trend, a magazine about the automobile industry. Looking at a data set of a collection of cars, they are
More informationLecture 18: Simple Linear Regression
Lecture 18: Simple Linear Regression BIOS 553 Department of Biostatistics University of Michigan Fall 2004 The Correlation Coefficient: r The correlation coefficient (r) is a number that measures the strength
More informationBiostatistics 380 Multiple Regression 1. Multiple Regression
Biostatistics 0 Multiple Regression ORIGIN 0 Multiple Regression Multiple Regression is an extension of the technique of linear regression to describe the relationship between a single dependent (response)
More informationHomework 2. For the homework, be sure to give full explanations where required and to turn in any relevant plots.
Homework 2 1 Data analysis problems For the homework, be sure to give full explanations where required and to turn in any relevant plots. 1. The file berkeley.dat contains average yearly temperatures for
More informationMultiple Regression and Regression Model Adequacy
Multiple Regression and Regression Model Adequacy Joseph J. Luczkovich, PhD February 14, 2014 Introduction Regression is a technique to mathematically model the linear association between two or more variables,
More informationStat 5102 Final Exam May 14, 2015
Stat 5102 Final Exam May 14, 2015 Name Student ID The exam is closed book and closed notes. You may use three 8 1 11 2 sheets of paper with formulas, etc. You may also use the handouts on brand name distributions
More informationPumpkin Example: Flaws in Diagnostics: Correcting Models
Math 3080. Treibergs Pumpkin Example: Flaws in Diagnostics: Correcting Models Name: Example March, 204 From Levine Ramsey & Smidt, Applied Statistics for Engineers and Scientists, Prentice Hall, Upper
More informationModelling using ARMA processes
Modelling using ARMA processes Step 1. ARMA model identification; Step 2. ARMA parameter estimation Step 3. ARMA model selection ; Step 4. ARMA model checking; Step 5. forecasting from ARMA models. 33
More informationMultiple Regression Introduction to Statistics Using R (Psychology 9041B)
Multiple Regression Introduction to Statistics Using R (Psychology 9041B) Paul Gribble Winter, 2016 1 Correlation, Regression & Multiple Regression 1.1 Bivariate correlation The Pearson product-moment
More informationVariance. Standard deviation VAR = = value. Unbiased SD = SD = 10/23/2011. Functional Connectivity Correlation and Regression.
10/3/011 Functional Connectivity Correlation and Regression Variance VAR = Standard deviation Standard deviation SD = Unbiased SD = 1 10/3/011 Standard error Confidence interval SE = CI = = t value for
More informationMODELS WITHOUT AN INTERCEPT
Consider the balanced two factor design MODELS WITHOUT AN INTERCEPT Factor A 3 levels, indexed j 0, 1, 2; Factor B 5 levels, indexed l 0, 1, 2, 3, 4; n jl 4 replicate observations for each factor level
More informationApplied Regression Analysis
Applied Regression Analysis Chapter 3 Multiple Linear Regression Hongcheng Li April, 6, 2013 Recall simple linear regression 1 Recall simple linear regression 2 Parameter Estimation 3 Interpretations of
More informationMath 3330: Solution to midterm Exam
Math 3330: Solution to midterm Exam Question 1: (14 marks) Suppose the regression model is y i = β 0 + β 1 x i + ε i, i = 1,, n, where ε i are iid Normal distribution N(0, σ 2 ). a. (2 marks) Compute the
More informationSCHOOL OF MATHEMATICS AND STATISTICS
RESTRICTED OPEN BOOK EXAMINATION (Not to be removed from the examination hall) Data provided: Statistics Tables by H.R. Neave MAS5052 SCHOOL OF MATHEMATICS AND STATISTICS Basic Statistics Spring Semester
More informationMultiple Predictor Variables: ANOVA
Multiple Predictor Variables: ANOVA 1/32 Linear Models with Many Predictors Multiple regression has many predictors BUT - so did 1-way ANOVA if treatments had 2 levels What if there are multiple treatment
More informationMatrices and vectors A matrix is a rectangular array of numbers. Here s an example: A =
Matrices and vectors A matrix is a rectangular array of numbers Here s an example: 23 14 17 A = 225 0 2 This matrix has dimensions 2 3 The number of rows is first, then the number of columns We can write
More informationFigure 1: The fitted line using the shipment route-number of ampules data. STAT5044: Regression and ANOVA The Solution of Homework #2 Inyoung Kim
0.0 1.0 1.5 2.0 2.5 3.0 8 10 12 14 16 18 20 22 y x Figure 1: The fitted line using the shipment route-number of ampules data STAT5044: Regression and ANOVA The Solution of Homework #2 Inyoung Kim Problem#
More informationChapter 5 Exercises 1. Data Analysis & Graphics Using R Solutions to Exercises (April 24, 2004)
Chapter 5 Exercises 1 Data Analysis & Graphics Using R Solutions to Exercises (April 24, 2004) Preliminaries > library(daag) Exercise 2 The final three sentences have been reworded For each of the data
More informationLecture 1: Linear Models and Applications
Lecture 1: Linear Models and Applications Claudia Czado TU München c (Claudia Czado, TU Munich) ZFS/IMS Göttingen 2004 0 Overview Introduction to linear models Exploratory data analysis (EDA) Estimation
More information> Y ~ X1 + X2. The tilde character separates the response variable from the explanatory variables. So in essence we fit the model
Regression Analysis Regression analysis is one of the most important topics in Statistical theory. In the sequel this widely known methodology will be used with S-Plus by means of formulae for models.
More informationStatistiek II. John Nerbonne. March 17, Dept of Information Science incl. important reworkings by Harmut Fitz
Dept of Information Science j.nerbonne@rug.nl incl. important reworkings by Harmut Fitz March 17, 2015 Review: regression compares result on two distinct tests, e.g., geographic and phonetic distance of
More informationChapter 3 - Linear Regression
Chapter 3 - Linear Regression Lab Solution 1 Problem 9 First we will read the Auto" data. Note that most datasets referred to in the text are in the R package the authors developed. So we just need to
More informationSimple Linear Regression
Simple Linear Regression Reading: Hoff Chapter 9 November 4, 2009 Problem Data: Observe pairs (Y i,x i ),i = 1,... n Response or dependent variable Y Predictor or independent variable X GOALS: Exploring
More informationVariance Decomposition and Goodness of Fit
Variance Decomposition and Goodness of Fit 1. Example: Monthly Earnings and Years of Education In this tutorial, we will focus on an example that explores the relationship between total monthly earnings
More informationInference for Regression Inference about the Regression Model and Using the Regression Line
Inference for Regression Inference about the Regression Model and Using the Regression Line PBS Chapter 10.1 and 10.2 2009 W.H. Freeman and Company Objectives (PBS Chapter 10.1 and 10.2) Inference about
More informationDensity Temp vs Ratio. temp
Temp Ratio Density 0.00 0.02 0.04 0.06 0.08 0.10 0.12 Density 0.0 0.2 0.4 0.6 0.8 1.0 1. (a) 170 175 180 185 temp 1.0 1.5 2.0 2.5 3.0 ratio The histogram shows that the temperature measures have two peaks,
More informationRegression Review. Statistics 149. Spring Copyright c 2006 by Mark E. Irwin
Regression Review Statistics 149 Spring 2006 Copyright c 2006 by Mark E. Irwin Matrix Approach to Regression Linear Model: Y i = β 0 + β 1 X i1 +... + β p X ip + ɛ i ; ɛ i iid N(0, σ 2 ), i = 1,..., n
More informationQuantitative Understanding in Biology 2.2 Fitting Model Parameters
Quantitative Understanding in Biolog 2.2 Fitting Model Parameters Jason Banfelder November 30th, 2017 Let us consider how we might fit a power law model to some data. simulating the power law relationship
More informationLab 3 A Quick Introduction to Multiple Linear Regression Psychology The Multiple Linear Regression Model
Lab 3 A Quick Introduction to Multiple Linear Regression Psychology 310 Instructions.Work through the lab, saving the output as you go. You will be submitting your assignment as an R Markdown document.
More informationLeverage. the response is in line with the other values, or the high leverage has caused the fitted model to be pulled toward the observed response.
Leverage Some cases have high leverage, the potential to greatly affect the fit. These cases are outliers in the space of predictors. Often the residuals for these cases are not large because the response
More informationSCHOOL OF MATHEMATICS AND STATISTICS Autumn Semester
RESTRICTED OPEN BOOK EXAMINATION (Not to be removed from the examination hall) Data provided: "Statistics Tables" by H.R. Neave PAS 371 SCHOOL OF MATHEMATICS AND STATISTICS Autumn Semester 2008 9 Linear
More information13 Simple Linear Regression
B.Sc./Cert./M.Sc. Qualif. - Statistics: Theory and Practice 3 Simple Linear Regression 3. An industrial example A study was undertaken to determine the effect of stirring rate on the amount of impurity
More informationHomework 9 Sample Solution
Homework 9 Sample Solution # 1 (Ex 9.12, Ex 9.23) Ex 9.12 (a) Let p vitamin denote the probability of having cold when a person had taken vitamin C, and p placebo denote the probability of having cold
More informationStudy Sheet. December 10, The course PDF has been updated (6/11). Read the new one.
Study Sheet December 10, 2017 The course PDF has been updated (6/11). Read the new one. 1 Definitions to know The mode:= the class or center of the class with the highest frequency. The median : Q 2 is
More informationStat 411/511 ESTIMATING THE SLOPE AND INTERCEPT. Charlotte Wickham. stat511.cwick.co.nz. Nov
Stat 411/511 ESTIMATING THE SLOPE AND INTERCEPT Nov 20 2015 Charlotte Wickham stat511.cwick.co.nz Quiz #4 This weekend, don t forget. Usual format Assumptions Display 7.5 p. 180 The ideal normal, simple
More informationMultiple Linear Regression
Multiple Linear Regression ST 430/514 Recall: a regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates).
More informationExam Applied Statistical Regression. Good Luck!
Dr. M. Dettling Summer 2011 Exam Applied Statistical Regression Approved: Tables: Note: Any written material, calculator (without communication facility). Attached. All tests have to be done at the 5%-level.
More informationUNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics January, 2018
UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics January, 2018 Work all problems. 60 points needed to pass at the Masters level, 75 to pass at the PhD
More information> modlyq <- lm(ly poly(x,2,raw=true)) > summary(modlyq) Call: lm(formula = ly poly(x, 2, raw = TRUE))
School of Mathematical Sciences MTH5120 Statistical Modelling I Tutorial 4 Solutions The first two models were looked at last week and both had flaws. The output for the third model with log y and a quadratic
More information14 Multiple Linear Regression
B.Sc./Cert./M.Sc. Qualif. - Statistics: Theory and Practice 14 Multiple Linear Regression 14.1 The multiple linear regression model In simple linear regression, the response variable y is expressed in
More informationLecture 8: Fitting Data Statistical Computing, Wednesday October 7, 2015
Lecture 8: Fitting Data Statistical Computing, 36-350 Wednesday October 7, 2015 In previous episodes Loading and saving data sets in R format Loading and saving data sets in other structured formats Intro
More informationTopic 14: Inference in Multiple Regression
Topic 14: Inference in Multiple Regression Outline Review multiple linear regression Inference of regression coefficients Application to book example Inference of mean Application to book example Inference
More informationStat 401B Final Exam Fall 2015
Stat 401B Final Exam Fall 015 I have neither given nor received unauthorized assistance on this exam. Name Signed Date Name Printed ATTENTION! Incorrect numerical answers unaccompanied by supporting reasoning
More informationLinear Regression. In this lecture we will study a particular type of regression model: the linear regression model
1 Linear Regression 2 Linear Regression In this lecture we will study a particular type of regression model: the linear regression model We will first consider the case of the model with one predictor
More informationRegression and Models with Multiple Factors. Ch. 17, 18
Regression and Models with Multiple Factors Ch. 17, 18 Mass 15 20 25 Scatter Plot 70 75 80 Snout-Vent Length Mass 15 20 25 Linear Regression 70 75 80 Snout-Vent Length Least-squares The method of least
More informationCoefficient of Determination
Coefficient of Determination ST 430/514 The coefficient of determination, R 2, is defined as before: R 2 = 1 SS E (yi ŷ i ) = 1 2 SS yy (yi ȳ) 2 The interpretation of R 2 is still the fraction of variance
More information1 Multiple Regression
1 Multiple Regression In this section, we extend the linear model to the case of several quantitative explanatory variables. There are many issues involved in this problem and this section serves only
More informationChecking model assumptions with regression diagnostics
@graemeleehickey www.glhickey.com graeme.hickey@liverpool.ac.uk Checking model assumptions with regression diagnostics Graeme L. Hickey University of Liverpool Conflicts of interest None Assistant Editor
More informationChapter 5 Exercises 1
Chapter 5 Exercises 1 Data Analysis & Graphics Using R, 2 nd edn Solutions to Exercises (December 13, 2006) Preliminaries > library(daag) Exercise 2 For each of the data sets elastic1 and elastic2, determine
More informationLecture 19: Inference for SLR & Transformations
Lecture 19: Inference for SLR & Transformations Statistics 101 Mine Çetinkaya-Rundel April 3, 2012 Announcements Announcements HW 7 due Thursday. Correlation guessing game - ends on April 12 at noon. Winner
More informationNon-Gaussian Response Variables
Non-Gaussian Response Variables What is the Generalized Model Doing? The fixed effects are like the factors in a traditional analysis of variance or linear model The random effects are different A generalized
More informationRegression. Bret Hanlon and Bret Larget. December 8 15, Department of Statistics University of Wisconsin Madison.
Regression Bret Hanlon and Bret Larget Department of Statistics University of Wisconsin Madison December 8 15, 2011 Regression 1 / 55 Example Case Study The proportion of blackness in a male lion s nose
More informationRegression, Part I. - In correlation, it would be irrelevant if we changed the axes on our graph.
Regression, Part I I. Difference from correlation. II. Basic idea: A) Correlation describes the relationship between two variables, where neither is independent or a predictor. - In correlation, it would
More informationMultiple Linear Regression
Multiple Linear Regression University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/~eariasca/teaching.html 1 / 42 Passenger car mileage Consider the carmpg dataset taken from
More informationR 2 and F -Tests and ANOVA
R 2 and F -Tests and ANOVA December 6, 2018 1 Partition of Sums of Squares The distance from any point y i in a collection of data, to the mean of the data ȳ, is the deviation, written as y i ȳ. Definition.
More informationVariance Decomposition in Regression James M. Murray, Ph.D. University of Wisconsin - La Crosse Updated: October 04, 2017
Variance Decomposition in Regression James M. Murray, Ph.D. University of Wisconsin - La Crosse Updated: October 04, 2017 PDF file location: http://www.murraylax.org/rtutorials/regression_anovatable.pdf
More informationDealing with Heteroskedasticity
Dealing with Heteroskedasticity James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) Dealing with Heteroskedasticity 1 / 27 Dealing
More informationLecture 2. Simple linear regression
Lecture 2. Simple linear regression Jesper Rydén Department of Mathematics, Uppsala University jesper@math.uu.se Regression and Analysis of Variance autumn 2014 Overview of lecture Introduction, short
More informationST430 Exam 2 Solutions
ST430 Exam 2 Solutions Date: November 9, 2015 Name: Guideline: You may use one-page (front and back of a standard A4 paper) of notes. No laptop or textbook are permitted but you may use a calculator. Giving
More informationIES 612/STA 4-573/STA Winter 2008 Week 1--IES 612-STA STA doc
IES 612/STA 4-573/STA 4-576 Winter 2008 Week 1--IES 612-STA 4-573-STA 4-576.doc Review Notes: [OL] = Ott & Longnecker Statistical Methods and Data Analysis, 5 th edition. [Handouts based on notes prepared
More informationFinal Exam. Name: Solution:
Final Exam. Name: Instructions. Answer all questions on the exam. Open books, open notes, but no electronic devices. The first 13 problems are worth 5 points each. The rest are worth 1 point each. HW1.
More informationHomework 2: Simple Linear Regression
STAT 4385 Applied Regression Analysis Homework : Simple Linear Regression (Simple Linear Regression) Thirty (n = 30) College graduates who have recently entered the job market. For each student, the CGPA
More informationInferences for Regression
Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In
More informationBinary Regression. GH Chapter 5, ISL Chapter 4. January 31, 2017
Binary Regression GH Chapter 5, ISL Chapter 4 January 31, 2017 Seedling Survival Tropical rain forests have up to 300 species of trees per hectare, which leads to difficulties when studying processes which
More informationExam 3 Practice Questions Psych , Fall 9
Vocabular Eam 3 Practice Questions Psch 3101-100, Fall 9 Rather than choosing some practice terms at random, I suggest ou go through all the terms in the vocabular lists. The real eam will ask for definitions
More informationChapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression
BSTT523: Kutner et al., Chapter 1 1 Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression Introduction: Functional relation between
More informationCorrelation and Regression
Correlation and Regression Marc H. Mehlman marcmehlman@yahoo.com University of New Haven All models are wrong. Some models are useful. George Box the statistician knows that in nature there never was a
More informationChapter 8: Correlation & Regression
Chapter 8: Correlation & Regression We can think of ANOVA and the two-sample t-test as applicable to situations where there is a response variable which is quantitative, and another variable that indicates
More informationTests of Linear Restrictions
Tests of Linear Restrictions 1. Linear Restricted in Regression Models In this tutorial, we consider tests on general linear restrictions on regression coefficients. In other tutorials, we examine some
More informationRef.: Spring SOS3003 Applied data analysis for social science Lecture note
SOS3003 Applied data analysis for social science Lecture note 05-2010 Erling Berge Department of sociology and political science NTNU Spring 2010 Erling Berge 2010 1 Literature Regression criticism I Hamilton
More informationChapter 11: Linear Regression and Correla4on. Correla4on
Chapter 11: Linear Regression and Correla4on Regression analysis is a sta3s3cal tool that u3lizes the rela3on between two or more quan3ta3ve variables so that one variable can be predicted from the other,
More informationIntroduction to Linear regression analysis. Part 2. Model comparisons
Introduction to Linear regression analysis Part Model comparisons 1 ANOVA for regression Total variation in Y SS Total = Variation explained by regression with X SS Regression + Residual variation SS Residual
More informationCorrelation and simple linear regression S5
Basic medical statistics for clinical and eperimental research Correlation and simple linear regression S5 Katarzyna Jóźwiak k.jozwiak@nki.nl November 15, 2017 1/41 Introduction Eample: Brain size and
More informationBIOSTATS 640 Spring 2018 Unit 2. Regression and Correlation (Part 1 of 2) R Users
BIOSTATS 640 Spring 08 Unit. Regression and Correlation (Part of ) R Users Unit Regression and Correlation of - Practice Problems Solutions R Users. In this exercise, you will gain some practice doing
More informationStat 401B Exam 2 Fall 2015
Stat 401B Exam Fall 015 I have neither given nor received unauthorized assistance on this exam. Name Signed Date Name Printed ATTENTION! Incorrect numerical answers unaccompanied by supporting reasoning
More informationUnit 6 - Introduction to linear regression
Unit 6 - Introduction to linear regression Suggested reading: OpenIntro Statistics, Chapter 7 Suggested exercises: Part 1 - Relationship between two numerical variables: 7.7, 7.9, 7.11, 7.13, 7.15, 7.25,
More informationSTAT 512 MidTerm I (2/21/2013) Spring 2013 INSTRUCTIONS
STAT 512 MidTerm I (2/21/2013) Spring 2013 Name: Key INSTRUCTIONS 1. This exam is open book/open notes. All papers (but no electronic devices except for calculators) are allowed. 2. There are 5 pages in
More informationTopics on Statistics 2
Topics on Statistics 2 Pejman Mahboubi March 7, 2018 1 Regression vs Anova In Anova groups are the predictors. When plotting, we can put the groups on the x axis in any order we wish, say in increasing
More information