Simple Linear Regression: Introduction to Diagnostics
|
|
- Carmella Edwards
- 5 years ago
- Views:
Transcription
1 Simple Linear Regression: Introduction to Diagnostics Anscombe Data Frame (1.973) A data frame proposed by Anscombe in and composed by 4 sets of 11 (x,y) points. All of them provide the same estimates for the intercept and the slope in the least squared line. Goodness of fit in terms of the coefficient of determination (R2) provided for the 4 sets is the same common value Data set are clearly different between them, a simple regression line is well-suited for data set A, but it is not suitable for sets B, C and D. Observations that are identified by outliers in its residual (ordinary residual or easier studentized residuals rstudent(model)), high leverage observations (hatvalues(model)) and influent data identified by atypical values in Cook s distance (cooks.distance(model)) are diagnostic indicators that are easily understood in simple regression and are very useful in the diagnosis of general multiple regression models estimated by least squares. XA YA XB YB XC YC XD YD 10 8, , ,46 8 6,58 8 6,95 8 8,14 8 6,77 8 5, , , ,74 8 7,71 9 8,81 9 8,77 9 7,11 8 8, , , ,81 8 8, , , ,84 8 7,04 6 7,24 6 6,13 6 6,08 8 5,25 4 4,26 4 3,10 4 5, , , , ,15 8 5,56 7 4,82 7 7,26 7 6,42 8 7,91 5 5,68 5 4,74 5 5,73 8 6,89 1 For every data set A to D plot and discuss results in the follow items: 1. Simple linear regression Y vs X. Write the equation. Plot residuals (studentized), leverages, Cook s distance (cooks.distance(model)). 2. Identify pattern in residuals. 3. Check the presence of outliers in residuals (boxplot or statistical distribution). 4. Check the presence of observations with high leverages. 5. Check the presence of observations that are influent data. 6. Copy scatterplots of (x,y) data and residuals vs fitted values. Draw the estimated regression line.
2 Y vs. X Residuals vs. Fitted values r = R 2 =. (recta ) = Y vs. X Residuals vs. Fitted values r = R 2 =. (recta ) =
3 Y vs. X Residuals vs. Fitted values r = R 2 =. (recta ) = Y vs. X Residuals vs. Fitted values r = R 2 =. (recta ) =
4 Some pieces of the script For example for data set C in the simple regression example of Anscombe73 data, the R script might be: # Joc C cor(anscombe$xc, anscombe$yc) # Calcula el coeficient de correlació lineal par(mfrow=c(1,1)) plot(anscombe$xc, anscombe$yc) # Diagrama bivariant, dades originals anscombe.lmc <- lm(anscombe$yc ~ anscombe$xc, data=anscombe) summary(anscombe.lmc) # Càlcul del model lineal simple : resultats de l ajust lines(anscombe$xc,anscombe.lmc$fitted.values) text(x=anscombe$xc,y=anscombe$yc,labels=row.names(anscombe), adj=1) # Sobrepasar al Diago. Bivariant Y vs X, la recta ajustada, # tot identificant les observacions pel seu id. par(mfrow=c(2,2)) plot(anscombe.lmc) # Gràfics de diagnosi standard par(mfrow=c(1,1)) levc <- hatvalues(anscombe.lmc) cooc <- cooks.distance(anscombe.lmc) tresc <- rstudent(anscombe.lmc) anscombe <- data.frame( anscombe, levc, cooc, tresc ) # Calcular: factor d anclatge, dist.cook i resid Student pel model C # guardant les columnas en el dataframe anscombe. A continuació venen #els plots estándar de tresid vs hii, tresid vs cooki tresid vs fitts # attributes(anscombe) plot(anscombe$levc,anscombe$tresc) text(x=anscombe$levc,y=anscombe$tresc,labels=row.names(anscombe), adj=1) plot(anscombe$cooc,anscombe$tresc) text(x=anscombe$cooc,y=anscombe$tresc,labels=row.names(anscombe), adj=1) plot(anscombe.lmc$fitted.values,anscombe$tresc) text(x=anscombe.lmc$fitted.values,y=anscombe$tresc,labels=row.names(anscombe ), adj=1) 4
5 Multiple regression: Yi = β 1+ β 2 X 2, i + + β p X p, i + ε i on ε i N 0, σ 2 ( ) independents Example: Duncan data on prestige of professions or weight vs height in Davis Study correlations between numeric variables appearing in the work space. Explicative variables are income and education. Response variables is prestige and we have to propose a multiple regression model to explain the prestige of jobs. 5 Suggested steps Correlation matrix in R, cor(duncan1, use="pairwise.complete.obs" ) Matrix of 2 by 2 scatterplots. Forward regression from the nul model with a direction forward option in method step(). > duncan1.lm0 <- lm( prestige ~1, data=duncan1)
6 > summary(duncan1.lm0) > step(duncan1.lm0, ~income+education, direction= forward, data=duncan1) Backward regression from the model with INCOME+EDUCATION in backward direction option in method step(). > duncan1.lm2 <- lm( prestige ~ income+education, data=duncan1) > summary(duncan1.lm2) > step(duncan1.lm2, direction= backward,data=duncan1) Use method step(.) in R from the nul model to the maximal model with direction specification both (it is the default) > duncan1.lml <- lm( prestige ~income+education, data=duncan1) > summary(duncan1.lm1) > duncan1.lm<- step(duncan1.lm1, ~income+education, data=duncan1) Linear correlation between a response variable and explicative variables might not be significative once some of the explicative variables are already included in the model. A touch on diagnostics: Check outliers in residuals and influent data in the selected model. Compute histogram of studentized residuals (rstudent(model)), leverage (hatvalues(model)) and Cook s distance (cooks.distance(model)). 1. R 2 and global regression test H : β = β Residual analysis: Detection of outliers. 0 2 = p = 6 Scatterplot of studentitzed residual vs. Y. Scatterplot of studentitzed residual vs. Y vs. X i. Detection of a priori and a posterior influent data. Scatterplot of studentitzed residual vs. leverage. Scatterplot of studentitzed residual vs. Cook s distance.
7 Example: weight vs height in Davis The Davis data frame has 200 rows and 5 columns. The subjects were men and women engaged in regular exercise. There are some missing data. This data frame contains the following columns: sex: A factor with levels: F, female; M, male. weight:measured weight in kg. height: Measured height in cm. r_weight : Reported weight in kg. r_height : Reported height in cm. 7 Firstly, we examine the relationship between the reported weight and the actual weight in order to assess how data behaves. Pay attention to outliers. Secondly, we focus on the classical relationship between weight (Y) and height (X): does a quadratic fit hold? Why?
8 Suggested steps Correlation matrix in R, cor(davis, use="pairwise.complete.obs" ) Matrix of 2 by 2 scatterplots. Multiple regression weight (Y) vs r_weight (Y). Interpret the regression equation and quality of the fit Multiple regression weight (Y) vs height (X). Interpret the regression equation and quality of the fit Multiple regression weight (Y) vs poly(height,2) (X). Can you Interpret the regression equation and quality of the fit? 8
Bivariate data analysis
Bivariate data analysis Categorical data - creating data set Upload the following data set to R Commander sex female male male male male female female male female female eye black black blue green green
More informationIntroduction to Linear regression analysis. Part 2. Model comparisons
Introduction to Linear regression analysis Part Model comparisons 1 ANOVA for regression Total variation in Y SS Total = Variation explained by regression with X SS Regression + Residual variation SS Residual
More informationGraphical Diagnosis. Paul E. Johnson 1 2. (Mostly QQ and Leverage Plots) 1 / Department of Political Science
(Mostly QQ and Leverage Plots) 1 / 63 Graphical Diagnosis Paul E. Johnson 1 2 1 Department of Political Science 2 Center for Research Methods and Data Analysis, University of Kansas. (Mostly QQ and Leverage
More informationChapter 6. Exploring Data: Relationships. Solutions. Exercises:
Chapter 6 Exploring Data: Relationships Solutions Exercises: 1. (a) It is more reasonable to explore study time as an explanatory variable and the exam grade as the response variable. (b) It is more reasonable
More informationCHAPTER 4 & 5 Linear Regression with One Regressor. Kazu Matsuda IBEC PHBU 430 Econometrics
CHAPTER 4 & 5 Linear Regression with One Regressor Kazu Matsuda IBEC PHBU 430 Econometrics Introduction Simple linear regression model = Linear model with one independent variable. y = dependent variable
More informationUPC. Notes for theory sessions SMDE-MIRI-FIB: Statistical Modeling: Normal response data General Linear Models
UPC Notes for theory sessions SMDE-MIRI-FIB: Statistical Modeling: Normal response data General Linear Models TABLE OF CONTENTS READINGS 3 INTRODUCTION TO LINEAR MODELS 4 3 LEAST SQUARES ESTIMATION IN
More informationOutline. 1 Preliminaries. 2 Introduction. 3 Multivariate Linear Regression. 4 Online Resources for R. 5 References. 6 Upcoming Mini-Courses
UCLA Department of Statistics Statistical Consulting Center Introduction to Regression in R Part II: Multivariate Linear Regression Denise Ferrari denise@stat.ucla.edu Outline 1 Preliminaries 2 Introduction
More informationSTATISTICS 110/201 PRACTICE FINAL EXAM
STATISTICS 110/201 PRACTICE FINAL EXAM Questions 1 to 5: There is a downloadable Stata package that produces sequential sums of squares for regression. In other words, the SS is built up as each variable
More informationStat 101 Exam 1 Important Formulas and Concepts 1
1 Chapter 1 1.1 Definitions Stat 101 Exam 1 Important Formulas and Concepts 1 1. Data Any collection of numbers, characters, images, or other items that provide information about something. 2. Categorical/Qualitative
More informationIntroduction and Single Predictor Regression. Correlation
Introduction and Single Predictor Regression Dr. J. Kyle Roberts Southern Methodist University Simmons School of Education and Human Development Department of Teaching and Learning Correlation A correlation
More informationChapter 3: Examining Relationships
Chapter 3: Examining Relationships 3.1 Scatterplots 3.2 Correlation 3.3 Least-Squares Regression Fabric Tenacity, lb/oz/yd^2 26 25 24 23 22 21 20 19 18 y = 3.9951x + 4.5711 R 2 = 0.9454 3.5 4.0 4.5 5.0
More informationEquation Number 1 Dependent Variable.. Y W's Childbearing expectations
Sociology 592 - Homework #10 - Advanced Multiple Regression 1. In their classic 1982 paper, Beyond Wives' Family Sociology: A Method for Analyzing Couple Data, Thomson and Williams examined the relationship
More informationEcn Analysis of Economic Data University of California - Davis February 23, 2010 Instructor: John Parman. Midterm 2. Name: ID Number: Section:
Ecn 102 - Analysis of Economic Data University of California - Davis February 23, 2010 Instructor: John Parman Midterm 2 You have until 10:20am to complete this exam. Please remember to put your name,
More informationBiostatistics. Correlation and linear regression. Burkhardt Seifert & Alois Tschopp. Biostatistics Unit University of Zurich
Biostatistics Correlation and linear regression Burkhardt Seifert & Alois Tschopp Biostatistics Unit University of Zurich Master of Science in Medical Biology 1 Correlation and linear regression Analysis
More informationIES 612/STA 4-573/STA Winter 2008 Week 1--IES 612-STA STA doc
IES 612/STA 4-573/STA 4-576 Winter 2008 Week 1--IES 612-STA 4-573-STA 4-576.doc Review Notes: [OL] = Ott & Longnecker Statistical Methods and Data Analysis, 5 th edition. [Handouts based on notes prepared
More informationAdditional Mathematics Lines and circles
Additional Mathematics Lines and circles Topic assessment 1 The points A and B have coordinates ( ) and (4 respectively. Calculate (i) The gradient of the line AB [1] The length of the line AB [] (iii)
More informationUNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics Thursday, August 30, 2018
UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics Thursday, August 30, 2018 Work all problems. 60 points are needed to pass at the Masters Level and 75
More informationChi-square tests. Unit 6: Simple Linear Regression Lecture 1: Introduction to SLR. Statistics 101. Poverty vs. HS graduate rate
Review and Comments Chi-square tests Unit : Simple Linear Regression Lecture 1: Introduction to SLR Statistics 1 Monika Jingchen Hu June, 20 Chi-square test of GOF k χ 2 (O E) 2 = E i=1 where k = total
More informationSTATISTICS 479 Exam II (100 points)
Name STATISTICS 79 Exam II (1 points) 1. A SAS data set was created using the following input statement: Answer parts(a) to (e) below. input State $ City $ Pop199 Income Housing Electric; (a) () Give the
More informationPre-Calculus Multiple Choice Questions - Chapter S8
1 If every man married a women who was exactly 3 years younger than he, what would be the correlation between the ages of married men and women? a Somewhat negative b 0 c Somewhat positive d Nearly 1 e
More informationMATH 1150 Chapter 2 Notation and Terminology
MATH 1150 Chapter 2 Notation and Terminology Categorical Data The following is a dataset for 30 randomly selected adults in the U.S., showing the values of two categorical variables: whether or not the
More informationChapter 3 - Linear Regression
Chapter 3 - Linear Regression Lab Solution 1 Problem 9 First we will read the Auto" data. Note that most datasets referred to in the text are in the R package the authors developed. So we just need to
More informationECON 450 Development Economics
ECON 450 Development Economics Statistics Background University of Illinois at Urbana-Champaign Summer 2017 Outline 1 Introduction 2 3 4 5 Introduction Regression analysis is one of the most important
More informationAnalysing data: regression and correlation S6 and S7
Basic medical statistics for clinical and experimental research Analysing data: regression and correlation S6 and S7 K. Jozwiak k.jozwiak@nki.nl 2 / 49 Correlation So far we have looked at the association
More informationLecture 6: Linear Regression
Lecture 6: Linear Regression Reading: Sections 3.1-3 STATS 202: Data mining and analysis Jonathan Taylor, 10/5 Slide credits: Sergio Bacallado 1 / 30 Simple linear regression Model: y i = β 0 + β 1 x i
More informationTutorial letter 201/2/2018
DSC1520/201/2/2018 Tutorial letter 201/2/2018 Quantitative Modelling 1 DSC1520 Semester 2 Department of Decision Sciences Solutions to Assignment 1 Bar code Dear Student This tutorial letter contains the
More informationLecture 6: Linear Regression (continued)
Lecture 6: Linear Regression (continued) Reading: Sections 3.1-3.3 STATS 202: Data mining and analysis October 6, 2017 1 / 23 Multiple linear regression Y = β 0 + β 1 X 1 + + β p X p + ε Y ε N (0, σ) i.i.d.
More informationMultiple Regression Examples
Multiple Regression Examples Example: Tree data. we have seen that a simple linear regression of usable volume on diameter at chest height is not suitable, but that a quadratic model y = β 0 + β 1 x +
More informationMidterm 2 - Solutions
Ecn 102 - Analysis of Economic Data University of California - Davis February 23, 2010 Instructor: John Parman Midterm 2 - Solutions You have until 10:20am to complete this exam. Please remember to put
More informationAP Statistics Bivariate Data Analysis Test Review. Multiple-Choice
Name Period AP Statistics Bivariate Data Analysis Test Review Multiple-Choice 1. The correlation coefficient measures: (a) Whether there is a relationship between two variables (b) The strength of the
More informationUnit 10: Simple Linear Regression and Correlation
Unit 10: Simple Linear Regression and Correlation Statistics 571: Statistical Methods Ramón V. León 6/28/2004 Unit 10 - Stat 571 - Ramón V. León 1 Introductory Remarks Regression analysis is a method for
More informationChapter 7. Scatterplots, Association, and Correlation
Chapter 7 Scatterplots, Association, and Correlation Bin Zou (bzou@ualberta.ca) STAT 141 University of Alberta Winter 2015 1 / 29 Objective In this chapter, we study relationships! Instead, we investigate
More informationCorrelation: basic properties.
Correlation: basic properties. 1 r xy 1 for all sets of paired data. The closer r xy is to ±1, the stronger the linear relationship between the x-data and y-data. If r xy = ±1 then there is a perfect linear
More informationSimple Linear Regression: One Qualitative IV
Simple Linear Regression: One Qualitative IV 1. Purpose As noted before regression is used both to explain and predict variation in DVs, and adding to the equation categorical variables extends regression
More informationLAB 3 INSTRUCTIONS SIMPLE LINEAR REGRESSION
LAB 3 INSTRUCTIONS SIMPLE LINEAR REGRESSION In this lab you will first learn how to display the relationship between two quantitative variables with a scatterplot and also how to measure the strength of
More informationProject Report for STAT571 Statistical Methods Instructor: Dr. Ramon V. Leon. Wage Data Analysis. Yuanlei Zhang
Project Report for STAT7 Statistical Methods Instructor: Dr. Ramon V. Leon Wage Data Analysis Yuanlei Zhang 77--7 November, Part : Introduction Data Set The data set contains a random sample of observations
More informationIntroduction to Simple Linear Regression
Introduction to Simple Linear Regression 1. Regression Equation A simple linear regression (also known as a bivariate regression) is a linear equation describing the relationship between an explanatory
More informationStatistical View of Least Squares
May 23, 2006 Purpose of Regression Some Examples Least Squares Purpose of Regression Purpose of Regression Some Examples Least Squares Suppose we have two variables x and y Purpose of Regression Some Examples
More informationActivity #12: More regression topics: LOWESS; polynomial, nonlinear, robust, quantile; ANOVA as regression
Activity #12: More regression topics: LOWESS; polynomial, nonlinear, robust, quantile; ANOVA as regression Scenario: 31 counts (over a 30-second period) were recorded from a Geiger counter at a nuclear
More informationChapter 5 Friday, May 21st
Chapter 5 Friday, May 21 st Overview In this Chapter we will see three different methods we can use to describe a relationship between two quantitative variables. These methods are: Scatterplot Correlation
More informationAP Statistics - Chapter 2A Extra Practice
AP Statistics - Chapter 2A Extra Practice 1. A study is conducted to determine if one can predict the yield of a crop based on the amount of yearly rainfall. The response variable in this study is A) yield
More informationMath 2311 Written Homework 6 (Sections )
Math 2311 Written Homework 6 (Sections 5.4 5.6) Name: PeopleSoft ID: Instructions: Homework will NOT be accepted through email or in person. Homework must be submitted through CourseWare BEFORE the deadline.
More informationLectures on Simple Linear Regression Stat 431, Summer 2012
Lectures on Simple Linear Regression Stat 43, Summer 0 Hyunseung Kang July 6-8, 0 Last Updated: July 8, 0 :59PM Introduction Previously, we have been investigating various properties of the population
More informationAP Statistics Two-Variable Data Analysis
AP Statistics Two-Variable Data Analysis Key Ideas Scatterplots Lines of Best Fit The Correlation Coefficient Least Squares Regression Line Coefficient of Determination Residuals Outliers and Influential
More informationHomework 2: Simple Linear Regression
STAT 4385 Applied Regression Analysis Homework : Simple Linear Regression (Simple Linear Regression) Thirty (n = 30) College graduates who have recently entered the job market. For each student, the CGPA
More informationSampling Distributions in Regression. Mini-Review: Inference for a Mean. For data (x 1, y 1 ),, (x n, y n ) generated with the SRM,
Department of Statistics The Wharton School University of Pennsylvania Statistics 61 Fall 3 Module 3 Inference about the SRM Mini-Review: Inference for a Mean An ideal setup for inference about a mean
More informationProblem Set 10: Panel Data
Problem Set 10: Panel Data 1. Read in the data set, e11panel1.dta from the course website. This contains data on a sample or 1252 men and women who were asked about their hourly wage in two years, 2005
More information15.1 The Regression Model: Analysis of Residuals
15.1 The Regression Model: Analysis of Residuals Tom Lewis Fall Term 2009 Tom Lewis () 15.1 The Regression Model: Analysis of Residuals Fall Term 2009 1 / 12 Outline 1 The regression model 2 Estimating
More informationSimple Linear Regression Using Ordinary Least Squares
Simple Linear Regression Using Ordinary Least Squares Purpose: To approximate a linear relationship with a line. Reason: We want to be able to predict Y using X. Definition: The Least Squares Regression
More informationReview. Number of variables. Standard Scores. Anecdotal / Clinical. Bivariate relationships. Ch. 3: Correlation & Linear Regression
Ch. 3: Correlation & Relationships between variables Scatterplots Exercise Correlation Race / DNA Review Why numbers? Distribution & Graphs : Histogram Central Tendency Mean (SD) The Central Limit Theorem
More informationInferences for Regression
Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In
More informationIntroductory Statistics with R: Linear models for continuous response (Chapters 6, 7, and 11)
Introductory Statistics with R: Linear models for continuous response (Chapters 6, 7, and 11) Statistical Packages STAT 1301 / 2300, Fall 2014 Sungkyu Jung Department of Statistics University of Pittsburgh
More informationAnalytics 512: Homework # 2 Tim Ahn February 9, 2016
Analytics 512: Homework # 2 Tim Ahn February 9, 2016 Chapter 3 Problem 1 (# 3) Suppose we have a data set with five predictors, X 1 = GP A, X 2 = IQ, X 3 = Gender (1 for Female and 0 for Male), X 4 = Interaction
More informationCOMPREHENSIVE WRITTEN EXAMINATION, PAPER III FRIDAY AUGUST 26, 2005, 9:00 A.M. 1:00 P.M. STATISTICS 174 QUESTION
COMPREHENSIVE WRITTEN EXAMINATION, PAPER III FRIDAY AUGUST 26, 2005, 9:00 A.M. 1:00 P.M. STATISTICS 174 QUESTION Answer all parts. Closed book, calculators allowed. It is important to show all working,
More informationLecture 3: Linear Models. Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012
Lecture 3: Linear Models Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012 1 Quick Review of the Major Points The general linear model can be written as y = X! + e y = vector of observed
More information1. The following two-way frequency table shows information from a survey that asked the gender and the language class taken of a group of students.
Name Algebra Unit 13 Practice Test 1. The following two-way frequency table shows information from a survey that asked the gender and the language class taken of a group of students. Spanish French other
More informationChapte The McGraw-Hill Companies, Inc. All rights reserved.
12er12 Chapte Bivariate i Regression (Part 1) Bivariate Regression Visual Displays Begin the analysis of bivariate data (i.e., two variables) with a scatter plot. A scatter plot - displays each observed
More informationScatterplots and Correlation
Bivariate Data Page 1 Scatterplots and Correlation Essential Question: What is the correlation coefficient and what does it tell you? Most statistical studies examine data on more than one variable. Fortunately,
More informationAnnouncements. Lecture 10: Relationship between Measurement Variables. Poverty vs. HS graduate rate. Response vs. explanatory
Announcements Announcements Lecture : Relationship between Measurement Variables Statistics Colin Rundel February, 20 In class Quiz #2 at the end of class Midterm #1 on Friday, in class review Wednesday
More informationLecture 2: Linear Models. Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011
Lecture 2: Linear Models Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011 1 Quick Review of the Major Points The general linear model can be written as y = X! + e y = vector
More informationRegression in R I. Part I : Simple Linear Regression
UCLA Department of Statistics Statistical Consulting Center Regression in R Part I : Simple Linear Regression Denise Ferrari & Tiffany Head denise@stat.ucla.edu tiffany@stat.ucla.edu Feb 10, 2010 Objective
More informationLesson Using Residuals to Determine If a Line Is a Good Fit
Ya STTWY STUDENT HNDOUT STUDENT NME DTE INTRODUCTION Recall that a residual (or error) is the difference between the actual value of the response variable and the value predicted by the regression line.
More informationThe Model Building Process Part I: Checking Model Assumptions Best Practice
The Model Building Process Part I: Checking Model Assumptions Best Practice Authored by: Sarah Burke, PhD 31 July 2017 The goal of the STAT T&E COE is to assist in developing rigorous, defensible test
More informationRegression Model Specification in R/Splus and Model Diagnostics. Daniel B. Carr
Regression Model Specification in R/Splus and Model Diagnostics By Daniel B. Carr Note 1: See 10 for a summary of diagnostics 2: Books have been written on model diagnostics. These discuss diagnostics
More informationRegression Analysis V... More Model Building: Including Qualitative Predictors, Model Searching, Model "Checking"/Diagnostics
Regression Analysis V... More Model Building: Including Qualitative Predictors, Model Searching, Model "Checking"/Diagnostics The session is a continuation of a version of Section 11.3 of MMD&S. It concerns
More informationRegression Analysis V... More Model Building: Including Qualitative Predictors, Model Searching, Model "Checking"/Diagnostics
Regression Analysis V... More Model Building: Including Qualitative Predictors, Model Searching, Model "Checking"/Diagnostics The session is a continuation of a version of Section 11.3 of MMD&S. It concerns
More informationStat 101: Lecture 6. Summer 2006
Stat 101: Lecture 6 Summer 2006 Outline Review and Questions Example for regression Transformations, Extrapolations, and Residual Review Mathematical model for regression Each point (X i, Y i ) in the
More informationSystematic error, of course, can produce either an upward or downward bias.
Brief Overview of LISREL & Related Programs & Techniques (Optional) Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised April 6, 2015 STRUCTURAL AND MEASUREMENT MODELS:
More informationRegression Diagnostics Procedures
Regression Diagnostics Procedures ASSUMPTIONS UNDERLYING REGRESSION/CORRELATION NORMALITY OF VARIANCE IN Y FOR EACH VALUE OF X For any fixed value of the independent variable X, the distribution of the
More informationChapter 10 Correlation and Regression
Chapter 10 Correlation and Regression 10-1 Review and Preview 10-2 Correlation 10-3 Regression 10-4 Variation and Prediction Intervals 10-5 Multiple Regression 10-6 Modeling Copyright 2010, 2007, 2004
More informationChapter 1 Linear Regression with One Predictor
STAT 525 FALL 2018 Chapter 1 Linear Regression with One Predictor Professor Min Zhang Goals of Regression Analysis Serve three purposes Describes an association between X and Y In some applications, the
More informationTopic 18: Model Selection and Diagnostics
Topic 18: Model Selection and Diagnostics Variable Selection We want to choose a best model that is a subset of the available explanatory variables Two separate problems 1. How many explanatory variables
More informationCorrelation. Bivariate normal densities with ρ 0. Two-dimensional / bivariate normal density with correlation 0
Correlation Bivariate normal densities with ρ 0 Example: Obesity index and blood pressure of n people randomly chosen from a population Two-dimensional / bivariate normal density with correlation 0 Correlation?
More informationThe Model Building Process Part I: Checking Model Assumptions Best Practice (Version 1.1)
The Model Building Process Part I: Checking Model Assumptions Best Practice (Version 1.1) Authored by: Sarah Burke, PhD Version 1: 31 July 2017 Version 1.1: 24 October 2017 The goal of the STAT T&E COE
More informationTHE PEARSON CORRELATION COEFFICIENT
CORRELATION Two variables are said to have a relation if knowing the value of one variable gives you information about the likely value of the second variable this is known as a bivariate relation There
More informationCan you tell the relationship between students SAT scores and their college grades?
Correlation One Challenge Can you tell the relationship between students SAT scores and their college grades? A: The higher SAT scores are, the better GPA may be. B: The higher SAT scores are, the lower
More information[ ESS ESS ] / 2 [ ] / ,019.6 / Lab 10 Key. Regression Analysis: wage versus yrsed, ex
Lab 1 Key Regression Analysis: wage versus yrsed, ex wage = - 4.78 + 1.46 yrsed +.126 ex Constant -4.78 2.146-2.23.26 yrsed 1.4623.153 9.73. ex.12635.2739 4.61. S = 8.9851 R-Sq = 11.9% R-Sq(adj) = 11.7%
More informationPredict y from (possibly) many predictors x. Model Criticism Study the importance of columns
Lecture Week Multiple Linear Regression Predict y from (possibly) many predictors x Including extra derived variables Model Criticism Study the importance of columns Draw on Scientific framework Experiment;
More informationChapter 12 Summarizing Bivariate Data Linear Regression and Correlation
Chapter 1 Summarizing Bivariate Data Linear Regression and Correlation This chapter introduces an important method for making inferences about a linear correlation (or relationship) between two variables,
More informationLinear Regression & Correlation
Linear Regression & Correlation Jamie Monogan University of Georgia Introduction to Data Analysis Jamie Monogan (UGA) Linear Regression & Correlation POLS 7012 1 / 25 Objectives By the end of these meetings,
More informationAnalysis of Bivariate Data
Analysis of Bivariate Data Data Two Quantitative variables GPA and GAES Interest rates and indices Tax and fund allocation Population size and prison population Bivariate data (x,y) Case corr® 2 Independent
More informationLinear Regression. In this lecture we will study a particular type of regression model: the linear regression model
1 Linear Regression 2 Linear Regression In this lecture we will study a particular type of regression model: the linear regression model We will first consider the case of the model with one predictor
More informationPh.D. Preliminary Examination Statistics June 2, 2014
Ph.D. Preliminary Examination Statistics June, 04 NOTES:. The exam is worth 00 points.. Partial credit may be given for partial answers if possible.. There are 5 pages in this exam paper. I have neither
More informationSection 5.4 Residuals
Section 5.4 Residuals A residual value is the difference between an actual observed y value and the corresponding predicted y value, y. Residuals are just errors. Residual error = observed value predicted
More informationCorrelation and Regression Theory 1) Multivariate Statistics
Correlation and Regression Theory 1) Multivariate Statistics What is a multivariate data set? How to statistically analyze this data set? Is there any kind of relationship between different variables in
More informationCorrelation and Regression
Correlation and Regression Marc H. Mehlman marcmehlman@yahoo.com University of New Haven All models are wrong. Some models are useful. George Box the statistician knows that in nature there never was a
More informationMath 082 Final Examination Review
Math 08 Final Examination Review 1) Write the equation of the line that passes through the points (4, 6) and (0, 3). Write your answer in slope-intercept form. ) Write the equation of the line that passes
More informationOverview. Overview. Overview. Specific Examples. General Examples. Bivariate Regression & Correlation
Bivariate Regression & Correlation Overview The Scatter Diagram Two Examples: Education & Prestige Correlation Coefficient Bivariate Linear Regression Line SPSS Output Interpretation Covariance ou already
More informationChapter 8: Correlation & Regression
Chapter 8: Correlation & Regression We can think of ANOVA and the two-sample t-test as applicable to situations where there is a response variable which is quantitative, and another variable that indicates
More information171:162 Design and Analysis of Biomedical Studies, Summer 2011 Exam #3, July 16th
Name 171:162 Design and Analysis of Biomedical Studies, Summer 2011 Exam #3, July 16th Use the selected SAS output to help you answer the questions. The SAS output is all at the back of the exam on pages
More informationsociology sociology Scatterplots Quantitative Research Methods: Introduction to correlation and regression Age vs Income
Scatterplots Quantitative Research Methods: Introduction to correlation and regression Scatterplots can be considered as interval/ratio analogue of cross-tabs: arbitrarily many values mapped out in -dimensions
More informationMATH11400 Statistics Homepage
MATH11400 Statistics 1 2010 11 Homepage http://www.stats.bris.ac.uk/%7emapjg/teach/stats1/ 4. Linear Regression 4.1 Introduction So far our data have consisted of observations on a single variable of interest.
More information5. Let W follow a normal distribution with mean of μ and the variance of 1. Then, the pdf of W is
Practice Final Exam Last Name:, First Name:. Please write LEGIBLY. Answer all questions on this exam in the space provided (you may use the back of any page if you need more space). Show all work but do
More informationCheat Sheet: Linear Regression
Cheat Sheet: Linear Regression Measurement and Evaluation of HCC Systems Scenario Use regression if you want to test the simultaneous linear effect of several variables varx1, varx2, on a continuous outcome
More informationStatistics 5100 Spring 2018 Exam 1
Statistics 5100 Spring 2018 Exam 1 Directions: You have 60 minutes to complete the exam. Be sure to answer every question, and do not spend too much time on any part of any question. Be concise with all
More informationSociology 593 Exam 2 Answer Key March 28, 2002
Sociology 59 Exam Answer Key March 8, 00 I. True-False. (0 points) Indicate whether the following statements are true or false. If false, briefly explain why.. A variable is called CATHOLIC. This probably
More informationOutline. Review regression diagnostics Remedial measures Weighted regression Ridge regression Robust regression Bootstrapping
Topic 19: Remedies Outline Review regression diagnostics Remedial measures Weighted regression Ridge regression Robust regression Bootstrapping Regression Diagnostics Summary Check normality of the residuals
More informationMath 138 Summer Section 412- Unit Test 1 Green Form, page 1 of 7
Math 138 Summer 1 2013 Section 412- Unit Test 1 Green Form page 1 of 7 1. Multiple Choice. Please circle your answer. Each question is worth 3 points. (a) Social Security Numbers are illustrations of which
More informationLecture 2: Linear and Mixed Models
Lecture 2: Linear and Mixed Models Bruce Walsh lecture notes Introduction to Mixed Models SISG, Seattle 18 20 July 2018 1 Quick Review of the Major Points The general linear model can be written as y =
More informationAnnouncements. Unit 6: Simple Linear Regression Lecture : Introduction to SLR. Poverty vs. HS graduate rate. Modeling numerical variables
Announcements Announcements Unit : Simple Linear Regression Lecture : Introduction to SLR Statistics 1 Mine Çetinkaya-Rundel April 2, 2013 Statistics 1 (Mine Çetinkaya-Rundel) U - L1: Introduction to SLR
More information