Psych 10 / Stats 60, Practice Problem Set 10 (Week 10 Material), Solutions

Size: px
Start display at page:

Download "Psych 10 / Stats 60, Practice Problem Set 10 (Week 10 Material), Solutions"

Transcription

1 Psych 10 / Stats 60, Practice Problem Set 10 (Week 10 Material), Solutions Part 1: Conceptual ideas about correlation and regression Tintle The association would be negative (as distance increases, final exam score decreases) I would expect a positive relationship, such that as temperature increases, ice cream sales increase too. (Although you also might hypothesize that if temperatures get high enough then people might stay indoors and sales would actually decrease, or you might hypothesize that your range of temperature is so restricted that you actually wouldn t see a strong relationship within this range of summer temperatures) The correlation would be It is positive because students with above-average exam scores on the first exam will also be the students with above-average exam scores on the second exam. The magnitude is 1.00 because we could draw a straight line that goes through every single point, meaning there is a perfect linear relationship (described by the regression line y = * x, note the positive slope of 1) The relationship is not linear, and correlation only measures the linear relationship between two variables. The best fit line through this curve does not capture the data very well D (there could still be a relationship, just not a linear relationship) A (in general, although this won t be the case for every single observation) a. There is a strong and negative relationship between GDP per capita and infant mortality, but the relationship is not linear. Specifically, infant mortality drops off very steeply as GDP per capita increases from about 0 to 10 (thousand dollars), and then drops off very shallowly as GDP per capita increases beyond that. b. The regression line is a straight line and the relationship between the two variables is not linear, so the regression line will not give as good of a prediction as a line that was allowed to be curved. (Additionally, it will systematically under-predict mortality at low GDP, systematically over-predict mortality for middle GDP, and then systematically under-predict mortality / stop making sense at high GDP (infant mortality cannot be negative), which is not a good quality of a regression line we want the errors (residuals) to be similarly distributed at any point in our X variable). c. There is a strong and negative relationship between log(gdp per capita) and log(infant mortality), and now the relationship looks linear (follows a straight line). d. Yes, because the regression line is a straight line, the relationship between the two variables is linear, and the relationship is strong enough that we can make a good prediction for Y (log of infant mortality) based on X (log of GDP per capita). 1. The scatter plots below visualize sets of observations of (X,Y) pairs that have correlation coefficients of approximately r = -.50, r = 0, r = +.50, and r = +.80.

2 Which panel has which correlation? (For additional practice with this, I recommend using the applet found here: Panel A is.80 (the relationship is positive, and we can see it is the strongest of the relationships pictured here). Panel C is 0 (looks like a circle of points, suggesting no relationship), and then we can figure out that Panel B is -.50 and Panel D is +.50, since the relationship in Panel B is negative and the relationship in Panel D is positive. They also look weaker than Panel A but stronger than Panel C, suggesting they correspond to r = Tintle : Two researchers want to investigate whether there is a relationship between annual company profit (in dollars) and median annual salary paid by the company (in dollars). Researcher Bart collects data on a random sample of 40 companies, and researcher Lisa collects data on a random sample of 140 companies. After analyzing their respective data sets, each finds a correlation coefficient of r = If they each calculate a p-value corresponding to the probability of finding a correlation coefficient as or more extreme as 0.60 (if the population correlation coefficient, ρ, is 0), who will have a smaller p-value and why? a. Bart b. Lisa c. Both will have the same p-value d. More information is needed to answer this question The t-statistic (and thus the p-value) increases with r and with n. Since r is the same for both samples but n is larger for Lisa s sample (140 vs. 40), we know that her t- c d a b X Y

3 statistic will be larger and her p-value will be smaller. (It s less likely to find a correlation coefficient as or more extreme as r =.60 by random chance in a sample of 140 people (with ρ = 0) vs. a sample of 40 people (with ρ = 0)). 3. How (if it all) will a t-statistic corresponding to a correlation coefficient (r) change with: a. the sample size (n) t will increase as n increases b. the correlation coefficient (r) t will increase as r increases (and t will be positive if r is positive, and negative if r is negative) c. a t-statistic corresponding to the slope of the regression line (b) for the same data these t-statistics are identical d. a t-statistic corresponding to the intercept (a) of the regression line for the data a t-statistic for the intercept would be testing whether the expected value for y when x = 0 is significantly different from some hypothesized value (usually 0). the intercept doesn t automatically give us information about the slope or the correlation coefficient, and so we can t predict how a change in the t-statistic corresponding to the intercept would map to a change in the t-statistic corresponding to the slope or the correlation coefficient 4. What s a residual, both in words and using a formula? How is the best fitting regression line selected (what makes a line the best fitting)? A residual is the difference between an actual value of y that was observed and the predicted value of y from a regression line (ŷ predicted from the corresponding x), i.e., a residual is (y i - ŷ i ). The best fitting line is the line that minimizes the sum of the squared residuals when summing over all of the observations in the dataset, Ʃ i (y i - ŷ i ) 2, also called SS unexplained or SS error or SS residual (analogous to SS within in an ANOVA). 5. In words, describe how to interpret the intercept and slope of a regression line. The intercept is the expected value of y (our best guess for y, or the mean of y) when x = 0. The slope is how much of a change in y we would expect when x increases by In baseball, the correlation between players batting averages between two consecutive years has been estimated as r =.41. If a player has a batting average that is two standard deviations above the mean in 2016, what is our best guess for how many standard deviations above or below the mean the player s batting average will be in 2017? We can use ẑ y = r * z x. Since the player had a batting average that was two standard deviations above the mean, z x = +2.00, and thus ẑ y =.41 * 2 = 0.82, or.82 standard deviations above the mean. (In other words, we would predict a substantial decrease in the player s relative standing in the league). 7. If the correlation between schools performances on a state-wide test between two consecutive years is r =.80, and a school performs one standard deviation below the mean in 2016, what is our best guess for how many standard deviations above

4 or below the mean the school will perform in 2017? How would this change if the correlation was r =.50? What if the correlation was r = 0? And finally, what if the correlation was r = 1? We can use ẑ y = r * z x. Since the school was one standard deviation below the mean, z x = -1.00, and ẑ y =.80 * (-1) = -.80, or.80 standard deviations below the mean. If r =.50, this prediction would change to ẑ y =.50 * (-1) = If r = 0, this prediction would change to ẑ y = 0 * (-1) = 0. If r = 1, this prediction would change to ẑ y =.1 * (-1) = -1. (In other words, unless r = 1, we would expect an increase in the school s relative standing among other schools). 8. If the correlation between two variables is 0, what is our best guess for the value of y (ŷ) for any observation? Our best guess is the mean of y (ȳ). This is easiest to see if we think about regression with z-scores, ẑ y = r * z x = 0 * z x = 0, regardless of the value of z x. As a reminder, a z-score of 0 corresponds to the mean of the observations, so z y = 0 corresponds to ȳ. 9. What general principle explains the answers to the last three questions and how? These examples all illustrate regression to the mean. If there is no relationship between two variables then our best guess for any y is ȳ, regardless of how many standard deviations above or below the mean x is (ẑ y = 0). If there is a perfect relationship between two variables then we expect a pair of y and x from the same observation to be the same number of standard deviations above or below their respective means (ẑ y = z x ; note that this is our best guess, not that we would see this for all observations). And if the correlation coefficient is between 0 and 1, our best guess for y moves closer toward the mean of y (and away from a prediction based on z x ) as the relationship between the two variables gets weaker (ẑ y = r * z x ). 10. Tintle : For a given data set, a test of association based on a slope is equivalent to a test of association based on a correlation coefficient. Being equivalent means which of the following is true? a. the confidence intervals for the population correlation and population slope will be the same b. the observed correlation will be the same as the observed slope of the regression line c. the p-value will be the same whether you use correlation as the statistic or the slope of the regression line as the statistic d. all of the above The confidence intervals are in units of r and units of b, so it would not make sense for them to be equal. The observed correlation is only guaranteed to be the same as the observed slope of the regression line if all x and y have been converted to z- scores (z x and z y ). When working with real-world units it would be very reasonable to have a slope such that b > 1, but it is impossible for r > 1.

5 Part 2: Correlation and regression by hand Calculate the correlation coefficient (r) and the best fitting regression line for the set of data below. To help, x = 3, s x = 2.74, ȳ = 4, and s y = 1.58: X Y (x x ) (y - ȳ) (x x )(y - ȳ) = = -2 (-3)(-2) = = = 0 (-2)(0) = = = 1 (1)(1) = = = -1 (0)(-1) = = = 2 (4)(2) = 8 r = (1 / (n-1)) * (Ʃ(x - x )(y - ȳ))/ (s x * s y ) = (1 / (5-1)) * (15) / (2.74 * 1.58) = (1/4) * 15 / (2.74 * 1.58) =.87 (note, there are several formulas that can be used to calculate r, see lecture slides for additional options) b = r * (s y / s x ) =.87 * (1.58 / 2.74) =.50 a = ȳ - bx = 4 (.50)*(3) = 2.5 regression line is ŷ = x Calculate the correlation coefficient (r) and the best fitting regression line for the set of data below. Then, calculate the predicted value (ŷ) and the residual for each observed value of Y. To help, x = 2, s x = 0.816, ȳ = 3, and s y = 1.63: X Y (x x ) (y - ȳ) (x x )(y - ȳ) ŷ residual (y-ŷ) = = 0 (-1)(0) = *1 = = = = 2 (1)(2) = *3 = = = = -2 (0)(-2) = *2 = = = = 0 (0)(0) = *2 = = 0 r = (1 / (n-1)) * (Ʃ(x - x )(y - ȳ))/ (s x * s y ) = (1 / (4-1)) * (2) / (0.816 * 1.63) = (1/3) * 2 / (0.816 * 1.63) =.50 (note, there are several formulas that can be used to calculate r, see lecture slides for additional options) b = r * (s y / s x ) =.50 * (1.63 / 0.816) = 1.00 a = ȳ - bx = 3 (1)*(2) = 1 regression line is ŷ = 1 + 1*x

6 Calculate the correlation coefficient (r) and the best fitting regression line for the set of data below. To help, x = 2, s x = 1.41, ȳ = 3, and s y = 2.12: X Y (x x ) (y - ȳ) (x x )(y - ȳ) = = 3 (-1)(3) = = = -2 (2)(-2) = = = 1 (-1)(1) = = = 0 (-1)(0) = = = -2 (1)(-2) = -2 r = (1 / (n-1)) * (Ʃ(x - x )(y - ȳ))/ (s x * s y ) = (1 / (5-1)) * (-10) / (1.41 * 2.12) = (1/4) * (-10) / (1.41 * 2.12) = -.84 (note, there are several formulas that can be used to calculate r, see lecture slides for additional options) b = r * (s y / s x ) = -.84 * (2.12/ 1.41) = a = ȳ - bx = 3 (-1.26)*(2) = 5.52 regression line is ŷ = *x If you d like more practice doing this by hand, an option is to generate a small set of data, create vectors x and y in R to get the mean and standard deviations, do the rest of the calculations by hand, and then check your answers in R. Something like this: x <- c(1, 2, 3, 4, 5) y <- c(2, 3, 2, 4, 4) mean(x) sd(x) mean(y) sd(y) # stop, do calculations by hand # then check your work cor(x, y) # get correlation coefficient lm(y ~ x) # get regression coefficients (note that this call to lm() is working directly with vectors instead of with columns in a dataframe, and so the syntax does not include a dataframe) as an additional note, if you d like more practice calculating ŷ and residuals, you can check your work with model <- lm(y ~ x) fitted(model) # gives fitted y values, ŷ resid(model) # gives residuals Part 3: Correlation and regression with R (case studies) The remaining part of this problem set is in R (see P10_PracticePS_W10_R). This section of the problem set uses R to quickly generate correlation coefficients and regression equations from larger data sets, but primarily involves practice using this information to perform additional calculations and answer conceptual questions.

Lecture 8 CORRELATION AND LINEAR REGRESSION

Lecture 8 CORRELATION AND LINEAR REGRESSION Announcements CBA5 open in exam mode - deadline midnight Friday! Question 2 on this week s exercises is a prize question. The first good attempt handed in to me by 12 midday this Friday will merit a prize...

More information

Chapter 16. Simple Linear Regression and dcorrelation

Chapter 16. Simple Linear Regression and dcorrelation Chapter 16 Simple Linear Regression and dcorrelation 16.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will

More information

Correlation & Simple Regression

Correlation & Simple Regression Chapter 11 Correlation & Simple Regression The previous chapter dealt with inference for two categorical variables. In this chapter, we would like to examine the relationship between two quantitative variables.

More information

Chapter 12 - Part I: Correlation Analysis

Chapter 12 - Part I: Correlation Analysis ST coursework due Friday, April - Chapter - Part I: Correlation Analysis Textbook Assignment Page - # Page - #, Page - # Lab Assignment # (available on ST webpage) GOALS When you have completed this lecture,

More information

Black White Total Observed Expected χ 2 = (f observed f expected ) 2 f expected (83 126) 2 ( )2 126

Black White Total Observed Expected χ 2 = (f observed f expected ) 2 f expected (83 126) 2 ( )2 126 Psychology 60 Fall 2013 Practice Final Actual Exam: This Wednesday. Good luck! Name: To view the solutions, check the link at the end of the document. This practice final should supplement your studying;

More information

Chapter 16. Simple Linear Regression and Correlation

Chapter 16. Simple Linear Regression and Correlation Chapter 16 Simple Linear Regression and Correlation 16.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will

More information

4.1 Introduction. 4.2 The Scatter Diagram. Chapter 4 Linear Correlation and Regression Analysis

4.1 Introduction. 4.2 The Scatter Diagram. Chapter 4 Linear Correlation and Regression Analysis 4.1 Introduction Correlation is a technique that measures the strength (or the degree) of the relationship between two variables. For example, we could measure how strong the relationship is between people

More information

Introduction to Linear Regression

Introduction to Linear Regression Introduction to Linear Regression James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) Introduction to Linear Regression 1 / 46

More information

Mathematics for Economics MA course

Mathematics for Economics MA course Mathematics for Economics MA course Simple Linear Regression Dr. Seetha Bandara Simple Regression Simple linear regression is a statistical method that allows us to summarize and study relationships between

More information

Chapter 7. Scatterplots, Association, and Correlation

Chapter 7. Scatterplots, Association, and Correlation Chapter 7 Scatterplots, Association, and Correlation Bin Zou (bzou@ualberta.ca) STAT 141 University of Alberta Winter 2015 1 / 29 Objective In this chapter, we study relationships! Instead, we investigate

More information

regression analysis is a type of inferential statistics which tells us whether relationships between two or more variables exist

regression analysis is a type of inferential statistics which tells us whether relationships between two or more variables exist regression analysis is a type of inferential statistics which tells us whether relationships between two or more variables exist sales $ (y - dependent variable) advertising $ (x - independent variable)

More information

Overview. Overview. Overview. Specific Examples. General Examples. Bivariate Regression & Correlation

Overview. Overview. Overview. Specific Examples. General Examples. Bivariate Regression & Correlation Bivariate Regression & Correlation Overview The Scatter Diagram Two Examples: Education & Prestige Correlation Coefficient Bivariate Linear Regression Line SPSS Output Interpretation Covariance ou already

More information

Section 3: Simple Linear Regression

Section 3: Simple Linear Regression Section 3: Simple Linear Regression Carlos M. Carvalho The University of Texas at Austin McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/ 1 Regression: General Introduction

More information

q3_3 MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

q3_3 MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. q3_3 MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Provide an appropriate response. 1) In 2007, the number of wins had a mean of 81.79 with a standard

More information

Business Statistics. Chapter 14 Introduction to Linear Regression and Correlation Analysis QMIS 220. Dr. Mohammad Zainal

Business Statistics. Chapter 14 Introduction to Linear Regression and Correlation Analysis QMIS 220. Dr. Mohammad Zainal Department of Quantitative Methods & Information Systems Business Statistics Chapter 14 Introduction to Linear Regression and Correlation Analysis QMIS 220 Dr. Mohammad Zainal Chapter Goals After completing

More information

Lecture 14. Analysis of Variance * Correlation and Regression. The McGraw-Hill Companies, Inc., 2000

Lecture 14. Analysis of Variance * Correlation and Regression. The McGraw-Hill Companies, Inc., 2000 Lecture 14 Analysis of Variance * Correlation and Regression Outline Analysis of Variance (ANOVA) 11-1 Introduction 11-2 Scatter Plots 11-3 Correlation 11-4 Regression Outline 11-5 Coefficient of Determination

More information

Lecture 14. Outline. Outline. Analysis of Variance * Correlation and Regression Analysis of Variance (ANOVA)

Lecture 14. Outline. Outline. Analysis of Variance * Correlation and Regression Analysis of Variance (ANOVA) Outline Lecture 14 Analysis of Variance * Correlation and Regression Analysis of Variance (ANOVA) 11-1 Introduction 11- Scatter Plots 11-3 Correlation 11-4 Regression Outline 11-5 Coefficient of Determination

More information

Keller: Stats for Mgmt & Econ, 7th Ed July 17, 2006

Keller: Stats for Mgmt & Econ, 7th Ed July 17, 2006 Chapter 17 Simple Linear Regression and Correlation 17.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will

More information

9. Linear Regression and Correlation

9. Linear Regression and Correlation 9. Linear Regression and Correlation Data: y a quantitative response variable x a quantitative explanatory variable (Chap. 8: Recall that both variables were categorical) For example, y = annual income,

More information

MATH 1150 Chapter 2 Notation and Terminology

MATH 1150 Chapter 2 Notation and Terminology MATH 1150 Chapter 2 Notation and Terminology Categorical Data The following is a dataset for 30 randomly selected adults in the U.S., showing the values of two categorical variables: whether or not the

More information

Announcements: You can turn in homework until 6pm, slot on wall across from 2202 Bren. Make sure you use the correct slot! (Stats 8, closest to wall)

Announcements: You can turn in homework until 6pm, slot on wall across from 2202 Bren. Make sure you use the correct slot! (Stats 8, closest to wall) Announcements: You can turn in homework until 6pm, slot on wall across from 2202 Bren. Make sure you use the correct slot! (Stats 8, closest to wall) We will cover Chs. 5 and 6 first, then 3 and 4. Mon,

More information

Intro to Linear Regression

Intro to Linear Regression Intro to Linear Regression Introduction to Regression Regression is a statistical procedure for modeling the relationship among variables to predict the value of a dependent variable from one or more predictor

More information

Topic 10 - Linear Regression

Topic 10 - Linear Regression Topic 10 - Linear Regression Least squares principle Hypothesis tests/confidence intervals/prediction intervals for regression 1 Linear Regression How much should you pay for a house? Would you consider

More information

Chapter 12 - Lecture 2 Inferences about regression coefficient

Chapter 12 - Lecture 2 Inferences about regression coefficient Chapter 12 - Lecture 2 Inferences about regression coefficient April 19th, 2010 Facts about slope Test Statistic Confidence interval Hypothesis testing Test using ANOVA Table Facts about slope In previous

More information

Chapter 12 : Linear Correlation and Linear Regression

Chapter 12 : Linear Correlation and Linear Regression Chapter 1 : Linear Correlation and Linear Regression Determining whether a linear relationship exists between two quantitative variables, and modeling the relationship with a line, if the linear relationship

More information

STATS DOESN T SUCK! ~ CHAPTER 16

STATS DOESN T SUCK! ~ CHAPTER 16 SIMPLE LINEAR REGRESSION: STATS DOESN T SUCK! ~ CHAPTER 6 The HR manager at ACME food services wants to examine the relationship between a workers income and their years of experience on the job. He randomly

More information

Inferences for Regression

Inferences for Regression Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In

More information

Final Exam - Solutions

Final Exam - Solutions Ecn 102 - Analysis of Economic Data University of California - Davis March 17, 2010 Instructor: John Parman Final Exam - Solutions You have until 12:30pm to complete this exam. Please remember to put your

More information

This module focuses on the logic of ANOVA with special attention given to variance components and the relationship between ANOVA and regression.

This module focuses on the logic of ANOVA with special attention given to variance components and the relationship between ANOVA and regression. WISE ANOVA and Regression Lab Introduction to the WISE Correlation/Regression and ANOVA Applet This module focuses on the logic of ANOVA with special attention given to variance components and the relationship

More information

Six Sigma Black Belt Study Guides

Six Sigma Black Belt Study Guides Six Sigma Black Belt Study Guides 1 www.pmtutor.org Powered by POeT Solvers Limited. Analyze Correlation and Regression Analysis 2 www.pmtutor.org Powered by POeT Solvers Limited. Variables and relationships

More information

Midterm 2 - Solutions

Midterm 2 - Solutions Ecn 102 - Analysis of Economic Data University of California - Davis February 24, 2010 Instructor: John Parman Midterm 2 - Solutions You have until 10:20am to complete this exam. Please remember to put

More information

Relationships between variables. Visualizing Bivariate Distributions: Scatter Plots

Relationships between variables. Visualizing Bivariate Distributions: Scatter Plots SFBS Course Notes Part 7: Correlation Bivariate relationships (p. 1) Linear transformations (p. 3) Pearson r : Measuring a relationship (p. 5) Interpretation of correlations (p. 10) Relationships between

More information

Linear Regression. Linear Regression. Linear Regression. Did You Mean Association Or Correlation?

Linear Regression. Linear Regression. Linear Regression. Did You Mean Association Or Correlation? Did You Mean Association Or Correlation? AP Statistics Chapter 8 Be careful not to use the word correlation when you really mean association. Often times people will incorrectly use the word correlation

More information

Announcements. J. Parman (UC-Davis) Analysis of Economic Data, Winter 2011 February 8, / 45

Announcements. J. Parman (UC-Davis) Analysis of Economic Data, Winter 2011 February 8, / 45 Announcements Solutions to Problem Set 3 are posted Problem Set 4 is posted, It will be graded and is due a week from Friday You already know everything you need to work on Problem Set 4 Professor Miller

More information

AMS 7 Correlation and Regression Lecture 8

AMS 7 Correlation and Regression Lecture 8 AMS 7 Correlation and Regression Lecture 8 Department of Applied Mathematics and Statistics, University of California, Santa Cruz Suumer 2014 1 / 18 Correlation pairs of continuous observations. Correlation

More information

Stat 500 Midterm 2 8 November 2007 page 0 of 4

Stat 500 Midterm 2 8 November 2007 page 0 of 4 Stat 500 Midterm 2 8 November 2007 page 0 of 4 Please put your name on the back of your answer book. Do NOT put it on the front. Thanks. DO NOT START until I tell you to. You are welcome to read this front

More information

Correlation and Linear Regression

Correlation and Linear Regression Correlation and Linear Regression Correlation: Relationships between Variables So far, nearly all of our discussion of inferential statistics has focused on testing for differences between group means

More information

Correlation Analysis

Correlation Analysis Simple Regression Correlation Analysis Correlation analysis is used to measure strength of the association (linear relationship) between two variables Correlation is only concerned with strength of the

More information

Part Possible Score Base 5 5 MC Total 50

Part Possible Score Base 5 5 MC Total 50 Stat 220 Final Exam December 16, 2004 Schafer NAME: ANDREW ID: Read This First: You have three hours to work on the exam. The other questions require you to work out answers to the questions; be sure to

More information

Inference with Simple Regression

Inference with Simple Regression 1 Introduction Inference with Simple Regression Alan B. Gelder 06E:071, The University of Iowa 1 Moving to infinite means: In this course we have seen one-mean problems, twomean problems, and problems

More information

determine whether or not this relationship is.

determine whether or not this relationship is. Section 9-1 Correlation A correlation is a between two. The data can be represented by ordered pairs (x,y) where x is the (or ) variable and y is the (or ) variable. There are several types of correlations

More information

Correlation and Regression

Correlation and Regression Correlation and Regression Dr. Bob Gee Dean Scott Bonney Professor William G. Journigan American Meridian University 1 Learning Objectives Upon successful completion of this module, the student should

More information

Regression and the 2-Sample t

Regression and the 2-Sample t Regression and the 2-Sample t James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) Regression and the 2-Sample t 1 / 44 Regression

More information

Chapter 10. Correlation and Regression. McGraw-Hill, Bluman, 7th ed., Chapter 10 1

Chapter 10. Correlation and Regression. McGraw-Hill, Bluman, 7th ed., Chapter 10 1 Chapter 10 Correlation and Regression McGraw-Hill, Bluman, 7th ed., Chapter 10 1 Chapter 10 Overview Introduction 10-1 Scatter Plots and Correlation 10- Regression 10-3 Coefficient of Determination and

More information

Important note: Transcripts are not substitutes for textbook assignments. 1

Important note: Transcripts are not substitutes for textbook assignments. 1 In this lesson we will cover correlation and regression, two really common statistical analyses for quantitative (or continuous) data. Specially we will review how to organize the data, the importance

More information

Lecture 2 Simple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: Chapter 1

Lecture 2 Simple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: Chapter 1 Lecture Simple Linear Regression STAT 51 Spring 011 Background Reading KNNL: Chapter 1-1 Topic Overview This topic we will cover: Regression Terminology Simple Linear Regression with a single predictor

More information

Answer Key. 9.1 Scatter Plots and Linear Correlation. Chapter 9 Regression and Correlation. CK-12 Advanced Probability and Statistics Concepts 1

Answer Key. 9.1 Scatter Plots and Linear Correlation. Chapter 9 Regression and Correlation. CK-12 Advanced Probability and Statistics Concepts 1 9.1 Scatter Plots and Linear Correlation Answers 1. A high school psychologist wants to conduct a survey to answer the question: Is there a relationship between a student s athletic ability and his/her

More information

Related Example on Page(s) R , 148 R , 148 R , 156, 157 R3.1, R3.2. Activity on 152, , 190.

Related Example on Page(s) R , 148 R , 148 R , 156, 157 R3.1, R3.2. Activity on 152, , 190. Name Chapter 3 Learning Objectives Identify explanatory and response variables in situations where one variable helps to explain or influences the other. Make a scatterplot to display the relationship

More information

Section Linear Correlation and Regression. Copyright 2013, 2010, 2007, Pearson, Education, Inc.

Section Linear Correlation and Regression. Copyright 2013, 2010, 2007, Pearson, Education, Inc. Section 13.7 Linear Correlation and Regression What You Will Learn Linear Correlation Scatter Diagram Linear Regression Least Squares Line 13.7-2 Linear Correlation Linear correlation is used to determine

More information

Math 10 - Compilation of Sample Exam Questions + Answers

Math 10 - Compilation of Sample Exam Questions + Answers Math 10 - Compilation of Sample Exam Questions + Sample Exam Question 1 We have a population of size N. Let p be the independent probability of a person in the population developing a disease. Answer the

More information

Ch 13 & 14 - Regression Analysis

Ch 13 & 14 - Regression Analysis Ch 3 & 4 - Regression Analysis Simple Regression Model I. Multiple Choice:. A simple regression is a regression model that contains a. only one independent variable b. only one dependent variable c. more

More information

Correlation and Regression

Correlation and Regression Correlation and Regression October 25, 2017 STAT 151 Class 9 Slide 1 Outline of Topics 1 Associations 2 Scatter plot 3 Correlation 4 Regression 5 Testing and estimation 6 Goodness-of-fit STAT 151 Class

More information

Intro to Linear Regression

Intro to Linear Regression Intro to Linear Regression Introduction to Regression Regression is a statistical procedure for modeling the relationship among variables to predict the value of a dependent variable from one or more predictor

More information

Lecture 16 - Correlation and Regression

Lecture 16 - Correlation and Regression Lecture 16 - Correlation and Regression Statistics 102 Colin Rundel April 1, 2013 Modeling numerical variables Modeling numerical variables So far we have worked with single numerical and categorical variables,

More information

Chapter 3: Examining Relationships

Chapter 3: Examining Relationships Chapter 3: Examining Relationships Most statistical studies involve more than one variable. Often in the AP Statistics exam, you will be asked to compare two data sets by using side by side boxplots or

More information

WISE Regression/Correlation Interactive Lab. Introduction to the WISE Correlation/Regression Applet

WISE Regression/Correlation Interactive Lab. Introduction to the WISE Correlation/Regression Applet WISE Regression/Correlation Interactive Lab Introduction to the WISE Correlation/Regression Applet This tutorial focuses on the logic of regression analysis with special attention given to variance components.

More information

Ordinary Least Squares Regression Explained: Vartanian

Ordinary Least Squares Regression Explained: Vartanian Ordinary Least Squares Regression Explained: Vartanian When to Use Ordinary Least Squares Regression Analysis A. Variable types. When you have an interval/ratio scale dependent variable.. When your independent

More information

STAT 458 Lab 4 Linear Regression Analysis

STAT 458 Lab 4 Linear Regression Analysis STAT 458 Lab 4 Linear Regression Analysis Scatter Plots: When investigating relationship between 2 quantitative variables, one of the first steps might be to construct a scatter plot of the response variable

More information

Chapter 12: Linear regression II

Chapter 12: Linear regression II Chapter 12: Linear regression II Timothy Hanson Department of Statistics, University of South Carolina Stat 205: Elementary Statistics for the Biological and Life Sciences 1 / 14 12.4 The regression model

More information

Basic Business Statistics 6 th Edition

Basic Business Statistics 6 th Edition Basic Business Statistics 6 th Edition Chapter 12 Simple Linear Regression Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value of a dependent variable based

More information

Unit #2: Linear and Exponential Functions Lesson #13: Linear & Exponential Regression, Correlation, & Causation. Day #1

Unit #2: Linear and Exponential Functions Lesson #13: Linear & Exponential Regression, Correlation, & Causation. Day #1 Algebra I Name Unit #2: Linear and Exponential Functions Lesson #13: Linear & Exponential Regression, Correlation, & Causation Day #1 Period Date When a table of values increases or decreases by the same

More information

Chapter 8. Linear Regression. Copyright 2010 Pearson Education, Inc.

Chapter 8. Linear Regression. Copyright 2010 Pearson Education, Inc. Chapter 8 Linear Regression Copyright 2010 Pearson Education, Inc. Fat Versus Protein: An Example The following is a scatterplot of total fat versus protein for 30 items on the Burger King menu: Copyright

More information

Statistics 100 Exam 2 March 8, 2017

Statistics 100 Exam 2 March 8, 2017 STAT 100 EXAM 2 Spring 2017 (This page is worth 1 point. Graded on writing your name and net id clearly and circling section.) PRINT NAME (Last name) (First name) net ID CIRCLE SECTION please! L1 (MWF

More information

Chapter 4 Describing the Relation between Two Variables

Chapter 4 Describing the Relation between Two Variables Chapter 4 Describing the Relation between Two Variables 4.1 Scatter Diagrams and Correlation The is the variable whose value can be explained by the value of the or. A is a graph that shows the relationship

More information

Lecture 30. DATA 8 Summer Regression Inference

Lecture 30. DATA 8 Summer Regression Inference DATA 8 Summer 2018 Lecture 30 Regression Inference Slides created by John DeNero (denero@berkeley.edu) and Ani Adhikari (adhikari@berkeley.edu) Contributions by Fahad Kamran (fhdkmrn@berkeley.edu) and

More information

Chapter Learning Objectives. Regression Analysis. Correlation. Simple Linear Regression. Chapter 12. Simple Linear Regression

Chapter Learning Objectives. Regression Analysis. Correlation. Simple Linear Regression. Chapter 12. Simple Linear Regression Chapter 12 12-1 North Seattle Community College BUS21 Business Statistics Chapter 12 Learning Objectives In this chapter, you learn:! How to use regression analysis to predict the value of a dependent

More information

Regression Analysis IV... More MLR and Model Building

Regression Analysis IV... More MLR and Model Building Regression Analysis IV... More MLR and Model Building This session finishes up presenting the formal methods of inference based on the MLR model and then begins discussion of "model building" (use of regression

More information

y n 1 ( x i x )( y y i n 1 i y 2

y n 1 ( x i x )( y y i n 1 i y 2 STP3 Brief Class Notes Instructor: Ela Jackiewicz Chapter Regression and Correlation In this chapter we will explore the relationship between two quantitative variables, X an Y. We will consider n ordered

More information

AMS 315/576 Lecture Notes. Chapter 11. Simple Linear Regression

AMS 315/576 Lecture Notes. Chapter 11. Simple Linear Regression AMS 315/576 Lecture Notes Chapter 11. Simple Linear Regression 11.1 Motivation A restaurant opening on a reservations-only basis would like to use the number of advance reservations x to predict the number

More information

MATH 1070 Introductory Statistics Lecture notes Relationships: Correlation and Simple Regression

MATH 1070 Introductory Statistics Lecture notes Relationships: Correlation and Simple Regression MATH 1070 Introductory Statistics Lecture notes Relationships: Correlation and Simple Regression Objectives: 1. Learn the concepts of independent and dependent variables 2. Learn the concept of a scatterplot

More information

df=degrees of freedom = n - 1

df=degrees of freedom = n - 1 One sample t-test test of the mean Assumptions: Independent, random samples Approximately normal distribution (from intro class: σ is unknown, need to calculate and use s (sample standard deviation)) Hypotheses:

More information

Business Statistics. Lecture 10: Correlation and Linear Regression

Business Statistics. Lecture 10: Correlation and Linear Regression Business Statistics Lecture 10: Correlation and Linear Regression Scatterplot A scatterplot shows the relationship between two quantitative variables measured on the same individuals. It displays the Form

More information

SLR output RLS. Refer to slr (code) on the Lecture Page of the class website.

SLR output RLS. Refer to slr (code) on the Lecture Page of the class website. SLR output RLS Refer to slr (code) on the Lecture Page of the class website. Old Faithful at Yellowstone National Park, WY: Simple Linear Regression (SLR) Analysis SLR analysis explores the linear association

More information

Chapter 5 Least Squares Regression

Chapter 5 Least Squares Regression Chapter 5 Least Squares Regression A Royal Bengal tiger wandered out of a reserve forest. We tranquilized him and want to take him back to the forest. We need an idea of his weight, but have no scale!

More information

28. SIMPLE LINEAR REGRESSION III

28. SIMPLE LINEAR REGRESSION III 28. SIMPLE LINEAR REGRESSION III Fitted Values and Residuals To each observed x i, there corresponds a y-value on the fitted line, y = βˆ + βˆ x. The are called fitted values. ŷ i They are the values of

More information

22 Approximations - the method of least squares (1)

22 Approximations - the method of least squares (1) 22 Approximations - the method of least squares () Suppose that for some y, the equation Ax = y has no solutions It may happpen that this is an important problem and we can t just forget about it If we

More information

Simple Linear Regression

Simple Linear Regression 9-1 l Chapter 9 l Simple Linear Regression 9.1 Simple Linear Regression 9.2 Scatter Diagram 9.3 Graphical Method for Determining Regression 9.4 Least Square Method 9.5 Correlation Coefficient and Coefficient

More information

The Simple Linear Regression Model

The Simple Linear Regression Model The Simple Linear Regression Model Lesson 3 Ryan Safner 1 1 Department of Economics Hood College ECON 480 - Econometrics Fall 2017 Ryan Safner (Hood College) ECON 480 - Lesson 3 Fall 2017 1 / 77 Bivariate

More information

Regression Analysis. BUS 735: Business Decision Making and Research

Regression Analysis. BUS 735: Business Decision Making and Research Regression Analysis BUS 735: Business Decision Making and Research 1 Goals and Agenda Goals of this section Specific goals Learn how to detect relationships between ordinal and categorical variables. Learn

More information

9 Correlation and Regression

9 Correlation and Regression 9 Correlation and Regression SW, Chapter 12. Suppose we select n = 10 persons from the population of college seniors who plan to take the MCAT exam. Each takes the test, is coached, and then retakes the

More information

20. Ignore the common effect question (the first one). Makes little sense in the context of this question.

20. Ignore the common effect question (the first one). Makes little sense in the context of this question. Errors & Omissions Free Response: 8. Change points to (0, 0), (1, 2), (2, 1) 14. Change place to case. 17. Delete the word other. 20. Ignore the common effect question (the first one). Makes little sense

More information

SECTION I Number of Questions 42 Percent of Total Grade 50

SECTION I Number of Questions 42 Percent of Total Grade 50 AP Stats Chap 7-9 Practice Test Name Pd SECTION I Number of Questions 42 Percent of Total Grade 50 Directions: Solve each of the following problems, using the available space (or extra paper) for scratchwork.

More information

Statistics for Managers using Microsoft Excel 6 th Edition

Statistics for Managers using Microsoft Excel 6 th Edition Statistics for Managers using Microsoft Excel 6 th Edition Chapter 13 Simple Linear Regression 13-1 Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value of

More information

Approximations - the method of least squares (1)

Approximations - the method of least squares (1) Approximations - the method of least squares () In many applications, we have to consider the following problem: Suppose that for some y, the equation Ax = y has no solutions It could be that this is an

More information

Inference for Regression

Inference for Regression Inference for Regression Section 9.4 Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c Department of Mathematics University of Houston Lecture 13b - 3339 Cathy Poliak, Ph.D. cathy@math.uh.edu

More information

Mrs. Poyner/Mr. Page Chapter 3 page 1

Mrs. Poyner/Mr. Page Chapter 3 page 1 Name: Date: Period: Chapter 2: Take Home TEST Bivariate Data Part 1: Multiple Choice. (2.5 points each) Hand write the letter corresponding to the best answer in space provided on page 6. 1. In a statistics

More information

Lectures on Simple Linear Regression Stat 431, Summer 2012

Lectures on Simple Linear Regression Stat 431, Summer 2012 Lectures on Simple Linear Regression Stat 43, Summer 0 Hyunseung Kang July 6-8, 0 Last Updated: July 8, 0 :59PM Introduction Previously, we have been investigating various properties of the population

More information

This gives us an upper and lower bound that capture our population mean.

This gives us an upper and lower bound that capture our population mean. Confidence Intervals Critical Values Practice Problems 1 Estimation 1.1 Confidence Intervals Definition 1.1 Margin of error. The margin of error of a distribution is the amount of error we predict when

More information

Association Between Variables Measured at the Interval-Ratio Level: Bivariate Correlation and Regression

Association Between Variables Measured at the Interval-Ratio Level: Bivariate Correlation and Regression Association Between Variables Measured at the Interval-Ratio Level: Bivariate Correlation and Regression Last couple of classes: Measures of Association: Phi, Cramer s V and Lambda (nominal level of measurement)

More information

3.2: Least Squares Regressions

3.2: Least Squares Regressions 3.2: Least Squares Regressions Section 3.2 Least-Squares Regression After this section, you should be able to INTERPRET a regression line CALCULATE the equation of the least-squares regression line CALCULATE

More information

Business Statistics. Lecture 9: Simple Regression

Business Statistics. Lecture 9: Simple Regression Business Statistics Lecture 9: Simple Regression 1 On to Model Building! Up to now, class was about descriptive and inferential statistics Numerical and graphical summaries of data Confidence intervals

More information

Topics on Statistics 2

Topics on Statistics 2 Topics on Statistics 2 Pejman Mahboubi March 7, 2018 1 Regression vs Anova In Anova groups are the predictors. When plotting, we can put the groups on the x axis in any order we wish, say in increasing

More information

Year 10 Mathematics Semester 2 Bivariate Data Chapter 13

Year 10 Mathematics Semester 2 Bivariate Data Chapter 13 Year 10 Mathematics Semester 2 Bivariate Data Chapter 13 Why learn this? Observations of two or more variables are often recorded, for example, the heights and weights of individuals. Studying the data

More information

Test 3A AP Statistics Name:

Test 3A AP Statistics Name: Test 3A AP Statistics Name: Part 1: Multiple Choice. Circle the letter corresponding to the best answer. 1. Other things being equal, larger automobile engines consume more fuel. You are planning an experiment

More information

Business Mathematics and Statistics (MATH0203) Chapter 1: Correlation & Regression

Business Mathematics and Statistics (MATH0203) Chapter 1: Correlation & Regression Business Mathematics and Statistics (MATH0203) Chapter 1: Correlation & Regression Dependent and independent variables The independent variable (x) is the one that is chosen freely or occur naturally.

More information

LECTURE 15: SIMPLE LINEAR REGRESSION I

LECTURE 15: SIMPLE LINEAR REGRESSION I David Youngberg BSAD 20 Montgomery College LECTURE 5: SIMPLE LINEAR REGRESSION I I. From Correlation to Regression a. Recall last class when we discussed two basic types of correlation (positive and negative).

More information

Quantitative Bivariate Data

Quantitative Bivariate Data Statistics 211 (L02) - Linear Regression Quantitative Bivariate Data Consider two quantitative variables, defined in the following way: X i - the observed value of Variable X from subject i, i = 1, 2,,

More information

1 A Review of Correlation and Regression

1 A Review of Correlation and Regression 1 A Review of Correlation and Regression SW, Chapter 12 Suppose we select n = 10 persons from the population of college seniors who plan to take the MCAT exam. Each takes the test, is coached, and then

More information

: The model hypothesizes a relationship between the variables. The simplest probabilistic model: or.

: The model hypothesizes a relationship between the variables. The simplest probabilistic model: or. Chapter Simple Linear Regression : comparing means across groups : presenting relationships among numeric variables. Probabilistic Model : The model hypothesizes an relationship between the variables.

More information

5. Linear Regression

5. Linear Regression 5. Linear Regression Outline.................................................................... 2 Simple linear regression 3 Linear model............................................................. 4

More information