y n 1 ( x i x )( y y i n 1 i y 2

Size: px
Start display at page:

Download "y n 1 ( x i x )( y y i n 1 i y 2"

Transcription

1 STP3 Brief Class Notes Instructor: Ela Jackiewicz Chapter Regression and Correlation In this chapter we will explore the relationship between two quantitative variables, X an Y. We will consider n ordered pairs of observations (x,y). Both x and y can be observed (observational study) or y can be observed for specific values of x that are selected by the researcher (experiment). Data must display a linear trend on the scatterplot for us to consider fitting a linear equation that will expres in terms of x. We will use the following vocabulary with respect to x and y: y response variable or dependent variable x predictor variable/explanatory variable or independent variable If our scatterplot indicatexistence of a linear trend (visually we can see our data points clustering closely to some line), we would be interested in expressing quantitatively the strength of that linear relationship. The numerical measure we will use is called Linear Correlation Coefficient and we will denote it's value by r. Linear correlation Coefficient of X and Y, r : r= where s x = x n i x y i y s x = x i x = y s i y y n ( x i x s x )( y y i ) = z n xi z yi (standard deviations of x-s and y-s respectively) n n Linear Correlation coefficient gives information about strength and direction of a linear trend in our data. Values close to or - indicate extremely strong linear association between X and Y. Example. Consider people on the diet program. Let X=# of times per week person eats out, Y= change in weight (in pounds),(before-after) after three months period of dieting. Compute and interpret linear correlation r. (Make sure to plot the data first, scatterplot shows reasonably strong negative linear trend) First we compute means and standard deviations of our x and y values: x i y i x i x (x i x) y i y ( y i y) x=5/5=5 s x= =.35 y=0/5=8 s x= 0 =5.8

2 STP3 Brief Class Notes Instructor: Ela Jackiewicz Now we are ready to compute our z-scores and finally r. x i y i z x i = x i x s x z yi = y i y z xi z yi (3-5)/.35= (0-8)/5.8= r= 3. = 0.5 This value indicates very strong linear relationship between x and y. Properties of r: r is unit-free r, the closer is r to - or the stronger is linear relationship between x and y, r= or - indicates a perfect linear relationship r is symmetric, if X and Y are reversed, value of r remains the same. sign of r matches the sign of the slope of the regression line Note: r=0 does not necessarily mean that there is no relationship between x and y, just that there is no linear relationship. Warning about correlation: strong (positive or negative) correlation does not have to imply causal relationship between x and y. Example: Researchers found that there is a very strong positive correlation between number of storks present an number of babies born. Does it imply that storks cause babies to be born? Of course not, both can be influenced by some other factors. Observational studies can't prove causation. Experiments can prove causation. Inference concerning correlation. r=sample linear correlation coefficient can be regarded as an estimate of a population correlation ρ (Greek letter rho). In order to regard r as an estimate of ρ it must be reasonable to assume that both x and y were selected at random from the population. When x-s are specified by the researcher, r is not an appropriate estimate of ρ. Bivariate Random Sampling Model assumeach pair (x,y) is sampled randomly from the population of all such pairs.

3 STP3 Brief Class Notes Instructor: Ela Jackiewicz Tests of hypotheses involving population correlation coefficient. Our hypotheses are: H 0 : =0 (x and y are not linearly correlated in the population) vs H a :ρ 0 or H a :ρ> 0 or H a :ρ< 0 Test statistics: t s =r n r Assuming null hypothesis is true, t s has a t distribution with df=n- In our example we may ask the question: Assuming that a linear model applies, test the hypothesis that there is no linear relationship between x and y against an alternative that y decreases with increasing x. Use 5% significance level. Our test hypotheses are: H 0 :ρ=0 and H a :ρ< 0 t s =. (5 ) = 3.8 P-value=0.05<0.05 (.83) Null is rejected, we have evidence for alternative. Yes, y decreases with increasing x, there is statistically significant negative linear relationship between x and y. Notes about correlation inference: There is a difference between statistical significance and importance, weak correlation may be statistically significant, but not at all useful or informative For a fixed value of r, test statistics, t s, increases as a sample size increases, if n is large enough, ts will be large and will show correlation to be significant no matter how small r is. CI for r may help determine a practical significance. r ixtremely sensitive to extreme data points, be aware of outliers. We have plotted our data to determine if there appear to be a linear relationship, we found the correlation coefficient and tested to see if r was statistically significant. Now we will discuss how to find and interpret the equation of the line that best summarizes the linear relationship between two quantitative variables. Our line is called Least Squares Regression Line and will have a form: y=b 0 b x, where y is a predicted value of y for given x. b 0 =y-intercept, b =slope, a rate of change of y with respect to x. The name of the line reflects the fact that our line has smallest possible sum of squared errors (of all lines that can be possibly fitted to the given data). Errors or residuals are the vertical deviations of observations (data points) from the line : e= y ŷ, SS(resid)= e = (y i ŷ i ) is minimized by LS regression line.

4 STP3 Brief Class Notes Instructor: Ela Jackiewicz Formulas for the slope and y-intercept of Least-Squares Regression equation: y =b 0 b x, b =r s x, b 0 =y b x In our example: x=5/5=5 y=0/5=8 b = = b 0=8 ( )(5)=8, so our equation is : ŷ=8 x Equation tells us that for every time increase in x (times to eat out in a week), y (weight change) decreases by pounds. The more often you go out to eat the less pounds are lost. Units of the slope are : (units of y)/ (units of x) Notes about the regression line: ( x, y) lies on the fitted line we can use the regression line to predict values of y for given values of x. Interpreting the slope: If x increases by unit, y increases (or decreases) by b units (direction of change depends on the sign of b. Regression line is based on the assumption that data points are scattered about the line. If data are scattered about the curve, or do not show any trend at all, avoid the linear regression. Linear regression can be strongly affected by outliers. Always plot data first. Making predictions using our equation: Predict number of pounds lost if dieting person eats out 5 times per week: y=8-0=8 lb (this prediction is for x within data range) Warning about making predictions: It is safe to predict y for x within data range. Extrapolation - making predictions for values of the predictor variable outside the range of the observed values of the predictor variable (x) Grossly incorrect predictions can result from extrapolation. For ex. Prediction for x=5 i=- (weight gain of lb). One may think that value is reasonable, weight gain is to be expected if you eat out so often, but because we have no data in that range, we may not be sure that our particular trend will hold as far out of the range of x values. Problems we may encounter with our data: Outlier a data point that lies far from the regression line, relative to other data points Influential observation a data point whose removal causes the regression to change considerably. It is usually separated in the x-direction from the other data points. It pulls the regression line towards itself.

5 STP3 Brief Class Notes Instructor: Ela Jackiewicz Important Sums of Squares : There are three summations that are important in assessment of the fit of our regression line. Total Sum of squares SS total = y i y gives total variability in y values Residual Sum of Squares: SS resid = y i y i by the regression equation gives total variability in y values unexplained Regression Sum of Squares: the regression equation SS reg = y i y gives total variability in y valuexplained by Regression Identity: SS(total)=SS(reg)+SS(resid) In our example SS(total) was already computed. We can now obtain residual sum of squares, and regression sum of squares by obtaining predicted values of y for each x, by using our equation. x i y i ( y i y) ŷ i y i ŷ ( y i ŷ) ŷ i y ( ŷ i y) SS(total) 8 SS(resid) 88 SS(reg) We can verify that in our example : 0=88+8 We will use the three sums of squares in farther definitions. Assessing the fit of our regression line: = We can use SS(resid) to compute Residual Standard Deviation: s SS(resid) e n This measure tells us how close are the data points to our fitted regression line, the smaller is this value, the better is the fit of our regression line. If is smaller than (standard deviation of y-s), it means a good fit. The units of are the same as units of y.

6 STP3 Brief Class Notes Instructor: Ela Jackiewicz measures variability of y values around the mean ( y ) measures variability around the regression line. In the nice data, if it is not too small, we expect roughly : 8% of observed y-s to be within ± of the regression line 5% of observed y-s points to be within ± of the regression line and.% of observed y-s points to be within ±3 of the regression line In other words these percentages (roughly) of data points are expected to be within a vertical distances (parallel to the y axis) above and below the regression line. = In our example = (8/3)=.5lb<5.5= indicating a good fit. ( s 0 y 5 =5.5 ) Another value that tells us about the fit of the regression line is: Coefficient of Determination: r SS reg SS resid = = SS total SS total (square of linear corr. coef. r) This measure gives us proportion of total variability in y values that ixplained by our regression line. If this is close to (00%), then that indicates a good fit. 0 r In our example r = 8 0 = 88 0 =0.83, it indicates a good fit, 83% of total variability in y-values ixplained by the regression line. There is an approximate relationship of r to and. This approximation is good for large data r If this ratio is small, it indicates a good fit Above formula can also be expressed as r s e and used to approximate r for large data. In our example 0.83 =0. and approximation, since our data is small. = =0.8, and r 0. not a great

7 STP3 Brief Class Notes Instructor: Ela Jackiewicz Hypothesis test and CI for the slope: Assuming that the following linear model applies to X and Y: Y = 0 X Term ϵ in our model represents random error, we include it in the model to reflect that Y varieven if X is fixed. We can test hypothesis related to the true slope β The terms we compute for Least Squares Regression are estimating parameters in our model: b 0 estimates 0, b estimates If we want to test if slope is zero (indicating no relationship between x and y), our hypotheses are: H 0 : =0 vs H a :β 0 or H a :β > 0 or H a :β < 0 Choice of alternative depends on the question. Test statistics: t s = b SE b df=n-, Standard Error of the slope b : SE b = (x i x) = s x n, the second formula iasier to compute, The test statistics we are using here iquivalent to the one we used before in a test for population correlation coefficient, so we do not need to repeat it. We can also compute Confidence interval for a true slope: 5% CI for : b ±t.05 SE b In our example SE b =.5.35 =.53 and 5% CI for is : ±3.8 (0.53), ±3.8 (0.53), (-3., -0.3) CI indicates that the slope is significantly different than 0 and negative.

8 STP3 Brief Class Notes Instructor: Ela Jackiewicz CALCULATOR ) To use above computational formulas and/or find basic statistics for x and y use STAT EDIT place x-s and y-s on L and L respectively, then STAT CALC, select -Var Stats <enter> -Var Stats L, L <enter> )To find the regression line: STAT EDIT place x-s and y-s on L and L respectively STAT CALC, select LinReg(ax+b) <enter> LinReg(ax+b) L, L <enter> Output should display slope, y intercept, r and r.if you do not see r and r, then go to the CATALOG, select DIAGNOSTICS ON <enter><enter>, next time you run the regression it will show both values. 3) To perform the test: STAT TESTS use LinRegT-Test, place Xlist L Ylist L, then select appropriate alternative hypothesis (They use β for β, test is for both β and ρ ) In the output value of s=, you also get values of s x and there - ) To compute CI for β use STAT TESTS use LinRegT-interval, place Xlist L Ylist L, then select appropriate C-level If you do not have the CI on your calculator, use s from the LinRegT-Test output to compute SE b = s x n and compute CI by hand.

9 STP3 Brief Class Notes Instructor: Ela Jackiewicz Least squares Linear Regression Example Is number of beers consumed a good predictor of BAC (blood alcohol content)? Can you drive legally after beers (legal limit is.08)? Random sample of students from Ohio State University participated in a study. Each student was assigned randomly number of beers to drink and after 30 minutes a police officer measured their BAC in grams of alcohol per deciliter of blood. The results are given in the table below. # of beers=x BAC=y Answer following questions: Graph these data to confirm existence of a linear trend. Obtain LS regression line for these data. What is the slope of your line telling you about change in BAC after you drink a next beer? Do you think y-intercept value is sensible? Compute linear correlation coefficient and coefficient of determination. What % of total variability in BAC ixplained by the regression line? Are r and r indicating a good fit? Clearly answer both questions posed at the beginning of the problem. Can we use our equation to safely predict BAC for one that drinks 5 beers? Given SS(resid)=.005, compute, compare it to and decide if it indicates a good fit. Test the hypothesis of no linear relationship between x and y against directional alternative hypothesis that y increases with increasing x. Obtain 5% CI for the true slope β

Prob/Stats Questions? /32

Prob/Stats Questions? /32 Prob/Stats 10.4 Questions? 1 /32 Prob/Stats 10.4 Homework Apply p551 Ex 10-4 p 551 7, 8, 9, 10, 12, 13, 28 2 /32 Prob/Stats 10.4 Objective Compute the equation of the least squares 3 /32 Regression A scatter

More information

4.1 Introduction. 4.2 The Scatter Diagram. Chapter 4 Linear Correlation and Regression Analysis

4.1 Introduction. 4.2 The Scatter Diagram. Chapter 4 Linear Correlation and Regression Analysis 4.1 Introduction Correlation is a technique that measures the strength (or the degree) of the relationship between two variables. For example, we could measure how strong the relationship is between people

More information

Analysis of Bivariate Data

Analysis of Bivariate Data Analysis of Bivariate Data Data Two Quantitative variables GPA and GAES Interest rates and indices Tax and fund allocation Population size and prison population Bivariate data (x,y) Case corr&reg 2 Independent

More information

Linear Correlation and Regression Analysis

Linear Correlation and Regression Analysis Linear Correlation and Regression Analysis Set Up the Calculator 2 nd CATALOG D arrow down DiagnosticOn ENTER ENTER SCATTER DIAGRAM Positive Linear Correlation Positive Correlation Variables will tend

More information

CHAPTER 4 DESCRIPTIVE MEASURES IN REGRESSION AND CORRELATION

CHAPTER 4 DESCRIPTIVE MEASURES IN REGRESSION AND CORRELATION STP 226 ELEMENTARY STATISTICS CHAPTER 4 DESCRIPTIVE MEASURES IN REGRESSION AND CORRELATION Linear Regression and correlation allows us to examine the relationship between two or more quantitative variables.

More information

10.1 Simple Linear Regression

10.1 Simple Linear Regression 10.1 Simple Linear Regression Ulrich Hoensch Tuesday, December 1, 2009 The Simple Linear Regression Model We have two quantitative random variables X (the explanatory variable) and Y (the response variable).

More information

Chapter 8. Linear Regression /71

Chapter 8. Linear Regression /71 Chapter 8 Linear Regression 1 /71 Homework p192 1, 2, 3, 5, 7, 13, 15, 21, 27, 28, 29, 32, 35, 37 2 /71 3 /71 Objectives Determine Least Squares Regression Line (LSRL) describing the association of two

More information

Least-Squares Regression

Least-Squares Regression MATH 203 Least-Squares Regression Dr. Neal, Spring 2009 As well as finding the correlation of paired data {{ x 1, y 1 }, { x 2, y 2 },..., { x n, y n }}, we also can plot the data with a scatterplot and

More information

Chapter 9. Correlation and Regression

Chapter 9. Correlation and Regression Chapter 9 Correlation and Regression Lesson 9-1/9-2, Part 1 Correlation Registered Florida Pleasure Crafts and Watercraft Related Manatee Deaths 100 80 60 40 20 0 1991 1993 1995 1997 1999 Year Boats in

More information

Chapter 6: Exploring Data: Relationships Lesson Plan

Chapter 6: Exploring Data: Relationships Lesson Plan Chapter 6: Exploring Data: Relationships Lesson Plan For All Practical Purposes Displaying Relationships: Scatterplots Mathematical Literacy in Today s World, 9th ed. Making Predictions: Regression Line

More information

Bivariate Data Summary

Bivariate Data Summary Bivariate Data Summary Bivariate data data that examines the relationship between two variables What individuals to the data describe? What are the variables and how are they measured Are the variables

More information

1 A Review of Correlation and Regression

1 A Review of Correlation and Regression 1 A Review of Correlation and Regression SW, Chapter 12 Suppose we select n = 10 persons from the population of college seniors who plan to take the MCAT exam. Each takes the test, is coached, and then

More information

Inferences for Regression

Inferences for Regression Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In

More information

Steps to take to do the descriptive part of regression analysis:

Steps to take to do the descriptive part of regression analysis: STA 2023 Simple Linear Regression: Least Squares Model Steps to take to do the descriptive part of regression analysis: A. Plot the data on a scatter plot. Describe patterns: 1. Is there a strong, moderate,

More information

Chapter 10. Correlation and Regression. Lecture 1 Sections:

Chapter 10. Correlation and Regression. Lecture 1 Sections: Chapter 10 Correlation and Regression Lecture 1 Sections: 10.1 10. You will now be introduced to important methods for making inferences based on sample data that come in pairs. In the previous chapter,

More information

23. Inference for regression

23. Inference for regression 23. Inference for regression The Practice of Statistics in the Life Sciences Third Edition 2014 W. H. Freeman and Company Objectives (PSLS Chapter 23) Inference for regression The regression model Confidence

More information

UNIT 12 ~ More About Regression

UNIT 12 ~ More About Regression ***SECTION 15.1*** The Regression Model When a scatterplot shows a relationship between a variable x and a y, we can use the fitted to the data to predict y for a given value of x. Now we want to do tests

More information

9 Correlation and Regression

9 Correlation and Regression 9 Correlation and Regression SW, Chapter 12. Suppose we select n = 10 persons from the population of college seniors who plan to take the MCAT exam. Each takes the test, is coached, and then retakes the

More information

BIVARIATE DATA data for two variables

BIVARIATE DATA data for two variables (Chapter 3) BIVARIATE DATA data for two variables INVESTIGATING RELATIONSHIPS We have compared the distributions of the same variable for several groups, using double boxplots and back-to-back stemplots.

More information

Describing Bivariate Relationships

Describing Bivariate Relationships Describing Bivariate Relationships Bivariate Relationships What is Bivariate data? When exploring/describing a bivariate (x,y) relationship: Determine the Explanatory and Response variables Plot the data

More information

Can you tell the relationship between students SAT scores and their college grades?

Can you tell the relationship between students SAT scores and their college grades? Correlation One Challenge Can you tell the relationship between students SAT scores and their college grades? A: The higher SAT scores are, the better GPA may be. B: The higher SAT scores are, the lower

More information

Reminder: Univariate Data. Bivariate Data. Example: Puppy Weights. You weigh the pups and get these results: 2.5, 3.5, 3.3, 3.1, 2.6, 3.6, 2.

Reminder: Univariate Data. Bivariate Data. Example: Puppy Weights. You weigh the pups and get these results: 2.5, 3.5, 3.3, 3.1, 2.6, 3.6, 2. TP: To review Standard Deviation, Residual Plots, and Correlation Coefficients HW: Do a journal entry on each of the calculator tricks in this lesson. Lesson slides will be posted with notes. Do Now: Write

More information

Lecture 11: Simple Linear Regression

Lecture 11: Simple Linear Regression Lecture 11: Simple Linear Regression Readings: Sections 3.1-3.3, 11.1-11.3 Apr 17, 2009 In linear regression, we examine the association between two quantitative variables. Number of beers that you drink

More information

AP Statistics Two-Variable Data Analysis

AP Statistics Two-Variable Data Analysis AP Statistics Two-Variable Data Analysis Key Ideas Scatterplots Lines of Best Fit The Correlation Coefficient Least Squares Regression Line Coefficient of Determination Residuals Outliers and Influential

More information

Chapter 12 : Linear Correlation and Linear Regression

Chapter 12 : Linear Correlation and Linear Regression Chapter 1 : Linear Correlation and Linear Regression Determining whether a linear relationship exists between two quantitative variables, and modeling the relationship with a line, if the linear relationship

More information

Chapter 12 Summarizing Bivariate Data Linear Regression and Correlation

Chapter 12 Summarizing Bivariate Data Linear Regression and Correlation Chapter 1 Summarizing Bivariate Data Linear Regression and Correlation This chapter introduces an important method for making inferences about a linear correlation (or relationship) between two variables,

More information

5.1 Bivariate Relationships

5.1 Bivariate Relationships Chapter 5 Summarizing Bivariate Data Source: TPS 5.1 Bivariate Relationships What is Bivariate data? When exploring/describing a bivariate (x,y) relationship: Determine the Explanatory and Response variables

More information

AMS 315/576 Lecture Notes. Chapter 11. Simple Linear Regression

AMS 315/576 Lecture Notes. Chapter 11. Simple Linear Regression AMS 315/576 Lecture Notes Chapter 11. Simple Linear Regression 11.1 Motivation A restaurant opening on a reservations-only basis would like to use the number of advance reservations x to predict the number

More information

Session 4 2:40 3:30. If neither the first nor second differences repeat, we need to try another

Session 4 2:40 3:30. If neither the first nor second differences repeat, we need to try another Linear Quadratics & Exponentials using Tables We can classify a table of values as belonging to a particular family of functions based on the math operations found on any calculator. First differences

More information

Quantitative Bivariate Data

Quantitative Bivariate Data Statistics 211 (L02) - Linear Regression Quantitative Bivariate Data Consider two quantitative variables, defined in the following way: X i - the observed value of Variable X from subject i, i = 1, 2,,

More information

Lecture 4 Scatterplots, Association, and Correlation

Lecture 4 Scatterplots, Association, and Correlation Lecture 4 Scatterplots, Association, and Correlation Previously, we looked at Single variables on their own One or more categorical variable In this lecture: We shall look at two quantitative variables.

More information

Lecture 4 Scatterplots, Association, and Correlation

Lecture 4 Scatterplots, Association, and Correlation Lecture 4 Scatterplots, Association, and Correlation Previously, we looked at Single variables on their own One or more categorical variables In this lecture: We shall look at two quantitative variables.

More information

Correlation Analysis

Correlation Analysis Simple Regression Correlation Analysis Correlation analysis is used to measure strength of the association (linear relationship) between two variables Correlation is only concerned with strength of the

More information

determine whether or not this relationship is.

determine whether or not this relationship is. Section 9-1 Correlation A correlation is a between two. The data can be represented by ordered pairs (x,y) where x is the (or ) variable and y is the (or ) variable. There are several types of correlations

More information

AP Statistics Unit 6 Note Packet Linear Regression. Scatterplots and Correlation

AP Statistics Unit 6 Note Packet Linear Regression. Scatterplots and Correlation Scatterplots and Correlation Name Hr A scatterplot shows the relationship between two quantitative variables measured on the same individuals. variable (y) measures an outcome of a study variable (x) may

More information

The following formulas related to this topic are provided on the formula sheet:

The following formulas related to this topic are provided on the formula sheet: Student Notes Prep Session Topic: Exploring Content The AP Statistics topic outline contains a long list of items in the category titled Exploring Data. Section D topics will be reviewed in this session.

More information

Chapter 4 Describing the Relation between Two Variables

Chapter 4 Describing the Relation between Two Variables Chapter 4 Describing the Relation between Two Variables 4.1 Scatter Diagrams and Correlation The is the variable whose value can be explained by the value of the or. A is a graph that shows the relationship

More information

Business Statistics. Lecture 10: Correlation and Linear Regression

Business Statistics. Lecture 10: Correlation and Linear Regression Business Statistics Lecture 10: Correlation and Linear Regression Scatterplot A scatterplot shows the relationship between two quantitative variables measured on the same individuals. It displays the Form

More information

AMS 7 Correlation and Regression Lecture 8

AMS 7 Correlation and Regression Lecture 8 AMS 7 Correlation and Regression Lecture 8 Department of Applied Mathematics and Statistics, University of California, Santa Cruz Suumer 2014 1 / 18 Correlation pairs of continuous observations. Correlation

More information

Introduction and Single Predictor Regression. Correlation

Introduction and Single Predictor Regression. Correlation Introduction and Single Predictor Regression Dr. J. Kyle Roberts Southern Methodist University Simmons School of Education and Human Development Department of Teaching and Learning Correlation A correlation

More information

Topic 10 - Linear Regression

Topic 10 - Linear Regression Topic 10 - Linear Regression Least squares principle Hypothesis tests/confidence intervals/prediction intervals for regression 1 Linear Regression How much should you pay for a house? Would you consider

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression OI CHAPTER 7 Important Concepts Correlation (r or R) and Coefficient of determination (R 2 ) Interpreting y-intercept and slope coefficients Inference (hypothesis testing and confidence

More information

Chapter 16. Simple Linear Regression and dcorrelation

Chapter 16. Simple Linear Regression and dcorrelation Chapter 16 Simple Linear Regression and dcorrelation 16.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will

More information

Correlation. A statistics method to measure the relationship between two variables. Three characteristics

Correlation. A statistics method to measure the relationship between two variables. Three characteristics Correlation Correlation A statistics method to measure the relationship between two variables Three characteristics Direction of the relationship Form of the relationship Strength/Consistency Direction

More information

Warm-up Using the given data Create a scatterplot Find the regression line

Warm-up Using the given data Create a scatterplot Find the regression line Time at the lunch table Caloric intake 21.4 472 30.8 498 37.7 335 32.8 423 39.5 437 22.8 508 34.1 431 33.9 479 43.8 454 42.4 450 43.1 410 29.2 504 31.3 437 28.6 489 32.9 436 30.6 480 35.1 439 33.0 444

More information

Lecture 14. Analysis of Variance * Correlation and Regression. The McGraw-Hill Companies, Inc., 2000

Lecture 14. Analysis of Variance * Correlation and Regression. The McGraw-Hill Companies, Inc., 2000 Lecture 14 Analysis of Variance * Correlation and Regression Outline Analysis of Variance (ANOVA) 11-1 Introduction 11-2 Scatter Plots 11-3 Correlation 11-4 Regression Outline 11-5 Coefficient of Determination

More information

Lecture 14. Outline. Outline. Analysis of Variance * Correlation and Regression Analysis of Variance (ANOVA)

Lecture 14. Outline. Outline. Analysis of Variance * Correlation and Regression Analysis of Variance (ANOVA) Outline Lecture 14 Analysis of Variance * Correlation and Regression Analysis of Variance (ANOVA) 11-1 Introduction 11- Scatter Plots 11-3 Correlation 11-4 Regression Outline 11-5 Coefficient of Determination

More information

6.1.1 How can I make predictions?

6.1.1 How can I make predictions? CCA Ch 6: Modeling Two-Variable Data Name: Team: 6.1.1 How can I make predictions? Line of Best Fit 6-1. a. Length of tube: Diameter of tube: Distance from the wall (in) Width of field of view (in) b.

More information

Basic Business Statistics 6 th Edition

Basic Business Statistics 6 th Edition Basic Business Statistics 6 th Edition Chapter 12 Simple Linear Regression Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value of a dependent variable based

More information

Correlation. We don't consider one variable independent and the other dependent. Does x go up as y goes up? Does x go down as y goes up?

Correlation. We don't consider one variable independent and the other dependent. Does x go up as y goes up? Does x go down as y goes up? Comment: notes are adapted from BIOL 214/312. I. Correlation. Correlation A) Correlation is used when we want to examine the relationship of two continuous variables. We are not interested in prediction.

More information

Inference for Regression Simple Linear Regression

Inference for Regression Simple Linear Regression Inference for Regression Simple Linear Regression IPS Chapter 10.1 2009 W.H. Freeman and Company Objectives (IPS Chapter 10.1) Simple linear regression p Statistical model for linear regression p Estimating

More information

Homework 2: Simple Linear Regression

Homework 2: Simple Linear Regression STAT 4385 Applied Regression Analysis Homework : Simple Linear Regression (Simple Linear Regression) Thirty (n = 30) College graduates who have recently entered the job market. For each student, the CGPA

More information

Objectives Simple linear regression. Statistical model for linear regression. Estimating the regression parameters

Objectives Simple linear regression. Statistical model for linear regression. Estimating the regression parameters Objectives 10.1 Simple linear regression Statistical model for linear regression Estimating the regression parameters Confidence interval for regression parameters Significance test for the slope Confidence

More information

appstats27.notebook April 06, 2017

appstats27.notebook April 06, 2017 Chapter 27 Objective Students will conduct inference on regression and analyze data to write a conclusion. Inferences for Regression An Example: Body Fat and Waist Size pg 634 Our chapter example revolves

More information

Linear Regression. Simple linear regression model determines the relationship between one dependent variable (y) and one independent variable (x).

Linear Regression. Simple linear regression model determines the relationship between one dependent variable (y) and one independent variable (x). Linear Regression Simple linear regression model determines the relationship between one dependent variable (y) and one independent variable (x). A dependent variable is a random variable whose variation

More information

Chapter 27 Summary Inferences for Regression

Chapter 27 Summary Inferences for Regression Chapter 7 Summary Inferences for Regression What have we learned? We have now applied inference to regression models. Like in all inference situations, there are conditions that we must check. We can test

More information

Six Sigma Black Belt Study Guides

Six Sigma Black Belt Study Guides Six Sigma Black Belt Study Guides 1 www.pmtutor.org Powered by POeT Solvers Limited. Analyze Correlation and Regression Analysis 2 www.pmtutor.org Powered by POeT Solvers Limited. Variables and relationships

More information

Statistics Introductory Correlation

Statistics Introductory Correlation Statistics Introductory Correlation Session 10 oscardavid.barrerarodriguez@sciencespo.fr April 9, 2018 Outline 1 Statistics are not used only to describe central tendency and variability for a single variable.

More information

Unit 6 - Introduction to linear regression

Unit 6 - Introduction to linear regression Unit 6 - Introduction to linear regression Suggested reading: OpenIntro Statistics, Chapter 7 Suggested exercises: Part 1 - Relationship between two numerical variables: 7.7, 7.9, 7.11, 7.13, 7.15, 7.25,

More information

Chapter 10. Correlation and Regression. McGraw-Hill, Bluman, 7th ed., Chapter 10 1

Chapter 10. Correlation and Regression. McGraw-Hill, Bluman, 7th ed., Chapter 10 1 Chapter 10 Correlation and Regression McGraw-Hill, Bluman, 7th ed., Chapter 10 1 Example 10-2: Absences/Final Grades Please enter the data below in L1 and L2. The data appears on page 537 of your textbook.

More information

Chapter 7. Scatterplots, Association, and Correlation

Chapter 7. Scatterplots, Association, and Correlation Chapter 7 Scatterplots, Association, and Correlation Bin Zou (bzou@ualberta.ca) STAT 141 University of Alberta Winter 2015 1 / 29 Objective In this chapter, we study relationships! Instead, we investigate

More information

a. Length of tube: Diameter of tube:

a. Length of tube: Diameter of tube: CCA Ch 6: Modeling Two-Variable Data Name: 6.1.1 How can I make predictions? Line of Best Fit 6-1. a. Length of tube: Diameter of tube: Distance from the wall (in) Width of field of view (in) b. Make a

More information

Business Statistics. Lecture 9: Simple Regression

Business Statistics. Lecture 9: Simple Regression Business Statistics Lecture 9: Simple Regression 1 On to Model Building! Up to now, class was about descriptive and inferential statistics Numerical and graphical summaries of data Confidence intervals

More information

Review of Regression Basics

Review of Regression Basics Review of Regression Basics When describing a Bivariate Relationship: Make a Scatterplot Strength, Direction, Form Model: y-hat=a+bx Interpret slope in context Make Predictions Residual = Observed-Predicted

More information

Lecture 18: Simple Linear Regression

Lecture 18: Simple Linear Regression Lecture 18: Simple Linear Regression BIOS 553 Department of Biostatistics University of Michigan Fall 2004 The Correlation Coefficient: r The correlation coefficient (r) is a number that measures the strength

More information

STA 101 Final Review

STA 101 Final Review STA 101 Final Review Statistics 101 Thomas Leininger June 24, 2013 Announcements All work (besides projects) should be returned to you and should be entered on Sakai. Office Hour: 2 3pm today (Old Chem

More information

13 Simple Linear Regression

13 Simple Linear Regression B.Sc./Cert./M.Sc. Qualif. - Statistics: Theory and Practice 3 Simple Linear Regression 3. An industrial example A study was undertaken to determine the effect of stirring rate on the amount of impurity

More information

STAT Chapter 11: Regression

STAT Chapter 11: Regression STAT 515 -- Chapter 11: Regression Mostly we have studied the behavior of a single random variable. Often, however, we gather data on two random variables. We wish to determine: Is there a relationship

More information

Unit 6 - Simple linear regression

Unit 6 - Simple linear regression Sta 101: Data Analysis and Statistical Inference Dr. Çetinkaya-Rundel Unit 6 - Simple linear regression LO 1. Define the explanatory variable as the independent variable (predictor), and the response variable

More information

Objectives. 2.1 Scatterplots. Scatterplots Explanatory and response variables Interpreting scatterplots Outliers

Objectives. 2.1 Scatterplots. Scatterplots Explanatory and response variables Interpreting scatterplots Outliers Objectives 2.1 Scatterplots Scatterplots Explanatory and response variables Interpreting scatterplots Outliers Adapted from authors slides 2012 W.H. Freeman and Company Relationship of two numerical variables

More information

THE PEARSON CORRELATION COEFFICIENT

THE PEARSON CORRELATION COEFFICIENT CORRELATION Two variables are said to have a relation if knowing the value of one variable gives you information about the likely value of the second variable this is known as a bivariate relation There

More information

Ch Inference for Linear Regression

Ch Inference for Linear Regression Ch. 12-1 Inference for Linear Regression ACT = 6.71 + 5.17(GPA) For every increase of 1 in GPA, we predict the ACT score to increase by 5.17. population regression line β (true slope) μ y = α + βx mean

More information

2.1 Scatterplots. Ulrich Hoensch MAT210 Rocky Mountain College Billings, MT 59102

2.1 Scatterplots. Ulrich Hoensch MAT210 Rocky Mountain College Billings, MT 59102 2.1 Scatterplots Ulrich Hoensch MAT210 Rocky Mountain College Billings, MT 59102 Association Between Variables We now consider the situation where we have two variables. Example Let x be the age of a husband,

More information

SMAM 314 Exam 42 Name

SMAM 314 Exam 42 Name SMAM 314 Exam 42 Name Mark the following statements True (T) or False (F) (10 points) 1. F A. The line that best fits points whose X and Y values are negatively correlated should have a positive slope.

More information

Business Statistics. Chapter 14 Introduction to Linear Regression and Correlation Analysis QMIS 220. Dr. Mohammad Zainal

Business Statistics. Chapter 14 Introduction to Linear Regression and Correlation Analysis QMIS 220. Dr. Mohammad Zainal Department of Quantitative Methods & Information Systems Business Statistics Chapter 14 Introduction to Linear Regression and Correlation Analysis QMIS 220 Dr. Mohammad Zainal Chapter Goals After completing

More information

Correlation. Relationship between two variables in a scatterplot. As the x values go up, the y values go down.

Correlation. Relationship between two variables in a scatterplot. As the x values go up, the y values go down. Correlation Relationship between two variables in a scatterplot. As the x values go up, the y values go up. As the x values go up, the y values go down. There is no relationship between the x and y values

More information

11 Correlation and Regression

11 Correlation and Regression Chapter 11 Correlation and Regression August 21, 2017 1 11 Correlation and Regression When comparing two variables, sometimes one variable (the explanatory variable) can be used to help predict the value

More information

Let the x-axis have the following intervals:

Let the x-axis have the following intervals: 1 & 2. For the following sets of data calculate the mean and standard deviation. Then graph the data as a frequency histogram on the corresponding set of axes. Set 1: Length of bass caught in Conesus Lake

More information

Important note: Transcripts are not substitutes for textbook assignments. 1

Important note: Transcripts are not substitutes for textbook assignments. 1 In this lesson we will cover correlation and regression, two really common statistical analyses for quantitative (or continuous) data. Specially we will review how to organize the data, the importance

More information

Chapter 16. Simple Linear Regression and Correlation

Chapter 16. Simple Linear Regression and Correlation Chapter 16 Simple Linear Regression and Correlation 16.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will

More information

Chapter 12 - Lecture 2 Inferences about regression coefficient

Chapter 12 - Lecture 2 Inferences about regression coefficient Chapter 12 - Lecture 2 Inferences about regression coefficient April 19th, 2010 Facts about slope Test Statistic Confidence interval Hypothesis testing Test using ANOVA Table Facts about slope In previous

More information

STA 302 H1F / 1001 HF Fall 2007 Test 1 October 24, 2007

STA 302 H1F / 1001 HF Fall 2007 Test 1 October 24, 2007 STA 302 H1F / 1001 HF Fall 2007 Test 1 October 24, 2007 LAST NAME: SOLUTIONS FIRST NAME: STUDENT NUMBER: ENROLLED IN: (circle one) STA 302 STA 1001 INSTRUCTIONS: Time: 90 minutes Aids allowed: calculator.

More information

INFERENCE FOR REGRESSION

INFERENCE FOR REGRESSION CHAPTER 3 INFERENCE FOR REGRESSION OVERVIEW In Chapter 5 of the textbook, we first encountered regression. The assumptions that describe the regression model we use in this chapter are the following. We

More information

Sampling Distributions in Regression. Mini-Review: Inference for a Mean. For data (x 1, y 1 ),, (x n, y n ) generated with the SRM,

Sampling Distributions in Regression. Mini-Review: Inference for a Mean. For data (x 1, y 1 ),, (x n, y n ) generated with the SRM, Department of Statistics The Wharton School University of Pennsylvania Statistics 61 Fall 3 Module 3 Inference about the SRM Mini-Review: Inference for a Mean An ideal setup for inference about a mean

More information

Review 6. n 1 = 85 n 2 = 75 x 1 = x 2 = s 1 = 38.7 s 2 = 39.2

Review 6. n 1 = 85 n 2 = 75 x 1 = x 2 = s 1 = 38.7 s 2 = 39.2 Review 6 Use the traditional method to test the given hypothesis. Assume that the samples are independent and that they have been randomly selected ) A researcher finds that of,000 people who said that

More information

Lecture 15: Chapter 10

Lecture 15: Chapter 10 Lecture 15: Chapter 10 C C Moxley UAB Mathematics 20 July 15 10.1 Pairing Data In Chapter 9, we talked about pairing data in a natural way. In this Chapter, we will essentially be discussing whether these

More information

THE ROYAL STATISTICAL SOCIETY 2008 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE (MODULAR FORMAT) MODULE 4 LINEAR MODELS

THE ROYAL STATISTICAL SOCIETY 2008 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE (MODULAR FORMAT) MODULE 4 LINEAR MODELS THE ROYAL STATISTICAL SOCIETY 008 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE (MODULAR FORMAT) MODULE 4 LINEAR MODELS The Society provides these solutions to assist candidates preparing for the examinations

More information

AP STATISTICS Name: Period: Review Unit IV Scatterplots & Regressions

AP STATISTICS Name: Period: Review Unit IV Scatterplots & Regressions AP STATISTICS Name: Period: Review Unit IV Scatterplots & Regressions Know the definitions of the following words: bivariate data, regression analysis, scatter diagram, correlation coefficient, independent

More information

Chapter 14. Statistical versus Deterministic Relationships. Distance versus Speed. Describing Relationships: Scatterplots and Correlation

Chapter 14. Statistical versus Deterministic Relationships. Distance versus Speed. Describing Relationships: Scatterplots and Correlation Chapter 14 Describing Relationships: Scatterplots and Correlation Chapter 14 1 Statistical versus Deterministic Relationships Distance versus Speed (when travel time is constant). Income (in millions of

More information

9. Linear Regression and Correlation

9. Linear Regression and Correlation 9. Linear Regression and Correlation Data: y a quantitative response variable x a quantitative explanatory variable (Chap. 8: Recall that both variables were categorical) For example, y = annual income,

More information

Final Exam - Solutions

Final Exam - Solutions Ecn 102 - Analysis of Economic Data University of California - Davis March 17, 2010 Instructor: John Parman Final Exam - Solutions You have until 12:30pm to complete this exam. Please remember to put your

More information

Chapte The McGraw-Hill Companies, Inc. All rights reserved.

Chapte The McGraw-Hill Companies, Inc. All rights reserved. 12er12 Chapte Bivariate i Regression (Part 1) Bivariate Regression Visual Displays Begin the analysis of bivariate data (i.e., two variables) with a scatter plot. A scatter plot - displays each observed

More information

Lecture notes on Regression & SAS example demonstration

Lecture notes on Regression & SAS example demonstration Regression & Correlation (p. 215) When two variables are measured on a single experimental unit, the resulting data are called bivariate data. You can describe each variable individually, and you can also

More information

Review for Final Exam Stat 205: Statistics for the Life Sciences

Review for Final Exam Stat 205: Statistics for the Life Sciences Review for Final Exam Stat 205: Statistics for the Life Sciences Tim Hanson, Ph.D. University of South Carolina T. Hanson (USC) Stat 205: Statistics for the Life Sciences 1 / 20 Overview of Final Exam

More information

REVIEW 8/2/2017 陈芳华东师大英语系

REVIEW 8/2/2017 陈芳华东师大英语系 REVIEW Hypothesis testing starts with a null hypothesis and a null distribution. We compare what we have to the null distribution, if the result is too extreme to belong to the null distribution (p

More information

Conditions for Regression Inference:

Conditions for Regression Inference: AP Statistics Chapter Notes. Inference for Linear Regression We can fit a least-squares line to any data relating two quantitative variables, but the results are useful only if the scatterplot shows a

More information

s e, which is large when errors are large and small Linear regression model

s e, which is large when errors are large and small Linear regression model Linear regression model we assume that two quantitative variables, x and y, are linearly related; that is, the the entire population of (x, y) pairs are related by an ideal population regression line y

More information

Regression used to predict or estimate the value of one variable corresponding to a given value of another variable.

Regression used to predict or estimate the value of one variable corresponding to a given value of another variable. CHAPTER 9 Simple Linear Regression and Correlation Regression used to predict or estimate the value of one variable corresponding to a given value of another variable. X = independent variable. Y = dependent

More information

Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference.

Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference. Understanding regression output from software Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals In 1966 Cyril Burt published a paper called The genetic determination of differences

More information

11.5 Regression Linear Relationships

11.5 Regression Linear Relationships Contents 11.5 Regression............................. 835 11.5.1 Linear Relationships................... 835 11.5.2 The Least Squares Regression Line........... 837 11.5.3 Using the Regression Line................

More information