INFERENCE FOR REGRESSION
|
|
- Aubrey Malone
- 5 years ago
- Views:
Transcription
1 CHAPTER 3 INFERENCE FOR REGRESSION OVERVIEW In Chapter 5 of the textbook, we first encountered regression. The assumptions that describe the regression model we use in this chapter are the following. We have n observations on an explanatory variable x and a response variable y. Our goal is to study or predict the behavior of y for given values of x. For any fixed value of x, the response y varies according to a Normal distribution. Repeated responses y are independent of each other. The mean response μy has a straight-line relationship with x given by a population regression line: μy = α + β x The slope β and intercept α are unknown parameters. The standard deviation of y (call it σ) is the same for all values of x. The value of σ is unknown. The true (population) regression line is μ y = α + β x and says that the mean response μ y moves along a straight line as the explanatory variable x changes. The parameters β and α are estimated by the slope b and intercept a of the least-squares regression line, and the formulas for these estimates are and b = r s y s x a = y b x where r is the correlation between y and x, y is the mean of the y observations, s y is the standard deviation of the y observations, x is the mean of the x observations, and s x is the standard deviation of the x observations. The standard error about the least-squares line is s = residual = n n ( y y ˆ ) where ˆ y = a + bx is the value we would predict for the response variable based on the least-squares regression line. We use s to estimate the unknown σ in the regression model.
2 Inference for Regression A level C confidence interval for β is b ± t*seb where t* is the critical value for the t distribution with n degrees of freedom with area C between t* and t*, and s SEb = ( x x ) is the standard error of the least-squares slope b. SEb is usually computed using a calculator or statistical software. The test of the hypothesis H : β = is based on the t statistic b t = SE b with P-values computed from the t distribution with n degrees of freedom. This test is also a test of the hypothesis that the correlation is in the population. A level C confidence interval for the mean response μy when x takes the value x* is y ˆ ± t*se ˆ μ where ˆ y = a + bx, t* is the critical value for the t distribution with n degrees of freedom and area C between t* and t* and SE ˆ μ = s n + ( x * x ) (x x ) SE ˆ μ is usually computed using a calculator or statistical software. A level C prediction interval for a single observation on y when x takes the value x* is y ˆ ± t*se y ˆ where t* is the critical value for the t distribution with n degrees of freedom and and area C between t* and t* and SE y ˆ = s + n (x * x ) + (x x ) SE ˆ y is usually computed using a calculator or statistical software. Finally, it is always good practice to check that the data satisfy the linear regression model assumptions before doing inference. Scatterplots and residual plots are useful tools for checking these assumptions.
3 Chapter 3 GUIDED SOLUTIONS Exercise 3. KEY CONCEPTS: Scatterplots, correlation, linear regression, residuals, standard error of the leastsquares line (a) First, examine the data and judge whether the relationship between Distance and Days is positive or negative. Sketch your scatterplot on the axes provided, or use software. 5 Scatterplot of Days versus Distance 3 Days 3 Distance 5 Use your calculator (or statistical software) to compute the correlation r: r = (b) What does the slope β of the true regression line say about the number of days until group infection and a group s distance from the first infected group? Enter your estimates of the slope β and intercept α of the true regression line. Use software or your calculator, or compute these values manually using the formulas in Chapter 6 of your textbook. Estimate of β =
4 Inference for Regression 3 Estimate of α = Although it isn t asked for in this part, write the equation of the least-squares regression line for predicting the number of days to infection for a gorilla group given its distance from the first group infected. You ll use this in part (c) The least-squares regression line is: ŷ = (c) To compute the residuals, complete the table. Remember, to compute the predicted number of days until infection, use the least-squares regression line. Distance from first group infected 3 5 Predicted number of days until infection Residual (prediction error) Compute the sum of residuals (sum of prediction errors). They should sum to zero. residual = Now estimate the standard deviation σ by computing residual = and then completing the following calculation. This is an estimate of σ. s = residual = n
5 Chapter 3 Exercise 3. KEY CONCEPTS: Tests for the slope of the least-squares regression line (a)the test of the hypotheses H : β is based on the t statistic t = = b SE b. In the statement of the problem, we are told that b =.63 and SE b =.59. The value of b is slightly different than the value we found in Exercise 3., due to differences in how much rounding was done at intermediate stages of the calculations. Compute the test statistic: t = b SE b = (b) What are the degrees of freedom for t? Refer to the original data in Exercise 3. of your textbook to determine the sample size n. Degrees of freedom = n = Now, use Table C to estimate the P-value for testing with the alternative hypothesis H a : β >, which hypothesizes a positive linear association between Days and Distance. P-value: What do you conclude? Exercise 3.38 KEY CONCEPTS: Scatterplots, examining residuals, confidence intervals for the slope (a) Use software or a calculator to compute the correlation between Time and Calories : Use software or a calculator to compute the equation of the least-squares regression line. Don t forget to have the computer or your calculator save the residuals, as we ll use them in part (b): ˆ y =
6 Inference for Regression 5 Use software or the axes provided to make a scatterplot of Calories versus Time Calories Time 35 5 (b) Here, we ll check conditions needed for regression inference. First, to check for a Linear Relationship, and to check whether spread about the line stays the same for all values of the explanatory variable, plot the residuals against Time (the explanatory variable): 8 6 Residuals Time 35 5 Does this plot show any systematic deviation from a roughly linear pattern? Does this plot show any systematic change in spread as Time changes?
7 6 Chapter 3 Are the observations independent? Is this obvious? Finally, look for evidence that the variation about the line appear to be Normal. Use software or the axes that follow (with class intervals residual < 3, 3 residual <, residual <, and so on) to make a histogram. 3 Frequency - - Residuals Does this plot have strong skewness or outliers which might suggest lack of Normality? (c) In this problem, the rate of change in calories consumed as time at the table increases is the slope of the population line, β. Hence, we need to construct a 95% confidence interval for β. Recall that a level C confidence interval for β is b ± t*se b where t* is the critical value for the t distribution with n degrees of freedom with area C between t* and t*, and s SE b = ( x x ) is the standard error of the least-squares slope b.
8 Inference for Regression 7 In this exercise, b and SE b can be read directly from the output of statistical software. Record their values. b = SE b = Now, find t* for a 95% confidence interval from Table C (what is n here?). t* = Compute the 9% confidence interval: Interpret this confidence interval in the context of this problem. Exercise 3. KEY CONCEPTS: Prediction, prediction intervals We used Minitab to compute a prediction of Calories when Time =. The output follows: The regression equation is Calories = Time Predictor Coef Stdev t-ratio p Constant Time s = 3. R-sq =.% R-sq(adj) = 38.9% Analysis of Variance SOURCE DF SS MS F p Regression Error Total Fit Stdev.Fit 95.% C.I. 95.% P.I (.3, 5.9) ( 386.6, 89.8) Where in this output does one find the 95% confidence interval to predict Rachel s calorie consumption at lunch? Refer to Examples 3.7 and 3.8 in the textbook if you need help. 95% prediction interval:
9 8 Chapter 3 COMPLETE SOLUTIONS Exercise 3. (a) If we look at the data, we see that as a gorilla group s distance from the first infection increases, so does the number of days until that group is infected. Thus, there is a positive association between Days and Distance. A scatterplot of the data with price as the explanatory variable follows. 5 Scatterplot of Days versus Distance 3 Days 3 Distance 5 The scatterplot indicates a strong positive linear association between Distance and Days. The correlation r is given by r =.96. This is consistent with the scatterplot as suggesting a strong linear relationship between Distance and Days. The estimate of β is b =.3 days per distance unit. The estimate of α is a = -8.9 days. The equation of the least-squares regression line for predicting days to infection for a gorilla group given its distance from the initial group infected is: Days = Distance (b) The slope of the population regression line, β, is the number of additional days (on average) required to infect a gorilla group one additional distance unit from the original infection group. You might think of this as a measure of the rate of the infection s spread - on average it takes β days for the infection to spread to an additional home range. The estimate of β is b =.3 days per distance unit. The estimate of α is a = 8.9 days. The equation of the least-squares regression line for predicting days to infection for a gorilla group given its distance from the initial group infected is: Days = Distance
10 Inference for Regression 9 (c) The residuals for the six data points are given in the table. Distance from first group infected Predicted number of days until infection Residual (prediction error) = = = = = =.3 The sum of the residuals listed is residual =.. The difference from is due to rounding in the parameter estimates above. To estimate the standard deviation σ in the regression model, we first calculate the sum of the squares of the residuals listed: residual =.8 (.7) (.3) = 96.. Our estimate of the standard deviation σ in the regression model is therefore s = residual = n (96.) =.9 days. 6- Exercise 3. (a) b =.63 and SE b =.59, so t = b SE b = = 7.79 (b) Referring to the original data in Exercise 3. of the textbook, we see that n = 6. Degrees of freedom = n = 6 = To estimate the P-value, we use Table C with df = and refer to the P-values corresponding to the two values of t* that bracket the computed value of t = 7.79: t* One-sided P.5. Because the test is two-sided,. < P-value <.5. Statistical software (Minitab) gives a P-value of.. There is extremely strong (overwhelming) evidence to support a positive linear association between distance of a gorilla group from the primary infection group and the number of days it takes for the infection to reach the group.
11 3 Chapter 3 Exercise 3.38 (a) Here is a scatterplot showing the relationship between time at the table and calories consumed Calories Time 35 5 The correlation between Calories and Time is r =.69. The overall pattern is roughly (perhaps weakly) linear with a negative slope. There are no clear outliers or strongly influential data points, it seems. Using statistical software, we find that the equation of the least-squares line is ˆ y = time (b) A scatterplot of the residuals against Time follows. 8 6 Residuals Time 35 5
12 Inference for Regression 3 This plot is useful for addressing the first two of the four conditions we check: Does the relationship appear linear? This scatterplot magnifies deviations from the regression line, making it easier to detect any non-linear pattern in the data. Based on this plot, there is little reason to doubt that the relationship between Calories and Time is linear. Does the spread about the line stay the same? The scatterplot of residuals versus Time seems to suggest that the spread about the line is roughly constant. Points seem to lie consistently in a band between and +. Are the observations independent? The answer is not clear. These are observations on different children rather than on a single child, and that is good. However, we do not know if the children were selected at random. In addition, we do not know if the children were all together so that the behavior of one child could influence the behavior of another. Are there children from the same family in this group? These issues would impact independence of observations. Does the variation about the line appear to be Normal? The histogram that follows has a gap and is not particularly bell-shaped. On the other hand there do not appear to be any outliers or extreme skew. With only observations, it s difficult to assess non- Normality here. 3 Frequency - - Residuals The conditions for inference (for a sample of size ) are approximately satisfied.
13 3 Chapter 3 (c) From statistical software, we find that b = 3.8 SE b =.85 For a 95% confidence interval from Table C with n = (and n = 8), t* =. We use these to compute the 95% confidence interval for the true slope of the regression line: b ± t*se b = 3.8 ± (.)(.85) = 3.8 ±.79 or.87 to.9 calories per minute. With 95% confidence, each minute spent at the table reduces calories consumed by between.9 calories and.87 calories. Exercise 3. Using software (Minitab, in this case): The output from Minitab follows: The regression equation is Calories = Time Predictor Coef Stdev t-ratio p Constant Time s = 3. R-sq =.% R-sq(adj) = 38.9% Analysis of Variance SOURCE DF SS MS F p Regression Error Total Fit Stdev.Fit 95.% C.I. 95.% P.I (.3, 5.9) (386.6, 89.8) The Fit entry gives the predicted calories. Minitab gives both the 95% confidence interval for the mean response and the prediction interval for a single observation. We are predicting a single observation, so the column labeled 95% PI contains the interval we want. We see that this 95% prediction interval is (386.6, 89.8). With 95% confidence, the mean number of calories consumed by Rachel at lunch is between 386 and 89 calories, roughly.
23. Inference for regression
23. Inference for regression The Practice of Statistics in the Life Sciences Third Edition 2014 W. H. Freeman and Company Objectives (PSLS Chapter 23) Inference for regression The regression model Confidence
More informationLecture 18: Simple Linear Regression
Lecture 18: Simple Linear Regression BIOS 553 Department of Biostatistics University of Michigan Fall 2004 The Correlation Coefficient: r The correlation coefficient (r) is a number that measures the strength
More informationWarm-up Using the given data Create a scatterplot Find the regression line
Time at the lunch table Caloric intake 21.4 472 30.8 498 37.7 335 32.8 423 39.5 437 22.8 508 34.1 431 33.9 479 43.8 454 42.4 450 43.1 410 29.2 504 31.3 437 28.6 489 32.9 436 30.6 480 35.1 439 33.0 444
More informationy = a + bx 12.1: Inference for Linear Regression Review: General Form of Linear Regression Equation Review: Interpreting Computer Regression Output
12.1: Inference for Linear Regression Review: General Form of Linear Regression Equation y = a + bx y = dependent variable a = intercept b = slope x = independent variable Section 12.1 Inference for Linear
More informationAP Statistics Unit 6 Note Packet Linear Regression. Scatterplots and Correlation
Scatterplots and Correlation Name Hr A scatterplot shows the relationship between two quantitative variables measured on the same individuals. variable (y) measures an outcome of a study variable (x) may
More informationSimple Linear Regression: A Model for the Mean. Chap 7
Simple Linear Regression: A Model for the Mean Chap 7 An Intermediate Model (if the groups are defined by values of a numeric variable) Separate Means Model Means fall on a straight line function of the
More informationInferences for Regression
Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In
More information28. SIMPLE LINEAR REGRESSION III
28. SIMPLE LINEAR REGRESSION III Fitted Values and Residuals To each observed x i, there corresponds a y-value on the fitted line, y = βˆ + βˆ x. The are called fitted values. ŷ i They are the values of
More informationIntro to Linear Regression
Intro to Linear Regression Introduction to Regression Regression is a statistical procedure for modeling the relationship among variables to predict the value of a dependent variable from one or more predictor
More information7.0 Lesson Plan. Regression. Residuals
7.0 Lesson Plan Regression Residuals 1 7.1 More About Regression Recall the regression assumptions: 1. Each point (X i, Y i ) in the scatterplot satisfies: Y i = ax i + b + ɛ i where the ɛ i have a normal
More informationInference for Regression Inference about the Regression Model and Using the Regression Line
Inference for Regression Inference about the Regression Model and Using the Regression Line PBS Chapter 10.1 and 10.2 2009 W.H. Freeman and Company Objectives (PBS Chapter 10.1 and 10.2) Inference about
More informationInference for Regression Inference about the Regression Model and Using the Regression Line, with Details. Section 10.1, 2, 3
Inference for Regression Inference about the Regression Model and Using the Regression Line, with Details Section 10.1, 2, 3 Basic components of regression setup Target of inference: linear dependency
More informationInference for the Regression Coefficient
Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression line. We can shows that b 0 and b 1 are the unbiased estimates
More informationCh 13 & 14 - Regression Analysis
Ch 3 & 4 - Regression Analysis Simple Regression Model I. Multiple Choice:. A simple regression is a regression model that contains a. only one independent variable b. only one dependent variable c. more
More informationChapter 27 Summary Inferences for Regression
Chapter 7 Summary Inferences for Regression What have we learned? We have now applied inference to regression models. Like in all inference situations, there are conditions that we must check. We can test
More informationThe simple linear regression model discussed in Chapter 13 was written as
1519T_c14 03/27/2006 07:28 AM Page 614 Chapter Jose Luis Pelaez Inc/Blend Images/Getty Images, Inc./Getty Images, Inc. 14 Multiple Regression 14.1 Multiple Regression Analysis 14.2 Assumptions of the Multiple
More informationIntro to Linear Regression
Intro to Linear Regression Introduction to Regression Regression is a statistical procedure for modeling the relationship among variables to predict the value of a dependent variable from one or more predictor
More informationappstats27.notebook April 06, 2017
Chapter 27 Objective Students will conduct inference on regression and analyze data to write a conclusion. Inferences for Regression An Example: Body Fat and Waist Size pg 634 Our chapter example revolves
More informationConfidence Interval for the mean response
Week 3: Prediction and Confidence Intervals at specified x. Testing lack of fit with replicates at some x's. Inference for the correlation. Introduction to regression with several explanatory variables.
More informationMultiple Regression Examples
Multiple Regression Examples Example: Tree data. we have seen that a simple linear regression of usable volume on diameter at chest height is not suitable, but that a quadratic model y = β 0 + β 1 x +
More informationChapter Goals. To understand the methods for displaying and describing relationship among variables. Formulate Theories.
Chapter Goals To understand the methods for displaying and describing relationship among variables. Formulate Theories Interpret Results/Make Decisions Collect Data Summarize Results Chapter 7: Is There
More informationChapter 9. Correlation and Regression
Chapter 9 Correlation and Regression Lesson 9-1/9-2, Part 1 Correlation Registered Florida Pleasure Crafts and Watercraft Related Manatee Deaths 100 80 60 40 20 0 1991 1993 1995 1997 1999 Year Boats in
More informationAnalysis of Bivariate Data
Analysis of Bivariate Data Data Two Quantitative variables GPA and GAES Interest rates and indices Tax and fund allocation Population size and prison population Bivariate data (x,y) Case corr® 2 Independent
More informationAnalysis of Covariance. The following example illustrates a case where the covariate is affected by the treatments.
Analysis of Covariance In some experiments, the experimental units (subjects) are nonhomogeneous or there is variation in the experimental conditions that are not due to the treatments. For example, a
More informationLINEAR REGRESSION ANALYSIS. MODULE XVI Lecture Exercises
LINEAR REGRESSION ANALYSIS MODULE XVI Lecture - 44 Exercises Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Exercise 1 The following data has been obtained on
More informationUNIT 12 ~ More About Regression
***SECTION 15.1*** The Regression Model When a scatterplot shows a relationship between a variable x and a y, we can use the fitted to the data to predict y for a given value of x. Now we want to do tests
More information9. Linear Regression and Correlation
9. Linear Regression and Correlation Data: y a quantitative response variable x a quantitative explanatory variable (Chap. 8: Recall that both variables were categorical) For example, y = annual income,
More informationConditions for Regression Inference:
AP Statistics Chapter Notes. Inference for Linear Regression We can fit a least-squares line to any data relating two quantitative variables, but the results are useful only if the scatterplot shows a
More information1 Introduction to Minitab
1 Introduction to Minitab Minitab is a statistical analysis software package. The software is freely available to all students and is downloadable through the Technology Tab at my.calpoly.edu. When you
More informationMultiple Regression. Inference for Multiple Regression and A Case Study. IPS Chapters 11.1 and W.H. Freeman and Company
Multiple Regression Inference for Multiple Regression and A Case Study IPS Chapters 11.1 and 11.2 2009 W.H. Freeman and Company Objectives (IPS Chapters 11.1 and 11.2) Multiple regression Data for multiple
More informationInference for Regression Simple Linear Regression
Inference for Regression Simple Linear Regression IPS Chapter 10.1 2009 W.H. Freeman and Company Objectives (IPS Chapter 10.1) Simple linear regression p Statistical model for linear regression p Estimating
More informationMathematics for Economics MA course
Mathematics for Economics MA course Simple Linear Regression Dr. Seetha Bandara Simple Regression Simple linear regression is a statistical method that allows us to summarize and study relationships between
More informationChapter 3: Describing Relationships
Chapter 3: Describing Relationships Section 3.2 The Practice of Statistics, 4 th edition For AP* STARNES, YATES, MOORE Chapter 3 Describing Relationships 3.1 Scatterplots and Correlation 3.2 Section 3.2
More informationCorrelation & Simple Regression
Chapter 11 Correlation & Simple Regression The previous chapter dealt with inference for two categorical variables. In this chapter, we would like to examine the relationship between two quantitative variables.
More informationMULTIPLE REGRESSION METHODS
DEPARTMENT OF POLITICAL SCIENCE AND INTERNATIONAL RELATIONS Posc/Uapp 816 MULTIPLE REGRESSION METHODS I. AGENDA: A. Residuals B. Transformations 1. A useful procedure for making transformations C. Reading:
More informationImportant note: Transcripts are not substitutes for textbook assignments. 1
In this lesson we will cover correlation and regression, two really common statistical analyses for quantitative (or continuous) data. Specially we will review how to organize the data, the importance
More informationMultiple Regression Methods
Chapter 1: Multiple Regression Methods Hildebrand, Ott and Gray Basic Statistical Ideas for Managers Second Edition 1 Learning Objectives for Ch. 1 The Multiple Linear Regression Model How to interpret
More informationChapter 3: Describing Relationships
Chapter 3: Describing Relationships Section 3.2 The Practice of Statistics, 4 th edition For AP* STARNES, YATES, MOORE Chapter 3 Describing Relationships 3.1 Scatterplots and Correlation 3.2 Section 3.2
More informationSchool of Mathematical Sciences. Question 1
School of Mathematical Sciences MTH5120 Statistical Modelling I Practical 8 and Assignment 7 Solutions Question 1 Figure 1: The residual plots do not contradict the model assumptions of normality, constant
More informationIntroduction to Regression
Introduction to Regression Using Mult Lin Regression Derived variables Many alternative models Which model to choose? Model Criticism Modelling Objective Model Details Data and Residuals Assumptions 1
More informationMath Section MW 1-2:30pm SR 117. Bekki George 206 PGH
Math 3339 Section 21155 MW 1-2:30pm SR 117 Bekki George bekki@math.uh.edu 206 PGH Office Hours: M 11-12:30pm & T,TH 10:00 11:00 am and by appointment Linear Regression (again) Consider the relationship
More informationSMAM 319 Exam1 Name. a B.The equation of a line is 3x + y =6. The slope is a. -3 b.3 c.6 d.1/3 e.-1/3
SMAM 319 Exam1 Name 1. Pick the best choice. (10 points-2 each) _c A. A data set consisting of fifteen observations has the five number summary 4 11 12 13 15.5. For this data set it is definitely true
More informationInferences for linear regression (sections 12.1, 12.2)
Inferences for linear regression (sections 12.1, 12.2) Regression case history: do bigger national parks help prevent extinction? ex. area of natural reserves and extinction: 6 national parks in Tanzania
More informationModels with qualitative explanatory variables p216
Models with qualitative explanatory variables p216 Example gen = 1 for female Row gpa hsm gen 1 3.32 10 0 2 2.26 6 0 3 2.35 8 0 4 2.08 9 0 5 3.38 8 0 6 3.29 10 0 7 3.21 8 0 8 2.00 3 0 9 3.18 9 0 10 2.34
More informationReview of Statistics 101
Review of Statistics 101 We review some important themes from the course 1. Introduction Statistics- Set of methods for collecting/analyzing data (the art and science of learning from data). Provides methods
More informationCorrelation and Regression
Correlation and Regression Dr. Bob Gee Dean Scott Bonney Professor William G. Journigan American Meridian University 1 Learning Objectives Upon successful completion of this module, the student should
More informationMultiple Regression an Introduction. Stat 511 Chap 9
Multiple Regression an Introduction Stat 511 Chap 9 1 case studies meadowfoam flowers brain size of mammals 2 case study 1: meadowfoam flowering designed experiment carried out in a growth chamber general
More informationReview of Regression Basics
Review of Regression Basics When describing a Bivariate Relationship: Make a Scatterplot Strength, Direction, Form Model: y-hat=a+bx Interpret slope in context Make Predictions Residual = Observed-Predicted
More informationRegression. Marc H. Mehlman University of New Haven
Regression Marc H. Mehlman marcmehlman@yahoo.com University of New Haven the statistician knows that in nature there never was a normal distribution, there never was a straight line, yet with normal and
More informationAP Statistics Bivariate Data Analysis Test Review. Multiple-Choice
Name Period AP Statistics Bivariate Data Analysis Test Review Multiple-Choice 1. The correlation coefficient measures: (a) Whether there is a relationship between two variables (b) The strength of the
More informationNature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference.
Understanding regression output from software Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals In 1966 Cyril Burt published a paper called The genetic determination of differences
More informationSMAM 314 Practice Final Examination Winter 2003
SMAM 314 Practice Final Examination Winter 2003 You may use your textbook, one page of notes and a calculator. Please hand in the notes with your exam. 1. Mark the following statements True T or False
More informationLinear Regression Communication, skills, and understanding Calculator Use
Linear Regression Communication, skills, and understanding Title, scale and label the horizontal and vertical axes Comment on the direction, shape (form), and strength of the relationship and unusual features
More information7. Do not estimate values for y using x-values outside the limits of the data given. This is called extrapolation and is not reliable.
AP Statistics 15 Inference for Regression I. Regression Review a. r à correlation coefficient or Pearson s coefficient: indicates strength and direction of the relationship between the explanatory variables
More informationCorrelation and Linear Regression
Correlation and Linear Regression Correlation: Relationships between Variables So far, nearly all of our discussion of inferential statistics has focused on testing for differences between group means
More informationUnit 6 - Introduction to linear regression
Unit 6 - Introduction to linear regression Suggested reading: OpenIntro Statistics, Chapter 7 Suggested exercises: Part 1 - Relationship between two numerical variables: 7.7, 7.9, 7.11, 7.13, 7.15, 7.25,
More informationSTA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #6
STA 8 Applied Linear Models: Regression Analysis Spring 011 Solution for Homework #6 6. a) = 11 1 31 41 51 1 3 4 5 11 1 31 41 51 β = β1 β β 3 b) = 1 1 1 1 1 11 1 31 41 51 1 3 4 5 β = β 0 β1 β 6.15 a) Stem-and-leaf
More informationAP Statistics. The only statistics you can trust are those you falsified yourself. RE- E X P R E S S I N G D A T A ( P A R T 2 ) C H A P 9
AP Statistics 1 RE- E X P R E S S I N G D A T A ( P A R T 2 ) C H A P 9 The only statistics you can trust are those you falsified yourself. Sir Winston Churchill (1874-1965) (Attribution to Churchill is
More informationSMAM 314 Exam 42 Name
SMAM 314 Exam 42 Name Mark the following statements True (T) or False (F) (10 points) 1. F A. The line that best fits points whose X and Y values are negatively correlated should have a positive slope.
More informationSimple Linear Regression. Material from Devore s book (Ed 8), and Cengagebrain.com
12 Simple Linear Regression Material from Devore s book (Ed 8), and Cengagebrain.com The Simple Linear Regression Model The simplest deterministic mathematical relationship between two variables x and
More information1. An article on peanut butter in Consumer reports reported the following scores for various brands
SMAM 314 Review Exam 1 1. An article on peanut butter in Consumer reports reported the following scores for various brands Creamy 56 44 62 36 39 53 50 65 45 40 56 68 41 30 40 50 50 56 65 56 45 40 Crunchy
More informationTrendlines Simple Linear Regression Multiple Linear Regression Systematic Model Building Practical Issues
Trendlines Simple Linear Regression Multiple Linear Regression Systematic Model Building Practical Issues Overfitting Categorical Variables Interaction Terms Non-linear Terms Linear Logarithmic y = a +
More informationThis document contains 3 sets of practice problems.
P RACTICE PROBLEMS This document contains 3 sets of practice problems. Correlation: 3 problems Regression: 4 problems ANOVA: 8 problems You should print a copy of these practice problems and bring them
More information2. Outliers and inference for regression
Unit6: Introductiontolinearregression 2. Outliers and inference for regression Sta 101 - Spring 2016 Duke University, Department of Statistical Science Dr. Çetinkaya-Rundel Slides posted at http://bit.ly/sta101_s16
More informationMATH 1150 Chapter 2 Notation and Terminology
MATH 1150 Chapter 2 Notation and Terminology Categorical Data The following is a dataset for 30 randomly selected adults in the U.S., showing the values of two categorical variables: whether or not the
More informationTHE ROYAL STATISTICAL SOCIETY 2008 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE (MODULAR FORMAT) MODULE 4 LINEAR MODELS
THE ROYAL STATISTICAL SOCIETY 008 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE (MODULAR FORMAT) MODULE 4 LINEAR MODELS The Society provides these solutions to assist candidates preparing for the examinations
More informationBasic Business Statistics 6 th Edition
Basic Business Statistics 6 th Edition Chapter 12 Simple Linear Regression Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value of a dependent variable based
More informationBasic Business Statistics, 10/e
Chapter 4 4- Basic Business Statistics th Edition Chapter 4 Introduction to Multiple Regression Basic Business Statistics, e 9 Prentice-Hall, Inc. Chap 4- Learning Objectives In this chapter, you learn:
More informationStart with review, some new definitions, and pictures on the white board. Assumptions in the Normal Linear Regression Model
Start with review, some new definitions, and pictures on the white board. Assumptions in the Normal Linear Regression Model A1: There is a linear relationship between X and Y. A2: The error terms (and
More informationStatistical Modelling in Stata 5: Linear Models
Statistical Modelling in Stata 5: Linear Models Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester 07/11/2017 Structure This Week What is a linear model? How good is my model? Does
More information1. Least squares with more than one predictor
Statistics 1 Lecture ( November ) c David Pollard Page 1 Read M&M Chapter (skip part on logistic regression, pages 730 731). Read M&M pages 1, for ANOVA tables. Multiple regression. 1. Least squares with
More informationSimple Linear Regression. (Chs 12.1, 12.2, 12.4, 12.5)
10 Simple Linear Regression (Chs 12.1, 12.2, 12.4, 12.5) Simple Linear Regression Rating 20 40 60 80 0 5 10 15 Sugar 2 Simple Linear Regression Rating 20 40 60 80 0 5 10 15 Sugar 3 Simple Linear Regression
More information11 Correlation and Regression
Chapter 11 Correlation and Regression August 21, 2017 1 11 Correlation and Regression When comparing two variables, sometimes one variable (the explanatory variable) can be used to help predict the value
More informationInference with Simple Regression
1 Introduction Inference with Simple Regression Alan B. Gelder 06E:071, The University of Iowa 1 Moving to infinite means: In this course we have seen one-mean problems, twomean problems, and problems
More informationIs economic freedom related to economic growth?
Is economic freedom related to economic growth? It is an article of faith among supporters of capitalism: economic freedom leads to economic growth. The publication Economic Freedom of the World: 2003
More informationBusiness Statistics. Lecture 10: Course Review
Business Statistics Lecture 10: Course Review 1 Descriptive Statistics for Continuous Data Numerical Summaries Location: mean, median Spread or variability: variance, standard deviation, range, percentiles,
More informationExamination paper for TMA4255 Applied statistics
Department of Mathematical Sciences Examination paper for TMA4255 Applied statistics Academic contact during examination: Anna Marie Holand Phone: 951 38 038 Examination date: 16 May 2015 Examination time
More informationChapter 3: Examining Relationships
Chapter 3: Examining Relationships Most statistical studies involve more than one variable. Often in the AP Statistics exam, you will be asked to compare two data sets by using side by side boxplots or
More informationAMS 7 Correlation and Regression Lecture 8
AMS 7 Correlation and Regression Lecture 8 Department of Applied Mathematics and Statistics, University of California, Santa Cruz Suumer 2014 1 / 18 Correlation pairs of continuous observations. Correlation
More informationSix Sigma Black Belt Study Guides
Six Sigma Black Belt Study Guides 1 www.pmtutor.org Powered by POeT Solvers Limited. Analyze Correlation and Regression Analysis 2 www.pmtutor.org Powered by POeT Solvers Limited. Variables and relationships
More informationMidterm 2 - Solutions
Ecn 102 - Analysis of Economic Data University of California - Davis February 24, 2010 Instructor: John Parman Midterm 2 - Solutions You have until 10:20am to complete this exam. Please remember to put
More informationLecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1
Lecture Slides Elementary Statistics Tenth Edition and the Triola Statistics Series by Mario F. Triola Slide 1 Chapter 10 Correlation and Regression 10-1 Overview 10-2 Correlation 10-3 Regression 10-4
More informationDensity Temp vs Ratio. temp
Temp Ratio Density 0.00 0.02 0.04 0.06 0.08 0.10 0.12 Density 0.0 0.2 0.4 0.6 0.8 1.0 1. (a) 170 175 180 185 temp 1.0 1.5 2.0 2.5 3.0 ratio The histogram shows that the temperature measures have two peaks,
More informationSMAM 319 Exam 1 Name. 1.Pick the best choice for the multiple choice questions below (10 points 2 each)
SMAM 319 Exam 1 Name 1.Pick the best choice for the multiple choice questions below (10 points 2 each) A b In Metropolis there are some houses for sale. Superman and Lois Lane are interested in the average
More informationMultiple and Logistic Regression
psls January 16, 2014 14:46 Sinclair Stammers/Science Source Baldi-4100190 C H A P T E R 28 Multiple and Logistic Regression W hen a scatterplot shows a linear relationship between a quantitative explanatory
More informationModel Building Chap 5 p251
Model Building Chap 5 p251 Models with one qualitative variable, 5.7 p277 Example 4 Colours : Blue, Green, Lemon Yellow and white Row Blue Green Lemon Insects trapped 1 0 0 1 45 2 0 0 1 59 3 0 0 1 48 4
More informationEstimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.
Estimating σ 2 We can do simple prediction of Y and estimation of the mean of Y at any value of X. To perform inferences about our regression line, we must estimate σ 2, the variance of the error term.
More informationChapter 7 Linear Regression
Chapter 7 Linear Regression 1 7.1 Least Squares: The Line of Best Fit 2 The Linear Model Fat and Protein at Burger King The correlation is 0.76. This indicates a strong linear fit, but what line? The line
More informationChapter 2: Looking at Data Relationships (Part 3)
Chapter 2: Looking at Data Relationships (Part 3) Dr. Nahid Sultana Chapter 2: Looking at Data Relationships 2.1: Scatterplots 2.2: Correlation 2.3: Least-Squares Regression 2.5: Data Analysis for Two-Way
More informationPre-Calculus Multiple Choice Questions - Chapter S8
1 If every man married a women who was exactly 3 years younger than he, what would be the correlation between the ages of married men and women? a Somewhat negative b 0 c Somewhat positive d Nearly 1 e
More informationCHAPTER 5 FUNCTIONAL FORMS OF REGRESSION MODELS
CHAPTER 5 FUNCTIONAL FORMS OF REGRESSION MODELS QUESTIONS 5.1. (a) In a log-log model the dependent and all explanatory variables are in the logarithmic form. (b) In the log-lin model the dependent variable
More informationKeller: Stats for Mgmt & Econ, 7th Ed July 17, 2006
Chapter 17 Simple Linear Regression and Correlation 17.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will
More informationStat 529 (Winter 2011) A simple linear regression (SLR) case study. Mammals brain weights and body weights
Stat 529 (Winter 2011) A simple linear regression (SLR) case study Reading: Sections 8.1 8.4, 8.6, 8.7 Mammals brain weights and body weights Questions of interest Scatterplots of the data Log transforming
More informationChapter 16. Simple Linear Regression and dcorrelation
Chapter 16 Simple Linear Regression and dcorrelation 16.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will
More informationWhat is the easiest way to lose points when making a scatterplot?
Day #1: Read 141-142 3.1 Describing Relationships Why do we study relationships between two variables? Read 143-144 Page 144: Check Your Understanding Read 144-149 How do you know which variable to put
More informationSTA 302 H1F / 1001 HF Fall 2007 Test 1 October 24, 2007
STA 302 H1F / 1001 HF Fall 2007 Test 1 October 24, 2007 LAST NAME: SOLUTIONS FIRST NAME: STUDENT NUMBER: ENROLLED IN: (circle one) STA 302 STA 1001 INSTRUCTIONS: Time: 90 minutes Aids allowed: calculator.
More informationSimple Linear Regression
Simple Linear Regression ST 370 Regression models are used to study the relationship of a response variable and one or more predictors. The response is also called the dependent variable, and the predictors
More informationCREATED BY SHANNON MARTIN GRACEY 146 STATISTICS GUIDED NOTEBOOK/FOR USE WITH MARIO TRIOLA S TEXTBOOK ESSENTIALS OF STATISTICS, 3RD ED.
10.2 CORRELATION A correlation exists between two when the of one variable are somehow with the values of the other variable. EXPLORING THE DATA r = 1.00 r =.85 r = -.54 r = -.94 CREATED BY SHANNON MARTIN
More informationSimple Linear Regression
Simple Linear Regression EdPsych 580 C.J. Anderson Fall 2005 Simple Linear Regression p. 1/80 Outline 1. What it is and why it s useful 2. How 3. Statistical Inference 4. Examining assumptions (diagnostics)
More informationBivariate Data Summary
Bivariate Data Summary Bivariate data data that examines the relationship between two variables What individuals to the data describe? What are the variables and how are they measured Are the variables
More information