The Mean Version One way to write the One True Regression Line is: Equation 1 - The One True Line

Similar documents
HOLLOMAN S AP STATISTICS BVD CHAPTER 08, PAGE 1 OF 11. Figure 1 - Variation in the Response Variable

Non-Linear Regression Samuel L. Baker

appstats27.notebook April 06, 2017

Chapter 27 Summary Inferences for Regression

Polynomial Degree and Finite Differences

Chapter 26: Comparing Counts (Chi Square)

1. Define the following terms (1 point each): alternative hypothesis

Warm-up Using the given data Create a scatterplot Find the regression line

FinQuiz Notes

Each element of this set is assigned a probability. There are three basic rules for probabilities:

Chapter 16. Simple Linear Regression and Correlation

Keller: Stats for Mgmt & Econ, 7th Ed July 17, 2006

MTH 65 WS 3 ( ) Radical Expressions

Regression, part II. I. What does it all mean? A) Notice that so far all we ve done is math.

Physics 210: Worksheet 26 Name:

3.5 Solving Quadratic Equations by the

Business Statistics. Lecture 9: Simple Regression

Ch Inference for Linear Regression

Math101, Sections 2 and 3, Spring 2008 Review Sheet for Exam #2:

Inferences for Regression

STAT 350 Final (new Material) Review Problems Key Spring 2016

Chapter 16. Simple Linear Regression and dcorrelation

appstats8.notebook October 11, 2016

Linear Regression. Linear Regression. Linear Regression. Did You Mean Association Or Correlation?

y = a + bx 12.1: Inference for Linear Regression Review: General Form of Linear Regression Equation Review: Interpreting Computer Regression Output

Solving Systems of Linear Equations Symbolically

Regression Analysis: Exploring relationships between variables. Stat 251

Lectures 5 & 6: Hypothesis Testing

Module 03 Lecture 14 Inferential Statistics ANOVA and TOI

Mathematical Ideas Modelling data, power variation, straightening data with logarithms, residual plots

Inference for Regression

ANOVA 3/12/2012. Two reasons for using ANOVA. Type I Error and Multiple Tests. Review Independent Samples t test

( )( b + c) = ab + ac, but it can also be ( )( a) = ba + ca. Let s use the distributive property on a couple of

Hypothesis testing I. - In particular, we are talking about statistical hypotheses. [get everyone s finger length!] n =

SIMPLE REGRESSION ANALYSIS. Business Statistics

Expansion formula using properties of dot product (analogous to FOIL in algebra): u v 2 u v u v u u 2u v v v u 2 2u v v 2

Quadratic Equations Part I

INFERENCE FOR REGRESSION

Simple linear regression: linear relationship between two qunatitative variables. Linear Regression. The regression line

Mathematics Background

Module 9: Further Numbers and Equations. Numbers and Indices. The aim of this lesson is to enable you to: work with rational and irrational numbers

At first numbers were used only for counting, and 1, 2, 3,... were all that was needed. These are called positive integers.

Conditions for Regression Inference:

The following document was developed by Learning Materials Production, OTEN, DET.

. Divide common fractions: 12 a. = a b. = a b b. = ac b c. 10. State that x 3 means x x x, . Cancel common factors in a fraction : 2a2 b 3.

PHY451, Spring /5

Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institution of Technology, Kharagpur

Throughout this chapter you will need: pencil ruler protractor. 7.1 Relationship Between Sides in Rightangled. 13 cm 10.5 cm

Regression Analysis and Forecasting Prof. Shalabh Department of Mathematics and Statistics Indian Institute of Technology-Kanpur

One sided tests. An example of a two sided alternative is what we ve been using for our two sample tests:

Probability Distributions

MCS 115 Exam 2 Solutions Apr 26, 2018

Regression, Part I. - In correlation, it would be irrelevant if we changed the axes on our graph.

t-test for b Copyright 2000 Tom Malloy. All rights reserved. Regression

Mr. Stein s Words of Wisdom

Chaos and Dynamical Systems

AMS 7 Correlation and Regression Lecture 8

Math 103 Exam 4 MLC Sample Exam. Name: Section: Date:

MATH 225: Foundations of Higher Matheamatics. Dr. Morton. 3.4: Proof by Cases

Lecture 18: Simple Linear Regression

MATH 1150 Chapter 2 Notation and Terminology

Chapter 8. Linear Regression. The Linear Model. Fat Versus Protein: An Example. The Linear Model (cont.) Residuals

STEP Support Programme. Hints and Partial Solutions for Assignment 2

ST505/S697R: Fall Homework 2 Solution.

Business Statistics. Lecture 10: Course Review

Simple Linear Regression

Regression. Marc H. Mehlman University of New Haven

Chapter 8. Linear Regression. Copyright 2010 Pearson Education, Inc.

CHAPTER 10 Comparing Two Populations or Groups

CHAPTER 10 Comparing Two Populations or Groups

Chapter 10. Regression. Understandable Statistics Ninth Edition By Brase and Brase Prepared by Yixun Shi Bloomsburg University of Pennsylvania

Important note: Transcripts are not substitutes for textbook assignments. 1

Swarthmore Honors Exam 2012: Statistics

Chapter 24. Comparing Means. Copyright 2010 Pearson Education, Inc.

Confidence Intervals. - simply, an interval for which we have a certain confidence.

Wed, June 26, (Lecture 8-2). Nonlinearity. Significance test for correlation R-squared, SSE, and SST. Correlation in SPSS.

Overview. Overview. Overview. Specific Examples. General Examples. Bivariate Regression & Correlation

Number Plane Graphs and Coordinate Geometry

AP Statistics. Chapter 9 Re-Expressing data: Get it Straight

Chapter 5. Elements of Multiple Regression Analysis: Two Independent Variables

Scatter plot of data from the study. Linear Regression

1 Least Squares Estimation - multiple regression.

Confidence intervals

LAB 5 INSTRUCTIONS LINEAR REGRESSION AND CORRELATION

Chapter 9 Regression. 9.1 Simple linear regression Linear models Least squares Predictions and residuals.

Inference for Regression Simple Linear Regression

Every equation implies at least two other equation. For example: The equation 2 3= implies 6 2 = 3 and 6 3= =6 6 3=2

Ordinary Least Squares Regression Explained: Vartanian

Multiple Regression Theory 2006 Samuel L. Baker

Six Sigma Black Belt Study Guides

Harvard University. Rigorous Research in Engineering Education

y response variable x 1, x 2,, x k -- a set of explanatory variables

Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras

Introduction to Linear Regression

Finding Complex Solutions of Quadratic Equations

Review of Statistics 101

Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference.

Study and research skills 2009 Duncan Golicher. and Adrian Newton. Last draft 11/24/2008

Regression and Models with Multiple Factors. Ch. 17, 18

HarperCollinsPublishers Ltd

Transcription:

Chapter 27: Inferences for Regression And so, there is one more thing which might vary one more thing aout which we might want to make some inference: the slope of the least squares regression line. The Population and the Samples If we had all of the data for an entire population, then we could construct the One True Regression Line a model for the ackone of the data (the middle of the data rememer that the model predicts the average value of the response variale for a given value of the explanatory variale.). Alas, we won t (ever) have an entire population, so a sample will have to do. The One True Line ut when we use a sample of data, the least squares regression model that we construct will only e one of many(!) possile models, since every sample would result in slightly different data and slightly different least squares models. Of course, that means that the regression lines that we ve een constructing are like statistics for some yet-unseen parameter. What does that parameter look like? Well there are actually two parameters (since a parameter is a single numer, not an equation): the slope and the y-intercept. The Mean Version One way to write the One True Regression Line is: Equation 1 - The One True Line µ = β + β x y 0 1 Note that the parameters are just suscripted versions of one letter. That s how real statisticians do it there is a slightly different version for those who are just eginning their studies: Equation 2 - The One True Line (again) µ = α+ β x y In oth cases, focus less on the letter used and more on its function is it a slope, or is it a y- intercept? Also, note that what the line produces is a mean this is why we keep saying that the model predicts some average value. The Oservation Version There is (of course) another way: Equation 3 - The One True Line (yet again) y= β0+ β1x+ ε (and the Junior Version for those who are just starting their explorations ) Equation 4 - The One True Line (one last time) y= α+ β x+ ε In this version, note that what is produced is the actual value of the response variale. The Greek letter ε stands for error. HOLLOMAN S AP STATISTICS BVD CHAPTER 27, PAGE 1 OF 12

This version is really nice, as it shows the relationship etween the actual, measured variales roken into two parts: the explained part ( α+ β x ) and the unexplained part (ε). The Error is the Key! The distriution of that unexplained part of the error is the key to inference. If our model has done the est jo possile, then the distriution of error ought to e random (if it wasn t random if there was a pattern to the error then we could predict that error and otain a etter model). Furthermore, a really good model ought to produce errors that are generally small there should e many more small errors than there are large errors. Additionally, those errors ought to e symmetric aout zero (just as many positive errors as negative errors). Rock Cra Measurements Carapace Width (mm) 20 30 40 50 15 20 25 30 35 40 45 Carapace Length (mm) Figure 1 - Scatterplot with a Good Linear Fit Distriution of Error Terms (Rock Cras) Frequency 0 10 20 30 40 50-2 -1 0 1 2 Residual Figure 2 - A Good Distriution of Error Terms HOLLOMAN S AP STATISTICS BVD CHAPTER 27, PAGE 2 OF 12

Finally, the errors should not e correlated with the explanatory variale if you take a thin slice of the scatterplot near one value of the explanatory variale, then the distriution of errors for points inside that slice should e the same no matter which slice you take! Figure 3 - Visualizing Slices Residual Plot for Rock Cras Residual -1.5-0.5 0.5 1.5 20 30 40 50 Predicted Carapace Width (mm) Figure 4 Checking All Slices at Once (no pattern) Those of you that are quick or deep thinkers may have noticed a discrepancy there. When I talk aout vertical slices, shouldn t that last graph have carapace length as the horizontal variale? Sure ut since predicted carapace width is a linear transformation of carapace width, this graph is equivalent. Statisticians use this version ecause they often have more than one explanatory variale. Trust me this graph does what we need it to do. HOLLOMAN S AP STATISTICS BVD CHAPTER 27, PAGE 3 OF 12

Conditions for Inference Linearity The variales must exhiit some type of linear relationship. Why would you conduct inference for the slope if there wasn t a slope to e found? Independence The oservations must e independent of one another. In other forms of inference, a random sample of oservations did the same thing for us. A random sample is really not desired here in quite a few applications of this inference, the values of the explanatory variale are determined efore, so the oservations can t e random! Equal Variance For any single value of the explanatory variale, the variance in the errors must e the same. No matter where you make a vertical slice in the scatterplot, the amount of variation in the error must e the same. Normal Distriution For any single value of the explanatory variale, the distriution of error must e approximately normal. No matter where you make a vertical slice in the scatterplot, the distriution of error must e approximately normal. Checking the Conditions We can check the linearity condition with a scatterplot of the data. For independence, a random sample will do just fine (despite my earlier comments). If you re not told that the sample was otained randomly, then you ll have to assume that the oservations are independent of one another. Don t assume that the sample was otained randomly! For the equal variance condition, check a residual plot. The plot should not show the triangle pattern that I discussed ack in Chapter 08. For the normal distriution, look at a histogram of the residuals. Rememer that you proaly won t see something that is instantly recognizale as normal! What we re really after is any evidence that the residuals couldn t possily e normally distriuted. Example (Actually, Example [9] from Chapter 08 does exactly this ut I ll make a new one just for you. You re welcome.) [1.]A study in Western Australia measured the carapace length and carapace width (oth measured in millimeters) for 200 cras. A random sample of the collected data are shown in tale 1. Let s check to see that we may safely conduct inference on these data. Tale 1 - Cra Data for Example 1 Cra 1 2 3 4 5 6 7 8 9 10 Length 47.6 47.2 37.8 26.1 33.9 29.0 37.9 33.8 26.6 37.6 Width 52.8 52.1 41.9 31.6 39.5 32.1 42.2 39.0 29.6 43.9 HOLLOMAN S AP STATISTICS BVD CHAPTER 27, PAGE 4 OF 12

First, let s graph the data. Cra Measurements Width (mm) 30 35 40 45 50 30 35 40 45 Figure 5 - Scatterplot for Example 1 The scatterplot appears linear, so that s good. I m told that these data are a random sample of the collected data I think that ought to handle the independence issue. Let s check for constant variation in the error terms with a residual plot. Residuals -1.0-0.5 0.0 0.5 1.0 1.5 Length (mm) Residual Plot 30 35 40 45 50 Predicted Carapace Width (mm) Figure 6 - Residual Plot for Example 1 I don t see the triangle shape, so it appears that the variation aout the line is constant that s good. Now for a histogram of the residuals. HOLLOMAN S AP STATISTICS BVD CHAPTER 27, PAGE 5 OF 12

Residual Distriution Frequency 0.0 0.5 1.0 1.5 2.0-1.5-1.0-0.5 0.0 0.5 1.0 1.5 Residuals Figure 7 - Histogram of Residuals for Example 1 There is no evidence of skew here, so it s reasonale to think that these residuals come from a normally distriuted population. Everything checks out we should e OK to conduct inference on these data! Another Sampling Distriution Every time we take a sample, we get a new sample slope (). Since this statistic varies, it has a distriution with some mean ( µ ) and some standard deviation ( σ ). Of course, we ll never know the value of those parameters ut that hasn t stopped us yet! If you ve een paying attention, then you should at least suspect that our conditions will cause the statistic to e uniased thus, the mean of all possile values of should e the parameter that we re trying to estimate β! Furthermore, you should e ale to guess that we ll never know the value of σ instead, we ll estimate it s value, and give this estimate the symol SE. There is a formula for that ut you ll proaly never need to use it. Instead, you must either [a] coax your calculator into giving you this value, or [] you must learn how to read generic computer output to otain this value. I ll demonstrate oth of those later in these notes. So if the conditions are met, then the following statistic must have a t-distriution with n 2 degrees of freedom. Equation 5 - The T-Statistic for Regression Slope β t= SE Note that the degrees of freedom have changed! This is ecause there are two parameters that the model is forced to estimate. For those of you who d like an explanation that is more satisfying, ut less true this requires that there are at least three points efore you start inference (ecause just two points would always make a perfect line!). HOLLOMAN S AP STATISTICS BVD CHAPTER 27, PAGE 6 OF 12

A Test of Significance Hypotheses For us, the null will always e that the population slope is zero. Equation 6 - Our Null Hypothesis H : β = 0 0 Note that this is the equivalent of saying that there is no useful, predictive linear relationship etween the variales. In fact, the test that we ll conduct is algeraically equivalent to a test aout the correlation coefficient eing zero! The alternate hypothesis has its usual three possiilities. Mechanics Because of our null hypothesis, our mechanics can e written in a slightly shorter version: Equation 7 - The T-Statistic, Simplified t= SE The degrees of freedom are the same. The p-value is calculated in the usual way. Conclusion and the conclusion is the same as it s een since the eginning. Example [2.] In the 1980s, researchers took various ody measurements from over 500 women of Pima Indian heritage. Two of the measured variales were triceps skin fold thickness (measured in kg millimeters) and ody mass index ( ). The data from a random sample of these measurements 2 m is shown elow in tale 2. Do these data provide evidence of a positive association etween ody mass index and triceps skin fold thickness? Tale 2 - Pima Indian Measurements Woman 1 2 3 4 5 6 7 8 9 10 BMI 35.9 22.9 37.6 32.4 33.7 43.3 18.2 32.8 24.2 28.4 TSFT 28 13 34 32 39 37 19 31 20 23 I ll let β e the slope of the least squares regression line for the population. H : β = 0 versus : 0 0 H a β > This calls for a t-test for slope. This requires evidence of a linear relationship etween the variales, independent oservations, and triceps skin fold thickness must vary normally aout the least squares regression line with constant variation. Here s a scatterplot of the data. HOLLOMAN S AP STATISTICS BVD CHAPTER 27, PAGE 7 OF 12

Skin Fold (mm) 15 20 25 30 35 40 Pima Indian Measurements 20 25 30 35 40 BMI (kg/m^2) Figure 8 - Scatterplot for Example 2 Looks linear! I ll assume that the random sample causes the oservations to e independent. Here s a residual plot for the least squares model: Residual Plot Residuals -5 0 5 15 20 25 30 35 40 Predicted Skin Fold Thickness (mm) Figure 9 - Residual Plot for Example 2 I don t see any pattern that would suggest non-constant variation. Finally, a histogram of the residuals: HOLLOMAN S AP STATISTICS BVD CHAPTER 27, PAGE 8 OF 12

Residual Distriution Frequency 0.0 0.5 1.0 1.5 2.0 Figure 10 - Residual Histogram for Example 2 There is no clear skew, so normal variation appears to e satisfied. All requirements appear to e met, so I ll conduct the test with α = 0.05. 0.9632 t= = = 4.606, with 8 degrees of freedom. SE 0.2091-10 -5 0 5 10 Residuals Let me pause to explain where I got SE. Your calculator will conduct this t-test. It will report (and store) the values of t and. Given those, what could you do (algeraically) to find SE? Don t write that down where a grader can see it! That is not a correct formula. It is a hack to coerce your calculator into giving you some pertinent information. OK ack to the prolem. P > 0.9632 = P t> 4.606 0.00087. With 8 degrees of freedom, ( ) ( ) If the slope of the population least squares regression line is zero, then I can expect to find a 2 mm mm m sample of data with a sample slope of 0.9632 (the units are nasty = ) or greater in kg kg 2 m aout 0.087% of samples. Since 0.00087 < 0.05, this occurs too rarely to attriute to chance at the 5% significance level, This is significant, and I reject the null hypothesis. The data provide evidence of a positive association etween ody mass index and triceps skin fold thickness. A Confidence Interval This will asically e the same as every other confidence interval that we ve constructed! HOLLOMAN S AP STATISTICS BVD CHAPTER 27, PAGE 9 OF 12

Mechanics Statistic, plus or minus, some critical value times some measure of variation: Equation 8 - A Confidence Interval for the Regression Slope * ± t SE The degrees of freedom are the same. Conclusion and the conclusion is the same, also! (I am 95% confident ) Example [3.] Let s use the cra data from example [1], and construct a 95% confidence interval for the population regression slope (when predicting width from length). We ve already checked the requirements ack in example [1], so there s no need to repeat that work. With 8 degrees of freedom, t * 2.306. The interval is * ± t SE = 1.04885± 2.306 0.04756= ( 0.9392,1.1585). I am 95% confident that the population slope when predicting cra width from cra length is etween 0.9392 and 1.1585 (the units would e a little weird mm mm ). Those of you with older calculators don t have this particular procedure pre-programmed. You do have the test, though! Run the test in order to get SE, then manually work out the lower and upper limits of the interval. Two More Things Reading Computer Output On the AP Exam, you will e required to read generic computer output from a least squares regression. I have some handouts for that. Note that there are two lines of particular importance and the placement/formatting of these two lines is nearly identical in every program. You should e ale to find the equation of the least squares regression line, SE, the t-statistic for a hypothesis test, the two-sided p-value of such test, the degrees of freedom for the test, the correlation coefficient (or coefficient of determination), and the standard deviation of the error terms. Note that the p-value is for the two-sided test! Given that, you should e ale to find the p- value for a one-sided test. Examples [4.] A statistician measured the sepal length and petal length (in centimeters) for a numer of irises, and wants to construct a least squares regression model to product sepal length from petal length. Use the computer output given elow to write the equation of the least squares regression line. HOLLOMAN S AP STATISTICS BVD CHAPTER 27, PAGE 10 OF 12

Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) 4.30660 0.07839 54.94 <2e-16 Petal.Length 0.40892 0.01889 21.65 <2e-16 Residual standard error: 0.4071 on 148 degrees of freedom Multiple R-squared: 0.76, Adjusted R-squared: 0.7583 F-statistic: 468.6 on 1 and 148 DF, p-value: < 2.2e-16 The regression equation is ɵ y= 4.3066+ 0.40892x, where x is the petal length (cm) and ɵ y is the sepal length (cm). [5.] Using the same data and output from example [4], and assuming that the conditions for inference have een met, test H : β = 0 versus : 0 0 H β > at the 1% significance level. Since the hypotheses are given, and I ve een told to assume that the conditions are met, I can skip straight to the mechanics. 0.40892 t= = = 21.65 SE 0.01889 With 148 degrees of freedom, P( ) P( t ) a > 0.40892 = > 21.65 = 0. If there is no association etween sepal length and petal length ( β = 0 ), then I can expect to find a sample where > 0.40892 in almost no samples. Since 0 < 0.05, this occurs to rarely to attriute to chance at the 1% significance level. This is significant, and I reject the null hypothesis. The data do provide evidence of a positive association etween sepal length and petal length (that β > 0 ). Two Kinds of Variation Some of you are now scratching your heads from that last section. Standard deviation of the error terms? You re correct I haven t mentioned that yet. I m going to do that now. We ve already met SE, which measures how the slopes vary around the One True Slope. Rememer that in general, standard deviation (and standard error) can e thought of as a typical distance etween the mean and a datum. In the case of slope, it s the typical difference etween a sample slope () and the One True Slope (β). This other standard deviation is for the error terms the residuals. This standard deviation measures how the response variale varies aout the least squares regression line. It can e thought of as the typical vertical distance etween a datum (a value of the response variale y) and the least squares regression line (the predicted value of the response variale ɵ y ). The standard deviation of the error terms is most often laeled s in computer output, ut some other designations include standard error, estimate of sigma, and residual standard error. HOLLOMAN S AP STATISTICS BVD CHAPTER 27, PAGE 11 OF 12

Examples [6.] Looking ack at the iris data in example [4] identify and interpret the standard error of the slope. First of all, note that this is asking aout SE, which is 0.01889 in this data set. This means that we can expect the typical sample slope (sepal width divided y petal length) to e aout 0.01889 ( cm ) aove or elow the population slope (β). cm [7.] While we re at it from example [4], identify and interpret the standard deviation of the residuals. In this particular output, that item is residual standard error. It s 0.4071. This means that we can expect the typical sepal length to e 0.4071 cm aove or elow the predicted sepal length for an petal length we expect an error of 0.4071 cm when predicting a sepal length from a petal length. HOLLOMAN S AP STATISTICS BVD CHAPTER 27, PAGE 12 OF 12