MULTIPLE REGRESSION METHODS

Size: px
Start display at page:

Download "MULTIPLE REGRESSION METHODS"

Transcription

1 DEPARTMENT OF POLITICAL SCIENCE AND INTERNATIONAL RELATIONS Posc/Uapp 816 MULTIPLE REGRESSION METHODS I. AGENDA: A. Residuals B. Transformations 1. A useful procedure for making transformations C. Reading: Agresti and Finlay Statistical Methods in the Social Sciences, 3 rd edition. II. III. RESIDUALS: A. MINITAB identifies cases that exert leverage (disproportionately affect) on estimators and very poor fitting data (that is, cases with large residuals). 1. Normally one would look at each of these cases carefully to make sure there are no measurement errors or substantive reasons that should be taken into account. B. Partial regression plots: summary 1. Assume Y and K independent or predictor variables. 2. A partial plot shows the relationship between Y and one of the predictors after both have been adjusted for the influence of the remaining K -1 variables. 3. Method: i. Regress Y on the K - 1 variables and obtain residuals ii. iii. iv. Regress X on the K - 1 variables and obtain residuals k Plot first set of residuals against second set to obtain partial regression plot. 1) This plot may indicate the need to transform the data. (See below.) Regress the first set of residuals on the second. 1) The intercept will be 0. 2) The regression coefficient will equal the partial regression coefficient obtained when Y is regressed on all K variables. TRANSFORMING DATA A. Frequently plots will reveal patterns that indicate one or more variables should be transformed in order to meet the assumptions and requirements of regression analysis. 1. OLS regression assumes that the model has been correctly specified; in particular the relationship between Y should be a linear function of the

2 Posc/Uapp 816 Class 17 Regression Methods Page 2 X s. 2. Moreover, variables sometimes need to be transformed to make their observed distributions more symmetrical. 3. The "raw" or "original" data can sometimes be transformed to new values, ' ' Y and/or X, in a way that creates linear relations and/or symmetry. 4. One way to find an appropriate transformation is to use the so-called "ladder powers." B. Here s a motivating example: 1. The next figure shows the relationship between sulfur dioxide and mortality. 2. The relationship seems slightly curved, right? C. Sometimes a variable will be highly skewed. 1. To see this let s switch to a new data set, one used last semester. 2. It includes per capita crime rates and percent living in poverty (or classified as poor) for 506 districts in Boston. i. The data were drawn from the Data Story and Library at Statlib located at Carnegie Mellon University. 3. Here is a stem-and-leaf display of the per capita crime variable.

3 Posc/Uapp 816 Class 17 Regression Methods Page 3 (400) It s clear that the data are highly skewed. Most values are below Moreover, it is hard to plot 500 plus data points. 6. So I took a random sample of 50 cases from the file to use in a preliminary analysis. 7. Here is the plot of crime versus percent poor.

4 Posc/Uapp 816 Class 17 Regression Methods Page 4 i. We can see that a linear model may not be appropriate, partly because Y is so highly skewed and perhaps because the relationship is not linear. D. What to do? 1. We need a systematic way to decide how to transform variables. 2. First let s consider bivariate relationships. 3. Basic idea: i. Rank the X scores from lowest to highest. ii. Divide them into three roughly equal batches (i.e., each batch has about 1/3 of the cases). 1) If N, divided by 3, is even each has same number. 2) If N, divided by three, has remainder of one, put the extra case in the middle batch. 3) If N, divided by 3, has a remainder of 2, put an extra case in each end batch. iii. iv. Find the median X in each of the three batches. Call these medians X L, X M, and X H. Find the median Y's for the Y's that correspond to the X's in each batch. The median Y may or may not involve the same cases as the X median. In other words, 1) The X's have been divided into three groups. 2) Find the Y's that correspond to these X's. 3) For each of the three batches of Y's find the medians: Y, L Y M, and Y H. 4) These medians need not be actual data points. v. Find the half slopes: 1) The left or lower half slope is: b L = Y M - Y L X M - X L 2) and the upper or right half slope is: b R = Y H - Y M X H - X M 4. The half slopes can be used to check for linearity and to pick an appropriate transformation (if any exists) that will "straighten out" the relationship so that OLS can be applied: i. Find the half slopes and sketch them in the scatter plot. If the data are linear the two half slopes will be roughly equal and their graph

5 Posc/Uapp 816 Class 17 Regression Methods Page 5 ii. iii. will be a nearly straight line. If, on the other hand, the relationship is not linear, then the graphs of the half slopes will form an "arrow" (see below) which you can use to pick a transformation. Calculate the half slope ratio by dividing b L by b R: if the relationship is linear, the ratio will be about 1.0; if not it will be less than or greater than 1.0. If the half slope ratio is negative, that means that one slope is positive and one is negative and the ladder powers will not help. E. Using the half slopes. Consider the following 1. Suppose data points were dispersed roughly as shown. 2. There is a relationship between X and Y, it is not linear. 3. You can imagine finding half slopes i. I ve sketched them in. They are of course not drawn to scale. 4. You can also imagine obtaining their ratio, which in this figure is greater than zero. i. Both slopes have the same sign, here negative. 5. The left slope is larger (steeper) than the right slope. i. So you can determine that ratio is greater than 1.0 Figure 3 6. You can imagine drawing an arrow using the two half slope ratios, as I have done. i. This arrow points down the Y and X axes. ii. That in turn suggests that we transform either Y or X or both by taking powers down the ladder. 1) See below. For now going down means taking the square root or logarithm or some other power of X and/or Y.

6 Posc/Uapp 816 Class 17 Regression Methods Page 6 7. It s possible that data would be related as indicated in Figure 4: i. Now there is a curved positive correlation. ii. The left half ratio is small than the right, although they both have positive signs. 1) So again the ratio is positive and a transformation of either X or Y or both might help. iii. The arrow formed by sketching the half slopes points up the X axis and down the Y axis. 1) As we will see this implies converting X by taking higher 2 power, such as X, and/or lower powers of Y such as log(y). Figure 4 8. Now look at the next figure. We can analyze it in the same way by drawing half slopes and creating arrows.

7 Posc/Uapp 816 Class 17 Regression Methods Page 7 Figure 5 i. The arrow points up the Y axis and down the X axis, so we would reverse the transformations mentioned above. ii. We might have to push Y up and/or pull X down. F. Each of the these figures contains an implied arrow that represents the half-slopes. Since there are "bends" in the line (hence the arrows), we can see that the relationships are nonlinear. 1. The direction that the implied arrow points indicates what transformations of X and/or Y may help make the relationship more nearly linear. i. The words "push up" means take powers of the variable that are greater than 1.0. That is, "push up X" means transform X by 2.5 squaring or cubing it; or perhaps taking the 2.5 power (that is, X ) trial and error is necessary to see which transformation works best. G. The words "pull down" meaning taking a power that is less than 1.0; for example, one can take the square root (the 1/2 power) or the logarithm (the 0 power) of X or Y or both. Again trial and error is necessary to give the best fit. 1. Ladder Powers: when "pushing" or "pulling" a variable, one can use the so-called ladder powers (named by John Tukey, a statistician at Bell Labs):

8 Posc/Uapp 816 Class 17 Regression Methods Page 8 The Ladder Powers Transformation (Power) (Step on Ladder) Name Result X = cube Pushes X "up" 2 2 X = square 1 "raw" score No change 1/2 1/2 X = square root 0 log(x) (base 10) -1/2 reciprocal root -1-1/X Pulls X "down" It is possible to take half or even more refined intermediate steps such as 3/4 raising X to the 3/4 power (i.e., X ). IV. AN EXAMPLE WITH SIMULATED DATA: A. Here is an example using simulated data. 1. I created a population based on the model: Y i = $ 11 X 2.9 i +, i 2. Note that that β = 0 and β = 1.0. Y is simply X plus an error term. That is, X has been raised to the 2.9 power. 3. I then sampled 100 cases from this population. 4. Assume then that I have 100 X-y pairs and am trying to find the best fitting model for them. 5. Normally, I would plot Y against X. In this case it the plot is:

9 Posc/Uapp 816 Class 17 Regression Methods Page 9 Figure 6 6. Since I am assuming that the "true" model is not known, my first guess is a simple linear equation: Y i = $ 0 + $ 1 X i +, i 7. But the plot suggests that there is a non-linear relationship between Y and X. i. Indeed, if one imagined half slopes forming the head of an arrow, one would think to transform X by going up the ladder powers-- that is, transforming X by taking, say, X-squared--or by moving down the ladder powers with Y--that is, using the square root of Y. 8. But for now I can proceed as if using raw X and Y were satisfactory. i. Here are the results from a bare-bones regression analysis.

10 Posc/Uapp 816 Class 17 Regression Methods Page 10 The regression equation is SampleY = SampleX Predictor Coef StDev T P Constant SampleX S = 5840 R-Sq =.851 Analysis of Variance Source DF SS MS F P Regression Residual Error Total ii. The sample data seem to fit the linear model quite well. Look at R 2 and s. iii. The estimated coefficient relating Y to X is 997.7, which we know is incorrect. 1) Also the constant is , which we know is wrong since we created the population to have β 0 = 0. iv. Still, the data provide a good fit. v. But if we use the half slope ratios or approximations of them, we can possibly improve the fit. 1) The imaginary arrow suggests going up the ladder in X (or down in Y, but let s try X first) so we can create a variable, 2 X*, which is simply X* = X. 2) The plot of it against Y follows.

11 Posc/Uapp 816 Class 17 Regression Methods Page 11 Figure 7 vi. The points seem to lie on a straight line so we use regression procedures to obtain. The regression equation is SampleY = SampleX2 Predictor Coef StDev T P Constant SampleX S = 2311 R-Sq = 97.7% R-Sq(adj) = 97.6% Analysis of Variance Source DF SS MS F P Regression Residual Error Total

12 Posc/Uapp 816 Class 17 Regression Methods Page 12 vii. 2 Although the R has become nearly perfect, we know--because we created the population model--that the estimated coefficients are off. 1) Of course they are closer to the population values of β = 0 0 and β 1 = ) Were we to transform X still again, by taking say, X, we would find the coefficients closer to the true values. a) Actually the figure above hints at a slightly curved relationship. 3) Also, don t forget that these data constitute a relatively 2.5 small sample from the population in which Y = X + error. a) So our transformation is not too bad. V. NOTES ARE CONTINUED ON NEXT PAGES: A. The file is too large to fit on a single disk so I split it into two parts.

MISCELLANEOUS REGRESSION TOPICS

MISCELLANEOUS REGRESSION TOPICS DEPARTMENT OF POLITICAL SCIENCE AND INTERNATIONAL RELATIONS Posc/Uapp 816 MISCELLANEOUS REGRESSION TOPICS I. AGENDA: A. Example of correcting for autocorrelation. B. Regression with ordinary independent

More information

MORE ON MULTIPLE REGRESSION

MORE ON MULTIPLE REGRESSION DEPARTMENT OF POLITICAL SCIENCE AND INTERNATIONAL RELATIONS Posc/Uapp 816 MORE ON MULTIPLE REGRESSION I. AGENDA: A. Multiple regression 1. Categorical variables with more than two categories 2. Interaction

More information

INFERENCE FOR REGRESSION

INFERENCE FOR REGRESSION CHAPTER 3 INFERENCE FOR REGRESSION OVERVIEW In Chapter 5 of the textbook, we first encountered regression. The assumptions that describe the regression model we use in this chapter are the following. We

More information

SIMPLE TWO VARIABLE REGRESSION

SIMPLE TWO VARIABLE REGRESSION DEPARTMENT OF POLITICAL SCIENCE AND INTERNATIONAL RELATIONS Posc/Uapp 816 SIMPLE TWO VARIABLE REGRESSION I. AGENDA: A. Causal inference and non-experimental research B. Least squares principle C. Regression

More information

28. SIMPLE LINEAR REGRESSION III

28. SIMPLE LINEAR REGRESSION III 28. SIMPLE LINEAR REGRESSION III Fitted Values and Residuals To each observed x i, there corresponds a y-value on the fitted line, y = βˆ + βˆ x. The are called fitted values. ŷ i They are the values of

More information

AP Statistics. The only statistics you can trust are those you falsified yourself. RE- E X P R E S S I N G D A T A ( P A R T 2 ) C H A P 9

AP Statistics. The only statistics you can trust are those you falsified yourself. RE- E X P R E S S I N G D A T A ( P A R T 2 ) C H A P 9 AP Statistics 1 RE- E X P R E S S I N G D A T A ( P A R T 2 ) C H A P 9 The only statistics you can trust are those you falsified yourself. Sir Winston Churchill (1874-1965) (Attribution to Churchill is

More information

Warm-up Using the given data Create a scatterplot Find the regression line

Warm-up Using the given data Create a scatterplot Find the regression line Time at the lunch table Caloric intake 21.4 472 30.8 498 37.7 335 32.8 423 39.5 437 22.8 508 34.1 431 33.9 479 43.8 454 42.4 450 43.1 410 29.2 504 31.3 437 28.6 489 32.9 436 30.6 480 35.1 439 33.0 444

More information

SMAM 319 Exam1 Name. a B.The equation of a line is 3x + y =6. The slope is a. -3 b.3 c.6 d.1/3 e.-1/3

SMAM 319 Exam1 Name. a B.The equation of a line is 3x + y =6. The slope is a. -3 b.3 c.6 d.1/3 e.-1/3 SMAM 319 Exam1 Name 1. Pick the best choice. (10 points-2 each) _c A. A data set consisting of fifteen observations has the five number summary 4 11 12 13 15.5. For this data set it is definitely true

More information

AP Statistics Unit 6 Note Packet Linear Regression. Scatterplots and Correlation

AP Statistics Unit 6 Note Packet Linear Regression. Scatterplots and Correlation Scatterplots and Correlation Name Hr A scatterplot shows the relationship between two quantitative variables measured on the same individuals. variable (y) measures an outcome of a study variable (x) may

More information

SMAM 319 Exam 1 Name. 1.Pick the best choice for the multiple choice questions below (10 points 2 each)

SMAM 319 Exam 1 Name. 1.Pick the best choice for the multiple choice questions below (10 points 2 each) SMAM 319 Exam 1 Name 1.Pick the best choice for the multiple choice questions below (10 points 2 each) A b In Metropolis there are some houses for sale. Superman and Lois Lane are interested in the average

More information

AP Statistics. Chapter 9 Re-Expressing data: Get it Straight

AP Statistics. Chapter 9 Re-Expressing data: Get it Straight AP Statistics Chapter 9 Re-Expressing data: Get it Straight Objectives: Re-expression of data Ladder of powers Straight to the Point We cannot use a linear model unless the relationship between the two

More information

Conditions for Regression Inference:

Conditions for Regression Inference: AP Statistics Chapter Notes. Inference for Linear Regression We can fit a least-squares line to any data relating two quantitative variables, but the results are useful only if the scatterplot shows a

More information

Correlation and Regression

Correlation and Regression Correlation and Regression Dr. Bob Gee Dean Scott Bonney Professor William G. Journigan American Meridian University 1 Learning Objectives Upon successful completion of this module, the student should

More information

STA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #6

STA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #6 STA 8 Applied Linear Models: Regression Analysis Spring 011 Solution for Homework #6 6. a) = 11 1 31 41 51 1 3 4 5 11 1 31 41 51 β = β1 β β 3 b) = 1 1 1 1 1 11 1 31 41 51 1 3 4 5 β = β 0 β1 β 6.15 a) Stem-and-leaf

More information

Lecture 18: Simple Linear Regression

Lecture 18: Simple Linear Regression Lecture 18: Simple Linear Regression BIOS 553 Department of Biostatistics University of Michigan Fall 2004 The Correlation Coefficient: r The correlation coefficient (r) is a number that measures the strength

More information

27. SIMPLE LINEAR REGRESSION II

27. SIMPLE LINEAR REGRESSION II 27. SIMPLE LINEAR REGRESSION II The Model In linear regression analysis, we assume that the relationship between X and Y is linear. This does not mean, however, that Y can be perfectly predicted from X.

More information

10 Model Checking and Regression Diagnostics

10 Model Checking and Regression Diagnostics 10 Model Checking and Regression Diagnostics The simple linear regression model is usually written as i = β 0 + β 1 i + ɛ i where the ɛ i s are independent normal random variables with mean 0 and variance

More information

1 Introduction to Minitab

1 Introduction to Minitab 1 Introduction to Minitab Minitab is a statistical analysis software package. The software is freely available to all students and is downloadable through the Technology Tab at my.calpoly.edu. When you

More information

Review of Regression Basics

Review of Regression Basics Review of Regression Basics When describing a Bivariate Relationship: Make a Scatterplot Strength, Direction, Form Model: y-hat=a+bx Interpret slope in context Make Predictions Residual = Observed-Predicted

More information

Evaluate the expression if x = 2 and y = 5 6x 2y Original problem Substitute the values given into the expression and multiply

Evaluate the expression if x = 2 and y = 5 6x 2y Original problem Substitute the values given into the expression and multiply Name EVALUATING ALGEBRAIC EXPRESSIONS Objective: To evaluate an algebraic expression Example Evaluate the expression if and y = 5 6x y Original problem 6() ( 5) Substitute the values given into the expression

More information

Histogram of Residuals. Residual Normal Probability Plot. Reg. Analysis Check Model Utility. (con t) Check Model Utility. Inference.

Histogram of Residuals. Residual Normal Probability Plot. Reg. Analysis Check Model Utility. (con t) Check Model Utility. Inference. Steps for Regression Simple Linear Regression Make a Scatter plot Does it make sense to plot a line? Check Residual Plot (Residuals vs. X) Are there any patterns? Check Histogram of Residuals Is it Normal?

More information

Simple Linear Regression. Steps for Regression. Example. Make a Scatter plot. Check Residual Plot (Residuals vs. X)

Simple Linear Regression. Steps for Regression. Example. Make a Scatter plot. Check Residual Plot (Residuals vs. X) Simple Linear Regression 1 Steps for Regression Make a Scatter plot Does it make sense to plot a line? Check Residual Plot (Residuals vs. X) Are there any patterns? Check Histogram of Residuals Is it Normal?

More information

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE IN STATISTICS, 009 MODULE 4 : Linear models Time allowed: One and a half hours Candidates should answer THREE questions. Each question carries

More information

Review of Multiple Regression

Review of Multiple Regression Ronald H. Heck 1 Let s begin with a little review of multiple regression this week. Linear models [e.g., correlation, t-tests, analysis of variance (ANOVA), multiple regression, path analysis, multivariate

More information

Steps for Regression. Simple Linear Regression. Data. Example. Residuals vs. X. Scatterplot. Make a Scatter plot Does it make sense to plot a line?

Steps for Regression. Simple Linear Regression. Data. Example. Residuals vs. X. Scatterplot. Make a Scatter plot Does it make sense to plot a line? Steps for Regression Simple Linear Regression Make a Scatter plot Does it make sense to plot a line? Check Residual Plot (Residuals vs. X) Are there any patterns? Check Histogram of Residuals Is it Normal?

More information

SOLUTIONS FOR PROBLEMS 1-30

SOLUTIONS FOR PROBLEMS 1-30 . Answer: 5 Evaluate x x + 9 for x SOLUTIONS FOR PROBLEMS - 0 When substituting x in x be sure to do the exponent before the multiplication by to get (). + 9 5 + When multiplying ( ) so that ( 7) ( ).

More information

1. An article on peanut butter in Consumer reports reported the following scores for various brands

1. An article on peanut butter in Consumer reports reported the following scores for various brands SMAM 314 Review Exam 1 1. An article on peanut butter in Consumer reports reported the following scores for various brands Creamy 56 44 62 36 39 53 50 65 45 40 56 68 41 30 40 50 50 56 65 56 45 40 Crunchy

More information

Analysis of Bivariate Data

Analysis of Bivariate Data Analysis of Bivariate Data Data Two Quantitative variables GPA and GAES Interest rates and indices Tax and fund allocation Population size and prison population Bivariate data (x,y) Case corr&reg 2 Independent

More information

Table 2.1 presents examples and explains how the proper results should be written. Table 2.1: Writing Your Results When Adding or Subtracting

Table 2.1 presents examples and explains how the proper results should be written. Table 2.1: Writing Your Results When Adding or Subtracting When you complete a laboratory investigation, it is important to make sense of your data by summarizing it, describing the distributions, and clarifying messy data. Analyzing your data will allow you to

More information

CRP 272 Introduction To Regression Analysis

CRP 272 Introduction To Regression Analysis CRP 272 Introduction To Regression Analysis 30 Relationships Among Two Variables: Interpretations One variable is used to explain another variable X Variable Independent Variable Explaining Variable Exogenous

More information

Experimental Uncertainty (Error) and Data Analysis

Experimental Uncertainty (Error) and Data Analysis Experimental Uncertainty (Error) and Data Analysis Advance Study Assignment Please contact Dr. Reuven at yreuven@mhrd.org if you have any questions Read the Theory part of the experiment (pages 2-14) and

More information

Confidence Interval for the mean response

Confidence Interval for the mean response Week 3: Prediction and Confidence Intervals at specified x. Testing lack of fit with replicates at some x's. Inference for the correlation. Introduction to regression with several explanatory variables.

More information

Is economic freedom related to economic growth?

Is economic freedom related to economic growth? Is economic freedom related to economic growth? It is an article of faith among supporters of capitalism: economic freedom leads to economic growth. The publication Economic Freedom of the World: 2003

More information

23. Inference for regression

23. Inference for regression 23. Inference for regression The Practice of Statistics in the Life Sciences Third Edition 2014 W. H. Freeman and Company Objectives (PSLS Chapter 23) Inference for regression The regression model Confidence

More information

STA220H1F Term Test Oct 26, Last Name: First Name: Student #: TA s Name: or Tutorial Room:

STA220H1F Term Test Oct 26, Last Name: First Name: Student #: TA s Name: or Tutorial Room: STA0HF Term Test Oct 6, 005 Last Name: First Name: Student #: TA s Name: or Tutorial Room: Time allowed: hour and 45 minutes. Aids: one sided handwritten aid sheet + non-programmable calculator Statistical

More information

Correlation & Simple Regression

Correlation & Simple Regression Chapter 11 Correlation & Simple Regression The previous chapter dealt with inference for two categorical variables. In this chapter, we would like to examine the relationship between two quantitative variables.

More information

Overview. Overview. Overview. Specific Examples. General Examples. Bivariate Regression & Correlation

Overview. Overview. Overview. Specific Examples. General Examples. Bivariate Regression & Correlation Bivariate Regression & Correlation Overview The Scatter Diagram Two Examples: Education & Prestige Correlation Coefficient Bivariate Linear Regression Line SPSS Output Interpretation Covariance ou already

More information

This module focuses on the logic of ANOVA with special attention given to variance components and the relationship between ANOVA and regression.

This module focuses on the logic of ANOVA with special attention given to variance components and the relationship between ANOVA and regression. WISE ANOVA and Regression Lab Introduction to the WISE Correlation/Regression and ANOVA Applet This module focuses on the logic of ANOVA with special attention given to variance components and the relationship

More information

Basic Business Statistics 6 th Edition

Basic Business Statistics 6 th Edition Basic Business Statistics 6 th Edition Chapter 12 Simple Linear Regression Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value of a dependent variable based

More information

Chapter 5: Data Transformation

Chapter 5: Data Transformation Chapter 5: Data Transformation The circle of transformations The x-squared transformation The log transformation The reciprocal transformation Regression analysis choosing the best transformation TEXT:

More information

22 Approximations - the method of least squares (1)

22 Approximations - the method of least squares (1) 22 Approximations - the method of least squares () Suppose that for some y, the equation Ax = y has no solutions It may happpen that this is an important problem and we can t just forget about it If we

More information

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE IN STATISTICS, 2011 MODULE 4 : Linear models Time allowed: One and a half hours Candidates should answer THREE questions. Each question

More information

Graphical Analysis and Errors MBL

Graphical Analysis and Errors MBL Graphical Analysis and Errors MBL I Graphical Analysis Graphs are vital tools for analyzing and displaying data Graphs allow us to explore the relationship between two quantities -- an independent variable

More information

Regression. Marc H. Mehlman University of New Haven

Regression. Marc H. Mehlman University of New Haven Regression Marc H. Mehlman marcmehlman@yahoo.com University of New Haven the statistician knows that in nature there never was a normal distribution, there never was a straight line, yet with normal and

More information

Multiple Regression Examples

Multiple Regression Examples Multiple Regression Examples Example: Tree data. we have seen that a simple linear regression of usable volume on diameter at chest height is not suitable, but that a quadratic model y = β 0 + β 1 x +

More information

, (1) e i = ˆσ 1 h ii. c 2016, Jeffrey S. Simonoff 1

, (1) e i = ˆσ 1 h ii. c 2016, Jeffrey S. Simonoff 1 Regression diagnostics As is true of all statistical methodologies, linear regression analysis can be a very effective way to model data, as along as the assumptions being made are true. For the regression

More information

Model Building Chap 5 p251

Model Building Chap 5 p251 Model Building Chap 5 p251 Models with one qualitative variable, 5.7 p277 Example 4 Colours : Blue, Green, Lemon Yellow and white Row Blue Green Lemon Insects trapped 1 0 0 1 45 2 0 0 1 59 3 0 0 1 48 4

More information

Data Set 8: Laysan Finch Beak Widths

Data Set 8: Laysan Finch Beak Widths Data Set 8: Finch Beak Widths Statistical Setting This handout describes an analysis of covariance (ANCOVA) involving one categorical independent variable (with only two levels) and one quantitative covariate.

More information

1. Least squares with more than one predictor

1. Least squares with more than one predictor Statistics 1 Lecture ( November ) c David Pollard Page 1 Read M&M Chapter (skip part on logistic regression, pages 730 731). Read M&M pages 1, for ANOVA tables. Multiple regression. 1. Least squares with

More information

STAT 350: Summer Semester Midterm 1: Solutions

STAT 350: Summer Semester Midterm 1: Solutions Name: Student Number: STAT 350: Summer Semester 2008 Midterm 1: Solutions 9 June 2008 Instructor: Richard Lockhart Instructions: This is an open book test. You may use notes, text, other books and a calculator.

More information

Chapter 9. Correlation and Regression

Chapter 9. Correlation and Regression Chapter 9 Correlation and Regression Lesson 9-1/9-2, Part 1 Correlation Registered Florida Pleasure Crafts and Watercraft Related Manatee Deaths 100 80 60 40 20 0 1991 1993 1995 1997 1999 Year Boats in

More information

AP Final Review II Exploring Data (20% 30%)

AP Final Review II Exploring Data (20% 30%) AP Final Review II Exploring Data (20% 30%) Quantitative vs Categorical Variables Quantitative variables are numerical values for which arithmetic operations such as means make sense. It is usually a measure

More information

Chapter 12 - Part I: Correlation Analysis

Chapter 12 - Part I: Correlation Analysis ST coursework due Friday, April - Chapter - Part I: Correlation Analysis Textbook Assignment Page - # Page - #, Page - # Lab Assignment # (available on ST webpage) GOALS When you have completed this lecture,

More information

CHAPTER 5 FUNCTIONAL FORMS OF REGRESSION MODELS

CHAPTER 5 FUNCTIONAL FORMS OF REGRESSION MODELS CHAPTER 5 FUNCTIONAL FORMS OF REGRESSION MODELS QUESTIONS 5.1. (a) In a log-log model the dependent and all explanatory variables are in the logarithmic form. (b) In the log-lin model the dependent variable

More information

Analysing data: regression and correlation S6 and S7

Analysing data: regression and correlation S6 and S7 Basic medical statistics for clinical and experimental research Analysing data: regression and correlation S6 and S7 K. Jozwiak k.jozwiak@nki.nl 2 / 49 Correlation So far we have looked at the association

More information

Regression. Estimation of the linear function (straight line) describing the linear component of the joint relationship between two variables X and Y.

Regression. Estimation of the linear function (straight line) describing the linear component of the joint relationship between two variables X and Y. Regression Bivariate i linear regression: Estimation of the linear function (straight line) describing the linear component of the joint relationship between two variables and. Generally describe as a

More information

Six Sigma Black Belt Study Guides

Six Sigma Black Belt Study Guides Six Sigma Black Belt Study Guides 1 www.pmtutor.org Powered by POeT Solvers Limited. Analyze Correlation and Regression Analysis 2 www.pmtutor.org Powered by POeT Solvers Limited. Variables and relationships

More information

[ ESS ESS ] / 2 [ ] / ,019.6 / Lab 10 Key. Regression Analysis: wage versus yrsed, ex

[ ESS ESS ] / 2 [ ] / ,019.6 / Lab 10 Key. Regression Analysis: wage versus yrsed, ex Lab 1 Key Regression Analysis: wage versus yrsed, ex wage = - 4.78 + 1.46 yrsed +.126 ex Constant -4.78 2.146-2.23.26 yrsed 1.4623.153 9.73. ex.12635.2739 4.61. S = 8.9851 R-Sq = 11.9% R-Sq(adj) = 11.7%

More information

Experimental Uncertainty (Error) and Data Analysis

Experimental Uncertainty (Error) and Data Analysis E X P E R I M E N T 1 Experimental Uncertainty (Error) and Data Analysis INTRODUCTION AND OBJECTIVES Laboratory investigations involve taking measurements of physical quantities, and the process of taking

More information

Chapter 1. Linear Regression with One Predictor Variable

Chapter 1. Linear Regression with One Predictor Variable Chapter 1. Linear Regression with One Predictor Variable 1.1 Statistical Relation Between Two Variables To motivate statistical relationships, let us consider a mathematical relation between two mathematical

More information

Linear Regression with one Regressor

Linear Regression with one Regressor 1 Linear Regression with one Regressor Covering Chapters 4.1 and 4.2. We ve seen the California test score data before. Now we will try to estimate the marginal effect of STR on SCORE. To motivate these

More information

Math Sec 4 CST Topic 7. Statistics. i.e: Add up all values and divide by the total number of values.

Math Sec 4 CST Topic 7. Statistics. i.e: Add up all values and divide by the total number of values. Measures of Central Tendency Statistics 1) Mean: The of all data values Mean= x = x 1+x 2 +x 3 + +x n n i.e: Add up all values and divide by the total number of values. 2) Mode: Most data value 3) Median:

More information

Worksheet for Exploration 6.1: An Operational Definition of Work

Worksheet for Exploration 6.1: An Operational Definition of Work Worksheet for Exploration 6.1: An Operational Definition of Work This Exploration allows you to discover how work causes changes in kinetic energy. Restart. Drag "handy" to the front and/or the back of

More information

Regression Analysis: Basic Concepts

Regression Analysis: Basic Concepts The simple linear model Regression Analysis: Basic Concepts Allin Cottrell Represents the dependent variable, y i, as a linear function of one independent variable, x i, subject to a random disturbance

More information

2.4.3 Estimatingσ Coefficient of Determination 2.4. ASSESSING THE MODEL 23

2.4.3 Estimatingσ Coefficient of Determination 2.4. ASSESSING THE MODEL 23 2.4. ASSESSING THE MODEL 23 2.4.3 Estimatingσ 2 Note that the sums of squares are functions of the conditional random variables Y i = (Y X = x i ). Hence, the sums of squares are random variables as well.

More information

Accuracy: An accurate measurement is a measurement.. It. Is the closeness between the result of a measurement and a value of the measured.

Accuracy: An accurate measurement is a measurement.. It. Is the closeness between the result of a measurement and a value of the measured. Chemical Analysis can be of two types: Chapter 11- Measurement and Data Processing: - : Substances are classified on the basis of their or properties, such as - : The amount of the sample determined in

More information

Topics. Estimation. Regression Through the Origin. Basic Econometrics in Transportation. Bivariate Regression Discussion

Topics. Estimation. Regression Through the Origin. Basic Econometrics in Transportation. Bivariate Regression Discussion 1/24 Topics Basic Econometrics in Transportation Bivariate Regression Discussion Amir Samimi Civil Engineering Department Sharif University of Technology First we consider the case of regression through

More information

Reteach 2-3. Graphing Linear Functions. 22 Holt Algebra 2. Name Date Class

Reteach 2-3. Graphing Linear Functions. 22 Holt Algebra 2. Name Date Class -3 Graphing Linear Functions Use intercepts to sketch the graph of the function 3x 6y 1. The x-intercept is where the graph crosses the x-axis. To find the x-intercept, set y 0 and solve for x. 3x 6y 1

More information

Multiple Regression an Introduction. Stat 511 Chap 9

Multiple Regression an Introduction. Stat 511 Chap 9 Multiple Regression an Introduction Stat 511 Chap 9 1 case studies meadowfoam flowers brain size of mammals 2 case study 1: meadowfoam flowering designed experiment carried out in a growth chamber general

More information

Problem Set 1 ANSWERS

Problem Set 1 ANSWERS Economics 20 Prof. Patricia M. Anderson Problem Set 1 ANSWERS Part I. Multiple Choice Problems 1. If X and Z are two random variables, then E[X-Z] is d. E[X] E[Z] This is just a simple application of one

More information

Secondary Math 2H Unit 3 Notes: Factoring and Solving Quadratics

Secondary Math 2H Unit 3 Notes: Factoring and Solving Quadratics Secondary Math H Unit 3 Notes: Factoring and Solving Quadratics 3.1 Factoring out the Greatest Common Factor (GCF) Factoring: The reverse of multiplying. It means figuring out what you would multiply together

More information

STUDY GUIDE Math 20. To accompany Intermediate Algebra for College Students By Robert Blitzer, Third Edition

STUDY GUIDE Math 20. To accompany Intermediate Algebra for College Students By Robert Blitzer, Third Edition STUDY GUIDE Math 0 To the students: To accompany Intermediate Algebra for College Students By Robert Blitzer, Third Edition When you study Algebra, the material is presented to you in a logical sequence.

More information

XVI. Transformations. by: David Scott and David M. Lane

XVI. Transformations. by: David Scott and David M. Lane XVI. Transformations by: David Scott and David M. Lane A. Log B. Tukey's Ladder of Powers C. Box-Cox Transformations D. Exercises The focus of statistics courses is the exposition of appropriate methodology

More information

Ø Set of mutually exclusive categories. Ø Classify or categorize subject. Ø No meaningful order to categorization.

Ø Set of mutually exclusive categories. Ø Classify or categorize subject. Ø No meaningful order to categorization. Statistical Tools in Evaluation HPS 41 Fall 213 Dr. Joe G. Schmalfeldt Types of Scores Continuous Scores scores with a potentially infinite number of values. Discrete Scores scores limited to a specific

More information

Lab 6 Forces Part 2. Physics 225 Lab

Lab 6 Forces Part 2. Physics 225 Lab b Lab 6 Forces Part 2 Introduction This is the second part of the lab that you started last week. If you happen to have missed that lab then you should go back and read it first since this lab will assume

More information

(4) 1. Create dummy variables for Town. Name these dummy variables A and B. These 0,1 variables now indicate the location of the house.

(4) 1. Create dummy variables for Town. Name these dummy variables A and B. These 0,1 variables now indicate the location of the house. Exam 3 Resource Economics 312 Introductory Econometrics Please complete all questions on this exam. The data in the spreadsheet: Exam 3- Home Prices.xls are to be used for all analyses. These data are

More information

Interpreting coefficients for transformed variables

Interpreting coefficients for transformed variables Interpreting coefficients for transformed variables! Recall that when both independent and dependent variables are untransformed, an estimated coefficient represents the change in the dependent variable

More information

Business Statistics. Lecture 10: Correlation and Linear Regression

Business Statistics. Lecture 10: Correlation and Linear Regression Business Statistics Lecture 10: Correlation and Linear Regression Scatterplot A scatterplot shows the relationship between two quantitative variables measured on the same individuals. It displays the Form

More information

Statistics and Data Analysis

Statistics and Data Analysis Statistics and Data Analysis Professor William Greene Phone: 212.998.0876 Office: KMC 7-90 Home page: http://people.stern.nyu.edu/wgreene Email: wgreene@stern.nyu.edu Course web page: http://people.stern.nyu.edu/wgreene/statistics/outline.htm

More information

Probability Distributions

Probability Distributions CONDENSED LESSON 13.1 Probability Distributions In this lesson, you Sketch the graph of the probability distribution for a continuous random variable Find probabilities by finding or approximating areas

More information

Experimental Design and Graphical Analysis of Data

Experimental Design and Graphical Analysis of Data Experimental Design and Graphical Analysis of Data A. Designing a controlled experiment When scientists set up experiments they often attempt to determine how a given variable affects another variable.

More information

SMAM 314 Computer Assignment 5 due Nov 8,2012 Data Set 1. For each of the following data sets use Minitab to 1. Make a scatterplot.

SMAM 314 Computer Assignment 5 due Nov 8,2012 Data Set 1. For each of the following data sets use Minitab to 1. Make a scatterplot. SMAM 314 Computer Assignment 5 due Nov 8,2012 Data Set 1. For each of the following data sets use Minitab to 1. Make a scatterplot. 2. Fit the linear regression line. Regression Analysis: y versus x y

More information

Pre-Calculus Multiple Choice Questions - Chapter S8

Pre-Calculus Multiple Choice Questions - Chapter S8 1 If every man married a women who was exactly 3 years younger than he, what would be the correlation between the ages of married men and women? a Somewhat negative b 0 c Somewhat positive d Nearly 1 e

More information

Intro to Linear Regression

Intro to Linear Regression Intro to Linear Regression Introduction to Regression Regression is a statistical procedure for modeling the relationship among variables to predict the value of a dependent variable from one or more predictor

More information

AP Statistics. Chapter 6 Scatterplots, Association, and Correlation

AP Statistics. Chapter 6 Scatterplots, Association, and Correlation AP Statistics Chapter 6 Scatterplots, Association, and Correlation Objectives: Scatterplots Association Outliers Response Variable Explanatory Variable Correlation Correlation Coefficient Lurking Variables

More information

Business 320, Fall 1999, Final

Business 320, Fall 1999, Final Business 320, Fall 1999, Final name You may use a calculator and two cheat sheets. You have 3 hours. I pledge my honor that I have not violated the Honor Code during this examination. Obvioiusly, you may

More information

WISE Regression/Correlation Interactive Lab. Introduction to the WISE Correlation/Regression Applet

WISE Regression/Correlation Interactive Lab. Introduction to the WISE Correlation/Regression Applet WISE Regression/Correlation Interactive Lab Introduction to the WISE Correlation/Regression Applet This tutorial focuses on the logic of regression analysis with special attention given to variance components.

More information

BIOSTATISTICS NURS 3324

BIOSTATISTICS NURS 3324 Simple Linear Regression and Correlation Introduction Previously, our attention has been focused on one variable which we designated by x. Frequently, it is desirable to learn something about the relationship

More information

appstats8.notebook October 11, 2016

appstats8.notebook October 11, 2016 Chapter 8 Linear Regression Objective: Students will construct and analyze a linear model for a given set of data. Fat Versus Protein: An Example pg 168 The following is a scatterplot of total fat versus

More information

Stat 101 Exam 1 Important Formulas and Concepts 1

Stat 101 Exam 1 Important Formulas and Concepts 1 1 Chapter 1 1.1 Definitions Stat 101 Exam 1 Important Formulas and Concepts 1 1. Data Any collection of numbers, characters, images, or other items that provide information about something. 2. Categorical/Qualitative

More information

Intro to Linear Regression

Intro to Linear Regression Intro to Linear Regression Introduction to Regression Regression is a statistical procedure for modeling the relationship among variables to predict the value of a dependent variable from one or more predictor

More information

Box-Cox Transformations

Box-Cox Transformations Box-Cox Transformations Revised: 10/10/2017 Summary... 1 Data Input... 3 Analysis Summary... 3 Analysis Options... 5 Plot of Fitted Model... 6 MSE Comparison Plot... 8 MSE Comparison Table... 9 Skewness

More information

22S39: Class Notes / November 14, 2000 back to start 1

22S39: Class Notes / November 14, 2000 back to start 1 Model diagnostics Interpretation of fitted regression model 22S39: Class Notes / November 14, 2000 back to start 1 Model diagnostics 22S39: Class Notes / November 14, 2000 back to start 2 Model diagnostics

More information

Mrs. Poyner/Mr. Page Chapter 3 page 1

Mrs. Poyner/Mr. Page Chapter 3 page 1 Name: Date: Period: Chapter 2: Take Home TEST Bivariate Data Part 1: Multiple Choice. (2.5 points each) Hand write the letter corresponding to the best answer in space provided on page 6. 1. In a statistics

More information

Ch 13 & 14 - Regression Analysis

Ch 13 & 14 - Regression Analysis Ch 3 & 4 - Regression Analysis Simple Regression Model I. Multiple Choice:. A simple regression is a regression model that contains a. only one independent variable b. only one dependent variable c. more

More information

Linear Regression. Linear Regression. Linear Regression. Did You Mean Association Or Correlation?

Linear Regression. Linear Regression. Linear Regression. Did You Mean Association Or Correlation? Did You Mean Association Or Correlation? AP Statistics Chapter 8 Be careful not to use the word correlation when you really mean association. Often times people will incorrectly use the word correlation

More information

Steps to take to do the descriptive part of regression analysis:

Steps to take to do the descriptive part of regression analysis: STA 2023 Simple Linear Regression: Least Squares Model Steps to take to do the descriptive part of regression analysis: A. Plot the data on a scatter plot. Describe patterns: 1. Is there a strong, moderate,

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression OI CHAPTER 7 Important Concepts Correlation (r or R) and Coefficient of determination (R 2 ) Interpreting y-intercept and slope coefficients Inference (hypothesis testing and confidence

More information

Models with qualitative explanatory variables p216

Models with qualitative explanatory variables p216 Models with qualitative explanatory variables p216 Example gen = 1 for female Row gpa hsm gen 1 3.32 10 0 2 2.26 6 0 3 2.35 8 0 4 2.08 9 0 5 3.38 8 0 6 3.29 10 0 7 3.21 8 0 8 2.00 3 0 9 3.18 9 0 10 2.34

More information

9 Correlation and Regression

9 Correlation and Regression 9 Correlation and Regression SW, Chapter 12. Suppose we select n = 10 persons from the population of college seniors who plan to take the MCAT exam. Each takes the test, is coached, and then retakes the

More information