Regression Models REVISED TEACHING SUGGESTIONS ALTERNATIVE EXAMPLES

Similar documents
Regression Models. Chapter 4. Introduction. Introduction. Introduction

Chapter 4. Regression Models. Learning Objectives

Regression Models. Chapter 4

Chapter 4: Regression Models

LI EAR REGRESSIO A D CORRELATIO

Chapter 16. Simple Linear Regression and dcorrelation

Chapter 16. Simple Linear Regression and Correlation

Chapter 7 Student Lecture Notes 7-1

Bayesian Analysis LEARNING OBJECTIVES. Calculating Revised Probabilities. Calculating Revised Probabilities. Calculating Revised Probabilities

Keller: Stats for Mgmt & Econ, 7th Ed July 17, 2006

Econ 3790: Business and Economics Statistics. Instructor: Yogesh Uppal

Chapter 3 Multiple Regression Complete Example

Ch 13 & 14 - Regression Analysis

Business Statistics. Chapter 14 Introduction to Linear Regression and Correlation Analysis QMIS 220. Dr. Mohammad Zainal

Simple Linear Regression

Correlation Analysis

Chapter 14 Student Lecture Notes 14-1

Econ 3790: Statistics Business and Economics. Instructor: Yogesh Uppal

Chapter 15 Multiple Regression

Basic Business Statistics, 10/e

Chapter Learning Objectives. Regression Analysis. Correlation. Simple Linear Regression. Chapter 12. Simple Linear Regression

Chapter 14 Student Lecture Notes Department of Quantitative Methods & Information Systems. Business Statistics. Chapter 14 Multiple Regression

Business Statistics. Lecture 10: Correlation and Linear Regression

ECON 497 Midterm Spring

Regression Analysis. BUS 735: Business Decision Making and Research

Exercises on Chapter 2: Linear Regression with one independent variable:

Chapter 14 Multiple Regression Analysis

The Multiple Regression Model

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

Basic Business Statistics 6 th Edition

Mrs. Poyner/Mr. Page Chapter 3 page 1

Data Analysis 1 LINEAR REGRESSION. Chapter 03

Mathematics for Economics MA course

Multiple Linear Regression

Chapter 13 Student Lecture Notes Department of Quantitative Methods & Information Systems. Business Statistics

Statistics for Managers using Microsoft Excel 6 th Edition

Section 3: Simple Linear Regression

Chapter 14 Simple Linear Regression (A)

Chapter 10. Correlation and Regression. McGraw-Hill, Bluman, 7th ed., Chapter 10 1

BNAD 276 Lecture 10 Simple Linear Regression Model

Overview. 4.1 Tables and Graphs for the Relationship Between Two Variables. 4.2 Introduction to Correlation. 4.3 Introduction to Regression 3.

Mathematics Level D: Lesson 2 Representations of a Line

M112 Short Course In Calculus V. J. Motto Spring 2013 Applications of Derivatives Worksheet

Interactions. Interactions. Lectures 1 & 2. Linear Relationships. y = a + bx. Slope. Intercept

Simple Linear Regression: One Quantitative IV

9. Linear Regression and Correlation

Inferences for Regression

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.

Chapter 3: Examining Relationships

Business Statistics (BK/IBA) Tutorial 4 Full solutions

Marketing Research Session 10 Hypothesis Testing with Simple Random samples (Chapter 12)

Practice Questions for Exam 1

CHAPTER 7. + ˆ δ. (1 nopc) + ˆ β1. =.157, so the new intercept is = The coefficient on nopc is.157.

Section 2.5 from Precalculus was developed by OpenStax College, licensed by Rice University, and is available on the Connexions website.

Chapter 6 Assessment. 3. Which points in the data set below are outliers? Multiple Choice. 1. The boxplot summarizes the test scores of a math class?

ANOVA - analysis of variance - used to compare the means of several populations.

q3_3 MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Quantitative Bivariate Data

This gives us an upper and lower bound that capture our population mean.

Can you tell the relationship between students SAT scores and their college grades?

Objectives Simple linear regression. Statistical model for linear regression. Estimating the regression parameters

Draft Proof - Do not copy, post, or distribute. Chapter Learning Objectives REGRESSION AND CORRELATION THE SCATTER DIAGRAM

1. The area of the surface of the Atlantic Ocean is approximately 31,830,000 square miles. How is this area written in scientific notation?

Regression Analysis. Regression: Methodology for studying the relationship among two or more variables

Simple Linear Regression Using Ordinary Least Squares

ECON3150/4150 Spring 2016

SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. x )

5. Let W follow a normal distribution with mean of μ and the variance of 1. Then, the pdf of W is

Ordinary Least Squares Regression Explained: Vartanian

Examining Relationships. Chapter 3

Year 10 Mathematics Semester 2 Bivariate Data Chapter 13

Final Exam - Solutions

CHAPTER 5 LINEAR REGRESSION AND CORRELATION

Chapter 9. Correlation and Regression

AP Statistics Unit 2 (Chapters 7-10) Warm-Ups: Part 1

AP Final Review II Exploring Data (20% 30%)

STAT 212 Business Statistics II 1

Regression Analysis. BUS 735: Business Decision Making and Research. Learn how to detect relationships between ordinal and categorical variables.

STAT 350 Final (new Material) Review Problems Key Spring 2016

Correlation & Simple Regression

Linear Regression Communication, skills, and understanding Calculator Use

ECON 450 Development Economics

Lectures on Simple Linear Regression Stat 431, Summer 2012

AP Statistics Bivariate Data Analysis Test Review. Multiple-Choice

Inference for Regression

Midterm 2 - Solutions

Homework 1 Solutions

STA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #6

Chapter 13. Multiple Regression and Model Building

Name Algebra 1 Midterm Review Period. = 10 4x e) x ) Solve for y: a) 6x 3y = 12 b) 4y 8x = 16

Table of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z).

Chapter 12 - Part I: Correlation Analysis

Inference for Regression Inference about the Regression Model and Using the Regression Line

STA121: Applied Regression Analysis

Econometrics Homework 1

Lecture 4 Scatterplots, Association, and Correlation

Lecture 4 Scatterplots, Association, and Correlation

Announcements: You can turn in homework until 6pm, slot on wall across from 2202 Bren. Make sure you use the correct slot! (Stats 8, closest to wall)

Statistics and Quantitative Analysis U4320

Correlation and Regression

Transcription:

M04_REND6289_10_IM_C04.QXD 5/7/08 2:49 PM Page 46 4 C H A P T E R Regression Models TEACHING SUGGESTIONS Teaching Suggestion 4.1: Which Is the Independent Variable? We find that students are often confused about which variable is independent and which is dependent in a regression model. For example, in Triple A s problem, clarify which variable is X and which is Y. Emphasize that the dependent variable (Y ) is what we are trying to predict based on the value of the independent (X) variable. Use examples such as the time required to drive to a store and the distance traveled, the totals number of units sold and the selling price of a product, and the cost of a computer and the processor speed. Teaching Suggestion 4.2: Statistical Correlation Does Not Always Mean Causality. Students should understand that a high R 2 doesn t always mean one variable will be a good predictor of the other. Explain that skirt lengths and stock market prices may be correlated, but raising one doesn t necessarily mean the other will go up or down. An interesting study indicated that, over a 10-year period, the salaries of college professors were highly correlated to the dollar sales volume of alcoholic beverages (both were actually correlated with inflation). Teaching Suggestion 4.3: Give students a set of data and have them plot the data and manually draw a line through the data. A discussion of which line is best can help them appreciate the least squares criterion. Teaching Suggestion 4.4: Select some randomly generated values for X and Y (you can use random numbers from the random number table in Chapter 15 or use the RAND function in Excel). Develop a regression line using Excel and discuss the coefficient of determination and the F-test. Students will see that a regression line can always be developed, but it may not necessarily be useful. Teaching Suggestion 4.5: A discussion of the long formulas and short-cut formulas that are provided in the appendix is helpful. The long formulas provide students with a better understanding of the meaning of the SSE and SST. Since many people use computers for regression problems, it helps to see the original formulas. The short-cut formulas are helpful if students are performing the computations on a calculator. ALTERNATIVE EXAMPLES Alternative Example 4.1: The sales manager of a large apartment rental complex feels the demand for apartments may be related to the number of newspaper ads placed during the previous month. She has collected the data shown in the accompanying table. Ads purchased, (X) Apartments leased, (Y) 15 6 9 4 40 16 20 6 25 13 25 9 15 10 35 16 We can find a mathematical equation by using the least squares regression approach. Leases, Y Ads, X (X X) 2 (X X)(Y Ȳ) 6 15 64 32 4 9 196 84 16 40 289 102 6 20 9 12 13 25 4 6 9 25 4 2 10 15 64 0 16 35 144 72 Y 80 X 184 (X X) 2 774 (X X)(Y Ȳ) 306 80 184 Y = = 10; X = = 23 8 8 b 1 306/774 0.395 b 0 10 0.395(23) 0.915 The estimated regression equation is Ŷ 0.915 0.395X or Apartments leased 0.915 0.395 ads placed If the number of ads is 30, we can estimate the number of apartments leased with the regression equation 0.915 0.395(30) 12.76 or 13 apartments Alternative Example 4.2: Given the data on ads and apartment rentals in Alternative Example 4.1, find the coefficient of determination. The following have been computed in the table that follows: SST 150; SSE 29.02; SSR 120.76 (Note: Round-off error may cause this to be slightly different than a computer solution.) 46

M04_REND6289_10_IM_C04.QXD 5/7/08 2:49 PM Page 47 CHAPTER 4 R EGRESSION M ODELS 47 Y X (Y Ȳ)2 Ŷ 0.915 0.395X (Y Ŷ) 2 (Ŷ Ȳ)2 6.00 15.00 16 6.84 0.706 9.986 4.00 9.00 36 4.47 0.221 30.581 16.00 40.00 36 16.715 0.511 45.091 6.00 20.00 16 8.815 7.924 1.404 13.00 25.00 9 10.79 4.884 0.624 9.00 25.00 1 10.79 3.204 0.624 10.00 15.00 0 6.84 9.986 9.986 16.00 35.00 36 14.74 1.588 22.468 80.00 184.00 SST 150.00 80.00 SSE 29.02 SSR 120.76 From this the coefficient of determination is r 2 SSR/SST 120.76/150 0.81 Alternative Example 4.3: For Alternative Examples 4.1 and 4.2, dealing with ads, X, and apartments leased, Y, compute the correlation coefficient. Since r 2 0.81 and the slope is positive ( 0.395), the positive square root of 0.81 is the correlation coefficient. r 0.90. SOLUTIONS TO DISCUSSION QUESTIONS AND PROBLEMS 4-1. The term least-squares means that the regression line will minimize the sum of the squared errors (SSE). No other line will give a lower SSE. 4-2. Dummy variables are used when a qualitative factor such as the gender of an individual (male or female) is to be included in the model. Usually this is given a value of 1 when the condition is met (e.g. person is male) and 0 otherwise. When there are more than two levels or values for the qualitative factor, more than one dummy variable must be used. The number of dummy variables is one less than the number of possible values or categories. For example, if students are classified as freshmen, sophomores, juniors and seniors, three dummy variables would be necessary. 4-3. The coefficient of determination (r 2 ) is the square of the coefficient of correlation (r). Both of these give an indication of how well a regression model fits a particular set of data. An r 2 value of 1 would indicate a perfect fit of the regression model to the points. This would also mean that r would equal 1 or 1. 4-4. A scatter diagram is a plot of the data. This graphical image helps to determine if a linear relationship is present, or if another type of relationship would be more appropriate. 4-5. The adjusted r 2 value is used to help determine if a new variable should be added to a regression model. Generally, if the adjusted r 2 value increases when a new variable is added to a model, this new variable should be included in the model. If the adjusted r 2 value declines or does not increase when a new variable is added, then the variable should not be added to the model. 4-6. The F-test is used to determine if the overall regression model is helpful in predicting the value of the independent variable (Y). If the F-value is large and the p-value or significance level is low, then we can conclude that there is a linear relationship and the model is useful, as these results would probably not occur by chance. If the significance level is high, then the model is not useful and the results in the sample could be due to random variations. 4-7. The SSE is the sum of the squared errors in a regression model. SST SSE SSR. 4-8. When the residuals (errors) are plotted after a regression line is found, the errors should be random and should not show any significant pattern. If a pattern does exist, then the assumptions may not be met or another model (perhaps nonlinear) would be more appropriate. 4-9. a. Ŷ 36 4.3(70) 337 b. Ŷ 36 4.3(80) 380 c. Ŷ 36 4.3(90) 423 4-10. a. Demand 12 10 8 6 4 2 0 0 2 4 6 8 10 TV Appearances

M04_REND6289_10_IM_C04.QXD 5/7/08 2:49 PM Page 48 48 CHAPTER 4 R EGRESSION M ODELS 4-10. b. Demand TV Appearances Y X (X X) 2 (Y Ȳ)2 (X X)(Y Ȳ) Ŷ (Y Ŷ)2 (Ŷ Ȳ)2 3 3 6.25 12.25 8.75 4 1 6.25 6 4 2.25 0.25 0.75 5 1 2.25 7 7 2.25 0.25 0.75 8 1 2.25 5 6 0.25 2.25 0.75 7 4 0.25 10 8 6.25 12.25 8.75 9 1 6.25 8 5 0.25 2.25 0.75 6 4 0.25 Y 39.0 X 33 17.5 29.5 17.5 12 17.5 Ȳ 6.5 X 5.5 SST SSE SSR SST 29.5; SSE 12; SSR 17.5 b 1 17.5/17.5 1 b 0 6.5 1(5.5) 1 The regression equation is Ŷ 1 1X. c. Ŷ 1 1X 1 1(6) 7. 4-11. See the table for the solution to problem 4-10 to obtain some of these numbers. MSE = SSE/(n k 1) = 12/(6 1 1) = 3 MSR = SSR/k = 17.7/1 = 17.5 F = MSR/MSE = 17.5/3 = 5.83 df 1 = k = 1 df 2 = n k 1 = 6 1 1 = 4 F 0.05, 1, 4 = 7.71 Do not reject H 0 since 5.83 7.71. Therefore, we cannot conclude there is a statistically significant relationship at the 0.05 level. 4-12. Using Excel, the regression equation is Ŷ 1 1X. F 5.83, the significance level is 0.073. This is significant at the 0.10 level (0.073 0.10), but it is not significant at the 0.05 level. There is marginal evidence that there is a relationship between demand for drums and TV appearances. 4-13. Fin. Test 1 Ave,(Y) (X) (X X) 2 (Y Ȳ)2 (X X)(Y Ȳ) Y (Y Ŷ)2 (Ŷ Ȳ)2 93 98 285.235 196 236.444 91.5 2.264 156.135 78 77 16.901 1 4.111 76 4.168 9.252 84 88 47.457 25 34.444 84.1 0.009 25.977 73 80 1.235 36 6.667 78.2 26.811 0.676 84 96 221.679 25 74.444 90 36.188 121.345 64 61 404.457 225 301.667 64.1 0.015 221.396 64 66 228.346 225 226.667 67.8 14.592 124.994 95 95 192.901 256 222.222 89.3 32.766 105.592 76 69 146.679 9 36.333 70 35.528 80.291 711 730 1544.9 998 1143 152.341 845.659 b 1 = 1143/1544.9 = 0.740 b 0 = (711/9) 0.740 (730/9) = 18.99

M04_REND6289_10_IM_C04.QXD 5/7/08 2:49 PM Page 49 CHAPTER 4 R EGRESSION M ODELS 49 a. Ŷ 18.99 0.74X b. Ŷ 18.99 0.74(83) 80.41 c. r 2 = SSR/SST = 845.629/998 = 0.85; r 0.92; this means that 85% of the variability in the final average can be explained by the variability in the first test score. 4-14. See the table for the solution to problem 4-13 to obtain some of these numbers. MSE = SSE/(n k 1) = 152.341/(9 1 1) = 21.76 MSR = SSR/k = 845.659/1 = 845.659 F = MSR/MSE = 845.659/21.76 = 38.9 df 1 = k = 1 df 2 = n k 1 = 9 1 1 = 7 F 0.05, 1, 7 = 5.59 Because 38.9 5.59, we can conclude (at the 0.05 level) that there is a statistically significant relationship between the first test grade and the final average. 4-15. F 38.86; the significance level 0.0004 (which is extremely small) so there is definitely a statistically significant relationship. 4-16. a. Ŷ 13,473 37.65(1,860) $83,502. b. The predicted average selling price for a house this size would be $83,502. Some will sell for more and some will sell for less. There are other factors besides size that influence the price of the house. c. Some other variables that might be included are age of the house, number of bedrooms, and size of the lot. There are other factors in addition to these that one can identify. d. The coefficient of determination (r 2 ) (0.63) 2 0.3969. 4-17. The multiple regression equation is Ŷ $90.00 $48.50X 1 $0.40X 2 a. Number of days on the road: X 1 5; Distance traveled: X 2 300 miles The amount he may be expected to claim is Ŷ 90.00 48.50(5) $0.40(300) $452.50 b. The reimbursement request, according to the model, appears to be too high. However, this does not mean that it is not justified. The accountants should question Thomas Williams about his expenses to see if there are other explanations for the high cost. c. A number of other variables should be included, such as the type of travel (air or car), conference fees if any, and expenses for entertainment of customers, and other transportation (cab and limousine) expenses. In addition, the coefficient of correlation is only 0.68 and r 2 (0.68) 2 0.46. Thus, about 46% of the variability in the cost of the trip is explained by this model; the other 54% is due to other factors. 4-18. Using computer software to get the regression equation, we get Ŷ 1.03 0.0034X where Ŷ predicted GPA and X SAT score. If a student scores 450 on the SAT, we get Ŷ 1.03 0.0034(450) 2.56. If a student scores 800 on the SAT, we get Ŷ 1.03 0.0034(800) 3.75. 4-19. a. A linear model is reasonable from the graph below. Ridership (100,000s) 50 45 40 35 30 25 20 15 10 5 0 0 5 10 Tourists (Millions) 15 20 25 b. Ŷ 5.060 1.593X c. Ŷ 5.060 1.593(10) 20.99, or 2,099,000 people. d. If there are no tourists, the predicted ridership would be 5.06 (100,000s) or 506,000. Because X 0 is outside the range of values that were used to construct the regression model, this number may be questionable. 4-20. The F-value for the F-test is 52.6 and the significance level is extremely small (0.00002) which indicates that there is a statistically significant relationship between number of tourists and ridership. The coefficient of determination is 0.84 indicating that 84% of the variability in ridership from one year to the next could be explained by the variations in the number of tourists. 4-21. a. Ŷ 24,328 3026.67X 1 6684X 2 where Ŷ predicted starting salary; X 1 GPA; X 2 1 if business major, 0 otherwise. b. Ŷ 24,328 3026.67(3.0) 6684(1) $40,092.01. c. The starting salary for business majors tends to be about $6,684 higher than non-business majors in this sample, even after adjusting for variations in GPA. d. The overall significance level is 0.099 and r 2 0.69. Thus, the model is significant at the 0.10 level and 69% of the variability in starting salary is explained by GPA and major. The model is useful in predicting starting salary. 4-22. a. Let Ŷ predicted selling price X 1 square footage X 2 number of bedrooms X 3 age The model with square footage: Ŷ 2367.26 46.60X 1 ; r 2 0.65 The model with number of bedrooms: Ŷ 1923.5 36137.76X 2 ; r 2 0.36 The model with age: Ŷ 147670.9 2424.16X 3 ; r 2 0.78

M04_REND6289_10_IM_C04.QXD 5/7/08 2:49 PM Page 50 50 CHAPTER 4 R EGRESSION M ODELS All of these models are significant at the 0.01 level or less. The best model uses age as the independent variable. The coefficient of determination is highest for this, and it is significant. 4-23. Ŷ 5701.45 48.51X 1 2540.39X 2 and r 2 0.65. Ŷ 5701.45 48.51(2000) 2540.39(3) 95,100.28. Notice the r 2 value is the same as it was in the previous problem with just square footage as the independent variable. Adding the number of bedrooms did not add any significant information that was not already captured by the square footage. It should not be included in the model. The r 2 for this is lower than for age alone in the previous problem. 4-24. Ŷ 82185.5 25.94X 1 2151.7X 2 1711.5X 3 and r 2 0.89. Ŷ 82185.5 25.94(2000) 2151.7(3) 1711.5(10) $110,495.4. 4-25. Ŷ 3071.885 6.5326X where Y DJIA and X S&P. r 0.84 and r 2 0.70. Ŷ 3071.885 6.5326(1100) 10257.8 (rounded) 4-26. With one independent variable, beds, in the model, r 2 0.88. With just admissions in the model, r 2 0.974. When both variables are in the model, r 2 0.975. Thus, the model with only admissions as the independent variable is the best. Adding the number of beds had virtually no impact on r 2, and the adjusted r 2 decreased slightly. Thus, the best model is Ŷ 1.518 0.6686X where Y expense and X admissions. 4-27. Using Excel with Y MPG; X 1 horsepower; X 2 weight the models are: Ŷ 53.87 0.269X 1 ; r 2 0.77 Ŷ 57.53 0.01X 2 ; r 2 0.73. Thus, the model with horsepower as the independent variable is better since r 2 is higher. 4-28. Ŷ 57,69 0.17X 1 0.005X 2 where Y MPG X 1 horsepower X 2 weight r 2 0.82. This model is better because the coefficient of determination is much higher with both variables than it is with either one individually. 4-29. Let Y MPG; X 1 horsepower; X 2 weight The model Ŷ b 0 b 1 X 1 b 2 X 1 2 is Ŷ 69.93 0.620X 1 0.001747X 1 2 and has r 2 0.798. The model Ŷ b 0 b 3 X 2 b 4 X 2 2 is Ŷ 89.09 0.0337X 2 0.0000039X 2 2 and has r 2 0.800. The model Ŷ b 0 b 1 X 1 b 2 X 1 2 b 3 X 2 b 4 X 2 2 is Ŷ 89.2 0.51X 1 0.001889X 1 2 0.01615X 2 0.00000162X 2 2 and has r 2 0.883. This model has a higher r 2 value than the model in 4-28. A graph of the data would show a nonlinear relationship. 4-30. If SAT median score alone is used to predict the cost, we get Ŷ 7793.1 21.8X 1 with r 2 0.22. If both SAT and a dummy variable (X 2 1 for private, 0 otherwise) are used to predict the cost, we get r 2 0.79. The model is Ŷ 7121.8 5.16X 1 9354.99X 2. This says that a private school tends to be about $9,355 more expensive than a public school when the median SAT score is used to adjust for the quality of the school. The coefficient of determination indicates that about 79% of the variability in cost can be explained by these factors. The model is significant at the 0.001 level. 4-31. Yˆ = 67. 8 + 0. 0145X There is a significant relationship between the number of victories (Y) and the payroll (X) at the 0.054 level, which is marginally significant. However, r 2 = 0.24, so the relationship is not very strong. Only about 24% of the variability in victories is explained by this model. 4-32. a. Yˆ = 42. 43 + 0. 0004X b. Yˆ = 31. 54 + 0. 0058X c. The correlation coefficient for the first stock is only 0.19 while the correlation coefficient for the second is 0.96. Thus, there is a much stronger correlation between stock 2 and the DJI than there is for stock 1 and the DJI. CASE STUDIES SOLUTION TO NORTH SOUTH AIRLINE CASE Northern Airline Data Airframe Cost Engine Cost Average Age Year per Aircraft per Aircraft (Hours) 2001 51.80 43.49 6,512 2002 54.92 38.58 8,404 2003 69.70 51.48 11,077 2004 68.90 58.72 11,717 2005 63.72 45.47 13,275 2006 84.73 50.26 15,215 2007 78.74 79.60 18,390 Southeast Airline Data Airframe Cost Engine Cost Average Age Year per Aircraft per Aircraft (Hours) 2001 13.29 18.86 5,107 2002 25.15 31.55 8,145 2003 32.18 40.43 7,360 2004 31.78 22.10 5,773 2005 25.34 19.69 7,150 2006 32.78 32.58 9,364 2007 35.56 38.07 8,259 Utilizing QM for Windows, we can develop the following regression equations for the variables of interest. Northern Airline airframe maintenance cost: Cost 36.10 0.0025 (airframe age) Coefficient of determination 0.7694 Coefficient of correlation 0.8771

M04_REND6289_10_IM_C04.QXD 5/7/08 2:49 PM Page 51 CHAPTER 4 R EGRESSION M ODELS 51 Northern Airline engine maintenance cost: Cost 20.57 0.0026 (airframe age) Coefficient of determination 0.6124 Coefficient of correlation 0.7825 Southeast Airline airframe maintenance cost: Cost 4.60 0.0032 (airframe age) Coefficient of determination 0.3904 Coefficient of correlation 0.6248 Southeast Airline engine maintenance cost: Cost 0.671 0.0041 (airframe age) Coefficient of determination 0.4599 Coefficient of correlation 0.6782 The graphs below portray both the actual data and the regression lines for airframe and engine maintenance costs for both airlines. Note that the two graphs have been drawn to the same scale to facilitate comparisons between the two airlines. Northern Airline: There seem to be modest correlations between maintenance costs and airframe age for Northern Airline. There is certainly reason to conclude, however, that airframe age is not the only important factor. Southeast Airline: The relationships between maintenance costs and airframe age for Southeast Airline are much less well defined. It is even more obvious that airframe age is not the only important factor perhaps not even the most important factor. Overall, it would seem that: 1. Northern Airline has the smallest variance in maintenance costs, indicating that the day-to-day management of maintenance is working pretty well. 2. Maintenance costs seem to be more a function of airline than of airframe age. 3. The airframe and engine maintenance costs for Southeast Airline are not only lower but more nearly similar than those for Northern Airline, but, from the graphs at least, appear to be rising more sharply with age. 4. From an overall perspective, it appears that Southeast Airline may perform more efficiently on sporadic or emergency repairs, and Northern Airline may place more emphasis on preventive maintenance. Ms. Young s report should conclude that: 1. There is evidence to suggest that maintenance costs could be made to be a function of airframe age by implementing more effective management practices. 2. The difference between maintenance procedures of the two airlines should be investigated. 3. The data with which she is presently working do not provide conclusive results. 90 80 70 60 50 40 30 20 Northern Airline 90 80 70 60 50 40 30 20 Southeast Airline Cost ($) Cost ($) Airframe Engine Airframe Engine 10 5 7 9 11 13 15 17 19 Average Airframe Age (Thousands) 10 5 7 9 11 13 15 17 19 Average Airframe Age (Thousands)