YEAR 10 GENERAL MATHEMATICS 2017 STRAND: BIVARIATE DATA PART II CHAPTER 12 RESIDUAL ANALYSIS, LINEARITY AND TIME SERIES This topic includes: Transformation of data to linearity to establish relationships between variables, for example y and x #, or y and $ % Key knowledge The concepts of direct, inverse and joint variation The methods of transforming data The use of log (base 10) and other scales. Key skills Solve problems which involve the use of direct, inverse or joint variation Model non- linear data by using suitable transformations Apply log (base 10) and other scales to solve variation problems. Chapter Sections Questions to be completed 12.2 Residual Analysis In booklet 12.3 Transforming to linearity 2, 4, 5, 6, 7, 9, 11 (in textbook) 12.4 Time series and trend lines In booklet 12.5 Fitting trend lines and forecasting In booklet Page 1 of 20
TABLE OF CONTENTS 12.2 - RESIDUAL ANALYSIS 3 EXAMPLE 1 4 EXAMPLE 2 5 PLOT RESIDUALS ON THE CAS CALCULATOR 7 EXERCISE QUESTIONS 12.2 RESIDUAL ANALYSIS 7 12.3 - TRANSFORMING TO LINEARITY 8 EXAMPLE 3 8 EXAMPLE 4 8 EXAMPLE 5 9 CHOOSING THE CORRECT TRANSFORMATION 9 TO TRANSFORM TO LINEARITY: 10 QUADRATIC TRANSFORMATIONS 10 LOGARITHMIC AND RECIPROCAL TRANSFORMATIONS 10 USING THE TRANSFORMED LINE FOR PREDICTIONS 11 EXAMPLE 6 11 EXAMPLE 7 11 12.4 TIME SERIES AND TREND LINES 12 TYPE OF TIME SERIES 12 SEASONAL FLUCTUATIONS 12 HERE ARE SOME COMMON SEASONAL PERIODS. 12 CYCLIC FLUCTUATIONS 12 IRREGULAR FLUCTUATIONS 12 PLOTTING TIME SERIES 13 EXAMPLE 8 13 EXAMPLE 9 13 FITTING TREND LINES 13 EXAMPLE 10 13 12.4 EXERCISE QUESTIONS TIME SERIES & TREND LINES 14 12.5 TREND LINES AND FORECASTING 15 ASSOCIATION TABLES AND FORECASTING 15 EXAMPLE 11 15 EXAMPLE 12 17 EXAMPLE 13 18 12.5 EXERCISE QUESTIONS - FITTING TREND LINES AND FORECASTING 19 Page 2 of 20
12.2 - RESIDUAL ANALYSIS There are situations where the mere fitting of a regression line to some data is not enough to convince us that the data set is truly linear. Even if the correlation is close to +1 or 1 it still may not be convincing enough. The next stage is to analyse the residuals, or deviations, of each data point from the straight line. A residual is the vertical difference between each data point and the regression line. To carry out a residual analysis, carry out the following steps: Determine the equation of the least squares regression line using CAS. Calculate the predicted y- value (y )*+, ) for every x- value using the least squares regression equation. Calculate the difference between this predicted y- value (y )*+, ) and the actual y- value for every x- value. We then plot the residual values against the original x- values. If the residual plot shows random points scattered above and below the x- axis, then the original data is most likely to have been a linear relationship. If the residual plot shows some sort of pattern, then the original data is probably NOT linear. Page 3 of 20
Example 1 x 1 2 3 4 5 6 7 8 9 y 12 20 35 40 50 67 83 88 93 Use the data above to produce a residual plot and comment on the likely linearity of the data. Step 1 Re- draw the table and add two more rows Step 2 Equation of the least- squares regression line, y = a + bx Using CAS calculator to calculate for the gradient: b = and the y- intercept: a = Therefore, the least- square regression line is: Step 3 Calculate the predicted y- values using the equation y pred = y pred = When x = 1 When x = 2 y pred = y pred = Or use the CAS calculator to get the y pred values from the regression equation. In the next available column (E) In the formula box type: = f1(a), (where a is the column of x-values) press enter The y-pred values are displayed in the column. Step 4. Calculate the residuals using, Residual = y - y pred y pred = When x = 1 When x = 2 residual = y pred = residual = Calculate the rest of the residuals and enter them into the table. Add all residuals to check that it equals zero. Step 5. Plot residual values against original x- values. Residual Page 4 of 20
The residual plot shows This example was slow and long, even using the CAS. We can use the CAS for even more of the steps than we did above. The next example shows how. Example 2 x 1 2 3 4 5 6 7 8 9 10 y 5 6 8 15 24 47 77 112 187 309 Use the data above to produce a residual plot and comment on the likely linearity of the data. Step 1. Re- draw the table and add two more rows x 1 2 3 4 5 y 5 6 8 15 24 y pred - 50.0545-21.3758 7.3030 35.9818 64.6606 Residual (y y pred ) x 6 7 8 9 10 y 47 77 112 187 309 y pred 93.3394 122.018 150.697 179.376 208.055 Residual (y y pred ) (a) Find the equation of the least- squares regression line. CAS calculator On a new Lists & Spreadsheet page, enter the data points: Name column A as x and enter the values of x into this column Name column B as y and enter the values of y into this column On a new Calculator page, click on the following: b 4: Statistics 1: Stat Calculations 4: Linear Regression (a + bx) Choose the following options: X List: x (the name of the list in the spreadsheet given to the column) Y List: y (the name of the list in the spreadsheet given to the column) Save RegEqn to: f1 (the function name that the rule will be saved to) Note: The highlighted number 1 may be different and will update based on how many times a LinReg calculation is performed in a particular document. Page 5 of 20
(b) Calculate the predicted values and residuals. Show at least two calculations for the predicted value and the residual. y 4567 = resid = y y 4567 When x = 1: When x = 1: When x = 2: When x = 2: CAS Calculator On the same Lists & Spreadsheet page: A x B y C ypred D = =f1(a) Here 1 (data point) (data point) (autofill) 2 (data point) (data point) (autofill) Etc. Etc. Etc. Go to the formula box under D and click on: h 3: Link to: stat1.resid Note: The highlighted number 1 may be different and will update based on how many times a LinReg calculation is performed in a particular document. Fill in the table below. x y y 4567 resid 1 5 2 6 3 8 4 15 5 24 6 47 7 77 8 112 9 187 10 309 (c) Plot the residual points below against the x- values and comment on the graph. 100 90 80 70 60 50 40 30 20 y 10 0 x -10 0 1 2 3 4 5 6 7 8 9 10-20 -30-40 -50 Page 6 of 20
Plot Residuals on the CAS Calculator Click on the c button and choose Data & Statistics, then do the following: Go to the bottom then Click to add variable and choose the x variable name Go to the left side then Click to add variable and choose the stat1.resid option To add the least squares regression line to the plot, click on the following: Menu b 4: Analyze 6: Regression 2: Show Linear (a+bx) Note: You can also calculate the residuals by subtracting y pred from y in a column in the list & spreadsheet in the formula box (i.e. = y y )*+, OR = column[] column[]). EXERCISE QUESTIONS 12.2 RESIDUAL ANALYSIS 1. Consider the data set shown. Find the equation of the least- squares regression line and calculate the residuals. x 1 2 3 4 5 6 7 8 9 y 12 20 35 40 50 67 83 88 93 2. Using the same data from question 1, plot the residuals and discuss the features of the residual plot. Is your result consistent with the coefficient of determination? 3. Find the residuals for the following data. x 1 2 3 4 5 6 y 1 9.7 12.7 13.7 14.4 14.5 4. For the results of question 3, plot the residuals and discuss whether the relationship between x and y is linear. 5. Consider the following table from a survey conducted at a new computer manufacturing factory. It shows the percentage of defective computers produced on 8 different days after the opening of the factory. Day 2 4 5 7 8 9 10 11 Defective rate (%) 15 10 12 4 9 7 3 4 a) The results of least- squares regression were: b = 1.19, a = 16.34, r = 0.87. Given y = a + bx, use the above information to calculate the predicted defective rates (y pred ). b) Find the residuals (y y pred ). c) Plot the residuals and comment on the likely linearity of the data. d) Estimate the defective rate after the first day of the factory s operation. e) Estimate when the defective rate will be at zero. Comment on this result. 6. Find the residuals for the following data set. m 12 37 35 41 55 69 77 90 P 2.5 21.6 52.3 89.1 100.7 110.3 112.4 113.7 7. For the data in question 6, plot the residuals and comment whether the relationship between x and y is linear. Page 7 of 20
12.3 - TRANSFORMING TO LINEARITY Although linear regression might produce a good fit (high r value) to a set of data, the data set may still be non- linear. To remove (as much as is possible) such non- linearity, the data can be transformed. The x- values or y- values may be transformed in some way so that the transformed data are more linear. This enables more accurate predictions (extrapolations and interpolations) from the regression equation. In this course, the following transformations are studied: Logarithmic transformations Quadratic transformations Reciprocal transformations y vs log $J x y vs x # y vs $ % log $J (y) vs x y # vs x $ K vs x Example 3 Apply all the listed transformations to the data from the previous example. x y x # y # 1 x 1 y log $J x log $J y 1 5 2 6 3 8 4 15 5 24 6 47 7 77 8 112 9 187 10 309 Example 4 Using the transformed data from the previous example, plot each of the following on CAS: (a) y vs x # (b) y vs $ % Page 8 of 20
(c) y vs log $J x (e) $ vs x K (d) y # vs x (f) log $J y vs x Which of the transformations looks most linear? Example 5 Perform a linear regression analysis for each of the transformed data in the previous example, and hence confirm the best linear transformation. Write r to 3 decimal places and r # as a percentage to 1 decimal place. Transformation r r # y vs x # y vs $ % y vs log $J x y # vs x $ K vs x log $J y vs x Choosing the correct transformation To decide on an appropriate transformation, examine the points on a scatterplot with high values of x and/or y (that is, away from the origin) and decide for each axis whether it needs to be stretched or compressed to make the points line up. The best way to see which of the transformations to use is to look at a number of data patterns. Page 9 of 20
To transform to linearity: 1. Plot the original data and least squares regression line. 2. Examine the high values of x and/or y and decide if the data need to be compressed or stretched to make them linear (see diagrams). 3. Transform the data by: (a) compressing x- or y- values using the reciprocal or logarithmic functions. (b) stretching x- or y- values using the square function. 4. Plot the transformed data and its least squares regression line. 5. Repeat steps 2 to 4 for all appropriate transformations. Examine the residuals or correlation coefficient r to see which one is a better fit. There are at least two possible transformations for any given non- linear scatterplot, the decision as to which is the best comes from the r value. The transformation with the better r value should be considered as the most appropriate. (Note: The better r value is the one closest to either 1 or 1. Quadratic transformations Use y vs x # transformation Use y vs x # transformation Use y # vs x transformation Use y # vs x transformation Logarithmic and reciprocal transformations Use y vs log $J x or y vs $ % transformation Use y vs log $J x or y vs $ % transformation Page 10 of 20
Use log $J y vs x or $ vs x transformation Use K log $J y vs x or $ vs x transformation K Using the transformed line for predictions Once the appropriate model has been established and the equation of least- squares regression line has been found, the equation can be used for predictions. Example 6 From the previous example, the best transformation was log $J y vs x. Write down the equation of the transformed regression line. Example 7 (a) Apply a reciprocal transformation to the following data. Temperature ( C) x 5 10 15 20 25 30 35 No. of students in a class wearing jumper(s) y 18 10 6 5 3 2 2 (b) Use the transformed regression equation to predict the number of students wearing a jumper when the temperature is 12 C. Page 11 of 20
12.4 TIME SERIES AND TREND LINES A time series plot is a scatterplot where the x variable is time, such as hours, days, weeks and years. The main purpose of a time series is to see how some quantity varies with time. For example, a company may wish to record its daily sales figures over a 10- day period. Type of time series Seasonal fluctuations Certain data seem to fluctuate during the year, as the seasons change, this is called a seasonal time series. The most obvious example of this would be total rainfall during summer, autumn, winter and spring in a year. The name seasonal is not specific to the seasons of a year. It could also be related to other constant periods of highs and lows. A key feature of seasonal fluctuations is that the seasons occur at the same time each cycle. Here are some common seasonal periods. Seasons Cycle Example Seasons Winter, spring, summer, autumn Four seasons in a year Rainfall Months Jan, Feb, Mar,, Nov, Dec 12 months in a year Grocery store monthly sales figures 1 st Quarter (Q 1 ), Quarters 2 nd Quarter (Q 2 ), Quarterly expenditure figures of a 3 rd Four quarters in a year Quarter (Q 3 ), company 4 th Quarter (Q 4 ) Days Monday to Friday Five days in a week Daily sales for a store open from Monday to Friday only Days Cyclic fluctuations Monday, Tuesday, Wednesday, Thursday, Friday, Saturday, Sunday Seven days in a week Like seasonal time series, cyclic time series show fluctuations upwards and downwards, but not according to season. Number of hamburgers sold at a takeaway store daily Irregular fluctuations Fluctuations may seem to occur at random. This can be caused by external events such as floods, wars, new technologies or inventions, or anything else that results from random causes. There is no obvious way to predict the direction of the time series or even when it changes direction. Page 12 of 20
Plotting time series 1. The explanatory variable is always time and is always on the x- axis. 2. Plot the points and joint them to form a time series and then identify for trend types. Note when plotting the time series using a CAS calculator, the time periods must be converted to a numerical value. This can be done by setting up a table with an extra row for the time conversion, called the time code where the first point for the time is converted to 1 and so on. Such a table is referred as an association table. Two examples are shown: Example 8 Week 1 Mon Week 1 Tues Week 1 Wed Week 1 Thurs Week 1 Fri Week 1 Sat Week 1 Sun Week 2 Mon Week 2 Tues 1 2 3 4 5 6 7 8 9 Example 9 Jan 2009 Feb 2009 Mar 2009 Apr 2009 May 2009 June 2009 July 2009 Aug 2009 Sep 2009 1 2 3 4 5 6 7 8 9 Fitting trend lines After we have plotted a time series graph, if there is a noticeable trend (upward, downward or flat) we can add a trend line, in the form of a straight line using least square regression method, to the data. Example 10 The following table displays the school fees collected over a 10- week period. Plot the data and decide on the type of time- series pattern. If there is a trend, fit a straight line. 12 Week beginning 8 Jan 15 Jan 22 Jan 29 Jan 5 Feb 12 Feb 19 Feb 26 Feb 5 Mar Mar $ 1000 1.5 2.5 14.0 4.5 13.0 4.5 8.5 0.5 5.0 1.0 Set up an association table with the time code. Week beginning 8 Jan. 15 Jan. 22 Jan. 29 Jan. 5 Feb. 12 Feb 19 Feb 26 Feb 5 Mar 12 Mar $ 1000 1.5 2.5 14.0 4.5 13.0 4.5 8.5 0.5 5.0 1.0 Time Code On a Lists & Spreadsheet page, enter the time code (week) values into column A and school fees values into column B. Label the columns accordingly. On a Data & Statistics page, draw a scatterplot of the data. To do this, tab e to each axis to select Click to add variable. Place week on the horizontal axis and schoolfees on the vertical axis. To show as a time- series plot with data points joined, press: MENU b 2: Plot Properties 2 1 : Connect Data Points 1 Page 13 of 20
To fit a least- squares regression line, complete the following steps. Press: MENU b 4: Analyse 4 6: Regression 6 2: Show Linear (a+bx) 2 12.4 EXERCISE QUESTIONS TIME SERIES & TREND LINES 1 Data was recorded about the number of families who moved from Melbourne to Ballarat over the last 10 years. Plot the data and decide on the type of time series pattern. If there is a trend, fit a straight line. Year 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 Number moved 97 118 125 106 144 155 162 140 158 170 For questions 2 to 6, identify whether the time series are likely to be seasonal, cyclic or irregular and if they are also displaying a trend: 2. the amount of rainfall, per month, in Western Victoria 3. the number of soldiers in the United States army, measured annually 4. the number of people living in Australia, measured annually 5. the share price of BHP Billiton, measured monthly 6. the number of seats held by the Liberal Party in Federal Parliament. 7. The following table shows the temperature in Victoria over a 10- day period. Day 1 2 3 4 5 6 7 8 9 10 Temperature ( C) 38 35 34 30 28 27 23 20 19 18 Fit a trend line to the data. 8. The monthly share prices of a recently privatised telephone company were recorded as follows. Date Jan. 09 Feb. 09 Mar. 09 Apr. 09 May 09 June 09 July 09 Aug. 09 Price ($) 2.50 2.70 3.00 3.20 3.60 3.70 3.90 4.20 Graph the data (let 1 = Jan., 2 = Feb.,... and so on) and fit a trend line to the data. Comment on the feasibility of predicting share prices for the following year. 9. Consider the data in the table shown, which represent the price of oranges over a 19- week period. Week 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 Price(cents) 40 45 53 46 40 45 62 58 67 60 72 60 64 78 74 66 78 81 80 a) Fit a straight trend line to the data and sate the equation b) Predict the price in week 25. 10. The following table represents the quarterly sales figures (in $000s) of a popular software product. Plot the data and fit a trend line. Discuss the type of time series best reflected by these data. Quarter Q 1-07 Q 2-07 Q 3-07 Q 4-07 Q 1-08 Q 2-08 Q 3-08 Q 4-08 Q 1-09 Q 2-09 Q 3-09 Q 4-09 Sales 120 135 150 145 140 120 100 110 120 140 190 220 Page 14 of 20
12.5 TREND LINES AND FORECASTING Association tables and forecasting An association table is often required to convert period labels to a numerical value, so that a straight- line equation can be calculated. It is best to set up an extra row if data are in tabular form, or to change the labels shown on the axis of a time- series plot to numerical values. Here are three examples. Example 1 Example 3 Example 2 For forecasting, use the association table to devise a time code for any period in the future. This time code will then be used in the straight- line equation. From the three examples we can calculate that for: Example 1: 2013 would have a time code of 8; Example 2: 1st quarter 2010 would have a time code of 9; Example 3: Monday week 4 would have a time code of 22. Example 11 A new tanning salon has opened in a shopping centre, with customer numbers for the first days shown in the following table. Fit a straight line to the data set using the least- squares regression method. Week 1 Week 2 Period Mon. Tue. Wed. Thu. Fri. Sat. Sun. Mon. Tue. Wed. Number of customers 9 9 11 13 16 18 19 20 23 27 Use the equation of the straight line to predict the number of customers for: (a) Monday week 4 (b) Thursday week 2 Solution Set up an association table with Monday week 1 as the starting point for the time. Week 1 Week 2 Period Mon. Tue. Wed. Thu. Fri. Sat. Sun. Mon. Tue. Wed. Number of customers 9 9 11 13 16 18 19 20 23 27 Time code Open a Lists & Spreadsheet page. Enter the time code values into column A and the number of customers values into column B. Label as shown. Page 15 of 20
Open a Data & Statistics page. Place timecode on the horizontal axis and customers on the vertical axis. Join the dots and fit a least- squares regression line. The time- series plot shows From the list generated, the r value of 0.9871 suggests a very strong linear relationship. Thus it is appropriate to use the least- squares regression line for performing predictions. Write the least- squares regression equation Rewrite an equation using the correct variables Use the linear regression equation to predict question (a) and (b). (a) Monday week 4 is equivalent to the time code of. Therefore, we substitute into the equation to calculate/predict for the number of customers. Number of customers = 1.9697 time code + 5.6667 Number of customers = (b) Thursday week 2 is equivalent to the time code of. Therefore we substitute into the equation to calculate/predict for the number of customers. Number of customers = 1.9697 time code + 5.6667 Number of customers = Alternatively, we can use the calculator to substitute these values into the equation and calculate for the number of customers. Use the Calculator page and the saved equation in f1. Complete the entry lines as: f1(22) f1(11) Press ENTER after each entry. Note: Remember that forecasting is an extrapolation and if going too far into the future, the prediction is not reliable, as the trend may change. Page 16 of 20
Once an equation has been determined for a time series, it can be used to analyse the situation. For the period given in Worked Example 2, the equation is: Number of customers = 1.9697 time code + 5.6667 The y- intercept (5.67) has no real meaning, as it represents the time code of zero, which is the day before the opening of the salon. The gradient or rate of change is of more importance. It indicates that the number of customers is changing; in this instance, growing by approximately 2 customers per day (gradient of 1.97). Example 12 The forecast equation to calculate share prices, y, in a sugar company was calculated from data of the share values over 5 years. The equation is y = 0.42t + 1.56, where t = 1 represents the year 2001, t = 2 represents the year 2002 and so on. (a) Rewrite the equation putting it in the context of the question. (b) Interpret the numerical values given in the relationship. (c) Predict the share value in 2010. Before we fit a straight line to a data set using least- squares regression it is useful to draw a scatterplot. This is beneficial since it can: 1. Demonstrate how close the points are to a straight line, or if a curve is a better fit for the data. 2. Demonstrate if there is an outlier in the data set that could affect the least- squares regression. If there is an outlier in the data and we are using the equation to make a prediction, then this can effect our prediction. If the outlier was removed from the data then this is more likely to give a better prediction since the least- squares regression will fit the data more closely. Page 17 of 20
Example 13 The table below shows the sales for the first 8 years of a new business. Year 1 2 3 4 5 6 7 8 Sales ($) 1326 1438 1376 1398 1412 1445 1477 1464 (a) Plot the data. (b) If there is a trend, fit a straight line to the data using a least- squares regression method. The equation is therefore: (c) Identify if there is any outlier and re- plot the data after removing any outliers. Explain the reason for removing the outlier. (d) Fit a straight line to the set of data without the outlier using a least- squares regression method. Page 18 of 20
The equation is therefore: (e) Comment on the gradient and y- intercept for the two set of data, with and without the outlier. 12.5 EXERCISE QUESTIONS - FITTING TREND LINES AND FORECASTING 1. Data was recorded on the number of road fatalities in Australia in 2009 and 2010. Month and year 01/09 02/09 03/09 04/09 05/09 06/09 07/09 08/09 09/09 10/09 11/09 12/09 Driver fatalities 115 117 132 153 130 131 109 117 113 143 107 124 Month and year 01/10 02/10 03/10 04/10 05/10 06/10 07/10 08/10 09/10 10/10 11/10 12/10 Driver fatalities 126 98 104 115 134 111 107 94 104 121 119 120 Fit a straight line to the data set using the least- squares regression method and use the straight line to predict the number of fatalities in: a) June 2015 b) April 2018. 2. The forecast equation for calculating the share prices, y, of a mining company was obtained from the data of share prices over the past 4 years. The equation is y = 18.57 0.1t, where t = 1 represents the year 2010. a) Rewrite the equation putting it in the context of the question. b) Interpret the values of the gradient and y- intercept. c) Predict the share price in 2019. 3. The following table shows the stock price for Apple in 2009 10. Month 2010 Jan. Feb. Mar. Apr. May June July Aug. Sept. Stock price ($) 192.06 204.62 235.00 261.09 256.88 251.53 257.25 243.10 283.75 a) Plot the data and decide if there are any outliers. Complete a least- squares regression including all data points. b) Re- plot the data after removing any outliers. c) If there is a trend, fit a straight line to the data using a least- squares regression method. d) Compare the gradient and y- intercept of the two regressions. Page 19 of 20
4. The following table represents the number of cars remaining to be completed on an assembly line. Fit a straight line to the following data using the least- squares regression method. Time (hours) 1 2 3 4 5 6 7 8 9 Cars remaining 32 26 27 23 16 17 13 10 9 a) Predict the number of cars remaining to be completed after 11 hours. b) At what rate is the numbers of cars on the assembly line being reduced by? 5. From the equation of the trend line, it should be possible to predict when there are no cars left on the assembly line. This is done by finding the value of t which makes y = 0. Using the equation from question 4, find the time when there will be no cars left on the assembly line. 6. The forecast equation for calculating prices, y, of shares in a steel company was obtained from data of the share prices of the past 6 years. The equation is y = 2.56 + 0.72t where t = 1 represents the year 2010, t = 2 represents the year 2011 and so on. a) Rewrite the equation putting it in the context of the question. b) Interpret the values of the gradient and the y- intercept. c) Predict the share price in 2020. 7. A mathematics teacher gives her students a test each month for 10 months, and the class average is recorded. The tests are carefully designed to be of similar difficulty. Test Feb. Mar. Apr. May June July Aug. Sept. Oct. Nov. Mark (%) 57 63 62 67 65 68 70 72 74 77 a) Calculate the equation of the trend line for these data using the least- squares regression method. b) Plot the data points and the trend line on the same set of axes. c) Use the trend line equation to predict the results for the last exam in December. d) Comment on the suitability of the trend line as a predictor of future trends, supporting your arguments with mathematical statements. 8. The average cost of a hotel room in Sydney in 2010 is shown in the table. Month 2010 Jan. Feb. Mar. Apr. May June July Aug. Sept. Hotel price ($) 250 240 235 237 239 230 228 237 332 a) Plot the data and decide if there are any outliers. Complete a least- squares regression including all data points. b) Re- plot the data after removing any outliers. c) If there is a trend, fit a straight line to the data using a least- squares regression method. d) Compare the gradient and y- intercept of the two regressions and give a possible explanation for the outlier. Page 20 of 20