YEAR 10 GENERAL MATHEMATICS 2017 STRAND: BIVARIATE DATA PART II CHAPTER 12 RESIDUAL ANALYSIS, LINEARITY AND TIME SERIES

Similar documents
Determine the trend for time series data

Time Series and Forecasting

Year 10 Mathematics Semester 2 Bivariate Data Chapter 13

7CORE SAMPLE. Time series. Birth rates in Australia by year,

FURTHER MATHEMATICS Units 3 & 4 - Written Examination 2

DAILY QUESTIONS 28 TH JUNE 18 REASONING - CALENDAR

Time series and Forecasting

2014 Summer Review for Students Entering Geometry

Mean, Median, Mode, and Range

JANUARY MONDAY TUESDAY WEDNESDAY THURSDAY FRIDAY SATURDAY SUNDAY

THE LIGHT SIDE OF TRIGONOMETRY

Lecture Prepared By: Mohammad Kamrul Arefin Lecturer, School of Business, North South University

Multiple Regression Analysis

Lesson 8: Variability in a Data Distribution

Chapter 5: Data Transformation

CIMA Professional

SYSTEM BRIEF DAILY SUMMARY

CIMA Professional

Lecture Prepared By: Mohammad Kamrul Arefin Lecturer, School of Business, North South University

SYSTEM BRIEF DAILY SUMMARY

Forecasting. Copyright 2015 Pearson Education, Inc.

CIMA Professional 2018

CIMA Professional 2018

Linear Regression 3.2

MEP Y7 Practice Book B

Study Island. Scatter Plots

2017 Autumn Courses. Spanish Elementary. September - December Elementary 1 - A1.1 Complete beginner

Assignments for Algebra 1 Unit 4: Linear Functions and Correlation

Motion to Review the Academic Calendar Passed by Admissions and Standards, September 24, 2009

CHAPTER 1 EXPRESSIONS, EQUATIONS, FUNCTIONS (ORDER OF OPERATIONS AND PROPERTIES OF NUMBERS)

FURTHER MATHEMATICS. Written examination 1. Wednesday 30 May 2018

Tables and Line Graphs

Eidul- Adha Break. Darul Arqam North Scope and Sequence Revised 6/01/18 8 th Algebra I. 1 st Quarter (41 Days)

Copyright 2017 Edmentum - All rights reserved.

ACCA Interactive Timetable & Fees

Reteach 2-3. Graphing Linear Functions. 22 Holt Algebra 2. Name Date Class

Some Excel Problems ELEMENTARY MATHEMATICS FOR BIOLOGISTS 2013

INTRODUCTION TO FORECASTING (PART 2) AMAT 167

Paper Reference(s) 6683 Edexcel GCE Statistics S1 Advanced/Advanced Subsidiary Thursday 5 June 2003 Morning Time: 1 hour 30 minutes

Reteaching Using Deductive and Inductive Reasoning

What Does It Take to Get Out of Drought?

Bivariate data data from two variables e.g. Maths test results and English test results. Interpolate estimate a value between two known values.

Project Appraisal Guidelines

Sunrise, Sunset and Mathematical Functions

Algebra 2 Level 2 Summer Packet

peak half-hourly New South Wales

Tuesday 15 January 2013 Afternoon

Time Concepts Series. Calendars

Chapter 5: Forecasting

Announcements: You can turn in homework until 6pm, slot on wall across from 2202 Bren. Make sure you use the correct slot! (Stats 8, closest to wall)

Letter STUDENT NUMBER FURTHER MATHEMATICS. Written examination 2. Monday 31 October 2016

EVALUATION OF ALGORITHM PERFORMANCE 2012/13 GAS YEAR SCALING FACTOR AND WEATHER CORRECTION FACTOR

ACCA Interactive Timetable

Ch. 9 Pretest Correlation & Residuals

IB Questionbank Mathematical Studies 3rd edition. Bivariate data. 179 min 172 marks

ACCA Interactive Timetable

3. If a forecast is too high when compared to an actual outcome, will that forecast error be positive or negative?

Candidate Number. General Certificate of Secondary Education Higher Tier June 2012

Brookline Police Department

ACCA Interactive Timetable

Year 9 Science Work Rate Calendar Term

2017 Settlement Calendar for ASX Cash Market Products ASX SETTLEMENT

ACCA Interactive Timetable & Fees

The point is located eight units to the right of the y-axis and two units above the x-axis. A) ( 8, 2) B) (8, 2) C) ( 2, 8) D) (2, 8) E) ( 2, 8)

ACCA Interactive Timetable

Decision 411: Class 3

peak half-hourly South Australia

ACCA Interactive Timetable & Fees

ACCA Interactive Timetable & Fees

Chapter 8 - Forecasting

ACCA Interactive Timetable & Fees

ENGINE SERIAL NUMBERS

Line Graphs. 1. Use the data in the table to make a line graph. 2. When did the amount spent on electronics increase the most?

Decision 411: Class 3

ACCA Interactive Timetable & Fees

BUSI 460 Suggested Answers to Selected Review and Discussion Questions Lesson 7

A Plot of the Tracking Signals Calculated in Exhibit 3.9

Simple Linear Regression

Time Series and Forecasting

classroomsecrets.com Reasoning and Problem Solving Read and Interpret Line Graphs Teaching Information

Grade 6 Standard 2 Unit Test Astronomy

(rev ) Important Dates Calendar FALL

Mathematics Practice Test 2

2017 Year 10 General Mathematics Chapter 1: Linear Relations and Equations Chapter 10: Linear Graphs and Models

Introduction to Forecasting

= observed volume on day l for bin j = base volume in jth bin, and = residual error, assumed independent with mean zero.

ACCA Interactive Timetable & Fees

Economics 390 Economic Forecasting

Exemplar for Internal Achievement Standard. Mathematics and Statistics Level 3

Thanksgiving Break Homework Packet Name: Per: Everyday on break, you are expected to do at least 15 minutes of math work.

Letter STUDENT NUMBER FURTHER MATHEMATICS. Written examination 2. Day Date

Mock GCSE Paper Calculator allowed for all questions

Using a Graphing Calculator

Mountain View Community Shuttle Monthly Operations Report

ACCA Interactive Timetable & Fees

ACCA Interactive Timetable

Algebra. Topic: Manipulate simple algebraic expressions.

2009 Further Mathematics GA 3: Written examination 2

1 Introduction to Minitab

TOPIC 13 Bivariate data

Transcription:

YEAR 10 GENERAL MATHEMATICS 2017 STRAND: BIVARIATE DATA PART II CHAPTER 12 RESIDUAL ANALYSIS, LINEARITY AND TIME SERIES This topic includes: Transformation of data to linearity to establish relationships between variables, for example y and x #, or y and $ % Key knowledge The concepts of direct, inverse and joint variation The methods of transforming data The use of log (base 10) and other scales. Key skills Solve problems which involve the use of direct, inverse or joint variation Model non- linear data by using suitable transformations Apply log (base 10) and other scales to solve variation problems. Chapter Sections Questions to be completed 12.2 Residual Analysis In booklet 12.3 Transforming to linearity 2, 4, 5, 6, 7, 9, 11 (in textbook) 12.4 Time series and trend lines In booklet 12.5 Fitting trend lines and forecasting In booklet Page 1 of 20

TABLE OF CONTENTS 12.2 - RESIDUAL ANALYSIS 3 EXAMPLE 1 4 EXAMPLE 2 5 PLOT RESIDUALS ON THE CAS CALCULATOR 7 EXERCISE QUESTIONS 12.2 RESIDUAL ANALYSIS 7 12.3 - TRANSFORMING TO LINEARITY 8 EXAMPLE 3 8 EXAMPLE 4 8 EXAMPLE 5 9 CHOOSING THE CORRECT TRANSFORMATION 9 TO TRANSFORM TO LINEARITY: 10 QUADRATIC TRANSFORMATIONS 10 LOGARITHMIC AND RECIPROCAL TRANSFORMATIONS 10 USING THE TRANSFORMED LINE FOR PREDICTIONS 11 EXAMPLE 6 11 EXAMPLE 7 11 12.4 TIME SERIES AND TREND LINES 12 TYPE OF TIME SERIES 12 SEASONAL FLUCTUATIONS 12 HERE ARE SOME COMMON SEASONAL PERIODS. 12 CYCLIC FLUCTUATIONS 12 IRREGULAR FLUCTUATIONS 12 PLOTTING TIME SERIES 13 EXAMPLE 8 13 EXAMPLE 9 13 FITTING TREND LINES 13 EXAMPLE 10 13 12.4 EXERCISE QUESTIONS TIME SERIES & TREND LINES 14 12.5 TREND LINES AND FORECASTING 15 ASSOCIATION TABLES AND FORECASTING 15 EXAMPLE 11 15 EXAMPLE 12 17 EXAMPLE 13 18 12.5 EXERCISE QUESTIONS - FITTING TREND LINES AND FORECASTING 19 Page 2 of 20

12.2 - RESIDUAL ANALYSIS There are situations where the mere fitting of a regression line to some data is not enough to convince us that the data set is truly linear. Even if the correlation is close to +1 or 1 it still may not be convincing enough. The next stage is to analyse the residuals, or deviations, of each data point from the straight line. A residual is the vertical difference between each data point and the regression line. To carry out a residual analysis, carry out the following steps: Determine the equation of the least squares regression line using CAS. Calculate the predicted y- value (y )*+, ) for every x- value using the least squares regression equation. Calculate the difference between this predicted y- value (y )*+, ) and the actual y- value for every x- value. We then plot the residual values against the original x- values. If the residual plot shows random points scattered above and below the x- axis, then the original data is most likely to have been a linear relationship. If the residual plot shows some sort of pattern, then the original data is probably NOT linear. Page 3 of 20

Example 1 x 1 2 3 4 5 6 7 8 9 y 12 20 35 40 50 67 83 88 93 Use the data above to produce a residual plot and comment on the likely linearity of the data. Step 1 Re- draw the table and add two more rows Step 2 Equation of the least- squares regression line, y = a + bx Using CAS calculator to calculate for the gradient: b = and the y- intercept: a = Therefore, the least- square regression line is: Step 3 Calculate the predicted y- values using the equation y pred = y pred = When x = 1 When x = 2 y pred = y pred = Or use the CAS calculator to get the y pred values from the regression equation. In the next available column (E) In the formula box type: = f1(a), (where a is the column of x-values) press enter The y-pred values are displayed in the column. Step 4. Calculate the residuals using, Residual = y - y pred y pred = When x = 1 When x = 2 residual = y pred = residual = Calculate the rest of the residuals and enter them into the table. Add all residuals to check that it equals zero. Step 5. Plot residual values against original x- values. Residual Page 4 of 20

The residual plot shows This example was slow and long, even using the CAS. We can use the CAS for even more of the steps than we did above. The next example shows how. Example 2 x 1 2 3 4 5 6 7 8 9 10 y 5 6 8 15 24 47 77 112 187 309 Use the data above to produce a residual plot and comment on the likely linearity of the data. Step 1. Re- draw the table and add two more rows x 1 2 3 4 5 y 5 6 8 15 24 y pred - 50.0545-21.3758 7.3030 35.9818 64.6606 Residual (y y pred ) x 6 7 8 9 10 y 47 77 112 187 309 y pred 93.3394 122.018 150.697 179.376 208.055 Residual (y y pred ) (a) Find the equation of the least- squares regression line. CAS calculator On a new Lists & Spreadsheet page, enter the data points: Name column A as x and enter the values of x into this column Name column B as y and enter the values of y into this column On a new Calculator page, click on the following: b 4: Statistics 1: Stat Calculations 4: Linear Regression (a + bx) Choose the following options: X List: x (the name of the list in the spreadsheet given to the column) Y List: y (the name of the list in the spreadsheet given to the column) Save RegEqn to: f1 (the function name that the rule will be saved to) Note: The highlighted number 1 may be different and will update based on how many times a LinReg calculation is performed in a particular document. Page 5 of 20

(b) Calculate the predicted values and residuals. Show at least two calculations for the predicted value and the residual. y 4567 = resid = y y 4567 When x = 1: When x = 1: When x = 2: When x = 2: CAS Calculator On the same Lists & Spreadsheet page: A x B y C ypred D = =f1(a) Here 1 (data point) (data point) (autofill) 2 (data point) (data point) (autofill) Etc. Etc. Etc. Go to the formula box under D and click on: h 3: Link to: stat1.resid Note: The highlighted number 1 may be different and will update based on how many times a LinReg calculation is performed in a particular document. Fill in the table below. x y y 4567 resid 1 5 2 6 3 8 4 15 5 24 6 47 7 77 8 112 9 187 10 309 (c) Plot the residual points below against the x- values and comment on the graph. 100 90 80 70 60 50 40 30 20 y 10 0 x -10 0 1 2 3 4 5 6 7 8 9 10-20 -30-40 -50 Page 6 of 20

Plot Residuals on the CAS Calculator Click on the c button and choose Data & Statistics, then do the following: Go to the bottom then Click to add variable and choose the x variable name Go to the left side then Click to add variable and choose the stat1.resid option To add the least squares regression line to the plot, click on the following: Menu b 4: Analyze 6: Regression 2: Show Linear (a+bx) Note: You can also calculate the residuals by subtracting y pred from y in a column in the list & spreadsheet in the formula box (i.e. = y y )*+, OR = column[] column[]). EXERCISE QUESTIONS 12.2 RESIDUAL ANALYSIS 1. Consider the data set shown. Find the equation of the least- squares regression line and calculate the residuals. x 1 2 3 4 5 6 7 8 9 y 12 20 35 40 50 67 83 88 93 2. Using the same data from question 1, plot the residuals and discuss the features of the residual plot. Is your result consistent with the coefficient of determination? 3. Find the residuals for the following data. x 1 2 3 4 5 6 y 1 9.7 12.7 13.7 14.4 14.5 4. For the results of question 3, plot the residuals and discuss whether the relationship between x and y is linear. 5. Consider the following table from a survey conducted at a new computer manufacturing factory. It shows the percentage of defective computers produced on 8 different days after the opening of the factory. Day 2 4 5 7 8 9 10 11 Defective rate (%) 15 10 12 4 9 7 3 4 a) The results of least- squares regression were: b = 1.19, a = 16.34, r = 0.87. Given y = a + bx, use the above information to calculate the predicted defective rates (y pred ). b) Find the residuals (y y pred ). c) Plot the residuals and comment on the likely linearity of the data. d) Estimate the defective rate after the first day of the factory s operation. e) Estimate when the defective rate will be at zero. Comment on this result. 6. Find the residuals for the following data set. m 12 37 35 41 55 69 77 90 P 2.5 21.6 52.3 89.1 100.7 110.3 112.4 113.7 7. For the data in question 6, plot the residuals and comment whether the relationship between x and y is linear. Page 7 of 20

12.3 - TRANSFORMING TO LINEARITY Although linear regression might produce a good fit (high r value) to a set of data, the data set may still be non- linear. To remove (as much as is possible) such non- linearity, the data can be transformed. The x- values or y- values may be transformed in some way so that the transformed data are more linear. This enables more accurate predictions (extrapolations and interpolations) from the regression equation. In this course, the following transformations are studied: Logarithmic transformations Quadratic transformations Reciprocal transformations y vs log $J x y vs x # y vs $ % log $J (y) vs x y # vs x $ K vs x Example 3 Apply all the listed transformations to the data from the previous example. x y x # y # 1 x 1 y log $J x log $J y 1 5 2 6 3 8 4 15 5 24 6 47 7 77 8 112 9 187 10 309 Example 4 Using the transformed data from the previous example, plot each of the following on CAS: (a) y vs x # (b) y vs $ % Page 8 of 20

(c) y vs log $J x (e) $ vs x K (d) y # vs x (f) log $J y vs x Which of the transformations looks most linear? Example 5 Perform a linear regression analysis for each of the transformed data in the previous example, and hence confirm the best linear transformation. Write r to 3 decimal places and r # as a percentage to 1 decimal place. Transformation r r # y vs x # y vs $ % y vs log $J x y # vs x $ K vs x log $J y vs x Choosing the correct transformation To decide on an appropriate transformation, examine the points on a scatterplot with high values of x and/or y (that is, away from the origin) and decide for each axis whether it needs to be stretched or compressed to make the points line up. The best way to see which of the transformations to use is to look at a number of data patterns. Page 9 of 20

To transform to linearity: 1. Plot the original data and least squares regression line. 2. Examine the high values of x and/or y and decide if the data need to be compressed or stretched to make them linear (see diagrams). 3. Transform the data by: (a) compressing x- or y- values using the reciprocal or logarithmic functions. (b) stretching x- or y- values using the square function. 4. Plot the transformed data and its least squares regression line. 5. Repeat steps 2 to 4 for all appropriate transformations. Examine the residuals or correlation coefficient r to see which one is a better fit. There are at least two possible transformations for any given non- linear scatterplot, the decision as to which is the best comes from the r value. The transformation with the better r value should be considered as the most appropriate. (Note: The better r value is the one closest to either 1 or 1. Quadratic transformations Use y vs x # transformation Use y vs x # transformation Use y # vs x transformation Use y # vs x transformation Logarithmic and reciprocal transformations Use y vs log $J x or y vs $ % transformation Use y vs log $J x or y vs $ % transformation Page 10 of 20

Use log $J y vs x or $ vs x transformation Use K log $J y vs x or $ vs x transformation K Using the transformed line for predictions Once the appropriate model has been established and the equation of least- squares regression line has been found, the equation can be used for predictions. Example 6 From the previous example, the best transformation was log $J y vs x. Write down the equation of the transformed regression line. Example 7 (a) Apply a reciprocal transformation to the following data. Temperature ( C) x 5 10 15 20 25 30 35 No. of students in a class wearing jumper(s) y 18 10 6 5 3 2 2 (b) Use the transformed regression equation to predict the number of students wearing a jumper when the temperature is 12 C. Page 11 of 20

12.4 TIME SERIES AND TREND LINES A time series plot is a scatterplot where the x variable is time, such as hours, days, weeks and years. The main purpose of a time series is to see how some quantity varies with time. For example, a company may wish to record its daily sales figures over a 10- day period. Type of time series Seasonal fluctuations Certain data seem to fluctuate during the year, as the seasons change, this is called a seasonal time series. The most obvious example of this would be total rainfall during summer, autumn, winter and spring in a year. The name seasonal is not specific to the seasons of a year. It could also be related to other constant periods of highs and lows. A key feature of seasonal fluctuations is that the seasons occur at the same time each cycle. Here are some common seasonal periods. Seasons Cycle Example Seasons Winter, spring, summer, autumn Four seasons in a year Rainfall Months Jan, Feb, Mar,, Nov, Dec 12 months in a year Grocery store monthly sales figures 1 st Quarter (Q 1 ), Quarters 2 nd Quarter (Q 2 ), Quarterly expenditure figures of a 3 rd Four quarters in a year Quarter (Q 3 ), company 4 th Quarter (Q 4 ) Days Monday to Friday Five days in a week Daily sales for a store open from Monday to Friday only Days Cyclic fluctuations Monday, Tuesday, Wednesday, Thursday, Friday, Saturday, Sunday Seven days in a week Like seasonal time series, cyclic time series show fluctuations upwards and downwards, but not according to season. Number of hamburgers sold at a takeaway store daily Irregular fluctuations Fluctuations may seem to occur at random. This can be caused by external events such as floods, wars, new technologies or inventions, or anything else that results from random causes. There is no obvious way to predict the direction of the time series or even when it changes direction. Page 12 of 20

Plotting time series 1. The explanatory variable is always time and is always on the x- axis. 2. Plot the points and joint them to form a time series and then identify for trend types. Note when plotting the time series using a CAS calculator, the time periods must be converted to a numerical value. This can be done by setting up a table with an extra row for the time conversion, called the time code where the first point for the time is converted to 1 and so on. Such a table is referred as an association table. Two examples are shown: Example 8 Week 1 Mon Week 1 Tues Week 1 Wed Week 1 Thurs Week 1 Fri Week 1 Sat Week 1 Sun Week 2 Mon Week 2 Tues 1 2 3 4 5 6 7 8 9 Example 9 Jan 2009 Feb 2009 Mar 2009 Apr 2009 May 2009 June 2009 July 2009 Aug 2009 Sep 2009 1 2 3 4 5 6 7 8 9 Fitting trend lines After we have plotted a time series graph, if there is a noticeable trend (upward, downward or flat) we can add a trend line, in the form of a straight line using least square regression method, to the data. Example 10 The following table displays the school fees collected over a 10- week period. Plot the data and decide on the type of time- series pattern. If there is a trend, fit a straight line. 12 Week beginning 8 Jan 15 Jan 22 Jan 29 Jan 5 Feb 12 Feb 19 Feb 26 Feb 5 Mar Mar $ 1000 1.5 2.5 14.0 4.5 13.0 4.5 8.5 0.5 5.0 1.0 Set up an association table with the time code. Week beginning 8 Jan. 15 Jan. 22 Jan. 29 Jan. 5 Feb. 12 Feb 19 Feb 26 Feb 5 Mar 12 Mar $ 1000 1.5 2.5 14.0 4.5 13.0 4.5 8.5 0.5 5.0 1.0 Time Code On a Lists & Spreadsheet page, enter the time code (week) values into column A and school fees values into column B. Label the columns accordingly. On a Data & Statistics page, draw a scatterplot of the data. To do this, tab e to each axis to select Click to add variable. Place week on the horizontal axis and schoolfees on the vertical axis. To show as a time- series plot with data points joined, press: MENU b 2: Plot Properties 2 1 : Connect Data Points 1 Page 13 of 20

To fit a least- squares regression line, complete the following steps. Press: MENU b 4: Analyse 4 6: Regression 6 2: Show Linear (a+bx) 2 12.4 EXERCISE QUESTIONS TIME SERIES & TREND LINES 1 Data was recorded about the number of families who moved from Melbourne to Ballarat over the last 10 years. Plot the data and decide on the type of time series pattern. If there is a trend, fit a straight line. Year 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 Number moved 97 118 125 106 144 155 162 140 158 170 For questions 2 to 6, identify whether the time series are likely to be seasonal, cyclic or irregular and if they are also displaying a trend: 2. the amount of rainfall, per month, in Western Victoria 3. the number of soldiers in the United States army, measured annually 4. the number of people living in Australia, measured annually 5. the share price of BHP Billiton, measured monthly 6. the number of seats held by the Liberal Party in Federal Parliament. 7. The following table shows the temperature in Victoria over a 10- day period. Day 1 2 3 4 5 6 7 8 9 10 Temperature ( C) 38 35 34 30 28 27 23 20 19 18 Fit a trend line to the data. 8. The monthly share prices of a recently privatised telephone company were recorded as follows. Date Jan. 09 Feb. 09 Mar. 09 Apr. 09 May 09 June 09 July 09 Aug. 09 Price ($) 2.50 2.70 3.00 3.20 3.60 3.70 3.90 4.20 Graph the data (let 1 = Jan., 2 = Feb.,... and so on) and fit a trend line to the data. Comment on the feasibility of predicting share prices for the following year. 9. Consider the data in the table shown, which represent the price of oranges over a 19- week period. Week 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 Price(cents) 40 45 53 46 40 45 62 58 67 60 72 60 64 78 74 66 78 81 80 a) Fit a straight trend line to the data and sate the equation b) Predict the price in week 25. 10. The following table represents the quarterly sales figures (in $000s) of a popular software product. Plot the data and fit a trend line. Discuss the type of time series best reflected by these data. Quarter Q 1-07 Q 2-07 Q 3-07 Q 4-07 Q 1-08 Q 2-08 Q 3-08 Q 4-08 Q 1-09 Q 2-09 Q 3-09 Q 4-09 Sales 120 135 150 145 140 120 100 110 120 140 190 220 Page 14 of 20

12.5 TREND LINES AND FORECASTING Association tables and forecasting An association table is often required to convert period labels to a numerical value, so that a straight- line equation can be calculated. It is best to set up an extra row if data are in tabular form, or to change the labels shown on the axis of a time- series plot to numerical values. Here are three examples. Example 1 Example 3 Example 2 For forecasting, use the association table to devise a time code for any period in the future. This time code will then be used in the straight- line equation. From the three examples we can calculate that for: Example 1: 2013 would have a time code of 8; Example 2: 1st quarter 2010 would have a time code of 9; Example 3: Monday week 4 would have a time code of 22. Example 11 A new tanning salon has opened in a shopping centre, with customer numbers for the first days shown in the following table. Fit a straight line to the data set using the least- squares regression method. Week 1 Week 2 Period Mon. Tue. Wed. Thu. Fri. Sat. Sun. Mon. Tue. Wed. Number of customers 9 9 11 13 16 18 19 20 23 27 Use the equation of the straight line to predict the number of customers for: (a) Monday week 4 (b) Thursday week 2 Solution Set up an association table with Monday week 1 as the starting point for the time. Week 1 Week 2 Period Mon. Tue. Wed. Thu. Fri. Sat. Sun. Mon. Tue. Wed. Number of customers 9 9 11 13 16 18 19 20 23 27 Time code Open a Lists & Spreadsheet page. Enter the time code values into column A and the number of customers values into column B. Label as shown. Page 15 of 20

Open a Data & Statistics page. Place timecode on the horizontal axis and customers on the vertical axis. Join the dots and fit a least- squares regression line. The time- series plot shows From the list generated, the r value of 0.9871 suggests a very strong linear relationship. Thus it is appropriate to use the least- squares regression line for performing predictions. Write the least- squares regression equation Rewrite an equation using the correct variables Use the linear regression equation to predict question (a) and (b). (a) Monday week 4 is equivalent to the time code of. Therefore, we substitute into the equation to calculate/predict for the number of customers. Number of customers = 1.9697 time code + 5.6667 Number of customers = (b) Thursday week 2 is equivalent to the time code of. Therefore we substitute into the equation to calculate/predict for the number of customers. Number of customers = 1.9697 time code + 5.6667 Number of customers = Alternatively, we can use the calculator to substitute these values into the equation and calculate for the number of customers. Use the Calculator page and the saved equation in f1. Complete the entry lines as: f1(22) f1(11) Press ENTER after each entry. Note: Remember that forecasting is an extrapolation and if going too far into the future, the prediction is not reliable, as the trend may change. Page 16 of 20

Once an equation has been determined for a time series, it can be used to analyse the situation. For the period given in Worked Example 2, the equation is: Number of customers = 1.9697 time code + 5.6667 The y- intercept (5.67) has no real meaning, as it represents the time code of zero, which is the day before the opening of the salon. The gradient or rate of change is of more importance. It indicates that the number of customers is changing; in this instance, growing by approximately 2 customers per day (gradient of 1.97). Example 12 The forecast equation to calculate share prices, y, in a sugar company was calculated from data of the share values over 5 years. The equation is y = 0.42t + 1.56, where t = 1 represents the year 2001, t = 2 represents the year 2002 and so on. (a) Rewrite the equation putting it in the context of the question. (b) Interpret the numerical values given in the relationship. (c) Predict the share value in 2010. Before we fit a straight line to a data set using least- squares regression it is useful to draw a scatterplot. This is beneficial since it can: 1. Demonstrate how close the points are to a straight line, or if a curve is a better fit for the data. 2. Demonstrate if there is an outlier in the data set that could affect the least- squares regression. If there is an outlier in the data and we are using the equation to make a prediction, then this can effect our prediction. If the outlier was removed from the data then this is more likely to give a better prediction since the least- squares regression will fit the data more closely. Page 17 of 20

Example 13 The table below shows the sales for the first 8 years of a new business. Year 1 2 3 4 5 6 7 8 Sales ($) 1326 1438 1376 1398 1412 1445 1477 1464 (a) Plot the data. (b) If there is a trend, fit a straight line to the data using a least- squares regression method. The equation is therefore: (c) Identify if there is any outlier and re- plot the data after removing any outliers. Explain the reason for removing the outlier. (d) Fit a straight line to the set of data without the outlier using a least- squares regression method. Page 18 of 20

The equation is therefore: (e) Comment on the gradient and y- intercept for the two set of data, with and without the outlier. 12.5 EXERCISE QUESTIONS - FITTING TREND LINES AND FORECASTING 1. Data was recorded on the number of road fatalities in Australia in 2009 and 2010. Month and year 01/09 02/09 03/09 04/09 05/09 06/09 07/09 08/09 09/09 10/09 11/09 12/09 Driver fatalities 115 117 132 153 130 131 109 117 113 143 107 124 Month and year 01/10 02/10 03/10 04/10 05/10 06/10 07/10 08/10 09/10 10/10 11/10 12/10 Driver fatalities 126 98 104 115 134 111 107 94 104 121 119 120 Fit a straight line to the data set using the least- squares regression method and use the straight line to predict the number of fatalities in: a) June 2015 b) April 2018. 2. The forecast equation for calculating the share prices, y, of a mining company was obtained from the data of share prices over the past 4 years. The equation is y = 18.57 0.1t, where t = 1 represents the year 2010. a) Rewrite the equation putting it in the context of the question. b) Interpret the values of the gradient and y- intercept. c) Predict the share price in 2019. 3. The following table shows the stock price for Apple in 2009 10. Month 2010 Jan. Feb. Mar. Apr. May June July Aug. Sept. Stock price ($) 192.06 204.62 235.00 261.09 256.88 251.53 257.25 243.10 283.75 a) Plot the data and decide if there are any outliers. Complete a least- squares regression including all data points. b) Re- plot the data after removing any outliers. c) If there is a trend, fit a straight line to the data using a least- squares regression method. d) Compare the gradient and y- intercept of the two regressions. Page 19 of 20

4. The following table represents the number of cars remaining to be completed on an assembly line. Fit a straight line to the following data using the least- squares regression method. Time (hours) 1 2 3 4 5 6 7 8 9 Cars remaining 32 26 27 23 16 17 13 10 9 a) Predict the number of cars remaining to be completed after 11 hours. b) At what rate is the numbers of cars on the assembly line being reduced by? 5. From the equation of the trend line, it should be possible to predict when there are no cars left on the assembly line. This is done by finding the value of t which makes y = 0. Using the equation from question 4, find the time when there will be no cars left on the assembly line. 6. The forecast equation for calculating prices, y, of shares in a steel company was obtained from data of the share prices of the past 6 years. The equation is y = 2.56 + 0.72t where t = 1 represents the year 2010, t = 2 represents the year 2011 and so on. a) Rewrite the equation putting it in the context of the question. b) Interpret the values of the gradient and the y- intercept. c) Predict the share price in 2020. 7. A mathematics teacher gives her students a test each month for 10 months, and the class average is recorded. The tests are carefully designed to be of similar difficulty. Test Feb. Mar. Apr. May June July Aug. Sept. Oct. Nov. Mark (%) 57 63 62 67 65 68 70 72 74 77 a) Calculate the equation of the trend line for these data using the least- squares regression method. b) Plot the data points and the trend line on the same set of axes. c) Use the trend line equation to predict the results for the last exam in December. d) Comment on the suitability of the trend line as a predictor of future trends, supporting your arguments with mathematical statements. 8. The average cost of a hotel room in Sydney in 2010 is shown in the table. Month 2010 Jan. Feb. Mar. Apr. May June July Aug. Sept. Hotel price ($) 250 240 235 237 239 230 228 237 332 a) Plot the data and decide if there are any outliers. Complete a least- squares regression including all data points. b) Re- plot the data after removing any outliers. c) If there is a trend, fit a straight line to the data using a least- squares regression method. d) Compare the gradient and y- intercept of the two regressions and give a possible explanation for the outlier. Page 20 of 20