Components for Accurate Forecasting & Continuous Forecast Improvement An ISIS Solutions White Paper November 2009 Page 1
Achieving forecast accuracy for business applications one year in advance requires four major components, specifically, a technology platform, forecast methodology, collaboration and a Closed Loop (for continuous forecast accuracy improvement). Statistical forecasting tools alone cannot produce consistent forecast accuracy one year out as business is too organic to yield solely to mathematics. As such, this four prong approach has yielded 93%+ forecast accuracy in empirical studies across numerous industries. INTRODUCTION To enable effective decisions regarding the deployment of capital and human resources and the mitigation of risks, business requires 93%+ accurate forecasts one year out. To achieve this accuracy across the business requires the combination of four major components: Technology Methodology Collaboration Closed-Loop Technology is an enabling platform for statistical forecasting across multiple dimensions. Methodology provides a thought process to use the technology to produce the most probable outcome. Collaboration tests the reasonableness of the forecast and for the input of known future events that can materially affect the forecast. Closed-Loop is a process using the technology and collaboration to evaluate the forecast accuracy and provide iterative improvement to future forecasts. Forecasting tools come in four varieties, namely, Business Intelligence, Data Mining, Statistical Forecasting and Demand Planning. These tools share a common weakness to consistently provide accurate forecasts a year in advance in that they do not approach forecasting holistically. For these tools forecasting is largely a matter of their technology and only tangentially is methodology or collaboration or Closed Loop applied. Enterprise BI tools have little to no practical statistical formula and look at forecast accuracy as a secondary outcome from spending more time looking at data by people who will be committed to the forecast. Data Mining and Statistical Forecast tools have statistical forecast formula but offer little in the way of methodology or collaboration or Closed Loop. Demand Planning has limited forecast formula and usually approaches forecasting as push-thebutton (i.e. letting the tool do all the work) which is why the forecast accuracy horizon is limited to about 90 days. This report explores the four components to consistently producing 93%+ accurate forecasts a year in advance. Page 2
FORECAST COMPONENTS OVERVIEW The four main components to produce accurate forecasts one year out are: Technology is the software platform used to generate a statistical forecast of data (e.g. sales, demand, expenses, efficiency, etc.) Methodology is a process that employs statistical and analytical research to create a forecast with a high probability outcome. Collaboration is the ability within the technology to share, review and manage the forecast among the business for feedback and input. Closed-Loop utilizes the technology along with collaboration to analyze the forecast against actual results and determine the changes to be made to improve the future forecast accuracy. The above four components combine for more accurate long range forecasting as well as continuous forecast accuracy improvement. TECHNOLOGY Technology is the foundation of statistical forecasting and has a confluence of four variables that must be met in the order present below: Data is what is to be forecasted; e.g. sales, demand, expenses, efficiency, etc. Dimension is the segmentation of the data for which the forecast is to be performed; e.g. Country, Region, State, Brand, Product, etc. Formula is the statistical formula used to forecast the data at its dimension; e.g. Holt-Winters, Linear Regression, ARIMA, etc. Time Series is the past contiguous set of time used by the Formula to Forecast the Data at the selected Dimension; e.g. the last 12, 24 or 36 months The Data is the given element of the forecast as it is the subject of interest. However the importance of Dimensions to forecast accuracy is often lightly regarded. The number of Dimensions and the hierarchical organization of Dimensions are crucial to develop accurate forecasts six to twelve months out. Formula is the next element that needs to be matched to the characteristic behavior of the Data. In modeling the future the past is the guiding post; however, from the past we can correlate those influences that can affect the future forecast. As such, correlating past influences can be used to modify the output from the formula used to forecast the future. The final element is to choose a time series of data to forecast the future. A calculated statistical forecast can have a different outcome for each contiguous time series of data selected. Therefore, there needs to be a methodology for identifying the most probable forecast from a range of potential forecast scenarios. METHODOLOGY Methodology emanates from a statistical analysis of the past data to reveal the formula and time series of data that can best forecast the future. Methodology is used in conjunction with the technology to choose the most probable outcome from a variety of potential outcomes. The methodology focuses on looking at the past and future as follows: Variance Analysis of the past performance to present Leading Indicator of the direction of the future trend Variance analysis consists of a qualitative and quantitative evaluation of the data to identify which statistical formula would be more applicable to forecasting (e.g. Holt-Winters, ARIMA, Linear regression, etc.). The leading indicator uses at least two statistical approaches to calculate the direction of the future trend. The confluence of variance analysis and leading indicator is used engage the appropriate statistical formula and select the best time series of data that will yield the most probable outcome. Once the forecast is complete a Monte Carlo Simulation can be run to assess the most probable range of outcome of the most probable forecast. In this manner business has a targeted most probable band to assess its up and down side risks. COLLABORATION Business Intelligence tools largely use collaboration (because they are fairly void of statistical formula) to gather people together to guess the best forecast. However, collaboration as used in this context is twofold: Reasonableness Assessment Knowledge of Future Events While people are not inherently statistical they are intuitive about their business. This intuition can be used Page 3
to gauge the reasonableness of the forecast. Here collaboration is used for people to review and analyze the forecast. It is important that the technology enable the reviewer with capability to report, compare, analyze, chart, trend and multi-dimensionally drill into the forecast. For collaboration to be successful the forecast should NOT be emailed in static spreadsheets as this does not provide the reviewer with the requisite capabilities to make a critical assessment of the forecast. The second part of collaboration is the knowledge in the field of future events. For example, an Account Manager may know a major store in his territory will be remodeled in the upcoming quarter. Remodeling will affect sales as half the store will effectively be closed. As such the technology needs to accommodate the systematic collection and management of field knowledge that could change the forecasted values. CLOSED-LOOP Closed Loop Forecasting is the ability to adapt the forecast the methodology for continuous forecast accuracy improvement. Close Loop starts with multiple dimensional forecasts and over time these different models are compared to actual results. For example, three demand forecasts for the next 12 months are prepared for a seasonal manufacturing business then run through a Monte Carlo Simulation. Each forecast model uses different time series of data and dimensional hierarchy. For the next three months the actual results are compared to the forecasts in each of the three models. In comparing the results the technology must provide for the following: Multi-Dimensional Segmentation Database Comparisons Performance Dimension Segmentation Multi-dimensional segmentation (e.g. by region, state, city. manager, product, etc.) is used to drill into the forecast to find specifically where material deviations occur and assess the nature of the deviation. For example, the forecast is off by 20% in Los Angeles for the Laser 8300 printer due to an auto accident with the truck that carried those printers. In this case no adjustment to the model is needed as the deviation was cause by a singular event. However, suppose the issue was not an auto accident but in the statistics. By comparing the three different models, one was found to have better accuracy because of the selection of the time series (or a different dimensional hierarchy or formula). To provide productivity in analyzing different forecasts the technology should have the ability to automatically calculate the difference between forecast and actual values as well as create Performance dimensions to segment material deviations. The concept of a Performance Dimension is exceptionally powerful (and not just for forecasting but for business analysis in general). A Performance Dimension is defined as: The segmentation of data based on the arithmetical, mathematical or statistical performance of the data with respect to time A simple example of a performance dimension is to segment those products that provide 80% of the revenue year to date. The segmentation is arithmetic (a simple division to obtain the products that are 80% of revenue) and the time frame is the year to date revenue. The segmentation would differ if the analysis was posed on a current month or quarter to date or rolling 12 month basis. Closed Loop forecasting seeks material deviations; i.e. those that have a material effect on the business. As such, nested Performance Dimensions are used. For example, find those deviations that are more than 10% from actual AND more than 1,000 units this month. This Closed-Loop provides the means to continuously assess and improve the forecast accuracy. Through multi-dimensional analysis the formula and dimensional hierarchy and time series of data that can yield the most probable forecasts are refined. DIMENSIONAL CRITICALITY Dimensionality is only lightly considered by most Forecasting, Data Mining, Demand Planning and Business Intelligence tools because of the Dimensional constriction inherent in these databases. However, forecast accuracy is critically dependent on routinely having the ability to engage 40 to 80 Dimensions for most Enterprise business applications. To consistently achieve accurate forecasts a year in advance will require the modeling and testing of several Dimensional configurations from the highest through lowest Dimensions or what is referred to as a full Line-of- Sight. For example, a Line-of-Sight from the Region to State to City to Salesman to Brand to SKU is one such configuration. At least three Dimensional arrangements should be attempted and the results compared to determine the optimum configuration to achieve forecast accuracy (overall accuracy may be equivalent at the high level but vary widely at intermediate levels). Page 4
FORECASTING FORMULA TOOL-KIT There is a wide variety of forecasting formulas but no one magic formula that can forecast all outcomes accurately all the time. Accordingly, to achieve accurate forecasts requires a tool-kit of forecast formula that can span the following business data characteristics: Non-linear Seasonal Non-linear Linear Correlative The above formula types apply to the four main business data characteristics. Seasonal is the most common behavior of sales and demand data and Holt- Winters is a well recognized formula to respond to this behavior. Non-linear has less general business application but has its use when the data is not seasonal and not linear. Here ARIMA models work well. Linear has two characteristic forms, the flat line and the line of the trend. An Average Value formula responds to the flat line behavior and a Linear Regression for a straight line through the best fit of the trend. Finally correlative formula use correlative models in conjunction with statistical forecasting to relate the dependencies between different data. Models such as Flex on Historical and Bayesian-Markov can be applied. PREDICTIVE FORECASTING Predictive Forecasting assesses the probability of achieving the forecast or, viewed another way, to quantify the uncertainty that surrounds the forecast. A forecast is a calculated future point value at a future point in time. However, the future has uncertainties and the probability of hitting a point value is relatively small. Therefore, once a forecast is created Predictive Forecasting employs statistical probability formula to determine a level of confidence and the most probable range of values about the forecast. represents an 84% probability the future outcome will fall at or below that mark. If the business wants a 98% or 99% probability of outcome it can use the top of the second or third blue bars, respectively, but there will be a disproportionally higher price for the increased level of confidence. As such, business has a quantitative assessment to balance risk and capital outlay. CONCLUSION Long range forecast accuracy a year in advance requires technology, methodology, collaboration and Closed-Loop. Deep Dimensionality and the ability to model several Dimensional configurations are crucial to forecast accuracy. Deploy collaboration and Closed-Loop Forecasting to Dimensional models to continuously analyze and improve the accuracy of the forecast method. An example of a Predictive Forecast is presented on the following graph calculated from a Monte Carlo Simulation. The white line is a forecast using a Holt- Winters seasonal model. The forecast represents a 50% probability that the future actual value will fall at or below the forecast value. The blue bars around the forecast line are the statistical standard deviations representing 68%, 14% and 2% probability, that future outcomes will fall within those ranges. Using the chart enables assessing the probability of actions. For example, the top of the darkest blue bar Page 5
ISIS Solutions, Inc. www.isis-solution.com (c) Copyright 2009 ISIS Solutions, Inc. All Rights Reserved. Predictive Forecasting and ISIS Discovery & Predictive Analytics are Trademarks of ISIS Solutions, Inc. all other trademarks are the property of their respective companies Page 6