The value of competitive information in forecasting FMCG retail product sales and category effects

Similar documents
Analytics for an Online Retailer: Demand Forecasting and Price Optimization

Advances in promotional modelling and analytics

Warwick Business School Forecasting System. Summary. Ana Galvao, Anthony Garratt and James Mitchell November, 2014

Forecast UPC-Level FMCG Demand, Part III: Grouped Reconciliation

Problem Statements in Time Series Forecasting

Using Temporal Hierarchies to Predict Tourism Demand

Chapter 13: Forecasting

This is the author s final accepted version.

Chapter 7. Development of a New Method for measuring Forecast Accuracy

Forecasting Seasonal Time Series 1. Introduction. Philip Hans Franses Econometric Institute Erasmus University Rotterdam

Forecasting Under Structural Break Uncertainty

Lecture 1: Introduction to Forecasting

CP:

Mining Big Data Using Parsimonious Factor and Shrinkage Methods

Time Series Econometrics For the 21st Century

Dennis Bricker Dept of Mechanical & Industrial Engineering The University of Iowa. Forecasting demand 02/06/03 page 1 of 34

9) Time series econometrics

Econometric Forecasting Overview

ECONOMETRIC MODEL WITH QUALITATIVE VARIABLES

Tourism Forecasting: to Combine or not to Combine?

Does modeling a structural break improve forecast accuracy?

An Empirical Study of Forecast Combination in Tourism

Forecasting in a Mixed Up World: Nowcasting Hawaii Tourism

Econ 8208 Homework 2 Due Date: May 7

Spatially-Explicit Prediction of Wholesale Electricity Prices

COMMODITY PORK PRICE FORECASTING FOR HORMEL FRESH PORK SALES TEAM CORTNEY BALLY. B.S., Kansas State University, 2007 A THESIS

Comparing Forecast Accuracy of Different Models for Prices of Metal Commodities

Chapter 7 Forecasting Demand

Forecasting the unemployment rate when the forecast loss function is asymmetric. Jing Tian

Forecasting Chapter 3

Forecasting. Dr. Richard Jerz rjerz.com

Outline. The binary choice model. The multinomial choice model. Extensions of the basic choice model

Figure 1. Time Series Plot of arrivals from Western Europe

An economic application of machine learning: Nowcasting Thai exports using global financial market data and time-lag lasso

Forecasting in the presence of recent structural breaks

Decision 411: Class 3

Nowcasting at the Italian Fiscal Council Libero Monteforte Parliamentary Budget Office (PBO)

ACE 564 Spring Lecture 11. Violations of Basic Assumptions IV: Specification Errors. by Professor Scott H. Irwin

Linear Methods for Regression. Lijun Zhang

Decision 411: Class 3

Decision 411: Class 3

Operations Management

Econ 300/QAC 201: Quantitative Methods in Economics/Applied Data Analysis. 17th Class 7/1/10

Assistant Prof. Abed Schokry. Operations and Productions Management. First Semester

Skupos Data Helps Convenience Stores Predict Inventory Needs Before Hurricanes

Bayesian Compressed Vector Autoregressions

Forecasting with Expert Opinions

RS Metrics CME Group Copper Futures Price Predictive Analysis Explained

Econometrics. Week 11. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Big Data as Audit Evidence: Utilizing Weather Indicators

3. If a forecast is too high when compared to an actual outcome, will that forecast error be positive or negative?

Combining country-specific forecasts when forecasting Euro area macroeconomic aggregates

Operations and Supply Chain Management Prof. G. Srinivasan Department of Management Studies Indian Institute of Technology, Madras

Lecture 6: Univariate Volatility Modelling: ARCH and GARCH Models

Lecture Notes: Estimation of dynamic discrete choice models

Introduction to Econometrics

Cost and Preference in Recommender Systems Junhua Chen LESS IS MORE

Using all observations when forecasting under structural breaks

FORECASTING THE FINNISH CONSUMER PRICE INFLATION USING ARTIFICIAL NEURAL NETWORK MODELS AND THREE AUTOMATED MODEL SELECTION TECHNIQUES *

10. Time series regression and forecasting

Discussion Papers in Economics. Ali Choudhary (University of Surrey and State Bank of Pakistan) & Adnan Haider (State Bank of Pakistan) DP 08/08

Online Appendix. Online Appendix A: MCMC Algorithm. The model can be written in the hierarchical form: , Ω. V b {b k }, z, b, ν, S

10) Time series econometrics

Advanced Forecast. For MAX TM. Users Manual

Information Sharing In Supply Chains: An Empirical and Theoretical Valuation

Frontiers in Forecasting, Minneapolis February 21-23, Sparse VAR-Models. Christophe Croux. EDHEC Business School (France)

Comment on: Automated Short-Run Economic Forecast (ASEF) By Nicolas Stoffels. Bank of Canada Workshop October 25-26, 2007

The regression model with one stochastic regressor (part II)

ECON 4230 Intermediate Econometric Theory Exam

Can Cattle Basis Forecasts Be Improved? A Bayesian Model Averaging Approach. Nicholas Payne and Berna Karali

05 Regression with time lags: Autoregressive Distributed Lag Models. Andrius Buteikis,

WRF Webcast. Improving the Accuracy of Short-Term Water Demand Forecasts

Cyclical Effect, and Measuring Irregular Effect

An Empirical Analysis of RMB Exchange Rate changes impact on PPI of China

Forecasting. Simultaneous equations bias (Lect 16)

Econometría 2: Análisis de series de Tiempo

Forecasting exchange rate volatility using conditional variance models selected by information criteria

Frequency Forecasting using Time Series ARIMA model

BUSI 460 Suggested Answers to Selected Review and Discussion Questions Lesson 7

Volume 38, Issue 2. Nowcasting the New Turkish GDP

NOWCASTING REPORT. Updated: May 20, 2016

FORECASTING IN FINANCIAL MARKET USING MARKOV R MARKOV REGIME SWITCHING AND PRINCIPAL COMPONENT ANALYSIS.

IIF Economic Forecasting Summer School 2018, Boulder Colorado, USA

Projektbereich B Discussion Paper No. B-393. Katrin Wesche * Aggregation Bias in Estimating. European Money Demand Functions.

A Sparse Linear Model and Significance Test. for Individual Consumption Prediction

Demand Forecasting. for. Microsoft Dynamics 365 for Operations. User Guide. Release 7.1. April 2018

Lecture Prepared By: Mohammad Kamrul Arefin Lecturer, School of Business, North South University

Lecture 13. Simple Linear Regression

22/04/2014. Economic Research

NOWCASTING THE NEW TURKISH GDP

Forecasting. Operations Analysis and Improvement Spring

Research Article A Partial Backlogging Inventory Model for Deteriorating Items with Fluctuating Selling Price and Purchasing Cost

University of Pretoria Department of Economics Working Paper Series

University of Konstanz Department of Economics

AUTO SALES FORECASTING FOR PRODUCTION PLANNING AT FORD

HEALTHCARE. 5 Components of Accurate Rolling Forecasts in Healthcare

Applied Econometrics. Professor Bernard Fingleton

PhD/MA Econometrics Examination January 2012 PART A

MA Advanced Macroeconomics: Solving Models with Rational Expectations

Forecasting: The First Step in Demand Planning

Transcription:

The value of competitive information in forecasting FMCG retail product sales and category effects Professor Robert Fildes r.fildes@lancaster.ac.uk Dr Tao Huang t.huang@lancaster.ac.uk Dr Didier Soopramanien d.soopramanien@lancaster.ac.uk

Outline The research question Literature summary Our contributions Incorporating competitive information Account for the change of the market environment Data and experimental design Results and insights Conclusion

The Research question We forecast retailer product sales (demand) at the product level (e.g. SKU/UPC) Accurate forecasts are important for inventory planning (e.g. to avoid over-stock and out-of-stock conditions). We want to improve the accuracy! But surely this has been done!

100% The shape of the data series 60%

What has been proposed? Many retailers are using simple statistical methods to initially generate baseline forecasts and then rely on managers to make adjustments for promotional events. - Cooper et al. (1999): PromoCast to estimate the adjustments based on historical information - Fildes et al. (2009): Mechanisms to help managers improving their adjustments. Other studies proposed technically sophisticated methods trying to utilizing the price/promotional information of the focal product more effectively. Aburto and Weber (2007): ANN; Ali et al. (2009): Regression tree.

How we contribute? We incorporate competitive information Competitive price and competitive promotions Strong influencing factors on product sales The data are available Previous studies all overlooked the competitive information in forecasting We account for the change of the market environment In reality the effect of price/promotions change over time Ignoring this fact leads to forecast bias We validate our proposals

Define competitive information The high dimensionality problem: too many predictors typically 100-200 items within each product category, impossible to reduce or even estimate the model if we take them all. Method 1: we apply variable selection method Most famous stepwise regression? Heavily criticized for retaining irrelevant variables and ignoring relevant variables (see Harrell, 2001) Least Absolute Shrinkage and Selection Operator (LASSO) (Tibshirani 1996; Turlach 2000) Autometrics (Hendry and Krolzig) We use a combine of stepwise regression and LASSO (but surely there are alternative algorithms!)

Define competitive information The high dimensionality problem: too many predictors typically 100-200 items within each product category, impossible to reduce or even estimate the model Method 2: we apply the Principal Component Analysis (PCA) To condense a large number of competitive explanatory variables into a handful set of diffusion indexes (DI) Have good performance in forecasting macroeconomic variables (Stock and Watson, 2002)

Incorporate competitive information We incorporate the following competitive information Explanatory variables selected by LASSO/stepwise OR Diffusion indexes constructed by PCA) into Autoregressive Distributed Lag (ADL) models and then simplify the model following the general-to-specific modelling strategy (Hendry 1995) The econometric model has good interpretability and also proved to be effectively in other areas: Tourism data in Song and Witt (2003); Airline passenger flow data in Fildes, Wei et al. (2009).

An example: The general ADL model Start with a general model: Product Simplify Sales the (in logs) For simplicity here we do not show weekly indicators and dummies for calendar events.

An example: The general ADL model Start with a general model: Lag of Product Sales For simplicity here we do not show weekly indicators and dummies for calendar events.

An example: The general ADL model Start with a general model: Lags of own price/promotions For simplicity here we do not show weekly indicators and dummies for calendar events.

An example: The general ADL model Start with a general model: Lags of competitive price/promotions (selected by LASSO/stepwise); OR Lags of diffusion indexes (constructed by PCA)

The change of the market environment The effect of price and promotions (on product sales) change over time owing to: Economic condition (more price/promotion sensitive during economic crunch) Consumer tastes change Competitive activities New product entry And the change of any other driving factors which are related to price and promotions but not included in the model

What happens if we ignore it? If we compromise the model with constant parameters when in fact the effects of price and promotions are changing over time: The model will be subject to structural break And be exposed to forecast failure, i.e. forecasts are biased and forecast error variance also slightly inflated, overall forecasting performance are poor compared to the model s in-sample fit (Clements and Hendry, 1999)

An example of how structural break causes forecast bias 16 14 12 Simulated data (y sales, x price, x~ Unif(0,1), u~ Unif(0,1) ) y = 10 2x + u Sales 10 8 6 4 2 Consumers demand increase but they also become more price sensitive (in reality, the timing of the change is UNKNOWN y = 14 3x + u Actual 0 1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67 70 73 76 79 82 85 88 91 94 97 100 103 106 109 112 115 118 Weeks

An example of how structural break causes forecast bias 16 14 12 Now we build a model with constant parameters y = 10 2x + u Sales 10 8 6 4 2 y =12.4 2. 3x The deterministic mean of the model with constant parameters will be a WEIGHTED AVERAGE for the data before and after the structural break y = 14 3x + u Actual Predict 0 1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67 70 73 76 79 82 85 88 91 94 97 100 103 106 109 112 115 118 Weeks

An example of how structural break causes forecast bias 16 14 12 The model obviously under-forecast in the forecast period y = 10 2x + u Sales 10 8 6 4 2 0 y =12.4 2. 3x The deterministic mean of the model with constant parameters will be a WEIGHTED AVERAGE for the data before and after the structural break 1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67 70 73 76 79 82 85 88 91 94 97 100 103 106 109 112 115 118 Weeks y = 14 3x + u Forecast bias Actual Predict

Test for structural break ADL models with LASSO/step wise OR diffusion factors: subject to structural break 100% 50% 0% Percentage of Models Subject to Structural Break (Chow test, a=0.05) 96% 92% 83% 88% 91% 88% 92% 98% 100% 100% 90% 95% 100% 100% Specification Rolling

Offsetting the forecast bias Models subject to structural break are exposed to forecast bias. If we can mitigate this bias, we may improve the forecasting performance. One way is to allow the parameters to varying over time: E.g. AR(1): y = int+ βx + u; βt = ηβt 1 + et ; int = r int + ε t t 1 Performance is poor- the presumed function form can hardly explain how the effect of price and promotions change over time. t

Offsetting the forecast bias Models subject to structural break are exposed to forecast bias. If we can mitigate this bias, we may be able to improve the forecasting performance. Alternatively, we estimate and then offset the forecast bias! Intercept correction

An Example of Intercept Correction 16 Estimate the forecast bias based on the data around the forecast origin. E.g. we take an average of the errors, assuming they are ALL caused by forecast bias 14 12 10 Sales 8 6 4 2 Estimate the forecast bias Forecast bias Actual Predict 0 1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67 70 73 76 79 82 85 88 91 94 97 100 103 106 109 112 115 118 Weeks

An Example of Intercept Correction 16 14 12 10 Then we offset the bias in the forecast period using the estimated bias Sales 8 6 4 2 Offset the forecast bias Actual Predict 0 1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67 70 73 76 79 82 85 88 91 94 97 100 103 106 109 112 115 118 Weeks

A trade-off against forecast bias Models subject to structural break are exposed to forecast bias. Rather than offsetting the forecast bias, we may take a tradeoff between the forecast bias and (reduced) forecast error variance by combining the forecasts generated by the models with various lengths of estimation window- Estimated Window Combining (EWC) (Pesaran and Timmermann, 2007)

An example of combining forecasts 16 14 12 Simulated data (y sales, x price, x~ Unif(0,1), u~ Unif(0,1) ) y = 10 2x + u Sales 10 8 6 4 2 Ideally, we would use the data AFTER the structural break, but the break time is UNKNOWN y = 14 3x + u Actual Predict 0 Weeks

An example of combining forecasts 16 Simulated data (y sales, x price, x~ Unif(0,1), u~ Unif(0,1) ) 14 Sales 12 10 8 6 4 2 0 We can only use the data close to the forecast origin- the model may not be subject to structural break, but will have inflated forecast error variance (because of less information used) Forecast 1 Actual Predict Weeks

Estimation with full sample data 16 Simulated data (y sales, x price, x~ Unif(0,1), u~ Unif(0,1) ) 14 Sales 12 10 8 6 4 2 0 On the other extreme, we can use ALL the data in the sample, thus we have biased forecasts but the forecast error variance is smaller (compared to the previous scenario) Forecast 2 Actual Predict Weeks

Combining forecasts Thus we can take a trade-off between (incurring) forecast bias and (reducing) forecast error variance: we estimate the same model with various estimation windows: y = int+ βx Estimate the model using data [80, 100], generate forecasts as Forecast 1 Estimate the model using data [1, 100], generate forecasts as Forecast 2 Finally we take an average of forecast 1 and forecast 2, the final forecasts may be more accurate (explained by the philosophy of forecast combination)

Data Dominick s Finer Foods, a large retail chain in Chicago area in the U.S (available from Chicago University website) Unit sales, price, and promotions at the UPC level; weekly data Promotions include Simple price discount (75%), Bonus buy (25%), and Coupons (less than 1%), we use one variable to represent. Aggregate across 83 stores based on All Commodity Volume (the revenue of the store) 122 items in 6 product categories including Soft Drinks, Frozen-Juices, Canned Soup, Bath Soap, Front-end-Candies, and Bathroom Tissue. Items are selected with relatively large sales volumes

Experiment design Fixed window rolling forecast : Estimation period- 120 weeks; forecast 1, 1-4, and1-12 weeks ahead 70 rolling events for each item Model specification: 200 weeks; Ideally the model could be re-specified every time. This can be simplified assuming foreknowledge of the data, and the model that would ideally be selected (Fildes et al. 2009) Error measures: MAPE, symmetric MAPE and MASE, AvgRelMAE

Candidate models and results Candidate models of two dimensions: 1) competitive information and 2) offsetting forecast bias Models of 2 dimensions No competitive information Ignoring the change of market environment Intercept Correction (IC) 25.6% symmetric MAPE ADL-OWN; Basetimes-lift ADL-OWN-IC 32.6% Estimation window combining (EWC) ADL-OWN-EWC LASSO/stepwise ADL ADL-IC ADL-EWC Diffusion Factor ADL-DI ADL-DI-IC ADL-DI-EWC Here we show the symmetric MAPE results for forecast horizon 1-12 weeks ahead, results based on other error measures are similar.

Candidate models and results Incorporating competitive information does improve forecasting accuracy. ADL and ADL-DI both outperform ADL-OWN. Models of 2 dimensions No competitive information Ignoring the change of market environment Intercept Correction (IC) Estimation window combining (EWC) symmetric MAPE ADL-OWN 25.6% ADL-OWN-IC ADL-OWN-EWC LASSO/stepwise ADL + 23.8% ADL-IC ADL-EWC Diffusion Factor ADL-DI ++ 23.0% ADL-DI-IC ADL-DI-EWC Better performance

Candidate models and results Accounting for the change of the market environment does improve forecasting accuracy. Models with IC and EWC all outperform their counterparts. Models of 2 dimensions Ignoring the change of market environment Intercept Correction (IC) Estimation window combining (EWC) symmetric MAPE 23.9% 24.1% No competitive ADL-OWN 25.6% ADL-OWN-IC ++ ADL-OWN-EWC + information 23.0% 23.3% LASSO/stepwise ADL 23.8% ADL-IC ++ ADL-EWC + 22.5% 22.8% Diffusion Factor ADL-DI 23.0% ADL-DI-IC +++ ADL-DI-EWC + IC and EWC improve the performance of the models with and without competitive information Better performance

Results and insights We can improve the forecasting accuracy by incorporating competitive information: PCA and LASSO/stepwise ADL-DI versus ADL-OWN 1-12 wks 1-4 wks 1 week ahead ahead ahead Promoted -6.1% -4.6% -3.5% Non-promoted -14.6% -12.0% -8.6% ADL versus ADL-OWN 1-12 wks 1-4 wks 1 week ahead ahead ahead Promoted -3.3% -0.2% 1.2% Non-promoted -10.2% -7.0% -4.6% ADL and ADL-DI substantially outperform ADL-OWN when the focal product is not being promoted. A possible reason is retailers try to avoid promoting competing products at the same time, so if the focal product is being promoted, their tend to be less promotional information on other competitive items

Results and insights We can improve the forecasting accuracy by offsetting potential forecast bias using Intercept Correction (IC) and Estimation Window Combining (EWC) ADL-OWN-IC versus ADL-OWN 1-12 wks 1-4 wks 1 week ahead ahead ahead Promoted -2.0% -0.8% 0.7% Non-promoted -9.6% -10.2% -9.7% ADL-OWN-EWC versus ADL- OWN 1-12 wks ahead 1-4 wks ahead 1 week ahead Promoted -0.3% 0.0% -0.6% Non-promoted -8.3% -8.0% -7.1% In the absence of competitive information, by offsetting the potential forecast bias of the ADL-OWN model, we achieve substantially higher forecasting accuracy, mainly for the forecast period when the focal product is not being promoted.

Results and insights We can improve the forecasting accuracy by offsetting potential forecast bias using Intercept Correction (IC) and Estimation Window Combining (EWC) ADL-EWC versus ADL 1-12 wks 1-4 wks 1 week ahead ahead ahead Promoted 0.8% 0.6% -0.9% Non-promoted -3.2% -4.2% -4.4% ADL-IC versus ADL 1-12 wks 1-4 wks 1 week ahead ahead ahead Promoted -1.1% -0.9% -0.7% Non-promoted -4.6% -5.1% -4.6% WITH competitive information, by offsetting the potential forecast bias of the ADL model, we achieve substantially higher forecasting accuracy, mainly for the forecast period when the focal product is not being promoted.

Results and insights We can improve the forecasting accuracy by offsetting potential forecast bias using Intercept Correction (IC) and Estimation Window Combining (EWC) ADL-DI-EWC versus ADL-DI 1-12 wks 1-4 wks 1 week ahead ahead ahead Promoted -1.7% -2.8% -4.0% Non-promoted -6.2% -7.3% -6.8% ADL-DI-IC versus ADL-DI 1-12 wks 1-4 wks 1 week ahead ahead ahead Promoted -3.7% -5.0% -4.9% Non-promoted -7.8% -8.3% -6.9% WITH competitive information, by offsetting the potential forecast bias of the ADL-DI model, we achieve substantially higher forecasting accuracy, mainly for the forecast period when the focal product is not being promoted.

Summary We can improve the forecasting accuracy by Incorporating competitive information PCA and LASSO/stepwise Accounting for the change of the market environment. Intercept Correction (IC) and Estimation Window Combining (EWC) The advantage of the new models mainly come from the forecast period when the focal product is not on promotion The best model is the ADL model with diffusion indexes and intercept correction (i.e. ADL-DI-IC)

Thank you! Questions? Professor Robert Fildes r.fildes@lancaster.ac.uk Dr Tao Huang t.huang@lancaster.ac.uk Dr Didier Soopramanien d.soopramanien@lancaster.ac.uk