António Ascensão Costa

Similar documents
Forecasting using exponential smoothing: the past, the present, the future

Exponential smoothing in the telecommunications data

Modified Holt s Linear Trend Method

Improved Holt Method for Irregular Time Series

A State Space Framework For Automatic Forecasting Using Exponential Smoothing Methods

DEPARTMENT OF ECONOMETRICS AND BUSINESS STATISTICS

A stochastic modeling for paddy production in Tamilnadu

Automatic forecasting with a modified exponential smoothing state space framework

Department of Econometrics and Business Statistics

DEPARTMENT OF ECONOMETRICS AND BUSINESS STATISTICS

at least 50 and preferably 100 observations should be available to build a proper model

Robust control charts for time series data

TIMES SERIES INTRODUCTION INTRODUCTION. Page 1. A time series is a set of observations made sequentially through time

STRUCTURAL TIME-SERIES MODELLING

Univariate ARIMA Models

Local linear forecasts using cubic smoothing splines

X t = a t + r t, (7.1)

Economic Forecasting Lecture 9: Smoothing Methods

State-space Model. Eduardo Rossi University of Pavia. November Rossi State-space Model Fin. Econometrics / 53

Univariate linear models

Statistical Methods for Forecasting

DEPARTMENT OF ECONOMETRICS AND BUSINESS STATISTICS

Complex exponential Smoothing. Ivan Svetunkov Nikolaos Kourentzes Robert Fildes

Evaluation of Some Techniques for Forecasting of Electricity Demand in Sri Lanka

A state space model for exponential smoothing with group seasonality

TIME SERIES ANALYSIS. Forecasting and Control. Wiley. Fifth Edition GWILYM M. JENKINS GEORGE E. P. BOX GREGORY C. REINSEL GRETA M.

Prediction of maximum/minimum temperatures using Holt Winters Method with Excel Spread Sheet for Junagadh Region Gundalia Manoj j. Dholakia M. B.

The ARIMA Procedure: The ARIMA Procedure

Exponential smoothing is, like the moving average forecast, a simple and often used forecasting technique

State-space Model. Eduardo Rossi University of Pavia. November Rossi State-space Model Financial Econometrics / 49

Applications of Mathematics

Short-term electricity demand forecasting in the time domain and in the frequency domain

Comparing the Univariate Modeling Techniques, Box-Jenkins and Artificial Neural Network (ANN) for Measuring of Climate Index

Time-series Forecasting - Exponential Smoothing

Complex Exponential Smoothing

MGR-815. Notes for the MGR-815 course. 12 June School of Superior Technology. Professor Zbigniew Dziong

TAKEHOME FINAL EXAM e iω e 2iω e iω e 2iω

Chapter 9: Forecasting

ARIMA Modelling and Forecasting

Decision 411: Class 9. HW#3 issues

5 Transfer function modelling

arxiv: v1 [stat.me] 5 Nov 2008

A time series is called strictly stationary if the joint distribution of every collection (Y t

TIME SERIES ANALYSIS AND FORECASTING USING THE STATISTICAL MODEL ARIMA

ISSN Original Article Statistical Models for Forecasting Road Accident Injuries in Ghana.

Part I State space models

THE ROYAL STATISTICAL SOCIETY 2009 EXAMINATIONS SOLUTIONS GRADUATE DIPLOMA MODULAR FORMAT MODULE 3 STOCHASTIC PROCESSES AND TIME SERIES

NATCOR. Forecast Evaluation. Forecasting with ARIMA models. Nikolaos Kourentzes

FORECASTING SUGARCANE PRODUCTION IN INDIA WITH ARIMA MODEL

Forecasting: Methods and Applications

STAT 115: Introductory Methods for Time Series Analysis and Forecasting. Concepts and Techniques

Prof. Dr. Roland Füss Lecture Series in Applied Econometrics Summer Term Introduction to Time Series Analysis

Two canonical VARMA forms: SCM vis-à-vis Echelon form

Econometric Forecasting

Using Temporal Hierarchies to Predict Tourism Demand

Do Markov-Switching Models Capture Nonlinearities in the Data? Tests using Nonparametric Methods

5 Autoregressive-Moving-Average Modeling

A Hybrid Method to Improve Forecasting Accuracy An Application to the Processed Cooked Rice

A new approach to forecasting based on exponential smoothing with independent regressors

Do we need Experts for Time Series Forecasting?

The Identification of ARIMA Models

Time Series Forecasting: A Tool for Out - Sample Model Selection and Evaluation

Ross Bettinger, Analytical Consultant, Seattle, WA

Moving Average (MA) representations

VARMA versus VAR for Macroeconomic Forecasting

Trend-Cycle Decompositions

A Dynamic Combination and Selection Approach to Demand Forecasting

MCMC analysis of classical time series algorithms.

Lecture 7: Exponential Smoothing Methods Please read Chapter 4 and Chapter 2 of MWH Book

Bioinformatics: Network Analysis

ARIMA modeling to forecast area and production of rice in West Bengal

Exponential Smoothing. INSR 260, Spring 2009 Bob Stine

FORECASTING. Methods and Applications. Third Edition. Spyros Makridakis. European Institute of Business Administration (INSEAD) Steven C Wheelwright

Theoretical and Simulation-guided Exploration of the AR(1) Model

Decision 411: Class 4

Forecasting Levels of log Variables in Vector Autoregressions

Econometric Forecasting

Dennis Bricker Dept of Mechanical & Industrial Engineering The University of Iowa. Forecasting demand 02/06/03 page 1 of 34

NON-STATIONARY QUEUE SIMULATION ANALYSIS USING TIME SERIES

AUTO SALES FORECASTING FOR PRODUCTION PLANNING AT FORD

Scenario 5: Internet Usage Solution. θ j

Revisiting linear and non-linear methodologies for time series prediction - application to ESTSP 08 competition data

Advanced Econometrics

Warwick Business School Forecasting System. Summary. Ana Galvao, Anthony Garratt and James Mitchell November, 2014

Ch 6. Model Specification. Time Series Analysis

Decision 411: Class 4

Univariate Nonstationary Time Series 1

IS THE NORTH ATLANTIC OSCILLATION A RANDOM WALK? A COMMENT WITH FURTHER RESULTS

7. Forecasting with ARIMA models

Forecasting. Simon Shaw 2005/06 Semester II

Estimating Missing Observations in Economic Time Series

A Hybrid Method of Forecasting in the Case of the Average Daily Number of Patients

STAT 443 Final Exam Review. 1 Basic Definitions. 2 Statistical Tests. L A TEXer: W. Kong

Forecasting Egyptian GDP Using ARIMA Models

Least Squares and Discounted Least Squares in Autoregressive Process

Predictive spatio-temporal models for spatially sparse environmental data. Umeå University

interval forecasting

Time Series Forecasting Methods

Time Series Analysis of United States of America Crude Oil and Petroleum Products Importations from Saudi Arabia

Problem Set 2: Box-Jenkins methodology

Transcription:

ISSN 1645-0647 Notes on Pragmatic Procedures and Exponential Smoothing 04-1998 António Ascensão Costa Este documento não pode ser reproduzido, sob qualquer forma, sem autorização expressa.

NOTES ON PRAGMATIC FORECASTING PROCEDURES AND EXPONENTIAL SMOOTHING ANTÓNIO ASCENSÃO COSTA ISEG - 1998 1

1 - INTRODUCTION We use the term "pragmatic forecasting procedures" by opposition to model based approaches to forecasting (as ARIMA modelling and structural time series modelling). Pragmatic forecasting procedures were designed to solve the problem of generating good forecasts for certain general patterns of time series behaviour without clearly defining a data generating process' model nor the correspondent optimal forecast function. Pragmatic forecasting procedures are extrapolative short term forecasting methods. Extrapolative are all univariate methods despite its statistical sofistication. All extrapolative methods have the implicit assumption that near future will be in the line of (at least) recent past. But, as things are changing it's better not to project actual conditions to far ahead and stay in short term horizons. What short term is is more difficult to define: practical needs are always pressing you to cover more periods ahead! Pragmatic forecasting procedures all share the implicit (and also pragmatic) idea that a series has a set of components (trend, seasonals, and an error). If you conceive components as deterministic (an overall polinomial trend, a fixed seasonal pattern) a least squares regression on time and seasonal dummies is a good choice. But probably the error is red and forecasts black! If you admit that components evolve according to a non specified pattern, then you need more flexible estimators. Estimators based on moving averages were a possible solution now in oblivion because estimators based in exponentially wheigted moving averages - exponential smoothing - were found superior and easy to use (and you don't even have to know that you are working with exponentially weighted moving averages). Exponential smoothing - the set of forecasting methods that is known under the name of exponential smoothing (for a compreensive review see Gardner, 1985) - appeared by late fifties early sixties (Brown, 1959; Holt, 1960; Winters, 1960). Easy to use, understand and implement, even for large sistems, its popularity grew among users. Later, with the development of computational facilities, which also facilitates more flexible usage of exponential smoothing, more sofisticated methods entered into the competion but, among extrapolative time series methods, exponential smoothing was found hard to beat in forecasting perfomance. After all what came as a pragmatig forecasting procedure was found 2

optimal for some data generating processes and very robust in general. So, it continues in use and it continues deserving atention both for application in large routine forecasting and for providing a perfomance benchmark for more sofisticated methods. Its behaviour shows that naïves are not as naïves as they used to be. 2 - "CONSTANT" LEVEL SERIES: SIMPLE SMOOTHING Assume you want to forecast future values of a series that behave like the one shown in Graph 1: Graph 1 115 110 105 100 95 90 The series has no trend and can be viewed as oscilations around a "locally constant level". The pragmatic solution to forecast future values is to estimate the level at the end of the series, let $a T be that estimate, and, since there is no trend, to project it into the future, i.e. the forecast function is: Y$ a$ T+ h= T, for h =1, 2,... Now, the problem is how to estimate the level. We can use means: an operator that filters erratic components. But since the level is changing it seems reasonable to think that recent values of the series are more important to estimate the actual level than long past ones. So, we want to discard some of the past information. This can be done by using moving averages: only the last N observations enter to form the level estimate. And each time a new observation arrives the actual level estimate is revised: the most ancient observation is dropped and is replaced by the newcomer. 3

Simple exponential smoothing proposes another solution. Assume that at time t-1, you have an estimate of the level, $a t 1, and that at time t you observe y t, and want to update the estimate of the level. Then, the new estimate may be formed via the recursion: a$ t = αyt + ( 1 α)$ at 1, with 0 < α < 1. Actually, sistematic use of this estimator is equivalent to an exponentially weighted moving average: $ 2 3 a = αy + α( 1 α) y + α( 1 α) y + α( 1 α) y + t t t 1 t 2 t 3 L High values of the smoothing constant discount past information very quickly but filter poorly erratic oscilations, low values of the smoothing constant have the opposite effect. 3 - "LOCAL" LINEAR SERIES - HOLT'S METHOD Assume now that you want to forecast a series like the one in Graph 2: Graph 2 170 160 150 140 130 120 110 100 90 The series is trended and this trend can be aproximated, at least locally, by a linear function, i. e. the trend is changing but linearity seems adequate for adjacent observations. 4

To forecast future values of a linear trend we need two estimates: an estimate of the level of trend ( $a T ) and an estimate of the slope of trend ( b $ T ) at the end of the period. Then the forecast function is: Y $ $ $ + = a + b. h, for h =1, 2,... T h T T How to get $a T and b $ T? Overall least squares is not adequate for a changing lienear trend. "Moving" least squares or weighted least squares are better. Pragmatic solutions use estimators based on moving averages, simple moving averages and double moving averages, simple exponential smoothing and double exponential smoothing, etc. Holt's solution uses the two following recursive estimators: a$ = αy + ( 1 α)($ a + b$ 1 1 ), with 0 < α < 1 t t t t b$ = β( a$ a$ ) + ( 1 β) b $, with 0 < β < 1. t t t 1 t 1 4 - DAMPED TRENDS Empirical aplications shown that Holt's linear forecasts tend to overstate medium and long horizons. Gardner and McKenzie (1985) suggest the introduction of a parameter that moderates extrapolation as time horizon grows. This was found adequate for series like the one in Graph 3. Graph 3 180 170 160 150 140 130 120 110 100 90 5

To damp extrapolations, the linear forecast function was modified by introducing a parameter φ, taking values between 0 and 1. So the forecast function becames: Y$ = a$ + b$ φ h T+ h T T j = 1 j, for h =1, 2,... In accordance, recursive estimators are now: a$ = αy + ( 1 α)($ a + φb$ 1 1 ), with 0 < α < 1 t t t t b$ = β( a$ a$ ) + ( 1 β) φb$, with 0 < β < 1 t t t 1 t 1 The method can be viewed as a generalization of exponential smoothing forecasting patterns: 1) With φ=0 the method collapses in simple exponential smoothing. 2) With 0 < φ < 1 the trend is damping and lim $ $ $ φ Y T+ h= a T+ b T ( ). For low and 1 φ moderate values of φ the eventual forecast function becomes practically constant a few steps ahead. 3) With φ=1 Holt's method emerges. 4) With φ>1 the forecast's function slope is recieving exponentially growing weight and forecasts become explosive. 5 - SEASONAL SERIES: HOLT-WINTERS METHODS For series that exhibit seasonality, possibly non deterministic, the same principle of recursive estimators was extended to the estimation of seasonal factors (Winters, 1960). As seasonal patterns can be judge additive or multiplicative, two sets of recursions are available. If additive seasonality is considered adequate (see Graph 4), the set of estimators and the forecast function are: 6

Graph 4 260 240 220 200 180 160 140 120 100 80 Additive seasonality Estimators: a$ = α( y s$ ) + ( 1 α)($ a + b$ 1 1 ), with 0 < α < 1 t t t L t t b$ = β( a$ a$ ) + ( 1 β) b $, with 0 < β < 1 t t t 1 t 1 s$ = γ( y a$ ) + ( 1 γ)$ s, with 0 < β < 1 t t t t L Forecast function: Y$ = a$ + b$. h+ s$, T+ h T T T+ h kl for h =1, 2,..., and k =1 for h L, k = 2 for L< h 2 L... For series exhibiting multiplicative seasonality (see Graph 5) the set of estimators and the forecast function are: 7

Graph 5 330 280 230 180 130 80 Multiplicative seasonality Estimators: a$ = α( y / s$ ) + ( 1 α)($ a + b$ 1 1 ), with 0 < α < 1 t t t L t t b$ = β( a$ a$ ) + ( 1 β) b $, with 0 < β < 1 t t t 1 t 1 s$ ( / $ t = γ yt at) + ( 1 γ)$ st L, with 0 < β < 1 Forecast function: Y$ = ( a$ + b$. h ). s$, T+ h T T T+ h kl for h =1, 2,..., and k =1 for h L, k = 2 for L< h 2 L... If the trend is more locally constant than locally linear the equation of the slope estimator can be dropped. Damped trends can also be considered by introducing the φ parameter as above. 8

6 - PRACTICAL ISSUES 6.1 - IDENTIFICATION The above presentation of exponential smoothing methods shows that exponential smoothing is more than a collection of methods. Exponential smoothing should be viewed as a forecasting methodology, with different methods for different time series patterns. So, selection of the appropriate exponential smoothing procedure is important in order to achieve more accuracy, a criticism that is often made to forecasting competions. The simplest way to select the most appropriate method is by visual inspection, i.e., graphic analysis of the patterns of the series. A more objective selection procedure, that can be used in automatic forecasting, was proposed by Gardner and McKenzie (1988). We reproduce their identification rules in Table 1. Table 1 Gardner and McKenzie identification rules Case Minimum variance series Procedure selected A Original Constant level B First difference Damped trend C Second difference Linear or exponential trend D First seasonal diference Constant level, seasonal E First difference of D Damped trend, seasonal F Second difference of D Linear or exponential, seasonal Tashman and Kurk (1996) report results on the use of three selection protocols (Gardner-McKenzie's variance analysis, a visual protocol and a selection procedure using Schwarz BIC). 6.2 - CHOOSING SMOOTHING PARAMETERS When computational facilities were not as developped as they are now, work with exponential smoothing methods, tended to guesstimate the smoothing parameters. Values of the smoothing parameters between 0.1 and 0.3 were suggested. The most usual approach, now, is the estimation by minimization of some function of one-step-ahead forecast errors (usually the sum of squared 9

errors). This approach has shown that the suggested range for guesstimates is often violated. 6.3 - STARTING VALUES Any recursive estimator needs starting values. To start the simple exponential smoothing, we need an estimate of $a 0 ; for starting the Holt's method, we need $a 0 and b $ 0 ; and for seasonal methods, we need $a 0, b $ 0 and $s i for i = 0, 1,..., L+ 1. Many heuristic solutions are in use. The current idea concerning starting values seems to be that, no matter how we start, starting values are no longer important after some recursions for moderate long series. This line of reasoning can be suported by what happens with the simple exponential smoothing. In fact, after t periods, the estimator of the level of the series will be computed as: $ 2 t t a = αy + α( 1 α) y + α( 1 α) y + L + α( 1 α) y + ( 1 α) a$ t t t 1 t 2 1 1 0 and we see that the effect of the starting value is vanishing as t grows. For seasonal factors, this will not happen so quickly, unless we have a very long series or use a very high smoothing parameter. And, more generally, even for simpler methods, empirical applications show that parameter optimization (and forecasts to some extent) is very sensitive to different starting values (Chatfield and Yar, 1988). 7 - OPTIMALITY OF EXPONENTIAL SMOOTHING METHODS Optimality of exponential smoothing methods for some stochastic processes and ARIMA models was soon recognized (Muth, 1960; Nerlove and Wage, 1964; Theil and Wage, 1964; Roberts, 1982). We present some results: Simple exponential smoothing Note that a$ t = αyt + ( 1 α)$ at 1 can also be written (the error correction form) as (1) a$ a$ t = t 1 + α et where et = yt a$ t 1 is the last one step ahead forecast error. 10

αe From (1) we have a$ t t = and from the error definition yt = a$ t 1 + et. So, by ( 1 B) αet substitution, yt = 1 + et, or ( 1 B) ( 1 By ) t = et ( 1 α ) et 1. Provided the errors are white noise, y t is an ARIMA(0,1,1) with parameter θ= 1 α. As 0 < α < 1, also 0 < θ < 1 and simple exponential smoothing is equivalent to adjusting a restricted ARIMA(0,1,1). We can also see that simple exponential smoothing is stable (invertible) for 0 < α < 2 and not only for 0 < α < 1. Simple exponential smoothing is also optimal for the structural model: y t = µ t + ε t µ = µ + η t t 1 t Holt's method Holt's estimators in the error correction form are: a$ a$ b$ t = t 1 + t 1 + α et, and b$ = b$ 1 + αβ e, where e = y ( a$ + b$ ) t t t t t t 1 t 1. So we have $ αβet bt = and a$ ( 1 B) in yt = a$ t + b$ 1 t 1 + et gives t = αβet 1 αet + 2 ( 1 B) ( 1 B). Substitution of $a t 1 and $ b t 1 2 ( 1 B) y = e ( 2 α αβ) e ( α 1) e. t t t 1 t 2 Provided errors are white noise, y t is following an ARIMA(0,2,2) with parameters θ1 = 2 α αβ and θ2 = α 1. Applying Holt's method is equivalent to adjusting a restricted ARIMA(0,2,2). Holt's method is also optimal for the structural model: 11

y t = µ t + ε t µ = µ + β + η t t 1 t 1 t β = β + ξ t t 1 t Damped exponential smoothing As a similar analysis shows, applying damped exponential smoothing with 0 < φ < 1 is equivalent to adjusting an ARIMA(1,1,2), with θ1 = 1 + φ α φαβ and θ = φ( α ). 2 1 Holt-Winters Additive Holt-Winters is equivalent to adjusting an ARIMA( 01,, L + 1)( 010,, ) L (see Roberts, 1982). Additive Holt-Winters ARIMA equivalent is similar to Harvey's basic structural model ARIMA equivalent (Harvey, 1989). Multiplicative Holt-Winters is nonlinear and does not have any ARIMA equivalent. 8 - PREDICTION INTERVALS When we do not define clearly the model, it is hard to construct prediction intervals for more then one step ahead (as we don't kown how to obtain forecast variances for longer horizons). Intervals derived by assuming deterministic trends and seasonals seem "unhelpful and potentially misleading" (Chatfield and Yar, 1988). So, except for the multiplicative Holt-Winters, intervals derived from ARIMA model's for wich exponential smoothing methods are optimal are an option. For multiplicative Holt-Winters see Chatfield and Yar (1991). An empirical approach to prediction intervals, based on computation of error's variances for different lead times during model-fitting, was proposed by Gardner (1988). He also proposed the use of Chebyshev's inequality in defining the confidence level. This seems to give rise to too much wide prediction intervals, which would be not very helpfull. 9 - ROBUSTNESS Although easy to use and understand and despite its optimal limited range of application compared to Arima modelling exponential smoothing methods 12

compared very well with model based methods in forecasting competitions (see Makridakis et al., 1982). Recently, Chen (1997) investigated robustness properties of four time series forecasting methods for seasonal series (wich include Holt- Winters, ARIMA, structural components, and regression on polinomial time trends and dummies with stationary ARMA errors). He concluded that "the strongly robust properties of the Holt-Winters method are due to the reasonable structure of the predictor with parsimonious parametrization. This has flexibility to handle a wide class of time series having stochastic/deterministic linear trend and seasonal components. This flexibility and parsimony are even more important in the case where the time series has structural changes". References: Brown, R.G., 1959, Statistical Forecasting for Inventory Control, New York, McGraw-Hill. Chatfield, C., Yar, M., 1988, Holt-Winters forecasting: some practical issues, The Statistician, 37, 129-140. Chatfield, C., Yar, M., 1991, Prediction intervals for multiplicative Holt-Winters, International Journal of Forecasting, 7, 31-37. Chen, C.,1997, Robustness properties of some ferecasting methods for seasonal time series: a Monte Carlo study, International Journal of Forecasting, 13, 269-282. Gardner, Jr. E.S.,1985, Exponential smoothing: The state of the art, Journal of Forecasting, 4, 1-28. Gardner, Jr. E.S.,1988, A simple method of computing prediction intervals for time series forecasts, Management Science, 34, 541-546. Gardner, Jr. E.S., McKenzie, E., 1985, Forecasting trends in time series, Management Science, 31, 1237-1246. Gardner, Jr. E.S., McKenzie, E., 1988, Model identification in exponential smoothing, Journal of the Operational Research Society, 3, 863-867. Harvey, A.C., 1989, Forecasting, structural time series models and the Kalman filter, Cambridge. 13

Holt, C.C. et al., 1960, Planning Production, Inventories and Wok Force, Englewood Cliffs, NJ:Prentice-Hall, Chap. 14. Makridakis,S. et al., 1982, The accuracy of extrapolation (time series) methods: results of a forecasting competition, Journal of Forecasting, 1, 111-153. Muth, J.F.,1960, Optimal properties of exponentially weighted forecasts, Journal of the American Statistical Association, 55, 299-306. Nerlove, M. and Wage, S., 1964, Some observations on adaptive forecasting, Management Science, 28, Management Science, 10, 207-224. Roberts, S.A., 1982, A general class of Holt-Winters type forecasting models, Management Science, 28, 808-820. Tashman, L.J., and Kurk, J.M., 1996, The use of protocols to select exponential smoothing procedures: a reconsideration of forecasting competitions, International Journal of Forecasting, 12, 235-253. Theil,H. and Wage,S., 1964, Some observations on adaptive forecasting, Management Science, 10, 198-206.. Winters, P.R., 1960, Forecasting sales by exponentially weighted moving averages, Management Science, 6,324-342. 14