Forecasting models and methods Giovanni Righini Università degli Studi di Milano Logistics
Forecasting methods Forecasting methods are used to obtain information to support decision processes based on data. Forecasts may have different time horizons: short term (e.g.: n. of calls to a call center in the next weeks) medium term (e.g.: sales of a product in the next year) long term (e.g.: demand of electric cars in the next decades)
References G. Ghiani, G. Laporte, R. Musmanno, Introduction to Logistics Systems management, Wiley, 2013
Classification Forecasting methods can be classified into qualitative and quantitative methods. Qualitative methods: Experts opinions Market polls Delphi method Quantitative methods: Explicative methods: one assumes a cause-effect relationship to be discovered and modeled; Extrapolative methods: one wants to derive regularities from available historical data.
Regression Regression analysis The goal is to identify a functional relationship between an effect and its causes. One observes a quantity y (dependent variable) and assumes it is a function of other quantities x (independent variables). y = f(x) If the independent variable is only one, the method is called simple regression. Otherwise it is called multiple regression. From previous observations some pairs of values (x i, y i ) are known; the aim is to find the function f() that best represents them.
Regression Regression analysis Instead of searching for a complicated function that represents the observations exactly, it is preferred to search for a simple function that represents them approximately. Hence we allow for a discrepancy between the values computed as f(x i ) and the observed values y i. If f() is linear, the method is called linear regression. y = A+Bx +ǫ The difference ǫ between computed values and observed values is called residual and it is a random variable that must satisfy two requisites: normal distribution with null average; independence between any two ǫ i and ǫ j for each i j.
Regression The least squares method This is taken as a measure of the approximation: Q = N (f(x i ) y i ) 2 i=1 and this is the objective function to be minimized. The unknowns, or decision variables, are the parameters of the line, i.e. A and B. To find their optimal values, it is sufficient to compute the partial derivatives of Q with respect to A and B and to impose they are null.
Regression The least squares method 6 5 4 3 data 2 1 0-1 0 1 2 3 4 5 6-1 time
Regression The least squares method Indicating the average values with we have: where S xx = N i=1 (x i x) 2 S xy = N i=1 (x i x)(y i y) S yy = N i=1 (y i y) 2 N i=1 x = x N i i=1 e y = y i N N B = S xy S xx e A = y Bx
Regression Line through the origin If we want to impose that the prediction line y = A+Bx pass through the origin, then we set A = 0 and we only estimate B = N i=1 x iy i N i=1 x. i 2
Regression Model evaluation A posteriori it is important to evaluate the reliability of the model used. Slope of the line: the model is considered non-significant if a given confidence interval for B contains the value 0. Linear correlation coefficient (Pearson index): r = Sxy. Sxx S yy It always holds 1 r 1. If r > 0 the line increases, if r < 0 it decreases. If r 1, linear correlation is strong; if r 0, it is weak. Variance estimator: s 2 N i=1 = (f(x i) y i ) 2 N 2 = 1 N 2 (S yy BS xy ).
Time series A time series is a sequence of values y t taken by a quantity of interest at given points in time t. If this points in time define a discrete set, the time series is called a discrete time series. We consider discrete time series with points in time t uniformly spaced (years, weeks, days,...).
Classification Extrapolative methods can be used to forecast a single period or multiple periods in the future. Time series decomposition Exponential smoothing: Brown model Holt model Winters model
Time series decomposition Models of time series We assume that the observed values y t be the result of a combination of several components of different nature. Long period trend m t Economic cycles v t Seasonal component s t (given a period L) Random residual r t. We consider two ways in which these components can interact: additive and multiplicative models. Additive models: y t = m t + v t + s t + r t Multiplicative models: y t = m t v t s t r t In the next slides we will consider a multiplicative model but the same concepts apply to additive ones, just replacing products with sums.
Time series decomposition Moving average If we know the period L of the seasonal component, we can remove the component ny computing the average on all periods of length L. L odd: (mv) t = L even: (mv) t = t+ L 1 2 i=t L 1 y i 2 L 1 2 y t L + t+ L 2 1 2 i=t L y i+ 1 2 +1 2 y t+ 2 L L To separate the trend component m from the economic cycle component v, we assume the former is linear and we compute it via simple linear regression, where time is the independent variable.
Time series decomposition Seasonal indices The seasonal component and the random component are obtained from (sr) t = y t (mv) t. Seasonal indices s 1,...,s L are obtained as k s t = (sr) t+kl N t where the extreme values of the range of the sum are suitably chosen to cover all the (sr) values previously computed and N t indicates the number of terms in the sum. The indices obtained in this way are then normalized: s t = Ls t L t=1 s t t = 1,...,L. So we have s t+kl = s t for each t and for each integer k.
Time series decomposition Example: trend component Componente tendenziale 1200,0 1000,0 800,0 600,0 400,0 200,0 0,0 1 21 41 61 81 101 121 141 Periodi
Time series decomposition Example: seasonal component Componente stagionale 1,2 1,1 1,0 0,9 0,8 0,7 0,6 7 27 47 67 87 107 127 147 Periodi
Time series decomposition Forecast The forecast is done by combining the trend component m and the seasonal component s. Predizione 1100,0 1000,0 900,0 800,0 700,0 600,0 Serie storica Predizione 500,0 400,0 7 27 47 67 87 107 127 147 167 Periodi
Exponential smoothing methods Introduction They are simple, versatile and accurate methods for forecasts based on time series. There are various models taking into account or not the existence of trend and seasonal components in the time series. The basic idea is to weigh the most recent observations more than the remote ones. This makes the smoothing methods able to adapt to unknown and sudden variations in the values of the time series owing to events that change the regularity of the observed phenomenon (technical failures, special discounts, bankruptcy of competitors, financial crisis,...).
Exponential smoothing methods Brown model This is the simple exponential smoothing method. Smoothed average: s t = αy t +(1 α)s t 1 t 2 s 1 = y 1 Forecast: f t+1 = s t with 0 α 1. For α close to 0 the model is more inertial; for α close to 1 the model is more reactive. The optimal value for α is obtained by minimizing the mean square error.
Exponential smoothing methods Holt model It is the exponential smoothing method with trend correction. Smoothed average: s t = αy t +(1 α)(s t 1 + m t 1 ) t 2 m t = β(s t s t 1 )+(1 β)m t 1 t 2 s 1 = y 1 m 1 = y 2 y 1 Forecast: f t+1 = s t + m t. with 0 β 1, optimized by minimizing the mean square error.
Exponential smoothing methods Winters model It is the exponential smoothing method with trend and seasonality corrections. Smoothed average: s t = α y t q t L +(1 α)(s t 1 + m t 1 ) t 2 m t = β(s t s t 1 )+(1 β)m t 1 t 2 q t = γ y t s t +(1 γ)q t L t L+1 s 1 = y 1 m 1 = y 2 y 1 q t = y t L τ=1 yτ/l t = 1,...,L with 0 γ 1, optimized by minimizing the mean square error. The forecast is f t+1 = (s t + m t )q t L+1.
Exponential smoothing methods Removing trend and seasonality To remove the trend component or the seasonality component from a time series: compute the moving average to remove the seasonality; compute iterative differences B t (h) = y t y t h to remove the trend; identify the trend with regression analysis; identify the seasonality by decomposing the series. Therefore it is possible: to use Winters model to the original time series; to use Holt model after removing seasonality; to use Brown model after removing trend and seasonality.