Exponential Smoothing INSR 260, Spring 2009 Bob Stine 1
Overview Smoothing Exponential smoothing Model behind exponential smoothing Forecasts and estimates Hidden state model Diagnostic: residual plots Examples!!!! (from Bowerman, Ch 8,9) Cod catch Paper sales 2
Smoothing Heuristic!! Data = Pattern + Noise Pattern is slowly changing, predictable Noise may have short-term dependence, but by-andlarge is irregular and unpredictable Idea Isolate the pattern from the noise by averaging data that are nearby in time. Noise should mostly cancel out, revealing the pattern Example: moving averages s t = y t w+ +y t 1 +y t +y t+1 + +y t+w 2w+1 Example: JMP s spline smoothing uses different weights 3
Simple Exponential Smooth Moving averages have a problem Not useful for prediction: Smooth s t depends upon observations in the future. Cannot compute near the ends of the data series Exponential smoothing is one-sided Average of current and prior values Recent values are more heavily weighted than Tuning parameter α = (1-w) controls weights (0 w<1) Two expressions for the smoothed value Weighted average l t = y t+wy t 1 +w 2 y t 2 + 1+w+w 2 + Predictor/Corrector y t l t = 1+w + w 2 + + w(y t 1 + wy t 2 + 1+w + w 2 + = (1 w)y t + wl t 1 = αy t + (1 α)l t 1 = l t 1 + α(y t l t 1 ) 4
Smoothing Constant α controls the amount of smoothing α 0 very smooth α 1 little smoothing Example (Bowerman): monthly tons, cod Overlay Plot l t = l t 1 + α(y t l t 1 ) 400 Cod (tons) 350 300 Your impression of the smooth? 250 0 5 10 15 20 25 Rows Table 6.1 5
Example: Splines Interpolating polynomial always possible to find a polynomial for which p(x)=y when there is one y for each x JMP interactive tool Cod catch Bivariate Fit of Cod (tons) By Time Cod (tons) 425 400 375 350 325 300 275 0 5 10 15 20 25 Time Smoothing Spline Fit, lambda=0.095749 Smoothing Spline Fit, lambda=0.095749 R-Square Sum of Squares Error 0.859312 3702.189 Bivariate Fit of Cod (tons) By Time Cod (tons) 425 400 375 350 325 300 275 0 5 10 15 20 25 Time Smoothing Spline Fit, lambda=29.62771 Smoothing Spline Fit, lambda=29.62771 R-Square Sum of Squares Error 0.133897 22791.47 6
Example: Exponential Smooth JMP formula similar to Excel l t = l t 1 + α(y t l t 1 ) Cod Catch Overlay Plot 400 375 350 325 300 Which is best? What value should we use for α? 275 0 5 10 15 20 25 Rows Y Cod (tons) ExpSmth.05 ExpSmth 0.2 ExpSmth 0.5 ExpSmth 0.8 7
Model Need statistical model to Express source of randomness, uncertainty Choose an optimal estimate for α Define predictor and quantify probable accuracy Want to have prediction intervals for exponential smoothing Latent variable model ( state-space models ) Assume each observation has mean L t-1!!!!!!! y t = L t-1 + ε t Mean values fluctuate over time!!!!!!! L t = L t-1 + α ε t Discussion L t is the state and is not observed If α = 0, L t is constant If α = 1, L t is just as variable as the data ε t ~ N(0,σ 2 ) 8
Predictions Model implies a predictor and method for finding prediction intervals Observations have mean L t-1!!!!! y t = L t-1 + ε t Means fluctuate over time!!!!!! L t = L t-1 + α ε t Errors are normally distributed!!!! ε t ~ N(0,σ 2 ) Predictor is constant E y n+1 = L n!!!!!!!!!!!!! ŷ n+1 = L n E y n+2 = L n+1 = L n + αε n+1!!!!!!!! ŷ n+2 = L n E y n+3 = L n+2 = L n+1 +αε n+2 = L n +α(ε n+2 +ε n+2 )! ŷ n+2 = L n In general, set ŷ n+f = L n Variance of prediction errors grows E(y n+1 -ŷ n+1 ) 2 = E(ε n+1 ) 2 = σ 2 E(y n+f -ŷ n+f ) 2 = E(ε n+f + α(ε n+f-1 + + ε n+1 ))2 = σ 2 (1+(f-1)α 2 ) 9
Model Estimating α Observations have mean L t-1!!!!! y t = L t-1 + ε t Mean values fluctuate over time!!!! L t = L t-1 + α ε t Correspondence l t is our estimate of L t â is our estimate of α (text uses, see page 392) Estimation Like doing least squares but you don t get to see how well your model captures the underlying state since it is not observed! Choose â based on forecasting If L t-1 were observed, we d use it to predict y t : it s the mean of y t Pick â to minimize the sum of squared errors, Σ(y t - l t-1 ) 2 Estimation is not linear in the data ˆα ε t ~ N(0,σ 2 ) 10
JMP Results Techniques for estimating â Text illustrates using the Excel solver We ll use JMP s time series methods Analyze > Modeling > Time Series Simple exponential smoothing Lots of output Results confirm little smooth pattern; near constant Parameter Estimates Term Level Smoothing Weight Forecast Estimate 0.00001974 Std Error 0 â 0 t Ratio. Prob> t 0.0000* Model Summary DF Sum of Squared Errors Variance Estimate Standard Deviation Akaike's 'A' Information Criterion Schwarz's Bayesian Criterion RSquare RSquare Adj MAPE MAE -2LogLikelihood Hessian is not positive definite. Book cheats a little by setting y 0 to mean of first 12 rather than smoothing after y 1. Finds â 0.03. 22 Stable Yes 26315.4779 Invertible Yes 1196.15809 34.5855184 232.424394 233.559888-0.174024-0.174024 9.14722694 31.0025784 230.424394 Predicted Value 420 400 380 360 340 320 300 280 260 0 10 20 30 40 50 Row 11
Diagnostics Sequence plot of residuals One-step ahead prediction errors, y t - ŷt Normal quantile plot 60.01.05.10.25.50.75.90.95.99 40 50 Residual Value 20 0-20 -40-60 0-50 -80-100 0 5 10 15 20 25 Row -100 1 2 3 4 5 6 Count -3-2 -1 0 1 2 3 Normal Quantile Plot No visual pattern remains But we will in a week or so more elaborate diagnostic routines associated with ARIMA models Text discusses tracking methods 12
Example Table 9.1 Data are weekly sales of paper towels Goal is to forecast future sales Units of data are 10,000 rolls 20 Rolls (x10000) 15 10 5 0 0 20 40 60 80 100 120 Row Level appears to change over time, trending down then up. 13
Exponential Smooth Apply simple exponential smoothing Model results not very satisfying Value for smoothing parameter, â = 1 (max allowed) Forecasts are constant Motivates alternative smoothing methods Parameter Estimates Term Level Smoothing Weight Estimate 1.0000000 Std Error 0.1110102 t Ratio 9.01 Prob> t <.0001* Model Summary DF Sum of Squared Errors Variance Estimate Standard Deviation Akaike's 'A' Information Criterion Schwarz's Bayesian Criterion RSquare RSquare Adj MAPE MAE -2LogLikelihood 118 Stable Yes 143.840613 Invertible Yes 1.21898825 1.10407801 362.267669 365.046793 0.93712456 0.93712456 18.7016851 0.86251008 360.267669 Predicted Value 25 20 15 10 5 0 0 20 40 60 80 100 120 140 160 Row 14
Summary Smoothing!!!!!!!!!!! locate patterns Exponential smoothing!!!!!!! uses past Model for exponential smoothing!!! latent state Diagnostic: residual plots!!!!! patternless Discussion Desire predictions that are more dynamic Extrapolate trends Linear patterns Seasonal patterns 15