A new insight into prediction modeling system

Size: px
Start display at page:

Download "A new insight into prediction modeling system"

Transcription

1 Integrated Design and Process Technology, IDPT-2003 Printed in the United States of America, December, Society for Design and Process Science A new insight into prediction modeling system Sang C. Suh, Sam I. Saffer, Dan Li, Jingmiao Gao Department of Computer Science Texas A&M University-Commerce Commerce, Texas , USA ABSTRACT Data mining and forecasting has attracted a lot of research interests in time series data sequence. The paper brings a new insight in how to select different time series forecasting models to make prediction according to different situations and data patterns. The contribution of this research is to facilitate prediction modeling system design for extensive forecasting purpose. Keywords : forecasting, prediction, time series, stochastic, data mining and knowledge discovery, intelligent database, smoothing, ARIMA, Box-Jenkins, regression, neural network, expert system NOMENCLATURE F t = Forecast for the period t I t = smoothed seasonal index for period t S t = smoothed value at end of period t S t = A single smoothed value S t = A double smoothed value SSE = Sum of squared errors (explained) SST = Total sums of squares SSU = Unexplained variation TC t = The combined trend and cyclical Y t = Actual in the period t = Fitted or forecasted value at time t c = seasonal index smoothing constant m = forecast horizon t = Period t a = (level) smoothing constant ß = trend smoothing constant (called Beta) σ = Standard deviation INTRODUCTION Forecasting and predictions attract researchers, industries, and governments more and more in recent years. Many researches and models have achieved great success in some fields. Examples include X-11 [19] forecasting model in unemployment rates, GNP, weather forecasts, neural network in forecasting in inventories [10], EMA in forecasting commodity sales and web access predictions [2][4], decision tree of data mining in forecasting credit card risks, and Fourier analysis in pattern forecasting in mobile stock system [7]. But model selections have been a painful issue in forecasting areas. When we select models for some forecasting, we need to think about the model from every angle, hypothesize how different variables affect each other, and etc. The rule of thumb is to just follow the carpenter s rule - Measure twice, cut once. Don t be in a rush to apply a model and run the program. There is no best single forecasting model. Different forecasting models may generate quite different results when applied in different fields according to different situations and environments. The following remark by G. Box gives us an insight into the problem of formalizing any type of prediction model. "All models are wrong, but some are useful." - George Box (1994) The purpose of this paper is to describe different forecasting models and compare them according to different conditions so as to bring a new insight into prediction modeling system. Our goal is to facilitate forecasting system analysis and design, to make analysts and designers understand different prediction models and know what a best suitable forecasting system is and when (under what kinds of circums tances) should use such models. The structure of the paper is as follows. In section 2, we present basic concepts and understandings of time series analysis and attach some frequently used formulas. In section 3, we describe different time series forecasting models and present advantages/disadvantages/constraints for each model. In section 4, we compare different models in different angles. In section 5, we devise our new prediction modeling system and explain the detailed procedure. In section 6, we conclude the paper and discuss the future work. BACKGROUND NOTIONS In time series analysis, there is a fundamental distinction between the terms process and realization. The actual values in an observed time series are the realization of some underlying process that generates 1

2 those values. We refer to the underlying process as the stochastic (a process whose outcomes are given by probability) or probabilistic generating process (probabilistic and stochastic are synonyms). In time series analysis, data sequences are often decomposed into patterns including random patterns, trend patterns, seasonal patterns, and cyclical patterns. So forecast = trend + season + cycle + random. Trends and seasons are deterministic and cycles and random are non-deterministic. Most models are based on sample statistics because a true population statistic is usually unobservable and checking the entire population is expensive and statistically unnecessary. So we have to clearly understand difference between samples and populations and difference between the number of samples and samplings. Generally, it is recommended that at least 60 observations (i.e., five seasons) be stored for monthly series, regardless of the forecasting method. When you select a forecasting model, how you can determine whether it is proper for your specific problems or data series and whether it is the best candidate become main issues in the building process of the whole prediction modeling system. Many statistic criteria can be used for measuring errors in different models under different circumstances. The commonly used statistic criteria are shown in Formula in appendix. DIFFERENT METHODS Univariate time series methods SMA (simple moving averages) SMA is useful in modeling a random series (i.e., one without trend or seasonality) because it averages or smoothes the most recent actual values to remove the unwanted randomness. Moving average models work well with patternless demands. A patternless series is one that does not have a trend or seasonality in it. Such a series can have smooth or erratic variations (i.e., with or without high autocorrelation). It assumes that a future value will equal an average of past values: SMA(m) = (X t + X t X t-m+1 ) / m (1) The most distinct disadvantage of all moving average or smoothing methods is that they do not model seasonality or trend. A general forecasting method should be able to model seasonal and trend series. Also, historically the disadvantage of moving average has been that all data needed to calculate the average must be stored and processed. For example, if a 52-week moving average is used, a database of 52 observations must be maintained. Although this created a serious data storage problem several decades ago when computer memory and calculations were expensive, it is no longer a matter of great concern. When using a moving average, it is difficult to determine the optimal number of periods to include in the average. However this is not a problem for exponential smoothing. Exponential smoothing is as accurate as moving averages while at the same time more computationally efficient. In general, the optimal number of periods to have in a SMA is the number that is minimizing the residual standard error (RSE see equation 22 in appendix). Longer moving-averages smooth the randomness of the series more. A long-period moving average is the more accurate forecasting model that yields the lowest RSE when a series is very random and erratic (an erratic series is one not possessing high levels of autocorrelation). In contrast, if the randomness is very smooth, with highly auto-correlated walks away from an overall mean, it should be modeled using a shorter-period average that will yield a lower RSE. (i.e., a very common pattern of prices for items sold in large competitive markets, such as stock, bond, and commodity markets). Also, it may describe the demand for very high-volume products that are not trending or seasonal. The pattern is referred to as a random walk because it randomly and smoothly walks up and down without any repeating pattern. Random walks are characterized by high degrees of autocorrelation. Exponential smoothing As mentioned above, when computer storage capacity was expensive, exponentially weighted moving averages method (i.e., exponential smoothing) was advantageous. They are still very popular today, even though storage cost is not of much concern. Exponential smoothing refers to a set of methods of forecasting, several of which are widely used. 1. Single exponential smoothing (SES) It is easy to apply because forecasts require only three pieces of data: the most recent forecast, the most recent actual, and a smoothing constant. The smoothing constant (a) determines the weight given to the most recent past observations and therefore controls the rate of smoothing or averaging. F t+1 = F t + a ( Y t - F t ) (2) The best alphas should be chosen on the basis of minimal sum of squared errors. Note: ( Y t - F t ) = forecast error GIVEN: F t+1 = a F t + ( 1 - a ) F t THEN: F t+1 = a Y t + a ( 1 - a ) Y t-1 + a ( 1 - a ) 2 Y t-2 + a ( 1 - a ) 3 Y t-3 + +a ( 1 - a ) t-1 Y 1 + ( 1 - a ) t F 1 (3) It only works for no trend and no seasonality patterns. 2

3 2. Double moving averages/linear moving averages This is the method that calculates a second moving average from the original moving average so as to eliminate the systematic error Advantages : When trend and randomness are the only significant demand patterns, this is a useful method. Because it smoothes large random variations, it is less influenced by outliers than the method of the first differences. In general, it is too simplistic to be used by itself; It does not model the seasonality of the series. We also face the problem of determining the optimal number of periods to use in the double moving averages. 3. Brown s Double Exponential Smoothing [1][17] It uses a single coefficient (a) for both smoothing operations. This method computes the difference between single and double smoothed values as a measure of trend, and then adds this value to the single smoothed value together with adjustment for the current trend. Brown s Double Exponential Smoothing Formula: S t = a Y t + ( 1 - a ) S t-1 S t = a S t + ( 1 - a ) S t-1 a t = S t + ( S t - S t ) = 2 S t - S t a b t = ( S t - S t ) 1 - a F t + m = a t + b t m (4) Advantages : It models the trends and level of a time series; It is computationally more efficient than double moving average; It requires less data than double moving averages. Because one parameter is used, parameter optimization is simple. There is some loss of flexibility because the best smoothing constants for the level and trend may not be equal; It does not model the seasonality of a series. Thus it is not recommended unless the data is first de-seasonalized. 4. Holt s Two-Parameter Trend Model It uses two smoothing constants (a and ß), one for trend and the other for the level of the series. Holt s Two -Parameter Trend Model Formula: S t = a Y t + ( 1 - a ) ( S t-1 + b t-1 ) b t = ß ( S t - S t-1 ) + ( 1 - ß) b t-1 F t + m = S t + b t m (5) Advantages : It eliminates the natural lag of single smoothing as Brown s DES; It is more flexible in that the level and trend can be smoothed with different weights. It requires that two parameters be optimized. Thus the search for the best comb ination of parameters is more complex than for single parameters; It does not model the seasonality of a series. 5. Winters Three-Parameter Exponential Smoothing It uses three smoothing constants, one for trend, one for the level of the series, and one for seasonality adjustment. Winter s Three-Parameter Exponential Smoothing formula: Y t + 1 = ( S t + b t ) I t L e t + 1 (6) where S t = smoothed non-seasonal level of the series at end of period t b t = smoothed tend in period t I t L + 1 = smoothed seasonal index for period t + 1 S t = a Y t I t L + ( 1 - a ) ( S t-1 + b t-1 ) b t = ß ( S t - S t-1 ) + ( 1 - ß) b t-1 I t = c Y t S t + ( 1 - c ) I t L (7)) where I t L = smoothed seasonal index L periods ago L = length of the seasonal cycle ( e.g., 12 months or four quarters ) Advantages : It models trend, seasonality, and randomness using an efficient exponential smoothing process; Parameters can be updated using computationally efficient algorithms; Its forecasting equations are easily interpreted and understood by management. So it is popular in some commercial forecasting systems. It may be too complex for data that do not have identifiable trends and seasonality; The simultaneous determination of the optimal values of three smoothing parameters using Winters model may take more computational time than regression, Fourier series analysis, etc. Outliers can have dramatic effects on forecasts, and should be eliminated. Data Requirements: Because it models seasonality, its data requirements are greater than previous methods above. To adequately 3

4 measure seasonality, at least three seasons of monthly data (36 months), four or five seasons of quarterly data (16 to 20 quarters), and tree seasons of weekly data (156 weeks) are suggested minimums. Decomposition - X-11 One of the most highly developed decomposition methods is the Census Method II, X-11 procedure. This is the primary method that the federal government uses to de-seasonalize the many macroeconomic times series data (e.g., GDP, unemployment, etc.) reported in the media. The predecessor of this method was developed by the National Bureau of Economic Research during the 1920s to forecast economic time series. It is the computerization of the classical decomposition method. Because of its complexity, the Census Method II is infeasible without a computer. The method decomposes a time series into three components, where TC is the combined effect of trend and cycle components. The components are estimated as follows: Estimate the TC component via a centered moving average of Y Estimate SI as the ratio of Y/TC Adjust SI for extreme points (find extreme points and smooth by a moving average) Estimate S by an averaging of SI Estimate TCI by the ratio of Y/S Estimate I by the ratio of TCI to TC This yields the following forecasting model using trend-cyclical and seasonal factors that can be the basis for a forecasting. That is, = TC t S t. The above six steps are repeated more than once for refinements during the following four major phases: First phase: adjusting the data for trading days in a month or quarter. Second phase: preliminary estimates of the seasonal indexes are made. Third phase: the preliminary seasonal indexes are refined. Final phase: several summary statistics are generated as diagnostic tools. Implementation of such method: the standard output involves 17 to 27 tables; the long output involves 27 to 39 tables; and the full output includes 44 to 59 tables. Plots of the seasonal indexes, the trend-cycle curves, and other graphs are extremely useful in tracking trends and estimating turning points. In addition, this method is a very good way to identify and adjust the outliers in a time series. Because of its heavy computer requirements, the method is not used routinely in automated forecasting systems. However, this is a cost-effective method for important time series. Also it is limited to time series with sufficient history (at least three years of monthly data). Its principal usage in business are in monthly and quarterly forecasting for seasonal time series for 1 to 10 years and in analyzing trend and seasonal factors for use in other forecasting methods. Because of the manner in which seasonal indexes and trends are calculated, overfitting models can be a problem with decomposition methods. That is, a large random error in only one period will influence the estimate of seasonal and trend values. Therefore, when calculating seasonal indexes, it is important to eliminate outliers and extraordinarily high or low indexes; then a modified mean or a median of the seasonal indexes should be used. When de-seasonalizing data and calculating trends is that the division by abnormally low seasonal indexes might generate extremely large forecasts. This creates large errors when forecasting. Consequently, it is important to check for outliers in each step of the method. It is very difficult to simultaneously decompose trend and seasonality in a series when only a few seasonal cycles exist. Because of these problems, the R 2 of the fitted decomposition model may be quite high relative to actual R 2 values realized when forecasting. Also, the multiperiod-ahead forecasts of a decomposition model may be greatly influenced by future changes in the cyclical influences. Consequently, we very cautiously forecast one to two seasonal cycles into the future using decomposition methods. We are even more cautious in forecasting several years into the future. Although this method has several disadvantages, it is useful in conjunction with other models of trend and cyclical variations. Thus, the more sophisticated the decomposition program, the better. Fourier Series Analysis (FSA) A method that models trend, seasonality, and cyclical movements using trigonometric sine and cosine functions. A method used in automated forecasting system; however, it is not without its detractors. Combing Function in Fourier Series Analysis where (8) = Fitted or forecasted value at time t = Constant used to set the level of the a 0 series b 0 = Trend estimate of the series = Coefficients defining the amplitude a 1, b 1, a 2 and phases ω = 2πf/n, know as omega k = Harmonic of ω 4

5 Disadvantages: Overfitting and the number of parameters to include As coefficients are added to a model, it becomes more complex, more time -consuming to estimate, more accurate in fitting the past, but eventually less accurate in forecasting the future. The additional terms that increase historical fit might not improve accuracy, a situation called overfitting. Added trig terms might only model events such as outliers that occurred in the past but are not going to occur in the future. Thus the significance test on the choice of frequencies that should be included in the model is an important tool in model selection. However, it is not an infallible one. Overfitting is a general problem no matter which forecasting approach is used. Interpretability The primary disadvantage of the FSA method is that it may be difficult to relate the model parameters to commonly used or intuitive explanations of seasonal profiles. We have found the use of graphs very useful in describing the seasonal behavior of FSA. Also, these seasonal profiles can be used to express the amplitudes of each frequency as either an additive or percentage of the trend. Equal weight to all observations A FSA model uses least squares regression to determine seasonality or cyclical variations. One of the attributes of least squares is equal weight is given to all errors; that is, a recent error (e.g. t = 46) is squared just as a distant error (e.g. t=2). Consequently, FSA models do not give more weight to the more recent actuals than to the distant actuals, as is done in exponential smoothing. Exponential smoothing methods such as Winters method give more weight to the more recent actual values, and this is often advantageous because the recent past is normally a better predictor of the immediate future. Advantages: An advantage of FSA over other trend-seasonal methods is that it has coefficients that are statistically independent of each other. In general, if all amplitudes are insignificant, then all will be dropped and the model will no longer include seasonal patterns. Finally, if the trend parameter is found to be insignificant, it too can be dropped to yield a simple level model. Most other methods require a refitting of the model when one of more terms is dropped from the model. This makes FSA a versatile approach to modeling time series in operational forecasting systems. Periodograms are useful in discerning white noise, non-stationarity, and seasonality. However, FSA and periodograms are not commonly used in time series, despite the fact that they provide useful information. Auto-Regressive Integrated Moving Averages (ARIMA) [14][15][16][5][6] ARIMA (Box-Jenkins) a method that models a series using trend, seasonal, and smoothing coefficients that are based on moving averages, auto-regression, and difference equations. It is an accurate and very versatile approach. The purpose of time series analysis is to use the realization of a process (i.e., the sample) to identify a model of the ARIMA process (i.e., the population) that generates the series. The procedures used to build models are broadly referred to as Box-Jenkins or ARIMA model building methods. ARIMA model building is an empirically driven methodology of systematically identifying, estimating, diagnosing, and forecasting time series. The Box-Jenkins modeling Procedure (see Fig. 1 in appendix) ARIMA(p, d, q) includes a class of models : MA (q) = ARIMA (0, 0, q) Suppose that {Y t } is a discrete purely random process with mean zero, then a process {X t } is said to be a moving average process of order q (MA(q)) if X t = Y t + ß 1 Y t ß q Y t-q (9) where { ß i } are constants, {Y t } are a sequence of random variables; AR (p) = ARIMA (p, 0, 0) Suppose that {Y t } is a discrete purely random process with mean zero, then a process {X t } is said to be an autoregressive process of order p (AR(p)) if X t = a 1 X t a p X t-p + Y t (10) where {a i } are constants, {X t } is multiple regressed on past variables of {X t } rather than on independent variables; ARMA (p, q) = ARIMA (p, 0, q) Broadly speaking, a MA(q) explains the present as the mixture of q random impulses, while an AR(p) process builds the present in terms of the past p events. A mixed auto-regressive moving-average process containing p AR terms and q MA terms is said to be an ARMA process of order (p, q). It is given by X t = a 1 X t a p X t-p + Y t + ß 1 Y t ß q Y t-q (11) where {X i } is the original series and {Y t } is a series of unknown random errors which are assumed to follow the normal probability distribution; ARIMA (p, d, q) d here is for difference purpose that it is possible to describe certain types of non-stationarity time series. Such a model is called ARIMA (Auto Regressive Integrated Moving Average) because the stationary model is fitted to the differenced data has to be summed or integrated to provide a model for the non-stationary data. The general process ARIMA (p, d, q) is of the form 5

6 W t =a 1 W t a p W t-p + Y t + ß 1 Y t ß q Y t-q (12) Limitations: Short-time forecasting We emphasize short-term forecasting because most ARIMA models place heavy emphasis on the recent past rather than the distant past. This emphasis on the recent past means that long-term forecasts from ARIMA models are less reliable than short-term forecasts. Data types The UBJ method applies to either discrete data or continuous data. Even though, it deals only with data measured at equally spaced, discrete time intervals. Data measured at discrete time intervals can arise in two ways. First, a variable may be accumulated through time and the total recorded periodically. Second, data of this type can arise when a variable is sampled periodically. Suppose an investment analyst records the closing price of a stock at the end of each week. The variable is being sampled at an instant in time rather than being accumulated through time. Sample Size Building an ARIMA model requires an adequate sample size. 50 observations is minimum required number. Stationary data series only A stationary time series has a mean, variance, and autocorrelation function that are essentially constant through time. Advantages : The concepts associated with UBJ models are derived from a solid foundation of classical probability theory and mathematical statistics. ARIMA models are a family of models, not just a single model. It includes as special cases the AR, MA and ARMA classes. That s why it is often called the best model in stochastic modeling because it combines many of historically popular univariate methods. Box and Jenkins have developed a strategy that guides the analyst in choosing one or more appropriate models out of this larger family of models. It can be shown that an appropriate ARIMA model produces optimal univariate forecasts. That is, no other standard single-series model can give forecasts with a smaller mean-squared forecast error (i.e. forecast error variance). In sum, there seems to be general agreement among knowledgeable professionals that properly built ARIMA models can handle a wider variety of situations and provide more accurate short-term forecasts than any other standard single-series technique. However, the construction of proper ARIMA models may require more experience and computer time than some historically popular univariate methods. Other statistic/quantitative/data-mining methods Multiple linear regressions General multiple-regression model: Y = a + b 1 X 1 + b 2 X b n X n + e (13) BLUE rules for desirable properties: Best: 1) minimum variance; 2) efficient (because size of sample? variance?) Linear: should be coefficients linear, not variables. Unbiased Estimates Multiple regressions modeling process: see Fig. 4 in the appendix. It is hard to implement in an automatic modeling system. It may face multicollinearity problems, serial correlation problems, heteroscedasticity problems, etc. which need bunches of tests with human judgments. Inductions/decision trees [1] It is a method (also referred to as decision trees) that generates rules (decision rules) for predicting dependencies. It is commonly used in target marketing, credit risk analysis, fraud detection, help desk/problem diagnosis, customer retention. The following simple credit risk analysis forecasting example shows the method to generate useful information for credit card companies: (see Table 1 in appendix). According to the table, we can construct the cross-tabulation table of independent vs. dependent columns for the root node (Table 2 in appendix) and draw the decision tree (see Fig. 2 in appendix). So we can generate our decision rules: If Debt = High, then Risk = Poor ( 20% ) If Debt = Low && Income = High, then Risk = Poor ( 20% ) If Debt = Low && Income = Low && Married = No, then Risk = Poor ( 20% ) If Debt = Low && Income = Low && Married = Yes && Credit Card = No, then Risk = Good ( 10% ) Factors selection may cause either under-fitting or over-fitting. We recommend to use some statistics tools to analyze the correlations between variables (independents and dependent) It is hard to implement those into a time series data sequence. It is a kind of semi-strong form of efficient market hypothesis, which states that the dependent variable fully reflects all relevant publicly available information (independent variables) rather than the past series of itself only. Artificial Neural Networks (ANNs) [8][9][10][18] It is a highly accurate model for prediction & forecasting. We introduce ANNs and apply/use one of the 6

7 most commonly used, versatile learning ANNS, the FeedForward, BackPropagation architecture (Multi-Layer). Known data is fed through nodes to train the model; once trained, new data is input and scored for its propensity. ANN is typically used in sales forecasting, fraud detection, target marketing, credit card validation, credit application acceptance, stock and bond price predictions, etc. Steps in the development of Artificial Neural Networks: 1) Determine the structure of the ANN based on some underlying theory about what influences the dependent variable. This step is the most important part in forecasting procedure. However, in general, this involves choosing input variables, the number of input nodes, the number of hidden layers and nodes, the transfer function type, and the number of output nodes. 2) Divide the input and output data into two groups, the first to be used to train the network, the second to be used to validate the network in an out-of-sample experiment or forecast. 3) Scale all input variables and the desired output variable to the range of 0 to 1. 4) Set initial weights and start a training epoch using the training data set by repeating step (5) to (13): An epoch is the calculation of errors and the adjustment of weights by processing( i.e., taking one pass through) all observation in the training set. Also in this step, initial values can influence the speed and RMS that result from the training process. Most programs allow weights to be initialized with all zeros and random numbers from, for example, -1 to 1. 5) Input scaled variables. 6) Distribute the scaled inputs to each hidden node. In general, each hidden node receives all scaled input variables, which results in a parallel processing of all inputs at multiple nodes. In some cases, these inputs are distributed to each output layer node as well as to the hidden layer nodes. Note that the output O j of an input node equals its input I j. Thus: O 0 = I 0, O 1 = I 1. 7) Weight and sum inputs to receiving nodes. At each hidden node, the outputs of the input nodes are weighted and summed. Thus the input to node 3 is: I 3 = W 03 O 0 + W 13 O 1 (14) 8) Transform hidden-node inputs to outputs. At each hidden node, the weighted inputs are transformed into an output in the range of 0 to 1. 1 O 3 = (1 + e I3 (15) ) where, as always, e is the natural number of ) Weight and sum hidden node outputs as inputs to output nodes. The outputs of the hidden node and O 3 any bias nodes are weighted and summed. As we will see, a bias node operates much like the constant term in regression analysis. I 4 = W 24 O 2 + W 34 O 3 (16) 10) Transform inputs at the output nodes. At the output node 4, the weighted input I 4 is transformed into the output of O 4 in the range of 0 to 1. This is the final output of the ANN: 1 O 4 = ( 1 + e I4 (17) ) 11) Calculate output errors. The scaled output value O 4 is compared to the scaled desired output value D 4 and the error is calculated: e i = D 4 + O 4 (18) where i is the observation number in the training set. 12) Backpropagate errors to adjust weights. Based on the errors of the step(11), the weights throughout the network are modified so as to move toward minimization of the RMS value. 13) Continue the epoch. Repeat steps (5) to (12) for all observations in the input data set. As mentioned, each pass through all observations is called an epoch. When all observations have been processed (i.e., one epoch completed), go to step (14). 14) Calculate the epoch RMS value of the errors. If this RMS is low enough, stop and go to step (15). If the RMS is not low enough, repeat steps (5) to (14) until the stopping condition is reached: (19) where RMS = Approximately the residual standard error e i = Errors during each observation in latest epoch nt = Number of observations in the epoch/training set 15) Judge out-of-sample validity. Having trained the ANN using one set of input data, now it should be validated using out-of-sample data. That is, use the ANN trained in steps (1) to (14) to predict the withheld output variables. If the out-of-sample RMS is consistent with the training RMS, the model appears valid. If the model is not valid, repeat the experiment after you: a) Try different initial values for the weights. b) Redesign the ANNs (i.e., use fewer / more layers / nodes). c) Try a different ANN method. d) Reject ANNs as a viable method. 16) Using the model in forecasting, being cautious to monitor it using tracking signals and other devices. There are few well-established guidelines on choosing the best ANN architecture. The architecture includes all of the considerations: the number of input variables and nodes, the number of hidden layers and nodes, the number of output nodes, also included are the choice of the transfer function used and values of other coefficients. In general, there is nothing that precludes having more than one hidden layer. It is hard to 7

8 determine the number of hidden layers as the best fitted model. So it is possible to have a model that is too complex or too simple. If the ANN is too complex, the fitted weights will be designed to match the input data too closely instead of modeling the underlying patterns. Fortunately, this problem of overfitting can be detected during step 15. If the neural network is too simplistic, the training and validation RMS values will be too high. COMPARISIONS AND ANALYSIS As we have described in the previous sections, there is no single best forecasting model. Each model may be best fitted into specific situation such as horizon length, automation of development, pattern recognition ability, number of observation required, etc. Table 4 and Table 5 in appendix show the comparison between different forecasting models. Any model has its own advantages, disadvantages and constraints. ANN models, like all forecasting methods, are not always the best approach to forecasting. In fact, the experiences to date give mixed reviews to the effectiveness of ANNs. Sharda and Patil (1990) used 75 times series from the M-competition and compared their results to those of an expert ARIMA forecasting system. Out of the 75 series, ANN performed better than 39 of the series and worse for 36 series. We believe that it is difficult to consistently outperform a good human ARIMA model builder (which moderately heavy use of skilled manpower). In general, smoothing methods such as Winters, Fourier series analysis, and ARIMA can be more accurate for immediate to intermediate-term forecasting than are multivariate methods. Of course, the virtue of having a broad class of models from which to choose is that, on the average, this should lead to more accurate forecasts. Certainly it would be expected that a sensible use of the ARIMA model building approach should, in the aggregate, produce forecasts which are of higher quality than those deriving from the imposition of a single model structure on every time series. However, empirical studies by Makridakis and Hibon (1979), and by Makridakis et al. (1982), report little or no improvement in forecast accuracy when using ARIMA models rather than some simpler extrapolative techniques. The degree to which the estimation of forecasting methods can be automated decreases for those lower in the Figure 3 in appendix. This is true because, the complexity and subjectivity of the forecasting methods evolve from the highly programmable to the ill-structured qualitative methods. As the complexity of the method increases, the less likely will be its automation. (i.e. it is not normally possible or cost effective to automate multiple regression modeling methods. The strengths of the ARIMA model building approach lie in its flexibility, the relative accuracy of the resulting forecasts, and the possibility of extension to the analysis of multiple time series. It is not easy to make development of automatic ARIMA model building programs so that can produce satisfactory forecasts. PREDICTION MODELING SYSTEM We have done extensive researches and developed case studies on how anyone can choose any model based on individual circumstances when designing a new prediction system. We developed a new methodology of a prediction modeling system. The modeling system is based on a couple of assumptions. First, heavy computation requirements won t be a great concern. We can sacrifice a little more computation time to trade in for more accurate forecasting. Just as we can sacrifice a little more hard disk space to trade in for higher speed and stability of operating system. Second, according to Table 5 in appendix, we recommended that at least 60 observations (i.e., five seasons) be stored for monthly series, regardless of the forecasting method. Because there is no best single method of forecasting systems, the methods used in good systems are, in general, less important than their intelligent usage. Several very good methodologies executed very poorly in automated systems a good implementation of a less effective method can outperform a poorly implemented best method. As we see from different forecasting model analysis and comparisons, we can classify those models into different patterns: (see Table 3 in appendix). From the table, in order to simplify the problems, we set a rulebased pattern table for classifications and select the following forecasting methods for building our DEMO prediction modeling system: SES Holt/Winters ARIMA Neural Network Now, let s see our prediction modeling system procedure: (Fig. 3 in appendix) 1. Pattern Identification Module (PIM) This module is to use statistic tools to analyze data sequence first for patterns classification. We built autocorrelation function (ACF) and partial auto-correlation function (PACF) to facilitate the objective and use t-value to test the coefficients. ACF measures the direction and strength of the statistical relationship between ordered pairs of observations on two random variables. It measures how closely the matched pairs are related to each other. And PACF measures the correlation between ordered pairs separated by various time spans (k=1, 2, 3 ) with the effects of intervening observations accounted for. We use total sample number divided by 4 8

9 as the lags number. This part is highly important because it is the milestone of the whole modeling system. We also calculate the mean, standard deviation, etc. to facilitate later use and do a RUNS test (if applicable) to measures the likelihood that a series is a random occurrence. 2. Pattern Comparison Module (PCM) This module is to do the pattern comparison by using back-propagation neural network. We use neural network instead of human experts to learn and train neurons (statistic results from PIM), eliminate outliers and output the trend of the statistic data sequence, which are added pattern information. This neural network based module can intelligently generate pattern information such as no season, with trend, stationary, etc. 3. Model Selection Module (MSM) This module is to select the available right model (s) from the pattern information generated from PCM based on the rule-based pattern table. For example, based on no season, with trend, stationary time series data sequence, Holt s Two-Parameter Trend Model may be a right choice. This model is a pure knowledge based expert system. 4. Error Check Module (ECM) This module is to run residual statistic analysis to see whether errors are within acceptable range. If so, then we can accept the forecasting results. If not, we have to repeat the whole procedures again and again to repeat comparing the following statistic criterion: residual standard error (RSE), mean square error (MSE), root mean squared error (RMSE), Mean Absolute Percentage Error (MAPE) (see Formulas in appendix) from (1) to (4) till errors fall into acceptable range. 3.5 Model Comparison Module This module is to do the model comparison only when there are two or more model candidates selected from MSM. Then it has to repeat both models to compare the forecasting accuracy by residual analysis also. A good forecasting model should not consistently over- or under-forecast; consequently, the mean error should not vary greatly from zero. The sum of the squared errors is used to calculate the residual standard errors as in equation (22) in Formulas in appendix. It is used for calculating the standard deviation of the errors, commonly called the residual standard error or standard error of estimate; we shall use the term residual standard error (RSE) to denote this concept. The k value in equation (22) in appendix is the usual degree of freedom adjustment that makes RSE a better estimate of the true population RSE. As shown, the RSE is different from the usual standard deviation. The RSE is a standard deviation that assumes the mean error is equal to zero, whether it is or not. The selection of the best forecasting model is considerably more complex than choosing the model with minimum RSE. There are a variety of criteria used to choose one forecasting model over another. The most important general objective in forecasting is to decrease the width of the confidence intervals used to make probability statements about future values. A probability statement for the next actual value is: Actual = Forecast ± Z RSE. The probability that the actual value will be in this interval is determined by the t- or Z-value. If the Z of 1.96 is used and errors are normally distributed, approximately 95% of the actual values for one-period-ahead forecasts will be in this range, assuming the model of the past is accurate and the past repeats itself. As we have discussed above, in our new prediction modeling system, we combine expert system (ES), conventional program system (CPS) and neural network (NN) with large statistics and computations together. Our effects are to build an automatic forecasting modeling system, which can be applied to any data sequence in different situations and conditions. CONCLUSION AND FURTHER WORK As we mentioned before, there was not one method that was best for all series or all forecast horizons. Some methods are better than others in short-term patterns, others at long-horizons length. Current world is a digital world. Computer-based Automatic prediction modeling system is the tendency. Since we have pointed out that each forecasting system has its own advantages and disadvantages and we also defined constraints for different prediction models. However, we cannot solely rely on the Fig. 4, Fig. 5 in appendix, the M-Competition 24 models and even our extensive hands-on experiences in building and testing different models. Because average rankings can be misleading, and by choosing different models for different horizons, one can dramatically improve accuracy. Thus, how to make those intelligencebased model selections and how to make seamless combinations of different models to solve more complex problems may bring a break-through in the development of prediction modeling system. REFERENCES [1] A. Bellacicco, 2000, "A new insight into the algebraic structure of the exponential smoothing algorithm of Brown", in: N. Ebecken and C. A. Brebbia (Eds.), Data Mining II, WIT Press, Southampton, UK, pp [2] A. Dhond, A. Gupta, S. Vadhavkar, 2000, Data Mining Techniques for Optimizing Inventories for Electronic Commerce, KDD, Boston, MA USA, 9

10 [3] D. Rafiei and A.O. Mendelzon, 2000, Querying Time Series Data Based on Similarity, IEEE Transactions on Knowledge and Data Engineering, Vol.12, no.5 [4] G. Antoniol, G. Casazza, G. Dilucca, M. Di Penta, E. Merlo., 2001, Predicting Web Site Access: an Application of Time Series. The 3 rd International Workshop on Web Site Evolution (WSE 01), IEEE [5] G. Box, G.M. Jenkins and G.C. Reinsel, 1994, Time Series Analysis: forecasting and control, 3 rd ed. Perntice Hall [6] G. Box and M. Jenkins. 1970, Time Series Forecasting Analysis and Control. Holden Day, San Francisco (USA). [7] Hillol Kargupta et al., 2001, MobiMine: Monitoring the Stock Market from a PDA, SIGKDD Explorations, Vol.3, Issue 2, pp [8] J. Neves and P. Cortez, 1997, An Artificial Neural Network Genetic Based Approach for Time Series Forecasting, the 4 th Brazilian Symposium on Neural Networks (SBRN 97) [9] J. Yao and C.L. Tan, 2000, Time dependent Directional Profit Model for Financial Time Series Forecasting, IEEE-INNS-ENNS International Joint Conference on Neural Networks (IJCNN 00) [10] K. Bansal, S. Vadhavkar, A. Gupta, 1998, Brief Application description Neural Networks Based Forecasting Techniques for Inventory Control Applications, Data Mining and Knowledge Discovery, Vol. 2, [11] Makridakis et al., 1982, The Accuracy of Extrapolation (Time Series) Methods: Results of a Forecasting Competition. Journal of Forecasting, Vol. 1, pp [12] Makridakis, S., S.C. Wheelwright, V.E. McGee., 1983, Forecasting Methods and Applications. 2 nd ed. New York: John Wiley & Sons. [13] Makridakis, S., and M. Hibon., 1979, Accuracy of Forecasting: An Empirical Investigation (with discussion). Journal of the Royal Statistical Society A, no.14 (part 2), pp [14] Pankratz, Alan, 1983, Forecasting with univariate Box-Jenkins models. [15] Paul Newbold, 1983, ARIMA Model Building and the Time Series Analysis Approach to Forecasting, Journal of Forecasting, Vol.2, pp [16] Paul Newbold and J. Kenton Zumwalt, 1987, Combining Forecasts To Improve Earnings Per Share Prediction, International Journal of Forecasting, Vol.3, pp [17] R.G. Brown., 1963, Smoothing, Forecasting and Prediction. Prentice Hall. [18] Sharda, R., and R. Patil., 1990, Neural Networks as Forecasting Experts: An Empirical Test., International Joint Conference on Neural Networks. Vol , pp (Washington, D.C.: IEEE) [19] Shiskin, J., A.H. Young, and J.C. Musgrave., 1967, The X-11 variant of the Census Method II seasonal adjustment program. Technical Paper No. 15. U.S. Department of Commerce, Bureau of Economic Analysis. [20] Stephen A. DeLurgio, 1998, Forecasting Principles and Applications, International Editions, McGraw Hill FIGURES AND TABLES Fig. 1 The Box-Jenkins modeling Procedure Table 1 Credit Risk Analysis for Credit Card Companies Table 2 cross-tabulation table of independent vs. dependent columns 10

11 Fig. 2 Decision Tree for credit risk analysis Fig. 3 Prediction Modeling System procedure Fig. 4 Multiple Regression Modeling Process [20] 11

12 Table 3 Model Pattern Classifications Table 5 Comparisons of Forecasting Methods (con d) [20] Table 4 Comparisons of Forecasting Methods [20] FORMULAS Frequently used general statistics: Mean, Variance, Standard Deviation, Covariance, Correlation, t-test, Z-normal, F-test, p-value, ACF, PACF, DW, VIF, RUNS test Residuals/error statistics used in smoothing: 12

13 In ARIMA: Mean square error: RACF: (23) Chi square test: RMSE (Root Mean Squared Error): (24) (25) MAPE (Mean Absolute Percentage Error): (26) (27) Akaike information criterion: AIC = nlog (SSE) + 2k (28) Schwarz Bayesian information criterion: BIC = nlog (SSE) + klog(n) (29) In Multiple Regressions DEV = [Y i E(Y)] (30) In Neural Network RSE (see equation 22 above) Standard error of regression coefficient (34) Partial F-test for restricted and unrestricted modes F calculated = ( SSE R - SSE U ) / m (35) (36) 13

TIME SERIES ANALYSIS AND FORECASTING USING THE STATISTICAL MODEL ARIMA

TIME SERIES ANALYSIS AND FORECASTING USING THE STATISTICAL MODEL ARIMA CHAPTER 6 TIME SERIES ANALYSIS AND FORECASTING USING THE STATISTICAL MODEL ARIMA 6.1. Introduction A time series is a sequence of observations ordered in time. A basic assumption in the time series analysis

More information

Univariate ARIMA Models

Univariate ARIMA Models Univariate ARIMA Models ARIMA Model Building Steps: Identification: Using graphs, statistics, ACFs and PACFs, transformations, etc. to achieve stationary and tentatively identify patterns and model components.

More information

at least 50 and preferably 100 observations should be available to build a proper model

at least 50 and preferably 100 observations should be available to build a proper model III Box-Jenkins Methods 1. Pros and Cons of ARIMA Forecasting a) need for data at least 50 and preferably 100 observations should be available to build a proper model used most frequently for hourly or

More information

Forecasting. Simon Shaw 2005/06 Semester II

Forecasting. Simon Shaw 2005/06 Semester II Forecasting Simon Shaw s.c.shaw@maths.bath.ac.uk 2005/06 Semester II 1 Introduction A critical aspect of managing any business is planning for the future. events is called forecasting. Predicting future

More information

A stochastic modeling for paddy production in Tamilnadu

A stochastic modeling for paddy production in Tamilnadu 2017; 2(5): 14-21 ISSN: 2456-1452 Maths 2017; 2(5): 14-21 2017 Stats & Maths www.mathsjournal.com Received: 04-07-2017 Accepted: 05-08-2017 M Saranyadevi Assistant Professor (GUEST), Department of Statistics,

More information

Forecasting Using Time Series Models

Forecasting Using Time Series Models Forecasting Using Time Series Models Dr. J Katyayani 1, M Jahnavi 2 Pothugunta Krishna Prasad 3 1 Professor, Department of MBA, SPMVV, Tirupati, India 2 Assistant Professor, Koshys Institute of Management

More information

Time Series and Forecasting

Time Series and Forecasting Time Series and Forecasting Introduction to Forecasting n What is forecasting? n Primary Function is to Predict the Future using (time series related or other) data we have in hand n Why are we interested?

More information

22/04/2014. Economic Research

22/04/2014. Economic Research 22/04/2014 Economic Research Forecasting Models for Exchange Rate Tuesday, April 22, 2014 The science of prognostics has been going through a rapid and fruitful development in the past decades, with various

More information

Time Series and Forecasting

Time Series and Forecasting Time Series and Forecasting Introduction to Forecasting n What is forecasting? n Primary Function is to Predict the Future using (time series related or other) data we have in hand n Why are we interested?

More information

Do we need Experts for Time Series Forecasting?

Do we need Experts for Time Series Forecasting? Do we need Experts for Time Series Forecasting? Christiane Lemke and Bogdan Gabrys Bournemouth University - School of Design, Engineering and Computing Poole House, Talbot Campus, Poole, BH12 5BB - United

More information

ARIMA modeling to forecast area and production of rice in West Bengal

ARIMA modeling to forecast area and production of rice in West Bengal Journal of Crop and Weed, 9(2):26-31(2013) ARIMA modeling to forecast area and production of rice in West Bengal R. BISWAS AND B. BHATTACHARYYA Department of Agricultural Statistics Bidhan Chandra Krishi

More information

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages: Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the

More information

Stochastic Processes

Stochastic Processes Stochastic Processes Stochastic Process Non Formal Definition: Non formal: A stochastic process (random process) is the opposite of a deterministic process such as one defined by a differential equation.

More information

Frequency Forecasting using Time Series ARIMA model

Frequency Forecasting using Time Series ARIMA model Frequency Forecasting using Time Series ARIMA model Manish Kumar Tikariha DGM(O) NSPCL Bhilai Abstract In view of stringent regulatory stance and recent tariff guidelines, Deviation Settlement mechanism

More information

Design of Time Series Model for Road Accident Fatal Death in Tamilnadu

Design of Time Series Model for Road Accident Fatal Death in Tamilnadu Volume 109 No. 8 2016, 225-232 ISSN: 1311-8080 (printed version); ISSN: 1314-3395 (on-line version) url: http://www.ijpam.eu ijpam.eu Design of Time Series Model for Road Accident Fatal Death in Tamilnadu

More information

Warwick Business School Forecasting System. Summary. Ana Galvao, Anthony Garratt and James Mitchell November, 2014

Warwick Business School Forecasting System. Summary. Ana Galvao, Anthony Garratt and James Mitchell November, 2014 Warwick Business School Forecasting System Summary Ana Galvao, Anthony Garratt and James Mitchell November, 21 The main objective of the Warwick Business School Forecasting System is to provide competitive

More information

MODELING INFLATION RATES IN NIGERIA: BOX-JENKINS APPROACH. I. U. Moffat and A. E. David Department of Mathematics & Statistics, University of Uyo, Uyo

MODELING INFLATION RATES IN NIGERIA: BOX-JENKINS APPROACH. I. U. Moffat and A. E. David Department of Mathematics & Statistics, University of Uyo, Uyo Vol.4, No.2, pp.2-27, April 216 MODELING INFLATION RATES IN NIGERIA: BOX-JENKINS APPROACH I. U. Moffat and A. E. David Department of Mathematics & Statistics, University of Uyo, Uyo ABSTRACT: This study

More information

Modified Holt s Linear Trend Method

Modified Holt s Linear Trend Method Modified Holt s Linear Trend Method Guckan Yapar, Sedat Capar, Hanife Taylan Selamlar and Idil Yavuz Abstract Exponential smoothing models are simple, accurate and robust forecasting models and because

More information

Lecture 2: Univariate Time Series

Lecture 2: Univariate Time Series Lecture 2: Univariate Time Series Analysis: Conditional and Unconditional Densities, Stationarity, ARMA Processes Prof. Massimo Guidolin 20192 Financial Econometrics Spring/Winter 2017 Overview Motivation:

More information

Forecasting Bangladesh's Inflation through Econometric Models

Forecasting Bangladesh's Inflation through Econometric Models American Journal of Economics and Business Administration Original Research Paper Forecasting Bangladesh's Inflation through Econometric Models 1,2 Nazmul Islam 1 Department of Humanities, Bangladesh University

More information

FORECASTING. Methods and Applications. Third Edition. Spyros Makridakis. European Institute of Business Administration (INSEAD) Steven C Wheelwright

FORECASTING. Methods and Applications. Third Edition. Spyros Makridakis. European Institute of Business Administration (INSEAD) Steven C Wheelwright FORECASTING Methods and Applications Third Edition Spyros Makridakis European Institute of Business Administration (INSEAD) Steven C Wheelwright Harvard University, Graduate School of Business Administration

More information

TIMES SERIES INTRODUCTION INTRODUCTION. Page 1. A time series is a set of observations made sequentially through time

TIMES SERIES INTRODUCTION INTRODUCTION. Page 1. A time series is a set of observations made sequentially through time TIMES SERIES INTRODUCTION A time series is a set of observations made sequentially through time A time series is said to be continuous when observations are taken continuously through time, or discrete

More information

BUSI 460 Suggested Answers to Selected Review and Discussion Questions Lesson 7

BUSI 460 Suggested Answers to Selected Review and Discussion Questions Lesson 7 BUSI 460 Suggested Answers to Selected Review and Discussion Questions Lesson 7 1. The definitions follow: (a) Time series: Time series data, also known as a data series, consists of observations on a

More information

Time Series Analysis -- An Introduction -- AMS 586

Time Series Analysis -- An Introduction -- AMS 586 Time Series Analysis -- An Introduction -- AMS 586 1 Objectives of time series analysis Data description Data interpretation Modeling Control Prediction & Forecasting 2 Time-Series Data Numerical data

More information

FORECASTING SUGARCANE PRODUCTION IN INDIA WITH ARIMA MODEL

FORECASTING SUGARCANE PRODUCTION IN INDIA WITH ARIMA MODEL FORECASTING SUGARCANE PRODUCTION IN INDIA WITH ARIMA MODEL B. N. MANDAL Abstract: Yearly sugarcane production data for the period of - to - of India were analyzed by time-series methods. Autocorrelation

More information

NATCOR. Forecast Evaluation. Forecasting with ARIMA models. Nikolaos Kourentzes

NATCOR. Forecast Evaluation. Forecasting with ARIMA models. Nikolaos Kourentzes NATCOR Forecast Evaluation Forecasting with ARIMA models Nikolaos Kourentzes n.kourentzes@lancaster.ac.uk O u t l i n e 1. Bias measures 2. Accuracy measures 3. Evaluation schemes 4. Prediction intervals

More information

Industrial Engineering Prof. Inderdeep Singh Department of Mechanical & Industrial Engineering Indian Institute of Technology, Roorkee

Industrial Engineering Prof. Inderdeep Singh Department of Mechanical & Industrial Engineering Indian Institute of Technology, Roorkee Industrial Engineering Prof. Inderdeep Singh Department of Mechanical & Industrial Engineering Indian Institute of Technology, Roorkee Module - 04 Lecture - 05 Sales Forecasting - II A very warm welcome

More information

Study on Modeling and Forecasting of the GDP of Manufacturing Industries in Bangladesh

Study on Modeling and Forecasting of the GDP of Manufacturing Industries in Bangladesh CHIANG MAI UNIVERSITY JOURNAL OF SOCIAL SCIENCE AND HUMANITIES M. N. A. Bhuiyan 1*, Kazi Saleh Ahmed 2 and Roushan Jahan 1 Study on Modeling and Forecasting of the GDP of Manufacturing Industries in Bangladesh

More information

Automatic Forecasting

Automatic Forecasting Automatic Forecasting Summary The Automatic Forecasting procedure is designed to forecast future values of time series data. A time series consists of a set of sequential numeric data taken at equally

More information

Lesson 13: Box-Jenkins Modeling Strategy for building ARMA models

Lesson 13: Box-Jenkins Modeling Strategy for building ARMA models Lesson 13: Box-Jenkins Modeling Strategy for building ARMA models Facoltà di Economia Università dell Aquila umberto.triacca@gmail.com Introduction In this lesson we present a method to construct an ARMA(p,

More information

Chart types and when to use them

Chart types and when to use them APPENDIX A Chart types and when to use them Pie chart Figure illustration of pie chart 2.3 % 4.5 % Browser Usage for April 2012 18.3 % 38.3 % Internet Explorer Firefox Chrome Safari Opera 35.8 % Pie chart

More information

Forecasting using R. Rob J Hyndman. 2.4 Non-seasonal ARIMA models. Forecasting using R 1

Forecasting using R. Rob J Hyndman. 2.4 Non-seasonal ARIMA models. Forecasting using R 1 Forecasting using R Rob J Hyndman 2.4 Non-seasonal ARIMA models Forecasting using R 1 Outline 1 Autoregressive models 2 Moving average models 3 Non-seasonal ARIMA models 4 Partial autocorrelations 5 Estimation

More information

FORECASTING THE INVENTORY LEVEL OF MAGNETIC CARDS IN TOLLING SYSTEM

FORECASTING THE INVENTORY LEVEL OF MAGNETIC CARDS IN TOLLING SYSTEM FORECASTING THE INVENTORY LEVEL OF MAGNETIC CARDS IN TOLLING SYSTEM Bratislav Lazić a, Nebojša Bojović b, Gordana Radivojević b*, Gorana Šormaz a a University of Belgrade, Mihajlo Pupin Institute, Serbia

More information

Classification of Forecasting Methods Based On Application

Classification of Forecasting Methods Based On Application Classification of Forecasting Methods Based On Application C.Narayana 1, G.Y.Mythili 2, J. Prabhakara Naik 3, K.Vasu 4, G. Mokesh Rayalu 5 1 Assistant professor, Department of Mathematics, Sriharsha Institute

More information

Forecasting Egyptian GDP Using ARIMA Models

Forecasting Egyptian GDP Using ARIMA Models Reports on Economics and Finance, Vol. 5, 2019, no. 1, 35-47 HIKARI Ltd, www.m-hikari.com https://doi.org/10.12988/ref.2019.81023 Forecasting Egyptian GDP Using ARIMA Models Mohamed Reda Abonazel * and

More information

Chapter 7 Forecasting Demand

Chapter 7 Forecasting Demand Chapter 7 Forecasting Demand Aims of the Chapter After reading this chapter you should be able to do the following: discuss the role of forecasting in inventory management; review different approaches

More information

Comparing the Univariate Modeling Techniques, Box-Jenkins and Artificial Neural Network (ANN) for Measuring of Climate Index

Comparing the Univariate Modeling Techniques, Box-Jenkins and Artificial Neural Network (ANN) for Measuring of Climate Index Applied Mathematical Sciences, Vol. 8, 2014, no. 32, 1557-1568 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ams.2014.4150 Comparing the Univariate Modeling Techniques, Box-Jenkins and Artificial

More information

5 Autoregressive-Moving-Average Modeling

5 Autoregressive-Moving-Average Modeling 5 Autoregressive-Moving-Average Modeling 5. Purpose. Autoregressive-moving-average (ARMA models are mathematical models of the persistence, or autocorrelation, in a time series. ARMA models are widely

More information

TIME SERIES ANALYSIS. Forecasting and Control. Wiley. Fifth Edition GWILYM M. JENKINS GEORGE E. P. BOX GREGORY C. REINSEL GRETA M.

TIME SERIES ANALYSIS. Forecasting and Control. Wiley. Fifth Edition GWILYM M. JENKINS GEORGE E. P. BOX GREGORY C. REINSEL GRETA M. TIME SERIES ANALYSIS Forecasting and Control Fifth Edition GEORGE E. P. BOX GWILYM M. JENKINS GREGORY C. REINSEL GRETA M. LJUNG Wiley CONTENTS PREFACE TO THE FIFTH EDITION PREFACE TO THE FOURTH EDITION

More information

Chapter 12: An introduction to Time Series Analysis. Chapter 12: An introduction to Time Series Analysis

Chapter 12: An introduction to Time Series Analysis. Chapter 12: An introduction to Time Series Analysis Chapter 12: An introduction to Time Series Analysis Introduction In this chapter, we will discuss forecasting with single-series (univariate) Box-Jenkins models. The common name of the models is Auto-Regressive

More information

Forecasting: Methods and Applications

Forecasting: Methods and Applications Neapolis University HEPHAESTUS Repository School of Economic Sciences and Business http://hephaestus.nup.ac.cy Books 1998 Forecasting: Methods and Applications Makridakis, Spyros John Wiley & Sons, Inc.

More information

Econometric Forecasting

Econometric Forecasting Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies Vienna October 1, 2014 Outline Introduction Model-free extrapolation Univariate time-series models Trend

More information

TIME SERIES DATA PREDICTION OF NATURAL GAS CONSUMPTION USING ARIMA MODEL

TIME SERIES DATA PREDICTION OF NATURAL GAS CONSUMPTION USING ARIMA MODEL International Journal of Information Technology & Management Information System (IJITMIS) Volume 7, Issue 3, Sep-Dec-2016, pp. 01 07, Article ID: IJITMIS_07_03_001 Available online at http://www.iaeme.com/ijitmis/issues.asp?jtype=ijitmis&vtype=7&itype=3

More information

Chapter 8 - Forecasting

Chapter 8 - Forecasting Chapter 8 - Forecasting Operations Management by R. Dan Reid & Nada R. Sanders 4th Edition Wiley 2010 Wiley 2010 1 Learning Objectives Identify Principles of Forecasting Explain the steps in the forecasting

More information

Empirical Market Microstructure Analysis (EMMA)

Empirical Market Microstructure Analysis (EMMA) Empirical Market Microstructure Analysis (EMMA) Lecture 3: Statistical Building Blocks and Econometric Basics Prof. Dr. Michael Stein michael.stein@vwl.uni-freiburg.de Albert-Ludwigs-University of Freiburg

More information

TRANSFER FUNCTION MODEL FOR GLOSS PREDICTION OF COATED ALUMINUM USING THE ARIMA PROCEDURE

TRANSFER FUNCTION MODEL FOR GLOSS PREDICTION OF COATED ALUMINUM USING THE ARIMA PROCEDURE TRANSFER FUNCTION MODEL FOR GLOSS PREDICTION OF COATED ALUMINUM USING THE ARIMA PROCEDURE Mozammel H. Khan Kuwait Institute for Scientific Research Introduction The objective of this work was to investigate

More information

Some Time-Series Models

Some Time-Series Models Some Time-Series Models Outline 1. Stochastic processes and their properties 2. Stationary processes 3. Some properties of the autocorrelation function 4. Some useful models Purely random processes, random

More information

Chapter 3: Regression Methods for Trends

Chapter 3: Regression Methods for Trends Chapter 3: Regression Methods for Trends Time series exhibiting trends over time have a mean function that is some simple function (not necessarily constant) of time. The example random walk graph from

More information

CHAPTER 4: DATASETS AND CRITERIA FOR ALGORITHM EVALUATION

CHAPTER 4: DATASETS AND CRITERIA FOR ALGORITHM EVALUATION CHAPTER 4: DATASETS AND CRITERIA FOR ALGORITHM EVALUATION 4.1 Overview This chapter contains the description about the data that is used in this research. In this research time series data is used. A time

More information

Read Section 1.1, Examples of time series, on pages 1-8. These example introduce the book; you are not tested on them.

Read Section 1.1, Examples of time series, on pages 1-8. These example introduce the book; you are not tested on them. TS Module 1 Time series overview (The attached PDF file has better formatting.)! Model building! Time series plots Read Section 1.1, Examples of time series, on pages 1-8. These example introduce the book;

More information

DEPARTMENT OF QUANTITATIVE METHODS & INFORMATION SYSTEMS

DEPARTMENT OF QUANTITATIVE METHODS & INFORMATION SYSTEMS DEPARTMENT OF QUANTITATIVE METHODS & INFORMATION SYSTEMS Moving Averages and Smoothing Methods ECON 504 Chapter 7 Fall 2013 Dr. Mohammad Zainal 2 This chapter will describe three simple approaches to forecasting

More information

Development of Demand Forecasting Models for Improved Customer Service in Nigeria Soft Drink Industry_ Case of Coca-Cola Company Enugu

Development of Demand Forecasting Models for Improved Customer Service in Nigeria Soft Drink Industry_ Case of Coca-Cola Company Enugu International Journal of Scientific Research Engineering & Technology (IJSRET), ISSN 2278 882 Volume 5, Issue 4, April 26 259 Development of Demand Forecasting Models for Improved Customer Service in Nigeria

More information

Empirical Approach to Modelling and Forecasting Inflation in Ghana

Empirical Approach to Modelling and Forecasting Inflation in Ghana Current Research Journal of Economic Theory 4(3): 83-87, 2012 ISSN: 2042-485X Maxwell Scientific Organization, 2012 Submitted: April 13, 2012 Accepted: May 06, 2012 Published: June 30, 2012 Empirical Approach

More information

Evaluation of Some Techniques for Forecasting of Electricity Demand in Sri Lanka

Evaluation of Some Techniques for Forecasting of Electricity Demand in Sri Lanka Appeared in Sri Lankan Journal of Applied Statistics (Volume 3) 00 Evaluation of Some echniques for Forecasting of Electricity Demand in Sri Lanka.M.J. A. Cooray and M.Indralingam Department of Mathematics

More information

Dynamic Time Series Regression: A Panacea for Spurious Correlations

Dynamic Time Series Regression: A Panacea for Spurious Correlations International Journal of Scientific and Research Publications, Volume 6, Issue 10, October 2016 337 Dynamic Time Series Regression: A Panacea for Spurious Correlations Emmanuel Alphonsus Akpan *, Imoh

More information

Suan Sunandha Rajabhat University

Suan Sunandha Rajabhat University Forecasting Exchange Rate between Thai Baht and the US Dollar Using Time Series Analysis Kunya Bowornchockchai Suan Sunandha Rajabhat University INTRODUCTION The objective of this research is to forecast

More information

Cyclical Effect, and Measuring Irregular Effect

Cyclical Effect, and Measuring Irregular Effect Paper:15, Quantitative Techniques for Management Decisions Module- 37 Forecasting & Time series Analysis: Measuring- Seasonal Effect, Cyclical Effect, and Measuring Irregular Effect Principal Investigator

More information

An approach to make statistical forecasting of products with stationary/seasonal patterns

An approach to make statistical forecasting of products with stationary/seasonal patterns An approach to make statistical forecasting of products with stationary/seasonal patterns Carlos A. Castro-Zuluaga (ccastro@eafit.edu.co) Production Engineer Department, Universidad Eafit Medellin, Colombia

More information

FORECASTING OF COTTON PRODUCTION IN INDIA USING ARIMA MODEL

FORECASTING OF COTTON PRODUCTION IN INDIA USING ARIMA MODEL FORECASTING OF COTTON PRODUCTION IN INDIA USING ARIMA MODEL S.Poyyamozhi 1, Dr. A. Kachi Mohideen 2. 1 Assistant Professor and Head, Department of Statistics, Government Arts College (Autonomous), Kumbakonam

More information

STAT 115: Introductory Methods for Time Series Analysis and Forecasting. Concepts and Techniques

STAT 115: Introductory Methods for Time Series Analysis and Forecasting. Concepts and Techniques STAT 115: Introductory Methods for Time Series Analysis and Forecasting Concepts and Techniques School of Statistics University of the Philippines Diliman 1 FORECASTING Forecasting is an activity that

More information

Improved Holt Method for Irregular Time Series

Improved Holt Method for Irregular Time Series WDS'08 Proceedings of Contributed Papers, Part I, 62 67, 2008. ISBN 978-80-7378-065-4 MATFYZPRESS Improved Holt Method for Irregular Time Series T. Hanzák Charles University, Faculty of Mathematics and

More information

Decision 411: Class 9. HW#3 issues

Decision 411: Class 9. HW#3 issues Decision 411: Class 9 Presentation/discussion of HW#3 Introduction to ARIMA models Rules for fitting nonseasonal models Differencing and stationarity Reading the tea leaves : : ACF and PACF plots Unit

More information

Dennis Bricker Dept of Mechanical & Industrial Engineering The University of Iowa. Forecasting demand 02/06/03 page 1 of 34

Dennis Bricker Dept of Mechanical & Industrial Engineering The University of Iowa. Forecasting demand 02/06/03 page 1 of 34 demand -5-4 -3-2 -1 0 1 2 3 Dennis Bricker Dept of Mechanical & Industrial Engineering The University of Iowa Forecasting demand 02/06/03 page 1 of 34 Forecasting is very difficult. especially about the

More information

Forecasting the Prices of Indian Natural Rubber using ARIMA Model

Forecasting the Prices of Indian Natural Rubber using ARIMA Model Available online at www.ijpab.com Rani and Krishnan Int. J. Pure App. Biosci. 6 (2): 217-221 (2018) ISSN: 2320 7051 DOI: http://dx.doi.org/10.18782/2320-7051.5464 ISSN: 2320 7051 Int. J. Pure App. Biosci.

More information

Austrian Inflation Rate

Austrian Inflation Rate Austrian Inflation Rate Course of Econometric Forecasting Nadir Shahzad Virkun Tomas Sedliacik Goal and Data Selection Our goal is to find a relatively accurate procedure in order to forecast the Austrian

More information

Introduction to Forecasting

Introduction to Forecasting Introduction to Forecasting Introduction to Forecasting Predicting the future Not an exact science but instead consists of a set of statistical tools and techniques that are supported by human judgment

More information

Box-Jenkins ARIMA Advanced Time Series

Box-Jenkins ARIMA Advanced Time Series Box-Jenkins ARIMA Advanced Time Series www.realoptionsvaluation.com ROV Technical Papers Series: Volume 25 Theory In This Issue 1. Learn about Risk Simulator s ARIMA and Auto ARIMA modules. 2. Find out

More information

FORECASTING YIELD PER HECTARE OF RICE IN ANDHRA PRADESH

FORECASTING YIELD PER HECTARE OF RICE IN ANDHRA PRADESH International Journal of Mathematics and Computer Applications Research (IJMCAR) ISSN 49-6955 Vol. 3, Issue 1, Mar 013, 9-14 TJPRC Pvt. Ltd. FORECASTING YIELD PER HECTARE OF RICE IN ANDHRA PRADESH R. RAMAKRISHNA

More information

MGR-815. Notes for the MGR-815 course. 12 June School of Superior Technology. Professor Zbigniew Dziong

MGR-815. Notes for the MGR-815 course. 12 June School of Superior Technology. Professor Zbigniew Dziong Modeling, Estimation and Control, for Telecommunication Networks Notes for the MGR-815 course 12 June 2010 School of Superior Technology Professor Zbigniew Dziong 1 Table of Contents Preface 5 1. Example

More information

Lecture 2. Judging the Performance of Classifiers. Nitin R. Patel

Lecture 2. Judging the Performance of Classifiers. Nitin R. Patel Lecture 2 Judging the Performance of Classifiers Nitin R. Patel 1 In this note we will examine the question of how to udge the usefulness of a classifier and how to compare different classifiers. Not only

More information

The Identification of ARIMA Models

The Identification of ARIMA Models APPENDIX 4 The Identification of ARIMA Models As we have established in a previous lecture, there is a one-to-one correspondence between the parameters of an ARMA(p, q) model, including the variance of

More information

On the benefit of using time series features for choosing a forecasting method

On the benefit of using time series features for choosing a forecasting method On the benefit of using time series features for choosing a forecasting method Christiane Lemke and Bogdan Gabrys Bournemouth University - School of Design, Engineering and Computing Poole House, Talbot

More information

FE570 Financial Markets and Trading. Stevens Institute of Technology

FE570 Financial Markets and Trading. Stevens Institute of Technology FE570 Financial Markets and Trading Lecture 5. Linear Time Series Analysis and Its Applications (Ref. Joel Hasbrouck - Empirical Market Microstructure ) Steve Yang Stevens Institute of Technology 9/25/2012

More information

ANN and Statistical Theory Based Forecasting and Analysis of Power System Variables

ANN and Statistical Theory Based Forecasting and Analysis of Power System Variables ANN and Statistical Theory Based Forecasting and Analysis of Power System Variables Sruthi V. Nair 1, Poonam Kothari 2, Kushal Lodha 3 1,2,3 Lecturer, G. H. Raisoni Institute of Engineering & Technology,

More information

3 Time Series Regression

3 Time Series Regression 3 Time Series Regression 3.1 Modelling Trend Using Regression Random Walk 2 0 2 4 6 8 Random Walk 0 2 4 6 8 0 10 20 30 40 50 60 (a) Time 0 10 20 30 40 50 60 (b) Time Random Walk 8 6 4 2 0 Random Walk 0

More information

arxiv: v1 [stat.me] 5 Nov 2008

arxiv: v1 [stat.me] 5 Nov 2008 arxiv:0811.0659v1 [stat.me] 5 Nov 2008 Estimation of missing data by using the filtering process in a time series modeling Ahmad Mahir R. and Al-khazaleh A. M. H. School of Mathematical Sciences Faculty

More information

data lam=36.9 lam=6.69 lam=4.18 lam=2.92 lam=2.21 time max wavelength modulus of max wavelength cycle

data lam=36.9 lam=6.69 lam=4.18 lam=2.92 lam=2.21 time max wavelength modulus of max wavelength cycle AUTOREGRESSIVE LINEAR MODELS AR(1) MODELS The zero-mean AR(1) model x t = x t,1 + t is a linear regression of the current value of the time series on the previous value. For > 0 it generates positively

More information

Univariate linear models

Univariate linear models Univariate linear models The specification process of an univariate ARIMA model is based on the theoretical properties of the different processes and it is also important the observation and interpretation

More information

Forecasting Area, Production and Yield of Cotton in India using ARIMA Model

Forecasting Area, Production and Yield of Cotton in India using ARIMA Model Forecasting Area, Production and Yield of Cotton in India using ARIMA Model M. K. Debnath 1, Kartic Bera 2 *, P. Mishra 1 1 Department of Agricultural Statistics, Bidhan Chanda Krishi Vishwavidyalaya,

More information

Decision 411: Class 3

Decision 411: Class 3 Decision 411: Class 3 Discussion of HW#1 Introduction to seasonal models Seasonal decomposition Seasonal adjustment on a spreadsheet Forecasting with seasonal adjustment Forecasting inflation Log transformation

More information

ISSN Original Article Statistical Models for Forecasting Road Accident Injuries in Ghana.

ISSN Original Article Statistical Models for Forecasting Road Accident Injuries in Ghana. Available online at http://www.urpjournals.com International Journal of Research in Environmental Science and Technology Universal Research Publications. All rights reserved ISSN 2249 9695 Original Article

More information

The ARIMA Procedure: The ARIMA Procedure

The ARIMA Procedure: The ARIMA Procedure Page 1 of 120 Overview: ARIMA Procedure Getting Started: ARIMA Procedure The Three Stages of ARIMA Modeling Identification Stage Estimation and Diagnostic Checking Stage Forecasting Stage Using ARIMA Procedure

More information

Decision 411: Class 3

Decision 411: Class 3 Decision 411: Class 3 Discussion of HW#1 Introduction to seasonal models Seasonal decomposition Seasonal adjustment on a spreadsheet Forecasting with seasonal adjustment Forecasting inflation Poor man

More information

Econ 423 Lecture Notes: Additional Topics in Time Series 1

Econ 423 Lecture Notes: Additional Topics in Time Series 1 Econ 423 Lecture Notes: Additional Topics in Time Series 1 John C. Chao April 25, 2017 1 These notes are based in large part on Chapter 16 of Stock and Watson (2011). They are for instructional purposes

More information

Prof. Dr. Roland Füss Lecture Series in Applied Econometrics Summer Term Introduction to Time Series Analysis

Prof. Dr. Roland Füss Lecture Series in Applied Econometrics Summer Term Introduction to Time Series Analysis Introduction to Time Series Analysis 1 Contents: I. Basics of Time Series Analysis... 4 I.1 Stationarity... 5 I.2 Autocorrelation Function... 9 I.3 Partial Autocorrelation Function (PACF)... 14 I.4 Transformation

More information

Autoregressive Integrated Moving Average Model to Predict Graduate Unemployment in Indonesia

Autoregressive Integrated Moving Average Model to Predict Graduate Unemployment in Indonesia DOI 10.1515/ptse-2017-0005 PTSE 12 (1): 43-50 Autoregressive Integrated Moving Average Model to Predict Graduate Unemployment in Indonesia Umi MAHMUDAH u_mudah@yahoo.com (State Islamic University of Pekalongan,

More information

Available online at ScienceDirect. Procedia Computer Science 72 (2015 )

Available online at  ScienceDirect. Procedia Computer Science 72 (2015 ) Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 72 (2015 ) 630 637 The Third Information Systems International Conference Performance Comparisons Between Arima and Arimax

More information

Estimation and application of best ARIMA model for forecasting the uranium price.

Estimation and application of best ARIMA model for forecasting the uranium price. Estimation and application of best ARIMA model for forecasting the uranium price. Medeu Amangeldi May 13, 2018 Capstone Project Superviser: Dongming Wei Second reader: Zhenisbek Assylbekov Abstract This

More information

Decision 411: Class 3

Decision 411: Class 3 Decision 411: Class 3 Discussion of HW#1 Introduction to seasonal models Seasonal decomposition Seasonal adjustment on a spreadsheet Forecasting with seasonal adjustment Forecasting inflation Poor man

More information

FORECASTING OF ECONOMIC QUANTITIES USING FUZZY AUTOREGRESSIVE MODEL AND FUZZY NEURAL NETWORK

FORECASTING OF ECONOMIC QUANTITIES USING FUZZY AUTOREGRESSIVE MODEL AND FUZZY NEURAL NETWORK FORECASTING OF ECONOMIC QUANTITIES USING FUZZY AUTOREGRESSIVE MODEL AND FUZZY NEURAL NETWORK Dusan Marcek Silesian University, Institute of Computer Science Opava Research Institute of the IT4Innovations

More information

FORECASTING FLUCTUATIONS OF ASPHALT CEMENT PRICE INDEX IN GEORGIA

FORECASTING FLUCTUATIONS OF ASPHALT CEMENT PRICE INDEX IN GEORGIA FORECASTING FLUCTUATIONS OF ASPHALT CEMENT PRICE INDEX IN GEORGIA Mohammad Ilbeigi, Baabak Ashuri, Ph.D., and Yang Hui Economics of the Sustainable Built Environment (ESBE) Lab, School of Building Construction

More information

Basics: Definitions and Notation. Stationarity. A More Formal Definition

Basics: Definitions and Notation. Stationarity. A More Formal Definition Basics: Definitions and Notation A Univariate is a sequence of measurements of the same variable collected over (usually regular intervals of) time. Usual assumption in many time series techniques is that

More information

Advances in promotional modelling and analytics

Advances in promotional modelling and analytics Advances in promotional modelling and analytics High School of Economics St. Petersburg 25 May 2016 Nikolaos Kourentzes n.kourentzes@lancaster.ac.uk O u t l i n e 1. What is forecasting? 2. Forecasting,

More information

Exponential smoothing in the telecommunications data

Exponential smoothing in the telecommunications data Available online at www.sciencedirect.com International Journal of Forecasting 24 (2008) 170 174 www.elsevier.com/locate/ijforecast Exponential smoothing in the telecommunications data Everette S. Gardner

More information

10. Time series regression and forecasting

10. Time series regression and forecasting 10. Time series regression and forecasting Key feature of this section: Analysis of data on a single entity observed at multiple points in time (time series data) Typical research questions: What is the

More information

Page No. (and line no. if applicable):

Page No. (and line no. if applicable): COALITION/IEC (DAYMARK LOAD) - 1 COALITION/IEC (DAYMARK LOAD) 1 Tab and Daymark Load Forecast Page No. Page 3 Appendix: Review (and line no. if applicable): Topic: Price elasticity Sub Topic: Issue: Accuracy

More information

On Consistency of Tests for Stationarity in Autoregressive and Moving Average Models of Different Orders

On Consistency of Tests for Stationarity in Autoregressive and Moving Average Models of Different Orders American Journal of Theoretical and Applied Statistics 2016; 5(3): 146-153 http://www.sciencepublishinggroup.com/j/ajtas doi: 10.11648/j.ajtas.20160503.20 ISSN: 2326-8999 (Print); ISSN: 2326-9006 (Online)

More information

Inflation Revisited: New Evidence from Modified Unit Root Tests

Inflation Revisited: New Evidence from Modified Unit Root Tests 1 Inflation Revisited: New Evidence from Modified Unit Root Tests Walter Enders and Yu Liu * University of Alabama in Tuscaloosa and University of Texas at El Paso Abstract: We propose a simple modification

More information

Introduction to Regression Analysis. Dr. Devlina Chatterjee 11 th August, 2017

Introduction to Regression Analysis. Dr. Devlina Chatterjee 11 th August, 2017 Introduction to Regression Analysis Dr. Devlina Chatterjee 11 th August, 2017 What is regression analysis? Regression analysis is a statistical technique for studying linear relationships. One dependent

More information

Lab: Box-Jenkins Methodology - US Wholesale Price Indicator

Lab: Box-Jenkins Methodology - US Wholesale Price Indicator Lab: Box-Jenkins Methodology - US Wholesale Price Indicator In this lab we explore the Box-Jenkins methodology by applying it to a time-series data set comprising quarterly observations of the US Wholesale

More information