TRANSFER FUNCTION MODEL FOR GLOSS PREDICTION OF COATED ALUMINUM USING THE ARIMA PROCEDURE

Similar documents
TIME SERIES ANALYSIS AND FORECASTING USING THE STATISTICAL MODEL ARIMA

The ARIMA Procedure: The ARIMA Procedure

arxiv: v1 [stat.me] 5 Nov 2008

MODELING INFLATION RATES IN NIGERIA: BOX-JENKINS APPROACH. I. U. Moffat and A. E. David Department of Mathematics & Statistics, University of Uyo, Uyo

at least 50 and preferably 100 observations should be available to build a proper model

SAS/ETS 14.1 User s Guide. The ARIMA Procedure

5 Autoregressive-Moving-Average Modeling

Ross Bettinger, Analytical Consultant, Seattle, WA

Investigating Seasonality in BLS Data Using PROC ARIMA Joseph Earley, Loyola Marymount University Los Angeles, California

Univariate ARIMA Models

Ross Bettinger, Analytical Consultant, Seattle, WA

Design of Time Series Model for Road Accident Fatal Death in Tamilnadu

Empirical Approach to Modelling and Forecasting Inflation in Ghana

Basics: Definitions and Notation. Stationarity. A More Formal Definition

Using PROC ARIMA in Forecasting the Demand and Utilization of Inpatient Hospital Services

FORECASTING SUGARCANE PRODUCTION IN INDIA WITH ARIMA MODEL

Suan Sunandha Rajabhat University

Time Series I Time Domain Methods

Sugarcane Productivity in Bihar- A Forecast through ARIMA Model

Minitab Project Report - Assignment 6

Forecasting Area, Production and Yield of Cotton in India using ARIMA Model

Implementation of ARIMA Model for Ghee Production in Tamilnadu

Dynamic Time Series Regression: A Panacea for Spurious Correlations

Univariate, Nonstationary Processes

Chapter 12: An introduction to Time Series Analysis. Chapter 12: An introduction to Time Series Analysis

ARIMA modeling to forecast area and production of rice in West Bengal

Short-Term Load Forecasting Using ARIMA Model For Karnataka State Electrical Load

FORECASTING OF COTTON PRODUCTION IN INDIA USING ARIMA MODEL

Study on Modeling and Forecasting of the GDP of Manufacturing Industries in Bangladesh

A stochastic modeling for paddy production in Tamilnadu

Forecasting the Prices of Indian Natural Rubber using ARIMA Model

AE International Journal of Multi Disciplinary Research - Vol 2 - Issue -1 - January 2014

Paper SA-08. Are Sales Figures in Line With Expectations? Using PROC ARIMA in SAS to Forecast Company Revenue

Lecture 19 Box-Jenkins Seasonal Models

FORECASTING YIELD PER HECTARE OF RICE IN ANDHRA PRADESH

Time Series Analysis -- An Introduction -- AMS 586

A SEASONAL TIME SERIES MODEL FOR NIGERIAN MONTHLY AIR TRAFFIC DATA

ISSN Original Article Statistical Models for Forecasting Road Accident Injuries in Ghana.

Acta Universitatis Carolinae. Mathematica et Physica

FORECASTING THE INVENTORY LEVEL OF MAGNETIC CARDS IN TOLLING SYSTEM

Estimation and application of best ARIMA model for forecasting the uranium price.

TIME SERIES DATA PREDICTION OF NATURAL GAS CONSUMPTION USING ARIMA MODEL

Modelling Monthly Rainfall Data of Port Harcourt, Nigeria by Seasonal Box-Jenkins Methods

Time Series Analysis Model for Rainfall Data in Jordan: Case Study for Using Time Series Analysis

Stat 5100 Handout #12.e Notes: ARIMA Models (Unit 7) Key here: after stationary, identify dependence structure (and use for forecasting)

Asitha Kodippili. Deepthika Senaratne. Department of Mathematics and Computer Science,Fayetteville State University, USA.

Modeling and forecasting global mean temperature time series

Lesson 13: Box-Jenkins Modeling Strategy for building ARMA models

Estimation of Parameters of Multiplicative Seasonal Autoregressive Integrated Moving Average Model Using Multiple Regression

FE570 Financial Markets and Trading. Stevens Institute of Technology

A MACRO-DRIVEN FORECASTING SYSTEM FOR EVALUATING FORECAST MODEL PERFORMANCE

Multiplicative Sarima Modelling Of Nigerian Monthly Crude Oil Domestic Production

A Comparison of the Forecast Performance of. Double Seasonal ARIMA and Double Seasonal. ARFIMA Models of Electricity Load Demand

Scenario 5: Internet Usage Solution. θ j

ARIMA Models. Jamie Monogan. January 16, University of Georgia. Jamie Monogan (UGA) ARIMA Models January 16, / 27

Seasonal Autoregressive Integrated Moving Average Model for Precipitation Time Series

Analysis. Components of a Time Series

Lab: Box-Jenkins Methodology - US Wholesale Price Indicator

Part 1. Multiple Choice (50 questions, 1 point each) Part 2. Problems/Short Answer (10 questions, 5 points each)

Box-Jenkins ARIMA Advanced Time Series

Firstly, the dataset is cleaned and the years and months are separated to provide better distinction (sample below).

Decision 411: Class 9. HW#3 issues

MULTI-YEAR AVERAGES FROM A ROLLING SAMPLE SURVEY

Circle a single answer for each multiple choice question. Your choice should be made clearly.

Using Analysis of Time Series to Forecast numbers of The Patients with Malignant Tumors in Anbar Provinc

Forecasting Bangladesh's Inflation through Econometric Models

MODELING MAXIMUM MONTHLY TEMPERATURE IN KATUNAYAKE REGION, SRI LANKA: A SARIMA APPROACH

Marcel Dettling. Applied Time Series Analysis SS 2013 Week 05. ETH Zürich, March 18, Institute for Data Analysis and Process Design

Asian Economic and Financial Review. SEASONAL ARIMA MODELLING OF NIGERIAN MONTHLY CRUDE OIL PRICES Ette Harrison Etuk

Statistical Methods for Forecasting

Forecasting using R. Rob J Hyndman. 2.4 Non-seasonal ARIMA models. Forecasting using R 1

Author: Yesuf M. Awel 1c. Affiliation: 1 PhD, Economist-Consultant; P.O Box , Addis Ababa, Ethiopia. c.

Modelling Multi Input Transfer Function for Rainfall Forecasting in Batu City

Forecasting of Nitrogen Content in the Soil by Hybrid Time Series Model

Modeling climate variables using time series analysis in arid and semi arid regions

Development of Demand Forecasting Models for Improved Customer Service in Nigeria Soft Drink Industry_ Case of Coca-Cola Company Enugu

AN EMPIRICAL COMPARISON OF BLOCK BOOTSTRAP METHODS: TRADITIONAL AND NEWER ONES

The Identification of ARIMA Models

{ } Stochastic processes. Models for time series. Specification of a process. Specification of a process. , X t3. ,...X tn }

BJEST. Function: Usage:

Time Series Analysis of Currency in Circulation in Nigeria

The Fitting of a SARIMA model to Monthly Naira-Euro Exchange Rates

Some Time-Series Models

MCMC analysis of classical time series algorithms.

Multivariate time series modeling using VARMAX

Time Series Analysis of Monthly Rainfall data for the Gadaref rainfall station, Sudan, by Sarima Methods

Forecasting Precipitation Using SARIMA Model: A Case Study of. Mt. Kenya Region

SOME BASICS OF TIME-SERIES ANALYSIS

Time Series 4. Robert Almgren. Oct. 5, 2009

UNIVARIATE TIME SERIES ANALYSIS BRIEFING 1970

A Beginner s Introduction. Box-Jenkins Models

Volume 11 Issue 6 Version 1.0 November 2011 Type: Double Blind Peer Reviewed International Research Journal Publisher: Global Journals Inc.

-_J E3 L I OT H E 0 U E

ARIMA Models. Jamie Monogan. January 25, University of Georgia. Jamie Monogan (UGA) ARIMA Models January 25, / 38

Time Series Forecasting: A Tool for Out - Sample Model Selection and Evaluation

ARIMA model to forecast international tourist visit in Bumthang, Bhutan

STAT 436 / Lecture 16: Key

Technical note on seasonal adjustment for Capital goods imports

Prediction of Annual National Coconut Production - A Stochastic Approach. T.S.G. PEIRIS Coconut Research Institute, Lunuwila, Sri Lanka

Forecasting. Simon Shaw 2005/06 Semester II

Transcription:

TRANSFER FUNCTION MODEL FOR GLOSS PREDICTION OF COATED ALUMINUM USING THE ARIMA PROCEDURE Mozammel H. Khan Kuwait Institute for Scientific Research Introduction The objective of this work was to investigate methods for using available test data to predict the percentage gloss values of coated aluminum exposed to open environment. Because of the uncertainty of the weathering data and a lack of consensus over the reliability of the predictive value of the accelerated testing carried out in the laboratory, suitable mathematical model techniques were sought which would fit the existing test data and would make prediction of the gloss value at any specified fu ture time. A model in this case, is an algebraic statement telling how the gloss value is statistically related to time and the other pertinent weathering variables such as temperature and relative humidity. There exists two distinctive modeling techniques: namely, 1. Regression Analysis 2. Time Series Analysis Regression method expresses the dependence of one variable on another at the same time. On the other hand, time series modeling expresses the dependence of variable on itself at different times. Also regression model requires that the value of the independent variable be independent. Aluminum coating degradation data do not fall into this category, since the gloss value at the present time step is also dependent on the gloss values of the previous time steps. The principal difference between two systems stem form the fact that the regression system is static where as the time series system is dynamic. A disturbance t entering a regression system at time t affects only Y t but not Y t + 1 By the time the system response proceeds from t to t+ 1, the disturbance E t is forgotten; thus the system has no memory or dynamics. In time series system, on the other hand, a disturbance at affecting the system is remembered and continues to affect the system at subsequent times. In this paper, first the univariate time series analysis was applied to the gloss degradation data and then the Box-Jenkins transfer function methodology was applied to forecast the gloss value. A multiple input transfer function was specified to explicitly account for the effect of weather conditions on the gloss value. Univariate Time Series Analysis Time series data refers to observations on a variable that occurs in a time sequence. The phrase "time series analysis" is used in several ways. Sometimes it refers to any kind of analysis involving time series data. At other times it is used more narrowly to describe attempts to explain behavior of time series data using only past observations of the variable in question. This activity is referred to as single series or univariate analysis. The underlying assumptions is that the time sequenced observations in a data series may be statistically dependent as opposed to traditional regression analysis where the various observations within a single data series are assumed to be statistically independent. The most widely used univariate time series analysis is popularly known as Box-Jenkins model so named after Box and Jenkins who are responsible for the development of this strategy. Box-Jenkins models are also often referred to as Auto-Regressive Integrated Moving Average (ARIMA) models. It deals only with data measured at equally spaced, discrete time intervals. The model could be an Auto Regressive CAR), or Moving Average (MA) or the integration of both. One of the most stringent requirement of ARIMA modeling is that the data series should be stationary. A stationary time series has a 517

mean 9 variance, and autocorrelation function that are essentially constant through time. The stationary requirement may seem quite restrictive. However, most nonstationary series that arise in practice can be transformed into stationary series through an appropriate degree of differencing. First differences are the value at t minus the value at t-l. If the first differences do not have a constant mean, second differencing is done, which is the difference of the first differences. After a differenced series has been modeled it is integrated d times to return the data to the appropriate model. A generalized Box-Jenkins model is then represented as ARIMA(p, d, q), where p is the order of the auto-regressive contribution, q is the moving average part and d is the degree of differencing. Univariate Gloss Performance The measurements for aluminum gloss retention were taken monthly since January 1985. Gloss values as percentage were recorded at the end of every month while the maximum and minimum temperatures and relative humidities were recorded daily. Subsequently. the monthly average maximum and minimum temperatures and relative humidities were calculated. To develop univariate time series model. only the gloss values and the time of exposure in terms of uniform time step (month) were considered. Although it is obvious that the mean of the series is not stationary, the autocorrelation function (acf) and partial autocorrelation function (pacf) of the undifferenced data are estimated using the SAS/ETSTM ARIMA procedure. The estimated autocorrelations decay slowly. they do not cross the zero line even by tenth lag. This supports the observation that the series is nonstationary. A check of this is to estimate an autoregressive model of order 1. AR(I). This model is implied by the decaying acf and the single significant pacf spike at lag 1. Estimation. although unstable, results a ci> value of 1 that confirms the nonstationary characteristics of the data. First differencing of the series is clearly needed. Table 1 shows the estimated acf and pacf of the first differences. The estimated acf clearly suggests that the first differences are stationary and they can be represented by an AR(I) model: the spikes tail off toward zero in the estimated acf says an AR model is appropriate. The estimated pacf is consistent with AR(I) model: pure AR model of order one are typically associated with pacrs that 518 cut off to zero after lag 1; the estimated pacf in Ta1;>le 1 displays this behavior. From the preceding analysis an ARIMA(I.I.0) model is tentatively selected. Estimation results and the residuals acf appear in Table 1. The univariate model. in back shift notation. then can be represented as follows: (1-O.597B) (I-B) GWSS = - 0.392 + at All indications are that the AR(I) is satisfactory. Both /J- and ci>t are significant judging by their large absolute t-values. and the stationary condition kf>tk1 is satisfied. The residual acf of Table 1. confirms the hypothesis that the shocks at of the univariate model are independent. There are no absolute correlation values in the residual acf exceeding any of the relevant practical warning levels (twice the standard error) and the chi-squared statistics is insignificant even at more than 10% level. Transfer Function Model Univariate model uses a single dependent or output variable as a function of its own histo- ry and previous errors. Transfer function model, on the other hand, may have single or multiple inputs that may possibly affect the system. The dynamic characteristics of a system are fully understood explicitly only through a transfer function model. The dynamic nature of the transfer function relationship lies in its ability to account for the instantaneous and lagged effects of an input variable on the output variable. This relationship can be used to improve the forecast of the output variable as well as provide the capability of using the model for simulation analysis by plugging in alternate forecasts of the input series. When simultaneous pairs of observations (XtX t ). (X 2 Y 2 )... (X N X N ) of the input and output variables are available at discrete equispaced times 1.2... N a discrete linear transfer function model can be written as (1-ll t B-... -ll r B r ) Y t = (wo-wtb-... -wsb S ) X t _ b or Il (B) Y t = w (B) Xt-b where B is backshift operator and b is the delay parameter. This

equation is referred to.. as transfer function model of order (r, s). If the system is infected by noise Nt, then the combined transfer,function-noise model may be written as Y t = 11-1 (B)w (B) Xt-h + Nt The objective of the identification stage is to obtain some idea of the order rand s of the transfer function model and to derive initial guesses for the parameter ll, wand the delay parameter b. In the same way that the autocorrelation function is used to identify p, d, q parameters of the univariate model, the r, s and b parameters for the transfer function models are identified by the cross correlation between the input and the output. Following Box-Jenkins the whole process of identification, estimation and diagnostic and forecasting can be outlined as follows: 1. 2. 3. 4. A univariate model is identified for each of the input variables being used. The output series is next prewhitened using the univariate model for each mput series. The cross correlation between the prewhitened output series and each corresponding input series is calculated. Based on the values and pattern of the cross correlation function (as acf in univariate model), the r, sand b parameters are tentatively identified and estimated for the transfer function between each in pu t series and the ou tpu t series. The transfer functions for all the input series are then combined into a single model and the transfer function parameters are estimated. 9. If the fitted model proves to be inadequate then the identification and estimation stages must be repeated for either the transfer function model, the noise model or both. 10. If, however, the fitted model is found to be adequate based on the diagnostic checking, then it can be used in forecasting. Transfer Function Model Resufts The temperature and humidity data were used for the construction of the univariate models for each of the inputs. To reduce the variance, temperature and humidity data were subjected to logarithmic transformation as follows: Temperature Effect TE=ln(Tman Tmin)12 Humidity Effect HE~ln(RHmax+RHmin)/2 where Tmox=monthly mean maximum temperature, Tmin=monthly mean minimum temperature, RHmox=monthly mean maximum relative humidity and RHmin=monthly mean minimum relative humidity. Univariate model building for both TE and HE were carried out following the procedures outlined earlier in the univariate time series analysis. Both the series needed first degree seasonal differencing in order to remove seasonal nonstationarity. Based on the acf and pacf of differenced series, the following autoregressive models were selected for TE and HE respectively: 5. 6. 7. 8. The transfer function parameters estimate can be checked for significance and if' necessary re-specified and re-estimated. After the form of the transfer function has been determined, the residuals are examined to identify an appropriate stochastic model for the noise component. The parameters function-noise estimated. of the com plete transfer model are then re- Diagnostic checks may be performed on the full model to determine its adequacy. 519 The diagnostic checkings were done and the above models were found to be the adequate representation of the temperature and humidity data. FollOWing the steps outlined for the building of transfer function model, the transfer function parameters were identified from the cross correlation functions for each input series.

There was no overall delay for either of the inputs, since the gloss measurements were taken at the end of the month during which the temperature and humidity measurements were recorded. The single cross correlation at lag zero for both the input series also suggested a single overall regression factor with no lag effects of either of the input series or the output series. The transfer function parameters for the model with combined inputs were estimated. The residuals from this models were investigated and a first order autoregressive model was identified for the noise portion of the model. The two models were then combined and the parameters were re-estimated and are given as (I-B) GWSS = -1.258 + 2.541 (I_B 12 ) TE All the parameters are significant judging from their large absolute t-values. The residuals of the final models were examined and none of the residual autocorrelation had an absolute value even approaching significant level and the chi-squared statistics were quite insignificant. Forecasting Results The residuals values for both univariate and transfer function models are shown in Figure 1. The transfer function residuals are slightly lower than those of the univariate model almost all along the range of the observed values. The adjusted Root Mean Square Error (RMSE) in transfer function model is 0.87 as compared to 1.01 in univariate model which indicates that the transfer function model fits the data more closely although both the models are equally parsimonious. Moreover, model with the smaller RMSE tends to have a smaller forecast-error variance. Likewise, the Mean Absolute Percent Error (MAPE) for the transfer function model is 0.62 as opposed to 0.77 for the univariate model. Forecast for the twelve months lead period for which the actual data is already available is shown in Table 2. In this case also, the transfer function model slightly outperforms the univariate model. The adjusted RMSE for transfer function model is 1.01 as compared to 1.10 for the univariate model and the MAPE for the former is 0.71 as opposed to 0.77 for the latter. Also as expected from the value of adjusted RMSE, the standard (SID) error for individual forecast is consistently lower in the transfer function model all through the forecasting range. Several points deserve emphasis regarding the forecast for lead times of more than one. After the first forecast, bootstrap forecasts are produced since they are based at least partly on forecast gloss values and forecast input values rather than the observed ones. The last eleven forecasts in Table 2 are bootstrap forecasts. Under this situation, the transfer function model offers a distinct advantage over the univariate model. The errors in the transfer function model can be decomposed into two components, namely, model error and the error in the input forecasts. The input forecast error can be reduced or eliminated if the transfer function model is used as a simulation model based on alternative or actual inputs. Thus no error is attributed to the inputs, it is all part of the model errors. Conclusions The applications of Box-Jenkins univariate time series approach and transfer function methodology to gloss degradation data have been found to be suitable. Temperature and humidity effects were the inputs to the transfer function model. Both the models fit the observed data with reasonable accuracy, although the transfer function model has a slight edge over the univariate model in both fitting and forecasting accuracy. The forecast can be done for any number of lead periods, although it is more reliable for short term forecasting. This limitation can be, however, relaxed for the transfer function model since this model could be used for the simulation analysis with alternate input forecast or actual inputs. This aspect of the transfer function model offers a remarkable advantage for gloss prediction if the actual values of temperature and humidity are known or alternate forecast for these variables could be made for environment with distinctly different weather conditions. SAS/ETS is a registered trademark of SAS Institu te Inc., Cary, NC, USA. 520

References (1) Box, G.E.P. and G.M. Jenkins. Time Series Analysis: Forecasting and Control, Revised Edition. San Francisco: Holden-Day, 1976. (2) Davis, S. Predicting Prison Population Using the SASIETS Product. Proc. of the 9th SUGJ conference. Cary, NC: SAS Institute Inc. 1984. 1059 pp. (3) Jacob, M.F. Residential Energy Forecasting: a Pragmatic Application of Box-Jenkins. Proc. of the 9th SUGJ conference. Cary, NC: SAS Institute Inc. 1984. 1059 pp. (4) Pandit, S.M. and S.M. Wu. Time Serles and System Analysis with Applications. New York: Wiley, 1983. (5) Pankratz, A. Forecasting with Univariate Box-Jenkins Model, Concept and Cases. New York: Wiley, 1983. (6) Rehfeldt, T K. Evaluotion of Degradation Data by Time Series Analysis. Progress in Organic Coating, 15(1987) 261-268.. (7) SAS Institute Inc. SASI ETS User's Guide, Version 5 Edition. Cary, NC: SAS Institute Inc., 1984. 738 pp. TABLE 1. IDENTIFICATION AND ESTIMATION FOR DIFFERENCED GLOSS DATA 987 6 5 4 3 2 1 n 1 2 ~ 4 ~ 6 7 8 9 1 1 1 1 1 1 1 1 1, ARIMA: CONDITIONAL LEAST LAG ~ 1 91~ ~ ~Z~~1, 1, 1,,,, 1, 521

i,gure 1. M o o E L R E S I o U A L s 0., " " TIME STEPS IN MONTH '".. MOOEL......... TRANSFER <> <> <> UN1U~1I1ATE Table 2. Prediction Performance of Univariate and Transfer Function Models TIME ACTUAL UNIVARIATE TRANSFER UNIVARIATE TRANSFER UNIVARIATE TRANSFER STEP GLOSS FORECAST FUNCTION RESIDUAL FUNCTION STD ERROR FUNCTION VALUE FORECAST RESIDUAL STD ERROR 37 53.0 52.7095 52.7012 0.29054 0.29880 0.70704 0.71500 38 52.0 51.7453 51.7278 0.25469 0.27222 1.33256 1.32181 39 51.0 50.6774 50.6519 0.32258 0.34809 1.91978 1.87927 40 49.5 49.5475 49.5159-0.04754-0.01585 2.45664 2.38211 41 48.0 48.3806 48.3444-0.38062-0.34437 2.94433 2.83517 42 47.0 47.1916 47.1521-0.19157-0.15206 3.38833 3.24569 43 45.5 45.9893 45.9475-0.48929-0.44748 3.79493 3.62068 44 44.0 44.7791 44.7357-0.77911-0.73568 4.16993 3.96615 45 43.0 43.5642 43.5197-0.56421-0.51965 4.51829 4.28700 46 42.0 42.3465 42.3011-0.34649-0.30113 4.84412 4.5871B 47 41.0 41.1271 41.0811-0.12708-0.08115 5.15077 4.86983 48 39.5 39.9067 39.8603-0.40666-0.36030 5.44096 5.13749 ADJUSTED RMSE(Transfer)=l.01 ADJUSTED RMSE(Univarlate)=l.10 MAPE(Transfer)=0.71 MAPE(Univariate)=0.77 The author may be contacted at: Computer Center Kuwait Institute for Scientific Research P.O. BOX: 24885 A Safat 13109 Safat KUW IT BITNET: TES258@KUKISROO 522