Forecasting SO 2 air pollution in Salamanca, Mexico using an ADALINE.

Iovative Productio Machies ad Systems D.T. Pham, E.E. Eldukhri ad A.J. Soroka (eds) 2008 MEC. Cardiff Uiversity, UK. Forecastig SO 2 air pollutio i Salamaca, Mexico usig a ADALINE. M.G. Cortia a, U.S. Medoza a, J.M. Barró-Adame a, D.Adia b, A. Vega-Coroa a a, Uiversidad de Guaajuato, Facultad de Igeiería Mecáica, Eléctrica y Electróica, México. b Uiversidad Politécica de Madrid, ETSI Telecomuicació, España. Abstract A compariso betwee a liear regressio model ad a No-liear regressio model is preseted i this work for forecastig of pollutio levels due to SO 2 i Salamaca city, Gto. Predictio is performed by meas of a Adaptive Liear Neural Network (ADALINE) ad a Geeralized Regressio Neural Network (GRNN). Predictio experimets are realized for 1, 12 ad 24 hours i advace, ad the results for liear regressio have bee satisfactory. The performace estimatio of both models are determied usig the Root Mea Squared Error (RMSE) ad Mea Absolute Error (MAE). Obtaied results are compared. The fial results idicated that ADALINE outperforms the past approach usig GRNN. Keywords: ADALINE, GRNN, SO 2 cocetratio 1. Itroductio Salamaca city is catalogued as oe of the most polluted cities i Mexico. The mai causes of pollutio i Salamaca are due to fixed emissio sources such as Chemical Idustry, ad Electricity Geeratio, beig the more importat pollutats i Air, Sulphur Dioxide (SO 2 ), measured i part per billio (PPB), ad Particulate Matter less tha 10 Micrometers i diameter (PM 10 ), measured i micrometers i diameter. This article focuses o forecastig SO 2 cocetratio. I a effort to fight pollutio of the zoe, i July 2005, the Program of Evirometal Cotigecy was lauched, the purpose of it beig to protect the health of populatio, especially that of vulerable groups. This program cotemplates the urget ad immediate reductio of SO 2 emissios ad PM 10 whe measuremets of these pollutats register levels above those established by Health Authorities. To accomplish it, 3 phases were established: Pre-cotigecy, Cotigecy Phase I ad Cotigecy Phase II for Sulphur Dioxide, PM 10 particles ad for a combiatio of both [1]. Predictio of pollutat cocetratios i the Atmosphere would allow takig prevetive measures, reducig the emissio of pollutats before reachig levels of a evirometal cotigecy. I this work, the use of a Neural Network ADALINE (ADA) is proposed to predict pollutio levels 1, 12 ad 24 hours i advace for the zoe of Salamaca before a evirometal cotigecy 232

occurs, ad results obtaied are compared with those obtaied with a Geeralized Regressio Neural Network (GRNN) [2]. 2. Methodology Figure 1 shows the flow diagram of the methodology that was followed for the realizatio of this work, which cosists of 3 mai phases: i) Select Traiig ad Test data sets, ii) Neural Network Desig ad iii)simulatio ad Results Evaluatio. system is used i the learig rule. ADA trasfer fuctio is a liear fuctio istead of a hard limit trasfer fuctio of the Perceptro. ADA ad the multiple versio Madalie (MADA) use a learig mechaism kow as Delta Rule of Widrow ad Hoff, also kow as the Least Square Mea Error (LSM) Rule [3], based o the search of the miimum error betwee the desired output ad the liear output obtaied. 2.2.1 Network Structure I geeral terms, the output fuctio of the etwork is give by equatio (1) T a W p (1) where a, is the output vector of the liear euros, W, is the weight matrix, ad p is the iput vector. 2.2.2 Learig Rule ADA is a supervised learig etwork that eeds a priori kowledge of the associated values to each iput, deomiated Widrow-Hoff Rule, also kow as LMS. The Rule of Widrow-Hoff i geeral terms is expressed as idicated i equatio (2). Fig. 1. Flow Diagram for SO 2 2.1. Traiig ad Test Data Sets Data base used for this experimet have bee previously processed as did i [2]. Data base selected to trai the et are those correspodig to the moths of Jauary ad February 2005, ad data selected to make the predictios are those of March 2005. 2.2. Neural Network Desig ADA is a geeralisatio of the perceptro traiig algorith. The mai fuctioal differece with the perceptro traiig rules is the way the output of the W(k + 1) =W(k) + 2 e(k)p T (k) (2) where k represets the curret iteratio of the weights updatig process, W(k+1) is the ext value that vector W is goig to take, ad W(k) is the curret weights vector; e is the vector of curret error, defied as the differece betwee the desired respose ad the etworks output show i equatio (1); is the learig rate. The gai updatig process is give by equatio (3) b(k + 1) = b(k) + 2 e(k) (3) where e is the vector represetig the error, b(k+1) is the Gai updatig vector, ad b(k) is the curret Gai Vector. 233

2.3. Traiig ad Simulatio of the et Iput euros are equal to the umber of observatios. I this work, iput euros are 1344 which is the total umber of observatios correspodig to the traiig period (Jauary ad February 2005). Test group cosists of 672 observatios correspodig to March 2005. Traiig vector have bee made oly to predict SO 2 cocetratios levels, that meas, o other variables were used to perform the forecast. Vector X, is the iput vector at times t=i--1,..., t=i-1, t=i, where i, is the curret hour ad is the umber of forecast hours is doe. Vector Y, is the output vector, whose elemets correspod to the estimatio of SO 2 levels at times t=i+1, t=i+2,..., t=, where i, is the curret hour ad is the umber of forecast hours is doe. Traiig ad simulatio for ADA ad GRNN were performed usig two differet Patter Schemes, sice the scheme used i [2] produces a apparet time-shift i the predictio made by ADA, due to how the patters for the traiig ad test matrices were formed. Due to this situatio, two differet Traiig Schemes were used, the secod oe to correct the apparet timeshift i the forecast for ADALINE etwork. These schemes were amed ADA I ad ADA II. I Traiig Scheme I (ADA I), iput patters x i are formed as idicated i [2], where i time t=i, each patter is X ={x i--1, x i-1, x i }, where x i is the SO 2 cocetratio i the curret hour, ad is the umber of forecast hours is doe. This meas that iput patters are formed with the curret ad past cocetratios. However, the first patter was formed by all zeros, sice for the first data, we had ot apriori iformatio. Output Traiig patters are formed with the ext hours cocetratios, Y={y i+1, y i+2,, y i+ }. I Traiig Scheme II (ADA II), patters were formed as i ADA I, but with the differece that for ADA II x 1 is equal to ADA I x 2, ADA II x 2 is equal to ADA I x 3 ad so o. Aother differece is that ADA I was formed with N patters, ad ADA II with N-1 patters. Due to the structure of the patters it is ecessary that we use N-1 patters. Evaluatio of the forecastig Performace was accomplished usig the Root Mea Squared Error (RMSE) ad the Mea Absolute Error (MAE). Mea-squared error is the most commoly used measure of success of umeric predictio, ad root mea-squared error is the square root of mea-squarederror, take to give it the same dimesios as the predicted values by themselves. This method exaggerates the predictio error - the differece betwee predictio value ad actual value of a test case of test case i which the predictio error is largest tha the others. If this umber is sigificatly greater tha the mea absolute error, it meas that there are test cases i which the predictio error is sigificatly greater tha the average predictio error. Balaguer et al. [4], have used RMSE as a idicator of the relatioship betwee predicted ad observed data. Root Mea Squared Error is computed accordig to equatio (4) RMSE 1 i 1 ( y i y i ) 2 (4) where i, is the predicted value for a determied time t=i, y i, is the real value for the same time ad is the umber of observatios. Mea Absolute Error is the average of the differece i all test cases. Mea Absolute Error (MAE) is computed accordig to equatio (5) 1 MAE i 1 y i y i (5) where i, is the predicted value for a give time t=i, y i, is the real value for the same time, ad is the umber of observatios. 3. Results Table I shows the obtaied results for schemes ADAI, ADA II, ad GRNN, proposed i [2]. GRNN was traied with Scheme I patter. 234

Table 1 Performace of a GRNN agaist a ADA Scheme GRNN ADA I ADA II Hours Ahead (PPB) 1 12 24 RMSE 59,6 58,05 73,94 MAE 34,67 43,72 53,1 RMSE 19,92 38,84 140,00 MAE 16,67 81,53 98,76 RMSE 19,92 58,36 58,36 MAE 16,67 36,13 36,13 The best results were achieved for the predictio of 1- hour ahead SO 2 cocetratios i both GRNN ad ADA etworks, which agrees with results obtaied by Medoza [2] ad Turias [5]. There is a sigificat improvemet usig the ADA II etwork sice both MAE ad RMSE errors are much lower tha those obtaied with GRNN. Results of 1-hr predictio are show i figures 2 ad 3. Fig 3. 1-hour forecastig with a ADA. I figure 5, for the case of SO 2 levels predictio with 12 hours ahead with ADA I, the predictio apparetly presets a time shift, which prevets gettig satisfactory results. This is due to the patters orgaizatio i this scheme. Fig 2. 1-hour forecastig with a GRNN Fig 4. 12-hours forecastig with a GRNN. Results obtaied with ADA II were much better tha those obtaied with ADA I ad GRNN, comparig them i Table 1, MAE ad also RMSE error were reduced, ad ADA II showed o time-shift for the predictio of SO 2. Figures 4, 5 ad 6 show results for 12-hour ahead predictio for the differet etworks 235

that were used. Fig 5. 12-hours forecastig with a ADA I. Fig. 7. 24-hours forecastig with a GRNN. The results for ADA I are show i the figure 8, where it is also time-shifted as results for 12-hr forecastig. Fig 6. 12-hours forecastig with aada II. For the case of 24-hr predictio, agai, ADA II scheme showed a better performace over the GRNN, ad ADAI. Figures 7 ad 9 show results for 24-hr predictio usig GRNN ad ADA II. Fig. 8. 24-hours forecastig with ADA I. 236

I both cases, error icreases as the umber of forecast hours is made icreases. It has bee show that the use of a liear regressio eural etwork improves the SO 2 predictio of cocetratio levels, reducig the error obtaied with a No-liear regressio eural etwork. Refereces [1] Istituto Estatal de Ecología. Calle Aldaa N.12 esquia calle Republica, Pueblito de Rocha, c.p. 36040, Guaajuato, Gto. Fig. 9. 24-hours forecastig with ADA II. Obtaied results i SO 2 forecast cocetratio levels with ADA show that the scheme of patters plays a importat role for obtaiig acceptable results. 4. Coclusios This work shows the compariso of the performace of a Liear Regressio Neural Network (ADALINE) ad a No-Liear Regressio Network (GRNN) to forecast cocetratio levels of SO 2. Oe of the mai differeces is, that a liear regressio etwork eeds less parameters adjustmet tha a Noliear regressio etwork, thus facilitatig its implemetatio, however, to obtai better results with a liear regressio etwork, it is ecessary to search for patter scheme that allows us reduce the error i the SO 2 predictio of cocetratio levels. ADA II outperformed GRNN i all the cases, showig that a appropriate patter Scheme must be used. [2] U.S. Medoza-Camarea, F. Ambriz-Coli, D.M. Arteaga-Jauregui ad A.Vega-Coroa. SO 2 cocetratios forecastig for differet hours i advace for the city of Salamaca, Gto., Mexico. 2005. [3] Mada M.. Gupta, Liag Ji, ad Noriyasu Homma. Static ad Dyamic Neural Networks, 2003. [4] Emili Balguer Ballester, Emilio Soria Olivas, ad Jose Luis Carrasco Rodríguez. Forecastig of surface ozoe cocetratioos 24 hours i advace usig eural etworks. Iteratioal coferece o Neural Networks ad Applicatios, 2001. [5] I. Turias, F.J. Gozalez, ad P.L. Galido. Applicatio of eural techiques to the modellig of time-series of atmospheric pollutio data i the campo de Gibraltar regio. Neural Network Egieerig Experieces, Procc. Of the 8 Iteratioal coferece o applicatio of eural etwork: 9-16, 2003. 237