Sensors and Atuators B 44 (1997) 423 428 Multiomponent analysis on polluted waters by means of an eletroni tongue C. Di Natale a, *, A. Maagnano a, F. Davide a, A. D Amio a, A. Legin b, Y. Vlasov b, A. Rudnitskaya b, B. Selezenev b a Department of Eletroni Engineering, Uniersity of Rome Tor Vergata, ia della Riera Sientifia, 00133 Rome, Italy b Department of Chemistry, Uniersity of St. Petersburg, St. Petersburg, Russia Aepted 14 May 1997 Abstrat In this paper the simultaneous measurements of the onentrations of a number of hemial speies in solutions performed by a sensor array of ion sensitive eletrodes are presented and disussed. By analogy with the well known eletroni nose this sensor array operating in solutions, will be here alled eletroni tongue. In order to extrat optimized information from the eletroni tongue output data, many different tehniques have been applied; they were based on hemometris, non-linear least squares and neural networks. The best results have been ahieved by the introdution of modular models whih make use, at the same time, of both qualitative and quantitative information. 1997 Elsevier Siene S.A. Keywords: Eletroni tongue; Multiomponent analysis; Chemometris 1. Introdution Environmental monitoring requires on-site simultaneous measurements of a number of different hemial speies. A lassi approah to this problem is to onsider a number of seletive sensors one for eah hemial speies. Along with this in presene of omplex solutions, sensors tend to lose their own speifiity and their response is no more diretly related to the onentration of the speies for whih they have been designed, and they beome rather influened by the presene of the other speies. This proess an be represented as a sort of onvolution between the sensor seletivity (namely the sensitivity to the speies present in the environment) and the hemial pattern ourring in the environment under measurement. Multiomponent analysis is an analytial proedure allowing the extration of qualitative and quantitative information from an array of non-seletive sensors. It is based on the utilization of an array of sensors mathed with a suitable data analysis proedure. * Corresponding author. Tel.: +39 6 7254408; fax: +39 6 2020519; e-mail: dinatale@eln.utovrm.it Multiomponent analysis provides a sensor array model from a alibration data-set, whih should be large enough to over the onentration range of eah speies, and to ope the non-linearity in the sensor responses. Many data analysis tehniques an be utilized to disentangle the information from sensor array outputs; they an be grouped in four lasses: hemometris, artifiial neural networks, non-linear least squares and some other exoti methods (suh as geneti algorithms or multi-dimensional splines). In past years the possibility to perform measurements of onentrations of a number of hemial speies in omplex solution has been attempted [1 3]. By analogy with the natural olfation, where sensor arrays operating in air whih are labelled as eletroni noses, sensor arrays operating in liquid media will be here alled eletroni tongues. In this paper the measurement of quantities relevant for pollution monitoring of internal waters by means of an array of hemially sensitive eletrodes will be presented and disussed. To this sope samples of waters from Neva river (a river flowing through the town St. Petersburg) has been artifiially polluted with ioni metals in order to simulate generi industrial waste pollution. 0925-4005/97/$17.00 1997 Elsevier Siene S.A. All rights reserved. PII S 0925-4005(97)00169-X
424 C. Di Natale et al. / Sensors and Atuators B 44 (1997) 423 428 Table 1 Ranges of the total onentrations of the eight hemial speies in solution Speies Cu 6 8 Cd 6 7 Zn 3 5 Cr 4 7 Fe 4 7 Cl 3 5.3 SO 4 H 3 3.9 2.6 8 Conentration range ( log[c]) Different data analysis methods have been utilized in order to selet the most fitting one. Substantial improvement in auray has been obtained with modular models that make use, at the same time, of qualitative and quantitative information. It is worth noting that regression methods generally do not make any use of qualitative information on the data. This means that an investigation of the data distribution may reveal, sometimes, the existene of different qualitative lasses. In these ases it is straightforward to suppose that the development of a many regression model, one for eah lass, should perform better relative to a unique regression model holding over the whole onentration ranges. Data lassifiation an be ahieved in several ways, among the others in this paper prinipal omponent analysis and self organizing maps have been onsidered. addition of Cl, SO 4 and H was done and that therefore their onentrations atually ourred, in the Neva river, at the sampling loations. The range of onentrations, for eah speies, is shown in Table 1. Samples have been measured by an eletroni tongue formed by 22 eletrodes mainly based on halogenide glasses variously doped and onventional eletrodes. The following halogenide glass systems have been used for the sensor preparation: AgI Sb 2 S 3, Ag 2 S As 2 S 3, GeS GeS 2 Ag 2 S, Ge Sb Se Ag with different omponents ontent. Also ommerial opper-, leadand admium-seletive halogenide glass sensors were inorporated. Glasses were prepared in evauated quartz ampoules at 1000 K from high purity omponents. 5 7 mm diameter and 3 5 mm thikness disks were ut and sealed into plasti tubes. Both liquid and solid-state inner ontats were used. The details of glass synthesis and sensor preparation are desribed in Ref. [4]. Commerial halogenide glass sensors were provided by Analytial Systems (St. Petersburg, Russia). Their ompositions are summarized in Ref. [4]. Conventional rystalline sensor was used basially for hloride ion determination. Two sensors of every type have been used in the array. In order to determine a robust regression model for the sensor array about 150 hemial solutions of different onentrations were prepared and measured. All sensors, inluding those with a nominal ion seletivity, have shown a strong ross-seletivity that did not allow any diret measurement of onentrations. 2. Experimental To simulate real onditions samples of Neva river waters have been artifiially polluted with ioni metals in order to reprodue generi industrial wastes. Neva river waters were taken at three different loations. In eah sample the following elements were added in ioni form: Cu, Cd, Fe, Cr and Zn. The solutions were left at room temperature for one week to approah, as muh as possible, an equilibrium state. The sope of the work was to measure the total onentrations of: Cu, Cd, Fe, Cr, Zn, Cl, SO 4 and H. It should be noted that no 3. Data analysis In order to utilize the sensor array to retrieve an estimate of the onentrations of the relevant speies in unknown samples a alibration of the sensor array is required. For this sope it is neessary to ollet a alibration data-set of measurements performed at known environmental onditions and then arry out a data analysis proedure. As mentioned in the previous paragraph about 150 alibration measurements were performed. The whole data-set has been split in two parts, one for alibration Table 2 Mean relative absolute error (RAE) obtained analysing the data with various methods Cu (%) Cd (%) Zn (%) Cr (%) Fe (%) Cl (%) SO 4 (%) H (%) MLR 40 25 47 208 73 6 11 11 PLS 7 5 7 15 8 2 4 4 NLLS 9 4 7 13 6 3 3 6 BP NN 7 5 7 15 7 1 1 4 It is worth to note that, apart MLR whih is only a referene to evaluate the non-linearities, the other methods show about the same perfromanes.
C. Di Natale et al. / Sensors and Atuators B 44 (1997) 423 428 425 Fig. 1. PCA plot of onentration values in whih the ourrene of two lasses of data is observed. (to determine the model) and one to test the apability of the model to predit orret onentrations from unknown samples. Many different approahes to the data analysis have been utilized. An extensive omparison of the performanes of many tehniques have been aomplished in order to obtain the best possible estimation of the onentrations. The following methods were utilized: multiple linear regression (MLR), partial least squares (PLS), non-linear least squares (NLLS) and bak-propagation neural network (BP NN). MLR approah has been utilized to evaluate how far from linearity the sensor response was. Eletrodes operating in single-omponent solutions behave aording to the Nernst law, and ross-seletivities are taken into aount for mixed solutions by the Nikolskij extension. In solutions haraterized by the presene of many different ompounds a deviation form the linearity is expeted also if models for the sensor response are not urrently available. So that the errors found by linear modelling give a figure for the entity of the deviation from linear behaviour. PLS is a tool extensively utilized, for quantitative analysis, in hemometris and in multisensors appliations. Although it is based on a linear approah it ahieves results whih are substantially better than those obtained by MLR. Nevertheless the nonlinearities involved an, sometimes, be so large that different approahes, non-linear in harater, are required. From an analytial point of view the non-linear approah is represented by the non-linear least squares. The sensor array is modelled as a system of non-linear equations where the number of equations is equal to the number of sensors and the variables are the onentrations. Sine no theoretial models are available eah eletrode has been fitted by a polynomial funtion of the third order. 8 f( 1,..., 8 )=k 0 + k 1, j i +k 2, j 2 j +k 3, j 3 j (1) j=1 The alibration of the array onsists then in the estimation of 25 parameters for eah eletrode for a total of 550 parameters. The whole array is then represented by a vetorial funtion desribing the relation between eah sensor output and the onentrations of the eight speies. Sine eah eletrode is represented by Eq. (1) the array funtion is written as: k 1, 1 Fb ( 1,..., 8 )=k 0 + k 22, 1 1 1 k 1 8 * 2 1 k 22, 25 2 8 3 1 3 8 (2) The inversion of this equation allows to estimate, from the output of the sensors, the onentrations of interest. These operations (parameters estimation and array funtion inversion) an be performed using the Levenberg Marquardt algorithm whih provides an iterative solution for the solution of redundant systems of nonlinear equations. [5] Non-linear least squares approah (NLLS) requires muh effort in terms of alulus and furthermore the Levenberg Marquardt algorithm an also have onvergene problems whih gives rise to a limitation of the auray of the estimations. In reent years artifiial neural networks have gained popularity to solve non-linear modelling problems. Their suesses are mainly due to the fat that, from the user point of view, they an be utilized as a regression mahine able to establish orrelations between bloks of data. Among the neural networks able to perform regression, bak-propagation based networks are urrently widely utilized. Basially these networks provide nonlinear models whose parameters are optimized by an algorithm whose onvergene is extremely favoured by the partiular arrangement of the network in feed forward layers [6]. The network here utilized was a 22:15:8 feed forward network where the neuron transfer funtion was the hyperboli tangent. MLR, PLS and NLLS were implemented on Mathematia For the ANN the Professional II Software tool by NeuralWare was utilized. All software ran on Apple PowerMaintosh 7500. PLS was implemented following the kernel approah desribed in Ref [7]. The
426 C. Di Natale et al. / Sensors and Atuators B 44 (1997) 423 428 Fig. 2. Radar plot of the onentration ranges in the two lasses. The main differene among the lasses is represented by the value of ph. performanes of the three methods in prediting unknown onentrations have been evaluated through the perentage relative absolute error (RAE) in retrieving the onentration in the test data-set. Results are listed in Table 2. Disregarding MLR whih, as said before, gives only information on the entity of non-linearities involved in the array, the other three methods exhibit similar performanes towards all the onentrations. It is remarkable the fat that although the results are omparable the effort, in terms of alulation and theoretial ompliations, is rather different. Indeed PLS an be onsidered as the less expensive tehnique, it an be implemented in any mathematial oriented programming language (suh as Mathematia or Matlab) and the alulation time is rather fast also on medium sized personal omputers. NLLS and BP NN, on the other hand, require longer time of alulus and, due to their iterative nature, present onvergene problems. A substantial improvement of the data analysis performanes an be obtained taking into onsideration the qualitative harater of the data. This an be easily performed by using PCA, a method widely utilized in hemometris and then on eletroni nose data analysis to display multidimensional data in a sub-spae formed by the prinipal omponents, namely those diretions along whih the variane of the data is maximum. Fig. 1 shows the PCA plot of the onentrations; the distribution of points reveals that the onentrations were not homogeneously distributed in the eight-fold spae of onentrations but that they are rather lustered in two lasses. The onentration range, inside eah lass, is shown as a radar plot in Fig. 2. The main differene among the two lasses of onentrations lies in the ph value that, in one ase, spans from 2.6 to 3.5 and, in the other lass, from 6.5 to 8. Fig. 3 shows the PCA plot of the sensors outputs. It is lear that, from a qualitative point of view the sensors are able to disriminate between the two lasses. This irumstane suggests that it should be possible to obtain a better performing sensor array model inluding these qualitative information and taking into aount one sub-model for eah qualitative lass. Fig. 3. PCA plot of the sensor outputs. The eletroni tongue orretly lassify the data in two lasses. Fig. 4. SOM lattie plot of the sensor outputs. A net distintion of the two lasses is observed.
C. Di Natale et al. / Sensors and Atuators B 44 (1997) 423 428 427 Fig. 5. Arhiteture of the two modular models. On the left side the hemometri based one is shown. It is formed by a PCA layer as lassifier and on two PLS blok one for eah lasses. Depending on the side of the PCA plot plane on whih a data falls one of the two blok is ativated. On the right side the neural network based model is represented. In this ase the lassifiation is aomplished by a SOM while two BP NN produe the estimation of the onentrations. Two modular models have been onsidered and ompared. The first was based on hemometris bloks, omposed by one PCA and two PLS bloks, and the seond was based on neural networks, using a Self Organizing Map (SOM) and two BP NN [8]. In the neural networks model a SOM is utilized as a lassifier. Beside its many features [9,10] SOM an be onsidered as a data modelling tool whih gives a bidimensional representation. It has been demonstrated that SOM is a neural implementation of a sort of prinipal urve analysis. Fig. 4 shows the SOM lattie plot of the sensor outputs; as in PCA, also in this ase the two lasses are learly separated. Fig. 5 shows the arhiteture of the two modular models, while their results, expressed by the mean RAE, are listed in Table 3. Modular models strongly redues the error in the onentration estimates in respet to the onforming models operating without the qualitative information. Furthermore the modular model formed by two kinds of neural networks behaves better than that based on hemometris bloks. 4. Conlusions The eletroni tongue, developed in the St. Petersburg Rome ollaboration, has been shown to be an useful tool for simultaneous measurement of several speies in environmental appliations, suh as the waters of a river. The auray of the measurement an be improved hoosing the proper data analysis tehnique. A omparison among several methods, based on oneptually different approah, shows that the performanes obtained by PLS, NLLS and BP NN are basially similar. A deisive improvement of the preditions is obtained enlarging the amount of onsidered information taking into aount also the qualitative aspets of the data. To this regard two different modular models based on hemometris (PCA+PLS) and neural networks (SOM+BP NN) have been introdued. It has been proved that these models signifiantly improve the auray of the estimations in respet to models operating without qualitative information. Table 3 Mean RAE obtained by modular models PCA+PLS 4 SOM+BP NN 2 Cu (%) Cd (%) Zn (%) Cr (%) Fe (%) Cl (%) SO 4 (%) H (%) 2 3 6 4 1 2 2 1 1 3 1 1 1 1 A fator two of gain in respet to the value listed in table 2 is obtained for (PCA+PLS) model, while more than a fator four is reahed by the (SOM+GP NN) model.
428 C. Di Natale et al. / Sensors and Atuators B 44 (1997) 423 428 Referenes [1] C. Di Natale, F. Davide, A. D Amio, A. Legin, Y. Vlasov, Multiomponent analysis of heavy metal ations and inorgani anions in liquids by a non-seletive halogenide glass sensor array, Sensors and Atuators B 34 (1996) 539 542. [2] M. Otto, J.D.R. Thomas, Model studies on multiple hannel analysis of free magnesium, alium, sodium and potassium at physiologial onentration levels with ion-seletive eletrode, Anal. Chem. 57 (1985) 2647. [3] M. Bos, A. Bos, W.E. Van der Linden, Proessing of signals from ion seletive eletrode array by a neural network, Anal. Chem. 233 (1990) 31. [4] Yu.G. Vlasov, E.A. Byhkov, Chalogenide glass ion-seletive eletrodes, Ion Seletive Eletrode Rev. 9 (1987) 5. [5] C. Di Natale, F. Davide, in: A. D Amio and G. Sberveglieri (Eds.), Proeedings of the 1st European Shool on Sensors, World Sientifi, Singapore, 1995. [6] D.E. Rumelhart, J.L. MClelland (Eds.), Parallel Distributed Proessing: Explorations in the Mirostrutures of Cognition, vol. 1, Foundations, MIT Press, Cambridge MA, 1986. [7] F. Lindgren, P. Geladi, S. Wold, A kernel algorithm for PLS, J. Chemometris 7 (1993) 45. [8] C. Di Natale, F. Davide, A. D Amio, W. Göpel, U. Weimar, Sensor arrays alibration with enhaned neural networks, Sensors and Atuators B 19 (1994) 654 657. [9] T. Kohonen, Self Organizing Map, Springer-Verlag, Berlin, 1995. [10] F. Davide, C. Di Natale, A. D Amio, Self organizing multisensor systems for odour lassifiation: internal ategorization, adaptation and drift rejetion, Sensors and Atuators B 18 (1994) 244 258..