Applied Mathematics and Computation

Size: px

Start display at page:

Download "Applied Mathematics and Computation"

Luke Pope
6 years ago
Views:

Applied Mathematics and Computation 19 (01) 6 Contents lists available at SciVerse ScienceDirect Applied Mathematics and Computation journal homepage: www.elsevier.

of Informatics, Thessaloniki, Greece b TEI of Larissa, Department of Computer Science and Telecommunications, Greece c University of Macedonia, Thessaloniki, Greece article info abstract Keywords:

1 Applied Mathematics and Computation 19 (01) 6 Contents lists available at SciVerse ScienceDirect Applied Mathematics and Computation journal homepage: Finding all real roots of nonlinear algebraic systems using neural networks Konstantinos Goulianas a, Athanasios Margaris b,, Miltiades Adamopoulos c a TEI of Thessaloniki, Department of Informatics, Thessaloniki, Greece b TEI of Larissa, Department of Computer Science and Telecommunications, Greece c University of Macedonia, Thessaloniki, Greece article info abstract Keywords: Nonlinear algebraic systems Neural networks Numerical analysis Computational methods The objective of this research is the description of a feed-forward neural network capable of solving nonlinear algebraic systems with polynomials equations. The basic features of the proposed structure, include among other things, product units trained by the backpropagation algorithm and a fixed input unit with a constant input of unity. The presented theory is demonstrated by solving complete nonlinear algebraic system paradigms, and the accuracy of the method is tested by comparing the experimental results produced by the network, with the theoretical values of the systems roots. Ó 01 Elsevier Inc. All rights reserved. 1. Introduction A typical nonlinear algebraic system is defined as Fð~zÞ ¼0 with the mapping function F : R n! R n ðn > 1Þ to be described as an n-dimensional vector F ¼½f 1 ; f ;...; f n Š T ; where f i : R n! R (i ¼ 1; ;...; n). Generally speaking, there are no good methods for solving such systems: even in the simple case of only two equations in the form f 1 ðz 1 ; z Þ¼0 and f ðz 1 ; z Þ¼0, the estimation of the system roots is reduced to the identification of the common points of the zero contours of the functions f 1 ðz 1 ; z Þ and f ðz 1 ; z Þ. But this is a very difficult task, since in general, these two functions have no relation to each other at all. In the general case of N nonlinear equations, solving the system requires the identification of points that are mutually common to N unrelated zero-contour hypersurfaces each of dimension N 1 [8].. Nonlinear algebraic systems According to the basic principles of the nonlinear algebra [6], a complete nonlinear algebraic system of n polynomial equations with n unknowns ~z ¼ðz 1 ; z ;...; z n Þ is identified completely by the number of equations n, and their degrees ðs 1 ; s ;...; s n Þ, it is expressed mathematically as F i ð~zþ ¼ Xn A j1j...js i j 1 ;j ;...;j si i z j1 z j...z jsi ¼ 0 ðþ ð1þ Corresponding author. addresses: gouliana@it.teithe.gr (K. Goulianas), amarg@teilar.gr, amarg@uom.gr (A. Margaris), miltos@uom.gr (M. Adamopoulos) /$ - see front matter Ó 01 Elsevier Inc. All rights reserved.

2 K. Goulianas et al. / Applied Mathematics and Computation 19 (01) 6 (i ¼ 1; ;...; n), and it has one non-vanishing solution (i.e. at least one z j 0) if and only if the equation n o ¼ 0 R s1 ;s ;...s n A j 1j...j si i ðþ holds. In this equation, the function R is called the resultant and it is a straightforward generalization of the determinant of a linear system. The resultant R is a polynomial of the coefficients of A of degree! d s1 ;s ;...;s n ¼ deg A R s1 ;s ;...;s n ¼ Xn Y s j ðþ i¼1 j i When all degrees coincide, i.e. s 1 ¼ s ¼¼s n ¼ s, the resultant R njs is reduced to a simple polynomial of degree d njs ¼ deg A R njs ¼ ns n 1 and it is described completely by the values of the parameters n and s. It can be proven that the coefficients of the matrix A, which is actually a tensor for n >, are not all independent each other. More specifically, for the simple case s 1 ¼ s ¼¼s n ¼ s, the matrix A is symmetric in the last s contra-variant indices and it contains only nm njs independent coefficients, with ðn þ s 1Þ M njs ¼ ðn 1Þ!s! : An interesting description concerning the existence of solution for a nonlinear algebraic system can be found in [0]. Even though the notion of the resultant has been defined for homogenous nonlinear equations, it can also describe nonhomogenous algebraic equations as well. In the general case, the resultant R, satisfies the nonlinear Cramer rule ðþ R s1 ;s ;...;s n fa ðkþ ðz k Þg ð6þ where Z k is the k th component of the solution of the no homogenous system, and A ðkþ is the k th column of the coefficient matrix A.. Review of previous work The solution of nonlinear algebraic systems is generally possible by using not analytical, but numerical algorithms. Besides the well known fixed-point based methods, (quasi)-newton and gradient descent methods, a well known class of such algorithms is the ABS algorithms introduced in 198 by Abaffy, Broyden, and Spedicato [10] to solve linear systems as well as nonlinear equations and system of equations [11,]. The basic function of the initial ABS algorithms is to solve a determined or under-determined n m linear system Az ¼ b (z R n ; b R m ; m 6 n) by using special matrices, known as Abaffians. The choice of those matrices as well as the quantities used in their defining equations, determine particular sub-classes of the ABS algorithms, the most important of them are the conjugate direction subclass, the orthogonality scaled subclass as well as, the optimally stable subclass. The extension of the ABS methods for solving nonlinear algebraic systems is straightforward and it can be found in many sources such as [9,8]. It can be proven, that under appropriate conditions, the ABS methods are locally convergent with a speed of Q-order two, while, the computational cost of one iteration is Oðn Þ flops plus one function and one Jacobian matrix evaluation. To save the cost of Jacobian matrix evaluations, Huang [] introduced quasi- Newton based ABS methods known as row update methods. These methods, do not require the a priori computation of the Jacobian matrix, and therefore its computational cost is Oðn Þ. Galantai and Jeney [1] have proposed alternative methods for solving nonlinear systems of equations that are combinations of the nonlinear ABS methods and quasi-newton methods. Another class of methods has been proposed by Kublanovskaya and Simonova [7] for estimating the roots of m nonlinear coupled algebraic equations with two unknowns k and l. In their work, the nonlinear system under consideration is described by the algebraic equation Fðk; lþ ¼½f 1 ðk; lþ; f ðk; lþ;...; f m ðk; lþš T ¼ 0 with the function f k ðk; lþ ðk ¼ 1;...; mþ to be a polynomial in the form f k ðk; lþ ¼½a ðkþ ts l t þþa ðkþ 0s Šks þþ½a ðkþ t0 l t þþa ðkþ 00 Š: ð7þ In this equation, the coefficients a ij (i 1;...; t and j 1;...; s) are, in general, complex numbers, while s and t are the maximum degrees of polynomials in k and l respectively, found in Fðk; lþ ¼0. The algorithms proposed by Kublanovskaya and Simonova are capable of finding the zero-dimensional roots ðk ; l Þ, i.e. the pairs of fixed numbers satisfying the nonlinear system, as well as the one-dimensional roots defined as ðk; lþ ¼ ½uðlÞ; lš and ðk; lþ ¼ ½k; ~uðkþš whose components are functionally related. The first method of Kublanovskaya and Simonova consists of two stages. At the first stage, the process passes from the system Fðk; lþ ¼0 to the spectral problem for a pencil Dðk; lþ ¼AðlÞ kbðlþ of polynomial matrices AðlÞ and BðlÞ, whose zero-dimensional and one-dimensional eigenvalues coincide with the zero dimensional and one dimensional roots of the nonlinear system under consideration. On the other hand, at the second stage, the spectral problem for Dðk; lþ is solved, i.e. all zero-dimensional eigenvalues of Dðk; lþ as well as a regular polynomial matrix pencil whose spectrum gives all one-dimensional eigenvalues of Dðk; lþ are found. Regarding the second method, it is based to the factorization of Fðk; lþ into irreducible factors and to the estimation of the roots ðl; kþ one after the other, since the resulting polynomials produced by this factorization are polynomials of only one variable.

3 6 K. Goulianas et al. / Applied Mathematics and Computation 19 (01) 6 Emiris, Mourrain, and Vrahatis [7] developed a method for the counting and identification of the roots of a nonlinear algebraic system based to the concept of topological degree and by using bisection techniques (for the application of those techniques see also []). A method for solving such a system using linear programming techniques can be found in [9], and another interesting method that uses evolutionary algorithms is described in [6]. There are many other methods concerning this subject; some of them are described in [,,,1]. In the concluding section, the proposed neural model will be compared against some of those other methods such as the trust-region-dogleg [], the trust-region-reflective [] and the Levenberg Marquardt algorithm [1,0], as well as the multivariate Newton Raphson method with partial derivatives.. ANNs as nonlinear system solvers Artificial neural networks (ANNs in short) have been used among other methods for solving linear algebra problems and simple linear systems and equations [1,1] as well as nonlinear equations and systems of nonlinear equations; however, they are not used so frequent as the methods described in the previous section, especially for the case of nonlinear systems. Mathia and Saeks [1] used recurrent neural networks composed of linear Hopfield networks to solve nonlinear equations which are approximated by a multilayer perceptron. Mishra and Kalra [] used a modified Hopfield network with a properly selected energy function to solve a nonlinear algebraic system of m equations with n unknowns. Hopfield neural networks were also used by Luo and Han [17] to solve a nonlinear system of equations which in the previous stage has been transformed to a kind of quadratic optimization. Finally, Li and Zeng [16] used the gradient descent rule with a variable step size to solve nonlinear algebraic systems at very rapid convergence and with very high accuracy. The main idea behind the proposed approach for solving a nonlinear algebraic system of p equations with p unknowns [18], is to construct a network with p output neurons, with the desired output of the th neuron ð1 6 6 pþ to be equal to zero; in this way, the nonlinear algebraic system under consideration is simulated completely by the neural model. An efficient approach to construct such a network is shown in Fig. 1 for the case of a complete nonlinear algebraic system [19]. In this figure, the proposed model is composed of four layers, an input layer, with one summation unit with a constant input equal to unity, a hidden layer of linear units that generate the linear terms x and y, a hidden layer of summation and product units that generate all the terms of the left hand part of the system equations except the fixed term namely, the terms x; y; x ; xy and y by using the appropriate activation function (for example, the activation function of the neuron Fig. 1. The structure of the complete nonlinear algebraic system neural solver.

4 K. Goulianas et al. / Applied Mathematics and Computation 19 (01) 6 7 producing the x term is the function f ðxþ ¼x ) and finally, an output layer of summation units whose total input is the expression of the left hand part of the corresponding equation. These output neurons uses the hyperbolic tangent as the activation function and have an additional input from a bias unit, whose fixed weight is the constant term of the associated nonlinear equation. To simulate the system, the synaptic weights are configured in the way shown in the figure. It is not difficult to note that the weights of synapses joining the neurons of the second layer to the neurons of the third layer, are fixed and equal to unity, while the weights joining the summation and the product units of the third layer to the neurons of the fourth (output) layer have been set to the fixed values of the coefficients of the system equations associated with the linear and nonlinear terms provided by the neurons (it is clear that if the algebraic system is not complete but there are missing terms, the corresponding synaptic weights are set to zero). The only variable-weight synapses are the two synapses that join the input neuron with the two linear neurons of the first hidden layer. Since this network structure produces as output a zero value, it is clear that a training operation with the traditional back propagation algorithm and the ½0 0Š T as the desired output vector, will assign to the weights of the two variable weight synapses, the components x and y of one of the system roots. The training of this network with a lot of initial conditions is capable of identifying all roots of the traditional complete nonlinear algebraic systems with four distinct roots as well as the basin of attraction for each one of them and the basin of infinity that lead to a nonconverging behavior [19].. The neural model for the complete nonlinear algebraic system The complete nonlinear algebraic system is defined as F i ðx 1 ; x ; x Þ¼a i;1 x 1 þ a i;x þ a i;x þ a i;x 1 x þ a i; x 1 x þ a i;6x 1 x þþa i;7 x 1 x þ a i;8x x þ a i;9 x x þ a i;10x 1 x x þ a i;11 x 1 x þþa i;1 x 1 x þ a i;1 x x þ a i;1 x 1 þ a i;1x þ a i;16x þ a i;17x 1 þ a i;18 x þ a i;19 x a i;0 ¼ 0 ði ¼ 1; ; Þ and it is a straightforward generalization of the complete nonlinear system studied in [19]. The neural structure that solves this type of system is based to the approach described in the previous section, it is described in Table 1 and it is shown in Fig.. In complete accordance with the proposed design, there are full connections between two consecutive layers; however, for the sake of clarity, the figure shows only the connections with nonzero synaptic weighs. These weights are defined as follows: h W 1 ¼ W ð1þ 1 WðÞ 1 WðÞ 1 W ¼fW ði;jþ ; g;;;...;9 i¼1;; W ¼fW ði;jþ ; g;;;...;17;18;19 i¼1;;...;9 i ¼ ½x 1 x x Š; ¼ ; ¼ ; Table 1 The structure of the neural solver for the complete nonlinear algebraic system. Layer Neurons Neuron type Weight table Layer 1 01 Summation W 1 ð1 Þ Layer 0 Summation W ð 9Þ Layer 09 Summation W ð9 19Þ Layer 19 Product & Summation W ð19 Þ Neurons 1,,,8,9,10,17,18,19 Summation units Neurons,,6,7,11,1,1,1,1,16 Product units Layer 0 Summation Not available

5 8 K. Goulianas et al. / Applied Mathematics and Computation 19 (01) 6 Fig.. The structure of the complete nonlinear algebraic system neural solver. Since the back propagation algorithm requires full connections between consecutive layers, the missing synapses are modeled as synapses with zero weight values. For the sake of simplicity, the parameters x 1, x and x are denoted as x, y and z, respectively, while the parameters a i;0 (i ¼ 1; ; ) are denoted as a 0, b 0 and c 0. a 1;17 a ;17 a ;17 a 1;1 a ;1 a ;1 a 1;1 a ;1 a ;1 a 1;11 a ;11 a ;11 a 1; a ; a ; a 1; a ; a ; a 1;10 a ;10 a ;10 a 1;18 a ;18 a ;18 a 1;1 a ;1 a ;1 W ¼ a 1; a ; a ; : a 1;1 a ;1 a ;1 a 1;9 a ;9 a ;9 a 1;8 a ;8 a ;8 a 1;1 a ;1 a ;1 a 1;7 a ;7 a ;7 a 1;6 a ;6 a ;6 a 1;19 a ;19 a ; a 1;16 a ;16 a ;16 a 1; a ; a ; The activation function for the neurons of the network structure are defined as follows: For the second layer: all neurons use the identity function; therefore we have f ðiþ ðtþ ¼t; i ¼ 1; ; : For the third layer: the activation functions of the nine neurons are defined as follows: h i f ¼ f ð1þ f ðþ f ðþ f ðþ f ðþ f ð6þ f ð7þ f ð8þ f ð9þ ¼ t t t t t t t t t :

6 K. Goulianas et al. / Applied Mathematics and Computation 19 (01) 6 9 For the fourth layer: all the neurons use the identity function, namely f ðiþ ðtþ ¼t; i ¼ 1; ; ;...; 17; 18; 19: For the fifth (output) layer: all the neurons use the hyperbolic tangent function, namely f ðiþ ðtþ ¼tanhðtÞ; i ¼ 1; ; : 6. Building the back propagation equations By following the traditional back propagation approach, the equation for training the neural network are constructed as follows: 6.1. Forward pass The input to the three neurons of the second layer are estimated as t ðiþ ¼ WðiÞ 1 ¼ x i; i ¼ 1; ; while the outputs produced by them have the values m ðiþ ¼ f ðiþ ðtðiþ ðiþ ðx iþ¼x i ; i ¼ 1; ; : These outputs are sent as inputs to the neurons of the third layer, whose total input is thus given by the equation t ðiþ ¼ X W ðj;iþ m ðjþ or in expanded form ; j ¼ 1; ; ; ; ; 6; 7; 8; 9 t ð1þ ¼ W ð1;1þ þ W ð;1þ m ðþ þ W ð;1þ m ðþ ¼ x 1 ; t ðþ ¼ W ð1;þ þ W ð;þ m ðþ þ W ð;þ m ðþ ¼ x 1 ; t ðþ ¼ W ð1;þ þ W ð;þ m ðþ þ W ð;þ m ðþ ¼ x 1 ; t ðþ ¼ W ð1;þ þ W ð;þ m ðþ þ W ð;þ m ðþ ¼ x ; t ðþ ¼ W ð1;þ þ W ð;þ m ðþ þ W ð;þ m ðþ ¼ x ; t ð6þ ¼ W ð1;6þ þ W ð;6þ m ðþ þ W ð;6þ m ðþ ¼ x ; t ð7þ ¼ W ð1;7þ þ W ð;7þ m ðþ þ W ð;7þ m ðþ ¼ x ; t ð8þ ¼ W ð1;8þ þ W ð;8þ m ðþ þ W ð;8þ m ðþ ¼ x ; t ð9þ ¼ W ð1;9þ þ W ð;9þ m ðþ þ W ð;9þ m ðþ ¼ x : On the other hand, the outputs of the neurons of the third layer are estimated as ¼ f ð1þ m ðþ ¼ f ðþ m ðþ ¼ f ðþ m ðþ ¼ f ðþ m ðþ ¼ f ðþ m ð6þ ¼ f ð6þ m ð7þ ¼ f ð7þ m ð8þ ¼ f ð8þ m ð9þ ¼ f ð9þ ðtð1þ ðtðþ ðtðþ ðtðþ ðtðþ ðtð6þ ðtð7þ ðtð8þ ðtð9þ ð1þ ðxþ ¼x 1; ðþ ðxþ ¼x 1 ; ðþ ðxþ ¼x 1 ; ðþ ðyþ ¼x ; ðþ ðyþ ¼x ; ð6þ ðyþ ¼x ; ð7þ ðzþ ¼x ; ð8þ ðzþ ¼x ; ð9þ ðzþ ¼x : These outputs are sent as inputs to the neurons of the fourth layer. For the summation neurons, their total input is estimated as t ðiþ ¼ X9 W ðj;iþ m ðjþ ; i ¼ 1; ; ; 8; 9; 10; 17; 18; 19

7 0 K. Goulianas et al. / Applied Mathematics and Computation 19 (01) 6 or in expanded form t ð1þ ¼ X9 t ðþ ¼ X9 t ðþ ¼ X9 t ð8þ ¼ X9 t ð9þ ¼ X9 t ð10þ ¼ X9 t ð17þ ¼ X9 t ð18þ ¼ X9 W ðj;1þ m ðjþ ð1;1þ ¼ðW Þ¼x 1; W ðj;þ m ðjþ ð;þ ¼ðW m ðþ Þ¼x 1 ; W ðj;þ m ðjþ ð;þ ¼ðW m ðþ Þ¼x 1 ; W ðj;8þ m ðjþ ð;8þ ¼ðW m ðþ Þ¼x ; W ðj;9þ m ðjþ ð;9þ ¼ðW m ðþ Þ¼x ; W ðj;10þ m ðjþ ¼ðWð6;10Þ m ð6þ Þ¼x ; W ðj;17þ m ðjþ ¼ðWð7;17Þ m ð7þ Þ¼x ; W ðj;18þ m ðjþ ¼ðWð8;18Þ m ð8þ Þ¼x : t ð19þ ¼ X9 W ðj;19þ m ðjþ ¼ðWð9;19Þ m ð9þ Þ¼x while the outputs of the product units of this fourth layer are estimated as t ðiþ ¼ Y js i W ðj;iþ m ðjþ ; i ¼ ; ; 6; 7; 11; 1; 1; 1; 1; 16; where S i is the set of indices of the neurons of the third layer that are connected to the i th neuron of the fourth layer (i ¼ ; ; 6; 7; 11; 1; 1; 1; 1; 16). If we expand this equation we get a set of equations in the form t ðþ ¼ðW ð1;þ ÞðWð;Þ m ðþ Þ¼x 1x ; t ðþ ¼ðW ð;þ m ðþ ÞðWð;Þ m ðþ Þ¼x 1 x ; t ð6þ ¼ðW ð1;6þ ÞðWð;6Þ m ðþ Þ¼x 1x ; t ð7þ ¼ðW ð1;7þ ÞðWð;7Þ m ðþ ÞðWð7;7Þ m ð7þ Þ¼x 1x x ; t ð11þ ¼ðW ð;11þ m ðþ ÞðWð7;11Þ m ð7þ Þ¼x x ; t ð1þ ¼ðW ð;1þ m ðþ ÞðWð8;1Þ m ð8þ Þ¼x x ; t ð1þ ¼ðW ð;1þ m ðþ ÞðWð7;1Þ m ð7þ Þ¼x x ; t ð1þ ¼ðW ð1;1þ ÞðWð7;1Þ m ð7þ Þ¼x 1x ; t ð1þ ¼ðW ð1;1þ ÞðWð8;1Þ m ð8þ Þ¼x 1x ; t ð16þ ¼ðW ð;16þ m ðþ ÞðWð7;16Þ m ð7þ Þ¼x 1 x with the outputs of the fourth layer estimated as ¼ f ð1þ m ðþ ¼ f ðþ m ðþ ¼ f ðþ m ðþ ¼ f ðþ m ðþ ¼ f ðþ m ð6þ ¼ f ð6þ m ð7þ ¼ f ð7þ m ð8þ ¼ f ð8þ ðtð1þ ðtðþ ðtðþ ðtðþ ðtðþ ðtð6þ ðtð7þ ðtð8þ ð1þ ðxþ ¼x 1; ðþ ðx Þ¼x 1 ; ðþ ðx Þ¼x 1 ; ðþ ðxyþ ¼x 1x ; ðþ ðx yþ¼x 1 x ; ð6þ ðxy Þ¼x 1 x ; ð7þ ðxyzþ ¼x 1x x ; ð8þ ðyþ ¼x ;

8 K. Goulianas et al. / Applied Mathematics and Computation 19 (01) 6 1 m ð9þ ¼ f ð9þ ðtð9þ ð9þ ðy Þ¼x ; m ð10þ ¼ f ð10þ ðt ð10þ ð10þ ðy Þ¼x ; m ð11þ ¼ f ð11þ ðt ð11þ ð11þ ðyzþ ¼x x ; ¼ f ð1þ ðt ð1þ ð1þ ðyz Þ¼x x ; ¼ f ð1þ ðt ð1þ ð1þ ðy zþ¼x x ; ¼ f ð1þ ðt ð1þ ð1þ ðxzþ ¼x 1 x ; ¼ f ð1þ ðt ð1þ ð1þ ðxz Þ¼x 1 x ; m ð16þ ¼ f ð16þ ðt ð16þ ð16þ ðx zþ¼x 1 x ; m ð17þ ¼ f ð17þ ðt ð17þ ð17þ ðzþ ¼x ; m ð18þ ¼ f ð18þ ðt ð18þ ð18þ ðz Þ¼x ; m ð19þ ¼ f ð19þ ðt ð19þ ð19þ ðz Þ¼x : Finally, the total input to the three summation neurons of the output layer is estimated as t ð1þ ¼ X19 W ðj;1þ m ðjþ ¼ a 1;17x 1 þ a 1;1 x 1 þ a 1;1x 1 þ a 1;11x 1 x þ a 1; x 1 x þ a 1; x 1 x þ a 1;10x 1 x x þ a 1;18 x þ a 1;1 x þ a 1; x þ a 1;1x x þ a 1;9 x x þ a 1;8x x þ a 1;1 x 1 x þ a 1;7 x 1 x þ a 1;6x x þ a 1;19 x þ a 1;16 x þ a 1;x a 1;0 ¼ F 1 ðx 1 ; x ; x Þ; t ðþ ¼ X19 t ðþ ¼ X19 W ðj;þ m ðjþ ¼ a ;17x 1 þ a ;1 x 1 þ a ;1x 1a ;11 x 1 x þ a ; x 1 x þ a ; x 1 x þ a ;10x 1 x x þ a ;18 x þ a ;1 x þ a ;x þ a ;1 x x þ a ;9 x x þ a ;8x x þ a ;1 x 1 x þ a ;7 x 1 x þ a ;6x x þ a ;19 x þ a ;16 x þ a ;x a ;0 ¼ F ðx 1 ; x ; x Þ W ðj;þ m ðjþ ¼ a ;17x 1 þ a ;1 x 1 þ a ;1x 1 þ a ;11x 1 x þ a ; x 1 x þ a ; x 1 x þ a ;10x 1 x x þ a ;18 x þ a ;1 x þ a ; x þ a ;1x x þ a ;9 x x þ a ;8x x þ a ;1 x 1 x þ a ;7 x 1 x þ a ;6x x þ a ;19 x þ a ;16 x þ a ;x a ;0 ¼ F ðx 1 ; x ; x Þ while the output produced by them and it is the real output of the neural network is given by the equations o 1 ¼ tanh½t ð1þ Š¼tanh½F 1ðx 1 ; x ; x ÞŠ ¼ f ðf 1 Þ; m ðþ o ¼ tanh½t ðþ Š¼tanh½F ðx 1 ; x ; x ÞŠ ¼ f ðf Þ; m ðþ o ¼ tanh½t ðþ Š¼tanh½F ðx 1 ; x ; x ÞŠ ¼ f ðf Þ; where for the sake of brevity we define f ðxþ ¼tanhðxÞ and use the convention F i ¼ F i ðx; y; zþ ði ¼ 1; ; Þ. 6.. Backward pass estimation of the d parameters In this stage we use the vectors h i d ¼ d ð1þ d ðþ d ðþ ; h d ¼ h d ¼ d ð1þ d ðþ d ðþ d ð17þ d ð18þ d ð19þ d ð1þ d ðþ d ðþ d ðþ d ðþ d ð6þ d ð7þ d ð8þ d ð9þ h i d ¼ d ð1þ d ðþ d ðþ to denote the d values of the fifth, the fourth, the third and the second layer, respectively. Since the desired output of the neural network are the values d 1 ¼ d ¼ d ¼ 0 it can be easily seen that i ; i ;

9 K. Goulianas et al. / Applied Mathematics and Computation 19 (01) 6 d ð1þ ¼ðd 1 o 1 Þf ð1þ0 ðtð1þ Þ¼ o 1f 0 ðf 1 Þ¼ fðf 1 Þf1 f ðf 1 Þg; d ðþ ¼ðd o Þf ðþ0 ðtðþ Þ¼ o f 0 ðf Þ¼ fðf Þf1 f ðf Þg; d ðþ ¼ðd o Þf ð1þ0 ðtðþ Þ¼ o f 0 ðf 1 Þ¼ fðf Þf1 f ðf Þg: Having estimated the components of the d vector, we can now estimate the components of the d vector (namely the delta values of the fourth layer) by using the equation! d ðkþx i ¼ X ðkþ W ðk;jþ dðjþ n ðkþx i ; k ¼ 1;...; 19; where n ðkþ ¼ i Since these neurons produce nonlinear terms we have generally to estimate more than one delta values for each one them, and for each independent variable participating to the nonlinear term they produce. In the above equation, there are many function derivatives with respect to x; y and z; these derivatives are estimated ðkþ n ðkþx i ¼ 1 for ðk; iþ ¼fð1; 1Þ; ð8; Þ; ð17; Þg with n ðkþx j ¼ 0 for j i. n ðkþx i ¼ x i for ðk; iþ ¼fð; 1Þ; ð9; Þ; ð18; Þg with n ðkþx j ¼ 0 for j i. n ðkþx i ¼ x i for ðk; iþ ¼fð; 1Þ; ð10; Þ; ð19; Þg with n ðkþx j ¼ 0 for j i. n ð7þx i ¼ x j x for ði; j; ð1; ; Þ; ð; 1; Þ; ð; 1; Þg. n ðkþx i ¼ x j for ðk; i; jþ ¼fð; 1; Þ; ð; ; 1Þ, ð11; ; Þ; ð11; ; Þ; ð1; 1; Þ; ð1; ; 1Þg. n ðkþx i ¼ 0 for ðk; iþ ¼fð; Þ; ð11; 1Þ; ð1; Þg. n ðkþx i ¼ x j for ðk; i; jþ ¼fð; ; 1Þ; ð6; 1; Þ, ð1; ; Þ; ð1; ; Þ; ð1; 1; Þ; ð16; ; 1Þg. n ðkþx i ¼ x i x j for ðk; i; jþ ¼fð; 1; Þ; ð6; ; 1Þ, ð1; ; Þ; ð1; ; Þ; ð1; ; 1Þ; ð16; 1; Þg. n ðkþx i ¼ 0 for ðk; iþ ¼fð; Þ; ð6; Þ, ð1; 1Þ; ð1; 1Þ; ð1; Þ; ð16; Þg. Based on the above expressions,we get the results d ð1þx 1 ¼ X a j;17 d ðjþ dð1þx d ð1þx d ðþx 1 ¼ x 1 X d ðþx d ðþx a j;1 d ðjþ d ðþx 1 ¼ x 1 X a j;1 d ðjþ dðþx d ðþx d ðþx 1 ¼ x X d ðþx ¼ x 1 X d ðþx 1 ¼ x 1 x X a j;11 d ðjþ a j;11 d ðjþ dðþx d ðþx d ð6þx 1 ¼ x d ð6þx ¼ x 1 x X d ð7þx 1 ¼ x x X a j; d ðjþ dðþx X ¼ x 1 a j; d ðjþ a j; d ðjþ dð6þx X a j;10 d ðjþ dð7þx ¼ x 1 x X a j; d ðjþ a j;10 d ðjþ

10 K. Goulianas et al. / Applied Mathematics and Computation 19 (01) 6 d ð7þx ¼ x 1 x X a j;10 d ðjþ dð8þx 1 d ð8þx ¼ X a j;18 d ðjþ dð8þx d ð9þx 1 d ð9þx ¼ x X d ð9þx d ð10þx 1 a j;1 d ðjþ d ð10þx ¼ x X a j; d ðjþ dð10þx d ð11þx 1 d ð11þx ¼ x X d ð11þx ¼ x X a j;1 d ðjþ a j;1 d ðjþ dð1þx 1 d ð1þx ¼ x X a j;9 d ðjþ dð1þx ¼ x x X a j;9 d ðjþ d ð1þx 1 d ð1þx ¼ x x X d ð1þx 1 ¼ x X a j;8 d ðjþ a j;1 d ðjþ dð1þx d ð1þx 1 ¼ x X a j;7 d ðjþ dð1þx d ð16þx 1 ¼ x 1 x X d ð17þx 1 d ð17þx d ð18þx 1 d ð18þx d ð19þx 1 d ð197þx a j;6 d ðjþ dð16þx d ð1þx ¼ x X a j;8 d ðjþ d ð1þx ¼ x 1 X d ð1þx ¼ x 1 x X a j;1 d ðjþ a j;7 d ðjþ

11 K. Goulianas et al. / Applied Mathematics and Computation 19 (01) 6 d ð16þx ¼ x 1 X a j;6 d ðjþ d ð17þx ¼ X d ð18þx ¼ x X a j;19 d ðjþ a j;16 d ðjþ d ð19þx ¼ x X a j; d ðjþ : In the next step we use these values to estimate the parameters d ðiþ (i ¼ 1; ; ;...; 9) via the expression 8 d ðiþ ¼ >< >: X 19 X 19 X 19 The results are the following W ði;jþ dðiþx 1 ; i ¼ 1; ; ; W ði;jþ dðiþx ; i ¼ ; ; 6; W ði;jþ dðiþx ; i ¼ 7; 8; 9: d ð1þ ¼ d ð1þx 1 þ d ðþx 1 þ d ð6þx 1 þ d ð7þx 1 þ d ð1þx 1 þ d ð1þx 1 ¼ X a j;17 d ðjþ þ x X a j;11 d ðjþ þ x X a j;11 d ðjþ þ X x a j; d ðjþ þ x x X a j;10 d ðjþ þ x X a j;1 d ðjþ þ X x a j;7 d ðjþ d ðþ ¼ d ðþx 1 þ d ðþx 1 þ d ð16þx 1 ¼ x 1 X d ðþ ¼ d ðþx 1 ¼ x 1 X a j;1 d ðjþ a j;1 d ðjþ þ x 1x X d ðþ ¼ d ðþx þ d ðþx þ d ð7þx þ d ð8þx þ d ð11þx þ d ð1þx X ¼ x 1 a j;11 d ðjþ þ X x 1 a j; d ðjþ þ x 1x X d ðþ ¼ d ð6þx þ d ð9þx þ d ð1þx ¼ x 1 x X d ð6þ ¼ d ð10þx ¼ x X a j; d ðjþ a j;10 d ðjþ þ X a j; d ðjþ þ x d ð7þ ¼ d ð7þx þ d ð11þx þ d ð1þx þ d ð1þx þ d ð16þx þ d ð17þx X ¼ x 1 x a j;10 d ðjþ þ x X a j;1 d ðjþ þ X x d ð8þ ¼ d ð1þx þ d ð1þx þ d ð18þx ¼ x x X d ð9þ ¼ d ð19þx ¼ x X a j; d ðjþ : X a j;8 d ðjþ þ x 1 a j; d ðjþ þ x 1x X a j;18 d ðjþ þ x X a j;1 d ðjþ þ x x X X a j;9 d ðjþ þ x 1x X a j;1 d ðjþ a j;6 d ðjþ a j;1 d ðjþ þ X x 1 a j;7 d ðjþ þ x X þ X x a j;8 d ðjþ a j;6 d ðjþ þ X a j;16 d ðjþ a j;9 d ðjþ a j;19 d ðjþ

12 K. Goulianas et al. / Applied Mathematics and Computation 19 (01) 6 Finally, the values of d ðiþ ði ¼ 1; ; Þ are estimated via the equation d ðiþ ¼ X9 and the results are W ði;jþ dðiþ ði ¼ 1; ; Þ d ð1þ ¼ d ð1þ þ d ðþ þ d ðþ ¼ x 1 X X a j;1 þ x 1 x a j; þ x X X a j; þ x 1 x a j;6 þ x! X X þ x a X j;11 þ x a j;1 þ x 1 a j;1 þ X a j;17 d ðjþ ¼ j ðx 1 ; x ; x 1 d ðþ ¼ d ðþ þ d ðþ þ d ð6þ ¼ x X a j; þ x 1 X X a X j; þ x 1 x a j; þ x x a j;8 þ x! X X þ x a j;1 þ x a j;1 þ X X a j;18 þ x 1 x a j;10 d ðjþ ¼ X d ðþ ¼ d ð7þ þ d ð8þ þ d ð9þ ¼ x X a j; þ x 1 X X a j;6 þ x 1 x a j;7 þ j ðx 1 ; x ; x X X þ x 1 a X j;1 þ x a j;1 þ x a j;16 þ X a j;19 d ðjþ ¼ j ðx 1 ; x ; x X X d ðjþ X d ðjþ a j;7 þ x x X a j;9 þ x 1 X X a X j;8 þ x x a j;9 þ x 1 x d ðjþ : a j;11 a j;10 a j; Update of the synaptic weights Proof. According to the back propagation algorithm, the update of the two variable weights W ðiþ 1 ¼ x i ði ¼ 1; ; Þ is performed as fw ðiþ 1 gðkþ1þ ¼fW ðiþ 1 gðkþ þ bd ðiþ ¼fWðiÞ 1 gðkþ þ b X or equivalently, as fw ðiþ 1 gðkþ1þ ¼fW ðiþ 1 gðkþ b j d i f ðf j Þð1 f ðf j j ðiþ 1 where we used the expressions for the d ðiþ and dðjþ parameters. From the definition of the mean square error of the back propagation algorithm E ¼ 1 X ðd j o j Þ ¼ 1 it can be easily ðiþ 1 ¼ X X o j ¼ 1 X f ðf j Þ f ðf j Þf 0 ðf j j ¼ X f ðiþ j Þð1 f ðf j j ðiþ 1 and therefore the weight update equation gets the form fw ðiþ 1 gðkþ1þ ¼fW ðiþ 1 ðiþ 1 that completes the proof. h 7. Experimental results and comparison with other methods To test the validity of the neural model and evaluate its precision and performance, typical nonlinear systems were constructed and solved using the network described above. The results and the analysis of the associated simulations, are presented below.

13 6 K. Goulianas et al. / Applied Mathematics and Computation 19 (01) 6 Example 1. The first system we present has eight real roots and it consists of the following equations: F 1 ðx 1 ; x ; x Þ¼x 1 þ x þ x x 1 x þ x 1 x x x þ x 1 x x x 1 x x F ðx 1 ; x ; x Þ¼0:x 1 þ 0:x þ 0:x x 1 x þ x 1 x x x þ x 1 x x 0:x 1 0:x 0:x 0:x 1 0:x 0:x þ 1: F ðx 1 ; x ; x Þ¼x 1 þ x þ x þ x 1 x x 1 x þ x x x 1 x x þ x 1 þ x þ x x 1 x x ¼ 0 The solution of this system by means of symbolic packages such as Mathematica, reveals the existence of 8 discrete roots with values 1. ðx 1 ; x ; x Þ¼ð 1; 1; 1Þ.. ðx 1 ; x ; x Þ¼ð 1; 1; þ1þ.. ðx 1 ; x ; x Þ¼ð 1; þ1; 1Þ.. ðx 1 ; x ; x Þ¼ðþ1; 1; þ1þ.. ðx 1 ; x ; x Þ¼ðþ1; þ1; 1Þ. 6. ðx 1 ; x ; x Þ¼ðþ1; þ1; þ1þ. 7. ðx 1 ; x ; x Þ¼ð 1:7; E 8; 1:7Þ. 8. ðx 1 ; x ; x Þ¼ð1:7; E 8; 1:7Þ. To solve the above system, the neural model run many times with different initial conditions in the intervals 6 x 1 6 :, 6 x 6 1:, and 6 x 6 0: with a variation step Dx 1 ¼ Dx ¼ Dx ¼ 0:1. Therefore, the model run 077 times to identify the basin of attraction of each root and to calculate the percentage of the initial conditions associated with each root. The operation of the neural network was simulated in MATLAB, and the components x 1, x and x of each root were estimated by using recurrently the last equation of the previous section. The learning rate for the back propagation algorithm was b ¼ 0:00. Since the speed of convergence was too fast, the maximum number of iterations was set to N ¼ 10000; therefore after iterations, the system was characterized as nonconvergent if it was unable to reach a root, with the process to move to the next point of initial conditions. For each such point, the network either reached one of the eight system roots (it is important to say that the network was able to identify all roots) or it was unable to estimate one of those roots. The basin of attraction for the eight roots as well as the number of points of that basin are depicted graphically in Figs. 6. Since these figures are three dimensional, in the above plots the projections x 1 x, x 1 x and x x are shown. Even though MATLAB supports the construction of D plots, these D graphs are more suitable for preview and they can be very easily extended in three dimensions. The basin of attraction of a root is defined as the fraction of the parameter space points ðx 1 ; x ; x Þ namely, the values of the initial conditions for which the neural network converged to that root. The ratio of the number of these basin points to the total number of the tested initial condition points multiplied by 100, gives the percentage defined above. For each one of the estimated roots, the best simulation run with respect to the minimal iteration number was identified. The experimental results associated with this system are shown in Tables and. Table contains for each one of the identified roots, the number of systems converged to that root, the average number of iterations and the mean square error as well as the standard deviation of those values, and the average value of the x 1, x and x components for each root. On the other hand, the data of Table are associated with the best run for each root, with the criterion for the best system to be the minimum number of iterations. The values shown for each best run, are the initial conditions ðx 0 1 ; x0 ; x0 Þ, the number of iterations, as well as the values of the functions F 1 ðx 1 ; x ; x Þ, F ðx 1 ; x ; x Þ and F ðx 1 ; x ; x Þ. For all best runs the value of MSE were equal to E 16. Fig.. Number of basin points for each one of the roots of System 1 and the nonconverging behavior, and the associated pie chart.

For each root, the projections x 1 x, x 1 x, and x x are shown in the top, middle, and bottom figure, respectively. Fig.

14 K. Goulianas et al. / Applied Mathematics and Computation 19 (01) 6 7 Fig.. The basin of attraction for the roots 1, and of the System 1 in the intervals defined above. For each root, the projections x 1 x, x 1 x, and x x are shown in the top, middle, and bottom figure, respectively. Fig.. The basin of attraction for the roots, and 6 of the System 1 in the intervals defined above. For each root, the projections x 1 x, x 1 x, and x x are shown in the top, middle, and bottom figure, respectively.

8 K. Goulianas et al. / Applied Mathematics and Computation 19 (01) 6 Fig. 6. The basin of attraction for the roots 7 and 8 of the System 1 in the intervals defined above.

15 8 K. Goulianas et al. / Applied Mathematics and Computation 19 (01) 6 Fig. 6. The basin of attraction for the roots 7 and 8 of the System 1 in the intervals defined above. For each root, the projections x 1 x, x 1 x, and x x are shown in the top, middle, and bottom figure, respectively. The variation of the iteration number with the learning rate for the first example system is shown in Fig. 7. Due to the large variation of the iteration number for the eight roots of the system, the vertical axis is not in a linear scale as the horizontal axis but in a logarithmic one. From this figure it is clear that in most cases the number of iterations decreases, as a learning rate increases, a fact, that it is of course expected. However, it has to be mentioned that for greater values of learning rate, the neural network could not be able to converge to some root, and therefore, the allowable learning rate values are much smaller than the typical value of associated with the classical back-propagation value. This relatively small values for the learning rate, leads to relatively large number of iterations, compared with the iteration number associated with other methods as it is pointed out in the concluding section. Example. The second system we present has four real roots and it consists of the following equations: F 1 ðx 1 ; x ; x Þ¼x 1 þ x þ x x 1 x x þ x 1 þ x þ x F ðx 1 ; x ; x Þ¼x 1 þ x þ x x 1 x x þ 8x F ðx 1 ; x ; x Þ¼x 1 þ x þ x x 1 x x þ x 1 þ x þ 8x 6 ¼ 0: The solution of this system with traditional methods reveals the existence of four roots with values 1. ðx 1 ; x ; x Þ¼ð1:80; 0:81; 1:8866Þ.. ðx 1 ; x ; x Þ¼ð1:107007; 1:0087; 0:68966Þ.

16 K. Goulianas et al. / Applied Mathematics and Computation 19 (01) 6 9 Table The experimental results for the eight roots of the first algebraic system. Root no. AVG iteration STDEV iteration AVG MSE STDEV MSE :9 9: : :0 : : :0 : : :800 : : :60 : : :60 8: : :700 : : :900 : : Root AVG AVG AVG no. x 1 x x þ1 1 þ1 1 þ1 1 þ1 þ1 þ1 1 6 þ1 þ1 þ1 7 1: :7 8 1: :7 Table The results for the best run for each one of the eight roots of the first algebraic system. Root x 0 1 x 0 x 0 Iterations F 1 F F 1 1:1 1:1 1: :0 0: þ1: :1 þ0:7 1: þ0: 1: þ0:7 77 þ þ1:0 þ0:7 1: þ þ1:1 þ1:1 þ1: þ : þ0:1 þ1:7 þ þ0:6 0: 1: 61 þ Fig. 7. The variation of the iteration number as a function of the learning rate for the first example system.. ðx 1 ; x ; x Þ¼ð 0:00060; 0:8690; 1:1860Þ.. ðx 1 ; x ; x Þ¼ð0; 1; 1Þ.

17 60 K. Goulianas et al. / Applied Mathematics and Computation 19 (01) 6 Fig. 8. Number of basin points for each one of the roots of System and the nonconverging behavior, and the associated pie chart. Fig. 9. The basin of attraction for the roots 1 and of the System in the intervals defined above. For each root, the projections x 1 x, x 1 x, and x x are shown in the top, middle, and bottom figure, respectively.

K. Goulianas et al. / Applied Mathematics and Computation 19 (01) 6 61 Fig. 10. The basin of attraction for the roots and of the System in the intervals defined above.

18 K. Goulianas et al. / Applied Mathematics and Computation 19 (01) 6 61 Fig. 10. The basin of attraction for the roots and of the System in the intervals defined above. For each root, the projections x 1 x, x 1 x, and x x are shown in the top, middle, and bottom figure, respectively. Table The experimental results for the four roots of the second algebraic system. Root no. AVG iteration STDEV iteration AVG MSE STDEV MSE AVG x 1 AVG x AVG x : :8 10 1:80 0:81 1: : : : :0087 0: :70 1: : : :8687 1: :910 1: : : As in the previous case, the neural model solved the system with a lot of different initial conditions in order to identify all the four roots and estimate the basin of attraction for each one of them. More specifically, the parameters x 1, x and x were varied in the interval 6 x i 6 with a variation step Dx i ¼ 0: ði ¼ 1; ; Þ. The learning rate value was b ¼ 0:0 and the maximum number of iterations was set to N ¼ 100; 000. The percentage of the basin points for the four roots and the nonconverging behavior are shown in Fig. 8, while the projections x 1 x, x 1 x and x x are plotted in Figs. 9 and 10. Tables and are similar to Tables and respectively and contain the data of the statistical analysis applied to the second system.

19 6 K. Goulianas et al. / Applied Mathematics and Computation 19 (01) 6 Fig. 11. The variation of the iteration number as a function of the learning rate for the second example system. Table The results for the best run for each one of the four roots of the second algebraic system. Root x 0 1 x 0 x 0 Iterations F 1 F F 1 þ1: 1 þ1: þ1: 1 0: þ0: 0 þ0:8 199 þ :8 1 þ1:0 190 þ Table 6 Simulation results for the proposed neural method, the trust-region-dogleg as well as the trust-region-reflective algorithms. In the results associated with the neural method the last column represents the learning rate associated with the run gave the presented results. Root x 1 x x Iterations F 1 F F LRate Neural based nonlinear system solver 1 1: : : :00 1: : þ1: :00 1: þ0: : :00 þ0: : þ1: þ :0 þ1: þ0: : þ :0 6 þ1: þ1: þ1: þ þ10 8 0:01 7 1:7 0: þ1: þ :00 8 þ1:7 0: : :00 F 1 10 a F 10 a F 10 a a Trust-region-dogleg algorithm 1 0: : : :160 0: : : : þ0: :1071 0:0710 0: : þ1: : : þ0:6661 þ0:089 1 þ1: : þ1: þ0:1779 þ0:016 þ0:79 11 þ0: þ0: : :687 0:696 0: þ0:99998 þ0: þ1: þ0:9866 þ0:11 þ0: :7 0: þ1:7 0 þ0: þ0:089 0: þ1:7 0: :7 0 0:108 0: : Trust-region-reflective algorithm 1 0:9986 1: : :9986 1: : : : þ1: þ0:0 þ0:089 0: : þ1: : þ0:089 þ0:089 þ0: þ1: : þ1: þ0:179 þ0:0068 þ0: þ0: þ0: : :6801 0:079 0: þ0: þ1: þ1: þ0: þ0: þ0: :7 0: þ1:7 0 þ0: þ0:0 þ0: þ1:7 0: :7 07 þ0:987 þ0: : The variation of the iteration number with the learning rate for the second example system is shown in Fig. 11 and it is characterized by the same features as in the case of the first example system discussed above. It can be easily seen that the variation of the learning rate is shown more clearly in Fig. 11 than Fig. 7 since the carves are shown in lin-lin axes and not in lin-log axes (as in Fig. 7) due to the large spreading of the iteration number for all the available roots.

20 K. Goulianas et al. / Applied Mathematics and Computation 19 (01) 6 6 Table 7 Simulation results for the Levenberg Marquardt algorithm as well as the multivariate Newton Raphson method with partial derivatives. The k parameter of the Levenberg Marquardt algorithm has the default value k ¼ 0:01. Root x 1 x x Iterations F 1 10 a F 10 a F 10 a a Levenberg Marquardt algorithm 1 1: : : :08 þ0:0078 0: : : þ1: þ0:198 0:188 0:16 1 1: þ1: : þ0:1776 0:089 0:66 1 þ0: : þ1: þ0:677 þ0:1901 þ0: þ0: þ1: : :1010 0:670 0: þ0:99998 þ1: þ1: þ0:11 0: þ0: :7 0: þ1:7 0 þ0: þ0:089 0: þ1:7 0: :7 10 0: þ0:6 þ0: Multivariate Newton Raphson method with partial derivatives 1 1: : : :089 0:0 0:089 1 þ1: : þ1: :089 þ0: þ0: : þ1: : þ0: þ0: þ0: þ1: : þ1: :008 0:008 þ0: þ1: þ1: : :1110 þ0: þ0: þ1: : þ1: þ0: þ0: : :7 0: þ1: :11 0:1878 0: : : : : þ0:6661 0: After the presentation of the experimental results that demonstrate the application of the proposed neural solver, let us proceed now to a comparison between the proposed method and other well known methods with respect to their performance. Since the proposed algorithm is an iterative one, we chose for this comparison well known iterative numerical algorithms such as the trust-region-dogleg [], the trust-region-reflective [] and the Levenberg Marquardt algorithm [1,0], as well as the multivariate Newton Raphson method with partial derivatives. The first three algorithms are supported by the fsolve function of the Optimization Toolbox of the MATLAB programming environment, while the Newton Raphson MATLAB implementation can be found in the literature for a theoretical description of this algorithm, see for example [8]. These algorithms were used to solve the problem using the initial conditions ðx 0 ; y 0 ; z 0 Þ of Table associated with the best run (to save pages, this comparison is restricted only to the first example system with eight roots, since its extension to the second example system is straightforward). The simulation results of these comparisons can be found in Tables 6 and 7. Table 6 contains the results of the proposed neural solver as well as the trust-region-dogleg and trust-region-reflective algorithms. On the other hand, Table 7 contains the results of the Levenberg Marquardt and Newton Raphson methods. Regarding the results associated with the proposed algorithm, the last column contains the learning rate value that gave the associated iteration number for the specified initial conditions, while for the other methods this column contains the order of magnitude of the functions evaluation in the position of the estimated root. From Tables 6 and 7 it can be easily seen that even though the neural based approach gave the correct results and with the same accuracy, it requires much more iterations than the other methods, due to the small learning rate values that allow the network to converge (see the discussion above). Of course, this situation can be improved somehow by adjusting appropriately the learning rate value, as it can be easily seen by comparing the number of iterations in Tables (a fixed learning rate value b ¼ 0:0) and 6 (the optimum learning rate identified via experimentation). In some cases (namely for some methods and some roots) the neural method gave more accurate results than the other methods. From the last row in Table 7 it can be seen that the Newton Raphson method converged to the Root 1 instead of Root 8, possibly because the associated initial condition is located near the boundary of the basin of attraction of these two roots. Besides this disadvantage of the neural based approach (which, actually is not a major problem since modern computing systems are characterized by very high speeds, and in most cases the duration of simulation time is not an issue), the proposed method is easy in its implementation (since it uses the classical back propagation approach and therefore it computes the function values in the first pass and the values of the partial derivatives via the estimation of d parameters), in contrast with the other methods that are more difficult to implement since they require a lot of mathematics as well as complicated operations such as Jacobian evaluations at specified points. 8. Conclusions and future work The objective of this research was the numerical estimation of the roots of complete nonlinear algebraic systems of polynomial equations using back-propagation neural networks. The main advantage of this approach is the simple and straightforward solution of the system, by building a structure that simulates exactly the nonlinear system under consideration and find its roots via the classical back propagation approach. Depending on the position on the parameter space of the initial condition used in each case, each run converged to one of the eight possible solutions or diverged to infinity; therefore,

21 6 K. Goulianas et al. / Applied Mathematics and Computation 19 (01) 6 the proposed neural structure is capable of finding all the roots of such a system, although this cannot be done in only one step. One of the tasks associated with the future work on this subject is to improve or redesign the neural solver in order to find all roots in only one run. This research is going to be extended to cover systems with more dimensions and different types of equations of nonpolynomial nature. Special cases have to be investigated as the case of double roots (and in general of roots with arbitrary multiplicity) and the associated theory has to be formulated. Finally, this structure can be extended to cover the case of complex roots and complex coefficients in nonlinear algebraic systems. References [1] A. Galantai, A. Jeney, Quasi-newton abs methods for solving nonlinear algebraic systems of equations, J. Optim. Theory Appl. 89 () (1996) [] Chistine Jager, K.N. Dietmar Ralz, P. Pau, A combined method for enclosing all solutions of nonlinear systems of polynomial equations, Rel. Comput. 1 (1) (199) 1 6. [] N.R. Conn, N.G., P. Toint, n.d. Trust-region methods, MPS/SIAM Series on Optimization. [] E. Spedicato, E. Bodon, A.P.N.M.-A. Abs methods and abspack for linear systems and optimization, a review, in: Proceedings of the third Seminar of Numerical Analysis, Zahedan, November 1/ [] E. Spedicato, Z. Huang, Numerical experience with newton-like methods for nonlinear algebraic systems, Computing 8 (1997) [6] C. Grosan, A. Abraham, Solving nonlinear equation systems using evolutionary algorithms, IEEE Trans. Syst. Man Cybernet. A 8 () (008). [7] I. Emiris, B.M. Vrahatis, Sign methods for counting and computing real roots of algebraic systems, Technical Report RR-669, INRIA, Sophia Antipolis, [8] J. Abaffy, A.E. Spedicato, The local convergence of abs methods for nonlinear algebraic equations, Numer. Math. 1 (1987) 9 9. [9] J. Abaffy, A. Galantai, Conjugate direction methods for linear and nonlinear systems of algebraic equations, in: P. Rezsa, D. Greenspan (Eds.), Numerical Methods, Colloquia Mathematica Soc, Amsterdam, 1987, pp [10] J. Abaffy, C.E. Spedicato, A class of direct methods for linear systems, Numerische Mathematik (198) [11] J. Abaffy, E. Spedicato, ABS Projection Algorithms: Mathematical Techniques for Linear and Nonlinear Equations, Ellis Horwood [1] Jiwei Xue, Yaohui Zi, Y.F.-L.Y., Z. Lin, An intelligent hybrid algorithm for solving nonlinear polynomial systems, in: Proceedings of International Conference on Computational Science, LCNS 07, 00, pp. 6. [1] K.G. Margaritis, M. Adamopoulos, K.G.-D.E. Artificial neural networks and iterative linear algebra methods, Parallel Algor. Appl. (1) (199) 1. [1] K.G. Margaritis, M. Adamopoulos, K. Solving linear systems by artificial neural network energy minimisation, Universily of Macedonia Annals (vol. XII), 199, pp. 0. [1] K. Levenberg, A method for the solution of certain problems in least squares, Quart. Appl. Math. (19) [16] G. Li, Z. Zeng, A neural network algorithm for solving nonlinear equation systems, Proceedings of 0 th International Conference on Computational Intelligence and Security, Los Alamito, USA, 008. pp. 0. [17] D. Luo, Z. Han, Solving nonlinear equation systems by neural networks, Proc. Int. Conf. Syst. Man Cybernet. 1 (199) [18] A. Margaris, M. Adamopoulos, Solving nonlinear algebraic systems using artificial neural networks, Proceedings of the 10th International Conference on Engineering Applications of Artificial Neural Networks, Thessaloniki, Greece, 007. pp [19] A. Margaris, K. Goulianas, Finding all roots of ( ) nonlinear algebraic systems using backpropagation neural networks, Neural Comput. Appl. 1(01) [0] D. Marquardt, An algorithm for least-squares estimation of nonlinear parameters, IAM, J. Appl. Math 11 (196) 1 1. [1] K. Mathia, R. Saeks, Solving nonlinear equations using recurrent neural networks, World Congress on Neural Networks, Washington DC, USA, 199. [] P. Mejzlik, A bisection method to find all solutions of a system of nonlinear equations, Proceedings of the Seventh International Conference on Domain Decomposition, October 7 0, 199, The Pennsylvania State University, pp [] D. Mishra, P.K. Kalva, Modified hopfield neural network approach for solving nonlinear algebraic equations, Eng. Lett. 1 (007) (1). [] M.W. Smiley, C. Chun, An algorithm for finding all solutions of a nonlinear system, J. Comput. Appl. Math. 17 () (001) 9 1. [] T-F. Tsai, M.-H.L. Finding all solutions of systems of nonlinear equations with free variables, Eng. Optim. 9 (6) (007) [6] V. Dolotin, A. Morozov, Introduction to Nonlinear Algebra, World Scientific Publishing Company, 007. [7] V.N. Kublanovskaya, V.N. Simonova, An approach to solving nonlinear algebraic systems, J. Math. Sci. () (1996) [8] W. Press, S. Teukolsky, W.B. Numerical Recipes in C The Art of Scientific Programming, nd ed., Cambridge University Press, 199. [9] S.O. Yusuke Nakaya, Find all solutions of nonlinear systems of equations using linear programming with guaranteed accuracies, J. Univ. Comput. Sci. () (1998) [0] G. Zhang, L. Bi, Existence of solutions for a nonlinear algebraic system, Discrete Dynamics in, Nature and Society, 009.

Applied Mathematics and Computation

Applied Mathematics and Computation 8 (0) 584 599 Contents lists available at SciVerse ScienceDirect Applied Mathematics and Computation journal homepage: www.elsevier.com/locate/amc Basin attractors for