STRUCTURED NEURAL NETWORK FOR NONLINEAR DYNAMIC SYSTEMS MODELING

STRUCTURED NEURAL NETWORK FOR NONLINEAR DYNAIC SYSTES ODELING J. CODINA, R. VILLÀ and J.. FUERTES UPC-Facultat d Informàtica de Barcelona, Department of Automatic Control and Computer Engineeering, Pau Gargallo 5, 08028 Barcelona. Catalonia Abstract. The use of artificial neural networks (ANN) for nonlinear system modeling is a field where still there is much theoretical work to be done. A structured ANN which obtains neural models of nonlinear systems is presented. Those neural models are Fourier-series based. To check the goodness of the method, conventional difference equations are re-modeled via ANN and their respective input/outputs compared. Also their Fourier series expansion are compared. The Fourier coefficients being optimal for series truncation, this allows to estimate the goodness of the models obtained. Preliminary tests give encouraging results Key Words. Neural nets; Nonlinear systems; odeling; Control systems; State-space methods; Fourier analysis. 1. INTRODUCTION Artificial neural networks (ANN) can be used, and are used to model the dynamics of a system or plant. (Yamada and Yabuta, 1993). Usually the ANN and the plant are feed with the same input signal (or input signals for multi-input systems) for training the neural network to model the plant behavior. The network output is then compared with the output from the plant and the error is used to update the weights of the synapses (see Fig. 1). this configuration an input-output model of the plant is obtained, instead of a space model. The advantage of this model is that a series-parallel model can be used while training the network. In a series-parallel model (Fig. 2) the previous outputs from the plant are used as to the ANN instead of the outputs from the proper neural network. Once the model is obtained then the can be taken from the previous outputs of the ANN obtaining and autonomous system. u(k) Plant Neural Network y(k) y(k) ^ + - error u(k) Plant Neural Network y(k) series parallel y(k) ^ + - error Fig. 1. ANN training structure for system modeling A simple feed-forward neural network can be used, (Narendra and Parthasarathy 1990; Wu et al. 1992), if the ANN is feed with the current input, the previous and the previous outputs. Using Fig. 2. Series-parallel training model Using the series-parallel model configuration the obtained model has the following equation:

y( k ) = F ( u( k ), u( k 1), u( k 2)..., y( k 1), y( k 2)...) (1) Different number of layers, activation functions and adaptation algorithms can be used. Qin et al. (1992) compares some configurations. To obtain a space model of a plant, a feedforward ANN can also be used, assuming that the system is accessible at any moment, (Nguyen and Widrow 1990; Anderson 1989). The resulting model is then a space representation of the system dynamics: y( k ), x( k + 1 ) = F ( u( k ), x( k )) (2) In both cases the learning method is static, and usually the backpropagation algorithm is used. A dynamic discrete time ANN can also be used. This is the case of Jordan (1986) or Elman (1988) models where the outputs from the network or from the hidden layer respectively are feed back as new to the network (Fig. 3). The ability of extracting useful information about the real system from the simulated neural network model, is of primary interest. Although the existence of a neural network structure that allows the extraction of a set of equations is useful for the study of the system properties, it doesn't relate equations to physical phenomena. A neural network with a structure inspired on the space representation of a system could allow the application of modern control theory to SISO and IO systems. Beginning with linear discrete time systems we first obtained a neural network, (Codina et al. 1992) able to learn the matrices of the system representation from the input-output properties (Fig 4). This model is a neural network with three layers, the input, the and the output layers. The and the output layers are connected to the input layer, composed by the actual and of the system. new outputs output output hidden layer hidden layer a input Fig. 3. a) Jordan network. b) Elaman network b input The Jordan network is very similar to the serialparallel model but it needs as many outputs as the system order. The Elman structure has some problems to learn systems where there is no direct connection between the input and the output, as happens when there are physical delays. After the ANN training, we obtain a black box which acts as a system simulator. This neural network is often obtained as a first step in the design of neuronal controllers. There are difficulties in making a study of the properties of the plant from that model. The kind of activation functions, the number of layers, and the topology of the network are more a hinder than a help for this study. 2. STATE SPACE ODEL Fig. 4. Proposed linear structure. To train the network, the backpropagation-throughtime learning algorithm has been used (Werbos 1990), where the error is back-propagated from the output, through the present, to the previous. The weights updating is done in batch mode. That is, the weights are changed after a number N of input-output pairs have been processed. The number N is taken to be equal to the length of the input signal. With this methodology, and using linear neurons we can obtain from the synapses weights the matrices A, B, C and D of the space representation of the system: x( k + 1) = Ax( k ) + Bu( k) y( k ) = Cx( k) + Du( k ) (3) As the main interest of the application of neural networks in dynamic systems is their ability to learn and map nonlinear functions, we have expanded our previous model to deal with nonlinear systems. Expanding the linear model to nonlinear systems using sigmoids hinders further study of the obtained model. To solve this drawback we used a structured neural network based on Fourier series in order to approximate any nonlinear function in a bounded interval (Fig. 5).

A nonlinear discrete time system can be modeled by the following difference equations: x( k + 1) = F ( x( k ), u( k )) y( k) = G( x( k ), u( k) ) new sines outputs cosines (4) Once the linear part of the model has been approximated, the network is expanded to include the nonlinear terms. This expansion will cause only a small change to the linear coefficients. If the weights of the sine and cosine neurons are fixed then this training method can be compared with the direct calculus of the Fourier coefficients (Codina et al. 1994). In other words, the ANN is calculating the discrete Fourier transform of the nonlinear transfer function in a bounded interval. To improve the results, the number of points should be at least the number of Fourier coefficients we are calculating, to avoid multiple solutions, and should be within a regular sampling of the space. Fig. 5. Proposed nonlinear structure. The Fourier series expansion allows us to express F and G as a weighted sum of sinusoidal functions, in a bounded interval: N x%( k + 1) = A% Fn, m cos( nwnx %( k ) + mwmu ( k )) n = 0 m= + B% sin( nw x %( k ) + mw u ( k )) N Fn, m n m y%( k ) = A% Gn, m cos( nwnx ( k ) + mwmu ( k )) 0 m= (5) + B% sin( nw x ( k ) + mw u ( k )) Gn, m n m Such series have the advantage that they form an orthogonal base of functions and, in particular, any coefficient can be added without changing the previous ones. So the network can be scalable in the sense of adding new sinusoidal elements, but beginning with the previous learned structure and weights. This model has similarities with a functional-link net (Pao 1990), but neurons with fixed weights are used in order to allow the application of the backpropagation-through-time algorithm. If the weights of the sine and cosine neurons are fixed in order to be nω o then this training method can be compared with the direct calculus of the Fourier coefficients (Codina et al. 1994). In order to obtain the linear part of the system and to minimize the Gibbs phenomenon, we have included a linear part to the Fourier expansion, by using the c coefficients. This gives us a hybrid expansion: x%( k + 1) = c1x( k) + c2u( k ) + N A% Fn, m cos( nwn x %( k ) + mw mu ( k )) 0 m= + B% Fn, m sin( nwn x %( k ) + mw mu ( k )) y%( k ) = c1 x( k ) + c2u( k ) + N A% Gn, m cos( nwn x ( k ) + mwmu ( k )) (6) 0 m= + B% sin( nw x ( k ) + mw u ( k )) Gn, m n m 3. EXAPLE To test the ANN structure, a discrete-time nonlinear system has been simulated: 3 x( k) + 2 2 u ( k) x( k + 1) = 2d 4 i + 0. 5 + 0. 25 2 1 + u ( K ) (7) 4 2 y( k ) = arctan( x( k) ) + log( u( k ) + 3) + 1 π 1. 61 The learning procedure can not use a series-parallel model, where the previous outputs of the real system are used instead of those from the simulated system. This approach is not feasible using a space structure, because neither the of the real system nor the internal representation taken by the ANN is known. Using the parallel model increases the number of training examples needed, and they are highly sensitive to initial conditions. The methodology consists on the following steps. 1) A first approximation is obtained by a linear network, with a fixed training signal. 2) The model is then expanded using the Fourier terms, the network is trained to learn the coefficients for the same training signal. This step is repeated until the mean square error (SE) decreases to a fixed order. 3) The NN is trained with different signals in order to make the system evolve through the whole working region of the space. 4) If the number of Fourier coefficients is too small or the training signal in step 2 was not rich enough, then the third step results in an important differences of the SE, between signals. If this happens then return to step 2 with a new training signal.

In our example we used as input a signal made up of 100 samples of a random step signal (Fig. 6), from which we obtain the desired output. As we use batching, the weights increments were calculated for each input-output pair but only updated at the end of the input signal processing. Input The test signal used for the example was k k2 u( k ) = 0. 5sin( ) + 0. 5sin( ) 10 10 When the test signal is presented to the ANN the SE increases, just a small amount, as it can be seen in Fig. 9. (8) Output Fig. 6. Input signal used during training. The first approximation was made linear. After presenting to the ANN one hundred times the training signal, the linear model was considered correct. Different will produce different linear models depending upon on which areas of the space the system is evolving in. After the linear approximation was obtained, the ANN was expanded with successive Fourier coefficients and trained, for each one, one hundred times. The SE error curve (Fig. 7) reflects this procedure, and shows a fast decrease of the error when a new frequency element is added. Error Fig. 8. Plant output (continuous line) and ANN output(dashed line) with input as the training signal. Output Fig. 9. Plant and ANN output with input a test signal. Trainings Fig. 7. Error curve: logarithm of the mean square error obtained during training. Every hundred trainings a new frequency element is added. There are differences, (Fig. 10), between the evolution of the of the plant and its representation by the ANN. This is not surprising as a system may have multiple descriptions. The ANN will evolve the space representation which is more easily obtainable. The whole learning process and the initial weights, together with the used training signal, will modulate the shape of this representation. State The difference between the desired output and the one obtained from the network is unappreciable, for the training signal, when using eight Fourier terms (Fig. 8). When the test signal is presented to the resulting ANN it can happen that this new signal makes the system evolve through areas of the space not reached when the training signal was used. As those areas couldn't be learned while in the training phase, the result is an increment of the error. Fig. 10. State evolution of the real system (continuous line) and the trained system (doted line).

4. CONCLUSIONS The absence of a methodology for the use of ANN in the field of nonlinear systems modeling is an important factor which impedes a widespread use of them. The difficulty in finding the right number of neurons or layers, the uncertainty of the results (local minima, random initial values...) are still a deterrent to their industrial application. In this paper we present a topology of ANN which was thought in order to help the search of a solution to those uncertainties. This paper presents a structured ANN based on space models of the system equations, together with the use of sines and cosines and linear activation functions. The ANN calculates the Fourier coefficients of the nonlinear functions that relate the with the outputs and new s. As the Fourier coefficients can also be calculated analytically or numerically, the results from the ANN can be compared with those expected by classical methods. The use of space equations for system descriptions minimizes the number of to the neural network and allows a further study of the system properties. Here the model has been presented together with one example of its viability for the modeling of nonlinear discrete time systems Acknowledgment. This research work has received fundings from the CICYT ref. 89-0278 and support from CERCA, Col lectiu d'estudis i Recerca en Control i Automàtica. REFERENCES Anderson, C.W. (1989) Learning to Control an Inverted Pendulum Using Neural Networks. IEEE Control Systems ag. April. 31-37. Codina, J., B. orcego, J.. Fuertes and A. Català. (1992) A Novel Neural Network Structure for Control. IEEE Int. Conf on Systems, an and Cybernetics. Chicago. pp. 1339-1344. Codina, J., J.C. Aguado and J.. Fuertes (1994) Capabilities of a Structured Neural Network. Learning and Comparison with Classical Techniques. To appear in Procc. of the ESANN Eruoconference. Elman J.L. (1988) Finding Structure in Time. Report 8801. University of California. San Diego. Jordan,.I. (1986) Serial Order: A Parallel Distributed Processing Approach. Institute for Cognitive Science. Report 8604. University of California. San Diego Narendra, K.S. and K. Parthasarathy (1990) Identification and Control of Dynamical Systems Using Neural Networks. IEEE Trans. on Neural Networks. arch 4-27. Nguyen, D.H. and B. Widrow (1990) Neural Networks for Self-Learning Control Systems. IEEE Control Systems ag. April, 18-23. Pao, Y. (1989) Adaptive Pattern Recognition and Neural Networks. Adisson-Wesley, Reading, A. Qin, S., H. Su and T.J. cavoy (1992) Comparison of Four Neural Net Learning ethods for Dynamic Systems Identification. IEEE Trans. on Nerual Networks. January 122-130. Werbos, P.J. (1990) Backpropagation Through Time: What it Does and How to Do it. Proc. of the IEEE. Vol 78 N. 10 1550-1560. Wu, Q.H., B.W. Hogg and G.W. Irwin. (1992) A Neural Network Regulator for Turbogenerators. IEEE Trans. on Neural Networks January 95-100. Yamada, T. and Yabuta T. (1993) Dynamic System Identification Using Neural Networks. IEEE Trans on Systems, an, and Cybernetics. January/February, 204-211