Neural Network Identification of Non Linear Systems Using State Space Techniques.

Neural Network Identification of Non Linear Systems Using State Space Techniques. Joan Codina, J. Carlos Aguado, Josep M. Fuertes. Automatic Control and Computer Engineering Department Universitat Politècnica de Catalunya C/ Pau Gargallo, 5-0808 Barcelona - Spain Abstract This paper will focus on the identification of non linear systems by means of discrete time Artificial Neural Networks (ANN). A new architecture of ANN, with the structure of state space description of systems, is compared to the system input-output ANN structure. Through it, the results obtained by our neural network are compared with the results presented by Narendra in [1]. Our purpose with such a comparison is to demonstrate the viability of the proposed model and to evaluate the characteristics and possibilities of each method. I. INTRODUCTION Systems identification is a fundamental question in systems theory. ANN can be used to identify dynamic systems in order to design an ANN controller based on the system model. The problems of using ANN in this field arise from the general properties of neural networks: the distribution of information inside the network, which difficulties the study of the model, and the lack of methodologies for the application of neural networks (number of layers or neurons, learning rate). Due to that, the identification of systems using ANN is not a self contained subject, but a tool needed by other applications of the ANN such as non linear controller's design. The use of ANN in the identification of dynamic systems has been carried out mainly in two ways, either using classical dynamic neural networks or using feed-forward neural networks with external delays. The use of standard dynamic neural networks such as the ones of Jordan [] or Elman [3] presents some problems: an example is given when some of the system states are not connected directly to the output signals, for instance in systems with physical delays. In Jordan nets we also need the number of outputs to be at least the same as the system order. This condition can be sometimes unattainable. The investigation in the field of systems identification has developed models based on feed-forward neural networks where, by many external delays of the inputs and the previous outputs, the dynamic behavior is accomplished (Fig. 1): Multilayer feed-forward Neural network Z -1 Z -1 Z -1 Z -1 Z -1 Z -1 y(k+1) u(k) y(k+1) Fig 1. Neural network based on input output description of systems. The dynamic behavior is obtained with external delays (z -1 ). y(k+1)= F [ u(k), u(k-1), u(k-)... y(k), y(k-1), y(k-)...]. (1) One of the most extensive studies with this methodology in which different configurations are studied, was done by Narendra. To show the viability of the model, five different systems were used as examples, indicating the signals that were used to train and test the system together with the number of learning iterations. Sigmoids were used as activation functions of the neurons. This network is compared to a neural network with a structure inspired on the state space representation of a system. Such structure could allow the application of modern control theory for single-input single output and to multivariable systems. In this network we use sines and cosines as activation functions in order to allow further studies on the ANN application capabilities and methodologies. NEURAL NETWORK STRUCTURE From our point of view, an ANN who learns efficiently the system dynamics should also allow a further study of the

learned system. Taking this approach, the model can be validated and used in combination together with other techniques different from ANN. For this reason, we began working with this orientation by modeling linear discrete systems. We first obtained a neural network, Codina[4], able to learn the matrices of the system state representation from pairs of input-output signals. The neural network structure has three layers: input, state and output. The state and the output layers are connected to the input layer, composed by the actual inputs and state of the system (Fig ). It has been used the back-propagation through time learning algorithm [5], where the error is back-propagated from the actual state to the previous state. With this methodology, and using linear neurons we can obtain, from the system, the matrices A, B, C and D: x(k+1) = A x(k) + B u(k) () y(k) = C x(k) + D u(k) The main interest of the application of neural networks in dynamic systems arises from their ability to learn and map non linear functions. We have expanded our previously stated model to deal with non linear systems although expanding the linear model using sigmoids impedes further study of the resulting model. To solve this drawback we have used a structured neural network based on the Fourier theory in order to approximate any non linear function within a bounded interval, by means of a weighted sum of sines and cosines. A non linear discrete system can be expressed by the following difference equations: x(k+1) = F( x(k), u(k)) (3) y(k) = G( x(k), u(k)) where F and G can be approximated in a bounded interval by a Fourier series, if we assume that both functions accomplish the Dirichlet conditions (which is a usual restriction). For one input and one state: z -1 Next State State Output Input Fig. Neural network structure based on the system state representation. x( k + 1) = A cos( nw x( k ) + mw u( k) ) + Fn, m n= 0 m= n + B sin( nw x( k ) + mw u( k )) Fn, m n m y( k ) = AG n, m cos( nwnx( k ) + mwmu( k )) + (5) n = 0 m= + B sin( nw x( k ) + mw u( k )) Gn, m n m The coefficients can thus be calculated by classical methods and the results can be compared with those by the ANN, work carried out in [6]. A linear coefficient is added in order to improve convergence and to reduce the number of Fourier coefficients needed. IDENTIFICATION PROCESS To demonstrate the viability of our ANN architecture we have tested it with the five system models that Narendra used in [1]. Each one of the five models has been simulated and a neural network has been trained to learn the system behavior. After the training, the ANN is fed with a test signal, to evaluate the learning capabilities of the proposed structure. Learning procedure The learning procedure used in this case is different from the one presented by Narendra, who used a series-parallel model. In this model, the previous outputs of the real system are used instead of those from the simulated system. This approach is no feasible using a state space structure, because it is known neither the state of the real system nor the internal state representation taken by the ANN. Using the parallel model increases the number of training examples, and they are highly sensitive to initial conditions. m (4) The methodology used has the following steps.

1) A first approximation is obtained by a linear network, with a fixed training signal. ) The model is then expanded using the Fourier terms, the network is trained to learn the coefficients, for the same training signal. 3) The NN is trained with different signals in order to make the system evolve through the whole working region of the state space. 4) If the number of Fourier coefficients is too small then the third step results in an important increasing of the MSE, returning to step. The system models In his paper, Narendra presented six exp eriments in model identification, but the sixth one was only a little variation from the first one. Thus we have replicated exactly his: y( k + 1) = f y( k ), y( k 1), y( k ), u( k ), u( k 1) x1xx3x5( x3 1) + x4 f x1, x, x3, x4, x5 = 1+ x + x 3 (1) and the test input provided: u( k ) = sin( πk / 15) if k 500; (13) u( k ) = 0. 8sin( π k / 15) + 0. sin( π k / 1. 5) if k > 500; Example 5. And the last nonlinear plant tried was: L y1 ( k + 1) NM O y ( k + 1) Q P = L NM y1( k ) 1+ y ( k) u k 1 ( ) y ( k) y ( k ) 1 L u ( k ) 1+ y ( k) P + N M O Q P (14) O P Q where the inputs used to test the trained network were: Example 1. The nonlinear plant to be identified was: 3 y( k + 1) = 0. 3y( k ) + 0. 6y( k 1) + u + 0. 3u 0. 4u (6) u1 ( k), u ( k ) = sin( πk / 5), cos( π k / 5) (15) Simulation results T T and the test input chosen: u( k ) = sin( πk / 50) + sin( π k / 5) (7) Example. Next nonlinear plant was second order: y( k ) y( k 1) y( k ) +. 5 y( k + 1) = + u( k) (8) 1+ y ( k ) + y ( k 1) and the test input tried was: In all five cases, the normalized MSE, between the desired output and the output from the network, decreases, while learning, to values between 10-4 and 10-6. If when tested, the error increases in an important way then more coefficients where added and also new training signals where used. The test signals are the same used by Narendra. In Fig 3 to Fig 8 the outputs from the real system (continuous trace) and the outputs from the trained ANN ( dashed trace) are compared. u( k ) = sin( π k / 5) (9) Example 3. The third nonlinear plant to be identified had the form: y( k ) 3 y( k + 1) = + u ( k ) + 1 y( k ) (10) and was tested with input function: u( k ) = sin( πk / 5) + sin( π k / 10) (11) Fig 3. Example 1. Output of the plant and identification model. Example 4. The nonlinear plant to now was assumed:

Fig 4. Example. Output of the plant and identification model. Fig 8. Example 5. Second output of the plant and second output of the identification model CONCLUSIONS Fig 5. Example 3. Output of the plant and identification model. Fig 6. Example 4. Output of the plant and identification model. Using structured ANN with sine, cosine, and linear activation functions allows us to extract information from the network about the system. Input-output models with sigmoids as activation functions allow the use of neural networks for identification purposes, but no information can be obtained from the resulting structure. In addition, the use of a network structure related with state space description of systems allows us to obtain the equations. With the equations we can apply many of the classical methodologies dealing with non linear systems, usually related with the state space description of systems. The use of state space representation has as a drawback the difficulty to know the initial state, and that the neural network learns his own state representation, that perhaps has no match with the real systems states. On the other hand, if we use input-output models for systems with multiple inputs and multiple outputs we will obtain different equations for every output. The use of structured neural networks has shown to be as useful for the identification of non linear dynamic systems as some of the other approaches. As classical methodologies, our structure allows the extraction of a mathematical model of the system, but it adds the advantages of neural networks to learn and generalize. Our model can also be used to train neural network controllers. REFERENCES Fig 7. Example 5. First output of the plant and of the identification model. [1] Narendra K.S., Parthasarathy K. Identification and Control of Dynamical Systems Using Neural Networks. IEEE Transactions on Neural Networks. Vol 1. N 1. March 1990. [] Jordan, M.I. Serial Order: A Parallel Distributed Processing approach. Institute for Cognitive Science Report 8604. University of California, Sand Diego. 1986

[3] Elman, J.L. Finding Structure in Time. Center For Research in Language. Report 8801. University of California. San Diego 1988. [4] Codina J., Morcego B., Fuertes J.M., Català A. A Novel Neural Network Structure for Control. IEEE Int. Conf. on Systems, Man and Cybernetics. pp. 1339-1344. Chicago.199 [5] Werbos P.J. Backpropagation Through Time: What it Does and How to Do it. Proceedings of the IEEE. Vol. 78 N. 10 pp. 1550-1560 October 1990. [6] Codina J., Aguado J. C., Fuertes J. M. Capabilities of a Structured Neural Network. Learning and Comparison with Classical Techniques. Unpublished.