Model Reference Adaptive Control for Multi-Input Multi-Output Nonlinear Systems Using Neural Networks

Model Reference Adaptive Control for MultiInput MultiOutput Nonlinear Systems Using Neural Networks Jiunshian Phuah, Jianming Lu, and Takashi Yahagi Graduate School of Science and Technology, Chiba University, Chiba 2638522, Japan Email: jsphuah@graduate.chibau.jp Abstract This paper presents a method of MRAC(model reference adaptive control) for multiinput multioutput(mimo) nonlinear systems using NNs(neural networks). The control input is given by the sum of the output of a model reference adaptive controller and the output of the NN(neural network). The NN is used to compensate the nonlinearity of plant dynamics that is not taken into consideration in the usual MRAC. The role of the NN is to construct a linearized model by minimizing the output error caused by nonlinearities in the control systems. INTRODUCTION MRAC is an important class of adaptive control scheme [],[2]. In the direct MRAC scheme, the regulator is updated online so that the plant output follows the output of a reference model. In the MRAC of linear plant, the reference model and the controller structure are chosen in such a way that a parameter set of the regulator exists to ensure perfect model following [3],[4]. However, for nonlinear plants with unknown structures, it may not be possible to ensure perfect model following [5]. This paper presents a structure of MRAC system for MIMO nonlinear systems using NNs. The control input is given by the sum of the output of a model reference adaptive controller and the output of the NN. The role of the NN is to compensate for constructing a linearized model so as to minimize an output error caused by nonlinearities in the control system. The role of model reference adaptive controller is to perform the model matching for the uncertain linearized system to a given linear reference model. One of the distinctive features of the proposed structure is to give an efficient method for calculating the derivative of the system output with respect to the input by using one identified parameter in the linearized model and the internal variables of the NN, which enables to perform the backpropagation algorithm very efficiently. Furthermore, in the proposed method, if the plant is linear, it is unique that neural network does not need to operate. Finally, the computer simulation is done and the effectiveness of this control system is confirmed. LINEAR MRAC In this section, we briefly describe a MIMO linear discretetime MRAC, the controller is designed to realize a plant output Y (k) converges to reference model output Y m (k). Let us consider the MIMO linear discretetime system described by A(z)Y (k) = diag(z di )B(z)U(k) () A(z) = diag[a (z),, A p (z)], z d d B (z) z d pd B p (z) B(z) =..... z d pd p B p (z) z d ppd p B pp (z) and diag(z di ) = diag[z d,, z dp ]. A i (z) and B ij (z)(i =,, p; j =,, p) are scalar polynomials, and d ij (i =,, p; j =,, p) represent the known time delay. Furthermore, U(k) R p is the system input vector and Y (k) R p is the system output vector, and d i = min j p d ij (i =,, p). The matrices A(z) and B(z) are given by n m A(z) = I p A i z i, B(z) = B j z j i= j= the coefficient matrices A i and B j are assumed to be unknown, and det B(z) =, for z <. The upper bounds for the degree of each polynomial in () is known. The control system attempts to make the plant output Y (k) match the reference model output asymptotically, i.e. lim y i(k) y mi (k) ε (2) k for some specified constant ε and i =, 2,, p. The output Y m (k) of the reference model to the command input R(k) is given by A m (z)y m (k) = diag(z d i )B m (z)r(k) (3) A m (z) and B m (z) are left prime, and A m (z) and B m (z) can be given in advance. Let D(z) be an asymptotically stable matrix polynomial. Then there exist unique matrix polynomials R(z), S(z) which satisfy D(z) = A(z)S(z) diag(z d i )R(z) (4) R(z),S(z) defined by S(z) = diag[s (z),, S p (z)], R(z) = diag[r (z),, R p (z)] and degr i (z) = dega i (z), degs i (z) = d i, i =, 2,, p, R i (z) and S i (z) are scalar polynomials.

Using () and (4), we obtain D(z)(Y (k) Y m (k)) = diag(z di )B(z)S(z)U(k) diag(z d i )R(z)Y (k) D(z)Y m (k) (5) When, the control input U(k) is given by U(k) = B (z)s (z)[d(z)y m (k d i ) R(z)Y (k)] (6) It is clear that lim k (y i (k) y mi (k)) = holds, therefore, the control purpose can be realized. When the coefficients of A(z) and B(z) in () are unknown, the problem of estimation of the unknown parameters of plant arises. The system equation in () can be written as n m Y (k) = A i Y (k i) B j U(k d i j) i= j= = αx T (k) (7) T denotes the transpose, and α = [A, A 2,, A n, B, B,, B m ] X(k) = [ Y T (k ),, Y T (k n), U T (k d i ),, U T (k d i m)] The matrix α represents the unknown parameters of plant to be estimated. This is accomplished by using an identification model described by the equation Ŷ (k) = ˆα(k)X T (k) (8) ˆα(k) = [Â(k),, Ân(k), ˆB (k),, ˆB m (k)] and Ŷ (k) is an estimate of Y (k) at time k, and ˆα(k) is an adjustable parameter matrix. The parameter adjustment law, which ensures that the estimated parameters can converge to their true values, is given by ˆα(k) = ˆα(k ) Γ(k) = σ [ˆα(k )XT (k) Y (k)]x(k)γ(k ) X(k)Γ(k )X T (k) [ Γ(k ) λγ(k ] )XT (k)x(k)γ(k ) σ λx(k)γ(k )X T (k) (9) () Γ() = δi, δ > () < σ and < λ < 2 [2] [4]. The control input U(k) in the adaptive case is given by U(k) = ˆB (z)ŝ (z)[d(z)y m (k d i ) ˆR(z)Y (k)] (2) ˆR(z), ˆB(z) and Ŝ(z) are the estimates of R(z), B(z) and S(z), respectively. NONLINEAR MRAC When the inputoutput characteristic of controlled object is nonlinear, it is not possible to express like Eq. (). Then, let R(k) Figure.. x i (k) Reference Model Adaptive Controller Y m (k) Parameter Calculation NN Parameter Estimation E(k) Nonlinear System U(k) Y(k) Structure of nonlinear adaptive control system w ji (k) p j (k) Figure. 2. w lj (k) q l (k) U(k) Nonlinear system System configuration with NN Y m (k) Y(k) E(k) the unknown system be expressed by a nonlinear discretetime system as Y (k) = H(Y T (k ),, Y T (k n), U T (k d i ),, U T (k d i m)) (3), H() is the unknown nonlinear function vector, Y (k) is the plant output, U(k) is the control signal, n and m are the number of past outputs and inputs of the plant depending on the plant order. In the case, when the input in (6) is used to control nonlinear discretetime system in (3), the problem of output error will arise. To keep the plant output Y (k) converges to the reference model output Y m (k), we synthesize the control input U(k) by the following equation U(k) = V (k) V (k) (4), V (k)(= [v (k),, v p (k)] T )is multioutput of the adaptive controller, V (k)(= [ v (k),, v p (k)] T ) is multioutput of the NN. It is possible to show V (k) and V (k) as follows V (k) = ˆB (z)ŝ (z)[d(z)y m (kd i ) ˆR(z)Y (k)] (5) V (k) = Ĥ(V T (k d i ), Y T m (k d i ), Y T (k ),, Y T (k n), U T (k d i ),, U T (k d i m)) (6)

The block diagram of the MIMO nonlinear MRAC system with NN is shown in Figure.. Using the above approach, the NN will be trained. The method of training is done by adjusting the weight of the NN until the output error limit lim k e i (k) = lim k y i (k) y mi (k) ε is met. COMPOSITION OF THE NN Figure 2 shows system configuration of inputoutput relation of the system with NN. The NN consists of three layers: an input layer, an output layer and an intermediate or hidden layer. Let x i (k) be the input to the ith node in the input layer, p j (k) be the input to the jth node in the hidden layer, q l (k) be the input to the lth node in the output layer. Furthermore, w ji be the weight between the input layer and the hidden layer, w lj be the weight between the hidden layer and the output layer. In Figure. 2, the control input is given by the sum of the output of a model reference adaptive controller and the output of the NN. The NN is used to compensate the nonlinearity of plant dynamics that is not taken into consideration in the usual MRAC. The role of the NN is to construct a linearized model by minimizing the output error caused by nonlinearities in the control systems. The input x i (k) to NN is given as x i (k) {V T (k d i ), Y T m (k d i ), Y T (k ),, Y T (k n), U T (k d i ),, U T (k d i m)} (7) Therefore, nonlinear function of a MIMO nonlinear system can be approximated by NN, and the number of components of the input layer is (n m 2) p. LEARNING OF THE NN From Figure. 2, we obtain p j (k) = i w ji (k)x i (k) (8) q l (k) = w lj (k)f(p j (k)) j (9) v l (k) = f(q l (k)) (2) f() is the sigmoid function and l =, 2,, p. The sigmoid function f() is chosen as f(x) = 2a exp( µx) a (2) µ >, a is a specified constant such that a, and f(x) satisfies a < f(x) < a. The derivative of the sigmoid function f() is as follows: f (x) = µ (a f(x))(a f(x)) (22) 2a Equation (8) shows the relation between intermediate layer and input layer, and (9) shows the relation between output layer and intermediate layer. The output of the NN can be obtained from (2). The error function (evaluation function) is defined as E i (k) = 2 [y mi(k) y i (k)] 2 (23) i =, 2,, p. The objective is to minimize the error function E i (k) by taking the error gradient with respect to the parameters or weight vector, say w(k), that is to be adapted. The weights are then updated by using w lj (k) = η E l(k) w lj(k ) α(k) (24) w ji (k) = η E l(k) w ji(k ) α(k) (25) η and α(k) are the learning rate and momentum, respectively, and α(k) = α(k ) α and l =, 2,, p. The upper limit of α(k) is set to be A. To obtain the / and / in (24), (25), we can write = E l(k) u l(k d l ) (26) = E l(k) u l(k d l ) f(p j(k)) (27) = (y ml (k) y l (k)), =, = f(p j (k)), = w lj (k), = x i (k), = µ 2a (a f(q l(k)))(a f(q l (k))), = µ 2a (a f(p j(k)))(a f(p j (k))) Since gradients / and / can be calculated, therefore it is possible to train the NN. Again, / is given by = = From Figure. 2, we obtain v l (k d l ) v l(k d l ) / ul (k d l ) v l (k d l ) v l (k d l ) (28) u l (k d l ) = v l (k d l ) v l (k d l ) (29)

Then v l (k d l ) = v l(k d l ) v l (k d l ) and v l (k d l )/ v l (k d l ) is given by v l (k d l ) v l (k d l ) = f(q l(k)) x l (k) = f(q l(k)) p j(k) x l (k) (3) (3) x l (k) = w lj(k) (32) Furthermore, the linear model of the plant is constructed using the estimated parameters. The output of this linear model is ŷ l (k). When we assume that the nonlinearity of the plant is relatively small, then it is possible to approximate as y l (k) ŷ l (k), and the approximate value of / u l (k d l ) can be calculated as below. From (8), the following equation holds. ŷ l (k) = â l (k)y l (k ) â ln (k)y l (k n) ˆb l (k)v (k d ) ˆb lm (k)v (k d m) ˆb lp (k)v p (k d p ) ˆb lpm (k)v p (k d p m) (33) From (33), we obtain v l (k d l ) ŷ l(k) v l (k d l ) = ˆb ll (k) (34) Using (3), (3), (34), it is possible to write (28) as H (k) = f(q l(k)) = ˆb ll (k) H (k) f(p j(k)) p j(k) x l (k) (35) COMPUTER SIMULATION As an example of the nonlinear system, two cases are taken up. In all cases λ =, σ =.98, and δ = 4 are fixed. Example : Let us consider the MIMO nonlinear discretetime system described by y (k) =.3y (k ).3y 2 (k ) u (k d ).8u (k d ).45u 2 (k d ) u (k d )u 2 (k d ).5 u 2 (k d ) y 2 (k) =.6y 2 (k ).62u (k d 2 ) u 2 (k d 2 ).7u 2 (k d 2 ) u 2 2(k d 2 ) In this example, we assume A m (z) = B m (z) = D(z) = I (2 2) and diag(z d i ) = diag[z, z ], then the output Y m (k) of the reference model to the command input R(k) is.2.8.6.4.2.2.4.6 2 4 6 8 2 4 6 8 2 Figure. 3..2.8.6.4.2.2.4.6 y ym Y m (k) and Y (k) before learning by NN 2 4 6 8 2 4 6 8 2 Figure. 4. y ym Y m (k) and Y (k) after learning by NN given by Y m (k) = diag[z, z ]R(k). Figure. 3 shows the desired output Y m (k) and plant output Y (k) before learning by NN. The results of Figure. 3 show that the error of Y (k) and Y m (k) is big. Figure. 4 shows Y m (k) and Y (k) after learning by NN,, the number of nodes in the input layer was 8, in the hidden layer was 8, and in the output layer was 2, and α =., α() =.2, A =.8, η =.25, a = 5, and µ =.2 are fixed. The results of Figure. 4 show that Y (k) can converge to Y m (k) after learning by NN. Example 2: Let us consider the MIMO nonlinear discretetime system described by y (k) = y (k ) y 2 2 (k ) u (k d ) y 2 (k) = y (k )y 2 (k ) y 2 2 (k ) u 2 (k d 2 ) The output Y m (k) of the reference model to the command input R(k) is given by Y m (k) = diag[z, z ]R(k). Figure. 5 shows Y m (k) and Y (k) before learning by NN. It can be seen from Figure. 5, that Y (k) to diverge. Figure. 6 shows Y m (k) and Y (k) after learning by NN,, the number of nodes in the input layer was 6, in the hidden layer was 6, and in the output layer 2, and α =., α() =.23, A =.8, η =.8, a = 5, and µ =. are fixed. It can be seen from Figure. 6 again that Y (k) can

3 2.5 2.5.5.5.5 2 4 6 8 2 4 6 8 2 Figure. 5. 3 2.5 2.5.5.5.5 y ym Y m(k) and Y (k) before learning by NN 2 4 6 8 2 4 6 8 2 Figure. 6. y ym Y m (k) and Y (k) after learning by NN CONCLUSION We have proposed a method of MRAC for MIMO nonlinear systems using NNs. The control input is given by the sum of the output of a model reference adaptive controller and the output of the NN. The NN is used to compensate the nonlinearity of plant dynamics that is not taken into consideration in the usual MRAC. From simulation results, it has been shown that the plant output Y (k) can converge to the desired output Y m (k) after learning by NN for nonlinear discretetime system. REFERENCES [] K. J. Åström and B. Wittenmark, Adaptive Control, AddisonWesley, 989. [2] J. Lu and T. Yahagi, Discretetime MRAC for nonminimum phase systems with disturbances using approximate inverse systems, IEE Proc. D, vol. 44, no. 5, pp. 447 454, 997. [3] J. Lu and T. Yahagi, New design method for MRAC for nonminimum phase discretetime systems with disturbances, IEE Proc. D, vol. 4, no., pp. 34 4, 993. [4] J. Lu, M. Shafiq, and T. Yahagi, A method for adaptive control of nonminimum phase continuoustime systems based on polezero placement, IEICE Trans. Fundamentals, vol. E8A, no. 6, pp. 9 5, 997. [5] K. S. Narendra and K. Parthasarathy, Identification and control of dynamical system using neural networks, IEEE Trans. NNs, vol., no., pp. 4 27, 99. converge to Y m (k) after learning by NN..