Optimal Design of CMAC Neural-Network Controller for Robot Manipulators

Similar documents
Deadzone Compensation in Motion Control Systems Using Neural Networks

Adaptive Robust Tracking Control of Robot Manipulators in the Task-space under Uncertainties

Optimal Control of Uncertain Nonlinear Systems

WE PROPOSE a new approach to robust control of robot

IN recent years, controller design for systems having complex

GAIN SCHEDULING CONTROL WITH MULTI-LOOP PID FOR 2- DOF ARM ROBOT TRAJECTORY CONTROL

Robust Control of Robot Manipulator by Model Based Disturbance Attenuation

Neural Network-Based Adaptive Control of Robotic Manipulator: Application to a Three Links Cylindrical Robot

NEURAL NETWORKS (NNs) play an important role in

AN INTELLIGENT control system may have the ability

ADAPTIVE FORCE AND MOTION CONTROL OF ROBOT MANIPULATORS IN CONSTRAINED MOTION WITH DISTURBANCES

OVER THE past 20 years, the control of mobile robots has

Neural Network Control of Robot Manipulators and Nonlinear Systems

THE robot is one of the choices for improving productivity

Design and Stability Analysis of Single-Input Fuzzy Logic Controller

A Sliding Mode Controller Using Neural Networks for Robot Manipulator

Lyapunov Design for Controls

PERIODIC signals are commonly experienced in industrial

A Nonlinear Disturbance Observer for Robotic Manipulators

Gain Scheduling Control with Multi-loop PID for 2-DOF Arm Robot Trajectory Control

Lyapunov Stability of Linear Predictor Feedback for Distributed Input Delays

H 2 Adaptive Control. Tansel Yucelen, Anthony J. Calise, and Rajeev Chandramohan. WeA03.4

Observer-based sampled-data controller of linear system for the wave energy converter

Nonlinear PD Controllers with Gravity Compensation for Robot Manipulators

Video 8.1 Vijay Kumar. Property of University of Pennsylvania, Vijay Kumar

H State-Feedback Controller Design for Discrete-Time Fuzzy Systems Using Fuzzy Weighting-Dependent Lyapunov Functions

1348 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS PART B: CYBERNETICS, VOL. 34, NO. 3, JUNE 2004

NONLINEAR NETWORK STRUCTURES FOR FEEDBACK CONTROL

The Design of Sliding Mode Controller with Perturbation Estimator Using Observer-Based Fuzzy Adaptive Network

THE control of systems with uncertain nonlinear dynamics

An Adaptive LQG Combined With the MRAS Based LFFC for Motion Control Systems

Adaptive Control of a Class of Nonlinear Systems with Nonlinearly Parameterized Fuzzy Approximators

A Sliding Mode Control based on Nonlinear Disturbance Observer for the Mobile Manipulator

State and Parameter Estimation Based on Filtered Transformation for a Class of Second-Order Systems

AFAULT diagnosis procedure is typically divided into three

Backstepping Control of Linear Time-Varying Systems With Known and Unknown Parameters

Here represents the impulse (or delta) function. is an diagonal matrix of intensities, and is an diagonal matrix of intensities.

RBF Neural Network Adaptive Control for Space Robots without Speed Feedback Signal

Real-time Motion Control of a Nonholonomic Mobile Robot with Unknown Dynamics

A Recurrent Neural Network for Solving Sylvester Equation With Time-Varying Coefficients

A Boiler-Turbine System Control Using A Fuzzy Auto-Regressive Moving Average (FARMA) Model

Trajectory tracking & Path-following control

458 IEEE TRANSACTIONS ON CONTROL SYSTEMS TECHNOLOGY, VOL. 16, NO. 3, MAY 2008

The Rationale for Second Level Adaptation

MCE/EEC 647/747: Robot Dynamics and Control. Lecture 12: Multivariable Control of Robotic Manipulators Part II

IN THIS PAPER, we consider a class of continuous-time recurrent

I. MAIN NOTATION LIST

A SIMPLE ITERATIVE SCHEME FOR LEARNING GRAVITY COMPENSATION IN ROBOT ARMS

Dynamic backstepping control for pure-feedback nonlinear systems

Neural Network Sliding-Mode-PID Controller Design for Electrically Driven Robot Manipulators

Fuzzy Based Robust Controller Design for Robotic Two-Link Manipulator

Design Artificial Nonlinear Controller Based on Computed Torque like Controller with Tunable Gain

Indirect Model Reference Adaptive Control System Based on Dynamic Certainty Equivalence Principle and Recursive Identifier Scheme

H-infinity Model Reference Controller Design for Magnetic Levitation System

An Iteration-Domain Filter for Controlling Transient Growth in Iterative Learning Control

Filter Design for Linear Time Delay Systems

Adaptive Jacobian Tracking Control of Robots With Uncertainties in Kinematic, Dynamic and Actuator Models

OPTIMAL CONTROL AND ESTIMATION

F.L. Lewis, NAI. Talk available online at Supported by : NSF AFOSR Europe ONR Marc Steinberg US TARDEC

Combined NN/RISE-based Asymptotic Tracking Control of a 3 DOF Robot Manipulator

Observer Based Output Feedback Tracking Control of Robot Manipulators

Robust Control of Cooperative Underactuated Manipulators

Adaptive NN Control of Dynamic Systems with Unknown Dynamic Friction

HIGH-ORDER STATE FEEDBACK GAIN SENSITIVITY CALCULATIONS USING COMPUTATIONAL DIFFERENTIATION

A New Approach to Control of Robot

Hover Control for Helicopter Using Neural Network-Based Model Reference Adaptive Controller

Nonlinear Adaptive Robust Control. Theory and Applications to the Integrated Design of Intelligent and Precision Mechatronic Systems.

CONTROL OF ROBOT CAMERA SYSTEM WITH ACTUATOR S DYNAMICS TO TRACK MOVING OBJECT

Chapter One. Introduction

AROTORCRAFT-BASED unmanned aerial vehicle

Introduction to centralized control

Adaptive Predictive Observer Design for Class of Uncertain Nonlinear Systems with Bounded Disturbance

Asignificant problem that arises in adaptive control of

Robust Gain Scheduling Synchronization Method for Quadratic Chaotic Systems With Channel Time Delay Yu Liang and Horacio J.

Robust fuzzy control of an active magnetic bearing subject to voltage saturation

Decentralized PD Control for Non-uniform Motion of a Hamiltonian Hybrid System

Analytic Nonlinear Inverse-Optimal Control for Euler Lagrange System

Introduction to centralized control

Control of industrial robots. Centralized control

Adaptive Robust Control for Servo Mechanisms With Partially Unknown States via Dynamic Surface Control Approach

WE EXAMINE the problem of controlling a fixed linear

Riccati difference equations to non linear extended Kalman filter constraints

1030 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 56, NO. 5, MAY 2011

THIS paper deals with robust control in the setup associated

IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 50, NO. 5, MAY Bo Yang, Student Member, IEEE, and Wei Lin, Senior Member, IEEE (1.

ADAPTIVE control of uncertain time-varying plants is a

Takagi Sugeno Fuzzy Sliding Mode Controller Design for a Class of Nonlinear System

ADAPTIVE NEURAL NETWORK CONTROL OF MECHATRONICS OBJECTS

Application of singular perturbation theory in modeling and control of flexible robot arm

Robot Manipulator Control. Hesheng Wang Dept. of Automation

On-line Learning of Robot Arm Impedance Using Neural Networks

OVER the past one decade, Takagi Sugeno (T-S) fuzzy

EML5311 Lyapunov Stability & Robust Control Design

A Cross-Associative Neural Network for SVD of Nonsquared Data Matrix in Signal Processing

Video 6.1 Vijay Kumar and Ani Hsieh

Mechanical Engineering Department - University of São Paulo at São Carlos, São Carlos, SP, , Brazil

Adaptive Control of Nonlinearly Parameterized Systems: The Smooth Feedback Case

Exponential Controller for Robot Manipulators

AS A POPULAR approach for compensating external

Chapter 2 Review of Linear and Nonlinear Controller Designs

Transcription:

22 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS PART C: APPLICATIONS AND REVIEWS, VOL. 30, NO. 1, FEBUARY 2000 Optimal Design of CMAC Neural-Network Controller for Robot Manipulators Young H. Kim and Frank L. Lewis, Fellow, IEEE Abstract This paper is concerned with the application of quadratic optimization for motion control to feedback control of robotic systems using cerebellar model arithmetic computer (CMAC) neural networks. Explicit solutions to the Hamilton Jacobi Bellman (H J B) equation for optimal control of robotic systems are found by solving an algebraic Riccati equation. It is shown how the CMAC s can cope with nonlinearities through optimization with no preliminary off-line learning phase required. The adaptive-learning algorithm is derived from Lyapunov stability analysis, so that both system-tracking stability and error convergence can be guaranteed in the closed-loop system. The filtered-tracking error or critic gain and the Lyapunov function for the nonlinear analysis are derived from the user input in terms of a specified quadratic-performance index. Simulation results from a two-link robot manipulator show the satisfactory performance of the proposed control schemes even in the presence of large modeling uncertainties and external disturbances. Index Terms CMAC neural network, optimal control, robotic control. I. INTRODUCTION THERE has been some work related to applying optimalcontrol techniques to the nonlinear robotic manipulator. These approaches often combine feedback linearization and optimal-control techniques. Johansson [6] showed explicit solutions to the Hamilton Jacobi Bellman (H J B) equation for optimal control of robot motion and how optimal control and adaptive control may act in concert in the case of unknown or uncertain system parameters. Dawson et al. [5] used a general-control law known as modified computed-torque control (MCTC) and quadratic optimal-control theory to derive a parameterized proportional-derivative (PD) form for an auxiliary input to the controller. However, in actual situations, the robot dynamics is rarely known completely, and thus, it is difficult to express real robot dynamics in exact mathematical equations or to linearize the dynamics with respect to the operating point. Neural networks have been used for approximation of nonlinear systems, for classification of signals, and for associative memory. For control engineers, the approximation capability of neural networks is usually used for system identification or identification-based control. More work is now appearing on the use of neural networks in direct, closed-loop controllers that yield guaranteed performance [13]. The robotic application of Manuscript received June 2, 1997; revised June 23, 1999. This research was supported by NSF Grant ECS-9521673. The authors are with the Automation and Robotics Research Institute, University of Texas at Arlington, Fort Worth, TX 76118-7115 USA (e-mail: ykim50@hotmail.com; flewis@arri.uta.edu). Publisher Item Identifier S 1094-6977(00)00364-3. neural-network based, closed-loop control can be found [12]. For indirect or identification-based, robotic-system control, several neural network and learning schemes can be found in the literature. Most of these approaches consider neural networks as very general computational models. Although a pure neural-network approach without a knowledge of robot dynamics may be promising, it is important to note that this approach will not be very practical due to high dimensionality of input output space. In this way, the training or off-line learning process by pure connectionist models would require a neural network of impractical size and unreasonable number of repetition cycles. The pure connectionist approach has poor generalization properties. In this paper, we propose a nonlinear optimal-design method that integrates linear optimal-control techniques and CMAC neural-network learning methods. The linear optimal control has an inherent robustness against a certain range of model uncertainties [9]. However, nonlinear dynamics cannot be taken into consideration in linear optimal-control design. We use the CMAC neural networks to adaptively estimate nonlinear uncertainties, yielding a controller that can tolerate a wider range of uncertainties. The salient feature of this H J B control design is that we can use a priori knowledge of the plant dynamics as the system equation in the corresponding linear optimal-control design. The neural network is used to improve performance in the face of unknown nonlinearities by adding nonlinear effects to the linear optimal controller. The paper is organized as follows. In Section II, we will review some fundamentals of the CMAC neural networks. In Section III, we give a new control design for rigid robot systems using the H J B equation. In Section IV, a CMAC controller combined with the optimal-control signal is proposed. In Section V, a two-link robot controller is designed and simulated in the face of large uncertainties and external disturbances. II. BACKGROUND Let denote the real numbers, the real -vectors, and the real matrices. We define the norm of a vector as and the norm of a matrix as where and are the largest and smallest eigenvalues of a matrix. The absolute value is denoted as. Given and, the Frobenius norm is defined by with as the trace operator. The associated inner product is. The Frobenius norm is compatible with the two-norm so that with and. 1094 6977/00$10.00 2000 IEEE

KIM AND LEWIS: OPTIMAL DESIGN OF NEURAL-NETWORK CONTROLLER 23 Fig. 1. Architecture of a CMAC neural network. A. CMAC Neural Networks Fig. 1 shows the architecture and operation of the CMAC. The CMAC can be used to approximate a nonlinear mapping : where is the application in the -dimensional input space and in the application output space. The CMAC algorithm consists of two primary functions for determining the value of a complex function, as shown in Fig. 1 2) Multidimensional Receptive-Field Functions: Given any, the multidimensional receptivefield functions are defined as with,. The output of the CMAC is given by (3) where continuous -dimensional input space; -dimensional association space; -dimensional output space. The function is fixed and maps each point in the input space onto the association space. The function computes an output by projecting the association vector determined by onto a vector of adjustable weights such that in (1) is the multidimensional receptive field function. 1) Receptive-Field Function: Given, let be domain of interest. For this domain, select integers and strictly increasing partitions (1) (2) where output-layer weight values; : continuous, multidimensional receptive-field function; number of the association point. The effect of receptive-field basis function type and partition number along each dimension on the CMAC performance has not yet been systematically studied. The output of the CMAC can be expressed in a vector notation as where matrix of adjustable weight values vector of receptive-field functions. Based on the approximation property of the CMAC, there exists ideal weight values, so that the function to be approximated can be represented as (4) (5) For each component of the input space, the receptive-field basis function can be defined as rectangular [1] or triangular [4] or any continuously bounded function, e.g., Gaussian [3]. with the functional reconstructional error and bounded. (6)

24 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS PART C: APPLICATIONS AND REVIEWS, VOL. 30, NO. 1, FEBUARY 2000 Then, an estimate of can be given by where are estimates of the ideal weight values. The Lyapunov method is applied to derive reinforcement adaptive learning rules for the weight values. Since these adaptive learning rules are formulated from the stability analysis of the controlled system, the system performance can be guaranteed for closed-loop control. B. Robot Arm Dynamics and Properties The dynamics of an -link robot manipulator may be expressed in the Lagrange form [9] (8) with joint variable; inertia; Coriolis/centripetal forces; gravitational forces; diagonal matrix of viscous friction coefficients; Coulomb friction coefficients; external disturbances. The external control torque to each joint is. Given a desired trajectory, the tracking errors are (7) and (9) and the instantaneous performance measure is defined as Property 1 Inertia: The inertia matrix is uniformly bounded and (16) Property 2 Skew Symmetry: The matrix (17) is skew-symmetric. III. OPTIMAL-COMPUTED TORQUE-CONTROLLER DESIGN A. H J B Optimization Define the velocity-error dynamics (18) The following augmented system is obtained: (19) or with shorter notation (20) with,, and. is defined as. A quadratic performance index is as follows: (21) (10) where is the constant-gain matrix or critic (not necessarily symmetric). The robot dynamics (8) may be written as with the Lagrangian (22) where the robot nonlinear function is and, for instance (11) (12) (13) This key function captures all the unknown dynamics of the robot arm. Now define a control-input torque as (14) Given the performance index, the control objective is to find the auxiliary control input that minimizes (21) subject to the differential constraints imposed by (19). The optimal control that achieves this objective will be denoted by.itis worth noting for now, that only the part of the control-input-torobotic-system denoted by in (14) is penalized. This is reasonable from a practical standpoint, since the gravity, Coriolis, and friction-compensation terms in (12) cannot be modified by the optimal-design phase. A necessary and sufficient condition for to minimize (21) subject to (20) is that there exist a function satisfying the H J B equation [10] (23) with an auxiliary control input to be optimized later. The closed-loop system becomes (15) where the Hamiltionian of optimization is defined as (24)

KIM AND LEWIS: OPTIMAL DESIGN OF NEURAL-NETWORK CONTROLLER 25 and is referred to as the value function. It satisfies the partial differential equation (25) The minimum is attained for the optimal control, and the Hamiltonian is then given by where is given by (12). It is referred to as an optimalcomputed torque controller (OCTC). B. Stability Analysis Theorem 2: Suppose that matrices and exist that satisfy the hypotheses of Lemma 1, and in addition, there exist constants and such that, and the spectrum of is bounded in the sense that on. Then using the feedback control in (29) and (20) results in the controlled nonlinear system (35) (26) Lemma 1: The following function composed of, and a positive symmetric matrix satisfies the H J B equation: (27) where and in (10) and (27) can be found from the Riccati differential equation This is globally exponentially stable (GES) regarding the origin in. Proof: The quadratic function is a suitable Lyapunov function candidate, because it is positive radially, growing with. It is continuous and has a unique minimum at the origin of the error space. It remains to show that for all. From the solution of the H J B equation (A12), it follows that (36) The optimal control (28) that minimizes (21) subject to (20) is Substituting (29) for (31) gives (29) See Appendix A for proof. Theorem 1: Let the symmetric weighting matrices, be chosen such that (30) with. Then the and required in Lemma 1 can be determined from the following relations: (31) (32) with (32) solved for using Lyapunov equation solvers (e.g., MatLab [15]). See Appendix B for proof. Remarks: 1) In order to guarantee positive definiteness of the constructed matrix, the following inequality [7] must be satisfied (33) 2) With the optimal-feedback control law calculated using Theorem 1, the torques to apply to the robotic system are calculated according to the control input (34) (37) The time derivative of the Lyapunov function is negative definite, and the assertion of the theorem then follows directly from the properties of the Lyapunov function [9]. IV. CMAC NEURAL-CONTROLLER DESIGN The block diagram in Fig. 2 shows the major components that embody the CMAC neural controller. The external-control torques to the joints are composed of the optimal-feedback control law given in Theorem 1 plus the CMAC neural-network output components. The nonlinear robot function can be represented by a CMAC neural network (38) where is a multidimensional receptive-field function for the CMAC. Then a functional estimate of can be written as The external torque is given by where is a robustifying vector. Then (11) becomes (39) (40) (41)

26 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS PART C: APPLICATIONS AND REVIEWS, VOL. 30, NO. 1, FEBUARY 2000 Fig. 2. CMAC neural controller based on the H J B optimization. with the weight-estimation error description of (41) can be given by. The state-space Evaluating (47) along the trajectory of (43) yields (42) with,, and given in (19) and (20). Inserting the optimal-feedback control law (29) into (42), we obtain Using the Riccati equation (28), we have (48), and from (43) Theorem 3: Let the control action be provided by the optimal controller (29), with the robustifying term given by (44) with and defined as the instantaneous-performance measure (10). Let the adaptive learning rule for neuralnetwork weights be given by (45) with and. Then the errors,, and are uniformly ultimately bounded. Moreover, the errors and can be made arbitrarily small by adjusting weighting matrices. Proof: Consider the following Lyapunov function: (46) Then the time derivative of Lyapunov function becomes (49) (50) Applying the robustifying term (44) and the adaptive learning rule (45), we obtain The following inequality is used in the previous derivation Completing the square terms yields (51) (52) where is positive definite and symmetric given by (31). The time derivative of the Lyapunov function becomes (47) (53)

KIM AND LEWIS: OPTIMAL DESIGN OF NEURAL-NETWORK CONTROLLER 27 which is guaranteed negative as long as either (54) or (55) holds (58) (54) where sqn is a signum function. The weighting matrices are as follows: (55) where and are convergence regions. According to a standard Lyapunov theory extension [11], this demonstrates uniformly ultimate boundedness of,, and. Remarks: 1) The OCTC is globally asymptotically stable if is fully known, whereas the neural-adaptive controller is UUB. In both cases, there is a convergence of tracking errors. UUB is a notion of stability in the practical sense that is usually sufficient for the performance of closed-loop systems, provided that the bound on system states is small enough. 2) Robotic manipulators are subjected to structured and/or unstructured uncertainties in all applications. Structured uncertainty is defined as the case of a correct dynamical model but with parameter uncertainty due to tolerance variations in the manipulator-link properties, unknown loads, and so on. Unstructured uncertainty describes the case of unmodeled dynamics that result from the presence of high-frequency modes in the manipulator, nonlinear friction. The adaptive optimizing feature of the proposed neural controller is suitable even without full knowledge of the system dynamics. 3) From Barron results [2], there exist lower bounds of order on the approximation error if only the parameters of a linear combination of basis functions are adjusted. Our stability proof shows that the effect of the bounds on the approximation error can be alleviated by the judicious choice of weighting matrices and. 4) It is emphasized that the neural-weight values may be initialized at zero, and stability will be maintained by the optimal controller in the performance-measurement loop until the neural network learns. This means that there is no off-line learning or trial and error phase, which often requires a long time in other works. 5) The advantage of the CMAC control scheme over other existing neural-network architectures is that the number of adjustable parameters (i.e., weight values) is significantly less, since only weights in the output layer are to be adjusted. It is very suitable for closed-loop control. V. SIMULATION RESULTS The dynamic equations for an -link manipulator can be found in [9]. The cost functional to be minimized is An external disturbance and frictions are (56) (57) Solving the matrices and using MatLab [15] yields (59) (60) The motion problem considered is for the robot end-effector to track a point on a circle centered at 0.05 m and radius 0.05 m, which turns 1/2 times per second in slow motion and two times per second in fast motion. It was pointed out that control-system performance may be quite different in low-speed and high-speed motion. Therefore, we carry out our simulation for two circular trajectories. The desired positions in low speed are and the high-speed positions profiles are (61) (62) By solving the inverse kinematics, we obtain the desired jointangle trajectory in fast motion. The responses of the OCTC, where all nonlinearities are exactly known, are shown in Fig. 3 without disturbances and friction. The simulation was performed in low speed and high speed. After a transient due to error in initial conditions, the position errors tend asymptotically toward zero. To show the effect of unstructured uncertainties, we dropped a term in gravity forces. The simulation results are shown in Fig. 4(a) in low speed. Note that there is a steady-state error with OCTC. Fig. 4(b) shows the effect of external disturbances and friction forces, which is difficult to model and compensate. This is corrected by adding a CMAC neural network as follows. The CMAC can be characterized by: number of input spaces: ; number of partitions for each space: ; number of association points: ; receptive field-basis functions: with ;, and

28 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS PART C: APPLICATIONS AND REVIEWS, VOL. 30, NO. 1, FEBUARY 2000 (a) (b) Fig. 3. Performance of OCTC (34): (a) tracking error for slow motion and (b) tracking error for fast motion (solid: joint 1, dotted: joint 2). (a) (b) Fig. 4. Performance of OCTC (34): (a) tracking error with modeling error for slow motion and (b) tracking error with disturbance and friction for slow motion (solid: joint 1, dotted: joint 2). (a) (b) Fig. 5. joint 2). Performance of CMAC neural network controller (40): (a) tracking error for slow motion and (b) tracking error for fast motion (solid: joint 1, dotted: learning rate in the weight-tuning law: and ; simulation time: 20 s. The results in Figs. 5 and 6 clearly show the ability of the CMAC neural-network controller to overcome uncertainties, both structured and unstructured. Note that the problem noted in Fig. 4 with OCTC does not arise here, as all the nonlinearities are assumed unknown to the CMAC neural controller. VI. CONCLUSION We have developed a hierarchical, intelligent control scheme for a robotic manipulator using the HJB optimization process and the CMAC neural network. It has been shown that the entire closed-loop system behavior depends on the user-specified performance index and, through the critic-gain matrix. The Lyapunov function for the stability of the overall system is automatically generated by the weighting matrices. In the derivation of the optimal-computed torque controller, it has been assumed that nonlinearities in the robotic manipulator are completely known. However, even with the knowledge of nonlinearities, it is difficult to achieve the control objective in the pres-

KIM AND LEWIS: OPTIMAL DESIGN OF NEURAL-NETWORK CONTROLLER 29 (a) (b) Fig. 6. Performance of CMAC neural-network controller (40): (a) tracking error with disturbance and friction for fast motion and (b) tracking error of mass variation (m ; 2:3! 4:0 kg at 5 s, m ; 4:0! 2:3 kg at 12 s) with disturbance and friction for fast motion (solid: joint 1, dotted: joint 2). ence of modeling uncertainties and frictional forces. The salient feature of the CMAC neural-hjb design is that the control objective is obtained with completely unknown nonlinearities in the robotic manipulator. The proposed neural-adaptive learning shows both robustness and adaptation to changing system dynamics. To that end, a critic signal is incorporated into the adaptive-learning scheme. The application potential of the proposed methodology lies in the control design in areas such as robotics and flight control and in motion-control analysis (e.g., of biomechanics). APPENDIX A PROOF OF LEMMA 1 The theorem claims that the HJB equation is satisfied for a function where (A1) (A2) (A3) matrix whose elements are partial derivatives of the elements of w.r.t.. A candidate for the Hamiltonian (24) is the sum of (A5) and the Lagrangian (22). Now we are ready to evaluate how depends on. The for which has its minimum values is obtained from the partial derivative w.r.t.. Since is unconstrained, (A3) requires that which gives a candidate for the optimal control since We know that (A3) is satisfied by (A5) and (A6) into (A8) gives (A7) (A8) (A9), given (A8). Inserting (A10) To derive optimal-control law, the partial derivatives of the function need to be evaluated. Here, we have the time derivative of the function Notice that the relation (A11) The gradient of with respect to the error state is with (A4) (A5) is used. A necessary and sufficient condition for optimality is that the chosen value function satisfies (23). Substituting (24) for (23) yields (A12) (A6) In (A6), has dimension is a zero vector, and the notation is used to represent the where it is understood that the partial derivatives of are being evaluated along the optimal control (A4) into (A12), we obtain in (A12). Inserting (A13)

30 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS PART C: APPLICATIONS AND REVIEWS, VOL. 30, NO. 1, FEBUARY 2000 Inserting (20), (22), and (A10) into (A13) gives (A14) Whence the application of robot property 2, (17) shows that the matrices of (31) and (32) solve the algebraic Riccati equation of (A20) Since as, (A14) can be written (A20) (A15) We can summarize by stating that if a matrix can be found that satisfies (A15), then the value function given in (A2) satisfies the HJB equation (A1). In this case, the desired optimal control is given by (A10). Note that if the matrix satisfies the algebraic Riccati equation (28), then satisfies (A15). This completes the proof. APPENDIX B PROOF OF THEOREM 1 From Lemma 1, it is known that solves the HJB equation for equation from the quadratic form The optimal-feedback control law that minimizes (A16), solving the matrix Let the weighting matrices be given by (30). Insertion of expressions for matrices in (20) and in (27) into (A2), we have is (A17) (A18) (A19) This completes the proof. REFERENCES [1] J. S. Albus, A new approach to manipulator control: The cerebellar model articulation controller (CMAC), J. Dynamic Syst., Meas., Contr., vol. 97, no. 3, pp. 220 227, 1975. [2] A. R. Barron, Universal approximation bounds for superposition of a sigmoidal function, IEEE Trans. Inform. Theory, vol. 39, pp. 930 945, Mar. 1993. [3] C.-T. Chiang and C.-S. Lin, CMAC with general basis functions, Neural Networks, vol. 9, no. 7, pp. 1199 1211, 1996. [4] S. Commuri, F. L. Lewis, S. Q. Zhu, and K. Liu, CMAC neural networks for control of nonlinear dynamical systems, Proc. Neural, Parallel and Scientific Computing, vol. 1, pp. 119 124, 1995. [5] D. Dawson, M. Grabbe, and F. L. Lewis, Optimal control of a modified computed-torque controller for a robot manipulator, Int. J. Robot. Automat., vol. 6, no. 3, pp. 161 165, 1991. [6] R. Johansson, Quadratic optimization of motion coordination and control, IEEE Trans. Automat. Contr., vol. 35, pp. 1197 1208, Nov. 1990. [7] D. E. Koditschek, Quadratic Lyapunov functions for mechanical systems, Yale Univ., Tech. Rep. 703, Mar. 1987. [8] S. H. Lane, D. A. Handelman, and J. J. Gelfand, Theory and development of higher-order CMAC neural networks, IEEE Contr. Syst. Mag., pp. 23 30, Apr. 1992. [9] F. L. Lewis, C. T. Abdallah, and D. M. Dawson, Control of Robot Manipulators, New York: Macmillan, 1993. [10] F. L. Lewis and V. L. Syrmos, Optimal Control, 2nd ed, New York: Wiley, 1995. [11] K. S. Narendra and A. M. Annaswamy, A new adaptive law for robust adaptation without persistent excitation, IEEE Trans. Automat. Contr., vol. AC-32, pp. 134 145, Feb. 1987. [12] F. L. Lewis, A. Yesildirek, and K. Liu, Multilayer neural-net robot controller with guaranteed tracking performance, IEEE Trans. Neural Networks, vol. 7, pp. 388 399, Mar. 1996. [13] M. M. Polycarpou, Stable adaptive neural control of scheme for nonlinear systems, IEEE Trans. Automat. Contr., vol. 41, pp. 447 451, Mar. 1996. [14] Y.-F. Wong and A. Sideris, Learning convergence in the cerebellar model articulation controller, IEEE Trans. Neural Networks, vol. 3, pp. 115 121, Jan. 1992. [15] MatLab Users Guide, Control System Toolbox. Natick, MA: Mathworks, 1990. Young Ho Kim was born in Taegu, Korea, in 1960. He received the B.S. degree in physics from Korea Military Academy in 1983, the M.S. degree in electrical engineering from the University of Central Florida, Orlando, in 1988, and the Ph.D. degree in electrical engineering from the University of Texas at Arlington, Fort Worth, in 1997. From 1994 to 1997, he was a Research Assistant at the Automation and Robotics Research Institute, University of Texas, Arlington. He has published extensively in the fields of feedback control using neural networks and fuzzy systems. He authored the book High-Level Feedback Control with Neural Networks. His research interests include optimal control, neural networks, dynamic recurrent neural networks, fuzzy-logic systems, real-time adaptive critics for intelligent control of robotics, and nonlinear systems. Dr. Kim received the Korean Army Overseas Scholarship. He received the Sigma Xi Doctoral Research Award in 1997. He is a member of Sigma Xi.

KIM AND LEWIS: OPTIMAL DESIGN OF NEURAL-NETWORK CONTROLLER 31 Frank L. Lewis (S 78 M 81 SM 86 F 94) was born in Wuzburg, Germany. He received the B.S. degree in physics and electrical engineering and the M.S. degree in electrical engineering at Rice University, Houston, TX, in 1971. He received the M.S. degree in aeronautical engineering from the University of West Florida, Pensacola, in 1977. He received the Ph.D. degree from Georgia Institute of Technology, Atlanta, in 1981. In 1981, he was employed as a Professor of Electrical Engineering with the University of Texas, Arlington. He spent six years in the United States Navy, serving as Navigator aboard the frigate USS Trippe (FF-1075) and Executive Officer and Acting Commanding Officer aboard USS Salinan (ATF-161). He has studied the geometric, analytic, and structural properties of dynamical systems and feedback control automation. His current interests include robotics, intelligent control, neural and fuzzy systems, nonlinear systems, and manufacturing process control. He is the author/coauthor of two U.S. patents, 124 journal papers, 20 chapters and encyclopedia articles, 210 refereed conference papers, and 7 books. Dr. Lewis is a registered Professional Engineer in the State of Texas and was selected to the Editorial Boards of International Journal of Control, Neural Computing and Applications, and International Journal of Intelligent Control Systems. He is the recipient of an NSF Research Initiation Grant and has been continuously funded by NSF since 1982. Since 1991, he has received $1.8 million in funding from NSF and upwards of $1 million in SBIR/industry/state funding. He was awarded the Moncrief-O Donnell Endowed Chair in 1990 at the Automation and Robotics Research Institute, Arlington, TX. He received a Fulbright Research Award, the American Society of Engineering Education F. E. Terman Award, three Sigma Xi Research Awards, the UTA Halliburton Engineering Research Award, the UTA University-Wide Distinguished Research Award, the ARRI Patent Award, various Best Paper Awards, the IEEE Control Systems Society Best Chapter Award, and the National Sigma Xi Award for Outstanding Chapter (as President). He was selected as Engineer of the year in 1994 by the Ft. Worth, TX, IEEE Section. He was appointed to the NAE Committee on Space Station in 1995 and to the IEEE Control Systems Society Board of Governors in 1996. In 1998, he was selected as an IEEE Control Systems Society Distinguished Lecturer. He is a Founding Member of the Board of Governors of the Mediterranean Control Association.