LYAPUNOV-BASED ROBUST AND ADAPTIVE CONTROL OF NONLINEAR SYSTEMS USING A NOVEL FEEDBACK STRUCTURE

Size: px

Start display at page:

Download "LYAPUNOV-BASED ROBUST AND ADAPTIVE CONTROL OF NONLINEAR SYSTEMS USING A NOVEL FEEDBACK STRUCTURE"

Gerard Fox
5 years ago
Views:

1 LYAPUNOV-BASED ROBUST AND ADAPTIVE CONTROL OF NONLINEAR SYSTEMS USING A NOVEL FEEDBACK STRUCTURE By PARAG PATRE A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORIDA 29 1

2 c 29 Parag Patre 2

3 To my parents, Madhukar and Surekha Patre, and my sister, Aparna 3

4 ACKNOWLEDGMENTS I would like to express sincere gratitude to my advisor, Dr. Warren E. Dixon, whose experience and motivation were instrumental in successful completion of my PhD. I appreciate his patience towards me and allowing me the freedom to work independently. I would also like to extend my gratitude to my committee members Dr. Norman Fitz-Coy, Dr. Rick Lind, Dr. Pramod Khargonekar, and Dr. Frank Lewis for the time and help they provided. I would like to thank my coworkers, family, and friends for their support and encouragement. 4

5 TABLE OF CONTENTS page ACKNOWLEDGMENTS LIST OF TABLES LIST OF FIGURES ABSTRACT CHAPTER 1 INTRODUCTION Motivation and Problem Statement Contributions ASYMPTOTIC TRACKING FOR SYSTEMS WITH STRUCTURED AND UNSTRUCTURED UNCERTAINTIES Introduction Dynamic Model Error System Development Stability Analysis Experimental Results Discussion Conclusions ASYMPTOTIC TRACKING FOR UNCERTAIN DYNAMIC SYSTEMS VIA A MULTILAYER NEURAL NETWORK FEEDFORWARD AND RISE FEEDBACK CONTROL STRUCTURE Introduction Dynamic Model Control Objective Feedforward NN Estimation RISE Feedback Control Development Open-Loop Error System Closed-Loop Error System Stability Analysis Experiment Discussion Conclusions A NEW CLASS OF MODULAR ADAPTIVE CONTROLLERS Introduction Dynamic System

6 4.3 Control Objective Control Development Modular Adaptive Update Law Development Stability Analysis Neural Network Extension to Non-LP Systems RISE Feedback Control Development Modular Tuning Law Development Stability Analysis Application to Euler-Lagrange Systems Experiment Modular Adaptive Update Law Modular Neural Network Update Law Discussion Conclusion COMPOSITE ADAPTIVE CONTROL FOR SYSTEMS WITH ADDITIVE DISTURBANCES Introduction Dynamic System Control Objective Control Development RISE-based Swapping Composite Adaptation Closed-Loop Prediction Error System Closed-Loop Tracking Error System Stability Analysis Experiment Discussion Conclusion COMPOSITE ADAPTATION FOR NN-BASED CONTROLLERS Introduction Dynamic System Control Objective Control Development Swapping Composite Adaptation Closed-Loop Error System Stability Analysis Experiment Discussion Conclusion

7 7 CONCLUSIONS AND FUTURE WORK Conclusions Future Work APPENDIX REFERENCES BIOGRAPHICAL SKETCH

8 Table LIST OF TABLES page 2-1 t-test: two samples assuming equal variances for RMS error t-test: two samples assuming equal variances for RMS torque LP case: Average RMS values for 1 trials Non-LP case: Average RMS values for 1 trials

9 Figure LIST OF FIGURES page 2-1 The experimental testbed consists of a circular disk mounted on a NSK direct-drive switched reluctance motor Desired trajectory used for the experiment Position tracking error without the adaptive feedforward term Torque input without the adaptive feedforward term Position tracking error for the control structure that includes the adaptive feedforward term Parameter estimates of the adaptive feedforward component: (a) ˆγ 1, (b) ˆγ 4, (c) ˆγ 6, (d) Ĵ Torque input for the control structure that includes the adaptive feedforward term The contribution of the RISE term for the control structure that includes the adaptive feedforward term RMS position tracking errors and torques for the two cases - (1) without the adaptation term in the control input, (2) with the adaptation term in the control input Tracking error for the RISE control law with no NN adaptation Control torque for the RISE control law with no NN adaptation Tracking error for the proposed RISE+NN control law Control torque for the proposed RISE+NN control law Average RMS errors (degrees) and torques (N-m). 1- RISE, 2- RISE+NN (proposed) The experimental testbed consists of a two-link robot. The links are mounted on two NSK direct-drive switched reluctance motors Link position tracking error with a gradient-based adaptive update law Torque input for the modular adaptive controller with a gradient-based adaptive update law Adaptive estimates for the gradient update law Link position tracking error with a least-squares adaptive update law

10 4-6 Torque input for the modular adaptive controller with a least-squares adaptive update law Adaptive estimates for the least squares update law Link position tracking error for the modular NN controller with a gradient-based tuning law Torque input for the modular NN controller with a gradient-based tuning law Link position tracking error for the modular NN controller with a Hebbian tuning law Torque input for the modular NN controller with a Hebbian tuning law Block diagram of the proposed RISE-based composite adaptive controller Actual and desired trajectories for the proposed composite adaptive control law (RISE+CFF) Tracking error for the proposed composite adaptive control law (RISE+CFF) Prediction error for the proposed composite adaptive control law (RISE+CFF) Control torque for the proposed composite adaptive control law (RISE+CFF) Contribution of the RISE term in the proposed composite adaptive control law (RISE+CFF) Adaptive estimates for the proposed composite adaptive control law (RISE+CFF) Average RMS errors (degrees) and torques (N-m). 1- RISE, 2- RISE+FF, 3- RISE+CFF (proposed) Block diagram of the proposed RISE-based composite NN controller Tracking error for the proposed composite adaptive control law (RISE+CNN) Prediction error for the proposed composite adaptive control law (RISE+CNN) Control torque for the proposed composite adaptive control law (RISE+CNN) Average RMS errors (degrees) and torques (N-m). 1- RISE, 2- RISE+NN, 3- RISE+CNN (proposed)

11 Abstract of Dissertation Presented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy LYAPUNOV-BASED ROBUST AND ADAPTIVE CONTROL OF NONLINEAR SYSTEMS USING A NOVEL FEEDBACK STRUCTURE Chair: Warren E. Dixon Major: Mechanical Engineering By Parag Patre August 29 The focus of this research is an examination of the interplay between different intelligent feedforward mechanisms with a recently developed continuous robust feedback mechanism, coined Robust Integral of the Sign of the Error (RISE), to yield asymptotic tracking in the presence of generic disturbances. This result solves a decades long open problem of how to obtain asymptotic stability of nonlinear systems with general sufficiently smooth disturbances with a continuous control method. Further, it is shown that the developed technique can be fused with other feedforward methods such as function approximation and adaptive control methods. The addition of feedforward elements adds system knowledge in the control structure, which heuristically, yields better performance and reduces control effort. This heuristic notion is supported by experimental results in this research. One key element in the development of the novel feedforward mechanisms presented in this dissertation is the modularity between the controller and update law. This modularity provides flexibility in the selection of different update laws that could be easier to implement or help to achieve faster parameter convergence and better tracking performance. The efficacy of the feedforward mechanisms is further enhanced by including a prediction error in the learning process. The prediction error, which directly relates to 11

12 the actual function mismatch, is used along with the system tracking errors to develop a composite adaptation law. Each result is supported through rigorous Lyapunov-based stability proofs and experimental demonstrations. 12

13 CHAPTER 1 INTRODUCTION 1.1 Motivation and Problem Statement The control of systems with uncertain nonlinear dynamics has been a decades long mainstream area of focus. For systems with uncertainties that can be linear parameterized, a variety of adaptive (e.g., [1 3]) feedforward controllers can be utilized to achieve an asymptotic result. Some recent results have also targeted the application of adaptive controllers for systems that are not linear in the parameters [4]. Learning controllers have been developed for systems with periodic disturbances [5 7], and recent research has focused on the use of exosystems [8 11] to compensate for disturbances that are the solution of a linear time-invariant system with unknown coefficients. A variety of methods have also been proposed to compensate for systems with unstructured uncertainty including: various sliding mode controllers (e.g., [3, 12]), robust control schemes [13], and neural network and fuzzy logic controllers [14 18]. From a review of these approaches, a general trend is that controllers developed for systems with more unstructured uncertainty will require more control effort (i.e., high gain or high frequency feedback) and yield reduced performance (e.g., uniformly ultimately bounded stability). Recently a new robust control strategy, coined Robust Integral of the Sign of the Error (RISE) in [19, 2], was developed in [21, 22] that can accommodate for sufficiently smooth bounded disturbances. A significant outcome of this new control structure is that asymptotic stability is obtained despite a fairly general uncertain disturbance. This technique was used in [23] to develop a tracking controller for nonlinear systems in the presence of additive disturbances and parametric uncertainties under the assumption that the disturbances are C 2 with bounded time derivatives. In [24], Xian et al. utilized this strategy to propose a new output feedback discontinuous tracking controller for a general class of second-order nonlinear systems whose uncertain dynamics are first-order differentiable. In [25], Zhang et al. combined the high gain feedback structure with a 13

14 high gain observer at the sacrifice of yielding a semi-global uniformly ultimately bounded result. This particular high gain feedback method has also been used as an identification technique. For example, the method has been applied to identify friction (e.g., [26]), for range identification in perspective and paracatadioptric vision systems (e.g., [27], [28]), and for fault detection and identification (e.g., [29]). The development in Chapter 2 is motivated by the desire to include some knowledge of the dynamics in the control design as a means to improve the performance and reduce the control effort. Specifically, for systems that include some dynamics that can be segregated into structured (i.e., linear parameterizable) and unstructured uncertainty, this chapter illustrates how a new controller, error system, and stability analysis can be crafted to include a model-based adaptive feedforward term in conjunction with the high gain RISE feedback technique to yield an asymptotic tracking result. Another learning method that has been extensively investigated by control researchers over the last fifteen years is the use of neural networks (NNs) as a feedforward element in the control structure. The focus on NN-based control methods is spawned from the ramification of the fact that NNs are universal approximators [3]. That is, NNs can be used as a black-box estimator for a general class of systems. Examples include: nonlinear systems with parametric uncertainty that do not satisfy the linear-in-the-parameters assumption required in most adaptive control methods; systems with deadzones or discontinuities; and systems with backlash. Typically, NN-based controllers yield global uniformly ultimately bounded (UUB) stability results (e.g., see [15, 31, 32] for examples and reviews of literature) due to residual functional reconstruction inaccuracies and an inability to compensate for some system disturbances. Motivated by the desire to eliminate the residual steady-state errors, several researchers have obtained asymptotic tracking results by combining the NN feedforward element with discontinuous feedback methods such as variable structure controllers (VSC) (e.g., [33, 34]) or sliding mode (SM) controllers (e.g., [34, 35]). A clever VSC-like controller was also proposed in [36] where 14

15 the controller is not initially discontinuous, but exponentially becomes discontinuous as an exogenous control element exponentially vanishes. Well known limitations of VSC and SM controllers include a requirement for infinite control bandwidth and chattering. Unfortunately, ad hoc fixes for these effects result in a loss of asymptotic stability (i.e., UUB typically results). Motivated by issues associated with discontinuous controllers and the typical UUB stability result, an innovative continuous NN-based controller was recently developed in [37] to achieve partial asymptotic stability for a particular class of systems. The result in Chapter 3 is motivated by the question: Can a NN feedforward controller be modified by a continuous feedback element to achieve an asymptotic tracking result for a general class of systems? Despite the pervasive development of NN controllers in literature and the widespread use of NNs in industrial applications, the answer to this fundamental question has remained an open problem. To provide an answer to the fundamental motivating question, the result in Chapter 3 focuses on augmenting a multi-layer NN-based feedforward method with the RISE control strategy Like most of the research in adaptive control, the results in Chapter 2 and 3 exploit Lyapunov-based techniques (i.e., the controller and the adaptive update law are designed based on a Lyapunov analysis); however, Lyapunov-based methods restrict the design of the adaptive update law. For example, many of the previous adaptive controllers are restricted to utilizing gradient update laws to cancel cross terms in a Lyapunov-based stability analysis. Gradient update laws can potentially exhibit slower parameter convergence which could lead to a degraded transient performance of the tracking error in comparison to other possible adaptive update laws (e.g., least-squares update law). Several results have been developed in literature that aim to augment the typical position/velocity tracking error-based gradient update law including: composite adaptive update laws [3, 38, 39]; prediction error-based update laws [1, 4 43]; and various least-squares update laws [44 46]. The adaptive update law in these results are all still designed to cancel cross terms in the Lyapunov-based stability analysis. In contrast to these results, 15

16 researchers have also developed a class of modular adaptive controllers (cf. [1, 4, 42, 43]) where a feedback mechanism is used to stabilize the error dynamics provided certain conditions are satisfied on the adaptive update law. For example, nonlinear damping [41, 47] is typically used to yield an input-to-state stability (ISS) result with respect to the parameter estimation error where it is assumed a priori that the update law yields bounded parameter estimates. Often the modular adaptive control development exploits a prediction error in the update law (e.g., see [3, 4 43]), where the prediction error is often required to be square integrable (e.g., [4, 42, 43]). A brief survey of modular adaptive control results is provided in [4]. Since the RISE feedback mechanism alone can yield an asymptotic result without a feedforward component to cancel cross terms in the stability analysis, the research in Chapter 4 is motivated by the following question: Can the RISE control method be used to yield a new class of modular adaptive controllers? Typical adaptive, robust adaptive, and function approximation methods and the ones used in Chapters 2-4 use tracking error feedback to update the adaptive estimates. As mentioned earlier, the use of the tracking error is motivated by the need for the adaptive update law to cancel cross-terms in the closed-loop tracking error system within a Lyapunov-based analysis. As the tracking error converges, the rate of the update law also converges, but drawing conclusions about the convergent value (if any) of the parameter update law is problematic. Ideally, the adaptive update law would include some estimate of the parameter estimation error as a means to prove the parameter estimates converge to the actual values; however, the parameter estimate error is unknown. The desire to include some measurable form of the parameter estimation error in the adaptation law resulted in the development of adaptive update laws that are driven, in part, by a prediction error [3, 42, 48 5]. The prediction error is defined as the difference between the predicted parameter estimate value and the actual system uncertainty. Including feedback of the estimation error in the adaptive update law enables improved parameter estimation. For example, some classic results [1, 3, 41] have proven 16

17 the parameter estimation error is square integrable and that the parameter estimates may converge to the actual uncertain parameters. Since the prediction error depends on the unmeasurable system uncertainty, the swapping lemma [3, 42, 48 5] is central to the prediction error formulation. The swapping technique (also described as input or torque filtering in some literature) transforms a dynamic parametric model into a static form where standard parameter estimation techniques can be applied. In [1] and [41], a nonlinear extension of the swapping lemma was derived, which was used to develop the modular z-swapping and x-swapping identifiers via an input-to-state stable (ISS) controller for systems in parametric strict feedback form. The advantages provided by prediction error based adaptive update laws led to several results that use either the prediction error or a composite of the prediction error and the tracking error (cf. [43, 51 56] and the references within). Although prediction error based adaptive update laws have existed for approximately two decades, no stability result has been developed for systems with additive bounded disturbances. In general, the inclusion of disturbances reduces the steady-state performance of continuous controllers to a uniformly ultimately bounded (UUB) result. In addition to a UUB result, the inclusion of disturbances may cause unbounded growth of the parameter estimates [57] for tracking error-based adaptive update laws without the use of projection algorithms or other update law modifications such as σ-modification [58]. Problems associated with the inclusion of disturbances are magnified for control methods based on prediction error-based update laws because the formulation of the prediction error requires the swapping (or control filtering) method. Applying the swapping approach to dynamics with additive disturbances is problematic because the unknown disturbance terms also get filtered and included in the filtered control input. This problem motivates the question of how can a prediction error-based adaptive update law be developed for systems with additive disturbances. To address this motivating question, a general Euler-Lagrange-like MIMO system is considered in Chapter 17

18 5 with structured and unstructured uncertainties, and a gradient-based composite adaptive update law is developed that is driven by both the tracking error and the prediction error. The swapping procedure used in standard adaptive control cannot be extended to NN controllers directly. The presence of a NN reconstruction error has impeded the development of composite adaptation laws for NNs. Specifically, the reconstruction error gets filtered and included in the prediction error destroying the typical prediction error formulation. Using the techniques developed in Chapter 5, the result in Chapter 6 presents the first ever attempt to develop a prediction error-based composite adaptive NN controller for an Euler-Lagrange second-order dynamic system using the RISE feedback. A usual concern about the RISE feedback is the presence of high-gain and high-frequency components in the control structure. However, in contrast to a typical widely used discontinuous high-gain sliding mode controller, the RISE feedback offers a continuous alternative. Moreover, the proposed control designs are not purely high gain as an adaptive element is used as a feedforward component that learns and incorporates the knowledge of system dynamics in the control structure. 1.2 Contributions This research focuses on combining various feedforward terms with the RISE feedback method for the control of uncertain nonlinear dynamic systems. The contributions of Chapters 2-6 are as follows. Chapter 2: Asymptotic Tracking for Systems with Structured and Unstructured Uncertainties: The development in this chapter is motivated by the desire to include some knowledge of the dynamics in the control design as a means to improve the performance and reduce the control effort. Specifically, for systems that include some dynamics that can be segregated into structured (i.e., linear parameterizable) and unstructured uncertainty, this chapter illustrates how a new controller, error system, and stability analysis can be crafted to include a model-based adaptive feedforward term in conjunction with the RISE feedback technique to yield an asymptotic tracking result. This chapter 18

19 presents the first result that illustrates how the amalgamation of these compensation methods can be used to yield an asymptotic result. Chapter 3: Asymptotic Tracking for Uncertain Dynamic Systems via a Multilayer Neural Network Feedforward and RISE Feedback Control Structure: The use of a NN as a feedforward control element to compensate for nonlinear system uncertainties has been investigated for over a decade. Typical NN-based controllers yield uniformly ultimately bounded (UUB) stability results due to residual functional reconstruction inaccuracies and an inability to compensate for some system disturbances. Several researchers have proposed discontinuous feedback controllers (e.g., variable structure or sliding mode controllers) to reject the residual errors and yield asymptotic results. The research in this chapter describes how the RISE feedback term can be incorporated with a NN-based feedforward term to achieve the first ever asymptotic tracking result. To achieve this result, the typical stability analysis for the RISE method is modified to enable the incorporation of the NN-based feedforward terms, and a projection algorithm is developed to guarantee bounded NN weight estimates. Experimental results are presented to demonstrate the performance of the proposed controller. Chapter 4: A New Class of Modular Adaptive Controllers: A novel adaptive nonlinear control design is developed which achieves modularity between the controller and the adaptive update law. Modularity between the controller/update law design provides flexibility in the selection of different update laws that could potentially be easier to implement or used to obtain faster parameter convergence and/or better tracking performance. For a general class of linear-in-the-parameters (LP) uncertain multi-input multi-output systems subject to additive bounded non-lp disturbances, the developed controller uses a model-based feedforward adaptive term in conjunction with the RISE feedback term. Modularity in the adaptive feedforward term is made possible by considering a generic form of the adaptive update law and its corresponding parameter estimate. This generic form of the update law is used to develop a new closed-loop error 19

20 system and stability analysis that does not depend on nonlinear damping to yield the modular adaptive control result. The result is then extended by considering uncertain dynamic systems that are not necessarily LP, and have additive non-lp bounded disturbances. A multilayer NN structure is used in the non-lp extension as a feedforward element to compensate for the non-lp dynamics in conjunction with the RISE feedback term. A NN-based controller is developed with modularity in NN weight tuning laws and the control law. An extension is provided that describes how the control development for the general class of systems can be applied to a class of dynamic systems modeled by Euler-Lagrange formulation. Experimental results on a two-link robot are included to illustrate the concept. Chapter 5: Composite Adaptive Control for Systems with Additive Disturbances: In a typical adaptive update law, the rate of adaptation is generally a function of the state feedback error. Ideally, the adaptive update law would also include some feedback of the parameter estimation error. The desire to include some measurable form of the parameter estimation error in the adaptation law resulted in the development of composite adaptive update laws that are functions of a prediction error and the state feedback. In all previous composite adaptive controllers, the formulation of the prediction error is predicated on the critical assumption that the system uncertainty is linear in the uncertain parameters (LP uncertainty). The presence of additive disturbances that are not LP would destroy the prediction error formulation and stability analysis arguments in previous results. In this chapter, a new prediction error formulation is constructed through the use of the RISE technique. The contribution of this design and associated stability analysis is that the prediction error can be developed even with disturbances that do not satisfy the LP assumption (e.g., additive bounded disturbances). A composite adaptive controller is developed for a general MIMO Euler-Lagrange system with mixed structured (i.e., LP) and unstructured uncertainties. A Lyapunov-based stability analysis is used to 2

21 derive sufficient gain conditions under which the proposed controller yields semi-global asymptotic tracking. Experimental results are presented to illustrate the approach. Chapter 6: Composite Adaptation for NN-Based Controllers: With the motivation of using more information to update the parameter estimates, composite adaptation that uses both the system tracking errors and a prediction error containing parametric information to drive the update laws, has become widespread in adaptive control literature. However, despite its obvious benefits, composite adaptation has not been implemented in NN-based control, primarily due to the NN reconstruction error that destroys a typical prediction error formulation required for the composite adaptation. This chapter presents the first ever attempt to design a composite adaptation law for NNs by devising an innovative swapping procedure that uses the RISE feedback method. Semi-global asymptotic tracking for the system errors is proved, while all other signals and control input are shown to be bounded. 21

22 CHAPTER 2 ASYMPTOTIC TRACKING FOR SYSTEMS WITH STRUCTURED AND UNSTRUCTURED UNCERTAINTIES 2.1 Introduction The development in this chapter is motivated by the desire to include some knowledge of the dynamics in the control design as a means to improve the performance and reduce the control effort. Specifically, for systems that include some dynamics that can be segregated into structured (i.e., linear parameterizable) and unstructured uncertainty, this chapter illustrates how a new controller and error system can be crafted to include a model-based adaptive feedforward term in conjunction with the RISE feedback technique to yield an asymptotic tracking result. This chapter presents the first result that illustrates how the amalgamation of these compensation methods can be used to yield an asymptotic result. Heuristically, the addition of the model-based adaptive feedforward term should reduce the overall control effort because some of the disturbance has been isolated and compensated for by a non-high gain feedforward element. Moreover, the addition of the adaptive feedforward term injects some knowledge of the dynamics in the control structure, which leads to improved performance. Experimental results are presented to reinforce these heuristic notions. Specifically, the presented controller was implemented on a simple rotating circular disk testbed and demonstrated reduced tracking error and control effort. For this testbed, the dynamics that were included in the feedforward term included the inertia of the linkage assembly, and the friction present in the system. 2.2 Dynamic Model The class of nonlinear dynamic systems considered in this manuscript is assumed to be modeled by the following Euler-Lagrange formulation that describes the behavior of a large class of engineering systems: M(q) q + V m (q, q) q + G(q) + f( q) + τ d (t) = τ(t). (2 1) 22

23 In (2 1), M(q) R n n denotes a generalized inertia matrix, V m (q, q) R n n denotes a generalized centripetal-coriolis matrix, G(q) R n denotes a generalized gravity vector, f( q) R n denotes a generalized friction vector, τ d (t) R n denotes a generalized nonlinear disturbance (e.g., unmodeled effects), τ(t) R n represents the generalized torque input control vector, and q(t), q(t), q(t) R n denote the generalized link position, velocity, and acceleration vectors, respectively. The friction term f( q) in (2 1) is assumed to have the following form [26] and [59]: f( q) = γ 1 (tanh(γ 2 q) tanh(γ 3 q)) + γ 4 tanh(γ 5 q) + γ 6 q, (2 2) where γ i R i = 1, 2,..., 6 denote unknown positive constants. The friction model in (2 2) has the following properties: 1) it is symmetric about the origin, 2) it has a static coefficient of friction, 3) it exhibits the Stribeck effect where the friction coefficient decreases from the static coefficient of friction with increasing slip velocity near the origin, 4) it includes a viscous dissipation term, and 5) it has a Coulombic friction coefficient in the absence of viscous dissipation. To a good approximation, the static friction coefficient is given by γ 1 + γ 4, and the Stribeck effect is captured by tanh(γ 2 q) tanh(γ 3 q). The Coulombic friction coefficient is given by γ 4 tanh(γ 5 q), and the viscous dissipation is given by γ 6 q. For further details regarding the friction model, see [26] and [59]. The subsequent development is based on the assumption that q(t) and q(t) are measurable and that M(q), V m (q, q), G(q), f( q), and τ d (t) are unknown. The error system systems utilized in this manuscript assume that the generalized coordinates, q (t), of the Euler-Lagrange dynamics in (2 1) allow additive and not multiplicative errors. Moreover, the following assumptions will be exploited in the subsequent development: Assumption 2-1: Symmetric and Positive-Definite Inertia Matrix The inertia matrix M(q) is symmetric, positive definite, and satisfies the following inequality ξ(t) R n : m 1 ξ 2 ξ T M(q)ξ m(q) ξ 2 (2 3) 23

24 where m 1 R is a known positive constant, m(q) R is a known positive function, and denotes the standard Euclidean norm. Assumption 2-2: If q(t), q(t) L, then V m (q, q), F( q) and G(q) are bounded. Moreover, if q(t), q(t) L, then the first and second partial derivatives of the elements of M(q), V m (q, q), G(q) with respect to q (t) exist and are bounded, and the first and second partial derivatives of the elements of V m (q, q), F( q) with respect to q(t) exist and are bounded. Assumption 2-3: The nonlinear disturbance term and its first two time derivatives, i.e. τ d (t), τ d (t), τ d (t) are bounded by known constants. Assumption 2-4: The desired trajectory is designed such that q (i) d (t) Rn (i =, 1,..., 4) exist and are bounded. 2.3 Error System Development The control objective is to ensure that the system tracks a desired time-varying trajectory despite structured and unstructured uncertainties in the dynamic model. To quantify this objective, a position tracking error, denoted by e 1 (t) R n, is defined as e 1 = q d q. (2 4) To facilitate the subsequent analysis, filtered tracking errors, denoted by e 2 (t), r(t) R n, are also defined as e 2 = ė 1 + α 1 e 1 (2 5) r = ė 2 + α 2 e 2, (2 6) where α 1, α 2 R denote positive constants. The subsequent development is based on the assumption that q (t) and q (t) are measurable, so the filtered tracking error r(t) is not measurable since the expression in (2 6) depends on q(t). 24

25 The open-loop tracking error system can be developed by premultiplying (2 6) by M(q) and utilizing the expressions in (2 1)-(2 5) to obtain the following expression: M(q)r = Y d θ + S + W d + τ d (t) τ(t), (2 7) where Y d (q d, q d, q d )θ R n is defined as Y d θ M(q d ) q d + V m (q d, q d ) q d + G(q d ) + γ 1 (tanh( γ 2 q d ) tanh ( γ 3 q d )) (2 8) + γ 4 tanh ( γ 5 q d ) + γ 6 q d. In (2 8), θ R p contains the constant unknown system parameters, Y d (q d, q d, q d ) R n p is the desired regression matrix that contains known functions of the desired link position, velocity, and acceleration, q d (t), q d (t), q d (t) R n, respectively, and γ 2, γ 3, γ 5 R are the best guess estimates for γ 2, γ 3, and γ 5, respectively. In (2 7), the auxiliary function S (q, q, q d, q d, q d ) R n is defined as S M (q)(α 1 ė 1 + α 2 e 2 ) + M (q) q d M(q d ) q d + V m (q, q) q V m (q d, q d ) q d + G(q) G(q d ) + γ 6 q γ 6 q d + γ 4 tanh (γ 5 q) γ 4 tanh (γ 5 q d ) + γ 1 (tanh (γ 2 q) tanh (γ 3 q)) (2 9) γ 1 (tanh (γ 2 q d ) tanh (γ 3 q d )), and the auxiliary function W d ( q d ) R n is defined as W d γ 4 tanh (γ 5 q d ) γ 4 tanh ( γ 5 q d ) + γ 1 (tanh (γ 2 q d ) tanh (γ 3 q d )) (2 1) γ 1 (tanh ( γ 2 q d ) tanh ( γ 3 q d )). Based on the expression in (2 7), the control torque input is designed as τ = Y dˆθ + µ. (2 11) In (2 11), µ(t) R n denotes the RISE term defined as µ(t) (k s + 1)e 2 (t) (k s + 1)e 2 () + t [(k s + 1)α 2 e 2 (σ) + βsgn(e 2 (σ))]dσ, (2 12) 25

26 where k s, β R are positive, constant control gains, and ˆθ(t) R p denotes a parameter estimate vector generated on-line according to the following update law: ˆθ = ΓẎ T d r (2 13) with Γ R p p being a known, constant, diagonal, positive-definite adaptation gain matrix. Since Ẏd(t) is only a function of the known desired time varying trajectory, (2 13) can be integrated by parts as follows: ˆθ(t) = ˆθ() + ΓẎ d T e 2(σ) t Γ t } T {Ÿ d e 2(σ) α 2 Ẏd T e 2(σ) dσ (2 14) so that the parameter estimate vector ˆθ(t) implemented in (2 11) does not depend on the unmeasurable signal r(t). Remark 2.1. The control design in (2 11) is similar to the results in [22]. However, previous designs based on [22] could only compensate for uncertainty in the system through the high gain RISE feedback term µ(t). Through the new development presented in the current result, an adaptive feedforward term can also be used to compensate for system uncertainty. This flexibility presents a significant advantage because it allows more system dynamics to be incorporated in the control design. Specifically, if some of the system uncertainty can be segregated into a linear parameterizable form, then the model-based adaptive feedforward term can be injected to compensate for the uncertainty instead of just relying on the non-model based high gain RISE feedback term. Heuristically, this contribution should improve the tracking performance and reduce the control effort. Experimental results on a simple one-link robot manipulator provide some validation of this heuristic idea. (2 7) as The closed-loop tracking error system can be developed by substituting (2 11) into M(q)r = Y d θ + S + Wd + τ d µ(t), (2 15) 26

27 where θ(t) R p represents the parameter estimation error vector defined as θ = θ ˆθ. (2 16) To facilitate the subsequent stability analysis (and to illustrate some insight into the structure of the design for µ(t)), the time derivative of (2 15) is determined as M(q)ṙ = 1 2Ṁ(q)r + Ẏd θ + Ñ (t) + N d (t) µ(t) e 2, (2 17) where the unmeasurable auxiliary term Ñ(e 1, e 2, r) R n is defined as Ñ(t) Y d ΓẎ T d r + Ṡ 1 2Ṁ(q)r + e 2, (2 18) where (2 13) was used. In (2 17), the unmeasurable auxiliary term N d (q d, q d, q d ) R n is defined as N d (t) Ẇd + τ d. (2 19) The time derivative of (2 12) is given as µ(t) = (k s + 1)r + βsgn(e 2 ). (2 2) In a similar manner as in [6], the Mean Value Theorem can be used to develop the following upper bound 1 Ñ(t) ρ ( z ) z, (2 21) where z(t) R 3n is defined as z(t) [ e T 1 e T 2 r T] T. (2 22) The following inequalities can be developed based on the expression in (2 19) and its time derivative: N d (t) ζ Nd Ṅ d (t) ζ Nd2, (2 23) 1 See Lemma 1 of the Appendix for the proof of the inequality in (2 21). 27

28 where ζ Nd, ζ Nd2 R are known positive constants. 2.4 Stability Analysis Theorem 2-1: The controller given in (2 11), (2 12), and (2 14) ensures that all system signals are bounded under closed-loop operation and that the position tracking error is regulated in the sense that e 1 (t) as t provided the control gain k s introduced in (2 12) is selected sufficiently large based on the initial conditions of the system (see the subsequent proof for details), α 1, α 2 are selected according to the following sufficient condition α 1 > 1 2, α 2 > 1 (2 24) and β is selected according to the following sufficient condition β > ζ Nd + 1 α 2 ζ Nd2 (2 25) where ζ Nd and ζ Nd2 are introduced in (2 23). Proof: Let D R 3n+p+1 be a domain containing y(t) =, where y(t) R 3n+p+1 is defined as y(t) [ z T (t) θ T (t) P(t)] T (2 26) and the auxiliary function P(t) R is defined as P(t) β n e 2i () e 2 () T N d () i=1 t L(τ)dτ (2 27) where the subscript i = 1, 2,.., n denotes the ith element of the vector. In (2 27), the auxiliary function L(t) R is defined as L(t) r T (N d (t) βsgn(e 2 )). (2 28) 28

29 The derivative P(t) R can be expressed as P(t) = L(t) = r T (N d (t) βsgn(e 2 )). (2 29) Provided the sufficient condition introduced in (2 25) is satisfied, the following inequality can be obtained 2 : t L(τ)dτ β n e 2i () e 2 () T N d (). (2 3) i=1 Hence, (2 3) can be used to conclude that P(t). Let V (y, t) : D [, ) R be a continuously differentiable, positive definite function defined as V (y, t) e T 1 e et 2 e rt M(q)r + P θ T Γ 1 θ (2 31) which satisfies the following inequalities: U 1 (y) V (y, t) U 2 (y) (2 32) provided the sufficient condition introduced in (2 25) is satisfied. In (2 32), the continuous, positive definite functions U 1 (y), U 2 (y) R are defined as U 1 (y) η 1 y 2 U 2 (y) η 2 (q) y 2 (2 33) where η 1, η 2 (q) R are defined as η min { { 1, m 1, λ }} min Γ 1 { 1 η 2 (q) max 2 m(q), 1 2 λ { } } max Γ 1, 1 where m 1, m(q) are introduced in (2 3) and λ min { }, λ max { } denote the minimum and maximum eigenvalues, respectively, of the argument. After taking the time derivative of 2 The inequality in (2 3) can be obtained in a similar manner as in Lemma 2 of the Appendix. 29

30 (2 31), V (y, t) can be expressed as V (y, t) = r T M(q)ṙ rt Ṁ(q)r + e T 2 ė2 + 2e T 1 ė1 + P θ T Γ 1 ˆθ. Remark 2.2. From (2 17), (2 27) and (2 28), some of the differential equations describing the closed-loop system for which the stability analysis is being performed have discontinuous right-hand sides as Mṙ = 1 2Ṁ(q)r + Ẏd θ + Ñ (t) + N d (t) (k s + 1)r β 1 sgn(e 2 ) e 2 (2 34) P(t) = L(t) = r T (N d (t) βsgn(e 2 )) (2 35) Let f (y, t) R n+1 denote the right-hand side of (2 34) (2 35). Since the subsequent analysis requires that a solution exists for ẏ = f (y, t), it is important to show the existence and uniqueness of the solution to (2 34) (2 35). As described in [13, 61], the existence of Filippov s generalized solution can be established for (2 34) (2 35). First, note that f (y, t) is continuous except in the set {(y, t) e 2 = }. Let F (y, t) be a compact, convex, upper semicontinuous set-valued map that embeds the differential equation ẏ = f (y, t) into the differential inclusions ẏ F (y, t). From Theorem 27 of [13], an absolute continuous solution exists to ẏ F (y, t) that is a generalized solution to ẏ = f (y, t). A common choice for F (y, t) that satisfies the above conditions is the closed convex hull of f (y, t) [13, 61]. A proof that this choice for F (y, t) is upper semicontinuous is given in [62]. Moreover, note that the differential equation describing the original closed-loop system (i.e., after substituting (2 11) into (2 1)) has a continuous right-hand side; thus, satisfying the condition for existence of classical solutions. Similar arguments are used for all the results in this dissertation. After utilizing (2 5), (2 6), (2 13), (2 17), (2 2), and (2 29), V (y, t) can be simplified as V (y, t) = r T Ñ(t) (k s + 1) r 2 α 2 e 2 2 2α 1 e e T 2 e 1. (2 36) 3

31 Because e T 2 (t)e 1 (t) can be upper bounded as e T 2 e e e 2 2 V (y, t) can be upper bounded using the squares of the components of z(t) as follows: V (y, t) r T Ñ(t) (k s + 1) r 2 α 2 e 2 2 2α 1 e e e 2 2. By using (2 21), the expression in (2 36) can be rewritten as follows: V (y, t) η 3 z 2 ( k s r 2 ρ( z ) r z ) (2 37) where η 3 min{2α 1 1, α 2 1, 1}, and the bounding function ρ( z ) R is a positive, globally invertible, nondecreasing function; hence, α 1, and α 2 must be chosen according to the sufficient conditions in (2 24). After completing the squares for the parenthetic terms in (2 37), the following expression can be obtained: V (y, t) η 3 z 2 + ρ2 ( z ) z 2. (2 38) 4k s The expression in (2 38) can be further upper bounded by a continuous, positive semi-definite function V (y, t) U(y) = c z 2 y D (2 39) for some positive constant c R, where D {y R 3n+p+1 y ρ 1 (2 η 3 k s )}. Larger values of k will expand the size of the domain D. The inequalities in (2 32) and (2 39) can be used to show that V (y, t) L in D; hence, e 1 (t), e 2 (t), r(t), and θ(t) L in D. Given that e 1 (t), e 2 (t), and r(t) L in D, standard linear analysis methods can be used to prove that ė 1 (t), ė 2 (t) L in D from (2 5) and (2 6). Since θ R p contains the constant unknown system parameters and θ(t) L in D, (2 16) can be used to 31

32 prove that ˆθ(t) L in D. Since e 1 (t), e 2 (t), r(t) L in D, the assumption that q d (t), q d (t), q d (t) exist and are bounded can be used along with (2 4)-(2 6) to conclude that q(t), q(t), q(t) L in D. The assumption that q d (t), q d (t), q d (t),... q d (t),... q d (t) exist and are bounded along with (2 8) can be used to show that Y d (q d, q d, q d ), Ẏ d (q d, q d, q d,... q d ), and Ÿd (q d, q d, q d,... q d,... q d ) L in D. Since q(t), q(t) L in D, Assumption 2-2 can be used to conclude that M(q), V m (q, q), G(q), and F( q) L in D. Thus from (2 1) and Assumption 2-3, we can show that τ(t) L in D. Given that r(t) L in D, (2 2) can be used to show that µ(t) L in D. Since q(t), q(t) L in D, Assumption 2-2 can be used to show that V m (q, q), Ġ(q), F(q) and Ṁ(q) L in D; hence, (2 17) can be used to show that ṙ(t) L in D. Since ė 1 (t), ė 2 (t), ṙ(t) L in D, the definitions for U(y) and z(t) can be used to prove that U(y) is uniformly continuous in D. Let S D denote a set defined as follows: 3 S {y(t) D U (y(t)) < η 1 (ρ 1 (2 η 3 k s )) 2 }. (2 4) Theorem 8.4 of [63] can now be invoked to state that c z(t) 2 as t y() S. (2 41) Based on the definition of z(t), (2 41) can be used to show that e 1 (t) as t y() S. 2.5 Experimental Results The testbed depicted in Figure 2-1 was used to implement the developed controller. The testbed consists of a circular disc of unknown inertia mounted on a NSK direct-drive 3 The region of attraction in (2 4) can be made arbitrarily large to include any initial conditions by increasing the control gain k s (i.e., a semi-global type of stability result) [22]. 32

Figure 2-1. The experimental testbed consists of a circular disk mounted on a NSK direct-drive switched reluctance motor. switched reluctance motor (24. Nm Model YS524-GN1).

33 Figure 2-1. The experimental testbed consists of a circular disk mounted on a NSK direct-drive switched reluctance motor. switched reluctance motor (24. Nm Model YS524-GN1). The NSK motor is controlled through power electronics operating in torque control mode. The motor resolver provides rotor position measurements with a resolution of 614, 4 pulses/revolution. A Pentium 2.8 GHz PC operating under QNX hosts the control algorithm, which was implemented via Qmotor 3., a graphical user-interface, to facilitate real-time graphing, data logging, and adjustment of control gains without recompiling the program (for further information on Qmotor 3., the reader is referred to [64]). Data acquisition and control implementation were performed at a frequency of 1. khz using the ServoToGo I/O board. A rectangular nylon block was mounted on a pneumatic linear thruster to apply an external friction load to the rotating disk. A pneumatic regulator maintained a constant pressure of 2 pounds per square inch on the circular disk. The dynamics for the testbed are given as follows: J q + f( q) + τ d (t) = τ(t), (2 42) where J R denotes the combined inertia of the circular disk and rotor assembly, the friction torque f( q) R is defined in (2 2), and τ d (t) R denotes a general nonlinear disturbance (e.g., unmodeled effects). The parameters γ 2, γ 3, γ 5 are embedded inside 33

34 5 Desired Trajectory [degrees] Time [sec] Figure 2-2. Desired trajectory used for the experiment. the nonlinear hyperbolic tangent functions and hence cannot be linearly parameterized. Since these parameters cannot be compensated for by an adaptive algorithm, best-guess estimates γ 2 = 5, γ 3 = 1, γ 5 = 5 are used. The values for γ 2, γ 3, γ 5 are based on previous experiments concerned with friction identification. Significant errors in these static estimates could degrade the performance of the system. The control torque input τ(t) is given by (2 11), where Y d ( q d, q d ) R 1 4 is the regression matrix defined as Y d [ q d tanh( γ 2 q d ) tanh ( γ 3 q d ) tanh ( γ 5 q d ) q d ], and ˆθ (t) R 4 is the vector consisting of the unknown parameters defined as [ ] T ˆθ Ĵ ˆγ 1 ˆγ 4 ˆγ 6. (2 43) The parameter estimates vector in (2 43) is generated on-line using the adaptive update law in (2 14). The desired link trajectory (see Figure 2-2) was selected as follows (in degrees): q d (t) = 45. sin(1.2t)(1 exp(.1t 3 )). (2 44) For all experiments, the rotor velocity signal is obtained by applying a standard 34

35 backwards difference algorithm to the position signal. The integral structure of the adaptive term in (2 14) and the RISE term in (2 12) was computed on-line via a standard trapezoidal algorithm. In addition, all the states and unknown parameters were initialized to zero. The signum function for the control scheme in (2 12) was defined as: 1 e 2 >.5 sgn(e 2 (t)) =.5 < e e 2.5 The RISE controller is composed of proportional-integral-derivative (PID) elements and the integral of sign term. There are several tuning techniques available in the literature for the PID elements in the controller, however, the control gains were obtained by choosing gains and then adjusting based on performance. If the response exhibited a prolonged transient response (compared with the response obtained with other gains), the proportional and integral gains were adjusted. If the response exhibited overshoot, derivative gains were adjusted. To fine tune the performance, the adaptive gains were adjusted after the feedback gains were tuned as described to yield the best performance. In contrast to this approach of adjusting the control and adaptation gains, the control gains could potentially be adjusted using more methodical approaches. For example, the nonlinear system in [65] was linearized at several operating points and a linear controller was designed for each point, and the gains were chosen by interpolating, or scheduling the linear controllers. In [66], a neural network is used to tune the gains of a PID controller. In [67], a genetic algorithm was used to fine tune the gains after an initialization. Additionally, in [68], the tuning of a PID controller for robot manipulators is discussed. Experiment 1 In the first experiment, the controller in (2 11) was implemented without including the adaptation term. Thus the control torque input given in (2 11) takes the following 35

36 Position Tracking Error [degrees] Time [sec] Figure 2-3. Position tracking error without the adaptive feedforward term. form [22]: τ(t) = µ(t). The gains for the controller that yielded the best steady-state performance were determined as follows: k s = 1 β = 115 α 1 = 4 α 2 = 3. (2 45) The position tracking error obtained from the controller is plotted in Figure 2-3, and the torque input by the controller is depicted in Figure 2-4. Experiment 2 In the second experiment, the control input given in (2 11) was used. The update law defined in (2 14) was used to update the parameter estimates defined in (2 43). The following control gains and best guess estimates were used to implement the controller in (2 11): k s = 1 β = 115 α 1 = 4 α 2 = 3 Γ = diag {1, 1, 1, 1}. 36

37 5 Torque [Nm] Time [sec] Figure 2-4. Torque input without the adaptive feedforward term. Position Tracking Error [degrees] Time [sec] Figure 2-5. Position tracking error for the control structure that includes the adaptive feedforward term. The position tracking error obtained from the controller is plotted in Figure 2-5, the parameter estimates are depicted in Figure 2-6, the contribution of the RISE term is shown in Figure 2-8, and the torque input by the controller is depicted in Figure

38 Parameter Estimates (a) (b) (c) (d) Time [sec] Figure 2-6. Parameter estimates of the adaptive feedforward component: (a) ˆγ 1, (b) ˆγ 4, (c) ˆγ 6, (d) Ĵ. 5 Torque [Nm] Time [sec] Figure 2-7. Torque input for the control structure that includes the adaptive feedforward term. 38

39 4 3 2 RISE Term [Nm] Time [sec] Figure 2-8. The contribution of the RISE term for the control structure that includes the adaptive feedforward term. 2.6 Discussion Figure 2-5 illustrates that the incorporation of a model-based feedforward term eliminates the spikes present in Figure 2-3 that occur when the motor changes direction. The spikes are initially present in Figure 2-5, but reduce in magnitude and vanish as the adaptive update converges. These figures exactly illustrate how the addition of the adaptive feedforward element injects model knowledge into the control design to improve the overall performance. Figure 2-8 indicates that the contribution of the RISE term in the overall torque decreases with time as the feedforward adaptation term begins to compensate for part of the disturbances. Both the experiments were repeated 1 consecutive times with the same gain values to check the repeatability and accuracy of the results. For each run, the root mean squared (RMS) values of the position tracking errors and torques are calculated. The average of these RMS values for the two cases (with adaptation and without adaptation) obtained over 1 sets are plotted in Figure 2-9, where the bars indicate the variance about the mean. An unpaired t-test assuming equal variances was performed using a statistical 39

40 package (Microsoft Office Excel 23) with a significance level of α =.5. The results of the t-test for the RMS error, and the RMS torque are shown in Table 2-1 and Table 2-2, respectively. Table 2-1 indicates that the P value obtained for the one-tailed test is less than the significance level α. Thus, the mean RMS error for case 2 is lower than that of case 1, and this difference is statistically significant. Similarly, from Table 2-2, the mean RMS torque for case 2 is lower than that of case 1. The results indicate that the mean RMS value of the position tracking error when the adaptive feedforward term is used is about 43.5% less than the case when no adaptation term is used. This improvement in performance by the proposed controller was obtained while using 17.6% less input torque as shown in Figure 2-9. While the developed controller is a continuous controller, it can exhibit some high frequency content due to the presence of the integral sign function. However, the frequency content is finite unlike current discontinuous nonlinear control methods. The experimental results show some chattering in the input/output signals, but the mechanical system acts a low-pass filter because the actuator bandwidth is lower than the bandwidth produced by the controller. Also, the controller requires full-state feedback (i.e., both position and velocity measurements are needed), but as mentioned earlier, only the position is measured and the velocity is obtained by an unfiltered backward difference algorithm. The need for velocity feedback is also a source of noise, especially for the sub-degree errors that the controller yields. 2.7 Conclusions A new class of asymptotic controllers is developed that contains an adaptive feedforward term to account for linear parameterizable uncertainty and a high gain feedback term which accounts for unstructured disturbances. In comparison with previous results that used a similar high gain feedback control structure, new control development, error systems and stability analysis arguments were required to include the additional adaptive feedforward term. The motivation for injecting the adaptive feedforward term is 4

41 Table 2-1. t-test: two samples assuming equal variances for RMS error RMS Error Variable 1 Variable 2 Mean Variance Observations 1 1 Pooled Variance Hypothesized Mean Difference df 18 t Stat P(T<=t) one-tail t Critical one-tail P(T<=t) two-tail t Critical two-tail 2.11 Figure 2-9. RMS position tracking errors and torques for the two cases - (1) without the adaptation term in the control input, (2) with the adaptation term in the control input. 41

42 Table 2-2. t-test: two samples assuming equal variances for RMS torque RMS Torque Variable 1 Variable 2 Mean Variance Observations 1 1 Pooled Variance Hypothesized Mean Difference df 18 t Stat P(T<=t) one-tail t Critical one-tail P(T<=t) two-tail t Critical two-tail 2.11 that improved tracking performance and reduced control effort result from including more knowledge of the system dynamics in the control structure. This heuristic idea was verified by our experimental results that indicate reduced control effort and reduced RMS tracking errors. 42

43 CHAPTER 3 ASYMPTOTIC TRACKING FOR UNCERTAIN DYNAMIC SYSTEMS VIA A MULTILAYER NEURAL NETWORK FEEDFORWARD AND RISE FEEDBACK CONTROL STRUCTURE 3.1 Introduction The contribution in this chapter is motivated by the question: Can a NN feedforward controller be modified by a continuous feedback element to achieve an asymptotic tracking result for a general class of systems? Despite the pervasive development of NN controllers in literature and the widespread use of NNs in industrial applications, the answer to this fundamental question has remained an open problem. To provide an answer to the fundamental motivating question, the result in this chapter focuses on augmenting a multi-layer NN-based feedforward method with a recently developed [21] high gain control strategy coined the Robust Integral of the Sign of the Error (RISE) in [19, 2]. The RISE control structure is advantageous because it is a differentiable control method that can compensate for additive system disturbances and parametric uncertainties under the assumption that the disturbances are C 2 with bounded time derivatives. Due to the advantages of the RISE control structure a flurry of results have recently been developed (e.g., [22, 23, 25 27]). A RISE feedback controller can be directly applied to yield asymptotic stability for the class of systems described in this chapter. However, the RISE method is a high-gain feedback tool, and hence, clear motivation exists (as with any other feedback controller) to combine a feedforward control element with the feedback controller for potential gains such as improved transient and steady-state performance, and reduced control effort. That is, it is well accepted that a feedforward component can be used to cancel out some dynamic effects without relying on high-gain feedback. Given this motivation, some results have already been developed that combine the RISE feedback element with feedforward terms. In [6], a remark is provided regarding the use of a constant best-guess feedforward component in conjunction with the RISE method to yield a UUB result. In [19, 2], the 43

44 RISE feedback controller was combined with a standard gradient feedforward term for systems that satisfy the linear-in-the-parameters assumption. The experimental results in [19] illustrate significant improvement in the root-mean-squared tracking error with reduced root-mean-squared control effort. However, for systems that do not satisfy the linear-in-the-parameters assumption, motivation exists to combine the RISE controller with a new feedforward method such as the NN. To blend the NN and RISE methods, several technical challenges must be addressed. One (lesser) challenge is that the NN must be constructed in terms of the desired trajectory instead of the actual trajectory (i.e., a DCAL-based NN structure [36]) to remove the dependence on acceleration. The development of a DCAL-based NN structure is challenging for a multi-layer NN because the adaptation law for the weights is required to be state-dependent. Straightforward application of the RISE method would yield an acceleration dependent adaptation law. One method to resolve this issue is to use a dirty derivative (as in the UUB result in [69]; see also [25]). In lieu of a dirty derivative, the result in this chapter uses a Lyapunov-based stability analysis approach for the design of an adaptation law that is only velocity dependent. In comparison with the efforts in [19, 2], a more significant challenge arises from the fact that since a multi-layer NN includes the first layer weight estimate inside of a nonlinear activation function, the previous methods (e.g., [19, 2]) can not be applied. That is, because of the unique manner in which the NN weight estimates appear, the stability analysis and sufficient conditions developed in previous works are violated. Previous RISE methods have a restriction (encapsulated by a sufficient gain condition) that terms in the stability analysis that are upper bounded by a constant must also have time derivatives that are upper bounded by a constant (these terms are usually denoted by N d (t) in RISE control literature, see [6]). The norm of the NN weight estimates can be bounded by a constant (due to a projection algorithm) but the time derivative is state-dependent (i.e., the norm of N d (t) can be bounded by a constant but the norm of Ṅ d (t) is state dependent). To 44

45 address this issue, modified RISE stability analysis techniques are developed that result in modified (but not more restrictive) sufficient gain conditions. By addressing this issue through stability analysis methods, the standard NN weight adaptation law does not need to be modified. Through unique modifications to the stability analysis that enable the RISE feedback controller to be combined with the NN feedforward term, the result in this chapter provides an affirmative answer for the first time to the aforementioned motivating question. Since the NN and the RISE control structures are model independent (black box) methods, the resulting controller is a universal reusable controller [36] for continuous systems. Because of the manner in which the RISE technique is blended with the NN-based feedforward method, the structure of the NN is not altered from textbook examples [15] and can be considered a somewhat modular element in the control structure. Hence, the NN weights and thresholds are automatically adjusted on-line, with no off-line learning phase required. Compared to standard adaptive controllers, the current asymptotic result does not require linearity in the parameters or the development and evaluation of a regression matrix. For systems with linear-in-the-parameters uncertainty, an adaptive feedforward controller has the desirable characteristics that the controller is continuous, can be proven to yield global asymptotic tracking, and includes the specific dynamics of the system in the feedforward path. Continuous feedback NN controllers don t include the specific dynamics in a regression matrix and have a degraded steady-state stability result (i.e., UUB tracking); however, they can be applied when the uncertainty in the system is unmodeled, can not be linearly parameterized, or the development and implementation of a regression matrix is impractical. Sliding mode feedback NN controllers have the advantage that they can achieve global asymptotic tracking at the expense of implementing a discontinuous feedback controller (i.e., infinite bandwidth, exciting structural modes, etc.). In comparison to these controllers, the development 45

46 in this chapter has the advantage of asymptotic tracking with a continuos feedback controller for a general class of uncertainty; however, these advantages are at the expense of semi-global tracking instead of the typical global tracking results. 3.2 Dynamic Model The dynamic model and its properties are the same as in Chapter 2; however, the dynamics are not assumed to satisfy the linear-in-the-parameters assumption. 3.3 Control Objective The control objective is to ensure that the system tracks a desired time-varying trajectory, denoted by q d (t) R n, despite uncertainties in the dynamic model. To quantify this objective, a position tracking error, denoted by e 1 (t) R n, is defined as e 1 q d q. (3 1) To facilitate the subsequent analysis, filtered tracking errors, denoted by e 2 (t), r(t) R n, are also defined as e 2 ė 1 + α 1 e 1 (3 2) r ė 2 + α 2 e 2 (3 3) where α 1, α 2 R denote positive constants. The filtered tracking error r(t) is not measurable since the expression in (3 3) depends on q(t). 3.4 Feedforward NN Estimation NN-based estimation methods are well suited for control systems where the dynamic model contains unstructured nonlinear disturbances as in (2 1). The main feature that empowers NN-based controllers is the universal approximation property. Let S be a compact simply connected set of R N1+1. With map f : S R n, define C n (S) as the space where f is continuous. There exist weights and thresholds such that some function f(x) C n (S) can be represented by a three-layer NN as [15, 32] f (x) = W T σ ( V T x ) + ε (x) (3 4) 46

47 for some given input x(t) R N1+1. In (3 4), V R (N 1+1) N 2 and W R (N2+1) n are bounded constant ideal weight matrices for the first-to-second and second-to-third layers respectively, where N 1 is the number of neurons in the input layer, N 2 is the number of neurons in the hidden layer, and n is the number of neurons in the third layer. The activation function 1 in (3 4) is denoted by σ ( ) R N2+1, and ε (x) R n is the functional reconstruction error. Note that, augmenting the input vector x(t) and activation function σ ( ) by 1 allows us to have thresholds as the first columns of the weight matrices [15, 32]. Thus, any tuning of W and V then includes tuning of thresholds as well. If ε (x) =, then f (x) is in the functional range of the NN. In general for any positive constant real number ε N >, f (x) is within ε N of the NN range if there exist finite hidden neurons N 2, and constant weights so that for all inputs in the compact set, the approximation holds with ε < ε N. For various activation functions, results such as the Stone-Weierstrass theorem indicate that any sufficiently smooth function can be approximated by a suitable large network. Therefore, the fact that the approximation error ε (x) is bounded follows from the Universal Approximation Property of the NNs [3]. Based on (3 4), the typical three-layer NN approximation for f(x) is given as [15, 32] ˆf (x) Ŵ T σ(ˆv T x) (3 5) where ˆV (t) R (N 1+1) N 2 and Ŵ(t) R(N 2+1) n are subsequently designed estimates of the ideal weight matrices. The estimate mismatch for the ideal weight matrices, denoted by Ṽ (t) R (N 1+1) N 2 and W(t) R (N 2+1) n, are defined as Ṽ V ˆV, W W Ŵ 1 A variety of activation functions (e.g., sigmoid, hyperbolic tangent or radial basis) could be used for the control development in this dissertation. 47

48 and the mismatch for the hidden-layer output error for a given x(t), denoted by σ(x) R N2+1, is defined as σ σ ˆσ = σ(v T x) σ(ˆv T x). (3 6) The NN estimate has several properties that facilitate the subsequent development. These properties are described as follows. Assumption 3-1: (Boundedness of the Ideal Weights) The ideal weights are assumed to exist and be bounded by known positive values so that V 2 F = tr ( V T V ) = vec (V ) T vec (V ) V B (3 7) W 2 F = tr ( W T W ) = vec (W) T vec (W) W B, (3 8) where F is the Frobenius norm of a matrix, tr ( ) is the trace of a matrix, and the operator vec ( ) stacks the columns of a matrix A R m n to form a vector vec(a) R mn as [ vec(a) A 11 A A m1 A 12 A A 1n... A mn ] T. Assumption 3-2: (Convex Regions) Based on (3 7) and (3 8), convex regions (e.g., see Section 4.3 of [7]) can be defined. Specifically, the convex region Λ V can be defined as 2 Λ V { v : v T v V B }, (3 9) where V B was given in (3 7). In addition, the following definitions concerning the region ) Λ V and the parameter estimate vector vec (ˆV R (N 1+1)N 2 (i.e., the dynamic estimate of vec (V ) Λ V ) are provided as follows: int(λ V ) denotes the interior of the region Λ V, ) (Λ V ) denotes the boundary for the region Λ V, vec (ˆV R (N 1 +1)N 2 is a unit vector ) normal to (Λ V ) at the point of intersection of the boundary surface (Λ V ) and vec (ˆV ) where the positive direction for vec (ˆV is defined as pointing away from int(λv ) (note 2 See Lemma 3 of the Appendix for the proof of convexity. 48

49 ) ) that vec (ˆV is only defined for vec (ˆV (Λ V )), Pr t (ψ) is the component of the vector ψ R (N 1+1)N 2 that is tangent to (Λ V ) at the point of intersection of the boundary surface ) (Λ V ) and the vector vec (ˆV, and P r (ψ) = ψ P t r(ψ) R (N 1+1)N 2 (3 1) is the component of the vector ψ R (N 1+1)N 2 that is perpendicular to (Λ V ) at the point ) of intersection of the boundary surface (Λ V ) and the vector vec (ˆV. Similar to (3 9), the convex region Λ W is defined as Λ W { v : v T v W B }, (3 11) where W B was given in (3 8). 3.5 RISE Feedback Control Development The contribution of this chapter is the control development and stability analysis that illustrates how the aforementioned textbook (e.g., [15]) NN feedforward estimation strategy can be fused with a RISE feedback control method as a means to achieve an asymptotic stability result for general Euler-Lagrange systems described by (2 1). In this section, the open-loop and closed-loop tracking error is developed for the combined control system Open-Loop Error System The open-loop tracking error system can be developed by premultiplying (3 3) by M(q) and utilizing the expressions in (2 1), (3 1), and (3 2) to obtain the following expression: M(q)r = f d + S + τ d τ (3 12) where the auxiliary function f d (q d, q d, q d ) R n is defined as f d M(q d ) q d + V m (q d, q d ) q d + G(q d ) + F ( q d ) (3 13) 49

50 and the auxiliary function S (q, q, q d, q d, q d ) R n is defined as S M (q)(α 1 ė 1 + α 2 e 2 ) + M (q) q d M(q d ) q d + V m (q, q) q V m (q d, q d ) q d (3 14) + G(q) G(q d ) + F ( q) F ( q d ). The expression in (3 13) can be represented by a three-layer NN as f d = W T σ(v T x d ) + ε (x d ). (3 15) In (3 15), the input x d (t) R 3n+1 is defined as x d (t) [1 q T d (t) qt d (t) qt d (t)]t so that N 1 = 3n where N 1 was introduced in (3 4). Based on the assumption that the desired trajectory is bounded, the following inequalities hold ε (x d ) ε b1, ε (x d, ẋ d ) ε b2, ε (x d, ẋ d, ẍ d ) ε b3 (3 16) where ε b1, ε b2, ε b3 R are known positive constants Closed-Loop Error System Based on the open-loop error system in (3 12), the control torque input is composed of a three-layer NN feedforward term plus the RISE feedback terms as τ ˆf d + µ. (3 17) Specifically, the RISE feedback control term µ(t) R n is defined as [22] µ(t) (k s + 1)e 2 (t) (k s + 1)e 2 () + t [(k s + 1)α 2 e 2 (σ) + β 1 sgn(e 2 (σ))]dσ (3 18) where k s, β 1 R are positive constant control gains. The feedforward NN component in (3 17), denoted by ˆf d (t) R n, is generated as ˆf d Ŵ T σ(ˆv T x d ). (3 19) 5

51 The estimates for the NN weights in (3 19) are generated on-line (there is no off-line learning phase) using a smooth projection algorithm (e.g., see Section 4.3 of [7]) as Ŵ = proj (µ 1 ) = ˆV = proj (µ 2 ) = ) µ 1 if vec (Ŵ int (Λ W ) ) (Ŵ) µ 1 {if vec (Ŵ (Λ W ) and vec (µ 1 ) T vec { ) (Ŵ) PMr t (µ 1) if vec (Ŵ (Λ W ) and vec (µ 1 ) T vec > (3 2) ) µ 2 if vec (ˆV int (Λ V ) ) ) µ 2 {if vec (ˆV (Λ V ) and vec (µ 2 ) T vec (ˆV (3 21) { ) ) PMr t (µ 2) if vec (ˆV (Λ V ) and vec (µ 2 ) T vec (ˆV > where ) vec (Ŵ () int (Λ W ), ) vec (ˆV () int (Λ V ) and the auxiliary terms µ 1 (t) R (N 2+1) n, µ 2 (t) R (N 1+1) N 2 are defined as µ 1 Γ 1ˆσ ˆV Tẋ d e T 2, µ 2 Γ 2 ẋ d (ˆσ T Ŵe 2 ) T (3 22) where Γ 1 R (N 2+1) (N 2 +1), Γ 2 R (N 1+1) (N 1 +1) are constant, positive definite, symmetric matrices. In (3 2) and (3 21), PMr t (A) = devec (P r t (vec (A))) for a matrix A, where the operation devec ( ) is the reverse of vec ( ). Remark 3.1. The use of the projection algorithm in (3 2) and (3 21) is to ensure that Ŵ(t) and ˆV (t) remain bounded inside the convex regions defined in (3 9), and (3 11). This fact will be exploited in the subsequent stability analysis. The closed-loop tracking error system can be developed by substituting (3 17) into (3 12) as M(q)r = f d ˆf d + S + τ d µ. (3 23) To facilitate the subsequent stability analysis, the time derivative of (3 23) is determined as M(q)ṙ = Ṁ(q)r + f d ˆf d + Ṡ + τ d µ. (3 24) 51

52 Taking the time derivative of the closed-loop error system is typical of the RISE stability analysis. In our case, the time differentiation also facilitates the design of NN weight adaptation laws instead of using the typical (as in [15, 32]) Taylor series approximation method to obtain a linear form for the estimation error Ṽ. Using (3 15) and (3 19), the closed-loop error system in (3 24) can be expressed as M(q)ṙ = Ṁ(q)r + W T σ ( V T x d ) V Tẋ d Ŵ T σ(ˆv T x d ) (3 25) Ŵ T σ (ˆV T x d ) ˆV T x d Ŵ T σ (ˆV T x d )ˆV T ẋ d + ε + Ṡ + τ d µ where σ (ˆV T x) dσ ( V T x ) /d ( V T x ) V T x=ˆv T x. After adding and subtracting the terms W T ˆσ ˆV Tẋ d + Ŵ T ˆσ Ṽ T ẋ d to (3 25), the following expression can be obtained: M(q)ṙ = Ṁ(q)r + Ŵ T ˆσ Ṽ T ẋ d + W T ˆσ ˆV Tẋ d + W T σ V T ẋ d W T ˆσ ˆV Tẋ d (3 26) Ŵ T ˆσ Ṽ T ẋ d + Ṡ Ŵ T ˆσ Ŵ T ˆσ ˆV T x d + τ d + ε µ where the notations ˆσ and σ are introduced in (3 6). Using the NN weight tuning laws in (3 2), (3 21), the expression in (3 26) can be rewritten as M(q)ṙ = 1 2Ṁ(q)r + Ñ + N e 2 (k s + 1)r β 1 sgn(e 2 ) (3 27) where the fact that the time derivative of (3 18) is given as µ(t) = (k s + 1)r + β 1 sgn(e 2 ) (3 28) was utilized, and where the unmeasurable auxiliary terms Ñ(e 1, e 2, r, t), N(Ŵ, ˆV, x d, ẋ d, t) R n are defined as Ñ(t) 1 2Ṁ(q)r proj(γ 1ˆσ ˆV Tẋ d e T 2 )T ˆσ (3 29) Ŵ T ˆσ proj(γ 2 ẋ d (ˆσ T Ŵe 2 ) T ) T x d + Ṡ + e 2 52

53 and In (3 3), N d (x d, ẋ d, t) R n is defined as N N d + N B. (3 3) N d W T σ V T ẋ d + ε + τ d (3 31) while N B (Ŵ, ˆV, x d, ẋ d, t) R n is further segregated as N B N B1 + N B2 (3 32) where N B1 (Ŵ, ˆV, x d, ẋ d, t) R n is defined as N B1 W T ˆσ ˆV Tẋ d Ŵ T ˆσ Ṽ T ẋ d (3 33) and the term N B2 (Ŵ, ˆV, x d, ẋ d, t) R n is defined as N B2 Ŵ T ˆσ Ṽ T ẋ d + W T ˆσ ˆV Tẋ d. (3 34) Motivation for segregating the terms in (3 3) is derived from the fact that the different components in (3 3) have different bounds. Segregating the terms as in (3 3)-(3 34) facilitates the development of the NN weight update laws and the subsequent stability analysis. For example, the terms in (3 31) are grouped together because the terms and their time derivatives can be upper bounded by a constant and rejected by the RISE feedback, whereas the terms grouped in (3 32) can be upper bounded by a constant but their derivatives are state dependent. The state dependency of the time derivatives of the terms in (3 32) violates the assumptions given in previous RISE-based controllers (e.g., [19, 2, 22, 23, 25 27]), and requires additional consideration in the adaptation law design and stability analysis. The terms in (3 32) are further segregated because N (Ŵ, ˆV B1, x d ) will be rejected by the RISE feedback, whereas N (Ŵ, ˆV B2, x d ) will be partially rejected by the RISE feedback and partially canceled by the adaptive update law for the NN weight estimates. 53

54 In a similar manner as in Chapter 2, the Mean Value Theorem can be used to develop the following upper bound Ñ(t) ρ ( z ) z (3 35) where z(t) R 3n is defined as z(t) [e T 1 e T 2 r T ] T (3 36) and the bounding function ρ( z ) R is a positive globally invertible nondecreasing function. The following inequalities can be developed based on Assumption 2-3, (3 7), (3 8), (3 16), (3 32)-(3 34): N d ζ 1, N B1 ζ 2, N B2 ζ 3, Ṅd ζ 4. (3 37) From (3 3), (3 32) and (3 37), the following bound can be developed N 1 N d + N 1B N d + N 1Ba + N 1Bb ζ 1 + ζ 2 + ζ 3. (3 38) By using (3 2), (3 21), the time derivative of N B (Ŵ, ˆV, x d ) can be bounded as ṄB ζ 5 + ζ 6 e 2. (3 39) In (3 37) and (3 39), ζ i R, (i = 1, 2,..., 6) are known positive constants. 3.6 Stability Analysis Theorem 3-1: The combined NN and RISE controller given in (3 17)-(3 21) ensures that all system signals are bounded under closed-loop operation and that the position tracking error is regulated in the sense that e 1 (t) as t provided the control gain k s introduced in (3 18) is selected sufficiently large (see the subsequent proof), α 1, α 2 are selected according to the following sufficient condition α 1 > 1 2, α 2 > β 2 + 1, (3 4) 54

55 and β 1 and β 2 are selected according to the following sufficient conditions: { β 1 > max ζ 1 + ζ 2 + ζ 3, ζ 1 + ζ 2 + ζ 4 + ζ } 5, β 3 > ζ 6 (3 41) α 2 α 2 where ζ i R, i = 1, 2,..., 5 are introduced in (3 37)-(3 39) and β 2 is introduced in (3 44). as Proof: Let D R 3n+2 be a domain containing y(t) =, where y(t) R 3n+2 is defined In (3 42), the auxiliary function P(t) R is defined as y(t) [z T (t) P(t) Q(t)] T. (3 42) n P(t) β 1 e 2i () e 2 () T N() i=1 t L(τ)dτ (3 43) where the subscript i = 1, 2,.., n denotes the ith element of the vector, and the auxiliary function L(t) R is defined as L(t) r T (N B1 (t) + N d (t) β 1 sgn(e 2 )) + ė 2 (t) T N B2 (t) β 2 e 2 (t) 2 (3 44) where β 2 R is a positive constant chosen according to the second sufficient condition in (3 41). The derivative P(t) R can be expressed as P(t) = L(t) = r T (N B1 (t) + N d (t) β 1 sgn(e 2 )) ė 2 (t) T N B2 (t) + β 2 e 2 (t) 2. (3 45) Provided the sufficient conditions introduced in (3 41) are satisfied, the following inequality can be obtained in a similar fashion as in Lemma 2 of the Appendix t n L(τ)dτ β 1 e 2i () e 2 () T N(). (3 46) i=1 Hence, (3 46) can be used to conclude that P(t). The auxiliary function Q(t) R in (3 42) is defined as Q(t) α 2 2 tr( W T Γ 1 1 W) + α 2 2 tr(ṽ T Γ 1 2 Ṽ ). (3 47) Since Γ 1 and Γ 2 are constant, symmetric, and positive definite matrices and α 2 >, it is straightforward that Q(t). 55

56 Let V L (y, t) : D [, ) R be a continuously differentiable positive definite function defined as V L (y, t) e T 1 e et 2 e rt M(q)r + P + Q (3 48) which satisfies the following inequalities: U 1 (y) V L (y, t) U 2 (y) (3 49) provided the sufficient conditions introduced in (3 41) are satisfied. In (3 49), the continuous positive definite functions U 1 (y), U 2 (y) R are defined as U 1 (y) λ 1 y 2, U 2 (y) λ 2 (q) y 2 (3 5) where λ 1, λ 2 (q) R are defined as λ min {1, m 1}, λ 2 (q) max{ 1 m(q), 1} 2 where m 1, m(q) are introduced in (2 3). After utilizing (3 2), (3 3), (3 27), (3 28), the time derivative of (3 48) can be expressed as Based on the fact that V L (y, t) = 2α 1 e e T 2 e 1 + r T Ñ(t) (k s + 1) r 2 α 2 e 2 2 (3 51) + β 2 e α 2 e T 2 [Ŵ T ˆσ Ṽ T ẋ d + W T ˆσ ˆV Tẋ ] d + tr(α 2 W T Γ 1 1 W) + tr(α 2 Ṽ T Γ 1 2 Ṽ ). e T 2 e e e 2 2 and using (3 2), (3 21), the expression in (3 51) can be simplified as 3 V L (y, t) r T Ñ(t) (k s + 1) r 2 (2α 1 1) e 1 2 (α 2 β 2 1) e 2 2. (3 52) 3 See Lemma 4 of the Appendix for the details of obtaining the inequality in (3 52). 56

57 By using (3 35), the expression in (3 52) can be further bounded as V L (y, t) λ 3 z 2 ( k s r 2 ρ( z ) r z ) (3 53) where λ 3 min{2α 1 1, α 2 β 2 1, 1}; hence, λ 3 is positive if α 1, α 2 are chosen according to the sufficient conditions in (3 4). After completing the squares for the second and third term in (3 53), the following expression can be obtained: V L (y, t) λ 3 z 2 + ρ2 (z) z 2. (3 54) 4k s The expression in (3 54) can be further upper bounded by a continuous, positive semi-definite function V L (y, t) U(y) = c z 2 y D (3 55) for some positive constant c R, where D {y R 3n+2 y ρ 1 (2 λ 3 k s )}. Larger values of k will expand the size of the domain D. The inequalities in (3 49) and (3 55) can be used to show that V L (y, t) L in D; hence, e 1 (t), e 2 (t), r(t), P(t), and Q(t) L in D. Given that e 1 (t), e 2 (t), and r(t) L in D, standard linear analysis methods can be used to prove that ė 1 (t), ė 2 (t) L in D from (3 2) and (3 3). Since e 1 (t), e 2 (t), r(t) L in D, the assumption that q d (t), q d (t), q d (t) exist and are bounded can be used along with (3 1)-(3 3) to conclude that q(t), q(t), q(t) L in D. Since q(t), q(t) L in D, Assumption 2-2 can be used to conclude that M(q), V m (q, q), G(q), and F( q) L in D. Therefore, from (2 1) and Assumption 2-3, we can show that τ(t) L in D. Given that r(t) L in D, (3 28) can be used to show that µ(t) L in D. Since q(t), q(t) L in D, Assumption 2-2 can be used to show that V m (q, q), Ġ(q), F(q) and Ṁ(q) L in D; hence, (3 24) can be used to show that ṙ(t) L in D. Since ė 1 (t), 57

58 ė 2 (t), ṙ(t) L in D, the definitions for U(y) and z(t) can be used to prove that U(y) is uniformly continuous in D. Let S D denote a set defined as follows: 4 S {y(t) D U 2 (y(t)) < λ 1 (ρ 1 (2 λ 3 k s )) 2 }. (3 56) Theorem 8.4 of [63] can now be invoked to state that c z(t) 2 as t y() S. (3 57) Based on the definition of z(t), (3 57) can be used to show that e 1 (t) as t y() S. (3 58) 3.7 Experiment As in Chapter 2, the testbed depicted in Figure 2-1 was used to implement the developed controller, however, no external friction is applied to the circular disk. The desired link trajectory is selected as follows (in degrees): q d (t) = 6. sin(3.t)(1 exp(.1t 3 )). (3 59) For all experiments, the rotor velocity signal is obtained by applying a standard backwards difference algorithm to the position signal. The integral structure for the RISE term in (3 18) was computed on-line via a standard trapezoidal algorithm. The NN input vector x d (t) R 4 is defined as x d = [1 q d q d q d ] T. The initial values of Ŵ () were chosen to be a zero matrix; however, the initial values of ˆV () were selected randomly between 1. and 1. to provide a basis [71]. A different 4 The region of attraction in (3 56) can be made arbitrarily large to include any initial conditions by increasing the control gain k s (i.e., a semi-global type of stability result) [22]. 58

59 Tracking Error [degrees] Time [sec] Figure 3-1. Tracking error for the RISE control law with no NN adaptation. transient response could be obtained if the NN weights are initialized differently. Ten hidden layer neurons were chosen based on trial and error (i.e., N 2 = 1). In addition, all the states were initialized to zero. The following control gains were used to implement the controller in (3 17) in conjunction with the NN update laws in (3 2) and (3 22): k s = 3, β 1 = 1, α 1 = 1, α 2 = 1, Γ 1 = 5I 11, Γ 2 =.5I 4. (3 6) Discussion Two different experiments were conducted to demonstrate the efficacy of the proposed controller. The control gains were chosen to obtain an arbitrary tracking error accuracy (not necessarily the best performance). For each controller, the gains were not retuned (i.e., the common control gains remain the same for both controllers). For the first experiment, no adaptation was used and the controller with only the RISE feedback was implemented. The tracking error is shown in Figure 3-1, and the control torque is shown in Figure 3-2. For the second experiment, the proposed NN controller was used in (hereinafter denoted as RISE+NN). The tracking error is shown in Figure 3-3, and the 59

60 15 1 Torque [N m] Time [sec] Figure 3-2. Control torque for the RISE control law with no NN adaptation Tracking Error [degrees] Time [sec] Figure 3-3. Tracking error for the proposed RISE+NN control law. 6

15 1 Torque [N m] 5 5 1 15 5 1 15 2 25 3 Time [sec] Figure 3-4. Control torque for the proposed RISE+NN control law. Figure 3-5. Average RMS errors (degrees) and torques (N-m). 1- RISE, (proposed).

61 15 1 Torque [N m] Time [sec] Figure 3-4. Control torque for the proposed RISE+NN control law. Figure 3-5. Average RMS errors (degrees) and torques (N-m). 1- RISE, (proposed). 2- RISE+NN control torque is shown in Figure 3-4. Each experiment was performed five times and the average RMS error and torque values are shown in Figure 3-5, which indicate that the proposed RISE+NN controller yields a lower RMS error with a similar control effort. 3.8 Conclusions The results in this chapter illustrate how a multilayer NN feedforward term can be fused with a RISE feedback term in a continuous controller to achieve semi-global 61

62 asymptotic tracking. Improved weight tuning laws are presented which guarantee boundedness of NN weights. To blend the NN and RISE methods, several technical challenges were addressed through Lyapunov-based techniques. These challenges include developing adaptive update laws for the NN weight estimates that do not depend on acceleration, and developing new RISE stability analysis methods and sufficient gain conditions to accommodate the incorporation of the NN adaptive updates in the RISE structure. Experimental results are presented that indicate reduced RMS tracking errors while requiring slightly higher RMS control effort. 62

63 CHAPTER 4 A NEW CLASS OF MODULAR ADAPTIVE CONTROLLERS 4.1 Introduction The results in this chapter provide the first investigation of the ability to yield controller/update law modularity using the RISE feedback. First, we consider a general class of multi-input multi-output (MIMO) dynamic systems with structured (i.e., LP) and unstructured uncertainties and develop a controller with modularity between the controller/update law, where a model-based adaptive feedforward term is used in conjunction with the RISE feedback term [19]. The RISE-based modular adaptive approach is different than previous work (cf. [4, 41, 43]) in the sense that it does not rely on nonlinear damping. The use of the RISE method in lieu of nonlinear damping has several advantages that motivate this investigation including: an asymptotic modular adaptive tracking result can be obtained for nonlinear systems with non-lp additive bounded disturbances; the dual objectives of asymptotic tracking and controller/update law modularity are achieved in a single step unlike the two stage analysis required in some results (cf., [4, 43]); the development does not require that the adaptive estimates are a priori bounded; and the development does not require a positive definite estimate of the inertia matrix or a square integrable prediction error as in [4, 43]. Modularity in the adaptive feedforward term is made possible by considering a generic form of the adaptive update law and its corresponding parameter estimate. The general form of the adaptive update law includes examples such as gradient, least-squares, and etc. This generic form of the update law is used to develop a new closed-loop error system, and the typical RISE stability analysis is modified to accommodate the generic update law. New sufficient gain conditions are derived to prove an asymptotic tracking result. The class of RISE-based modular adaptive controllers is then extended to include uncertain dynamic systems that do not satisfy the LP assumption. Neural networks (NNs) have gained popularity as a feedforward adaptive control method that can compensate 63

64 for non-lp uncertainty in nonlinear systems. A limiting factor in previous NN-based feedforward control results is that a residual function approximation error exists that limits the steady-state performance to a uniformly ultimately bounded result, rather than an asymptotic result. Some results (cf. [33 36, 72 74]) have been developed to augment the NN feedforward component with a discontinuous feedback element to achieve asymptotic tracking. Motivated by the practical limitations of discontinuous feedback, a multilayer NN-based controller was augmented by RISE feedback in [75] to yield the first asymptotic tracking result using a continuous controller. However, in most NN-based controllers, the NN adaptation is governed by a gradient update law to facilitate the Lyapunov-based stability analysis. Since multilayer NNs are nonlinear in the weights, a challenge is to derive weight tuning laws in closed-loop feedback control systems that yield stability as well as bounded weights. The development in the current chapter illustrates how to extend the class of modular adaptive controllers for NNs. Specifically, the result allows the NN weight tuning laws to be determined from a developed generic update law (rather than be restricted to a gradient update law). We are not aware of any modular multilayer NN-based controller in literature with modularity in the tuning laws/controller. The NN feedforward structure adaptively compensates for the non-lp uncertain dynamics. For the tuning laws that could be used in this result, the NN weights can be initialized randomly, and no off-line training is required. The modular adaptive control development for the general class of multi-input systems is then applied to a class of dynamic systems modeled by the Euler-Lagrange formulation. The Euler-Lagrange formulation describes the behavior of a large class of engineering systems (e.g., robot manipulators, satellites, vehicular systems). An experimental section is included that illustrates that different adaptation laws can be included with the feedback controller through examples including a gradient update law and a least squares update law for the LP dynamics case, and a common gradient weight 64

65 tuning law based on backpropagated error [76] and a simplified tuning law based on the Hebbian algorithm [77] for the non-lp dynamics case. While the current result encompasses a large variety of adaptive update laws, an update law design based on the prediction error is not possible because the formulation of a prediction error requires the system dynamics to be completely LP. Future efforts can focus on developing a RISE-based adaptive controller for a completely LP system that could also use a prediction error/torque filtering approach. Also, one of the shortcomings of current work is that only a semi-global asymptotic stability is achieved, and further investigation is needed to achieve a global stability result. Inroads to solve the global tracking problem are provided in [78] under a set of assumptions. 4.2 Dynamic System Consider a class of MIMO nonlinear systems of the following form: x (m) = f(x, ẋ,..., x (m 1) ) + G(x, ẋ,..., x (m 1) )u + h (t), (4 1) where ( ) (i) (t) denotes the i th derivative with respect to time, x (i) (t) R n, i =,..., m 1 are the system states, u (t) R n is the control input, f ( ) R n and G ( ) R n n are unknown nonlinear C 2 functions, and h (t) R n denotes a general nonlinear disturbance (e.g., unmodeled effects). The outputs of the system are the system states. Throughout the chapter denotes the absolute value of the scalar argument, denotes the standard Euclidean norm for a vector or the induced infinity norm for a matrix, and F denotes the Frobenius norm of a matrix. The subsequent development is based on the assumption that all the system states are measurable. Moreover, the following properties and assumptions will be exploited in the subsequent development. Assumption 4-1: G ( ) is symmetric positive definite, and satisfies the following inequality y(t) R n : g y 2 y T G 1 ( )y ḡ(x, ẋ,..., x (m 1) ) y 2, (4 2) 65

66 where g R is a known positive constant, ḡ(x, ẋ,..., x (m 1) ) R is a known positive function. Assumption 4-2: The functions G 1 ( ) and f( ) are second order differentiable such that G 1 ( ), Ġ 1 ( ), G 1 ( ), f( ), f( ), f( ) L if x (i) (t) L, i =, 1,..., m + 1. Assumption 4-3: The nonlinear disturbance term and its first two time derivatives (i.e., h (t),ḣ(t), ḧ(t)) are bounded by known constants. Assumption 4-4: The unknown nonlinearities G 1 ( ) and f( ) are linear in terms of unknown constant system parameters (i.e., LP). Assumption 4-5: The desired trajectory x d (t) R n is assumed to be designed such that x (i) d (t) L, i =, 1,..., m Control Objective The objective is to design a continuous modular adaptive controller which ensures that the system tracks a desired time-varying trajectory x d (t) despite uncertainties and bounded disturbances in the dynamic model. To quantify this objective, a tracking error, denoted by e 1 (t) R n, is defined as e 1 x d x. (4 3) To facilitate a compact presentation of the subsequent control development and stability analysis, auxiliary error signals denoted by e i (t) R n, i = 2, 3,..., m are defined as e 2 ė 1 + α 1 e 1 e 3 ė 2 + α 2 e 2 + e 1 e 4 ė 3 + α 3 e 3 + e 2. e i ė i 1 + α i 1 e i 1 + e i 2 (4 4). 66

67 e m ė m 1 + α m 1 e m 1 + e m 2, where α i R, i = 1, 2,..., m 1 denote constant positive control gains. The error signals e i (t), i = 2, 3,..., m can be expressed in terms of e 1 (t) and its time derivatives as i 1 e i = a i,j e (j) 1, (4 5) j= where the constant coefficients a i,j R can be evaluated by substituting (4 5) in (4 4), and comparing coefficients [79]. A filtered tracking error [57], denoted by r(t) R n, is also defined as r ė m + α m e m, (4 6) where α m R is a positive, constant control gain. The filtered tracking error r(t) is not measurable since the expression in (4 6) depends on x (m) (t). 4.4 Control Development The open-loop tracking error system is developed by premultiplying (4 6) by G ( 1 x, ẋ,..., x (m 1)) and utilizing the expressions in (4 1), (4 4), (4 5) as G 1 r = Y d θ + S G 1 d h u (4 7) where the fact that a m,m 1 = 1, was used. In (4 7), Y d θ R n is defined as Y d θ G 1 d x(m) d G 1 d f d, (4 8) where Y d (x d, ẋ d,..., x (m) d ) R n p is a desired regression matrix, and θ R p contains the constant unknown system parameters. In (4 8), the functions G 1 d (x d, ẋ d,..., x (m 1) d ) R n n, f d (x d, ẋ d,..., x (m 1) d ) R n are defined as G 1 d G 1 (x d, ẋ d,..., x (m 1) d ) (4 9) f d f(x d, ẋ d,..., x (m 1) d ). 67

68 Also in (4 7), the auxiliary function S ( x, ẋ,..., x (m 1), t ) R n is defined as ( m 2 ) S G 1 a m,j e (j+1) 1 + α m e m j= + G 1 x (m) d G 1 d x(m) d G 1 f + G 1 d f d G 1 h + G 1 d h. Based on the open-loop error system in (4 7), the control input is composed of an adaptive feedforward term plus the RISE feedback term as (4 1) u Y dˆθ + µ. (4 11) In (4 11), µ(t) R n denotes the RISE feedback term defined as [19, 22] µ (t) (k s + 1)e m (t) (k s + 1)e m () + t [(k s + 1)α m e m (σ) + β 1 sgn(e m (σ))]dσ, (4 12) where k s, β 1 R are positive constant control gains, and α m R was introduced in (4 6). In (4 11), ˆθ (t) R p denotes a subsequently designed parameter estimate vector. The closed-loop tracking error system is developed by substituting (4 11) into (4 7) as G 1 r = Y d (θ ˆθ ) + S G 1 d h µ. (4 13) To facilitate the subsequent modular adaptive control development and stability analysis, the time derivative of (4 13) is expressed as G 1 ṙ = 1 2Ġ 1 r + Ñ(t) + N B (t) (k s + 1)r β 1 sgn(e m ) e m, (4 14) where the fact that the time derivative of (4 12) is given as µ(t) = (k s + 1)r + β 1 sgn(e m ) (4 15) was utilized. In (4 14), the unmeasurable/unknown auxiliary terms Ñ(e 1, e 2,..., e m, r, t), N B (t) R n are defined as Ñ(t) 1 2Ġ 1 r + Ṡ + e m + Ñ (4 16) 68

69 N B (t) N B1 (t) + N B2 (t), (4 17) where N B1 (t) R n is given by N B1 Ẏdθ Ġ 1 d h G 1 ḣ, (4 18) and the sum of the auxiliary terms Ñ(t), N B2 (t) R n is given by d N B2 (t) + Ñ = Ẏdˆθ Y d ˆθ. (4 19) Specific definitions for Ñ(t), N B2 (t) are provided subsequently based on the definition of the adaptive update law for ˆθ (t). The structure of (4 14) and the introduction of the auxiliary terms in (4 16)-(4 19) is motivated by the desire to segregate terms that can be upper bounded by state-dependent terms and terms that can be upper bounded by constants. Specifically, depending on how the adaptive update law is designed, analysis is provided in the next section to upper bound Ñ(t) by state-dependent terms and N B (t) by a constant. The need to further segregate N B (t) is that some terms in N B (t) have time derivatives that are upper bounded by a constant, while other terms have time-derivatives that are upper-bounded by state dependent terms. The segregation of these terms based on the structure of the adaptive update law (see (4 19)) is key for the development of a stability analysis for the modular RISE-based adaptive update law/controller. 4.5 Modular Adaptive Update Law Development A key difference between the traditional modular adaptive controllers that use nonlinear damping (cf., [1, 41, 63]) and the current RISE-based approach is that the RISE-based method does not exploit the ISS property with respect to the parameter estimation error. The current approach does not rely on nonlinear damping, but instead uses the ability of the RISE technique to compensate for smooth bounded disturbances. In general, previous nonlinear damping-based modular adaptive controllers first prove an ISS stability result provided the adaptive update law yields bounded parameter estimates (e.g., ˆθ (t) L via a projection algorithm), and then use additional analysis along with 69

70 assumptions (PD estimate of the inertia matrix, square integrable prediction error, etc.) to conclude asymptotic convergence. In contrast, since the RISE-based modular adaptive control approach in this chapter does not exploit an ISS analysis, the assumptions regarding the parameter estimate are modified. The following development requires some general bounds on the structure of the adaptive update law ˆθ (t) and the corresponding parameter estimate ˆθ (t) to segregate the components of the auxiliary terms introduced in (4 16)-(4 19). Specifically, instead of assuming that ˆθ (t) L, the subsequent development is based on the less restrictive assumption that the parameter estimate ˆθ (t) can be described as ˆθ (t) = f 1 (t) + Φ(x, ẋ,..., x (m 1), e 1, e 2,..., e m, t). (4 2) In (4 2), f 1 (t) R p is a known function such that f 1 (t) γ 1 (4 21) f m 1 (t) γ 2 + γ i+2 e i + γ m+3 r, i=1 where γ i R, i = 1, 2,..., m + 3 are known non-negative constants (i.e., the constants can be set to zero for different update laws), and Φ ( x, ẋ,..., x (m 1), e 1, e 2,..., e m, t ) R p is a known function that satisfies the following bound: Φ (t) ρ 1 ( ē ) ē, (4 22) where the bounding function ρ 1 ( ) R is a positive, globally invertible, nondecreasing function, and ē(t) R nm is defined as ē(t) [ e T 1 e T 2... e T m] T. (4 23) 7

71 The estimate in (4 2) is assumed to be generated according to an update law of the following general form ˆθ (t) = g 1 (t) + Ω(x, ẋ,..., x (m 1), e 1, e 2,..., e m, r, t). (4 24) In (4 24), g 1 (t) R p is a known function such that g 1 (t) δ 1 (4 25) m ġ 1 (t) δ 2 + δ i+2 e i + δ m+3 r, i=1 where δ i R, i = 1, 2,..., m + 3 are known non-negative constants, and Ω(x, ẋ,..., x (m 1), e 1, e 2,..., e m, r, t) R p satisfies the following bound: Ω (t) ρ 2 ( z ) z, (4 26) where the bounding function ρ 2 ( ) R is a positive, globally invertible, nondecreasing function, and z(t) R n(m+1) is defined as z(t) [ e T 1 e T 2... e T m r T] T. (4 27) Remark 4.1. The update law in (4 24) depends on the unmeasurable signal r(t). But it is assumed that the update law in (4 24) is of the form which upon integration yields an estimate ˆθ (t) that is independent of r(t). Thus the controller needs only the measurable signals for implementation. The structure of the adaptive estimate and the adaptive update law is flexible in the sense that any of the terms in (4 2) and (4 24) can be removed for any specific update law and estimate. For example if all the error-dependent terms in (4 2) are removed, then the condition on ˆθ (t) is the same as in the standard nonlinear damping-based modular adaptive methods (i.e., ˆθ (t) L ). In this sense, the ISS property with respect to the parameter estimation error is automatically proven by considering this special case of ˆθ (t). The results in this chapter are not proven for estimates or update 71

72 laws with additional terms that are not included in the generic structure in (4 2) and (4 24). For example, a standard gradient-based update law is of the form (4 24), but the corresponding estimate (obtained via integration by parts) is not of the form (4 2) due to the presence of some terms that are bounded by the integral of the error instead of being bounded by the error. However, the same gradient-based update law and its corresponding estimate can be used in (4 11) if a smooth projection algorithm is used that keeps the estimates bounded. As shown in [19], the standard gradient-based update law can be used in (4 11) without a projection algorithm, yet including this structure in the modular adaptive analysis is problematic because the integral of the error could be unbounded (so this update law could not be used in nonlinear damping-based modular adaptive laws without a projection either). Since the goal in this chapter is to develop a modular update law, a specific update law cannot be used to inject terms in the stability analysis to cancel the terms containing the parameter mismatch error. Instead, the terms containing the parameter mismatch error are segregated depending on whether they are state-dependent or bounded by a constant (see (4 19)). Based on the development given in (4 2)-(4 25), the terms Ñ(t) and N B2 (t) introduced in (4 16)-(4 19) are defined as Ñ (t) ẎdΦ Y d Ω (4 28) N B2 (t) Ẏdf 1 Y d g 1. (4 29) In a similar manner as in Lemma 1 of the Appendix, by applying the Mean Value Theorem along with the inequalities in (4 22) and (4 26) yields an upper bound for the expression in (4 16) as Ñ(t) ρ ( z ) z, (4 3) where the bounding function ρ( ) R is a positive, globally invertible, nondecreasing function, and z(t) R n(m+1) is defined in (4 27). The following inequalities are developed based on the expressions in (4 17), (4 18), (4 29), their time derivatives, 72

73 and the inequalities in (4 21) and (4 25): N B (t) ζ 1 ṄB 2 (t) ζ 3 + Ṅ B1 (t) ζ 2 (4 31) m ζ i+3 e i + ζ m+4 r, i=1 where ζ i R, i = 1, 2,..., m + 4 are known positive constants. 4.6 Stability Analysis Theorem 4-1: The controller given in (4 11), (4 12), (4 2), and (4 24) ensures that all system signals are bounded under closed-loop operation and that the position tracking error is regulated in the sense that e 1 (t) as t provided the control gain k s introduced in (4 12) is selected sufficiently large (see the subsequent proof), α i, i = 1, 2,..., m are selected according to the following conditions α i > 1 2 β i+1, i = 1, 2,..., m 2 α m 1 > 1 2 β m α m > β m β m m 1 β i , i=1 (4 32) and β i, i = 1, 2,..., m + 2 are selected according to the following sufficient conditions: β 1 > ζ α m ζ α m ζ 3 (4 33) β i+1 > ζ i+3, i = 1, 2,..., m + 1, where β 1 was introduced in (4 12), and β 2,..., β m+2 are introduced in (4 36). Proof: Let D R n(m+1)+1 be a domain containing y(t) =, where y(t) R n(m+1)+1 is defined as y(t) [z T (t) P(t)] T. (4 34) 73

74 In (4 34), the auxiliary function P(t) R is defined as n P(t) β 1 e mi () e m () T N B () t i=1 L(τ)dτ, (4 35) where e mi () denotes the ith element of the vector e m (), and the auxiliary function L(t) R is defined as L(t) r T (N B (t) β 1 sgn(e m )) m β i+1 e i (t) e m (t) β m+2 e m (t) r(t), (4 36) i=1 where β i R, i = 2, 3,..., m + 2 are positive constants chosen according to the sufficient conditions in (4 33). Provided the sufficient conditions introduced in (4 33) are satisfied, the following inequality is obtained 1 : t n L(τ)dτ β 1 e mi () e m () T N B (). (4 37) i=1 Hence, (4 37) indicates that P(t). Let V L (y, t) : D [, ) R be a continuously differentiable, positive definite function defined as V L (y, t) 1 m e T i 2 e i rt G 1 r + P, (4 38) which satisfies the following inequalities: i=1 U 1 (y) V L (y, t) U 2 (y) (4 39) provided the sufficient conditions introduced in (4 32)-(4 33) are satisfied. The Rayleigh-Ritz theorem was used to develop the inequalities in (4 39), where the continuous positive definite functions U 1 (y),u 2 (y) R are defined as U 1 (y) λ 1 y 2 and U 2 (y) λ 2 (x, ẋ,..., x (m 1) ) y 2, where λ 1, λ 2 (x, ẋ,..., x (m 1) ) R are defined as λ min { 1, g } λ 2 (x, ẋ,..., x (m 1) ) max{ 1 2ḡ(x, ẋ,..., x(m 1) ), 1}, (4 4) 1 Details of the bound in (4 37) are provided in the Lemma 1 of the Appendix. 74

75 where g, ḡ(x, ẋ,..., x (m 1) ) are introduced in (4 2). After taking the time derivative of (4 38), V (y, t) is expressed as V L (y, t) = r T G 1 ṙ rt Ġ 1 r + m e T i ė i + P. i=1 The derivative P(t) R is given by P(t) = L(t) = r T (N B (t) β 1 sgn(e m )) + m β i+1 e i (t) e m (t) + β m+2 e m (t) r(t). After utilizing (4 4), (4 6), (4 14), (4 15), and (4 41), V (y, t) is expressed as i=1 (4 41) V L (y, t) = 1 2 rt Ġ 1 r m α i e T i e i + e T m 1 e m 1 2 rt Ġ 1 r i=1 + r T Ñ + r T N B r T r k s r T r β 1 r T sgn(e m ) m r T (N B β 1 sgn(e m )) + β i+1 e i e m + β m+2 e m r. i=1 After canceling similar terms, V (y, t) is simplified as V L (y, t) = + m α i e T i e i + e T m 1e m r T r k s r T r + r T Ñ i=1 m β i+1 e i e m + β m+2 e m r. i=1 Based on the fact that a T b 1 2 ( a 2 + b 2 ) for some a, b R n, VL (y, t) is upper bounded using the squares of the components of z(t) as m 2 V L (y, t) (α i 1 2 β i+1) e i 2 (α m β m 1 2 ) e m 1 2 i=1 (α m β m β m+2 1 m 1 β i ) e m 2 r 2 (4 42) (k s 1 2 β m+2) r 2 + r T Ñ. i=1 75

76 By using (4 3), the expression in (4 42) is rewritten as where V L (y, t) λ 3 z 2 [ (k s β ] m+2 2 ) r 2 ρ( z ) r z, (4 43) λ 3 min{α β 2, α β 3,..., α m β m 1, α m β m 1 2, α m β m β m+2 1 m 1 β i , 1}. After completing the squares for the terms inside the brackets in (4 43), the following expression can be obtained, provided the sufficient gain conditions in (4 32) and (4 33) are satisfied: V L (y, t) λ 3 z 2 + ρ2 ( z ) z 2 ( ). (4 44) 4 k s β m+2 2 The expression in (4 44) can be further upper bounded by a continuous, positive semi-definite function i=1 V L (y, t) U(y) = c z 2 y D (4 45) for some positive constant c R, where ( ( D {y R n(m+1)+1 y ρ 1 2 λ 3 k s β ) )} m+2. 2 Larger values of k will expand the size of the domain D. The inequalities in (4 39) and (4 45) indicate that V L (y, t) L in D; hence, e i (t) L, and r(t) L in D. Given that e i (t) L, and r(t) L in D, then ė i (t) L in D from (4 4) and (4 6). Since e i (t) L, and r(t) L in D, the assumption that x (i) d (t) exist and are bounded and (4 3)-(4 6) indicate that x (i) (t) L in D. Since x (i) (t) L in D, (4 2)-(4 25) indicate that ˆθ(t), ˆθ(t) L in D, and G 1 ( ) and f( ) L in D from Property 2. Thus, from (4 1) and Property 3, we can show that u(t) L in D. Given that r(t) L in D, (4 15) indicates that µ(t) L in D. Since x (i) (t) L in D, then Ġ 1 ( ) and f( ) L 76

77 in D based on Property 2; hence, (4 14) indicates that ṙ(t) L in D. Since ė i (t) L and ṙ(t) L in D, then U(y) is uniformly continuous in D based on the definitions for U(y) and z(t). Let S D denote a set defined as ( ( S y(t) D U 2(y(t)) < λ 1 (ρ 1 2 λ 3 k s β ) )) 2 m+2 2. (4 46) The region of attraction in (4 46) is arbitrarily large and can include any initial condition by increasing the control gain k s (i.e., a semi-global stability result). By invoking Theorem 8.4 of [63] c z(t) 2 as t y() S. (4 47) Based on the definition of z(t), (4 47) indicates that e 1 (t) as t y() S. (4 48) 4.7 Neural Network Extension to Non-LP Systems The class of RISE-based modular adaptive controllers developed in the preceding sections are extended to include uncertain dynamic systems that do not satisfy the LP assumption (i.e., Assumption 4-4 is not satisfied). NN-based estimation methods are well suited for control systems where the dynamic model contains unstructured nonlinear disturbances as in (4 1). The main feature that empowers NN-based controllers is the universal approximation property as described in Section 3.4 of Chapter RISE Feedback Control Development The modular control development and stability analysis is provided to illustrate how the aforementioned textbook (e.g., [15]) NN feedforward estimation strategy can be fused with a RISE feedback control method as a means to achieve asymptotic stability for general class of MIMO systems described by (4 1) while using generic NN weight update laws. The open-loop and closed-loop tracking error is developed for the combined control system. 77

78 Similar to (4 7), the open-loop tracking error system is developed by premultiplying (4 6) by G 1 and utilizing the expressions in (4 1) and (4 4) to obtain: G 1 r = f NN + S G 1 d h u, (4 49) where the auxiliary function f NN (x d, ẋ d,..., x (m) d ) R n is defined as f NN G 1 d x(m) d G 1 d f d, (4 5) where G 1 d (x d, ẋ d,..., x (m 1) d ) R n n and f d (x d, ẋ d,..., x (m 1) d ) R n are defined in (4 9). In (4 49), the auxiliary function S ( x, ẋ,..., x (m 1), t ) R n is defined similar to (4 1). The expression in (4 5) can be represented by a three-layer NN as f NN = W T σ ( V T x d ) + ε ( xd ). (4 51) In (4 51), the input x d (t) R (m+1)n+1 is defined as x d (t) [1 x T d (t) ẋt d (t)... T x(m) d (t)] T so that N 1 = (m + 1)n where N 1 was introduced in (3 4). Based on the assumption that the desired trajectory is bounded, the inequalities in (3 16) hold. Based on the open-loop error system in (4 49), the control torque input is composed of a three-layer NN feedforward term plus the RISE feedback term as u ˆf NN + µ (4 52) where the RISE feedback term µ R n is defined in (4 12), and the feedforward NN component denoted by ˆf NN (t) R n, is defined as ˆf NN Ŵ T σ(ˆv T x d ). (4 53) 78

79 4.7.2 Modular Tuning Law Development The estimates for the NN weights in (4 53) are generated on-line (there is no off-line learning phase) using a smooth projection algorithm as ϱ 1 if vec(ŵ) int(λ W) Ŵ proj (ϱ 1 ) = ϱ 1 if vec(ŵ) (Λ W) and vec (ϱ 1 ) T vec(ŵ) PMr t (ϱ 1) if vec(ŵ) (Λ W) and vec (ϱ 1 ) T vec(ŵ) > ˆV proj (ϱ 2 ) = ϱ 2 if vec(ˆv ) int(λ V ) ϱ 2 if vec(ˆv ) (Λ V ) and vec (ϱ 2 ) T vec(ˆv ) P t Mr (ϱ 2) if vec(ˆv ) (Λ V ) and vec (ϱ 2 ) T vec(ˆv ) >, (4 54) (4 55) where proj ( ) is the projection operator and vec(ŵ ()) int (Λ W), vec(ˆv ()) int (Λ V ). In (4 54) and (4 55), the auxiliary terms ϱ 1 (t) R (N 2+1) n and ϱ 2 (t) R (N 1+1) N 2 denote adaptation rules of the following general form: proj (ϱ 1 ) = w 1 (t) + Ξ W (x, ẋ,..., x (m 1), e 1, e 2,..., e m, r, t) (4 56) proj (ϱ 2 ) = v 1 (t) + Ξ V (x, ẋ,..., x (m 1), e 1, e 2,..., e m, r, t). In (4 56), w 1 (t) R (N 2+1) n and v 1 (t) R (N 1+1) N 2 are known functions such that w 1 (t) γ 1 (4 57) m ẇ 1 (t) γ 2 + γ i+2 e i + γ m+3 r i=1 v 1 (t) δ 1 (4 58) m v 1 (t) δ 2 + δ i+2 e i + δ m+3 r, i=1 79

80 and Ξ W R (N 2+1) n and Ξ V R (N 1+1) N 2 satisfy the following bounds: Ξ W (t) m γ i+m+3 e i + γ 2m+4 r (4 59) i=1 Ξ V (t) m δ i+m+3 e i + δ 2m+4 r, i=1 where γ i, δ i R, i = 1, 2,..., 2m + 4 are known non-negative constants (i.e., the constants can be set to zero for different update laws). In (4 54) and (4 55), P t Mr (A) = devec (Pr t (vec (A))) for a matrix A, where the operation devec ( ) is the reverse of vec ( ). The use of the projection algorithm in (4 54) and (4 55) is to ensure that Ŵ(t) and ˆV (t) remain bounded inside the convex regions defined in (3 9) and (3 11). This fact will be exploited in the subsequent stability analysis. Thus, unlike the general form for parameter estimate ˆθ (t) in (4 2) for the LP case, the NN weight estimates are bounded by constants. The NN weight adaptation laws are restrictive compared to the adaptive laws for the parameter estimates in the LP case. This is because the first layer weight estimates ˆV (t) are embedded inside the nonlinear activation function σ ( ) (i.e., σ(ˆv T x d )). Typically the NN activation functions are bounded over the entire domain; however, their time derivatives depend on the adaptation law ˆV (t), which could be state-dependent. Similar to the LP case, it is assumed that only the NN adaptation rules depend on the unmeasurable signal r (t) but the corresponding weight estimate obtained after integration is independent of r (t). The closed-loop tracking error system can be developed by substituting (4 52) into (4 49) as G 1 r = f NN ˆf NN + S G 1 d h µ. (4 6) To facilitate the subsequent stability analysis, the time derivative of (4 6) is determined as G 1 ṙ = Ġ 1 r + f NN ˆf NN + Ṡ Ġ 1 d h G 1 d ḣ µ. (4 61) 8

81 Using (4 51), (4 53), the closed-loop error system in (4 61) can be expressed as ( ) G 1 ṙ = Ġ 1 r + W T σ V T x d V T x d Ŵ T σ(ˆv T x d ) (4 62) Ŵ T σ (ˆV T x d )( ˆV T x d + ˆV T x d ) + ε + Ṡ Ġ 1 d h G 1 d ḣ µ, where σ (ˆV T x) dσ ( V T x ) /d ( V T x ) V T x=ˆv T x. After adding and subtracting the term W T ˆσ T ˆV x d + Ŵ T ˆσ Ṽ T x d to (4 62), the following expression can be obtained: G 1 ṙ = Ġ 1 r + Ŵ T ˆσ Ṽ T x d + W T ˆσ ˆV T x d + W T σ V T x d W T ˆσ ˆV T x d (4 63) Ŵ T ˆσ Ṽ T x d Ŵ T ˆσ Ŵ T ˆσ ˆV T x d + ε + Ṡ Ġ 1 d h G 1ḣ µ, where the notations ˆσ and σ are introduced in (3 6). Substituting the NN weight adaptation laws in (4 54), (4 55) in (4 63) yields d G 1 ṙ = 1 2Ġ 1 r + Ñ + N B e m (k s + 1)r β 1 sgn(e m ), (4 64) where (4 15) was utilized, and the unmeasurable auxiliary terms Ñ(e 1, e 2,..., e m, r, t), N B (Ŵ, ˆV, x d, x d, t) R n are defined as Ñ 1 2Ġ 1 r Ξ T W ˆσ Ŵ T ˆσ Ξ T V x d + Ṡ + e m (4 65) In (4 66), N B1 ( x d, x d, t), N B2 (Ŵ, ˆV, x d, x d, t) R n are given by N B N B1 + N B2. (4 66) N B1 = W T σ V T x d + ε Ġ 1 d h G 1 d ḣ (4 67) N B2 = Ŵ T ˆσ Ṽ T x d + W T ˆσ ˆV T x d w T 1 ˆσ Ŵ T ˆσ v T 1 x d (4 68) In a similar manner as before, application of the Mean Value Theorem will yield an upper bound for Ñ (t) as in (4 3). The following inequalities are developed based on Assumptions 4-2 and 4-3, (3 7), (3 8), (3 16), (4 56)-(4 59), and (4 66)-(4 68): N B ζ 1 Ṅ B1 ζ2 (4 69) 81

82 ṄB 2 ζ3 + where ζ i R, i = 1, 2,..., m + 4 are known positive constants. m ζ i+3 e i + ζ m+4 r, (4 7) i=1 4.8 Stability Analysis Theorem 4-2: The combined NN and RISE controller given in (4 52)-(4 55) ensures that all system signals are bounded under closed-loop operation and that the position tracking error is regulated in the sense that e 1 (t) as t provided similar gains conditions as in Theorem 4-1 are satisfied. The proof of Theorem 4-2 is similar to Theorem Application to Euler-Lagrange Systems The Euler-Lagrange formulation describes the behavior of a large class of engineering systems. In this section, the modular adaptive control development for the general class of MIMO dynamic systems is applied to dynamic systems modeled by the Euler-Lagrange formulation M(q) q + V m (q, q) q + G(q) + F ( q) + τ d (t) = τ(t). (4 71) In (4 71), M(q) R n n denotes the inertia matrix, V m (q, q) R n n denotes the centripetal-coriolis matrix, G(q) R n denotes the gravity vector, F ( q) R n denotes friction, τ d (t) R n denotes a general nonlinear disturbance (e.g., unmodeled effects), τ(t) R n represents the torque input control vector, and q(t), q(t), q(t) R n denote the link position, velocity, and acceleration vectors, respectively. The control development is based on the assumption that q(t) and q(t) are measurable and that M(q), V m (q, q), G(q), F ( q) and τ d (t) are unknown. The modular adaptive controller developed in the earlier sections for the a general class of MIMO can be easily applied to Euler-Lagrange systems. Please see [8], [81] for complete details of the control development. 82

Figure 4-1. The experimental testbed consists of a two-link robot. The links are mounted on two NSK direct-drive switched reluctance motors. 4.1 Experiment To investigate the performance of the modular controller developed in this chapter, an experiment was performed on a two-link robot testbed as depicted in Fig.

83 Figure 4-1. The experimental testbed consists of a two-link robot. The links are mounted on two NSK direct-drive switched reluctance motors. 4.1 Experiment To investigate the performance of the modular controller developed in this chapter, an experiment was performed on a two-link robot testbed as depicted in Fig The testbed is composed of a two-link direct drive revolute robot consisting of two aluminum links, mounted on a 24. Nm (base joint) and 2. Nm (second joint) switched reluctance motors. The motors are controlled through power electronics operating in torque control mode. The motor resolvers provide rotor position measurements with a resolution of 614, 4 pulses/revolution, and a standard backwards difference algorithm is used to numerically determine velocity from the encoder readings. A Pentium 2.8 GHz PC operating under QNX hosts the control algorithm, which was implemented via a custom graphical user-interface [64], to facilitate real-time graphing, data logging, and the ability to adjust control gains without recompiling the program. Data acquisition and control implementation were performed at a frequency of 1. khz using the ServoToGo I/O board. 83

84 The dynamics for the testbed are τ 1 τ 2 = p 1 + 2p 3 c 2 p 2 + p 3 c 2 p 2 + p 3 c 2 p 2 q 1 q 2 + p 3s 2 q 2 p 3 s 2 ( q 1 + q 2 ) p 3 s 2 q 2 q 1 q 2 + f ( q) + τ d 1 τ d2, (4 72) where the nonlinear friction term is assumed to be modeled as [19, 26] f ( q) = γ 1 (tanh(γ 2 q 1 ) tanh(γ 3 q 1 )) γ 1 (tanh(γ 2 q 2 ) tanh(γ 3 q 2 )) + γ 4 tanh (γ 5 q 1 ) + γ 6 q 1 γ 4 tanh (γ 5 q 2 ) + γ 6 q 2. (4 73) In (4 72) and (4 73), p 1, p 2, p 3, γ i R, (i = 1, 2,..., 6) are unknown positive constant parameters, c 2 denotes cos(q 2 ), s 2 denotes sin(q 2 ), and τ d1, τ d2 R denote general nonlinear disturbances (e.g., unmodeled effects). Part of the dynamics in (4 72) and (4 73) is linear in the following parameters: θ = [p 1 p 2 p 3 γ 1 γ 4 γ 6 ] T. The parameters γ 2, γ 3, γ 5 are embedded inside the nonlinear hyperbolic tangent functions and hence cannot be linearly parameterized. Since these parameters cannot be compensated for by an adaptive algorithm, best-guess estimates γ 2 = 5, γ 3 = 1, γ 5 = 5 are used. The values for γ 2, γ 3, γ 5 are based on our previous experiments concerned with friction identification. Significant errors in these static estimates could degrade the performance of the system. An advantage of using the NN-based controller developed in non-lp extension section is that the NN can compensate for the non-lp dynamics. Specifically, for the NN-based controllers tested in this section, the NN is used to estimate the friction model, and the best guess values for γ 2, γ 3, γ 5 are not required. The control objective is to track the desired time-varying trajectory by using the proposed 84

85 modular adaptive control law. Two different update laws were chosen for both the LP and non-lp cases. To achieve this control objective, the control gains α 1, α 2, k s, and β 1, defined as scalars in (4 4), (4 6), and (4 12), were implemented (with non-consequential implications to the stability result) as diagonal gain matrices. Specifically, the control gains for both adaptive update laws for both the LP and non-lp cases were selected as α 1 = diag {7, 7} α 2 = diag {25, 25} β 1 = diag {1,.1} k s = diag {1, 2}. (4 74) Modular Adaptive Update Law The desired trajectories for this experiment were chosen as q d1 = q d2 = 6 sin(2.t) ( 1 exp (.1t 3)). (4 75) To test the modularity of the controller, two separate adaptive update laws were tested including a standard gradient update law defined as ˆθ = proj ( ΓY T d r), where Γ R 6 6 is a diagonal positive-definite gain matrix, and a least squares update law defined as ˆθ = proj ( PY T d r ), P = PY T d Y d P where P (t) R 6 6 is a time-varying symmetric matrix. The parameter estimates were all initialized to zero. In practice, the adaptive estimates would be initialized to a best-guess estimate of the values. Initializing the estimates to zero was done to test a scenario of no parameter knowledge. For the gradient and least squares update laws the adaptation gains and the initial value of P(t) were selected as Γ = P() = diag ([.15,.1,.1,.1,.1,.2]). (4 76) 85

86 Table 4-1. LP case: Average RMS values for 1 trials Gradient Least Squares Average RMS Error (deg) - Link Average RMS Error (deg) - Link Average RMS Torque (Nm) - Link Average RMS Torque (Nm) - Link Error Standard Deviation (deg) - Link Error Standard Deviation (deg) - Link Torque Standard Deviation (Nm) - Link Torque Standard Deviation (Nm) - Link The adaptation gain in (4 76) is gradually increased by trial and error tuning procedure based on our previous experience to achieve faster adaptation until a point when no significant performance improvement is noticed without causing unnecessary oscillations in the parameter estimates. Each experiment was performed ten times and following statistical data is provided in Table 4-1. Figure 4-2 depicts the tracking errors for one experimental trial with a gradient update law. The control torques and adaptive estimates for the same experimental trial are shown in Figs. 4-3 and 4-4, respectively. The tracking errors for a representative experimental trial with a least squares update law is depicted in Figure 4-5. The torques for the least squares update law are shown in Figure 4-6. The adaptive estimates for the least squares update law are shown in Figure Modular Neural Network Update Law The desired trajectories for this experiment were chosen as q d1 = q d2 = 6 sin(2.5t) ( 1 exp (.1t 3)). (4 77) To test the modularity of the controller, two separate neural network tuning laws were simulated including a standard gradient tuning law based on backpropagated errors [76] 86

87 Tracking Error [degrees] Tracking Error [degrees] Link Link Time [sec] Figure 4-2. Link position tracking error with a gradient-based adaptive update law. Link 1 Torque [N m] Link 2 Torque [N m] Time [sec] Figure 4-3. Torque input for the modular adaptive controller with a gradient-based adaptive update law. 87

88 2.5 2 Parameter Estimates Time [sec] Figure 4-4. Adaptive estimates for the gradient update law. Tracking Error [degrees] Tracking Error [degrees] Link Link Time [sec] Figure 4-5. Link position tracking error with a least-squares adaptive update law. 88

89 Link 1 Torque [N m] Link 2 Torque [N m] Time [sec] Figure 4-6. Torque input for the modular adaptive controller with a least-squares adaptive update law Parameter Estimates Time [sec] Figure 4-7. Adaptive estimates for the least squares update law. 89

90 Table 4-2. Non-LP case: Average RMS values for 1 trials Gradient Hebbian Average RMS Error (deg) - Link Average RMS Error (deg) - Link Average RMS Torque (Nm) - Link Average RMS Torque (Nm) - Link Error Standard Deviation (deg) - Link Error Standard Deviation (deg)- Link Torque Standard Deviation (Nm) - Link Torque Standard Deviation (Nm) - Link and a Hebbian tuning law [77]. The gradient update law is defined as Ŵ = proj (F ˆσ T ˆV ˆV = proj ) x d e T 2 ( ( ) ) G x T d ˆσ T Ŵe 2, where F R and G R 7 7 are gain matrixes and σ ( ) R 11 is a sigmoidal activation function, and the input vector x d (t) R 7 is defined as x d = [1 q d1 q d2 q d1 q d2 q d1 q d2 ] T. The Hebbian update law is defined as Ŵ = proj ( F ˆσe T 2 ˆV = proj ) (G x dˆσ T ). The adaptation gains for both estimates were selected as F = 2I 11 G =.5I 7, where I 11 R and I 7 R 7 7 are identity matrixes. As with the LP case, the initial values of Ŵ () were chosen to to be a zero matrix; however, the initial values of ˆV () were selected randomly between 1. and 1. to provide a basis [71]. A different transient response could be obtained if the NN weights are initialized differently. 9

91 Tracking Error [degrees] Link Link 2 Tracking Error [degrees] Time [sec] Figure 4-8. Link position tracking error for the modular NN controller with a gradient-based tuning law. Link 1 Torque [N m] Link 2 Torque [N m] Time [sec] Figure 4-9. Torque input for the modular NN controller with a gradient-based tuning law. Figure 4-8 depicts the tracking errors for one experimental trial with a gradient update law. The control torques for the same experimental trial are shown in Figure 4-9. The tracking errors for a representative experimental trial with a Hebbian update law is depicted in Figure 4-1. The torques for the Hebbian update law are shown in Figure

92 Tracking Error [degrees] Tracking Error [degrees] Link Link Time [sec] Figure 4-1. Link position tracking error for the modular NN controller with a Hebbian tuning law. Link 1 Torque [N m] Link 2 Torque [N m] Time [sec] Figure Torque input for the modular NN controller with a Hebbian tuning law Discussion Data from the two sets of experiments illustrate that different adaptive update laws (which are specific parts of the more general update laws considered in the control development) can be used for the proposed modular adaptive controller both for the LP and non-lp case. The different update laws can be used with no change required in the overall structure of the controller and the stability is guaranteed through the Lyapunov analysis. The stability results for previous continuous modular adaptive controllers would 92

93 not have been valid for the dynamic model developed for this robot. Specifically, the advanced friction model derived from [26] contains non-lp disturbances that can not be considered in previous results. In Experiment 1, the system dynamics are assumed to be partially LP. The model-based adaptive controller in (4 11) is implemented with two different update laws - the standard gradient-based and the least-squares update law. Our experimental results confirm the widely acknowledged fact that least-squares update laws yield faster convergence of parameter estimates compared to the gradient-based update laws. The parameter estimates converge much faster when using the least-squares law as compared to the gradient-based law (cf. Figure 4-4 and Figure 4-7). But this faster convergence of parameter estimates does not seem to be beneficial, as the overall tracking performance and control efforts required are similar for both cases (see Table 4-1). In Experiment 2, the system dynamics are not assumed to be LP. The NN-based controller in (4 52) is implemented with two different update laws, specifically the gradient-based and the Hebbian law. One outcome of the experimental results is the fact that the Hebbian update laws give a tracking performance that is on par compared to that of the gradient-based laws (see Table 4-2). This result is significant because in general it is difficult to prove stability and guarantee performance using the Hebbian laws [15]. On the contrary, the gradient-based laws are designed to cancel some cross terms in the Lyapunov stability analysis [32, 75], thus facilitating the analysis and guaranteeing stability. But the gradient-based laws have a complicated form and require the computation of the Jacobian of the activation function. Hebbian laws, on the other hand, have a simple structure and do not require the computation of the Jacobian of the activation function. The fact that a simpler update law yields better performance in our experiments is interesting and may lead to a greater use of the Hebbian (or other simpler) update laws in NN-based control. 93

94 4.11 Conclusion A RISE-based approach was presented to achieve modularity in the controller/update law for a general class of multi-input systems. Specifically, for systems with structured and unstructured uncertainties, a controller was employed that uses a model-based feedforward adaptive term in conjunction with the RISE feedback term (see [19]). The adaptive feedforward term was made modular by considering a generic form of the adaptive update law and its corresponding parameter estimate. This generic form of the update law was used to develop a new closed-loop error system, and the typical RISE stability analysis was modified. New sufficient gain conditions were derived to show asymptotic tracking of the desired link position. The class of RISE-based modular adaptive controllers is then extended to include uncertain dynamic systems that do not satisfy the LP assumption. Specifically, the result allows the NN weight tuning laws to be determined from a developed generic update law (rather than being restricted to a gradient update law). The modular adaptive control development is then applied to Euler-Lagrange dynamic systems. An experimental section is included that illustrates the concept. 94

95 CHAPTER 5 COMPOSITE ADAPTIVE CONTROL FOR SYSTEMS WITH ADDITIVE DISTURBANCES 5.1 Introduction Applying the swapping approach to dynamics with additive disturbances is problematic because the unknown disturbance terms also get filtered and included in the filtered control input. This problem motivates the question of how can a prediction error-based adaptive update law be developed for systems with additive disturbances. To address this motivating question, a general Euler-Lagrange-like MIMO system is considered with structured and unstructured uncertainties, and a gradient-based composite adaptive update law is developed that is driven by both the tracking error and the prediction error. The control development is based on the recent continuous Robust Integral of the Sign of the Error (RISE) [19] technique that was originally developed in [21] and [22]. The RISE architecture is adopted since this method can accommodate for C 2 disturbances and yield asymptotic stability. For example, the RISE technique was used in [23] to develop a tracking controller for nonlinear systems in the presence of additive disturbances and parametric uncertainties. Based on the well accepted heuristic notion that the addition of system knowledge in the control structure yields better performance and reduces control effort, model-based adaptive and neural network feedforward elements were added to the RISE controller in [19] and [75], respectively. In comparison to these approaches that used the RISE method in the feedback component of the controller, the RISE structure is used in both the feedback and feedforward elements of the control structure to enable, for the first time, the construction of a prediction error in the presence of additive disturbances. Specifically, since the swapping method will result in disturbances in the prediction error (the main obstacle that has previously limited this development), an innovative use of the RISE structure is also employed in the prediction error update (i.e., the filtered control input estimate). A block diagram indicating the unique use of the RISE method in the control and the prediction error formulation is 95

96 provided in Figure 5-1. Sufficient gain conditions are developed under which this unique double RISE controller guarantees semi-global asymptotic tracking. Experimental results are presented to illustrate the performance of the proposed approach. The asymptotic stability for the proposed RISE-based composite adaptive controller comes at the expense of achieving semi-global stability which requires the initial condition to be within a specified region of attraction that can be made larger by increasing certain gains as subsequently discussed in Section 5.5. Development is also provided that proves the prediction error is square integrable; yet, no conclusion can be drawn about the convergence of the parameter estimation error due to the presence of filtered additive disturbances in the prediction error. The proposed method uses a gradient-based composite adaptive law with a fixed adaptation gain. Future efforts could also focus on designing a composite law with least-squares estimation with time-varying adaptation gain for the considered class of systems. 5.2 Dynamic System Consider a class of MIMO nonlinear Euler-Lagrange systems of the following form: x (m) = f(x, ẋ,..., x (m 1) ) + G(x, ẋ,..., x (m 2) )u + h (t) (5 1) where ( ) (i) (t) denotes the i th derivative with respect to time, x (i) (t) R n, i =,..., m 1 are the system states, u (t) R n is the control input, f ( x, ẋ,..., x (m 1)) R n and G ( x, ẋ,..., x (m 2)) R n n are unknown nonlinear C 2 functions, and h (t) R n denotes a general nonlinear disturbance (e.g., unmodeled effects). Throughout the chapter, denotes the absolute value of the scalar argument, denotes the standard Euclidean norm for a vector or the induced infinity norm for a matrix. The subsequent development is based on the assumption that all the system states are measurable outputs. Moreover, the following assumptions will be exploited in the subsequent development. 96

97 Assumption 5-1: G ( ) is symmetric positive definite, and satisfies the following inequality y(t) R n : g y 2 y T G 1 y ḡ(x, ẋ,..., x (m 2) ) y 2 (5 2) where g R is a known positive constant, and ḡ(x, ẋ,..., x (m 2) ) R is a known positive function. Assumption 5-2: The functions G 1 ( ) and f( ) are second order differentiable such that G 1, Ġ 1, G 1, f, f, f L if x (i) (t) L, i =, 1,..., m + 1. Assumption 5-3: The nonlinear disturbance term and its first two time derivatives (i.e., h, ḣ, ḧ) are bounded by known constants. Assumption 5-4: The unknown nonlinearities G 1 ( ) and f( ) are linear in terms of unknown constant system parameters (i.e., LP). Assumption 5-5: The desired trajectory x d (t) R n is assumed to be designed such that x (i) d (t) L, i =, 1,..., m + 2. The desired trajectory x d (t) need not be persistently exciting and can be set to a constant value for the regulation problem. 5.3 Control Objective The objective is to design a continuous composite adaptive controller which ensures that the system state x (t) tracks a desired time-varying trajectory x d (t) despite uncertainties and bounded disturbances in the dynamic model. To quantify this objective, a tracking error, denoted by e 1 (t) R n, is defined as e 1 x d x. (5 3) 97

98 To facilitate a compact presentation of the subsequent control development and stability analysis, auxiliary error signals denoted by e i (t) R n, i = 2, 3,..., m are defined as e 2 ė 1 + α 1 e 1 e 3 ė 2 + α 2 e 2 + e 1 e 4 ė 3 + α 3 e 3 + e 2. e i ė i 1 + α i 1 e i 1 + e i 2 (5 4). e m ė m 1 + α m 1 e m 1 + e m 2 where α i R, i = 1, 2,..., m 1 denote constant positive control gains. The error signals e i (t), i = 2, 3,..., m can be expressed in terms of e 1 (t) and its time derivatives as i 1 e i = b i,j e (j) 1, b i,i 1 = 1 (5 5) j= where the constant coefficients b i,j R can be evaluated by substituting (5 5) in (5 4), and comparing coefficients. A filtered tracking error [57], denoted by r(t) R n, is also defined as r ė m + α m e m (5 6) where α m R is a positive, constant control gain. The filtered tracking error r (t) is not measurable since the expression in (5 6) depends on x (m). 5.4 Control Development To develop the open-loop tracking error system, the filtered tracking error in (5 6) is premultiplied by G 1 ( ) to yield G 1 r = G 1 ė m + G 1 α m e m. (5 7) 98

99 Substituting (5 5) in to (5 7) for ė m (t) yields m 1 G 1 r = G 1 b m,j e (j+1) 1 + G 1 α m e m. (5 8) j= By separating the last term from the summation, (5 8) can also be expressed as m 2 G 1 r = G 1 b m,m 1 e (m) 1 + G 1 b m,j e (j+1) 1 + G 1 α m e m. (5 9) Using the fact that b m,m 1 = 1 and making substitutions from (5 1) and (5 3), the expression in (5 9) is rewritten as G 1 r = G 1 x (m) d which can be rearranged as j= m 2 G 1 f G 1 h u + G 1 b m,j e (j+1) 1 + G 1 α m e m j= G 1 r = Y d θ + S 1 G 1 d h u. (5 1) In (5 1), the auxiliary function S 1 ( x, ẋ,..., x (m 1), t ) R n is defined as ( m 2 ) S 1 G 1 b m,j e (j+1) 1 + α m e m j= Also in (5 1), Y d θ R n is defined as + G 1 x (m) d G 1 d x(m) d G 1 f + G 1 d f d G 1 h + G 1 d h. (5 11) Y d θ G 1 d x(m) d G 1 d f d (5 12) where Y d (x d, ẋ d,..., x (m) d ) R n p is a desired regression matrix, and θ R p contains the constant unknown system parameters. In (5 12), the functions G 1 d (x d, ẋ d,..., x (m 2) d ) R n n and f d (x d, ẋ d,..., x (m 1) d ) R n are defined as G 1 d G 1 (x d, ẋ d,..., x (m 2) d ) (5 13) f d f(x d, ẋ d,..., x (m 1) d ). 99

100 The open-loop error system in (5 1) is typically written in a form similar to G 1 ė m = Y d θ + S 1 G 1 α m e m G 1 d h u (5 14) which can be obtained by substituting (5 6) into (5 1) for the filtered tracking error. Although (5 1) and (5 14) are equivalent, the atypical form in (5 1) is used to facilitate the subsequent closed-loop error system development and stability analysis. Specifically, the subsequent RISE control method is designed based on the time derivative of the control input. The design of the filtered tracking error in (5 6) is not necessary, but it simplifies the subsequent development by allowing the closed-loop error system to be expressed in terms of ṙ(t) rather than ë m (t) RISE-based Swapping A measurable form of the prediction error ε (t) R n is defined as the difference between the filtered control input u f (t) R n and the estimated filtered control input û f (t) R n as ε u f û f (5 15) where the filtered control input u f (t) R n is generated by [3] u f + ωu f = ωu, u f () = (5 16) where ω R is a known positive constant, and û f (t) R n is subsequently designed. The differential equation in (5 16) can be directly solved to yield u f = v u (5 17) where is used to denote the standard convolution operation, and the scalar function v (t) is defined as v ωe ωt. (5 18) 1

101 Using (5 1), the expression in (5 17) can be rewritten as u f = v ( G 1 x (m) G 1 f G 1 h ). (5 19) Since the system dynamics in (5 1) include non-lp bounded disturbances h (t), they also get filtered and included in the filtered control input in (5 19). To compensate for the effects of these disturbances, the typical prediction error formulation is modified to include a RISE-like structure in the design of the estimated filtered control input. With this motivation, the structure of the open-loop prediction error system is engineered to facilitate the RISE-based design of the estimated filtered control input. yields Adding and subtracting the term G 1 d x(m) d + G 1 d f d + G 1 h to the expression in (5 19) d u f = v (G 1 d x(m) d +G 1 d f d+g 1 x (m) G 1 d x(m) d Using (5 12), the expression in (5 2) is simplified as G 1 f G 1 d f d G 1 h+g 1 d h G 1 d h). (5 2) u f = v ( Y d θ + S S d G 1 d h) (5 21) where S(x, ẋ,..., x (m) ), S d (x d, ẋ d,..., x (m) d ) R n are defined as S G 1 x (m) G 1 f G 1 h (5 22) The expression in (5 21) is further simplified as S d G 1 d x(m) d G 1 d f d G 1 d h. (5 23) u f = Y df θ + v S v S d + h f (5 24) where the filtered regressor matrix Y df (x d, ẋ d,..., x (m) d ) R n p is defined as Y df v Y d (5 25) 11

102 and the disturbance h f (t) R n is defined as h f v G 1 d h. The term v S(x, ẋ,..., x (m) ) R n in (5 24) depends on x (m). Using the following property of convolution [57]: g 1 ġ 2 = ġ 1 g 2 + g 1 ()g 2 g 1 g 2 () (5 26) an expression independent of x (m) can be obtained. Consider v S = v ( G 1 x (m) G 1 f G 1 h ) which can be rewritten as v S = v ( d dt (G 1 x (m 1) ) Ġ 1 x (m 1) G 1 f G 1 h). (5 27) Applying the property in (5 26) to the first term of (5 27) yields v S = S f + W (5 28) where the state-dependent terms are included in the auxiliary function S f (x, ẋ,..., x (m 1) ) R n, defined as S f v ( G 1 x (m 1)) + v ()G 1 x (m 1) v Ġ 1 x (m 1) v G 1 f v G 1 h (5 29) and the terms that depend on the initial states are included in W (t) R n, defined as W vg 1 ( x (), ẋ(),..., x (m 2) () ) x (m 1) (). (5 3) Similarly, following the procedure in (5 27)-(5 3), the expression v S d in (5 24) is evaluated as v S d = S df + W d (5 31) 12

103 where S df (x d, ẋ d,..., x (m 1) d ) R n is defined as S df v (G 1 d x(m 1) d and W d (t) R n is defined as ) + v ()G 1 d x(m 1) d v Ġ 1 d x(m 1) d v G 1 d f d v G 1 d h (5 32) W d vg 1 d (x d (),ẋ d (),..., x (m 2) d ())x (m 1) d (). (5 33) Substituting (5 28)-(5 33) into (5 24), and then substituting the resulting expression into (5 15) yields ε = Y df θ + S f S df + W W d + h f û f. (5 34) Composite Adaptation The composite adaptation for the adaptive estimates ˆθ (t) R p in (5 47) is given by ˆθ = ΓẎ T d r + ΓẎ T df ε (5 35) where Γ R p p is a positive definite, symmetric, constant gain matrix and the filtered regressor matrix Y df (x d, ẋ d,..., x (m) d ) R n p is defined in (5 25). Remark 5.1. The parameter estimate update law in (5 35) depends on the unmeasurable signal r (t), but the parameter estimates are independent of r (t) as can be shown by directly solving (5 35) as ˆθ (t) = ˆθ () + ΓẎ d T (σ)e m (σ) t + t t ΓẎ T df (σ) ε (σ) dσ { } ΓŸ d T (σ)e m(σ) α m ΓẎ d T (σ)e m(σ) dσ Closed-Loop Prediction Error System Based on (5 36) and the subsequent analysis, the filtered control input estimate is designed as û f = Y df ˆθ + µ2, (5 36) 13

104 where µ 2 (t) R n is a RISE-like term defined as µ 2 (t) t [k 2 ε(σ) + β 2 sgn(ε(σ))]dσ, (5 37) where k 2, β 2 R denote constant positive control gains. In a typical prediction error formulation, the estimated filtered control input is designed to include just the first term Y df ˆθ in (5 36). But as previously discussed, the presence of non-lp disturbances in the system model results in filtered disturbances in the unmeasurable form of the prediction error in (5 34). Hence, the estimated filtered control input is augmented with an additional RISE-like term µ 2 (t) to cancel the effects of disturbances in the prediction error as illustrated in Figure 5-1 and the subsequent design and stability analysis. Substituting (5 36) into (5 34) yields the following closed-loop prediction error system: ε = Y df θ + Sf S df + W W d + h f µ 2 (5 38) where θ (t) R p denotes the parameter estimate mismatch defined as θ θ ˆθ. (5 39) To facilitate the subsequent composite adaptive control development and stability analysis, the time derivative of (5 38) is expressed as ε = Ẏdf θ Y df ΓẎ T df ε + Ñ2 + N 2B k 2 ε β 2 sgn(ε), (5 4) where (5 35) and the fact that µ 2 = k 2 ε + β 2 sgn(ε) (5 41) were utilized. In (5 4), the unmeasurable/unknown auxiliary term Ñ2(e 1, e 2,..., e m, r, t) R n is defined as Ñ 2 Ṡf Ṡdf Y df ΓẎ d T r, (5 42) 14

105 where the update law in (5 35) was utilized, and the term N 2B (t) R n is defined as N 2B Ẇ Ẇd + ḣf. (5 43) In a similar manner as in Lemma 1 of the Appendix, by applying the Mean Value Theorem can be used to develop the following upper bound for the expression in (5 42): Ñ2(t) ρ 2 ( z ) z, (5 44) where the bounding function ρ 2 ( ) R is a positive, globally invertible, nondecreasing function, and z(t) R n(m+1) is defined as z(t) [ e T 1 e T 2... e T m rt] T. (5 45) Using Assumption 5-3, and the fact that v (t) is a linear, strictly proper, exponentially stable transfer function, the following inequality can be developed based on the expression in (5 43) with a similar approach as in Lemma 2 of [42]: N 2B (t) ξ, (5 46) where ξ R is a known positive constant Closed-Loop Tracking Error System Based on the open-loop error system in (5 1), the control input is composed of an adaptive feedforward term plus the RISE feedback term as u Y dˆθ + µ1 (5 47) where µ 1 (t) R n denotes the RISE feedback term defined as µ 1 (t) (k 1 + 1)e m (t) (k 1 + 1)e m () + t {(k 1 + 1)α m e m (σ) + β 1 sgn(e m (σ))}dσ (5 48) where k 1, β 1 R are positive constant control gains, and α m R was introduced in (5 6). In (5 47), ˆθ (t) R p denotes a parameter estimate vector for unknown system parameters 15

106 Figure 5-1. Block diagram of the proposed RISE-based composite adaptive controller θ R p, generated by a subsequently designed gradient-based composite adaptive update law [38, 39, 82]. The closed-loop tracking error system can be developed by substituting (5 47) into (5 1) as G 1 r = Y d θ + S1 G 1 d h µ 1. (5 49) To facilitate the subsequent composite adaptive control development and stability analysis, the time derivative of (5 49) is expressed as G 1 ṙ = 1 2Ġ 1 r + Ẏd θ Y d ΓẎ T df ε + Ñ1 + N 1B (k 1 + 1)r β 1 sgn(e m ) e m (5 5) where (5 35) and the fact that the time derivative of (5 48) is given as µ 1 = (k 1 + 1)r + β 1 sgn(e m ) (5 51) was utilized. In (5 5), the unmeasurable/unknown auxiliary terms Ñ1(e 1, e 2,..., e m, r, t) and N 1B (t) R n are defined as Ñ 1 1 2Ġ 1 r + Ṡ1 + e m Y d ΓẎ T d r (5 52) 16

107 where (5 35) was used, and N 1B Ġ 1 d h G 1 d ḣ. (5 53) The structure of (5 5) and the introduction of the auxiliary terms in (5 52) and (5 53) is motivated by the desire to segregate terms that can be upper bounded by state-dependent terms and terms that can be upper bounded by constants. In a similar fashion as in (5 44), the following upper bound can be developed for the expression in (5 52): Ñ1(t) ρ 1 ( z ) z (5 54) where the bounding function ρ 1 ( ) R is a positive, globally invertible, nondecreasing function, and z(t) R n(m+1) was defined in (5 45). Using Assumptions 5-2 and 5-3, the following inequalities can be developed based on the expression in (5 53) and its time derivative: N 1B (t) ζ 1, Ṅ1B(t) ζ 2 (5 55) where ζ i R, i = 1, 2 are known positive constants. The RISE controller in (5 47) and (5 48), and sliding mode controllers (SMCs) (e.g., see the classic results in [3, 12]) are the only methods that have been proven to yield an asymptotic tracking result for an open-loop error system such as (5 1) where additive bounded disturbances are present that are upper bounded by a constant. Other approaches lack a mechanism to cancel the disturbance terms (these terms are typically eliminated through nonlinear damping and yield a uniformly ultimately bounded (UUB) result). SMC is a discontinuous control method that requires infinite control bandwidth and a known upper bound on the disturbance term. Continuous modifications of SMC reduce the stability result to UUB. The RISE control method in (5 47) and (5 48) is continuous/differentiable (i.e., finite bandwidth) and requires a known upper bound on the disturbance and the time derivative of the disturbance. Despite which feedback control method is selected, the control challenge is that the disturbance term h(t) in (5 1) will be included in the swapping (or torque filtering) method in the prediction error formulation 17

108 as shown in (5 34). This technical obstacle has prevented the development of any previous composite adaptive controller for a dynamic system with added disturbances such as h(t). The contribution of the current result is the development of the RISE-based swapping method to enable the development of composite adaptive controllers for systems such as (5 1). Specifically, the RISE-based swapping technique provides a means to cancel the filtered disturbance terms in (5 34). 5.5 Stability Analysis Theorem 5-1: The controller given in (5 47) and (5 48) in conjunction with the composite adaptive update law in (5 35), where the prediction error is generated from (5 15), (5 16), (5 36), and (5 37), ensures that all system signals are bounded under closed-loop operation and that the position tracking error and the prediction error are regulated in the sense that e 1 (t) and ε(t) as t provided the control gains k 1 and k 2 introduced in (5 48) and (5 37) are selected sufficiently large based on the initial conditions of the system (see the subsequent proof), and the following conditions are satisfied: α m 1 > 1 2, α m > 1 2, (5 56) β 1 > ζ α m ζ 2, β 2 > ξ, (5 57) where the gains α m 1 and α m were introduced in (5 4), β 1 was introduced in (5 48), β 2 was introduced in (5 37), ζ 1 and ζ 2 were introduced in (5 55), and ξ was introduced in (5 46). Proof: Let D R n(m+2)+p+2 be a domain containing y(t) =, where y(t) R n(m+2)+p+2 is defined as y [z T ε T P1 P2 θt ] T. (5 58) 18

109 In (5 58), the auxiliary function P 1 (t) R is defined as n P 1 (t) β 1 e mi () e m () T N 1B () t i=1 L 1 (τ)dτ, (5 59) where e mi () R denotes the ith element of the vector e m (), and the auxiliary function L 1 (t) R is defined as L 1 r T (N 1B β 1 sgn(e m )), (5 6) where β 1 R is a positive constant chosen according to the sufficient condition in (5 57). Provided the sufficient condition introduced in (5 57) is satisfied, the following inequality is obtained [22]: t n L 1 (τ)dτ β 1 e mi () e m () T N 1B (). (5 61) i=1 Hence, (5 61) can be used to conclude that P 1 (t). Also in (5 58), the auxiliary function P 2 (t) R is defined as P 2 (t) t where the auxiliary function L 2 (t) R is defined as L 2 (τ)dτ, (5 62) L 2 ε T (N 2B β 2 sgn(ε)), (5 63) where β 2 R is a positive constant chosen according to the sufficient condition in (5 57). Provided the sufficient condition introduced in (5 57) is satisfied, then P 2 (t). Let V L (y, t) : D [, ) R be a continuously differentiable, positive definite function defined as V L (y, t) 1 m e T i 2 e i rt G 1 r εt ε + P 1 + P θ T Γ 1 θ (5 64) i=1 which satisfies the inequalities U 1 (y) V L (y, t) U 2 (y) (5 65) 19

110 provided the sufficient conditions introduced in (5 57) are satisfied. In (5 65), the continuous positive definite functions U 1 (y), U 2 (y) R are defined as U 1 (y) λ 1 y 2 and U 2 (y) λ 2 (x, ẋ,..., x (m 2) ) y 2, where λ 1, λ 2 (x, ẋ,..., x (m 2) ) R are defined as λ min { 1, g, λ min { Γ 1 }} (5 66) λ 2 max{ 1 2ḡ(x, ẋ,..., x(m 2) ), 1 2 λ max { Γ 1 }, 1} where g, ḡ(x, ẋ,..., x (m 2) ) are introduced in (5 2), and λ min { } and λ max { } denote the minimum and maximum eigenvalue of the arguments, respectively. After using (5 4), (5 6), (5 35), (5 4), (5 5), (5 59), (5 6), (5 62) and (5 63), the time derivative of (5 64) can be expressed as V L (y, t) = m α i e T i e i + e T m 1 e m r T r k 1 r T r + r T Ẏ d θ + r T Ñ 1 i=1 + r T N 1B r T Y d ΓẎ T df ε β 1r T sgn(e m ) + ε T Ẏ df θ (5 67) + ε T Ñ 2 + ε T N 2B k 2 ε T ε ε T Y df ΓẎ T df ε β 2ε T sgn(ε) r T (N 1B β 1 sgn(e m )) ε T N 2B + ε T β 2 sgn(ε) θ T Γ 1 (ΓẎ T d r + ΓẎ T df ε). After canceling the similar terms and using the fact that a T b 1 2 ( a 2 + b 2 ) for some a, b R n, the expression in (5 67) is upper bounded as V L (y, t) m α i e T i e i e m e m 2 r 2 i=1 k 1 r 2 + r T Ñ 1 r T Y d ΓẎ T df ε + εt Ñ 2 k 2 ε T ε ε T Y df ΓẎ T df ε. Using the following upper bounds: Y d ΓẎ T c 1, df Y df ΓẎ T c 2, df 11

111 where c 1, c 2 R are positive constants, VL (y, t) is upper bounded using the squares of the components of z(t) as V L (y, t) λ 3 z 2 k 1 r 2 + r Ñ1 + c 1 ε r + ε Ñ2 (k 2 c 2 ) ε 2, (5 68) where λ 3 min{α 1, α 2,..., α m 2, α m 1 1 2, α m 1 2, 1}. Letting k 2 = k 2a + k 2b where k 2a, k 2b R are positive constants, and using the inequalities in (5 54) and (5 44), the expression in (5 68) is upper bounded as V L (y, t) λ 3 z 2 k 2b ε 2 [ k 1 r 2 ρ 1 ( z ) r z ] (5 69) [ (k 2a c 2 ) ε 2 (ρ 2 ( z ) + c 1 ) ε z ]. Completing the squares for the terms inside the brackets in (5 69) yields V L (y, t) λ 3 z 2 k 2b ε 2 + ρ2 1 ( z ) z 2 + (ρ 2( z ) + c 1 ) 2 z 2 4k 1 4 (k 2a c 2 ) λ 3 z 2 + ρ2 ( z ) z 2 k 2b ε 2, (5 7) 4k where k R is defined as k k 1 (k 2a c 2 ) max {k 1, (k 2a c 2 )}, k 2a > c 2 (5 71) and ρ( ) R is a positive, globally invertible, nondecreasing function defined as ρ 2 ( z ) ρ 2 1 ( z ) + (ρ 2( z ) + c 1 )

112 The expression in (5 7) can be further upper bounded by a continuous, positive semi-definite function V L (y, t) U(y) = c [ z T ε T] T 2 y D (5 72) for some positive constant c, where D ( {y (t) R n(m+2)+p+2 y ρ 1 2 )} λ 3 k. Larger values of k will expand the size of the domain D. The inequalities in (5 65) and (5 7) can be used to show that V L (y, t) L in D; hence, e i (t) L and ε (t), r(t), θ (t) L in D. Given that e i (t) L and r(t) L in D, standard linear analysis methods can be used to prove that ė i (t) L in D from (5 4) and (5 6). Since e i (t) L, and r(t) L in D, Assumption 5-5 can be used along with (5 3)-(5 6) to conclude that x (i) (t) L, i =, 1,..., m in D. Since θ (t) L in D, (5 39) can be used to prove that ˆθ(t) L in D. Since x (i) (t) L, i =, 1,..., m in D, Assumption 5-2 can be used to conclude that G 1 ( ) and f( ) L in D. Thus, from (5 1) and Assumption 5-3, we can show that u(t) L in D. Therefore, u f (t) L in D, and hence, from (5 15), û f (t) L in D. Given that r(t) L in D, (5 51) can be used to show that µ 1 (t) L in D, and since Ġ 1 ( ) and f( ) L in D, (5 5) can be used to show that ṙ(t) L in D, and (5 4) can be used to show that ε(t) L in D. Since ė i (t) L, ṙ(t), and ε (t) L in D, the definitions for U(y) and z(t) can be used to prove that U(y) is uniformly continuous in D. Let S D denote a set defined as S ( {y(t) D U 2 (y(t)) < λ 1 (ρ 1 2 )) } 2 λ 3 k. (5 73) The region of attraction in (5 73) can be made arbitrarily large to include any initial conditions by increasing the control gain k (i.e., a semi-global stability result). Theorem 112

113 8.4 of [63] can now be invoked to state that c [ z T ε T] T 2 as t y() S. (5 74) Based on the definition of z(t), (5 74) can be used to show that e 1 (t) as t y() S (5 75) ε(t) as t y() S. 5.6 Experiment As in Chapter 2, the testbed depicted in Figure 2-1 was used to implement the developed controller. The desired link trajectory is selected as follows (in degrees): q d (t) = 6. sin(1.2t)(1 exp(.1t 3 )). (5 76) For all experiments, the rotor velocity signal is obtained by applying a standard backwards difference algorithm to the position signal. The integral structure for the RISE term in (5 48) was computed on-line via a standard trapezoidal algorithm. The parameter estimates were all initialized to zero. In practice, the adaptive estimates would be initialized to a best-guess estimate of the values. Initializing the estimates to zero was done to test a scenario of no parameter knowledge. In addition, all the states were initialized to zero. The following control gains and best guess estimates were used to implement the controller in (5 47) and (5 48) in conjunction with the composite adaptive update law in (5 35), where the prediction error is generated from (5 15), (5 16), (5 36), and (5 37): k 1 = 7, β 1 = 5, α 1 = 2, α 2 = 1, Γ = diag {4,.9,.9, 2} k 2 = 4, β 2 = 1, ω = 1, γ 2 = 5, γ 3 = 1, γ 5 = 5. (5 77) 113

114 8 6 Actual Desired Actual and Desired Positions [degrees] Time [sec] Figure 5-2. Actual and desired trajectories for the proposed composite adaptive control law (RISE+CFF) Tracking Error [degrees] Time [sec] Figure 5-3. Tracking error for the proposed composite adaptive control law (RISE+CFF). 114

115 Prediction Error [N m] Time [sec] Figure 5-4. Prediction error for the proposed composite adaptive control law (RISE+CFF) Torque [N m] Time [sec] Figure 5-5. Control torque for the proposed composite adaptive control law (RISE+CFF). 115

116 6 4 2 [N m] Time [sec] Figure 5-6. Contribution of the RISE term in the proposed composite adaptive control law (RISE+CFF). Parameter Estimates Time [sec] Figure 5-7. Adaptive estimates for the proposed composite adaptive control law (RISE+CFF). 116

Figure 5-8. Average RMS errors (degrees) and torques (N-m). 1- RISE, 2- RISE+FF, 3- RISE+CFF (proposed). 5.6.

117 Figure 5-8. Average RMS errors (degrees) and torques (N-m). 1- RISE, 2- RISE+FF, 3- RISE+CFF (proposed) Discussion Three different experiments were conducted to demonstrate the efficacy of the proposed controller. For each controller, the gains were not retuned (i.e., the common control gains remain the same for all controllers). First, no adaptation was used and the controller with only the RISE feedback was implemented. For the second experiment, the prediction error component of the update law in (5 35) was removed, resulting in a standard gradient-based update law (hereinafter denoted as RISE+FF). For the third experiment, the proposed composite adaptive controller in (5 47)-(5 48) (hereinafter denoted as RISE+CFF) was implemented. Figure 5-2 depicts the actual position compared with the desired trajectory for the RISE+CFF controller, while the tracking error and the prediction error are shown in Figure 5-3 and Figure 5-4, respectively. The control torque is shown in Figure 5-5, and the contribution of the RISE term in the overall torque is depicted in Figure 5-6. The contribution of the feedback RISE term decreases as the adaptive estimates converge as shown in Figure 5-7. Each experiment was performed five times and the average RMS error and torque values are shown in Figure 5-8, which indicate that the proposed RISE+CFF controller yields the lowest RMS error with a similar control effort. 117

118 5.7 Conclusion A novel approach for the design of a gradient-based composite adaptive controller was proposed for generic MIMO systems subjected to bounded disturbances. A model-based feedforward adaptive component was used in conjunction with the RISE feedback, where the adaptive estimates were generated using a composite update law driven by both the tracking and prediction error with the motivation of using more information in the adaptive update law. To account for the effects of non-lp disturbances, the typical prediction error formulation was modified to include a second RISE-like term in the estimated filtered control input design. Using a Lyapunov stability analysis, sufficient gain conditions were derived under which the proposed controller yields semi-global asymptotic stability. The current development, as well as all previous RISE controllers, require full-state feedback. The development of an output feedback result remains an open problem. Experiments on a rotating disk with externally applied friction indicate that the proposed method yields better tracking performance with a similar control effort. 118

119 CHAPTER 6 COMPOSITE ADAPTATION FOR NN-BASED CONTROLLERS 6.1 Introduction This chapter presents the first ever attempt to develop a prediction error-based composite adaptive NN controller for an Euler-Lagrange second-order dynamic system using the recent continuous Robust Integral of the Sign of the Error (RISE) [19] technique that was originally developed in [21] and [22]. The RISE architecture is adopted since this method can accommodate for C 2 disturbances and yield asymptotic stability. The RISE technique was used in [75] to prove the first ever asymptotic result for a NN-based controller using a continuous feedback. In this chapter, the RISE feedback is used in conjunction with a NN feedfoward element similar to [75], however, unlike the typical tracking error-based gradient update law used in [75], the result in this chapter uses a composite update law driven by both the tracking and the prediction error. In order to compensate for the effect of NN reconstruction error, an innovative use of the RISE structure is also employed in the prediction error update (i.e., the filtered control input estimate). A block diagram indicating the unique use of the RISE method in the control and the prediction error formulation is provided in Figure 6-1. Sufficient gain conditions are derived using a Lyapunov-based stability analysis under which this unique double-rise control strategy yields a semi-global asymptotic stability for the system tracking errors and the prediction error, while all other signals and the control input are shown to be bounded. Since a multi-layer NN includes the first layer weight estimate inside of a nonlinear activation function, proving that the NN weight estimates are bounded is a challenging task. A projection algorithm is used to guarantee the boundedness of the weight estimates. However, if instead a single-layer NN is used, projection is not required and the weight estimates can be shown bounded via the stability analysis. The control development in this chapter can be easily simplified for a single-layer NN by choosing fixed first layer weights. 119

120 A usual concern about the RISE feedback is the presence of high-gain and high-frequency components in the control structure. However, in contrast to a typical widely used discontinuous high-gain sliding mode controller, the RISE feedback offers a continuous alternative. Moreover, the proposed controller is not purely high gain as a multi-layer NN is used as a feedforward component that learns and incorporates the knowledge of system dynamics in the control structure. 6.2 Dynamic System Consider a class of second order nonlinear systems of the following form: ẍ = f(x, ẋ) + G(x)u (6 1) where x (t), ẋ(t) R n are the system states, u (t) R n is the control input, f (x, ẋ) R n and G (x) R n n are unknown nonlinear C 2 functions. The control development for the dynamic system in (6 1) can be easily extended to a second-order Euler-Lagrange system of the following form: M(q) q + V m (q, q) q + G(q) + F ( q) = τ(t) where M(q) R n n denotes the inertia matrix, V m (q, q) R n n denotes the centripetal-coriolis matrix, G(q) R n denotes the gravity vector, F ( q) R n denotes friction, τ(t) R n represents the torque input control vector, and q(t), q(t), q(t) R n denote the link position, velocity, and acceleration vectors, respectively. Throughout the chapter, denotes the absolute value of the scalar argument, denotes the standard Euclidean norm for a vector or the induced infinity norm for a matrix, and F denotes the Frobenius norm of a matrix. The following properties and assumptions will be exploited in the subsequent development. 12

121 Assumption 6-1: G ( ) is symmetric positive definite, and satisfies the following inequality ξ(t) R n : g ξ 2 ξ T G 1 ξ ḡ(x) ξ 2 (6 2) where g R is a known positive constant, and ḡ(x) R is a known positive function. Assumption 6-2: The functions G 1 ( ) and f( ) are second order differentiable such that G 1 ( ), Ġ 1 ( ), G 1 ( ), f( ), f( ), f( ) L if x (i) (t) L, i =, 1, 2, 3, where ( ) (i) (t) denotes the i th derivative with respect to time. Assumption 6-3: The desired trajectory x d (t) R n is designed such that x (i) d (t) L, i =, 1,..., 4 with known bounds. 6.3 Control Objective The objective is to design a continuous composite adaptive NN controller which ensures that the system state x (t) tracks a desired time-varying trajectory x d (t) despite uncertainties in the dynamic model. To quantify this objective, a tracking error, denoted by e 1 (t) R n, is defined as e 1 x d x. (6 3) To facilitate the subsequent analysis, filtered tracking errors, denoted by e 2 (t), r(t) R n, are also defined as e 2 ė 1 + α 1 e 1 (6 4) r ė 2 + α 2 e 2 (6 5) where α 1, α 2 R denote positive constants. The subsequent development is based on the assumption that the system states x (t), ẋ (t) are measurable. Hence, the filtered tracking error r (t) is not measurable since the expression in (6 5) depends on ẍ (t). 121

122 6.4 Control Development The open-loop tracking error system is developed by premultiplying (6 5) by G 1 ( ) and utilizing the expressions in (6 1), (6 3), (6 4) as G 1 r = ψ + S 1 u. (6 6) In (6 6), ψ (x d, ẋ d, ẍ d ) R n is defined as ψ G 1 d ẍd G 1 d f d (6 7) where the functions G 1 d (x d) R n n and f d (x d, ẋ d ) R n are defined as G 1 d G 1 (x d ), f d f(x d, ẋ d ). (6 8) Also in (6 6), the auxiliary function S 1 (x, ẋ, t) R n is defined as S 1 G 1 α 2 e 2 + G 1 ẍ d G 1 d ẍd G 1 f + G 1 d f d + G 1 α 1 ė 1. (6 9) The unknown dynamics in (6 7) can be represented by a three-layer NN [15, 32] using the universal approximation property as described in Section 3.4 of Chapter 3. ψ = W T σ(v T x d ) + ε ( x d ). (6 1) Based on (6 1), the typical three-layer NN approximation for ψ ( x d ) is given as [15, 32] ) ˆψ Ŵ T σ (ˆV T x d (6 11) where ˆV (t) R (N 1+1) N 2 and Ŵ(t) R(N 2+1) n are subsequently designed estimates of the ideal weight matrices. The estimate mismatch for the ideal weight matrices, denoted by Ṽ (t) R (N 1+1) N 2 and W(t) R (N 2+1) n, are defined as Ṽ V ˆV, W W Ŵ 122

123 and the mismatch for the hidden-layer output error for a given x d (t), denoted by σ( x d ) R N 2+1, is defined as σ σ ˆσ = σ ( V T x d ) σ (ˆV T x d ). (6 12) Property 6-1: (Taylor Series Approximation) The Taylor series expansion for σ ( V T x d ) for a given x d (t) may be written as [15, 32] σ ( V T x d ) = σ (ˆV T x d ) + σ (ˆV T x d )Ṽ T x d + O (Ṽ T x d ) 2 (6 13) where σ (ˆV T x d ) dσ ( V T x d ) /d ( V T x d ) V T x d =ˆV T x d, and O (ˆV T x d ) 2 denotes the higher order terms. After substituting (6 13) into (6 12), the following expression can be obtained ) 2 σ = ˆσ Ṽ T x d + O (Ṽ T x d (6 14) where ˆσ σ (ˆV T x d ). Based on the open-loop error system in (6 6), the control input is composed of a NN estimate term plus the RISE feedback term as [75] u ˆψ + µ 1 (6 15) where ˆψ(t) R n denotes a first ever, subsequently designed, prediction-error based NN feedfoward term. In (6 15), µ 1 (t) R n denotes the RISE feedback term defined as [21, 22, 75] µ 1 (t) (k 1 + 1)e 2 (t) (k 1 + 1)e 2 () + t {(k 1 + 1)α 2 e 2 (σ) + β 1 sgn(e 2 (σ))}dσ (6 16) where k 1, β 1 R are positive constant control gains, α 2 R was introduced in (6 5), and sgn ( ) denotes the signum function defined as sgn(e 2 ) [sgn(e 21 ) sgn(e 22 )... sgn(e 2i )... sgn(e 2n )] T. 123

124 The closed-loop tracking error system can be developed by substituting (6 15) into (6 6) as G 1 r = ψ ˆψ + S 1 µ 1. (6 17) To facilitate the subsequent composite adaptive control development and stability analysis, (6 1) and (6 11) are used to express the time derivative of (6 17) as G 1 ṙ = Ġ 1 r + W T σ ( V T x d ) V T x d Ŵ T σ (ˆV T x d )ˆV T x d + ε + Ṡ1 µ 1. Ŵ T σ(ˆv T x d ) Ŵ T σ (ˆV T x d ) ˆV T x d (6 18) After adding and subtracting the terms W T ˆσ ˆV T x d + Ŵ T ˆσ Ṽ T x d to (6 18), the following expression can be obtained: G 1 ṙ = Ġ 1 r + Ŵ T ˆσ Ṽ T x d + W T ˆσ ˆV T x d + W T σ V T x d W T ˆσ ˆV T x d (6 19) Ŵ T ˆσ Ṽ T x d + Ṡ1 Ŵ T ˆσ Ŵ T ˆσ ˆV T x d + ε µ 1 ) ) where the notations ˆσ (ˆV T x d and ˆσ (ˆV T x d are introduced in (6 12) and (6 14), respectively Swapping In this section, the swapping procedure is used to generate a measurable form of a prediction error that relates to the function mismatch error (i.e., ψ (t) ˆψ (t)). A measurable form of the prediction error η (t) R n is defined as the difference between a filtered control input u f (t) R n and an estimated filtered control input û f (t) R n as η u f û f (6 2) where the filtered control input is generated from the stable first order differential equation u f + ωu f = ωu, u f () = (6 21) 124

125 where ω R is a known positive constant. The differential equation in (6 21) can be expressed as a convolution as u f = v u (6 22) where is used to denote the standard convolution operation, and the scalar function v (t) is defined as v ωe ωt. (6 23) Using (6 1), the expression in (6 22) can be rewritten as u f = v ( G 1 ẍ G 1 f ). (6 24) The construction of a NN-based controller to approximate the unknown system dynamics in (6 24) will inherently result in a residual function reconstruction error ε ( x d ). The presence of the reconstruction error has been the technical obstacle that has prevented the development of composite adaptation laws for NNs. To compensate for the effects of the reconstruction error, the typical prediction error formulation is modified to include a RISE-like structure in the design of the estimated filtered control input. With this motivation, the open-loop prediction error system is engineered to facilitate the RISE-based design of the estimated filtered control input. Adding and subtracting the term v ( G 1 d ẍd + G 1 d f d) to the expression in (6 24), and using (6 7) yields where S(x, ẋ, ẍ), S d (x d, ẋ d, ẍ d ) R n are defined as u f = v (ψ + S S d ) (6 25) S G 1 ẍ G 1 f (6 26) The expression in (6 25) is further simplified as S d G 1 d ẍd G 1 d f d. (6 27) u f = v ψ + v S v S d. (6 28) 125

126 The term v S(x, ẋ, ẍ) R n in (6 28) depends on ẍ (t). Using the following property of convolution [57]: g 1 ġ 2 = ġ 1 g 2 + g 1 ()g 2 g 1 g 2 () (6 29) an expression independent of ẍ (t) can be obtained as v S = S f + D (6 3) where the state-dependent terms are included in the auxiliary function S f (x, ẋ) R n, defined as S f v ( G 1 ẋ ) + v () G 1 ẋ v Ġ 1 ẋ v G 1 f (6 31) and the terms that depend on the initial states are included in D (t) R n, defined as D vg 1 (x ()) ẋ (). (6 32) Similarly, the expression v S d (x d, ẋ d, ẍ d ) in (6 28) is evaluated as v S d = S df + D d (6 33) where S df (x d, ẋ d ) R n is defined as S df v (G 1 d and D d (t) R n is defined as ẋ d ) + v ()G 1 d ẋ d v Ġ 1 d ẋ d v G 1 d f d (6 34) D d vg 1 d (x d ())ẋ d (). (6 35) Substituting (6 3)-(6 35) into (6 28), and then substituting the resulting expression into (6 2) yields η = v ψ + S f S df + D D d û f. (6 36) 126

127 Based on (6 36) and the subsequent analysis, the filtered control input estimate is designed as û f = ˆψ f + µ 2 (6 37) where the filtered NN estimate ˆψ f (t) R n is generated by from the stable first order differential equation ˆψ f + ω ˆψ f = ω ˆψ, ˆψf () = which can be expressed as a convolution ˆψ f = v ˆψ. In (6 37), µ 2 (t) R n is a RISE-like term defined as µ 2 (t) t [k 2 η(σ) + β 2 sgn(η(σ))]dσ (6 38) where k 2, β 2 R denote constant positive control gains. In a typical prediction error formulation, the estimated filtered control input is designed to include just the first term ˆψ f (t) in (6 37). But as discussed earlier, due to the presence of the NN reconstruction error, the unmeasurable form of the prediction error in (6 36) also includes the filtered reconstruction error. Hence, the estimated filtered control input is augmented with an additional RISE-like term µ 2 (t) to cancel the effects of reconstruction error in the prediction error measurement as illustrated in Figure 6-1 and the subsequent design and stability analysis. Substituting (6 37) into (6 36) yields the following closed-loop prediction error system: η = v ( ψ ˆψ ) + S f S df + D D d µ 2. (6 39) To facilitate the subsequent composite adaptive control development and stability analysis, the time derivative of (6 39) is expressed as η = v ( ψ ˆψ ) ( + ω ψ ˆψ ) + Ṡf Ṡdf + Ḋ Ḋd µ 2 (6 4) 127

128 Figure 6-1. Block diagram of the proposed RISE-based composite NN controller. where the property d dt (6 11) into (6 4) yields (f g) = ( f g) (t) + f () g (t) was used. Substituting (6 1) and η = v ( ) ( ) W T σ(v T x d ) + ε Ŵ T σ(ˆv T x d ) + ω W T σ(v T x d ) + ε Ŵ T σ(ˆv T x d ) (6 41) + Ṡf Ṡdf + Ḋ Ḋd µ 2. Adding and subtracting v (t) (Ŵ T σ + W T ˆσ) + ω(ŵ T σ + W T ˆσ) to (6 41) and rearranging the terms yields η = v (Ŵ T σ + W T ˆσ + W ) T σ + ε + ω (Ŵ T σ + W T ˆσ + W ) T σ + ε (6 42) + Ṡf Ṡdf + Ḋ Ḋd µ 2. By using the Taylor series expansion in (6 13), (6 42) can be expressed as ( ) ( ) ) 2 η = v W T ˆσ + Ŵ T ˆσ Ṽ T x d + v Ŵ T O (Ṽ T x d + W T σ + ε (6 43) ( ) ( ) ) 2 + ω W T ˆσ + Ŵ T ˆσ Ṽ T x d + ω Ŵ T O (Ṽ T x d + W T σ + ε + Ṡf Ṡdf + Ḋ Ḋd µ

129 Using the commutativity and distributivity property of the convolution, and rearranging the terms in (6 43) yields η = W T (ˆσ v + ωˆσ) + Ŵ T ˆσ Ṽ T ( x d v + ω x d ) + Ñ2 + N 2B k 2 η β 2 sgn(η) (6 44) where the fact that µ 2 = k 2 η + β 2 sgn(η) (6 45) was utilized. In (6 44), the unmeasurable/unknown auxiliary term Ñ2(e 1, e 2, r, t) R n is defined as and the term N 2B (t) R n is defined as N 2B Ḋ Ḋd + v Ñ 2 Ṡf Ṡdf (6 46) ( ) ) ( 2 ) ) 2 Ŵ T O (Ṽ T x d + W T σ + ε + ω Ŵ T O (Ṽ T x d + W T σ + ε. In a similar manner as in Lemma 1 of the Appendix, by applying the Mean Value Theorem can be used to develop the following upper bound for the expression in (6 46): (6 47) Ñ2(t) ρ 2 ( z ) z (6 48) where the bounding function ρ 2 ( ) R is a positive, globally invertible, nondecreasing function, and z(t) R 3n is defined as z(t) [ e T 1 e T 2 r T] T. (6 49) Using Assumption 6-3, and the fact that v (t) is a linear, strictly proper, exponentially stable transfer function, the following inequality can be developed based on the expression in (6 47) with a similar approach as in Lemma 2 of [42]: N 2B (t) ξ (6 5) where ξ R is a known positive constant. 129

130 6.4.2 Composite Adaptation The composite adaptation for the NN weight estimates is given by Ŵ Γ 1 proj(α 2ˆσ ˆV T x d e T 2 + ˆσ f η T ) (6 51) ˆV Γ 2 proj(α 2 xd e T 2 Ŵ T ˆσ + x df η T Ŵ T ˆσ ) (6 52) where Γ 1 R (N 2+1) (N 2 +1), Γ 2 R (N 1+1) (N 1 +1) are constant, positive definite, symmetric control gain matrices, proj ( ) denotes a smooth projection operator (see [7, 75]) that is used to ensure that Ŵ(t) and ˆV (t) remain inside the bounded convex region. Also, η (t) R n denotes the measurable prediction error defined in (6 2), and the scalar function v (t) was defined in (6 23). The filtered activation function ˆσ f (t) R N 2+1 is generated from the stable first order differential equation ˆσ f + ωˆσ f = ωˆσ, ˆσ f () = (6 53) while the filtered NN input vector x df (t) R 3n+1 is generated by x df + ω x df = ω x d, x df () =. (6 54) In a typical NN weight adaptation law, only the system tracking errors are used to update the weights and no information about the actual estimate mismatch (i.e., ψ (t) ˆψ (t)) is utilized as it is unmeasurable, and hence cannot be used in control implementation. The proposed method uses the swapping procedure in Section to generate a measurable prediction error η (t) that contains information related to the estimate mismatch error. 13

131 The projection used in the NN weight adaptation laws in (6 51) and (6 52) can be decomposed into two terms as 1 Ŵ = χ W η + χ W e 2, ˆV = χ V η + χ V e 2 (6 55) ) such that the auxiliary functions χ W η (ˆσ f, η), χ W e 2 (ˆV, xd, x d, e 2 R (N2+1) n and ( x df, Ŵ, ˆV ) ), η, χ V e 2 (Ŵ, ˆV, xd, x d, e 2 R (N 1+1) N 2 satisfy the following bounds χ V η χ W η b1 η, χ V η b 1 η, χ V e2 b 2 e 2 χ W e2 b2 e 2 (6 56) where b 1, b 2, b 1, and b 2 R are known positive constants. To facilitate the subsequent stability analysis, the following inequality is developed based on (6 56) and the fact that the NN weight estimates are bounded by the smooth projection algorithm: χ W η ˆσ + Ŵ T ˆσ χ V η x d c 1 η (6 57) where c 1 R is a positive constant Closed-Loop Error System Substituting for Ŵ (t) and ˆV (t) from (6 55), the expression in (6 19) can be rewritten as G 1 ṙ = 1 2Ġ 1 r χ W η ˆσ Ŵ T ˆσ χ V η x d + Ñ1 + N 1 (k 1 + 1)r β 1 sgn(e 2 ) e 2 (6 58) where the fact that the time derivative of (6 16) is given as µ 1 = (k 1 + 1)r + β 1 sgn(e 2 ) (6 59) 1 See Lemma 5 of the Appendix for the proof of the decomposition in (6 55) and the inequalities in (6 56). 131

132 was utilized. In (6 58), the unmeasurable/unknown auxiliary terms Ñ1(e 1, e 2, r, t) and N 1 (t) R n are defined as Ñ 1 1 2Ġ 1 r + Ṡ1 + e 2 χ W e 2 ˆσ Ŵ T ˆσ χ V e 2 x d (6 6) In (6 61), N d ( x d, x d, t) R n is defined as while N 1B (Ŵ, ˆV, x d, x d, t) R n is further segregated as N 1 N d + N 1B. (6 61) N d W T σ V T x d + ε (6 62) where N 1Ba (Ŵ, ˆV, x d, x d, t) R n is defined as and the term N 1Bb (Ŵ, ˆV, x d, x d, t) R n is defined as N 1B N 1Ba + N 1Bb (6 63) N 1Ba W T ˆσ ˆV T x d Ŵ T ˆσ Ṽ T x d (6 64) N 1Bb Ŵ T ˆσ Ṽ T x d + W T ˆσ ˆV T x d. (6 65) Motivation for segregating the terms in (6 61) is derived from the fact that the different components in (6 61) have different bounds. Segregating the terms as in (6 61)-(6 65) facilitates the development of the NN weight update laws and the subsequent stability analysis. For example, the terms in (6 62) are grouped together because the terms and their time derivatives can be upper bounded by a constant and rejected by the RISE feedback, whereas the terms grouped in (6 63) can be upper bounded by a constant but their derivatives are state dependent. The state dependency of the time derivatives of the terms in (6 63) violates the assumptions given in previous RISE-based controllers (e.g., [19, 2, 22, 23, 25 27]), and requires additional consideration in the adaptation law design and stability analysis. The terms in (6 63) are further segregated because 132

133 N 1Ba (Ŵ, ˆV, x d, x d ) will be rejected by the RISE feedback, whereas N 1Bb (Ŵ, ˆV, x d, x d ) will be partially rejected by the RISE feedback and partially canceled by the adaptive update law for the NN weight estimates. In a similar manner as in (6 48), the following upper bound is developed for the expression in (6 6): Ñ1(t) ρ 1 ( z ) z (6 66) where the bounding function ρ 1 ( ) R is a positive, globally invertible, nondecreasing function, and z(t) R 3n was defined in (6 49). The following inequalities can be developed based on Assumption 6-3, (3 7), (3 8), (3 16), (6 63)-(6 65): N d ζ 1, N 1Ba ζ 2, N 1Bb ζ 3, Ṅd ζ 4. (6 67) From (6 61), (6 63) and (6 67), the following bound can be developed N 1 N d + N 1B N d + N 1Ba + N 1Bb ζ 1 + ζ 2 + ζ 3. (6 68) By using (6 51) and (6 52), the time derivative of N 1B (Ŵ, ˆV, x d ) can be bounded as Ṅ1B ζ 5 + ζ 6 e 2 + ζ 7 η. (6 69) In (6 67) and (6 69), ζ i R, (i = 1, 2,..., 7) are known positive constants. For the subsequent stability analysis, let D R 4n+3 be a domain containing y(t) =, where y(t) R 4n+3 is defined as y [z T η T P1 P2 Q] T. (6 7) In (6 7), the auxiliary function P 1 (t) R is defined as n P 1 (t) β 1 e 2i () e 2 () T N 1 () i=1 t L 1 (τ)dτ (6 71) 133

134 where e 2i () R denotes the ith element of the vector e 2 (), and the auxiliary function L 1 (t) R is defined as L 1 r T (N 1Ba + N d β 1 sgn(e 2 )) + ė T 2 N 1Bb β 3 e 2 2 β 4 η 2 (6 72) where β 1, β 3, β 4 R are positive constants chosen according to the sufficient conditions { β 1 > max ζ 1 + ζ 2 + ζ 3, ζ 1 + ζ 2 + ζ 4 + ζ } 5, β 3 > ζ 6 + ζ 7 α 2 α 2 2, β 4 > ζ 7 2 where ζ 1 ζ 7 were introduced in (6 67) and (6 69). Provided the sufficient conditions introduced in (6 73) are satisfied, the following inequality is obtained [22], [2]: t (6 73) n L 1 (τ)dτ β 1 e 2i () e 2 () T N 1 (). (6 74) i=1 Hence, (6 74) can be used to conclude that P 1 (t). Also in (6 7), the auxiliary function P 2 (t) R is defined as where the auxiliary function L 2 (t) R is defined as t P 2 (t) L 2 (τ)dτ (6 75) L 2 η T (N 2B β 2 sgn(η)) (6 76) where β 2 R is a positive constant chosen according to the sufficient condition β 2 > ξ (6 77) where ξ was introduced in (6 5). Provided the sufficient condition introduced in (6 77) is satisfied, then P 2 (t). The auxiliary function Q(t) R in (6 7) is defined as Q(t) 1 2 tr ( W T Γ 1 1 ) W + 1 ) (Ṽ 2 tr T Γ 1 2 Ṽ. (6 78) Since Γ 1 and Γ 2 are constant, symmetric, and positive definite matrices, it is straightforward that Q(t). 134

135 6.5 Stability Analysis Theorem 6-1: The controller given in (6 15) and (6 16) in conjunction with the composite NN adaptation laws in (6 51) and (6 52), where the prediction error is generated from (6 2), (6 21), (6 37), and (6 38), ensures that all system signals are bounded under closed-loop operation and that the position tracking error and the prediction error are regulated in the sense that e 1 (t) and η(t) as t provided the control gains k 1 and k 2 introduced in (6 16) and (6 38) are selected sufficiently large based on the initial conditions of the systems states (see the subsequent semi-global stability proof), the sufficient conditions in (6 73) and (6 77) are satisfied, and the following conditions are satisfied: α 1 > 1 2, α 2 > β (6 79) where the gains α 1 and α 2 were introduced in (6 4) and (6 5). Proof: Let V L (y, t) : D [, ) R be a continuously differentiable, positive definite function defined as V L 1 2 et 1 e et 2 e rt G 1 r ηt η + P 1 + P 2 + Q (6 8) which satisfies the inequalities U 1 (y) V L (y, t) U 2 (y) (6 81) provided the sufficient conditions introduced in (6 73) and (6 77) are satisfied. In (6 81), the continuous positive definite functions U 1 (y), U 2 (y) R are defined as U 1 (y) λ 1 y 2 135

136 and U 2 (y) λ 2 (x) y 2, where λ 1, λ 2 (x) R are defined as λ min { 1, g } (6 82) 1 λ 2 max{ 1} 2ḡ(x), where g, ḡ(x) are introduced in (6 2). Using (6 4), (6 5), (6 44), (6 58), (6 71), (6 72), (6 75) and (6 76), the time derivative of (6 8) can be expressed as V L = e T 1 (e 2 α 1 e 1 ) + e T 2 (r α 2e 2 ) + r T ( 1 2Ġ 1 r χ W η ˆσ Ŵ T ˆσ χ V η x d + Ñ1 (6 83) + N 1 (k 1 + 1)r β 1 sgn(e 2 ) e 2 ) rt Ġ 1 r + η T ( W T ˆσ f + Ŵ T ˆσ Ṽ T x df + Ñ2 + N 2B k 2 η β 2 sgn(η)) r T (N 1Ba + N d β 1 sgn(e 2 )) ė T 2 N 1B b + β 3 e β 4 η 2 η T (N 2B β 2 sgn(η)) tr( W T Γ 1 Ŵ) tr(ṽ T Γ 1 ˆV ). 1 2 Expanding the terms in (6 83) and using (6 61) and (6 63) yields V L = e T 1 e 2 α 1 e T 1 e 1 + e T 2 r α 2 e T 2 e 2 + r T Ñ rt Ġ 1 r r T χ W η ˆσ r T Ŵ T ˆσ χ V η x d (6 84) + r T (N d + N 1Ba ) + (ė 2 + α 2 e 2 ) T N 1Bb (k 1 + 1)r T r β 1 r T sgn(e 2 ) r T e 2 + η T W T ˆσ f + η T Ŵ T ˆσ Ṽ T x df η T N 2B + η T Ñ 2 + η T N 2B k 2 η T η β 2 η T sgn(η) r T (N 1Ba + N d β 1 sgn(e 2 )) ė 2 (t) T N 1Bb + β 3 e β 4 η 2 + β 2 η T sgn(η)) rt Ġ 1 r tr( W T Γ 1 1 Ŵ) tr(ṽ T Γ 1 2 Canceling the common terms in (6 84), and substituting for N 1Bb from (6 65), and using ˆV ). the fact that a T b = tr ( ba T), VL (y, t) is expressed as V L = e T 1 e 2 α 1 e T 1 e 1 α 2 e T 2 e 2 k 2 η T η r T χ W η ˆσ r T Ŵ T ˆσ χ V η x d + r T Ñ 1 (6 85) ) ( ) (k 1 + 1)r T r + β 3 e β 4 η 2 + η T Ñ 2 + tr (α 2 W T ˆσ T ˆV x d e T 2 + tr W T ˆσ f η T ) ( ) ) + tr (α 2 Ṽ T x d (ˆσ T Ŵe 2 ) T + tr Ṽ T x T df (ˆσ T Ŵη tr( W T Γ 1 Ŵ) tr(ṽ T Γ 1 ˆV )

137 Substituting the update laws from (6 51) and (6 52) in (6 85), canceling the similar terms, and using the fact that e T 1 e ( e e 2 2 ), the expression in (6 85) is upper bounded as ( V L α 1 1 ) ( e 1 2 α 2 1 ) 2 2 β 3 e 2 2 r 2 + c 1 η r + r Ñ1 k 1 r 2 + η Ñ2 (k 2 β 4 ) η 2 where (6 57) was used. Using the squares of the components of z(t), VL (y, t) is upper bounded as V L λ 3 z 2 k 1 r 2 + r Ñ1 + c 1 η z + η Ñ2 k 2 η 2 (6 86) where λ 3 min{α 1 1 2, α β 3, 1}. Letting k 2 = k 2a + k 2b where k 2a, k 2b R are positive constants, and using the inequalities in (6 48) and (6 66), the expression in (6 86) is upper bounded as V L λ 3 z 2 (k 2b β 4 ) η 2 [ k 1 r 2 ρ 1 ( z ) r z ] (6 87) [ k 2a η 2 (ρ 2 ( z ) + c 1 ) η z ]. Completing the squares for the terms inside the brackets in (6 87) yields V L λ 3 z 2 (k 2b β 4 ) η 2 + ρ2 1 ( z ) z 2 + (ρ 2( z ) + c 1 ) 2 z 2 4k 1 4k 2a λ 3 z 2 + ρ2 ( z ) z 2 (k 2b β 4 ) η 2 (6 88) 4k where k R is defined as k min {k 1, k 2a } (6 89) 137

138 and ρ( ) R is a positive, globally invertible, nondecreasing function defined as ρ 2 ( z ) ρ 2 1 ( z ) + (ρ 2( z ) + c 1 ) 2. The expression in (6 88) can be further upper bounded by a continuous, positive semi-definite function V L (y, t) U(y) = c [ z T η T] T 2 y D (6 9) for some positive constant c R, where D ( {y (t) R 4n+3 y ρ 1 2 )} λ 3 k. Larger values of k will expand the size of the domain D. The inequalities in (6 81) and (6 9) can be used to show that V L (y, t) L in D; hence, e 1 (t), e 2 (t), r(t), and η (t) L in D. Given that e 1 (t), e 2 (t), and r(t) L in D, standard linear analysis methods can be used to prove that ė 1 (t), and ė 2 (t) L in D from (6 4) and (6 5). Since ė 1 (t), ė 2 (t), and r(t) L in D, Assumption 6-3 can be used along with (6 3)-(6 5) to conclude that x (i) (t) L, in D. Since x (i) (t) L, in D, Assumption 6-2 can be used to conclude that G 1 ( ) and f( ) L in D. Thus, from (6 1) we can show that u(t) L in D. Therefore, u f (t) L in D, and hence, from (6 2), û f (t) L in D. Given that r(t) L in D, (6 59) can be used to show that µ 1 (t) L in D, and since Ġ 1 ( ) and f( ) L in D, (6 58) can be used to show that ṙ(t) L in D, and (6 44) can be used to show that η(t) L in D. Since ė 1 (t), ė 2 (t), ṙ(t), and η (t) L in D, the definitions for U(y) and z(t) can be used to prove that U(y) is uniformly continuous in D. Let S D denote a set defined as S ( {y(t) D U 2 (y(t)) < λ 1 (ρ 1 2 )) } 2 λ 3 k. (6 91) The region of attraction in (6 91) can be made arbitrarily large to include any initial conditions by increasing the control gain k (i.e., a semi-global stability result). Theorem 138

139 8.4 of [63] can now be invoked to state that c [ z T η T] T 2 as t y() S. (6 92) Based on the definition of z(t), (6 92) can be used to show that e 1 (t) as t y() S η(t) as t y() S. 6.6 Experiment As in Chapter 2, the testbed depicted in Figure 2-1 was used to implement the developed controller. No external friction is applied to the circular disk. The desired link trajectory is selected as follows (in degrees): q d (t) = 6. sin(3.t)(1 exp(.1t 3 )). (6 93) For all experiments, the rotor velocity signal is obtained by applying a standard backwards difference algorithm to the position signal. The integral structure for the RISE term in (6 16) was computed on-line via a standard trapezoidal algorithm. The NN input vector x d (t) R 4 is defined as x d = [1 q d q d q d ] T. The initial values of Ŵ () were chosen to be a zero matrix; however, the initial values of ˆV () were selected randomly between 1. and 1. to provide a basis [71]. A different transient response could be obtained if the NN weights are initialized differently. Ten hidden layer neurons were chosen based on trial and error. In addition, all the states were initialized to zero. The following control gains were used to implement the controller in (6 15) in conjunction with the composite adaptive update laws in (6 51) and (6 52), where the prediction error is generated from (6 2), (6 21), (6 37), and (6 38): 139

140 Tracking Error [degrees] Time [sec] Figure 6-2. Tracking error for the proposed composite adaptive control law (RISE+CNN). k 1 = 3, β 1 = 1, α 1 = 1, α 2 = 1, Γ 1 = 5I 11, Γ 2 =.5I 4 k 2 = 3, β 2 = 1, ω = 2. (6 94) Discussion Three different experiments were conducted to demonstrate the efficacy of the proposed controller. The control gains were chosen to obtain an arbitrary tracking error accuracy (not necessarily the best performance), and for each controller, the gains were not retuned (i.e., the common control gains remain the same for all controllers). First, no adaptation was used and the controller with only the RISE feedback was implemented. For the second experiment, the prediction error component of the update laws in (6 51) and (6 52) was removed, resulting in a standard gradient-based NN update law (hereinafter denoted as RISE+NN). For the third experiment, the proposed composite adaptive controller in (6 15) (hereinafter denoted as RISE+CNN) was implemented. The tracking error and the prediction error are shown in Figure 6-2 and Figure 6-3, 14

141 Prediction Error [N m] Time [sec] Figure 6-3. Prediction error for the proposed composite adaptive control law (RISE+CNN) Torque [N m] Time [sec] Figure 6-4. Control torque for the proposed composite adaptive control law (RISE+CNN). 141

142 Figure 6-5. Average RMS errors (degrees) and torques (N-m). 1- RISE, 2- RISE+NN, 3- RISE+CNN (proposed). respectively. The control torque is shown in Figure 6-4. Each experiment was performed five times and the average RMS error and torque values are shown in Figure 6-5, which indicate that the proposed RISE+CNN controller yields the lowest RMS error with a similar control effort. 6.7 Conclusion A first ever gradient-based composite NN controller is developed for nonlinear uncertain systems. A NN feedforward component is used in conjunction with the RISE feedback, where the NN weight estimates are generated using a composite update law driven by both the tracking and the prediction error with the motivation of using more information in the NN update law. The construction of a NN-based controller to approximate the unknown system dynamics inherently results in a residual function reconstruction error. The presence of the reconstruction error has been the technical obstacle that has prevented the development of composite adaptation laws for NNs. To compensate for the effects of the reconstruction error, the typical prediction error formulation is modified to include a RISE-like structure in the design of the estimated filtered control input. Using a Lyapunov stability analysis, sufficient gain conditions are derived under which the proposed controller yields semi-global asymptotic stability. 142

THE control of systems with uncertain nonlinear dynamics

THE control of systems with uncertain nonlinear dynamics IEEE TRANSACTIONS ON CONTROL SYSTEMS TECHNOLOGY, VOL. 16, NO. 2, MARCH 2008 373 Asymptotic Tracking for Systems With Structured and Unstructured Uncertainties P. M. Patre, Student Member, IEEE, W. MacKunis,