Online Adaptive Approximate Optimal Tracking Control with Simplified Dual Approximation Structure for Continuous-time Unknown Nonlinear Systems

Size: px
Start display at page:

Download "Online Adaptive Approximate Optimal Tracking Control with Simplified Dual Approximation Structure for Continuous-time Unknown Nonlinear Systems"

Transcription

1 4 IEEE/CAA JOURNAL OF AUTOMATICA SINICA, VOL., NO. 4, OCTOBER 04 Online Adaptive Approximate Optimal Tracking Control with Simplified Dual Approximation Structure for Continuous-time Unknown Nonlinear Systems Jing Na Guido Herrmann Abstract This paper proposes an online adaptive approximate solution for the infinite-horizon optimal tracking control problem of continuous-time nonlinear systems with unknown dynamics. The requirement of the complete knowledge of system dynamics is avoided by employing an adaptive identifier in conjunction with a novel adaptive law, such that the estimated identifier weights converge to a small neighborhood of their ideal values. An adaptive steady-state controller is developed to maintain the desired tracking performance at the steady-state, and an adaptive optimal controller is designed to stabilize the tracking error dynamics in an optimal manner. For this purpose, a critic neural network NN) is utilized to approximate the optimal value function of the Hamilton-Jacobi-Bellman HJB) equation, which is used in the construction of the optimal controller. The learning of two NNs, i.e., the identifier NN and the critic NN, is continuous and simultaneous by means of a novel adaptive law design methodology based on the parameter estimation error. Stability of the whole system consisting of the identifier NN, the critic NN and the optimal tracking control is guaranteed using Lyapunov theory; convergence to a near-optimal control law is proved. Simulation results exemplify the effectiveness of the proposed method. Index Terms Adaptive control, optimal control, approximate dynamic programming, system identification. I. INTRODUCTION AMONG various modern control methodologies, optimal control has been well-recognized and successfully verified in some real-world applications, which is concerned with finding a control policy that drives a dynamical system to a desired reference in an optimal way, i.e., a prescribed cost function is minimized. In general, the optimal control Manuscript received July 7, 03; accepted March 4, 04. This work was supported by National Natural Science Foundation of China ). Recommended by Associate Editor Zhongsheng Hou Citation: Jing Na, Guido Herrmann. Online adaptive approximate optimal tracking control with simplified dual approximation structure for continuoustime unknown nonlinear systems. IEEE/CAA Journal of Automatica Sinica, 04, 4): 4 4 Jing Na is with the Faculty of Mechanical and Electrical Engineering, Kunming University of Science and Technology, , China najing5@63.com). Guido Herrmann is with the Department of Mechanical Engineering, University of Bristol, BS8 TR, UK g.herrmann@bristol.ac.uk). can be derived by using Pontryagin s minimum principle, or by solving the Hamilton-Jacobi-Bellman HJB) equation. Although mathematically elegant, traditional optimal control designs are obtained offline and impose the assumption on the complete knowledge of system dynamics. To allow for uncertainties in system dynamics, adaptive control 3 4 has been developed, where the unknown system parameters are online updated/estimated by using the tracking error, such that the tracking error convergence and the boundedness of the parameter estimates can be guaranteed. However, classical adaptive control methods are generally far from optimal. With the wish to achieve adaptive optimal control, one may add optimality features to an adaptive controller, i.e., to drive the adaptation by an optimality criterion. An alternative solution is to incorporate adaptive features into an optimal control design, e.g., improve the optimal control policy by means of the updated system parameters. Recently, a bioinspired method, reinforcement learning RL) 5 7, that was developed in the computational intelligence and machine learning societies, has provided a means to design adaptive controllers in an optimal manner. Considering the similarities between optimal control and RL, Werbos 8 introduced an RL-based actor-critic framework, called approximate dynamic programming ADP), where neural networks NNs) are trained to approximately solve the optimal control problem based on the named value iteration VI) method. A survey of ADPbased feedback control designs can be found in 9. The discrete/iterative nature of the ADP formulation lends itself naturally to the design of discrete-time DT) optimal control 3 5. However, the extension of the RL-based controllers to continuous-time CT) systems entails challenges in proving stability and convergence for a model-free algorithm that can be solved online. Some of the existing ADP algorithms for CT nonlinear systems lacked a rigorous stability analysis 6, 6. By incorporating NNs into the actorcritic structure, an offline method was proposed in 7 to find approximate solutions of optimal control for CT nonlinear systems. In, 8, an online integral RL technique was developed to find the optimal control for CT systems

2 NA AND HERRMANN: ONLINE ADAPTIVE APPROXIMATE OPTIMAL TRACKING CONTROL WITH 43 without using the system drift dynamics, which led to a hybrid continuous-time/discrete-time sampled data controller based on policy iteration PI) with a two time-scale actor critic learning process. This learning procedure was based on sequential updates of the critic policy evaluation) NN and actor policy improvement) NN. Thus, while one NN was tuned, the other one remained constant. Vamvoudakis and Lewis 9 further extended this idea by designing an improved online ADP algorithm called synchronous PI, which involved simultaneous tuning of both actor and critic NNs by minimizing the Bellman error, i.e., both NNs were tuned at the same time by using the proposed adaptive laws to approximately solve the CT infinite horizon optimal control problem. To avoid the need for the complete knowledge of system dynamics in 9, a novel actor-critic-identifier architecture was proposed in 0, where an extra NN of the identifier was employed in conjunction with an actor-critic controller to identify the unknown system dynamics. Although the states of the identifier converged to their true values, the identifier NN weight convergence was not guaranteed. Moreover, the knowledge of the input dynamics was still required. On the other hand, most of the ADP based optimal control methods have been developed to address the stabilization or regulation problem, and only a few results have been reported for optimal tracking control 3. For these results, the key idea is to superimpose an optimal control that stabilizes the error dynamics at the transient stage in an optimal way under the assumption of a traditional steady-state tracking controller e.g., feedback linearization control, adaptive control). In 3, an observer was adopted to reconstruct unknown system states, while in an adaptive NN identifier was used to online estimate unknown system dynamics. Although the obtained control input was ensured to be close to the optimal control within a small bound, it was not guaranteed that the NN identifier weights stayed bounded in a compact neighborhood of their ideal values. In this paper, we will provide a solution where the convergence of the identifier weights is guaranteed and the convergence of the critic NN weights to a nearly optimal control solution is shown. To the best of our knowledge, ADP-based optimal tracking control has rarely been designed for CT systems with unknown nonlinear dynamics and guaranteed parameter estimation convergence. In this paper, we propose a new ADP algorithm for solving the optimal tracking control problem of nonlinear systems with unknown dynamics. Inspired by the work of 0, the requirement of the complete or at least partial knowledge of system dynamics in the existing ADP algorithms for CT systems is eliminated. This is achieved by constructing an adaptive NN for the identifier of system dynamics; a novel adaptive law based on the parameter estimation error 4 is utilized such that, even in the presence of an NN approximation error, the identifier NN weights are guaranteed to converge to a small region around their true values under a standard persistent excitation PE) condition or a slightly more relaxed singular value condition for a filtered regressor matrix. To achieve optimal tracking control, an adaptive steady-state control for maintaining desired tracking at the steady-state is augmented with an adaptive optimal control for stabilizing the tracking error dynamics in an optimal manner. To design such an optimal control, a critic NN is employed to online approximate the solution to the HJB equation. Thus, the optimal value function is obtained, which is then used to calculate the control action. The identifier parameters and critic NN weights are online updated continuously and simultaneously. In particular, a direct parameter estimation scheme is used to estimate NN weights; this is in contrast to the minimization of the Bellman error or the residual approximation error in the HJB equation by using least-squares 0 or the modified Levenberg- Marquardt algorithms 9. We will also show that the identifier weight estimation error affects the critic NN convergence; the conventional PE condition or again a relaxed condition on a filtered regressor matrix is sufficient to guarantee parameter estimation convergence. To this end, a novel adaptation scheme based on the parameter estimation error that was originally proposed in our previous work 4 is employed for updating both identifier weights and critic NN weights; this may lead to fast convergence and provides an easy online-check of the required convergence condition 4. Finally, the stability of the overall system and the uniform ultimate boundedness UUB) of the identifier and critic weights are proved by using Lyapunov theory, and the obtained control guarantees the tracking of a desired trajectory, while also asymptotically converging to a small bound around the optimal policy. The main contributions can be summarized as follows. ) The optimal tracking control problem of nonlinear CT systems is studied by proposing a new critic-identifier based ADP control configuration. The actor NN is not necessary to prove the overall stability. Thus, instead of the tripleapproximation structure, this introduces a simplified dualapproximation method. To achieve tracking control, a steadystate control is used in conjunction with an adaptive optimal control such that the overall control converges to the optimal solution within a small bound. ) A novel adaptation design methodology based on the parameter estimation error is proposed such that the weights of both the identifier NN and critic NN are online updated simultaneously. With this framework, all these weights are directly estimated with guaranteed convergence rather than updated to minimize the identifier error and Bellman error by using the gradient-based schemes e.g., least-squares in 0). It is shown that the convergence of the identifier weights to their true values in a bounded sense is achieved, which is also important for the convergence of the optimal control. The paper is organized as follows. Section II provides the formulation of the optimal control problem. Section III discusses the design of the identifier to accommodate unknown system dynamics. Section IV presents the adaptive tracking control design and the closed-loop stability analysis. Section V presents simulation examples that show the effectiveness of the proposed method, and Section VI gives some conclusions.

3 44 IEEE/CAA JOURNAL OF AUTOMATICA SINICA, VOL., NO. 4, OCTOBER 04 II. PROBLEM FORMULATION Consider a continuous-time nonlinear system as ẋ = F x, u), ) where x R n, u R m are the output and input of the studied system, F x, u) R n R m R n is a Lipschitz continuous nonlinear function on a compact set Ω R n R m that contains the origin, such that solution x of system is unique for any finite initial condition x 0 and control u. This paper will address the optimal tracking control for system ), i.e., to find an adaptive controller u which ensures that the system output x tracks a given trajectory x d and fulfills the following infinite horizon cost function in a sub-optimal sense): V et)) = min uτ) ΨΩ) t reτ), uτ))dτ, ) where ΨΩ) is a set of admissible control policies 7, e = x x d is defined as the tracking error and r, ) : R n R m R n, reτ), uτ)) 0 is the utility function; the utility function is to be defined later. Note that in this paper the command reference x d and its derivative ẋ d are all continuous and bounded. It should also be noted that the tracking error e rather than the system state x is used in cost function ), because the tracking control rather than the regulation problem is studied in this paper. Remark. Many industrial processes can be modeled as system ), such as missile systems 5, robotic manipulators 4 and biochemical processes 6. Although some results,,, 9 0 have been recently developed to address the optimal regulation problem of partially unknown) system ) by means of ADP, only a few results, 5 have been reported concerning the tracking control of system ). In, the plant system is assumed to be precisely known. To facilitate the control design, the following assumption is made about system ). Assumption 7. The function F x, u) in ) is continuous and satisfies a locally Lipschitz condition such that ) has a unique solution on the set Ω that contains the origin. The control action u has a control-affine form as in 9 0 with constant input gain B. Since the dynamics of ) are unknown, the optimal tracking control design presented in this paper can be divided into two steps as in 0 : ) Propose an adaptive identifier by using input-output data to reconstruct the unknown system dynamics for ); ) Design an adaptive optimal tracking controller based on the identified dynamics and ADP methods. III. ADAPTIVE IDENTIFIER BASED ON PARAMETER ESTIMATION ERROR In this section, an adaptive identifier is established to reconstruct the unknown system dynamics using available input output measurements. From Assumption, system ) can be rewritten in the form of a recursive neural network RNN), 7 8 : ẋ = Ax + Bu + C T fx) + ε, 3) where A R n n, B R n m are known matrices and C R p n is the unknown weight matrix, ε R n is a bounded approximation error of the RNN, and fx) R p is a nonlinear regressor function vector, which is Lipschitz continuous function such that fx) fy) κ x y holds for some positive constant κ > 0. To determine the unknown parameters C, we define the filtered variables x f, u f, f f of x, u, f as kẋ f + x f = x, x f 0) = 0, k u f + u f = u, u f 0) = 0, kf f + f f = f, f f 0) = 0, where k R is a positive scalar constant filter parameter. Then for any positive scalar constant l > 0, we define the filtered and integrated regressor matrices P R p p and Q R p n as P = lp + f f f T f, P 0) = 0, Q = lq + f f x xf k Ax f Bu f T, Q 0) = 0, and another auxiliary matrix M R p n calculated based on P and Q as M = P Ĉ Q, 6) where Ĉ is the estimation of C. Then the adaptive law for estimating Ĉ is provided by Ĉ = Γ M, 7) with Γ > 0 being a constant, positive definite learning gain matrix. Lemma 4. Under the assumption that variables x and u in 3) are bounded, vector M in 6) can be reformulated as M = P C + ψ for bounded ψ t) = t 0 e lt r) f f r)ε T f r)dr, where C = C Ĉ is the estimation error. Proof. For the ordinary matrix differential equation of 5), one can obtain its solution as 4 P = t 0 e lt r) f f r)f T f r)dr, Q = t 0 e lt r) f f r) xr) xf r) k Ax f r) Bu f r) T dr. On the other hand, by applying the linear filter operation 4) on both sides of 3) it can be obtained that 4) 5) 8) ẋ f = Ax f + Bu f + C T f f + ε f, 9) where ε f is the filtered version of bounded error ε in terms of k ε f + ε f = ε Vector ε f will be used only for analysis). Then from the first equation of 4), it is found that ẋ f = x x f. 0) k

4 NA AND HERRMANN: ONLINE ADAPTIVE APPROXIMATE OPTIMAL TRACKING CONTROL WITH 45 Consequently, we can obtain from 9) and 0) that x x f k = Ax f + Bu f + C T f f + ε f. ) By substituting ) into 8), we have Q = P C ψ with ψ t) = t 0 e lt r) f f r)ε T f r)dr being a bounded variable, i.e., ψ ε ψ for a constant ε ψ > 0 because the NN regressor function fx) and approximation error ε are all bounded for bounded x and u. Then, 6) can be rewritten as M = P Ĉ Q = P C + ψ, ) where C = C Ĉ is the estimation error. Moreover, to prove the parameter estimation convergence, we need to analyze the positive definiteness property of P. Denote λ max ) and λ min ) as the maximum and minimum eigenvalues of the corresponding matrices, then we have the following lemma. Lemma 4. If the regressor function vector fx) defined in 3) is persistently excited 3, then the matrix P defined in 5) is positive definite, i.e., its minimum eigenvalue λ min P ) > σ > 0 with σ being a positive constant. We refer to 4 for the detailed proof of Lemma. Now, we have the following result. Theorem. If x and u in 3) are bounded and the minimum eigenvalue of P satisfies λ min P ) > σ > 0 for system 3) with the parameter estimation 7), then we have ) For ε = 0 i.e., no reconstruction error), the estimation error C exponentially converges to zero; ) For ε 0 i.e., with bounded approximation error), the estimation error C converges to a compact set around zero. Proof. Consider the Lyapunov function candidate as V = tr C T Γ C), then the derivative V is obtained from 7) as V = tr C T Γ C) = tr C T P C) + tr CT ψ ). 3) ) In case that ε = 0 and thus ψ = 0, 3) is reduced to V = tr C T P C) σ C µ V, 4) where µ = σ /λ max Γ ) is a positive constant. Then according to Lyapunov s theorem Theorem 3.4. in 4, p0), the parameter estimation error C converges to zero exponentially, where the convergence rate depends on the excitation level σ and the learning gain Γ. ) In case that there is a bounded approximation error ε 0, 3) can be further presented as V = tr C T P C) + tr CT ψ ) C σ V ε ψ ) 5) for σ = σ /λmax Γ ) being a positive constant. Then according to the extended Lyapunov theorem Theorem in 4, p), the parameter estimation error C ultimately uniformly converges to the compact set Ω := { } C V ε ψ / σ, whose size depends on the bound of the approximation error ε ψ and the excitation level σ. This completes the proof. Remark. For adaptive law 7), variable M of 6) obtained based on P, Q by 5) contains the information on the weight estimation error P C as shown in ), where the residual error ψ will vanish for vanishing NN approximation error ε 0. It is well known that ε 0 holds for sufficiently large hidden layer nodes in identifier 3), i.e., p +. Thus M can be used to drive parameter estimation 7). Consequently, parameter estimation Ĉ can be directly obtained without using an observer/predictor error in comparison to 0 see Theorem in 0 ). Remark 3. Lemma shows that the required condition i.e., λ min P ) > σ > 0) for the parameter estimation convergence in this paper can be fulfilled under a conventional PE condition 3. In general, the direct online validation of the PE condition is difficult in particular for a nonlinear system. To this end, Lemma provides a numerically verifiable way to online validate the required convergence condition of the novel adaptation law 7), i.e., by calculating the minimum eigenvalue of matrix P to test λ min P ) > σ > 0. This condition does not necessarily imply the PE condition of fx). It is also to be noticed that the PE condition of fx) can be suitably weakened when a well-designed control is imposed 4, 9, e.g., transformed into an a priori verifiable sufficient richness SR) requirement on the command reference. IV. ADAPTIVE APPROXIMATE OPTIMAL TRACKING CONTROL As shown in Section III, the unknown weight parameter C can be online estimated. Without loss of generality, we assume that there is an unavoidable approximation error ε in 3) such that the estimated weight matrix Ĉ would converge to a compact set around its true value C. In this case, system ) can be rewritten as ẋ = Ax + Bu + ĈT fx) + ε + ε N, 6) where ε N = C T fx) can be taken as an adaptation error for which a later analysis will show the boundedness, i.e., ε N φ N for a constant φ N > 0 in the compact set Ω. Then the optimal control of ) is transformed into the optimal control of 6). In this section, the optimal controller design of 6) will be provided in detail. To achieve optimal tracking control, it is noted that the overall control u can be composed of two parts as u = u s +u e, where u s is the adaptive steady-state control used to maintain the tracking error close to zero at the steady-state stage, and u e is the adaptive optimal control designed to stabilize the tracking error dynamics in the control transient in an optimal manner, 5. Consider the tracking error as e = x x d, so that ė = Ax + Bu + ĈT fx) ẋ d + ε + ε N. 7) Since the adaptive steady-state control u s is used to guarantee a steady state at zero for the tracking error, it should be designed to retain the steady-state dynamics ė = ẋ ẋ d = 0 in

5 46 IEEE/CAA JOURNAL OF AUTOMATICA SINICA, VOL., NO. 4, OCTOBER 04 7), i.e., u = u s needs to guarantee x = x d for ε + ε N = 0. Thus, the steady-state control signal u s can be selected as u s = B ẋ d Ax d ĈT fx d )), 8) where B denotes the generalized inverse of B. Note that the input gain B in 3) is assumed to be known but may not be invertible i.e., it may have the rank lower than n). It is shown that u s depends on the available variables x d, ẋ d, A, Ĉ, and thus can be implemented based on identifier 3) with adaptive law 7). Substituting 8) into 7), the tracking error dynamics can be rewritten as ė = Ae + ĈT fx) fx d ) + Bu e + ε + ε N + ε ϕ, 9) where ε ϕ = BB I)ẋ d Ax d ĈT fx d )) denotes the residual error due to the generalized inverse of B, which is clearly bounded because x d and ẋ d are bounded and f is Lipschitz continuous, i.e., ε ϕ φ p for a constant φ p > 0. It is noted that ε ϕ will be null under the so-called matching condition, e.g., BB I)A = 0 or BB I)ĈT = 0, which is a standard condition raised in nonlinear control when counteracting disturbances As shown above, by using the adaptive steady-state control u s in 8), the error dynamics can be further presented as 9), which is not necessarily stable, in particular for a system with identifier error ε + ε N. In this sense, the tracking problem of system 6) can be reduced into the regulation problem of 9). Hence, the adaptive optimal control u e will be designed to stabilize the tracking error dynamics 9) in an approximately optimal manner. In this case, the optimal value function ) for system ) can be reformulated using u e from system 9) to provide the value function V et)) = t reτ), u e eτ)))dτ, 0) where the utility function can be chosen as reτ), u e eτ))) = e T Qe + u T e Ru e with Q R n n and R R m m being symmetric positive definite matrices. Thus, the tracking problem is optimized by using control u e, which optimally stabilizes e. It will be shown below that u e is a function of control error e. Definition 7. A control policy µe) is defined as admissible with respect to 0) on a compact set Ω, denoted by µe) ΨΩ). If µe) is continuous on Ω, µ0) = 0, µe) = ue) stabilizes 9) on Ω, and V e) is finite e Ω. The remaining problem can be formulated as: given the CT error system 9) with the admissible control set µe) ΨΩ) and the infinite horizon cost function 0), find an admissible control policy u e e) µe) such that cost 0) associated with system 9) is minimized. For this purpose, we define the Hamiltonian of system 9) as He, u e, V ) =V T e Ae + ĈT fx) fx d )) + Bu e + ε + ε N + ε ϕ + e T Qe + u T e Ru e, ) where V e := V e denotes the partial derivative of the value function V with respect to e. The optimal cost function V e) is defined as ) V e) = reτ), u e eτ)))dτ, ) min u e ΨΩ) which satisfies the HJB equation t 0 = min u e ΨΩ) He, u e, V ). 3) Then we can obtain the optimal control u e He, u e, V )/ u e = 0 as by solving u e = R B T V e), 4) e where V is the solution to the HJB equation 3). Remark 4. In order to find the optimal control 4), one needs to solve the HJB equation 3) for the value function V e) and then substitute the solution into 4) to obtain the optimal control u e. For linear systems, considering a quadratic cost functional, the equivalent of the HJB equation is the well known Riccati equation. However, for nonlinear systems, the HJB equation 3) is a nonlinear partial differential equation PDE) which is difficult to solve. In the literature, there are a number of results concerning the optimal control for 9) in terms of critic-actor based ADP schemes, where two NNs, i.e., a critic NN and an actor NN, are employed to approximate the value function and its corresponding policy. However, some of them run in an offline manner 7 and/or require at least partial knowledge of system dynamics,8 0,. In the following, an online adaptive algorithm will be proposed to derive the optimal control solution for system 9) using the NN identifier introduced in the previous section and another critic NN for approximating the value function of the HJB equation 3). Instead of sequentially updating the critic and actor NNs, 8, both networks are updated simultaneously in real time, and thus lead to the synchronous online implementation. A. Value Function Approximation via NN Assuming the optimal value function is continuous and defined on compact sets, then a single-layer NN can be used to approximate it 9, such that the solution V e) and its derivative V e)/ e with respect to e can be uniformly approximated by and V e) = W T Φe) + ε, 5) V e) e = Φ T W + ε, 6) where W R l are the unknown ideal weights and Φe) = Φ,, Φ l T R l is the NN activation function vector, l is the number of neurons in the hidden layer, and ε is the NN approximation error. Φ := Φ/ e and ε := ε / e denote the partial derivative of Φe) and ε with regard to e, respectively.

6 NA AND HERRMANN: ONLINE ADAPTIVE APPROXIMATE OPTIMAL TRACKING CONTROL WITH 47 Some standard NN assumptions that will be used throughout the remainder of this paper are summarized here. Assumption 7, 9. The ideal NN weights W are bounded by a positive constant W N, i.e., W W N ; the NN activation function Φ ) and its derivative Φ ) with respect to argument e are bounded, e.g., Φ φ M ; and the function approximation error ε and its derivative ε with respect to e are bounded, e.g., ε φ ε. In a practical application, the NN activation functions {Φ i e) : i =,, l} can be selected so that as l +, Φe) provides a complete independent basis for V e). Then using Assumption and the Weierstrass higher-order approximation theorem, both V e) and V e)/ e can be uniformly approximated by NNs in 5) and 6), i.e., as l +, the approximation errors ε 0, ε 0 as shown in 7, 9. Then the critic NN ˆV e) that approximates the optimal value function V e) is given by ˆV e) = Ŵ T Φe), 7) where Ŵ is the estimation of the unknown weights W in critic NN 5), which will be specified by adaptive law 38). In this case, one may obtain the approximated optimal control as û e = R B T ˆV e) = e R B T Φ T Ŵ, 8) such that the overall optimal tracking control for system 6) can be given as u = u s + û e 9) with u s being the steady-state control given in 8). Consider that the ideal optimal control u e can be determined based on 4) and 6) as u e = R B T V e) = e R B T Φ T W + ε ), 30) and then substitute the estimated optimal control û e of 8) into the error dynamics 9), we have ė =Ae + ĈT fx) fx d ) + Bu e + ε + ε N + ε ϕ = Ae + ĈT fx) fx d ) + B R B T Φ T Ŵ + R B T Φ T W + ε ) + Bu e + ε + ε N + ε ϕ = Ae + ĈT fx) fx d ) + BR B T Φ T W + Bu e + BR B T ε + ε + ε N + ε ϕ. 3) Remark 5. Since the overall control 9) is derived using the steady-state control 8) and the approximate optimal control 8) that depends on the estimated optimal value function Φ T Ŵ, the critic NN in 7) can be used to determine the control action without using another NN as the actor in 9. This can reduce the computational cost and improve the learning process. However, alternatively, a separate actor NN, e.g., Φ T Ŵ a, may be used in a similar way for producing the approximated optimal control action û e = R B T Φ T Ŵ a as that shown in 9. B. Adaptive Law for Critic NN The problem now is to update the critic NN weights Ŵ, such that Ŵ converge to a small bounded region around the ideal values W. To derive the adaptive law, we denote f d = fx d ), substitute 6) into Hamiltonian function ), and thus rewrite the HJB equation 3) as 0 = He, u e, V ) = W T ΦAe + ĈT f f d ) + Bu e + e T Qe + u T e Ru e + ε HJB, 3) where ε HJB = ε Ae + ĈT f f d ) + Bu e + ε + ε N + ε ϕ + W T Φε N + ε ϕ + ε) is the residual error due to the NN approximation errors, which can be made arbitrarily small by using a sufficiently large number of NN nodes 7, 9, i.e., ε 0 as p + and ε 0 as l +. Equally, Theorem implies that estimation error ε N = Cfx) converges to zero as p + for bounded control and states. In contrast, ε ϕ = 0 when B is of rank n. To facilitate the design of the adaptive law, we denote the known terms in 3) as Ξ = Φ Ae + ĈT f f d ) + Bu e and Θ = e T Qe + u T e Ru e, and then represent 3) as Θ = W T Ξ ε HJB. 33) In 33), the unknown critic NN weights W appear in a linearly parameterized form, and will be directly estimated in the following development by utilizing the parameter estimation error method proposed in Section III. Remark 6. It is shown in 3) that the residual HJB equation error ε HJB is due to the critic NN approximation error ε in 6), the identifier error ε + ε N in 6) and the matching condition error ε ϕ. As claimed in 7, 9, the critic NN approximation error ε converges uniformly to zero as the number of hidden layer nodes increases, i.e., ε 0 as long as l +. That is, µ > 0, Nµ) : sup ε µ. Moreover, in case that there is no NN approximation error in 3), i.e., ε = 0, the effect of the identifier error ε N in 6) will vanish i.e., ε N 0 as p + ) because C 0 holds for ε = 0 as proved in Theorem for bounded state x and control input u. Finally, the fact ε ϕ = 0 is also true under the matching condition in 9). Consequently, if there are no approximation errors ε in identifier 6) and ε in critic NN 6), and the matching condition holds, the residual error in 33) is null, i.e., ε HJB = 0. Remark 7. Some available ADP based optimal controls are designed to online update the critic NN weights Ŵ by minimizing the squared residual Bellman error in the approximated HJB equation 9, 3, where the Least-squares 0 or modified Levenberg-Marquardt algorithms 9 are employed. In the following, we will extend our previous results 4 to design the adaptive law to directly estimate unknown critic NN weights W based on 33) rather than to reduce the Bellman error.

7 48 IEEE/CAA JOURNAL OF AUTOMATICA SINICA, VOL., NO. 4, OCTOBER 04 Similar to Section III, we define the auxiliary filtered regressor matrix P R l l and vector Q R l as { P = lp + ΞΞ T, P 0) = 0, 34) Q = lq + ΞΘ, Q 0) = 0, where l > 0 is the design parameter. Then one can obtain { P = t 0 e lt r) Ξr)Ξ T r)dr, Q = t 35) 0 e lt r) Ξr)Θr)dr. Define another auxiliary vector M R l based on P and Q in 34) as M = P Ŵ + Q, 36) where Ŵ is the estimation of W, which will be given in the following adaptive law 38). By substituting 33) into 35), we have Q = P W + ψ with ψ = t 0 e lt r) ε HJB r)ξr)dr being a bounded variable for bounded state x and control u, e.g., ψ ε ψ for some ε ψ > 0. In this case, 36) can be rewritten as M = P Ŵ + Q = P W + ψ, 37) where W = W Ŵ is the NN weight error. Then the adaptive law for estimating Ŵ is provided by Ŵ = Γ M, 38) with Γ > 0 being a constant matrix. Similar to Section III, the condition that λ min P ) > σ > 0 is needed if one desires to precisely estimate the unknown critic NN weights W so that the approximated value function 7) converges to its true value 5). As shown in 9 0, a small probing noise can be added to the control input to retain the PE condition if this condition is not satisfactory. This implies λ min P ) > σ > 0, as stated in Lemma 3. Lemma 3 4. If the regressor function vector Ξ defined in 33) is persistently excited, then matrix P defined in 35) is positive definite, i.e., its minimum eigenvalue λ min P ) > σ > 0. Then we have the following theorem. Theorem. For the critic NN adaptive law 38) with regressor vector Ξ satisfying λ min P ) > σ > 0 and for bounded state x and control u, one has ) For ε HJB = 0 i.e., no NN approximation errors), the estimation error W exponentially converges to zero; ) For ε HJB 0 i.e., with bounded approximation errors), the estimation error W converges to a bounded set around zero. Proof. Consider the Lyapunov function candidate as V = W T Γ W, then the derivative V can be calculated along 38) as V = W T Γ W = W T P W + W T ψ. 39) ) In case that ε HJB = 0, i.e., ψ = 0, thus 39) can be reduced as V = W T P W σ W µ V, 40) where µ = σ /λ max Γ ) is a positive constant. Then according to Lyapunov s theorem Theorem 3.4. in 4, p0), the weight estimation error W converges to zero exponentially, where the convergence rate depends on the excitation level σ and the learning gain Γ. ) In case that there are bounded approximation errors, i.e., ε HJB 0, 39) can be written as V = W T P W + W T ψ W σ V ε ψ ) 4) for σ = σ /λmax Γ ). Then according to the extended Lyapunov theorem Theorem in 4, p), the weight estimation error W { uniformly ultimately converges to the } compact set Ω := W V ε ψ / σ, whose size depends on the bound of approximation error ε ψ and the excitation level σ. This completes the proof. C. Stability Analysis Now, we summarize the main results of this paper as follows. Theorem 3. For system 3) with controls 8), 8) and adaptive laws 7), 38) being used, if the initial control action is chosen to be admissible and regressor vectors f and Ξ satisfy λ min P ) > σ > 0 and λ min P ) > σ > 0, then the following semi-global results hold: ) In the absence of approximation errors, the tracking error e and the parameter estimation errors C and W converge to zero, and adaptive control û e in 8) converges to its optimal solution u e in 8), i.e., û e u e if ε = 0. ) In the presence of approximation errors, the tracking error e and the parameter estimation errors C and W are uniformly ultimately bounded, and the adaptive control û e in 8) converges to a small bound around its optimal solution u e in 4), i.e., û e u e ε u for a small positive constant ε u. Please refer to Appendix for the detailed proof of Theorem 3. V. SIMULATIONS In this section, a numerical example is provided to demonstrate the effectiveness of the proposed approach. Consider the following nonlinear continuous-time system ẋ = x + x, ẋ = 0.5x 0.5x cosx ) + ) )+ cosx ) + u. 4) The results are to be compared with the exact results in 3. Then weight matrices Q and R of cost ) are chosen as identity matrices of appropriate dimensions. The control objective is to make system states x track the desired trajectory x d = sint) and x d = cost) + sint). It is assumed that system dynamics are partially unknown, and we first use identifier 3) to reconstruct system dynamics with A = 0.5 0, B= 0 being known

8 NA AND HERRMANN: ONLINE ADAPTIVE APPROXIMATE OPTIMAL TRACKING CONTROL WITH 49 T matrices, C= is the unknown identifier weights to be estimated, and the activation function is chosen as fx)= x, x cosx ) + ), cosx ) T. The parameters for simulation are set as k = 0.00, l =, Γ = 350. The initial weight parameter is set as Ĉ0)=0. Two different scenarios are investigated, the adaptive algorithm without and with injection of additional noise. The noise has a uniform distribution and a maximal amplitude of 0. induced at the measurements for x and x. It is removed after a duration of 4 seconds. Fig. shows the profile of the estimated identifier weights Ĉ with adaptive law 7), where one may find that the identifier weight estimation converges to their true value C after a 3.5 second transient without noise. It is evident that the algorithm with noise injection converges slightly faster. for the online estimation of the critic NN weights Ŵ is shown in Fig. ; this indicates that Ŵ converges in about.5 seconds. In particular, Ŵ and Ŵ3 converge close to its optimal value of and 0, while Ŵ for the noise induced case carries a larger error. Ŵ does not affect the closed loop behavior, but has influence on the value function estimate. This means that the designed adaptive optimal control 8) converges close to its optimal control action in 44). An error in the weights is to be expected as ε ϕ 0. The novel identifier and critic NN weight update laws 7) and 38), based on the information of the parameter estimation error, lead to faster convergence of weights compared to 9. Moreover, for the noise-free case, the system states for tracking the given external command are shown in Fig. 4, the tracking error profile is given in Fig. 5, and the associated control action is provided in Fig. 6. The noise induced case provides again very similar trajectories, which are not displayed here for space reasons. Fig.. Convergence of identifier parameters Ĉ. In the following, the control performance will be verified. For this purpose, the adaptive steady-state control 8) for system 4) to maintain the steady-state performance can be written as u s = 0, cost) Ĉ T cost) sint) sint) cost) + sint) cost) + sint) cost) sint))cos sint)) + ) cos sint)) ). 43) As input matrix B is of rank, it is evident that ε ϕ = BB I)ẋ d Ax d ĈT fx d )) is not zero. Thus, the computation of optimal control 8) using adaptive law 38) for the critic NNs may be subjected to a small error. To this end, following 9, 3, the optimal value function and the associated optimal control for system 4) are V e) = e + e and u e = R B T V e) e = e. 44) Similar to 9 0, we select the activation function for the critic NN as Φe) = e, e e, e T, then the optimal weights W = 0.5, 0, T can be derived. Note that only the last nonzero coefficient W 3 = affects the closed loop. The time trace Fig.. Convergence of critic NN weights Ŵ. Fig. 3. Excitation conditions λ minp ) and λ minp ). A critical issue in using the proposed adaptive laws 7) and 38) is to ensure sufficient excitation of regressor vectors fx) and Ξ. This condition can be fulfilled in the studied system as shown in Fig. 3, where the online evolutions of λ min P ) and λ min P ) are provided. The scalar λ min P ) remains positive at all time. The value of λ min P ) is sufficiently large till the

9 40 IEEE/CAA JOURNAL OF AUTOMATICA SINICA, VOL., NO. 4, OCTOBER 04 time instant of 4 second when the noise is removed for the noise induced case; λ min P ) for the noise-free case remains sufficiently large until the time instant of second, i.e., after the NN weight convergence is obtained See Figs. and ). Fig. 6. Control action profile u. Fig. 4. Evaluation of tracking performance. Fig. 5. Convergence of tracking error e = x x d. VI. CONCLUSIONS An adaptive optimal tracking control is proposed for a class of continuous-time nonlinear systems with unknown dynamics. To achieve the optimal tracking control, an adaptive steady-state control for maintaining the desired steadystate tracking performance is accomplished with an adaptive optimal control for stabilizing the tracking error dynamics at transient stage in an optimal manner. To eliminate the need for precisely known system dynamics, an adaptive identifier is used to estimate the unknown system dynamics. A critic NN is used to online learn the approximate solution of the HJB equation, which is then used to provide the approximately optimal control action. Novel adaptive laws based on the parameter estimation error are developed for updating the unknown weights in both identifier and critic NN, such that the online learning of identifier and the optimal policy is achieved simultaneously. The PE conditions or more relaxed filtered regressor matrix conditions are required to ensure the error convergence to a bounded region around the optimal control and stability of the closed-loop system. Simulation results demonstrate the improved performance of the proposed method. APPENDIX Proof of Theorem 3. Consider the Lyapunov function as V = V + V + V 3 + V 4 = tr C T Γ C) + W T Γ W + Γe T e + KV e) + Σψ T ψ, A) where V e) is the optimal value function 0), and K, Γ and Σ are positive constants. This Lyapunov function is investigated in a compact set Ω R p n R l R n R R n R n in tuple C, W, e, ψ, x d, ẋ d ), which contains the element 0, 0, 0, 0, 0, 0) in its interior, and C, W, e, ψ, x d, ẋ d ) Ω implies e + x d, u s x d, ẋ d ) + u e e)) Ω. Ω and Ω should be both chosen to be sufficiently large but of fixed size. In particular, any temporal initial value of C, W, e, ψ, x d, ẋ d ) is assumed to be within the interior Ω, while in particular x d and ẋ d are chosen to remain within Ω. Thus, for any initial trajectory, state x and control u remain bounded for at least finite time t 0, T, which again implies in particular ψ to be bounded in this time interval. Thus, consider inequality ab a η/ + b /η for η > 0, then derivative V along 7) is derived as V = tr C T Γ C) = tr C T P C) + tr CT ψ ) σ /η) C + ηε ψ/, A) and derivative V along 38) is derived as V = W T Γ W = W T P W + W T ψ σ /η) W + η ψ /. Moreover, one may deduce V 3 from 0) and 3) as V 3 = Γe T ė + K e T Qe u T e Ru e) = e T Γ Ae + ĈT f f d ) + BR B T Φ T W + Bu e + A) BR B T ε + ε + ε N + ε ϕ + K e T Qe u T e Ru e) Kλ minq) Γ4 + A + Ĉ κ ) e + Γ BR B T Φ T W Kλ minr) Γ B ) u e + Γ BR B T ε ) T ε + Γε T ε + Γε T N ε N + Γε T ϕε ϕ. A4)

10 NA AND HERRMANN: ONLINE ADAPTIVE APPROXIMATE OPTIMAL TRACKING CONTROL WITH 4 It is evident that ψ = lψ + Ξε HJB. Hence, similar to the parameter η > 0, parameter µ > 0 is introduced to compute an upper bound of the derivative of V 4 = Σψ T ψ as V 4 = Σψ T ψ = { Σψ T lψ + Ξ W T Φ ε N + ε ϕ + ε) + ε Ae + ĈT f f d ) + Bu e + ε + ε N + ε ϕ ) } Σl 5µ) ψ + µ Σ ΞW T Φ + ε ) ε ϕ + ε)) + µ Σ Ξ εĉt f f d )) + µ Σ Ξ ε BR B T Φ T Ŵ + µ Σ Ξ εae + µ Σ ΞW T Φ + ε ) ε N. Considering that ε N = Cfx), we have A5) V = V + V + V 3 + V 4 σ η Γ + µ Σ ΞW T Φ + ε ) ) f C σ η Γφ M BR B T µ Σ Ξ ε BR B T Φ T ) W Kλ minq) Γ 4 + A + Ĉ κ ) µ Σ Ξ ε Ĉ T κ + Ξ ε A ) e KλminR) Γ B ) u e Σl 5µ) η ψ + BR Γ B T ε) T ε + Γε T ε + Γε T ϕε ϕ + ηε ψ+ ΞW µ Σ T Φ + ε ) ε ϕ + ε)) + µ Σ Ξ ε BR B T Φ T W. A6) The design parameters η, µ, Γ, Σ and K are appropriately chosen such that Kλ min R) Γ B ) > 0 and the scalars a, a, a 3 and a 4 are positive and larger than certain positive constant a > 0, where a = σ η Γ + ΞW µ Σ T Φ + ε ) ) f, a = σ η Γφ M BR B T µ Σ Ξ ε BR B T Φ T, 4 + A + Ĉ κ ) a 3 = Kλ minq) Γ µ Σ Ξ ε Ĉ T κ + Ξ ε A ), a 4 = Σl 5µ) η. This can be achieved by selecting η > 0 and K > 0 large enough, while a > 0, µ > 0, Γ > 0 and Σ > 0 are chosen to be small enough to satisfy in particular minσ, σ ) > a > 0 and a 4 > a > 0. Note also that Lipschitz continuity of f ) and smoothness of Φ ) and V ) imply that f ), Ξ and Φ ) are bounded on Ω. Thus, A6) can be further presented as V a C a W a 3 e a 4 ψ + γ, A7) where γ = Γ BR B T ε ) T ε + Γε T ε + Γε T ϕε ϕ + ηε ψ + µ Σ ΞW T Φ + ε ) ε ϕ + ε)) + µ Σ Ξ ε BR B T Φ T W defines the effect of the identifier errors ε, ψ, the critic NN approximation error ε and the matching error ε ϕ. ) In case that there are no approximation errors in both identifier and critic NN, i.e., ε N = ε = ψ = ψ = ε ϕ = 0, then we have γ = 0, such that A7) can be deduced as V a C a W a 3 e a 4 ψ 0. A8) Thus, there is a compact set ˆΩ Ω, in C, W, e, ψ ) with 0, 0, 0, 0) in its interior, which is a set of attraction. Then within ˆΩ according to Lyapunov s theorem, V 0 holds as t + such that the estimation errors C, W and e all converge to zero. In this case, by assuming the critic NN approximation error ε = 0, we have û e u e = R B T Φ T Ŵ + R B T Φ T W = such that R B T Φ T W, A9) lim t + û e u e φ M R B T lim t + W = 0. A0) ) In case that there are bounded approximation errors in both identifier and critic NN, then we have γ 0. Consequently, according to A7), it can be shown that V is negative if C > γ/a, W > γ/a, e > γ/a 3, ψ > γ/a 4. A) Then again for some set ˆΩ Ω, the estimation errors C, W, ψ and e are all uniformly ultimately bounded according to Lyapunov s theorem within the set of attraction ˆΩ. Next we will prove û e u e ε u. Recalling the expressions of u e from 4) or 30) and û e from 8), we have û e u e = R B T Φ T Ŵ + R B T Φ T W + ε ) = R B T Φ T W + R B T ε. A) When t, the upper bound of A) is û e u e R B T Φ T W + R B T ε ε u. A3) Clearly, the upper bound ε u depends on the critic NN approximation error W and the NN estimation error ε.

11 4 IEEE/CAA JOURNAL OF AUTOMATICA SINICA, VOL., NO. 4, OCTOBER 04 REFERENCES Lewis F L, Vrabie D, Syrmos V L. Optimal Control. Wiley. com, 0. Vrabie D, Lewis F L. Neural network approach to continuous-time direct adaptive optimal control for partially unknown nonlinear systems. Neural Networks, 009, 3): Sastry S, Bodson M. Adaptive Control: Stability, Convergence, and Robustness. New Jersey: Prentice Hall, Ioannou P A, Sun J. Robust Adaptive Control. New Jersey: Prentice Hall, Sutton R S, Barto A G. Reinforcement Learning: An Introduction. Cambridge: Cambridge University Press, Doya K J. Reinforcement learning in continuous time and space. Neural computation, 000, ): Sutton R S, Barto A G, Williams R J. Reinforcement learning is direct adaptive optimal control. IEEE Control Systems Magazine, 99, ): 9 8 Werbos P J. A menu of designs for reinforcement learning over time. Neural Networks for Control. MA, USA: MIT Press Cambridge, Si J, Barto A G, Powell W B, Wunsch D C. Handbook of Learning and Approximate Dynamic Programming. Los Alamitos: IEEE Press, Wang F Y, Zhang H G, Liu D R. Adaptive dynamic programming: an introduction. IEEE Computational Intelligence Magazine, 009, 4): Lewis F L, Vrabie D. Reinforcement learning and adaptive dynamic programming for feedback control. IEEE Circuits and Systems Magazine, ): 3 50 Zhang H G, Zhang X, Luo Y H, Yang J. An overview of research on adaptive dynamic programming. Acata Automatica Sinica, 03, 394): Dierks T, Thumati B T, Jagannathan S. Optimal control of unknown affine nonlinear discrete-time systems using offline-trained neural networks with proof of convergence. Neural Networks, 009, 5): Al-Tamimi A, Lewis F L, Abu-Khalaf M. Discrete-time nonlinear HJB solution using approximate dynamic programming: convergence proof. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 008, 384): Wang D, Liu D R, Wei Q L, Zhao D B, Jin N. Optimal control of unknown nonaffine nonlinear discrete-time systems based on adaptive dynamic programming. Automatica, 0, 488): Hanselmann T, Noakes L, Zaknich A. Continuous-time adaptive critics. IEEE Transactions on Neural Networks, 007, 83): Abu-Khalaf M, Lewis F L. Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach. Automatica, 005, 45): Vrabie D, Pastravanu O, Abu-Khalaf M, Lewis F L. Adaptive optimal control for continuous-time linear systems based on policy iteration. Automatica, 009, 45): Vamvoudakis K G, Lewis F L. Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem. Automatica, 00, 465): Bhasin S, Kamalapurkar R, Johnson M, Vamvoudakis K G, Lewis F L, Dixon W E. A novel actor-critic-identifier architecture for approximate optimal control of uncertain nonlinear systems. Automatica, 03, 49): 8 9 Zhang H G, Cui L, Zhang X, Luo Y. Data-driven robust approximate optimal tracking control for unknown general nonlinear systems using adaptive dynamic programming method. IEEE Transactions on Neural Networks, 0, ): 6 36 Mannava A, Balakrishnan S N, Tang L, Landers R G. Optimal tracking control of motion systems. IEEE Transactions on Control Systems Technology, 0, 06): Nodland D, Zargarzadeh H, Jagannathan S. Neural network-based optimal adaptive output feedback control of a helicopter UAV. IEEE Transactions on Neural Networks and Learning Systems, 03, 47): Na J, Herrmann G, Ren X M, Mahyuddin M N, Barber P. Robust adaptive finite-time parameter estimation and control of nonlinear systems. In: Proceedings of IEEE International Symposium on Intelligent Control ISIC). Denver, CO: IEEE, Uang H J, Chen B S. Robust adaptive optimal tracking design for uncertain missile systems: a fuzzy approach. Fuzzy Sets and Systems, 00, 6): Krstic M, Kokotovic P V, Kanellakopoulos I. Nonlinear and Adaptive Control Design. New York: Wiley, Kosmatopoulos E B, Polycarpou M M, Christodoulou M A, Ioannou P A. High-order neural network structures for identification of dynamical systems. IEEE Transactions on Neural Networks, 995, 6): Abdollahi F, Talebi H A, Patel R V. A stable neural network-based observer with application to flexible-joint manipulators. IEEE Transactions on Neural Networks, 006, 7): Lin J S, Kanellakopoulos I. Nonlinearities enhance parameter convergence in strict feedback systems. IEEE Transactions on Automatic Control, 999, 44): Edwards C, Spurgeon S K. Sliding Mode Control: Theory and Applications. Boca Raton: CRC Press, Sira-Ramirez H. Differential geometric methods in variable-structure control. International Journal of Control, 988, 48 4): Nevistic V, Primbs J A. Constrained Nonlinear Optimal Control: A Converse HJB Approach, Technical Report CIT-CDS 96-0, California Institute of Technology, Pasadena, CA, 996. Jing Na Professor in Kunming University of Science and Technology. He received his Ph. D. degree from Beijing Institute of Technology in 00. From 0 to 0, he was a Postdoctoral Fellow with the ITER Organization. His research interest covers intelligent control, adaptive parameter estimation, neural networks, repetitive control, and nonlinear control & applications. Corresponding author of this paper. Guido Herrmann Received his Ph. D. degree from University of Leicester, UK, in 00. From 00 to 003, he was a Senior Research Fellow in the Data Storage Institute in Singapore. From 003 until 007, he was a research associate, fellow, and lecturer in University of Leicester. He joined University of Bristol, UK, as a lecturer in March 007. He was promoted to a Senior Lecturer in 009 and a Reader in Control and Dynamics in 0. He is a Senior Member of the IEEE. His research interest covers the development and application of novel, robust and nonlinear control systems.

Stability Analysis of Optimal Adaptive Control under Value Iteration using a Stabilizing Initial Policy

Stability Analysis of Optimal Adaptive Control under Value Iteration using a Stabilizing Initial Policy Stability Analysis of Optimal Adaptive Control under Value Iteration using a Stabilizing Initial Policy Ali Heydari, Member, IEEE Abstract Adaptive optimal control using value iteration initiated from

More information

Dynamic backstepping control for pure-feedback nonlinear systems

Dynamic backstepping control for pure-feedback nonlinear systems Dynamic backstepping control for pure-feedback nonlinear systems ZHANG Sheng *, QIAN Wei-qi (7.6) Computational Aerodynamics Institution, China Aerodynamics Research and Development Center, Mianyang, 6,

More information

IN recent years, controller design for systems having complex

IN recent years, controller design for systems having complex 818 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS PART B: CYBERNETICS, VOL 29, NO 6, DECEMBER 1999 Adaptive Neural Network Control of Nonlinear Systems by State and Output Feedback S S Ge, Member,

More information

NEURAL NETWORKS (NNs) play an important role in

NEURAL NETWORKS (NNs) play an important role in 1630 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS PART B: CYBERNETICS, VOL 34, NO 4, AUGUST 2004 Adaptive Neural Network Control for a Class of MIMO Nonlinear Systems With Disturbances in Discrete-Time

More information

H 2 Adaptive Control. Tansel Yucelen, Anthony J. Calise, and Rajeev Chandramohan. WeA03.4

H 2 Adaptive Control. Tansel Yucelen, Anthony J. Calise, and Rajeev Chandramohan. WeA03.4 1 American Control Conference Marriott Waterfront, Baltimore, MD, USA June 3-July, 1 WeA3. H Adaptive Control Tansel Yucelen, Anthony J. Calise, and Rajeev Chandramohan Abstract Model reference adaptive

More information

Adaptive Predictive Observer Design for Class of Uncertain Nonlinear Systems with Bounded Disturbance

Adaptive Predictive Observer Design for Class of Uncertain Nonlinear Systems with Bounded Disturbance International Journal of Control Science and Engineering 2018, 8(2): 31-35 DOI: 10.5923/j.control.20180802.01 Adaptive Predictive Observer Design for Class of Saeed Kashefi *, Majid Hajatipor Faculty of

More information

Topic # /31 Feedback Control Systems. Analysis of Nonlinear Systems Lyapunov Stability Analysis

Topic # /31 Feedback Control Systems. Analysis of Nonlinear Systems Lyapunov Stability Analysis Topic # 16.30/31 Feedback Control Systems Analysis of Nonlinear Systems Lyapunov Stability Analysis Fall 010 16.30/31 Lyapunov Stability Analysis Very general method to prove (or disprove) stability of

More information

SLIDING MODE FAULT TOLERANT CONTROL WITH PRESCRIBED PERFORMANCE. Jicheng Gao, Qikun Shen, Pengfei Yang and Jianye Gong

SLIDING MODE FAULT TOLERANT CONTROL WITH PRESCRIBED PERFORMANCE. Jicheng Gao, Qikun Shen, Pengfei Yang and Jianye Gong International Journal of Innovative Computing, Information and Control ICIC International c 27 ISSN 349-498 Volume 3, Number 2, April 27 pp. 687 694 SLIDING MODE FAULT TOLERANT CONTROL WITH PRESCRIBED

More information

arxiv: v1 [math.oc] 23 Oct 2017

arxiv: v1 [math.oc] 23 Oct 2017 Stability Analysis of Optimal Adaptive Control using Value Iteration Approximation Errors Ali Heydari arxiv:1710.08530v1 [math.oc] 23 Oct 2017 Abstract Adaptive optimal control using value iteration initiated

More information

UDE-based Dynamic Surface Control for Strict-feedback Systems with Mismatched Disturbances

UDE-based Dynamic Surface Control for Strict-feedback Systems with Mismatched Disturbances 16 American Control Conference ACC) Boston Marriott Copley Place July 6-8, 16. Boston, MA, USA UDE-based Dynamic Surface Control for Strict-feedback Systems with Mismatched Disturbances Jiguo Dai, Beibei

More information

CONTROL OF ROBOT CAMERA SYSTEM WITH ACTUATOR S DYNAMICS TO TRACK MOVING OBJECT

CONTROL OF ROBOT CAMERA SYSTEM WITH ACTUATOR S DYNAMICS TO TRACK MOVING OBJECT Journal of Computer Science and Cybernetics, V.31, N.3 (2015), 255 265 DOI: 10.15625/1813-9663/31/3/6127 CONTROL OF ROBOT CAMERA SYSTEM WITH ACTUATOR S DYNAMICS TO TRACK MOVING OBJECT NGUYEN TIEN KIEM

More information

Optimal Scheduling for Reference Tracking or State Regulation using Reinforcement Learning

Optimal Scheduling for Reference Tracking or State Regulation using Reinforcement Learning Optimal Scheduling for Reference Tracking or State Regulation using Reinforcement Learning Ali Heydari Abstract The problem of optimal control of autonomous nonlinear switching systems with infinite-horizon

More information

State and Parameter Estimation Based on Filtered Transformation for a Class of Second-Order Systems

State and Parameter Estimation Based on Filtered Transformation for a Class of Second-Order Systems State and Parameter Estimation Based on Filtered Transformation for a Class of Second-Order Systems Mehdi Tavan, Kamel Sabahi, and Saeid Hoseinzadeh Abstract This paper addresses the problem of state and

More information

Design of Observer-based Adaptive Controller for Nonlinear Systems with Unmodeled Dynamics and Actuator Dead-zone

Design of Observer-based Adaptive Controller for Nonlinear Systems with Unmodeled Dynamics and Actuator Dead-zone International Journal of Automation and Computing 8), May, -8 DOI:.7/s633--574-4 Design of Observer-based Adaptive Controller for Nonlinear Systems with Unmodeled Dynamics and Actuator Dead-zone Xue-Li

More information

Robotics. Control Theory. Marc Toussaint U Stuttgart

Robotics. Control Theory. Marc Toussaint U Stuttgart Robotics Control Theory Topics in control theory, optimal control, HJB equation, infinite horizon case, Linear-Quadratic optimal control, Riccati equations (differential, algebraic, discrete-time), controllability,

More information

An Optimal Tracking Approach to Formation Control of Nonlinear Multi-Agent Systems

An Optimal Tracking Approach to Formation Control of Nonlinear Multi-Agent Systems AIAA Guidance, Navigation, and Control Conference 13-16 August 212, Minneapolis, Minnesota AIAA 212-4694 An Optimal Tracking Approach to Formation Control of Nonlinear Multi-Agent Systems Ali Heydari 1

More information

Iterative Learning Control Analysis and Design I

Iterative Learning Control Analysis and Design I Iterative Learning Control Analysis and Design I Electronics and Computer Science University of Southampton Southampton, SO17 1BJ, UK etar@ecs.soton.ac.uk http://www.ecs.soton.ac.uk/ Contents Basics Representations

More information

The Rationale for Second Level Adaptation

The Rationale for Second Level Adaptation The Rationale for Second Level Adaptation Kumpati S. Narendra, Yu Wang and Wei Chen Center for Systems Science, Yale University arxiv:1510.04989v1 [cs.sy] 16 Oct 2015 Abstract Recently, a new approach

More information

Decentralized Control of Nonlinear Multi-Agent Systems Using Single Network Adaptive Critics

Decentralized Control of Nonlinear Multi-Agent Systems Using Single Network Adaptive Critics Decentralized Control of Nonlinear Multi-Agent Systems Using Single Network Adaptive Critics Ali Heydari Mechanical & Aerospace Engineering Dept. Missouri University of Science and Technology Rolla, MO,

More information

Tracking Control of a Class of Differential Inclusion Systems via Sliding Mode Technique

Tracking Control of a Class of Differential Inclusion Systems via Sliding Mode Technique International Journal of Automation and Computing (3), June 24, 38-32 DOI: 7/s633-4-793-6 Tracking Control of a Class of Differential Inclusion Systems via Sliding Mode Technique Lei-Po Liu Zhu-Mu Fu Xiao-Na

More information

A Discrete Robust Adaptive Iterative Learning Control for a Class of Nonlinear Systems with Unknown Control Direction

A Discrete Robust Adaptive Iterative Learning Control for a Class of Nonlinear Systems with Unknown Control Direction Proceedings of the International MultiConference of Engineers and Computer Scientists 16 Vol I, IMECS 16, March 16-18, 16, Hong Kong A Discrete Robust Adaptive Iterative Learning Control for a Class of

More information

Optimal Control of Uncertain Nonlinear Systems

Optimal Control of Uncertain Nonlinear Systems Proceedings of the 47th IEEE Conference on Decision and Control Cancun, Mexico, Dec. 9-11, 28 Optimal Control of Uncertain Nonlinear Systems using RISE Feedback K.Dupree,P.M.Patre,Z.D.Wilcox,andW.E.Dixon

More information

On the Scalability in Cooperative Control. Zhongkui Li. Peking University

On the Scalability in Cooperative Control. Zhongkui Li. Peking University On the Scalability in Cooperative Control Zhongkui Li Email: zhongkli@pku.edu.cn Peking University June 25, 2016 Zhongkui Li (PKU) Scalability June 25, 2016 1 / 28 Background Cooperative control is to

More information

Neural Network Control of Robot Manipulators and Nonlinear Systems

Neural Network Control of Robot Manipulators and Nonlinear Systems Neural Network Control of Robot Manipulators and Nonlinear Systems F.L. LEWIS Automation and Robotics Research Institute The University of Texas at Arlington S. JAG ANNATHAN Systems and Controls Research

More information

EML5311 Lyapunov Stability & Robust Control Design

EML5311 Lyapunov Stability & Robust Control Design EML5311 Lyapunov Stability & Robust Control Design 1 Lyapunov Stability criterion In Robust control design of nonlinear uncertain systems, stability theory plays an important role in engineering systems.

More information

Indirect Model Reference Adaptive Control System Based on Dynamic Certainty Equivalence Principle and Recursive Identifier Scheme

Indirect Model Reference Adaptive Control System Based on Dynamic Certainty Equivalence Principle and Recursive Identifier Scheme Indirect Model Reference Adaptive Control System Based on Dynamic Certainty Equivalence Principle and Recursive Identifier Scheme Itamiya, K. *1, Sawada, M. 2 1 Dept. of Electrical and Electronic Eng.,

More information

Stability of Feedback Solutions for Infinite Horizon Noncooperative Differential Games

Stability of Feedback Solutions for Infinite Horizon Noncooperative Differential Games Stability of Feedback Solutions for Infinite Horizon Noncooperative Differential Games Alberto Bressan ) and Khai T. Nguyen ) *) Department of Mathematics, Penn State University **) Department of Mathematics,

More information

Prediction-based adaptive control of a class of discrete-time nonlinear systems with nonlinear growth rate

Prediction-based adaptive control of a class of discrete-time nonlinear systems with nonlinear growth rate www.scichina.com info.scichina.com www.springerlin.com Prediction-based adaptive control of a class of discrete-time nonlinear systems with nonlinear growth rate WEI Chen & CHEN ZongJi School of Automation

More information

Optimal Control. McGill COMP 765 Oct 3 rd, 2017

Optimal Control. McGill COMP 765 Oct 3 rd, 2017 Optimal Control McGill COMP 765 Oct 3 rd, 2017 Classical Control Quiz Question 1: Can a PID controller be used to balance an inverted pendulum: A) That starts upright? B) That must be swung-up (perhaps

More information

Adaptive Dynamic Inversion Control of a Linear Scalar Plant with Constrained Control Inputs

Adaptive Dynamic Inversion Control of a Linear Scalar Plant with Constrained Control Inputs 5 American Control Conference June 8-, 5. Portland, OR, USA ThA. Adaptive Dynamic Inversion Control of a Linear Scalar Plant with Constrained Control Inputs Monish D. Tandale and John Valasek Abstract

More information

A Globally Stabilizing Receding Horizon Controller for Neutrally Stable Linear Systems with Input Constraints 1

A Globally Stabilizing Receding Horizon Controller for Neutrally Stable Linear Systems with Input Constraints 1 A Globally Stabilizing Receding Horizon Controller for Neutrally Stable Linear Systems with Input Constraints 1 Ali Jadbabaie, Claudio De Persis, and Tae-Woong Yoon 2 Department of Electrical Engineering

More information

A NONLINEAR TRANSFORMATION APPROACH TO GLOBAL ADAPTIVE OUTPUT FEEDBACK CONTROL OF 3RD-ORDER UNCERTAIN NONLINEAR SYSTEMS

A NONLINEAR TRANSFORMATION APPROACH TO GLOBAL ADAPTIVE OUTPUT FEEDBACK CONTROL OF 3RD-ORDER UNCERTAIN NONLINEAR SYSTEMS Copyright 00 IFAC 15th Triennial World Congress, Barcelona, Spain A NONLINEAR TRANSFORMATION APPROACH TO GLOBAL ADAPTIVE OUTPUT FEEDBACK CONTROL OF RD-ORDER UNCERTAIN NONLINEAR SYSTEMS Choon-Ki Ahn, Beom-Soo

More information

A Sliding Mode Control based on Nonlinear Disturbance Observer for the Mobile Manipulator

A Sliding Mode Control based on Nonlinear Disturbance Observer for the Mobile Manipulator International Core Journal of Engineering Vol.3 No.6 7 ISSN: 44-895 A Sliding Mode Control based on Nonlinear Disturbance Observer for the Mobile Manipulator Yanna Si Information Engineering College Henan

More information

Concurrent Learning for Convergence in Adaptive Control without Persistency of Excitation

Concurrent Learning for Convergence in Adaptive Control without Persistency of Excitation Concurrent Learning for Convergence in Adaptive Control without Persistency of Excitation Girish Chowdhary and Eric Johnson Abstract We show that for an adaptive controller that uses recorded and instantaneous

More information

Nonlinear Tracking Control of Underactuated Surface Vessel

Nonlinear Tracking Control of Underactuated Surface Vessel American Control Conference June -. Portland OR USA FrB. Nonlinear Tracking Control of Underactuated Surface Vessel Wenjie Dong and Yi Guo Abstract We consider in this paper the tracking control problem

More information

Target Localization and Circumnavigation Using Bearing Measurements in 2D

Target Localization and Circumnavigation Using Bearing Measurements in 2D Target Localization and Circumnavigation Using Bearing Measurements in D Mohammad Deghat, Iman Shames, Brian D. O. Anderson and Changbin Yu Abstract This paper considers the problem of localization and

More information

AFAULT diagnosis procedure is typically divided into three

AFAULT diagnosis procedure is typically divided into three 576 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 47, NO. 4, APRIL 2002 A Robust Detection and Isolation Scheme for Abrupt and Incipient Faults in Nonlinear Systems Xiaodong Zhang, Marios M. Polycarpou,

More information

Several Extensions in Methods for Adaptive Output Feedback Control

Several Extensions in Methods for Adaptive Output Feedback Control Several Extensions in Methods for Adaptive Output Feedback Control Nakwan Kim Postdoctoral Fellow School of Aerospace Engineering Georgia Institute of Technology Atlanta, GA 333 5 Anthony J. Calise Professor

More information

Observer-based Adaptive Optimal Control for Unknown Singularly Perturbed Nonlinear Systems With Input Constraints

Observer-based Adaptive Optimal Control for Unknown Singularly Perturbed Nonlinear Systems With Input Constraints 48 IEEE/CAA JOURNAL OF AUTOMATICA SINICA, VOL. 4, NO. 1, JANUARY 17 Observer-based Adaptive Optimal Control for Unknown Singularly Perturbed Nonlinear Systems With Input Constraints Zhijun Fu, Wenfang

More information

Robust Observer for Uncertain T S model of a Synchronous Machine

Robust Observer for Uncertain T S model of a Synchronous Machine Recent Advances in Circuits Communications Signal Processing Robust Observer for Uncertain T S model of a Synchronous Machine OUAALINE Najat ELALAMI Noureddine Laboratory of Automation Computer Engineering

More information

FINITE HORIZON ROBUST MODEL PREDICTIVE CONTROL USING LINEAR MATRIX INEQUALITIES. Danlei Chu, Tongwen Chen, Horacio J. Marquez

FINITE HORIZON ROBUST MODEL PREDICTIVE CONTROL USING LINEAR MATRIX INEQUALITIES. Danlei Chu, Tongwen Chen, Horacio J. Marquez FINITE HORIZON ROBUST MODEL PREDICTIVE CONTROL USING LINEAR MATRIX INEQUALITIES Danlei Chu Tongwen Chen Horacio J Marquez Department of Electrical and Computer Engineering University of Alberta Edmonton

More information

CHATTERING-FREE SMC WITH UNIDIRECTIONAL AUXILIARY SURFACES FOR NONLINEAR SYSTEM WITH STATE CONSTRAINTS. Jian Fu, Qing-Xian Wu and Ze-Hui Mao

CHATTERING-FREE SMC WITH UNIDIRECTIONAL AUXILIARY SURFACES FOR NONLINEAR SYSTEM WITH STATE CONSTRAINTS. Jian Fu, Qing-Xian Wu and Ze-Hui Mao International Journal of Innovative Computing, Information and Control ICIC International c 2013 ISSN 1349-4198 Volume 9, Number 12, December 2013 pp. 4793 4809 CHATTERING-FREE SMC WITH UNIDIRECTIONAL

More information

MODEL-BASED REINFORCEMENT LEARNING FOR ONLINE APPROXIMATE OPTIMAL CONTROL

MODEL-BASED REINFORCEMENT LEARNING FOR ONLINE APPROXIMATE OPTIMAL CONTROL MODEL-BASED REINFORCEMENT LEARNING FOR ONLINE APPROXIMATE OPTIMAL CONTROL By RUSHIKESH LAMBODAR KAMALAPURKAR A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT

More information

90 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 19, NO. 1, JANUARY /$ IEEE

90 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 19, NO. 1, JANUARY /$ IEEE 90 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 19, NO. 1, JANUARY 2008 Generalized Hamilton Jacobi Bellman Formulation -Based Neural Network Control of Affine Nonlinear Discrete-Time Systems Zheng Chen,

More information

The ϵ-capacity of a gain matrix and tolerable disturbances: Discrete-time perturbed linear systems

The ϵ-capacity of a gain matrix and tolerable disturbances: Discrete-time perturbed linear systems IOSR Journal of Mathematics (IOSR-JM) e-issn: 2278-5728, p-issn: 2319-765X. Volume 11, Issue 3 Ver. IV (May - Jun. 2015), PP 52-62 www.iosrjournals.org The ϵ-capacity of a gain matrix and tolerable disturbances:

More information

Robust Adaptive Attitude Control of a Spacecraft

Robust Adaptive Attitude Control of a Spacecraft Robust Adaptive Attitude Control of a Spacecraft AER1503 Spacecraft Dynamics and Controls II April 24, 2015 Christopher Au Agenda Introduction Model Formulation Controller Designs Simulation Results 2

More information

Set-based adaptive estimation for a class of nonlinear systems with time-varying parameters

Set-based adaptive estimation for a class of nonlinear systems with time-varying parameters Preprints of the 8th IFAC Symposium on Advanced Control of Chemical Processes The International Federation of Automatic Control Furama Riverfront, Singapore, July -3, Set-based adaptive estimation for

More information

Neural-network-observer-based optimal control for unknown nonlinear systems using adaptive dynamic programming

Neural-network-observer-based optimal control for unknown nonlinear systems using adaptive dynamic programming International Journal of Control, 013 Vol. 86, No. 9, 1554 1566, http://dx.doi.org/10.1080/0007179.013.79056 Neural-network-observer-based optimal control for unknown nonlinear systems using adaptive dynamic

More information

A Neuron-Network-Based Optimal Control of Ultra-Capacitors with System Uncertainties

A Neuron-Network-Based Optimal Control of Ultra-Capacitors with System Uncertainties THIS PAPER HAS BEEN ACCEPTED BY IEEE ISGT NA 219. 1 A Neuron-Network-Based Optimal Control of Ultra-Capacitors with System Uncertainties Jiajun Duan, Zhehan Yi, Di Shi, Hao Xu, and Zhiwei Wang GEIRI North

More information

Approximation-Free Prescribed Performance Control

Approximation-Free Prescribed Performance Control Preprints of the 8th IFAC World Congress Milano Italy August 28 - September 2 2 Approximation-Free Prescribed Performance Control Charalampos P. Bechlioulis and George A. Rovithakis Department of Electrical

More information

ONLINE LEARNING ALGORITHM FOR ZERO-SUM GAMES WITH INTEGRAL REINFORCEMENT LEARNING

ONLINE LEARNING ALGORITHM FOR ZERO-SUM GAMES WITH INTEGRAL REINFORCEMENT LEARNING JAISCR,, Vol., No.4, pp. 35 33 ONLINE LEARNING ALGORIHM FOR ZERO-SUM GAMES WIH INEGRAL REINFORCEMEN LEARNING Kyriakos G. Vamvoudakis, Draguna Vrabie, Frank L. Lewis Automation and Robotics Research Institute,

More information

MCE/EEC 647/747: Robot Dynamics and Control. Lecture 12: Multivariable Control of Robotic Manipulators Part II

MCE/EEC 647/747: Robot Dynamics and Control. Lecture 12: Multivariable Control of Robotic Manipulators Part II MCE/EEC 647/747: Robot Dynamics and Control Lecture 12: Multivariable Control of Robotic Manipulators Part II Reading: SHV Ch.8 Mechanical Engineering Hanz Richter, PhD MCE647 p.1/14 Robust vs. Adaptive

More information

Contraction Based Adaptive Control of a Class of Nonlinear Systems

Contraction Based Adaptive Control of a Class of Nonlinear Systems 9 American Control Conference Hyatt Regency Riverfront, St. Louis, MO, USA June -, 9 WeB4.5 Contraction Based Adaptive Control of a Class of Nonlinear Systems B. B. Sharma and I. N. Kar, Member IEEE Abstract

More information

Adaptive NN Control of Dynamic Systems with Unknown Dynamic Friction

Adaptive NN Control of Dynamic Systems with Unknown Dynamic Friction Adaptive NN Control of Dynamic Systems with Unknown Dynamic Friction S. S. Ge 1,T.H.LeeandJ.Wang Department of Electrical and Computer Engineering National University of Singapore Singapore 117576 Abstract

More information

Robust Adaptive MPC for Systems with Exogeneous Disturbances

Robust Adaptive MPC for Systems with Exogeneous Disturbances Robust Adaptive MPC for Systems with Exogeneous Disturbances V. Adetola M. Guay Department of Chemical Engineering, Queen s University, Kingston, Ontario, Canada (e-mail: martin.guay@chee.queensu.ca) Abstract:

More information

EE C128 / ME C134 Feedback Control Systems

EE C128 / ME C134 Feedback Control Systems EE C128 / ME C134 Feedback Control Systems Lecture Additional Material Introduction to Model Predictive Control Maximilian Balandat Department of Electrical Engineering & Computer Science University of

More information

Adaptive Robust Tracking Control of Robot Manipulators in the Task-space under Uncertainties

Adaptive Robust Tracking Control of Robot Manipulators in the Task-space under Uncertainties Australian Journal of Basic and Applied Sciences, 3(1): 308-322, 2009 ISSN 1991-8178 Adaptive Robust Tracking Control of Robot Manipulators in the Task-space under Uncertainties M.R.Soltanpour, M.M.Fateh

More information

Event-sampled direct adaptive neural network control of uncertain strict-feedback system with application to quadrotor unmanned aerial vehicle

Event-sampled direct adaptive neural network control of uncertain strict-feedback system with application to quadrotor unmanned aerial vehicle Scholars' Mine Masters Theses Student Research & Creative Works Fall 2016 Event-sampled direct adaptive neural network control of uncertain strict-feedback system with application to quadrotor unmanned

More information

An Approach of Robust Iterative Learning Control for Uncertain Systems

An Approach of Robust Iterative Learning Control for Uncertain Systems ,,, 323 E-mail: mxsun@zjut.edu.cn :, Lyapunov( ),,.,,,.,,. :,,, An Approach of Robust Iterative Learning Control for Uncertain Systems Mingxuan Sun, Chaonan Jiang, Yanwei Li College of Information Engineering,

More information

CHATTERING REDUCTION OF SLIDING MODE CONTROL BY LOW-PASS FILTERING THE CONTROL SIGNAL

CHATTERING REDUCTION OF SLIDING MODE CONTROL BY LOW-PASS FILTERING THE CONTROL SIGNAL Asian Journal of Control, Vol. 12, No. 3, pp. 392 398, May 2010 Published online 25 February 2010 in Wiley InterScience (www.interscience.wiley.com) DOI: 10.1002/asjc.195 CHATTERING REDUCTION OF SLIDING

More information

Event-Triggered Decentralized Dynamic Output Feedback Control for LTI Systems

Event-Triggered Decentralized Dynamic Output Feedback Control for LTI Systems Event-Triggered Decentralized Dynamic Output Feedback Control for LTI Systems Pavankumar Tallapragada Nikhil Chopra Department of Mechanical Engineering, University of Maryland, College Park, 2742 MD,

More information

ADAPTIVE FILTER THEORY

ADAPTIVE FILTER THEORY ADAPTIVE FILTER THEORY Fourth Edition Simon Haykin Communications Research Laboratory McMaster University Hamilton, Ontario, Canada Front ice Hall PRENTICE HALL Upper Saddle River, New Jersey 07458 Preface

More information

Robust Stabilization of Non-Minimum Phase Nonlinear Systems Using Extended High Gain Observers

Robust Stabilization of Non-Minimum Phase Nonlinear Systems Using Extended High Gain Observers 28 American Control Conference Westin Seattle Hotel, Seattle, Washington, USA June 11-13, 28 WeC15.1 Robust Stabilization of Non-Minimum Phase Nonlinear Systems Using Extended High Gain Observers Shahid

More information

RESEARCH ON TRACKING AND SYNCHRONIZATION OF UNCERTAIN CHAOTIC SYSTEMS

RESEARCH ON TRACKING AND SYNCHRONIZATION OF UNCERTAIN CHAOTIC SYSTEMS Computing and Informatics, Vol. 3, 13, 193 1311 RESEARCH ON TRACKING AND SYNCHRONIZATION OF UNCERTAIN CHAOTIC SYSTEMS Junwei Lei, Hongchao Zhao, Jinyong Yu Zuoe Fan, Heng Li, Kehua Li Naval Aeronautical

More information

Multi-Robotic Systems

Multi-Robotic Systems CHAPTER 9 Multi-Robotic Systems The topic of multi-robotic systems is quite popular now. It is believed that such systems can have the following benefits: Improved performance ( winning by numbers ) Distributed

More information

Variable Learning Rate LMS Based Linear Adaptive Inverse Control *

Variable Learning Rate LMS Based Linear Adaptive Inverse Control * ISSN 746-7659, England, UK Journal of Information and Computing Science Vol., No. 3, 6, pp. 39-48 Variable Learning Rate LMS Based Linear Adaptive Inverse Control * Shuying ie, Chengjin Zhang School of

More information

Lyapunov Stability of Linear Predictor Feedback for Distributed Input Delays

Lyapunov Stability of Linear Predictor Feedback for Distributed Input Delays IEEE TRANSACTIONS ON AUTOMATIC CONTROL VOL. 56 NO. 3 MARCH 2011 655 Lyapunov Stability of Linear Predictor Feedback for Distributed Input Delays Nikolaos Bekiaris-Liberis Miroslav Krstic In this case system

More information

Control of industrial robots. Centralized control

Control of industrial robots. Centralized control Control of industrial robots Centralized control Prof. Paolo Rocco (paolo.rocco@polimi.it) Politecnico di Milano ipartimento di Elettronica, Informazione e Bioingegneria Introduction Centralized control

More information

Stochastic and Adaptive Optimal Control

Stochastic and Adaptive Optimal Control Stochastic and Adaptive Optimal Control Robert Stengel Optimal Control and Estimation, MAE 546 Princeton University, 2018! Nonlinear systems with random inputs and perfect measurements! Stochastic neighboring-optimal

More information

Adaptive Robust Control for Servo Mechanisms With Partially Unknown States via Dynamic Surface Control Approach

Adaptive Robust Control for Servo Mechanisms With Partially Unknown States via Dynamic Surface Control Approach IEEE TRANSACTIONS ON CONTROL SYSTEMS TECHNOLOGY, VOL. 18, NO. 3, MAY 2010 723 Adaptive Robust Control for Servo Mechanisms With Partially Unknown States via Dynamic Surface Control Approach Guozhu Zhang,

More information

arxiv: v1 [math.oc] 30 May 2014

arxiv: v1 [math.oc] 30 May 2014 When is a Parameterized Controller Suitable for Adaptive Control? arxiv:1405.7921v1 [math.oc] 30 May 2014 Romeo Ortega and Elena Panteley Laboratoire des Signaux et Systèmes, CNRS SUPELEC, 91192 Gif sur

More information

Risk-Sensitive Control with HARA Utility

Risk-Sensitive Control with HARA Utility IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 46, NO. 4, APRIL 2001 563 Risk-Sensitive Control with HARA Utility Andrew E. B. Lim Xun Yu Zhou, Senior Member, IEEE Abstract In this paper, a control methodology

More information

Adaptive estimation in nonlinearly parameterized nonlinear dynamical systems

Adaptive estimation in nonlinearly parameterized nonlinear dynamical systems 2 American Control Conference on O'Farrell Street, San Francisco, CA, USA June 29 - July, 2 Adaptive estimation in nonlinearly parameterized nonlinear dynamical systems Veronica Adetola, Devon Lehrer and

More information

Learning Model Predictive Control for Iterative Tasks: A Computationally Efficient Approach for Linear System

Learning Model Predictive Control for Iterative Tasks: A Computationally Efficient Approach for Linear System Learning Model Predictive Control for Iterative Tasks: A Computationally Efficient Approach for Linear System Ugo Rosolia Francesco Borrelli University of California at Berkeley, Berkeley, CA 94701, USA

More information

ADAPTIVE EXTREMUM SEEKING CONTROL OF CONTINUOUS STIRRED TANK BIOREACTORS 1

ADAPTIVE EXTREMUM SEEKING CONTROL OF CONTINUOUS STIRRED TANK BIOREACTORS 1 ADAPTIVE EXTREMUM SEEKING CONTROL OF CONTINUOUS STIRRED TANK BIOREACTORS M. Guay, D. Dochain M. Perrier Department of Chemical Engineering, Queen s University, Kingston, Ontario, Canada K7L 3N6 CESAME,

More information

Nonlinear Model Predictive Control Tools (NMPC Tools)

Nonlinear Model Predictive Control Tools (NMPC Tools) Nonlinear Model Predictive Control Tools (NMPC Tools) Rishi Amrit, James B. Rawlings April 5, 2008 1 Formulation We consider a control system composed of three parts([2]). Estimator Target calculator Regulator

More information

Output Regulation of Uncertain Nonlinear Systems with Nonlinear Exosystems

Output Regulation of Uncertain Nonlinear Systems with Nonlinear Exosystems Output Regulation of Uncertain Nonlinear Systems with Nonlinear Exosystems Zhengtao Ding Manchester School of Engineering, University of Manchester Oxford Road, Manchester M3 9PL, United Kingdom zhengtaoding@manacuk

More information

A Recurrent Neural Network for Solving Sylvester Equation With Time-Varying Coefficients

A Recurrent Neural Network for Solving Sylvester Equation With Time-Varying Coefficients IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL 13, NO 5, SEPTEMBER 2002 1053 A Recurrent Neural Network for Solving Sylvester Equation With Time-Varying Coefficients Yunong Zhang, Danchi Jiang, Jun Wang, Senior

More information

State Regulator. Advanced Control. design of controllers using pole placement and LQ design rules

State Regulator. Advanced Control. design of controllers using pole placement and LQ design rules Advanced Control State Regulator Scope design of controllers using pole placement and LQ design rules Keywords pole placement, optimal control, LQ regulator, weighting matrixes Prerequisites Contact state

More information

ADAPTIVE FEEDBACK LINEARIZING CONTROL OF CHUA S CIRCUIT

ADAPTIVE FEEDBACK LINEARIZING CONTROL OF CHUA S CIRCUIT International Journal of Bifurcation and Chaos, Vol. 12, No. 7 (2002) 1599 1604 c World Scientific Publishing Company ADAPTIVE FEEDBACK LINEARIZING CONTROL OF CHUA S CIRCUIT KEVIN BARONE and SAHJENDRA

More information

Neural Network-Based Adaptive Control of Robotic Manipulator: Application to a Three Links Cylindrical Robot

Neural Network-Based Adaptive Control of Robotic Manipulator: Application to a Three Links Cylindrical Robot Vol.3 No., 27 مجلد 3 العدد 27 Neural Network-Based Adaptive Control of Robotic Manipulator: Application to a Three Links Cylindrical Robot Abdul-Basset A. AL-Hussein Electrical Engineering Department Basrah

More information

Output Feedback Stabilization with Prescribed Performance for Uncertain Nonlinear Systems in Canonical Form

Output Feedback Stabilization with Prescribed Performance for Uncertain Nonlinear Systems in Canonical Form Output Feedback Stabilization with Prescribed Performance for Uncertain Nonlinear Systems in Canonical Form Charalampos P. Bechlioulis, Achilles Theodorakopoulos 2 and George A. Rovithakis 2 Abstract The

More information

Adaptive State Feedback Nash Strategies for Linear Quadratic Discrete-Time Games

Adaptive State Feedback Nash Strategies for Linear Quadratic Discrete-Time Games Adaptive State Feedbac Nash Strategies for Linear Quadratic Discrete-Time Games Dan Shen and Jose B. Cruz, Jr. Intelligent Automation Inc., Rocville, MD 2858 USA (email: dshen@i-a-i.com). The Ohio State

More information

Locally optimal controllers and application to orbital transfer (long version)

Locally optimal controllers and application to orbital transfer (long version) 9th IFAC Symposium on Nonlinear Control Systems Toulouse, France, September 4-6, 13 FrA1.4 Locally optimal controllers and application to orbital transfer (long version) S. Benachour V. Andrieu Université

More information

1 The Observability Canonical Form

1 The Observability Canonical Form NONLINEAR OBSERVERS AND SEPARATION PRINCIPLE 1 The Observability Canonical Form In this Chapter we discuss the design of observers for nonlinear systems modelled by equations of the form ẋ = f(x, u) (1)

More information

IEOR 265 Lecture 14 (Robust) Linear Tube MPC

IEOR 265 Lecture 14 (Robust) Linear Tube MPC IEOR 265 Lecture 14 (Robust) Linear Tube MPC 1 LTI System with Uncertainty Suppose we have an LTI system in discrete time with disturbance: x n+1 = Ax n + Bu n + d n, where d n W for a bounded polytope

More information

Adaptive backstepping for trajectory tracking of nonlinearly parameterized class of nonlinear systems

Adaptive backstepping for trajectory tracking of nonlinearly parameterized class of nonlinear systems Adaptive backstepping for trajectory tracking of nonlinearly parameterized class of nonlinear systems Hakim Bouadi, Felix Antonio Claudio Mora-Camino To cite this version: Hakim Bouadi, Felix Antonio Claudio

More information

AS A POPULAR approach for compensating external

AS A POPULAR approach for compensating external IEEE TRANSACTIONS ON CONTROL SYSTEMS TECHNOLOGY, VOL. 16, NO. 1, JANUARY 2008 137 A Novel Robust Nonlinear Motion Controller With Disturbance Observer Zi-Jiang Yang, Hiroshi Tsubakihara, Shunshoku Kanae,

More information

Adaptive linear quadratic control using policy. iteration. Steven J. Bradtke. University of Massachusetts.

Adaptive linear quadratic control using policy. iteration. Steven J. Bradtke. University of Massachusetts. Adaptive linear quadratic control using policy iteration Steven J. Bradtke Computer Science Department University of Massachusetts Amherst, MA 01003 bradtke@cs.umass.edu B. Erik Ydstie Department of Chemical

More information

Introduction to Nonlinear Control Lecture # 3 Time-Varying and Perturbed Systems

Introduction to Nonlinear Control Lecture # 3 Time-Varying and Perturbed Systems p. 1/5 Introduction to Nonlinear Control Lecture # 3 Time-Varying and Perturbed Systems p. 2/5 Time-varying Systems ẋ = f(t, x) f(t, x) is piecewise continuous in t and locally Lipschitz in x for all t

More information

Prashant Mhaskar, Nael H. El-Farra & Panagiotis D. Christofides. Department of Chemical Engineering University of California, Los Angeles

Prashant Mhaskar, Nael H. El-Farra & Panagiotis D. Christofides. Department of Chemical Engineering University of California, Los Angeles HYBRID PREDICTIVE OUTPUT FEEDBACK STABILIZATION OF CONSTRAINED LINEAR SYSTEMS Prashant Mhaskar, Nael H. El-Farra & Panagiotis D. Christofides Department of Chemical Engineering University of California,

More information

A Novel Integral-Based Event Triggering Control for Linear Time-Invariant Systems

A Novel Integral-Based Event Triggering Control for Linear Time-Invariant Systems 53rd IEEE Conference on Decision and Control December 15-17, 2014. Los Angeles, California, USA A Novel Integral-Based Event Triggering Control for Linear Time-Invariant Systems Seyed Hossein Mousavi 1,

More information

Output-feedback Dynamic Surface Control for a Class of Nonlinear Non-minimum Phase Systems

Output-feedback Dynamic Surface Control for a Class of Nonlinear Non-minimum Phase Systems 96 IEEE/CAA JOURNAL OF AUTOMATICA SINICA, VOL. 3, NO., JANUARY 06 Output-feedback Dynamic Surface Control for a Class of Nonlinear Non-minimum Phase Systems Shanwei Su Abstract In this paper, an output-feedback

More information

An Adaptive LQG Combined With the MRAS Based LFFC for Motion Control Systems

An Adaptive LQG Combined With the MRAS Based LFFC for Motion Control Systems Journal of Automation Control Engineering Vol 3 No 2 April 2015 An Adaptive LQG Combined With the MRAS Based LFFC for Motion Control Systems Nguyen Duy Cuong Nguyen Van Lanh Gia Thi Dinh Electronics Faculty

More information

Author's Accepted Manuscript

Author's Accepted Manuscript Author's Accepted Manuscript Dual Heuristic Dynamic Programming for Nonlinear Discrete-Time Uncertain Systems with State Delay Bin Wang, Dongbin Zhao, Cesare Alippi, Derong Liu www.elsevier.com/locate/neucom

More information

SOLVING the Hamilton Jacobi Bellman (HJB) equation

SOLVING the Hamilton Jacobi Bellman (HJB) equation 15 IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 9, NO. 6, JUNE 018 Approximate Dynamic Programming: Combining Regional and Local State Following Approximations Patryk Deptula,JoelA.Rosenfeld,

More information

Unifying Behavior-Based Control Design and Hybrid Stability Theory

Unifying Behavior-Based Control Design and Hybrid Stability Theory 9 American Control Conference Hyatt Regency Riverfront St. Louis MO USA June - 9 ThC.6 Unifying Behavior-Based Control Design and Hybrid Stability Theory Vladimir Djapic 3 Jay Farrell 3 and Wenjie Dong

More information

UNCERTAIN CHAOTIC SYSTEM CONTROL VIA ADAPTIVE NEURAL DESIGN

UNCERTAIN CHAOTIC SYSTEM CONTROL VIA ADAPTIVE NEURAL DESIGN International Journal of Bifurcation and Chaos, Vol., No. 5 (00) 097 09 c World Scientific Publishing Company UNCERTAIN CHAOTIC SYSTEM CONTROL VIA ADAPTIVE NEURAL DESIGN S. S. GE and C. WANG Department

More information

Optimal Control. Lecture 18. Hamilton-Jacobi-Bellman Equation, Cont. John T. Wen. March 29, Ref: Bryson & Ho Chapter 4.

Optimal Control. Lecture 18. Hamilton-Jacobi-Bellman Equation, Cont. John T. Wen. March 29, Ref: Bryson & Ho Chapter 4. Optimal Control Lecture 18 Hamilton-Jacobi-Bellman Equation, Cont. John T. Wen Ref: Bryson & Ho Chapter 4. March 29, 2004 Outline Hamilton-Jacobi-Bellman (HJB) Equation Iterative solution of HJB Equation

More information

On the stability of receding horizon control with a general terminal cost

On the stability of receding horizon control with a general terminal cost On the stability of receding horizon control with a general terminal cost Ali Jadbabaie and John Hauser Abstract We study the stability and region of attraction properties of a family of receding horizon

More information