Inverse optimal control for unmanned aerial helicopters with disturbances

Received: 4 February 8 Revised: September 8 Accepted: September 8 DOI:./oca.47 RESEARCH ARTICLE Inverse optimal control for unmanned aerial helicopters with disturbances Haoxiang Ma Mou Chen Qingxian Wu College of Automation Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing, China Correspondence Mou Chen, College of Automation Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing 6, China. Email: chenmou@nuaa.edu.com Present Address Nanjing University of Aeronautics and Astronautics, Sub-box 69 of Main Post Box 59, No. 69 Sheng Tai West Road, Jiang Ning District, Nanjing 6, Jiangsu Province, China. Funding information National Natural Science Foundation of China, Grant/Award Number: 65784; Jiangsu Natural Science Foundation of China, Grant/Award Number: BK747; Aeronautical Science Foundation of China, Grant/Award Number: 657549; Fundamental Research Funds for the Central Universities, Grant/Award Number: NE6 Summary This paper proposes an optimal control method of an unmanned aerial helicopter UAH) with unknown disturbances. Solving the Hamilton-Jacobi- Bellman HJB) equation is considered as the common approach to design an optimal controller under a meaningful cost function when facing the nonlinear optimal control problem. However, the HJB equation is hard to solve even for a simple problem. The inverse optimal control method that avoids the difficulties of solving the HJB equation has been adopted. In this inverse optimal control approach, a stabilizing optimal control law and a particular cost function that are obtained by a control Lyapunov function are required. An integrator backstepping method is used in designing the optimal control law of the UAH. Furthermore, a disturbance-observer based control DOBC) approach has been adopted in the optimal control law for dealing with the unknown disturbances of the UAH system. Simulation results have been given to certify the stability of the nonlinear UAH system and the validity of this developed control method. KEYWORDS attitude and altitude control, backstepping, disturbance observer based control, inverse optimal control, unmanned aerial helicopter INTRODUCTION With the improvement of the technology of unmanned aerial vehicles, unmanned aerial helicopters UAHs) are diffusely utilized in civilian and military fields. As we know, UAH is an underactuated nonlinear mechanical system, and the dynamics of the UAH are highly coupled among different channels, which makes them become sensitive to external disturbances. Because of the complicated aerodynamic nature which produced by the main rotor thrust and the tail thrust, the unknown parameter and model uncertainty will appear in the model of UAH. This feature renders more challenges of the high performance controller design. In order to design available controllers for the UAH, linearization-based control of helicopter dynamics has been utilized in early years. In the work of Shin et al, an optimal control law of unmanned helicopter under the attitude and positioning loop has been designed. In addition, a tracking control method applying neural dynamic programming approach has been adopted in the work of Enns and Si. Furthermore, there is a few research results related to helicopter flight control based on nonlinear dynamic characteristic.,4 Recently, a robust adaptive backstepping control has been applied in the UAH with flapping dynamics in the work of Yan et al. 5 However, the optimal control method for the nonlinear UAH system has rarely been considered. 5 8 John Wiley & Sons, Ltd. wileyonlinelibrary.com/journal/oca Optim Control Appl Meth. 9;4:5 7.

MA ET AL. 5 The optimal control problem of a linear system can be converted into the minimum problem, which can be settled by solving the related Riccati equation. In the work of Budiyono and Wibowo, 6 an optimal tracking controller has been designed for a small-scale unmanned helicopter by using feedback linearization. For the purpose of tracking the hover attitude trajectory, a linear quadratic optimal model-following control has been developed for a helicopter in the work of Pieper et al. 7 Under the requirement of the precision of the modeling, the optimal control with respect to nonlinear systems, which requires solving the nonlinear HJB equation, are further studied. In the work of Enns and Si, 8 aneural network NN) dynamic programming control method under optimal condition for the UAH has been utilized. In order to handle the trajectory tracking problem, an output-feedback optimal controller has been designed of the UAH in the work of Devasia. 9 In the work of Shin et al, a nonlinear model predictive controller has been developed by using an NN. However, the HJB equation does not have a closed-form solution, which may cause no solution or multiple solutions. Then, it is hard to solve the HJB equation when dealing with the nonlinear UAH system. In this paper, we adopt an inverse approach to design the optimal control laws for the attitude and altitude system of the UAH with parametric uncertainties and unknown external disturbances. This inverse approach to derive the optimal feedback controller is illustrated in the work of Sepulchre et al, which is called the inverse optimal control approach. Avoiding solving the HJB equation is the advantage of this approach, and the designed feedback controller is optimal with respect to a series of meaningful cost functions. This inverse optimal approach has been proposed by Kalman using the nonlinear regulator technology in the work of Moylan and Anderson. In the work of Anderson and Moore, the robust adaptive control method has been adopted to solve the problem of linear quadratic regulators. There is a great deal of applications in the inverse optimal approach to the rigid spacecraft attitude control. -5 In order to apply the inverse optimal method, a CLF and a stabilizing control law in a particular form are required to minimize a particular cost function of the UAH. Taking the practical environment into account, the UAH will be subjected to many unknown external disturbances during the flight process. Thus, how to guarantee the availability of the inverse optimal controller under the unknown external disturbances is a significant problem to solve. Considering the practical flight environment, the external disturbance should be taken into account when establishing the UAH's attitude and altitude system. In order to compensate the effect of the external disturbance, the concept of DOBC has been proposed in the work of Chen et al. 6 To combine this DOBC method with the nonlinear systems, related techniques have been further studied in the work of Chen. 7 IntheworkofChenandChen, 8 a sliding mode control SMC) method based on disturbance observer has been proposed for nonlinear systems. In addition, a novel type of control scheme combining the DOBC with terminal sliding mode control has been adopted to a class of multiple-input multiple-output continuous nonlinear systems with disturbances in the work of Wei and Guo. 9 In the work of Guo and Cao, recent advances in DOBC theory are introduced, and especially, the composite hierarchical antidisturbance control is firstly addressed. A DOBC in combination with an NN scheme and backstepping method is developed to achieve a composite antidisturbance controller design in thew work of Sun and Guo. Combining disturbance observer and NN, a robust flight control approach has been studied for the hypersonic vehicle with uncertainties and external disturbances in the work of Chen et al. In the work of Xu, the DOBC has also been combined with the dynamic surface control method in dealing with the transport aircraft. About the hypersonic vehicle, an overview on flight dynamics and control methods has been sufficiently summarized in the work of Xu and Shi. 4 In the work of Xu et al, 5 the disturbance observer based neural adaptive control method on the longitudinal dynamics of a flexible hypersonic flight vehicle has been studied considering the presence of wind effects. Furthermore, considering the input saturation, a robust constrained SMC scheme has been designed for cascade nonlinear systems with unknown external disturbance in the work of Chen et al. 6 As we know, the DOBC method has been applied in quadrotor vehicles. For instance, the SMC approach based on disturbance observers has been utilized to design a robust flight controller for a small quadrotor vehicle in the work of Besnard et al. 7 In the other work of Besnard et al, 8 this developed SMC driven by the sliding mode disturbance observer approach has been used to design a robust flight controller for a small quadrotor vehicle. Recently, a robust tracking control scheme is proposed for unmanned quadrotor with perturbation parameters and external disturbances in the work of Cheng et al. 9 Considering the parametric uncertainties and unknown external disturbances of UAHs, a robust optimal adaptive control strategy has been utilized to deal with tracking problem in the work of Fang et al. In the work of Park, considering the external disturbances, the robust inverse optimal control approach was adopted to the attitude nonlinear model of the rigid spacecraft. For the UAH modeling problem of attitude system, since the control inputs in this paper are the control moments of the UAH, not the blade cyclic pitches and squanders of the UAH, the kinetic equations are similar to the modeling dynamics of the quadrotor. For the problem of altitude system, different with the quadrotor, the control input is the thrust of main rotor, not the resultant forces of four rotors. A backstepping method has been adopted by

54 MA ET AL. designing a virtual control law. However, the optimal control problem of the UAH with external disturbances still needs further study. This paper consists of the following sections. The concepts of the inverse optimal approach and the DOBC method, which present a convenient form of the design of controller, are introduced in Section. Furthermore, a model of the attitude and altitude system of UAHs with external disturbances is given in this section. In Section, an inverse optimal control law based on disturbance observer has been designed. The numerical simulation results are given to prove the stability and the practicability of the designed optimal controller based on disturbance observer in this paper in Section 4. PROBLEM STATEMENT AND MODELING According to the kinematic equation of the UAH, the attitude and altitude motion of UAH can be described as follows : Ω =Hω ω = J ω Jω + J u ) Ḣ z = v m v = cos φ cos θt mr mg, where H z and v are the altitude and velocity of the UAH in the z-axis, respectively. m is the gross mass of the UAH, g denotes the gravitational acceleration. Ω = [φ, θ, ψ] T presents the Euler angles vector including roll φ), pitch θ), and yaw ψ). ω = [p, q, r] T presents the angular rate vector in the body fixed frame. u = [Σ L, Σ M, Σ N ] T denotes the total torque control input vector of the UAH in x-axis, y-axis, and z-axis, respectively. T mr is the thrust generated by the main rotor. The attitude kinematic matrix H is defined as [ ] sinφ tan θ cos φ tan θ H = cosφ sin φ. ) sinφ cos θ cos φ cos θ The inertia matrix J is defined as J = diagj xx, J yy, J zz ), ) where J xx, J yy,andj zz are the moments of inertia of the UAH in x-axis, y-axis, and z-axis, respectively. In order to substitute the multiplication cross, the symbol S ) is introduced, which denotes a skew-symmetric matrix, that is, [ ] r q Sω) = r p. 4) q p Namely, we have J ω Jω = J Sω)Jω. 5) To describe the model of the UAH concisely, we define M =[Ω T, H z ] T, N =[ω T, v] T, u = [ u T, T mr [ ] [ ] [ H J Λ=, F = Sω)Jω J ], G = cos φ cos θ. m Considering )-5) and the external disturbance, we obtain the form of UAH model as follows: [ ] H Ṁ = N [ ] [ J Ṅ = Sω)Jω J ] [ ] D + u + Ω D v g, cos φ cos θ m where D Ω =[D p, D q, D r ] T and D v denote the disturbances of the Euler angular rates p, q, r and the velocity z of the UAH in the z-axis, respectively. Then, the attitude and altitude model of the UAH can be rewritten as Ṁ =ΛN 7a) Ṅ = F + Gu + D, where D =[D p, D q, D r, D v g ] R 4 denotes the external unknown disturbance. ] T 6) 7b)

MA ET AL. 55 With the model in 7), the control objective of this paper is to derive an inverse optimal controller based on the disturbance observer to stabilize the system and to compensate for the unknown disturbance D. To design this composite control law, the related lemmas and assumptions are given as follows: Lemma. See the work of Krstic and Tsiotras ) Consider the nonlinear affine system ẋ = fx)+gx)u, 8) where x R n and u R n are the state and input vectors, respectively. f R n R n and g R n R n n are smooth vector- and matrix-valued functions, respectively, with f) =. Then, the state-feedback control law u = β R x) [ L g Vx) ] T 9) is optimal with respect to the following cost function : J = [ lx)+u T Rx)u ] dt, ) where β is a design constant; Rx) is a positive definite matrix to be designed, ie, Rx) =R T x) > for all x R n ; Vx) denotes a positive definite Lyapunov function of the system in ). lx) is given by { lx) = β L f Vx) L g Vx)R x) [ L g Vx) ] } T + ββ )L g Vx)R x) [ L g Vx) ] T. ) Lemma. See the work of Chen et al 6 ) Consider the nonlinear affine system 8) with an unknown disturbance d R l, which can be written as ẋ = fx)+gx)u + g x)d ) where g x) R n R n l is a smooth function in terms of x. The disturbance observer can be designed as follows 6 : { d = η + Px) η = Qx)g x)η Qx) [g x)px)+fx)+gx)u], ) where d and η are the estimated value of the disturbance and state vector of disturbance observer, respectively. Px) is the nonlinear function to be designed. Qx) denotes a positive matrix that represents the nonlinear disturbance observer gain, which can be written as Qx) = Px) x. 4) Then, the disturbance observer ) can estimate the unknown disturbance d well. Assumption. In order to guarantee the attitude kinematic matrix H and the diagonal ) matrix G are nonsingular, the roll angle φ and the pitch angle θ always change between the interval π, π, namely, φ < π, θ < π. Assumption. The disturbance vector D is slowly time varying, ie, Ḋ. Remark. If V = L f Vx) L g Vx)R x)[l g Vx)] T <, x can be ensured, according to ), we obtain that lx) >, x. Thus, the cost function ) represents a meaningful cost in the sense that it includes a positive penalty on the states and a positive penalty on the control for each states of the UAH system. Remark. In order to apply Lemma, we consider the gravitational acceleration g as a part of the external disturbance D. Then, we have F) =, which satisfies the condition of Lemma. The gravitational acceleration g is a constant, which has no effect on the dynamic states of the UAH system. Thus, seeing the gravitational acceleration g as a part of the external disturbance D v would not have effects on the stability and feasibility of the UAH system 7).

56 MA ET AL. FIGURE Structure of nonlinear disturbance observer based inverse optimal control. UAH, unmanned aerial helicopter INVERSE OPTIMAL CONTROL DESIGN BASED ON DISTURBANCE OBSERVER In this section, a disturbance observer based inverse optimal control law will be designed. The control principal diagram has been represented in Figure. As shown in this figure, the design process of disturbance observer and optimal control law are totally separated.. Design of disturbance observer For system 7), under the result of Lemma, the following nonlinear disturbance observer is used to estimate the unknown disturbances D,whichisgivenby { D = ξ + PN) 5) ξ = QN)ξ QN) [PN)+F + Gu], where ξ is the internal states of the nonlinear observer, and PN) R 4 denotes a designed nonlinear function. D = [ D Ω, D v ] T denotes the disturbance estimation vector, and D Ω and D v denote the disturbance estimation values of the Euler angles and the velocity in the z-axis of the UAH, respectively. The nonlinear observer gain QN) is defined as QN) = PN) N, 6) where QN) is a positive matrix to be designed. The disturbance estimation error is given by e = D D. 7) Obviously, according to Assumption, D can track the trajectory of D asymptotically if PN) is chosen such that Then, we obtain that this disturbance observer is globally stable for all N R 4. ėt) = PN) N et) = QN)et). 8). Design of inverse optimal control law To apply the result of Lemma, a control Lyapunov function should be constructed for system 7). The backstepping method is adopted along with a particular form for the system in 7), which has been applied in the work of Krstic and Tsiotras. Step. Define a virtual control N d M), which stabilizes the subsystem 7a) where k > is a parameter to be designed. N d = k Λ T M, 9)

MA ET AL. 57 With this virtual control law, the closed-loop subsystem 7a) becomes Ṁ = k Λ Λ T M. ) In order to prove that the system in ) is stable, we choose the following Lyapunov function: V M) = MT M. ) Then, M, the derivative of V along the trajectories of 9) is given by V = M T Ṁ = k M T Λ Λ T M = k Λ T M <. ) Step. Considering the virtual control law 9), define the error variable as Z = N N d = N + k Λ T M. ) The differential equation for the full attitude and altitude model of the UAH can be written as Ṁ =ΛN = k Λ Λ T M +Λ Z. 4) Obviously, as shown in Step, 4) is globally exponentially stable for Z =. In order to prove the stability of Z,wedefine [ ] J L = Sω)J 5) and design the composite control law u as u = u + u d, 6) where u, u d denote the inverse optimal control law and the compensate control law of the disturbance observer, respectively. Then, system 7) becomes Ṁ =Λ N 7a) Ṅ = L N + Gu + D. Consider ), 4), and 7b), the differential equation for Z can be written as 7b) Ż = Ṅ + k Λ T Ṁ = L N k ΛT ΛΛ T M + k Λ T ΛZ + Gu + u d )+D = L N k ΛT ΛΛ T M + k Λ T ΛZ + Gu +Gu d + D). 8) Define Ξ=Gu d + D. 9) Obviously, taking u d = G D ) to compensate the external disturbances, we obtain that Ξ=D D = e. ) Then, invoking ) and ), 8) can be written as Ż = L N k ΛT ΛΛ T M + k Λ T ΛZ + Gu +Ξ = L Z k Λ T M) k ΛT ΛΛ T M + k Λ T ΛZ + Gu e. )

58 MA ET AL. In order to design the inverse optimal control law u candidate Lyapunov function V is chosenasfollows: = u M, Z, e) which stabilizes the system of 4) and ), the VM, Z, e) = k V M)+ ZT Z + et e. ) Then, considering ), the derivative of VM, Z, e) can be written as VM, Z, e) = k ΛT M + Z T Ż + e T ė. 4) Considering the design process of disturbance observer in Section., we choose QN) =α I, 5) where α is a design parameter and I R 4 4 denotes the identity matrix. Then, 8) becomes ėt) = α et). 6) According to Lemma, a particular optimal control law in 9) needs to be designed. Considering 6), the general type of the derivative of V can be written as V = L F V + L G V u = L F V + L G V u + L G V u d. 7) Considering ) and 4), the coefficient matrix of the control input u can be written as Z T G. Namely, considering ), we obtain that L G V = V Z G = ZT G. 8) Then, according to 9), we design the optimal control law as u = βr M, N)[L G V] T where RM, N) =R T M, N) > will be designed. Considering ), ), and 6), the derivative of V can be rewritten as = βr M, N)GZ, 9) V = k ΛT M + Z T L Z k Λ T M ) k ZT Λ T ΛΛ T M + k Z T Λ T ΛZ + Z T Gu Z T e α e T e = k ΛT M k Z T L + k Λ T Λ ) Λ T M + Z T L + k Λ T Λ ) Z + Z T Gu Z T e α e T e. 4) Define { Δ = k Z T L + k Λ T Λ ) Λ T M 4) Δ = Z T e α e T e. Considering the following fact: Δ = k ZT L + k Λ T Λ ) Λ T M k ΛT M) T L + k Λ T Λ ) Z [ ] T [ Λ T M)+Λ T M) T = k 4 k L + k Λ T Λ ) Z = k 4 Λ T M + L + k Λ T Λ ) Z k Δ = α ZT αe αe) T αe k L + k Λ T Λ ) ] Z + k 4 ΛT M + k k Λ T Λ+L ) Z = 4α ZT Z 4α ZT Z α ZT αe α ZT αe αe) T αe = 4α ZT Z α Z + αe, 4) 4)

MA ET AL. 59 the form of 4) becomes V = k ΛT M +Δ +Δ + Z T L + k Λ T Λ ) Z + Z T Gu 4 ΛT k M 4 Λ T M + L + k Λ T Λ ) Z k + Z T L + k Λ T Λ ) Z + Z T Gu. = k + k k Λ T Λ+L ) Z α Z + αe + 4α ZT Z 44) Define Δ = k k Λ T Λ+L ) Z. 45) Considering the following fact: Δ = k k Λ T Λ+L ) Z + k k Λ T Λ L ) Z k k Λ T Λ L ) Z = k Z T [ k Λ T Λ+L ) T k Λ T Λ+L ) + k Λ T Λ L ) T k Λ T Λ L )] Z k k Λ T Λ L ) Z = k Z T [ k ΛT ΛΛ T Λ+L T L)+k Λ T ΛL + LΛ T Λ) k Λ T ΛL + LΛ T Λ) ] Z k k Λ T Λ L ) Z = k Z T k ΛT ΛΛ T Λ+L T L)Z k k Λ T Λ L ) Z, 46) the form of 44) becomes V = k 4 ΛT M k 4 Λ T M + L + k Λ T Λ ) Z k + Z T L + k Λ T Λ ) Z + Z T Gu = k 4 ΛT k M 4 Λ T M + L + k Λ T Λ ) Z k +Δ α Z + αe + 4α ZT Z k k Λ T Λ L ) Z α Z + αe + k Z T k ΛT ΛΛ T Λ+L T L)Z + 4α ZT Z + Z T L + k Λ T Λ ) Z + Z T Gu 4 ΛT k M 4 Λ T M + L + k Λ T Λ ) Z k Λ k T Λ L ) Z k α Z + αe + Z T k Λ T ΛΛ T Λ+ L T L + ) k 4α I + L + k Λ T Λ Z + Z T Gu. 47) = k The design and analysis of the inverse optimal control based on nonlinear disturbance observer for the UAH can be summarized as the following theorem. Theorem. Consider the attitude and altitude nonlinear system 7) of the UAH with unknown external disturbances, which satisfies Assumptions and. The composite control law u is designed according to 6). Choosing β =, the inverse optimal control law u is designed as u = G [k I + 4k Λ T Λ Λ T Λ+ 4 ] L T L + L + k Λ T Λ Z, 48) k which minimizes the cost function J = { lm, N)+u T RM, N)u } dt, 49)

6 MA ET AL. where k is a design parameter related to α, which can be written as k = 4α, 5) and lm, N) is given by lm, N) =k ΛT M + 4k N + k Λ T M + k Λ T M + L + k Λ T Λ ) N + k k Λ M) T + 4 k k Λ T Λ L ) N + k Λ T M). 5) In order to compensate the external disturbances D, the compensate term u d is designed as ). Then, the following composite control law u u = u + u d = G [k I + 4k Λ T ΛΛ T Λ+ 4 ] L T L + L + k Λ T Λ Z G D 5) k can stabilize the full attitude and altitude system of the UAH in 7). Proof. In order to compensate the positive terms in 47), by choosing β =, we denote [ RM, N) =G k I + k Λ T Λ Λ T Λ+ L T L + k L + k Λ Λ] T G. 5) The form of the inverse optimal control law u can be denoted as 5), then 47) becomes V = k 4 ΛT k M 4 Λ T M + L + k Λ T Λ ) Z k Λ k T Λ L ) Z k α Z + αe + Z T k Λ T ΛΛ T Λ+ k L T L + 4α I + L + k Λ T Λ)Z Z T GG [k I + 4k Λ T ΛΛ T Λ+ 4 k L T L + L + k Λ T Λ]Z = k 4 ΛT M k 4 Λ T M + L + k Λ T Λ ) Z k Z T [k I + k Λ T ΛΛ T Λ+ k L T L + k Λ T Λ]Z = k 4 ΛT M k 4 Λ T M + L + k Λ T Λ ) Z k k k Λ T Λ L ) Z α Z + αe k k Λ T Λ L ) Z α Z + αe k Z k Λ T ΛZ k LZ k ΛZ, 54) where VM, N) <, M, N, and the equilibrium M = N = is globally asymptotically stable. 4 SIMULATION EXAMPLE In order to prove the validity of the designed optimal control law, numerical simulations will be performed. Since the object of study is the middle-sized UAH, we assume that the gross mass of the UAH is 798.5 kg, with inertia matrix J = diag58.4, 777.9, 6.4) kg/m. Considering the physical truth, the initial altitude h and velocity v of the UAH are chosen as m and m/s, respectively. Let the initial Euler angles satisfy that φ = 5,θ =,ψ =. Namely, Ω) =[5,, ] T. The unknown external disturbances of the Euler angles and the velocity in the z-axis of the UAH are designed as.5 rad/s and 5 m/s, respectively. Set up the target altitude at 8 m.

MA ET AL. 6 5 4 deg) 5 5 Timesec) FIGURE Inverse optimal control results of Ω 85 8 75 H m) 7 65 6 55 5 5 5 Timesec) FIGURE Inverse optimal control results of H z The response results of the Euler angles of the system with k =, k = 6 are shown in Figure, which proves the stability of the attitude closed-loop system. Meanwhile, choosing varies initial altitudes with k =, k = 6, Figure shows the altitude response results of the UAH with the initial values of h = 5 m. The stability of the altitude closed-loop system is proved. In order to distinguish the influence of the different values of k and k, the response results of the Euler angles φ, θ, ψ with control law in 5) with k = and the different values of k are shown in Figures 4, 8, and, respectively. Comparing with the figures above, Figures 6,, and 4 present the response results with k = 6 and the different value of k. Figures 5, 7, 9,,, and 5 have shown the roll, pitch, and yaw moments of the system with different values of k and k. Regardless of the values of k and k, the stability of the system have a very uniform behavior. However, according to the response results of the Euler angles and the control inputs in different values of k and k, it is obvious that the control action varies greatly from Figures 4 to 5. For example, the damping characteristics of the system is negative related with the value of k, and the overshoot and the rapidity of the system are also different with the different values of k.from the values of the control inputs in Figures 4, 6, 8,,, and 4, the state values can reach stabilization in 5 seconds by choosing appropriate parameters. Meanwhile, from the values of the control inputs in Figures 5, 7, 9,,, and 5, the maximum values of the control inputs are below N m with appropriate parameters, ie, k =, k = 6,which conforms to reality. Then, the practicality and rationality of this optimal control law has been proved.

6 MA ET AL. 5 4 = =5 =5 deg) FIGURE 4 The responses of roll angle φ with k = 5 5 Time sec).5 x L Nm).5.5 = =5 =5.5 FIGURE 5 The responses of roll moment Σ L with k =.5 5 5 Timesec) 5 4 = =5 = deg) FIGURE 6 The responses of roll angle φ with k = 6 5 5 Time sec)

MA ET AL. 6 x k = k =5 k = L Nm) FIGURE 7 The responses of roll moment Σ L with k = 6 4 5 6 7 5 5 Time sec).5 = =5 =5.5 deg).5.5 FIGURE 8 The responses of pitch angle θ with k = 5 5 Time sec) Nm) 4 = =5 =5 6 8 FIGURE 9 The responses of pitch moment Σ M with k = 5 5 Timesec)

64 MA ET AL..5 = =5 =.5 deg).5.5 FIGURE The responses of pitch angle θ with k = 6 5 5 Time sec) 5 Nm) 5 = =5 = 5 FIGURE The responses of pitch moment Σ M with k = 6 5 5 Time sec).5 deg).5 = =5 =5.5 FIGURE The responses of yaw angle ψ with k =.5 5 5 Time sec)

MA ET AL. 65 9 8 7 = 6 =5 5 =5 6 6 Nm) 5 4 FIGURE The responses of yaw moment Σ N with k = 5 5 Timesec) = =5 = deg) FIGURE 4 The responses of yaw angle ψ with k = 6 5 5 Time sec).5 x = =5 =.5 N Nm).5.5.5 FIGURE 5 The responses of yaw moment Σ N with k = 6 5 5 Time sec)

66 MA ET AL..5.45.4 Estimate value of D deg).5..5..5..5 5 5 Time sec) FIGURE 6 Estimate value of D Ω in disturbance observer based control 5 4.5 4 Estimate value of D v m/s).5.5.5.5 5 5 Time sec) FIGURE 7 Estimate value of D v in DOBC In order to compensate the external disturbances D =[.5,.5,.5, 4.8] T that denotes that the external disturbances of Euler angular velocities are.5 rad/s and one of velocity of the UAH in the z-axis is 5 m/s, the disturbance observer based control law is designed. With k =, k = 6, according to 5), we obtain that α =.5.Withthis 4k parameter α, the response results of D Ω and D v are given in Figures 6 to 7. Meanwhile, Figures 8 to 9 show the stability of the disturbance estimation errors e. Finally, in order to embody the advantages of the inverse optimal control low 48), a disturbance observer based backstepping control method has been designed to compare with 5). Considering the attitude and altitude model of the UAH 7)with external unknowndisturbance D, the disturbance observer based backstepping control law can be designed as u c = G [ FN)+K N N d )+Ṅ d + D ] = G [ ] FN)+K N N d )+Ṅ d G D, 55) where N d = Λ K M denotes the virtual control law of 7a), and K, K R 4 4 are positive matrix to be designed. Since the same disturbance observer 5) has been designed to compensate the unknown disturbance D, the differences between the inverse optimal control law u and the backstepping based control law u c can be presented by the following simulations results. As shown in Figures,, and, the UAH system has the similar dynamic performance under the same initial states by choosing k =, k = 6, K = diag,,,.55), K = 6 I. Comparing with Figures, 4, and 5, the values of u are obviously lower than u c, which reflects the advantages and necessity of the inverse optimal control method in this paper.

MA ET AL. 67.5.45.4.5 Error of D deg)..5..5..5 5 5 Time sec) FIGURE 8 Estimate error of D Ω in disturbance observer based control 5 4.5 4.5 Error of D v m/s).5.5.5 5 5 Time sec) FIGURE 9 Estimate error of D v in disturbance observer based control 5 4 deg) 5 5 Timesec) FIGURE Comparison of roll angle φ between u and u c

68 MA ET AL..5 * c.5 deg).5.5 5 5 Timesec) FIGURE Comparison of pitch angle θ between u and u c.5.5 deg).5 5 5 Timesec) FIGURE Comparison of yaw angle ψ between u and u c Nm) 4 6 8 5 5 Timesec) FIGURE Comparison of roll moment Σ L between u and u c

MA ET AL. 69 5 5 Nm) 5 5 5 5 5 5 Timesec) FIGURE 4 Comparison of pitch moment Σ M between u and u c 5 5 Nm) 5 5 5 5 5 Timesec) FIGURE 5 Comparison of yaw moment Σ N between u and u c 5 CONCLUSION Dealing with nonlinear control problem, since the HJB equation does not have a closed-form solution, which may cause no solution or multiple solutions, this paper adopts the inverse optimal control method. Finding an optimal controller with respect to a meaningful cost without solving the HJB equations is the advantage of this method. In order to compensate external disturbances, disturbance observer basic control method has been chosen. To combine the control methods above with the attitude and altitude system of UAH, backstepping has been adopted. According to the simulation results in Section 4, the stability of the attitude and altitude system of the UAH has been proved. Moreover, the advantages of this disturbance observer based inverse optimal control law have been expressed through comparing with the conventional backstepping control in Section 4. Furthermore, with the appropriate parameters of the control law, the timeliness and practicality can also be satisfied. ACKNOWLEDGEMENTS This work was supported in part by the National Natural Science Foundation of China under Grant 65784, in part by the Jiangsu Natural Science Foundation of China BK747, in part by the Aeronautical Science Foundation of China under Grant 657549, and in part by the Fundamental Research Funds for the Central Universities under Grant NE6.

7 MA ET AL. ORCID Haoxiang Ma http://orcid.org/--6-58 REFERENCES. Shin J, Nonami K, Fujiwara D, Hazawa K. Model-based optimal attitude and positioning control of small-scale unmanned helicopter. Robotica. 5;):5-6.. Enns R, Si J. Helicopter trimming and tracking control using direct neural dynamic programming. IEEE Trans Neural Netw. ;44):99-99.. Dalamagkidis K, Valavanis KP, Piegl LA. Nonlinear model predictive control with neural network optimization for autonomous autorotation of small unmanned helicopters. IEEE Trans Control Syst Technol. ;94):88-8. 4. Koo TJ, Ma Y, Sastry S. Nonlinear control of a helicopter based unmanned aerial vehicle model. IEEE Trans Control Syst Technol.. 5. Yan K, Wu Q, Chen M. Robust adaptive backstepping control for unmanned autonomous helicopter with flapping dynamics. Paper presented at: th IEEE International Conference on Control & Automation ICCA); 7; Ohrid, Macedonia. 6. Budiyono A, Wibowo SS. Optimal tracking controller design for a small scale helicopter. J Bionic Eng. 7;44):7-8. 7. Pieper JK, Baillie S, Goheen KR. Linear-quadratic optimal model-following control of a helicopter in hover. In: Proceedings of 994 American Control Conference; 994; Baltimore, MD. 8. Enns R, Si J. Helicopter trimming and tracking control using direct neural dynamic programming. IEEE Trans Neural Netw. ;44):99-99. 9. Devasia S. Output tracking with nonhyperbolic and near nonhyperbolic internal dynamics: helicopter hover control. In: Proceedings of the 997 American Control Conference; 997; Albuquerque, NM.. Sepulchre R, Janković M, Kokotović PV. Constructive Nonlinear Control. London, UK: Springer;.. Moylan PJ, Anderson BDO. Nonlinear regulator theory and an inverse optimal control problem. IEEE Trans Autom Control. 97;85):46-465.. Anderson BDO, Moore JB. Optimal Control: Linear Quadratic Methods. Mineola, NY: Dover Publications; 7.. Krstic M, Tsiotras P. Inverse optimal stabilization of a rigid spacecraft. IEEE Trans Autom Control. 999;445):4-49. 4. Park Y. Inverse optimal and robust nonlinear attitude control of rigid spacecraft. Aerosp Sci Technol. ;8):57-65. 5. Luo W, Chu YC, Ling KV. Inverse optimal adaptive control for attitude tracking of spacecraft. IEEE Trans Autom Control. 5;5):69-654. 6. Chen WH, Ballance DJ, Gawthrop PJ, O'Reilly J. A nonlinear disturbance observer for robotic manipulators. IEEE Trans Ind Electron. ;474):9-98. 7. Chen WH. Disturbance observer based control for nonlinear systems. IEEE/ASME Trans Mechatron. 4;94):76-7. 8. Chen M, Chen WH. Sliding mode control for a class of uncertain nonlinear system based on disturbance observer. Int J Adapt Control Signal Process. ;4):5-64. 9. Wei X, Guo L. Composite disturbance-observer-based control and terminal sliding mode control for non-linear systems with disturbances. Int J Control. 9;86):8-98.. Guo L, Cao S. Anti-disturbance control theory for systems with multiple disturbances: a survey. ISA Trans. 4;54):846-849.. Sun H, Guo L. Neural network-based DOBC for a class of nonlinear systems with unmatched disturbances. IEEE Trans Neural Netw Learn Syst. 7;8):48-489.. Chen M, Jiang C-S, Wu Q-X. Disturbance-observer-based robust flight control for hypersonic vehicles using neural networks. Adv Sci Lett. ;44-5):77-775.. Xu B. Disturbance observer-based dynamic surface control of transport aircraft with continuous heavy cargo airdrop. IEEE Trans Syst Man Cybern Syst. 7;47):6-7. 4. Xu B, Shi Z. An overview on flight dynamics and control approaches for hypersonic vehicles. Science China Inf Sci. 5;587):-9. 5. Xu B, Wang D, Zhang Y, Shi Z. DOB-based neural control of flexible hypersonic flight vehicle considering wind effects. IEEE Trans Ind Electron. 7;64):8676-8685. 6. Chen M, Shi P, Lim C-C. Robust constrained control for MIMO nonlinear systems based on disturbance observer. IEEE Trans Autom Control. 5;6):8-86. 7. Besnard L, Shtessel YB, Landrum B. Control of a quadrotor vehicle using sliding mode disturbance observer. Paper presented at: 7 American Control Conference; 7; New York, NY. 8. Besnard L, Shtessel YB, Landrum B. Quadrotor vehicle control via sliding mode controller driven by sliding mode disturbance observer. J Franklin Inst. ;49):658-684. 9. Cheng Y, Jiang L, Li T, Guo L. Robust tracking control for a quadrotor UAV via DOBC approach. Paper presented at: 8 Chinese Control And Decision Conference CCDC); 8; Shenyang, China.. Fang X, Wu A, Shang Y, Dong N. Robust control of small-scale unmanned helicopter with matched and mismatched disturbances. J Franklin Inst. 6;58):48-48.

MA ET AL. 7. Park Y. Robust and optimal attitude stabilization of spacecraft with external disturbances. Aerosp Sci Technol. 5;9):5-59.. Marconi L, Naldi R. Robust full degree-of-freedom tracking control of a helicopter. Automatica. 7;4):99-9. Howtocitethisarticle: Ma H, Chen M, Wu Q. Inverse optimal control for unmanned aerial helicopters with disturbances. Optim Control Appl Meth. 9;4:5 7. https://doi.org/./oca.47