Springer Uncertainty Research. Yuanguo Zhu. Uncertain Optimal Control

Size: px

Start display at page:

Download "Springer Uncertainty Research. Yuanguo Zhu. Uncertain Optimal Control"

Virgil Taylor
5 years ago
Views:

1 Springer Uncertainty Research Yuanguo Zhu Uncertain Optimal Control

2 Springer Uncertainty Research Series editor Baoding Liu, Beijing, China

3 Springer Uncertainty Research Springer Uncertainty Research is a book series that seeks to publish high quality monographs, texts, and edited volumes on a wide range of topics in both fundamental and applied research of uncertainty. New publications are always solicited. This book series provides rapid publication with a world-wide distribution. Editor-in-Chief Baoding Liu Department of Mathematical Sciences Tsinghua University Beijing , China liu@tsinghua.edu.cn Executive Editor-in-Chief Kai Yao School of Economics and Management University of Chinese Academy of Sciences Beijing , China yaokai@ucas.ac.cn More information about this series at

4 Yuanguo Zhu Uncertain Optimal Control 12

5 Yuanguo Zhu Department of Mathematics Nanjing University of Science and Technology Nanjing, China ISSN ISSN (electronic) Springer Uncertainty Research ISBN ISBN (ebook) Library of Congress Control Number: Springer Nature Singapore Pte Ltd This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, express or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore , Singapore

6 Preface If a dynamical system is disturbed by uncertain factors, it may be described by an uncertain differential equation. A problem of optimizing an index subject to an uncertain differential equation is called an uncertain optimal control problem. It is a novel topic on optimal control based on the uncertainty theory. This book is to introduce the theory and applications of uncertain optimal control. Two types of models including expected value uncertain optimal control and optimistic value uncertain optimal control are established. These models which have continuous-time forms and discrete-time forms are dealt with by dynamic programming. The uncertain optimal control theory concerns on establishing models based on expected value and optimistic value criterions, equation of optimality, bang bang optimal control, optimal control for switched uncertain system, optimal control for uncertain system with time delay, and parametric optimal control. The applications of uncertain optimal control are shown in portfolio selection, engineering, and management. The book is suitable for researchers, engineers, and students in the field of mathematics, cybernetics, operations research, industrial engineering, artificial intelligence, economics, and management science. Acknowledgement This work was partially supported by the National Natural Science Foundation of China (Grant Nos , ). Nanjing, China June 2018 Yuanguo Zhu v

7 Contents 1 Basics on Uncertainty Theory Uncertainty Space Uncertain Variable Independence Expected Value Distribution of Function of Uncertain Variable Expected Value of Function of Uncertain Variable Optimistic Value and Pessimistic Value Uncertain Simulation Uncertain Process Liu Process Liu Integral Uncertain Differential Equation... 2 References Uncertain Expected Value Optimal Control Problem of Uncertain Optimal Control Principle of Optimality Equation of Optimality Equation of Optimality for Multidimension Case Uncertain Linear Quadratic Model Optimal Control Problem of the Singular Uncertain System... 9 References Optimistic Value-Based Uncertain Optimal Control Optimistic Value Model Equation of Optimality Uncertain Optimal Control Model with Hurwicz Criterion vii

8 viii Contents.4 Uncertain Linear Quadratic Model Under Optimistic Value Criterion Optimistic Value Optimal Control for Singular System Example References Optimal Control for Multistage Uncertain Systems Recurrence Equation Linear Quadratic Model General Case Hybrid Intelligent Algorithm Finite Search Method Optimal Controls for Any Initial State Example Indefinite LQ Optimal Control with Equality Constraint Problem Setting An Equivalent Deterministic Optimal Control A Necessary Condition for State Feedback Control Well Posedness of the Uncertain LQ Problem Example References Bang Bang Control for Uncertain Systems Bang Bang Control for Continuous Uncertain Systems An Uncertain Bang Bang Model Example Bang Bang Control for Multistage Uncertain Systems Example Equation of Optimality for Saddle Point Problem Bang Bang Control for Saddle Point Problem A Special Bang Bang Control Model Example References Optimal Control for Switched Uncertain Systems Switched Uncertain Model Expected Value Model Two-Stage Algorithm Stage (a) Stage (b) An Example LQ Switched Optimal Control Problem

9 Contents ix 6.4 MACO Algorithm for Optimal Switching Instants Example Optimistic Value Model Two-Stage Approach Stage (a) Stage (b) Example Discrete-Time Switched Linear Uncertain System Analytical Solution Two-Step Pruning Scheme Local Pruning Scheme Global Pruning Scheme Examples References Optimal Control for Time-Delay Uncertain Systems Optimal Control Model with Time-Delay Uncertain Linear Quadratic Model with Time-Delay Example Model with Multiple Time-Delays Example References Parametric Optimal Control for Uncertain Systems Parametric Optimization Based on Expected Value Parametric Optimal Control Model Parametric Approximation Method Parametric Optimization Based on Optimistic Value Piecewise Optimization Method References Applications Portfolio Selection Models Expected Value Model Optimistic Value Model Manufacturing Technology Diffusion Problem Mitigation Policies for Uncertain Carbon Dioxide Emissions Four-Wheel Steering Vehicle Problem References Index

10 Chapter 1 Basics on Uncertainty Theory For modeling indeterminacy, there exist many ways. Roughly speaking, there are two representative theories: one is probability theory and the other is uncertainty theory [1]. Probability is interpreted as frequency, while uncertainty is interpreted as personal belief degree. When the sample size is large enough, probability theory is the unique method to deal with the problem on the basis of estimated probability distributions. However, in many cases, no samples are available to estimate a probability distribution. We have to invite some domain experts to evaluate the belief degree that each event will happen. By the Nobelist Kahneman and his partner Tversky [2], human tends to overweight unlikely events, and the belief degree has a much larger range than the true frequency as a result. In this case, probability theory does not work [], so uncertainty theory is founded to deal with this type of indeterminacy. In order to rationally deal with belief degrees, uncertainty theory was founded in 2007 [1]. Nowadays, uncertainty theory has become a branch of axiomatic mathematics for modeling belief degrees [4]. Theory and practice have shown that uncertainty theory is an efficient tool to deal with some nondeterministic information, such as expert data and subjective estimations, which appears in many practical problems. During the past years, there have been many achievements in uncertainty theory, such as uncertain programming, uncertain statistics, uncertain logic, uncertain inference, and uncertain process. 1.1 Uncertainty Space To begin with, some basic concepts in the uncertainty theory [1, 4] are listed. Let Γ be a nonempty set, and L a σ -algebra over Γ. Each element A L is called an event. A set function M defined on the σ -algebra L is called an uncertain measure if it satisfies (i) M(Γ ) = 1; (ii) M(A) + M(A c ) = 1 for any event A; (iii) M( i=1 A i) i=1 M(A i) for every countable sequence of events A i. Springer Nature Singapore Pte Ltd Y. Zhu, Uncertain Optimal Control, Springer Uncertainty Research, 1

11 2 1 Basics on Uncertainty Theory Definition 1.1 ([1]) Let Γ be a nonempty set, let L be a σ -algebra over Γ, and let M be an uncertain measure. Then the triplet (Γ, L, M) is called an uncertainty space. Product uncertain measure was defined to produce an uncertain measure of compound event by Liu [5] in 2009, thus producing the fourth axiom of uncertainty theory. Let (Γ k, L k, M k ) be uncertainty spaces for k = 1, 2,...Write Γ = Γ 1 Γ 2 (1.1) that is the set of all ordered tuples of the form (γ 1,γ 2,...), where γ k Γ k for k = 1, 2,...A measurable rectangle in Γ is a set Λ = Λ 1 Λ 2 (1.2) where Λ k L k for k = 1, 2,... The smallest σ -algebra containing all measurable rectangles of Γ is called the product σ -algebra, denoted by L = L 1 L 2 (1.) Then the product uncertain measure M on the product σ -algebra L is defined by the following product axiom [5]. Axiom 4 (Product Axiom)Let(Γ k, L k, M k ) be uncertainty spaces for k = 1, 2,... The product uncertain measure M is an uncertain measure satisfying { } M Λ k = M k {Λ k } (1.4) k=1 where Λ k are arbitrarily chosen events from L k for k = 1, 2,..., respectively. For each event Λ L, wehave k=1 sup min M k{λ k } Λ 1 Λ 2 Λ k 1 if sup min M k{λ k } > 0.5, Λ 1 Λ 2 Λ k 1 M{Λ} = 1 sup min M k{λ k } Λ 1 Λ 2 Λ c k 1 if sup min M k{λ k } > 0.5, Λ 1 Λ 2 Λ c k 1 0.5, otherwise. (1.5) Definition 1.2 Assume that (Γ k, L k, M k ) are uncertainty spaces for k = 1, 2,... Let Γ = Γ 1 Γ 2, L = L 1 L 2 and M = M 1 M 2 Then the triplet (Γ, L, M) is called a product uncertainty space.

12 1.1 Uncertainty Space Uncertain Variable Definition 1. ([1]) An uncertain variable is a function ξ from an uncertainty space (Γ, L, M) to the set of real numbers R such that {ξ B} is an event for any Borel set B. Definition 1.4 ([1]) The uncertainty distribution Φ of an uncertain variable ξ is defined by Φ(x) = M {ξ x} (1.6) for any real number x. Theorem 1.1 ([6]) A function Φ(x) : R [0, 1] is an uncertainty distribution if and only if it is a monotone increasing function except Φ(x) 0 and Φ(x) 1. Example 1.1 An uncertain variable ξ is called linear if it has a linear uncertainty distribution 0, if x a Φ(x) = (x a)/(b a), if a x b (1.7) 1, if x b denoted by L(a, b) where a and b are real numbers with a < b. Example 1.2 An uncertain variable ξ is called zigzag if it has a zigzag uncertainty distribution 0, if x a (x a)/2(b a), if a x b Φ(x) = (1.8) (x + c 2b)/2(c b), if b x c 1, if x c denoted by Z(a, b, c) where a, b, c are real numbers with a 0. Example 1.4 An uncertain variable ξ is called empirical if it has an empirical uncertainty distribution 0, if x < x 1 Φ(x) = α i + (α i+1 α i )(x x i ), if x i x x i+1, 1 i < n (1.10) x i+1 x i 1, if x > x n where x 1 < x 2 < < x n and 0 α 1 α 2 α n 1.

13 4 1 Basics on Uncertainty Theory Independence Definition 1.5 ([5]) The uncertain variables ξ 1,ξ 2,...,ξ n are said to be independent if { n } n M (ξ i B i ) = M {ξ i B i } (1.11) i=1 for any Borel sets B 1, B 2,...,B n. Theorem 1.2 ([5]) The uncertain variables ξ 1,ξ 2,...,ξ n are independent if and only if { n } n M (ξ i B i ) = M {ξ i B i } (1.12) i=1 for any Borel sets B 1, B 2,...,B n. Theorem 1. ([5]) Let ξ 1,ξ 2,...,ξ n be independent uncertain variables, and let f 1, f 2,..., f n be measurable functions. Then f 1 (ξ 1 ), f 2 (ξ 2 ),..., f n (ξ n ) are independent uncertain variables. i=1 i=1 1.2 Expected Value Expected value is the average value of uncertain variable in the sense of uncertain measure and represents the size of uncertain variable. Definition 1.6 ([1]) Let ξ be an uncertain variable. Then the expected value of ξ is defined by + 0 E[ξ] = M{ξ x}dx M{ξ x}dx (1.1) 0 provided that at least one of the two integrals is finite. Theorem 1.4 ([1]) Let ξ be an uncertain variable with uncertainty distribution Φ. Then + 0 E[ξ] = (1 Φ(x))dx Φ(x)dx. (1.14) 0 Definition 1.7 ([4]) An uncertainty distribution Φ(x) is said to be regular if it is a continuous and strictly increasing function with respect to x at which 0 <Φ(x) <1, and lim Φ(x) = 0, lim x Φ(x) = 1. x +

14 1.2 Expected Value 5 Theorem 1.5 ([4]) Let ξ be an uncertain variable with regular uncertainty distribution Φ. Then 1 E[ξ] = Φ 1 (α)dα. (1.15) 0 Theorem 1.6 ([7]) Assume ξ 1,ξ 2,...,ξ n are independent uncertain variables with regular uncertainty distributions Φ 1,Φ 2,...,Φ n, respectively. If f (x 1, x 2,...,x n ) is strictly increasing with respect to x 1, x 2,...,x m and strictly decreasing with respect to x m+1, x m+2,...,x n, then the uncertain variable ξ = f (ξ 1,ξ 2,...,ξ n ) has an expected value E[ξ] = 1 0 f (Φ 1 1 (α),,φ 1 m (α), Φ 1 m+1 (1 α),,φ 1 n (1 α))dα. (1.16) Theorem 1.7 ([4]) Let ξ and η be independent uncertain variables with finite expected values. Then for any real numbers a and b, we have E[aξ + bη] =ae[ξ]+be[η]. (1.17) Definition 1.8 ([1]) Let ξ be an uncertain variable with finite expected value e. Then the variance of ξ is V [ξ] =E[(ξ e) 2 ]. (1.18) Let ξ be an uncertain variable with expected value e. If we only know its uncertainty distribution Φ, then the variance V [ξ] = = = M{(ξ e) 2 x}dx M{(ξ e + x) (ξ e x)}dx (M{ξ e + x}+m{ξ e x})dx (1 Φ(e + x) + Φ(e x))dx. Thus the following stipulation is introduced. Stipulation.Let ξ be an uncertain variable with uncertainty distribution Φ and finite expected value e. Then V [ξ] = + 0 (1 Φ(e + x) + Φ(e x))dx. (1.19) Now let us give an estimation for the expected value of aξ + ξ 2 if ξ is a normal uncertain variable [8].

15 6 1 Basics on Uncertainty Theory Theorem 1.8 ([8]) Let ξ be a normal uncertain variable with expected value 0 and variance σ 2 (σ>0), whose uncertainty distribution is Φ(x) = ( ( )) π x exp, x R. σ Then for any real number a, σ 2 2 E[aξ + ξ 2 ] σ 2. (1.20) Proof We only need to verify the conclusion under the case that a > 0 because the similar method is suitable to the case that a 0. Let x 1 = a a 2 + 4r, x 2 = a + a 2 + 4r 2 2 which is derived from the solutions of the equation ax + x 2 = r for any real number r a 2 /4 (Denote y 0 = a 2 /4). Then E[aξ + ξ 2 ]= = M{aξ + ξ 2 r}dr 0 y 0 M{(ξ x 1 ) (ξ x 2 )}dr M{aξ + ξ 2 r}dr y 0 M{(ξ x 1 ) (ξ x 2 )}dr. (1.21) Since M{ξ x 2 }=M{((ξ x 1 ) (ξ x 2 )) (ξ x 1 )} M{(ξ x 1 ) (ξ x 2 )}+M{ξ x 1 }, we have M{(ξ x 1 ) (ξ x 2 )} M{ξ x 2 } M{ξ x 1 }=Φ(x 2 ) Φ(x 1 ). Notice that M{(ξ x 1 ) (ξ x 2 )} M{ξ x 1 }+M{ξ x 2 }=Φ(x 1 ) + 1 Φ(x 2 ). Hence, it follows from (1.21) that

16 1.2 Expected Value E[aξ + ξ 2 ] Φ(x 1 )dr + (1 Φ(x 2 ))dr (Φ(x 2 ) Φ(x 1 ))dr 0 0 y = ( ) dr + ( ) dr exp π x 1 0 σ 1 + exp σ π x ( ) dr + ( ) dr y exp π x 2 y 0 σ 1 + exp π x 1 σ a + 2x + a + 2x = ( ) dx + ( ) dx a 1 + exp π x 0 π x σ 1 + exp σ 0 a + 2x a a + 2x ( ) dx + ( ) dx a/2 1 + exp π x a/2 σ 1 + exp π x σ + a 2x + a + 2x = ( ) ( dx) + ( ) dx a 1 + exp π x 0 σ 1 + exp π x σ 0 a 2x a a 2x ( ) ( dx) + ( ) ( dx) a/2 π x 1 + exp a/2 σ 1 + exp π x σ a 1 + x = a ( ) dx + 2 ( ) dx 0 π x 1 + exp a π x σ 1 + exp σ + x a a 2x +2 ( ) dx ( ) dx 0 π x 1 + exp 0 π x σ 1 + exp σ + x = 4 ( ) dx 0 π x 1 + exp σ = σ 2. (1.22) On the other hand, since M{(ξ x 1 ) (ξ x 2 )} M{ξ x 2 }=1 Φ(x 2 ), and M{(ξ x 1 ) (ξ x 2 )} M{ξ x 2 }=Φ(x 2 ), it follows from (1.21) that

17 8 1 Basics on Uncertainty Theory E[aξ + ξ 2 ] = (1 Φ(x 2 ))dr 0 y 0 1 ( ) dr 1 + exp σ π x2 Φ(x 2 )dr 0 1 ( y exp 0 π x 2 σ ) dr a + 2x a + 2x = ( ) dx ( ) dx 0 π x 1 + exp a/2 σ 1 + exp π x σ + x a/2 x = 2 ( ) dx + 2 ( ) dx 0 π x 1 + exp 0 π x σ 1 + exp σ a ( ) dx a/2 π x 1 + exp σ = σ σ 2 πa 2 σ z aσ + π e dz + 1 z π 1 + e dz z 0 σ 2 2. (1.2) Combining (1.22) and (1.2) yields the conclusion. The theorem is completed. Given an increasing function Φ(x) whose values are in [0, 1], Peng and Iwamura [6] introduced an uncertainty space (R, B, M) as follows. Let B be the Borel algebra over R.LetC be the collection of all intervals of the form (, a], (b, + ), and R. The uncertain measure M is provided in such a way: first, πa 2 σ M{(, a]} = Φ(a), M{(b, + )} =1 Φ(b), M{ } = 0, M{R} =1. Second, for any B B, there exists a sequence {A i } in C such that B A i. i=1 Thus M{B} = inf B A i M{A i }, if inf i=1 1 inf B c A i M{A i }, if i=1 B A i i=1 inf B c A i 0.5, otherwise. M{A i } < 0.5 M{A i } < 0.5 i=1 (1.24) The uncertain variable defined by ξ(γ) = γ from the uncertainty space (R, B, M) to R has the uncertainty distribution Φ.

18 1.2 Expected Value 9 Note that for monotone increasing function Φ(x) except Φ(x) 0 and Φ(x) 1, there may be multiple uncertain variables whose uncertainty distributions are just Φ(x). However, for any one ξ among them, the uncertain measure of the event {ξ B} for Borel set B may not be analytically expressed by Φ(x). For any two ξ and η among them, the uncertain measure of {ξ B} may differ from that of {η B}. These facts result in inconvenience of use in practice. Which one among them should we choose for reasonable and convenient use? Let us consider the uncertain variable ξ defined by ξ(γ) = γ on the uncertainty space (R, B, M) with the uncertainty distribution Φ(x), where the uncertain measure M is defined by (1.24), and another uncertain variable ξ 1 on the uncertainty space (Γ 1, L 1, M 1 ). For each A C, wehavem{ξ A} =M 1 {ξ 1 A}. For any Borel set B R, if B i=1 A i with M{A i } < 0.5, then i=1 { } M 1 {ξ 1 B} M 1 {ξ 1 A i } M 1 {ξ 1 A i } = i=1 M{ξ A i } < 0.5; i=1 if B c i=1 A i with M{A i } < 0.5, then i=1 { } M 1 {ξ 1 B} =1 M 1 {ξ 1 B c } 1 M 1 {ξ 1 A i } i=1 i=1 i=1 1 M 1 {ξ 1 A i }=1 M{ξ A i } > 0.5. i=1 Thus M 1 {ξ 1 B} M{ξ B} = inf B A i M{A i } < 0.5 i=1 if if inf B A i inf B c A i M{A i } < 0.5 and i=1 M 1 {ξ 1 B} M{ξ B} =1 M{A i } < 0.5. i=1 inf B c A i M{A i } > 0.5 In other cases, M{ξ B} =0.5. Therefore, the uncertain measure of {ξ B} is closer to 0.5 than that of {ξ 1 B}. Based on the maximum uncertainty principle [4], i=1

19 10 1 Basics on Uncertainty Theory we adopt uncertain variable ξ defined on (R, B, M) for use in our discussion if only the uncertainty distribution is provided. Definition 1.9 ([9]) An uncertain variable ξ with distribution Φ(x) is an ordinary uncertain variable if it is from the uncertainty space (R, B, M) to R defined by ξ(γ) = γ, where B is the Borel algebra over R and M is defined by (1.24). Let Φ(x) be continuous. For uncertain measure M defined by (1.24), we know that M{(, a)} =Φ(a) and M{[b, + )} =1 Φ(b). Definition 1.10 ([9]) An uncertain vector ξ = (ξ 1,ξ 2,...,ξ n ) is ordinary if every uncertain variable ξ i is ordinary for i = 1, 2,...,n Distribution of Function of Uncertain Variable Let us discuss the distribution of f (ξ) for an ordinary uncertain variable ξ or an ordinary uncertain vector. Assume C is the collection of all intervals of the form (, a], (b, + ), and R. Each element A i emerging in the sequel is in C. Theorem 1.9 ([9]) (i) Let ξ be an ordinary uncertain variable with the continuous distribution Φ(x) and f (x) a Borel function. Then the distribution of the uncertain variable f (ξ) is Ψ(x) = M{ f (ξ) x} inf M{A i }, if inf { f (ξ) x} A i i=1 = 1 inf M{A i }, if { f (ξ)>x} A i i=1 0.5, otherwise. { f (ξ) x} A i i=1 inf { f (ξ)>x} A i M{A i } < 0.5 M{A i } < 0.5 i=1 (1.25) (ii) Let f : R n R be a Borel function, and ξ = (ξ 1,ξ 2,...,ξ n ) be an ordinary uncertain vector. Then the distribution of the uncertain variable f (ξ) is Ψ(x) = M{ f (ξ 1,ξ 2,...,ξ n ) x} = M{(ξ 1,ξ 2,...,ξ n ) f 1 (, x)} sup min M k{λ k } Λ 1 Λ 2 Λ n Λ 1 k n if sup min M k{λ k } > 0.5, Λ 1 Λ 2 Λ n Λ 1 k n = 1 sup min M k{λ k } Λ 1 Λ 2 Λ n Λ c 1 k n if sup min M k{λ k } > 0.5, Λ 1 Λ 2 Λ n Λ c 1 k n 0.5, otherwise (1.26) where Λ = f 1 (, x), and each M k {Λ k } is derived from (1.24).

20 1.2 Expected Value 11 Proof The conclusions follow directly from (1.24) and (1.5), respectively. Theorem 1.10 ([9]) Let ξ be an ordinary uncertain variable with the continuous distribution Φ(x). For real numbers b and c, denote x 1 = b b 2 4(c x), x 2 = b + b 2 4(c x) 2 2 for x c b 2 /4. Then the distribution of the uncertain variable ξ 2 + bξ + cis Ψ(x) = Proof For x < c b 2 /4, we have 0, if x < c b2 4 Φ(x 2 ) (1 Φ(x 1 )), if Φ(x 2 ) (1 Φ(x 1 )) < 0.5 Φ(x 2 ) Φ(x 1 ), if Φ(x 2 ) Φ(x 1 )> , otherwise. Let x c b 2 /4 in the sequel. Then Ψ(x) = M{ξ 2 + bξ + c x} =M{ } = 0. Ψ(x) = M{ξ 2 + bξ + c x} =M{x 1 ξ x 2 }=M{[x 1, x 2 ]}. (1.27) The conclusion will be proved by (1.24). Since [x 1, x 2 ] (, x 2 ] and [x 1, x 2 ] [x 1, + ), and M{(, x 2 ]} = Φ(x 2 ) and M{[x 1, + )} =1 Φ(x 1 ),wehave Ψ(x) = Φ(x 2 ) (1 Φ(x 1 )) if Φ(x 2 ) (1 Φ(x 1 )) < 0.5. Since [x 1, x 2 ] c = (, x 1 ) (x 2, + ), wehave Ψ(x) = 1 (Φ(x 1 ) + 1 Φ(x 2 )) = Φ(x 2 ) Φ(x 1 ) if M{(, x 1 )}+M{(x 2, + )} =Φ(x 1 ) + 1 Φ(x 2 )<0.5, or Φ(x 2 ) Φ(x 1 )>0.5. Otherwise Ψ(x) = 0.5. The proof of the theorem is completed Expected Value of Function of Uncertain Variable If the expected value of uncertain variable ξ with uncertainty distribution Φ(x) exists, then + 0 E[ξ] = (1 Φ(x))dx Φ(x)dx; 0

21 12 1 Basics on Uncertainty Theory or E[ξ] = 1 0 Φ 1 (α)dα provided that Φ 1 (α) exists and unique for each α (0, 1). Thus, if we obtain the uncertainty distribution Ψ(x) of f (ξ), the expected value of f (ξ) is easily derived from + 0 E[ f (ξ)] = (1 Ψ(x))dx Ψ(x)dx. (1.28) 0 For a monotone function f (x), Theorem 1.6 gives a formula to compute the expected value of f (ξ) with the uncertainty distribution Φ(x) of ξ. However, we may generally not present a formula to compute the expected value of f (ξ) with Φ(x) for a nonmonotone function f (x) because the uncertainty distribution Ψ(x) of f (ξ)may not be analytically expressed by Φ(x). Now if we consider an ordinary uncertain variable ξ, the uncertainty distribution Ψ(x) of f (ξ) may be presented by (1.25), and then the expected value of f (ξ) can be obtained by (1.28). Next, we will give some examples to show how to compute the expected value of f (ξ) for an ordinary uncertain variable ξ no matter whether f (x) is monotone. Example 1.5 Let ξ be an ordinary linear uncertain variable L(a, b) with the distribution (also see Fig. 1.1) 0, if x a Φ(x) = (x a)/(b a), if a x b 1, if x b. The expected value of ξ is e = (a + b)/2. Now we consider the variance of ξ: V [ξ] =E[(ξ e) 2 ]. Let the uncertainty distribution of (ξ e) 2 be Ψ(x).Letx 0, and x 1 = e x, x 2 = e + x.if x (b a)/2, then x 2 b and x 1 a. Thus Ψ(x) = Φ(x 2 ) Φ(x 1 ) = 1. If x (b a)/2, then e x 2 b and a x 1 e. Thus Φ(x 2 ) (1 Φ(x 1 )) > 0.5. When Φ(x 2 ) Φ(x 1 ) = 2 x/(b a) >0.5, Fig. 1.1 Linear uncertainty distribution

22 1.2 Expected Value 1 Fig. 1.2 Uncertainty distribution of (ξ e) 2 that is, x >(b a)/4, Ψ(x) = Φ(x2 ) Φ(x 1 ) = 2 x/(b a). Hence, the uncertainty distribution of (ξ e) 2 (also see Fig. 1.2) is 0, if x < 0 0.5, if 0 x (b a) 2 /16 Ψ(x) = 2 x/(b a), if (b a) 2 /16 x (b a) 2 /4 1, if x (b a) 2 /4 by (1.27). The variance of ξ is V [ξ] =E[(ξ e) 2 ]= = (b a) 2 /16 0 = 7 96 (b a) dx + (1 Ψ(x))dx (b a) 2 /4 (b a) 2 /16 ( 1 2 ) x dx b a Example 1.6 Let ξ be an ordinary linear uncertain variable L( 1, 1) with the distribution 0, if x 1 Φ(x) = (x + 1)/2, if 1 x 1 1, if x 1. We will consider the expected value E[ξ 2 + bξ] for real number b. Let the uncertainty distribution of uncertain variable η = ξ 2 + bξ be Ψ(x). Forx b 2 /4, denote x 1 = b b 2 + 4x 2, x 2 = b + b 2 + 4x. 2

23 14 1 Basics on Uncertainty Theory (I) If b = 0, then E[ξ 2 ]=7/24 by Example 1.5. (II) If b 2, then { 0, if x < 1 b Ψ(x) = Φ(x 2 ), if x 1 b by (1.27). Note that x = x2 2 + bx 2. Thus + 0 E[ξ 2 + bξ] = (1 Ψ(x))dx Ψ(x)dx Thus Thus = = 0 1+b = 1. (III) If 1 b < 2, then E[ξ 2 + bξ] = = = b 0 1 (IV)If0< b < 1, then 0 0 (1 Φ(x 2 ))dx Φ(x 2 )dx ( 1 b 1 y + 1 ) 0 (2y + b)dy 2 Ψ(x) = 1 { 0, if x < b 2 /4 Φ(x 2 ), if x b 2 /4. (1 Ψ(x))dx 0 0 Ψ(x)dx (1 Φ(x 2 ))dx Φ(x 2 )dx ( b 2 /4 1 y + 1 ) 0 (2y + b)dy 2 = 1 48 (b 6b b + 8). b/2 0, if x (1 b 2 )/4. y + 1 (2y + b)dy 2 y + 1 (2y + b)dy 2

24 1.2 Expected Value 15 E[ξ 2 + bξ] = = + 0 (1 b 2 )/4 0 1+b + 1 b = 1 b b 0 (1 Ψ(x))dx Ψ(x)dx 1 1 b 2 dx + (1 Φ(x 2 ) + Φ(x 1 ))dx (1 b 2 )/4 (1 Φ(x 2 ))dx 0 b 2 /4 1 b ( + 1 y + 1 (1 b)/2 2 ( 1 y ) (2y + b)dy = 1 48 (b + 12b 2 12b + 14). (V) If b 2, then Also we have E[ξ 2 + bξ] =1/. (VI) If 2 < b 1, then Φ(x 2 )dx + y b b/2 { 0, if x < 1 + b Ψ(x) = 1 Φ(x 1 ), if x 1 + b. { 0, if x < b 2 /4 Ψ(x) = 1 Φ(x 1 ), if x b 2 /4. ) (2y + b)dy y + 1 (2y + b)dy 2 Thus E[ξ 2 + bξ] = 1 48 ( b 6b 2 12b + 8). (VII) If 1 (1 b 2 )/4. Thus

25 16 1 Basics on Uncertainty Theory + 0 E[ξ 2 + bξ] = (1 Ψ(x))dx Ψ(x)dx 0 (1 b 2 )/4 1 1+b 1 b = 0 2 dx + (1 Φ(x 2 ) + Φ(x 1 ))dx + Φ(x 1 )dx (1 b 2 )/4 1+b 0 (1 Φ(x 1 ))dx b 2 /4 = 1 1 ( b2 + 1 y y b + 1 ) (2y + b)dy 8 (1 b)/ y ( + (2y + b)dy 1 y + 1 ) (2y + b)dy ( 1 b)/2 2 b/2 2 = 1 48 ( b + 12b b + 14) 1. Optimistic Value and Pessimistic Value Definition 1.11 ([1]) Let ξ be an uncertain variable, and α (0, 1]. Then ξ sup (α) = sup{r M{ξ r} α} is called the α-optimistic value to ξ; and ξ inf (α) = inf{r M{ξ r} α} is called the α-pessimistic value to ξ. Example 1.7 Let ξ be a normal uncertain variable N(e, σ )(σ > 0). Then its α- optimistic value and α-pessimistic value are ξ sup (α) = e σ π e + σ π ln α 1 α. ln α 1 α and ξ inf(α) = Theorem 1.11 ([1]) Assume that ξ is an uncertain variable. Then we have (a) if λ 0, then (λξ) sup (α) = λξ sup (α), and (λξ) inf (α) = λξ inf (α), (b) if λ<0, then (λξ) sup (α) = λξ inf (α), and (λξ) inf (α) = λξ sup (α). (c) (ξ + η) sup (α) = ξ sup (α) + η sup (α) if ξ and η are independent. Let us give an estimation for the α-optimistic value of aξ + bξ 2 if ξ is a normal uncertain variable (α (0, 1)). Theorem 1.12 ([10]) Let ξ be a normal uncertain variable with expected value 0 and variance σ 2 (σ > 0), whose uncertainty distribution is Φ(x) = ( ( )) π x exp, x R. σ Then for any real number a and any small enough ε>0,

26 1. Optimistic Value and Pessimistic Value 17 [ aξ + bξ 2 ] ( ) sup (α) π ln 1 α a σ + α π ln 1 α 2 bσ 2, (1.29) α [ aξ + bξ 2 ] ( ) sup (α) π ln 1 α + ε a σ + α ε π ln 2 ε 2 bσ 2 (1.0) ε if b > 0; and [ aξ + bξ 2 ] ( ) sup (α) π ln 1 α ε a σ + α + ε π ln 2 ε 2 bσ 2, (1.1) ε [ aξ + bξ 2 ] ( ) sup (α) π ln 1 α a σ + α π ln 1 α 2 bσ 2 (1.2) α if b < 0; and also [ aξ + bξ 2 ] sup (α) = π ln 1 α a σ (1.) α if b = 0. Proof (I) We first verify the conclusion under the case that b > 0. Let x 1 = a a 2 + 4by, x 2 = a + a 2 + 4by 2b 2b which are derived from the solutions of the equation ax + bx 2 = y for any real number y y 0 = a2 4b (when y < y 0, M { aξ + bξ 2 y } = 1). If a 0, we have Letting 1 Φ(x 2 ) = α, we get M { aξ + bξ 2 y } = M {(ξ x 1 ) (ξ x 2 )} M {ξ x 1 } M {ξ x 2 } = 1 Φ(x 2 ). y = ax 2 + bx 2 2 = aφ(1 α) + b [ Φ 1 (1 α) ] 2 = a σ π ln 1 α α By the definition of α-optimistic value, we have [aξ + bξ 2 ] sup (α) a σ π ln 1 α α + b σ 2 π 2 + b σ 2 π 2 ( ln 1 α ) 2. α ( ln 1 α ) 2. α

27 18 1 Basics on Uncertainty Theory If a < 0, we have M { aξ + bξ 2 y } M {(ξ x 1 )} M {(ξ x 2 )} = Φ(x 1 ). Letting Φ(x 1 ) = α, wehave [aξ + bξ 2 ] sup (α) a σ π ln 1 α α + b σ 2 π 2 ( ln 1 α ) 2. α Hence, for any real number a, we obtain inequality (1.29). On the other hand, for ε>0 small enough, there exists a d = d ε > 0 such that M {ξ d} = M {ξ d} = ε 2. In fact, it follows from Φ( d) = M {ξ d} = ε 2 that d = σ π 2 ε ln. Note that ε {aξ + bξ 2 y} ={aξ + bξ 2 y, d ξ d} {aξ + bξ 2 y, ξ< d or ξ>d}. For each γ {γ aξ(γ) + bξ(γ) 2 y, d ξ(γ) d}, wehave Then we get aξ(γ) + bξ(γ) 2 aξ(γ) + bd 2. {aξ + bξ 2 y} {aξ + bd 2 y} {ξ d} {ξ d}. So we have M{aξ + bξ 2 y} M{aξ + bd 2 y}+ε. Letting M{aξ + bξ 2 y} α, wehave M{aξ + bd 2 y}+ε α, or M{aξ + bd 2 y} α ε. (1.4) It follows from inequality (1.4) and the definition of optimistic value that If a 0, then y (aξ + bd 2 ) sup (α ε) = (aξ) sup (α ε) + bd 2. y aξ sup (α ε) + bd 2 = a σ π ln 1 α + ε α ε + b σ 2 π 2 ( ln 2 ε ) 2. ε

28 1. Optimistic Value and Pessimistic Value 19 If a < 0, then y aξ inf (α ε) + bd 2 = a σ π ln 1 α + ε α ε + b σ 2 π 2 ( ln 2 ε ) 2. ε Therefore, inequality (1.0) holds. (II) In the case of b < 0, we can prove the inequalities (1.1) and (1.2) bythe similar method to the above process. (III) When b = 0, if a 0, [ aξ + bξ 2] sup (α) = aξ sup(α) = aσ π 0, [ aξ + bξ 2] sup (α) = aξ inf(α) = aσ theorem is proved. π 1 α ln α 1 α ln α ; If a <. Thus, Eq. (1.) is obtained. The Similarly, we can get an estimation for the α-pessimistic value of aξ + bξ 2 if ξ is a normal uncertain variable (α (0, 1)). Theorem 1.1 ([4]) Let ξ be a normal uncertain variable with expected value 0 and variance σ 2 (σ > 0), whose uncertainty distribution is Φ(x) = ( ( )) π x exp, x R. σ Then for any real number a and any small enough ε>0, [ aξ + bξ 2 ] inf (α) π ln [ aξ + bξ 2 ] inf (α) π ln ( α 1 α a σ + π ln α + ε 1 α ε a σ + ) 2 α bσ 2, (1.5) 1 α ) 2 bσ 2 (1.6) ( π ln 2 ε ε if b > 0; and [ aξ + bξ 2 ] ( inf (α) π ln α ε 1 α + ε a σ + π ln 2 ε ε [ aξ + bξ 2 ] inf (α) π ln α 1 α a σ + ( π ln α 1 α ) 2 bσ 2, (1.7) ) 2 bσ 2 (1.8) if b < 0; and also [ aξ + bξ 2 ] inf (α) = π ln α a σ (1.9) 1 α if b = 0.

29 20 1 Basics on Uncertainty Theory Proof According to Theorem 1.11, wehave [ aξ + bξ 2 ] inf (α) = [ aξ bξ 2] sup (α). Then, via applying Theorem 1.12, the conclusions are easily proved. 1.4 Uncertain Simulation It follows from Theorem 1.10 and the examples in the above section that the uncertainty distribution Ψ(x) of f (ξ) may be analytically expressed by (1.27) fora quadratic function f (x). But Ψ(x) may be hardly analytically expressed for other kinds of functions. Now we will introduce uncertain simulation approaches [9] for uncertainty distribution Ψ(x), optimistic value f sup, and expected value E[ f (ξ)] of f (ξ) based on (1.25) and (1.26). (a) Let ξ = (ξ 1,ξ 2,...,ξ n ) be an ordinary uncertain vector where ξ i is an ordinary uncertain variable with continuous uncertainty distribution Φ i (x) for i = 1, 2,...,n, and f : R n R be a Borel function. We use Algorithm 1.1 to simulate the following uncertain measure: L = M{ f (ξ) 0}. Algorithm 1.1 (Uncertain simulation for L) Step 1. Set m 1 (i) = 0andm 2 (i) = 0, i = 1, 2,...,n. Step 2. Randomly generate u k = (γ (1) k,γ (2) k,...,γ (n) k ) with 0 <Φ i (γ (i) k )<1, i = 1, 2,...,n, k = 1, 2,...,N. Step. Rank γ (i) k from small to large as γ (i) 1 γ (i) 2... γ (i) N, i = 1, 2,...,n. Step 4. From k = 1tok = N,if f (u k ) 0, m 1 (i) = m 1 (i) + 1, denote x (i) m = γ (i) 1(i) k ;otherwise, m 2 (i) = m 2 (i) + 1, denote y (i) m = γ (i) 2(i) k, i = 1, 2,...,n. Step 5. Set a (i) = Φ(x (i) m 1(i) ) (1 Φ(x(i) 1 )) (Φ(x(i) 1 ) + 1 Φ(x(i) 2 )) (Φ(x (i) m ) + 1 1(i) 1 Φ(x(i) m )); 1(i) b(i) = Φ(y (i) m 2(i) ) (1 Φ(y(i) 1 )) (Φ(y(i) 1 ) + 1 Φ(y (i) 2 )) (Φ(y(i) m ) + 1 2(i) 1 Φ(y(i) m 2(i) )), i = 1, 2,...,n. Step 6. If a (i) < 0.5, return L (i) 1 = a (i), L (i) 2 = 1 a (i) ;ifb (i) < 0.5, return L (i) 1 = 1 b (i), L (i) 2 = b (i) ; otherwise, return L (i) 1 = 0.5, L (i) 2 = 0.5, i = 1, 2,...,n. Step 7. If a = L (1) 1 L (2) 1 L (n) 1 > 0.5, then L = a; if b = L (1) 2 L (2) 2 L (n) 2 > 0.5, then L = 1 b; otherwise, L = 0.5. (b) Let ξ = (ξ 1,ξ 2,...,ξ n ) be an ordinary uncertain vector where ξ i is an ordinary uncertain variable with continuous uncertainty distribution Φ i (x) for i = 1, 2,...,n,

30 1.4 Uncertain Simulation 21 and f : R n R be a Borel function. The Algorithm 1.2 is used to simulate the optimistic value: f sup = sup{r M{ f (ξ) r} α} where α (0, 1) is a predetermined confidence level. Algorithm 1.2 (Uncertain simulation for f sup ) Step 1. Randomly generate u k = (γ (1) k,γ (2) k,...,γ (n) k ) with 0 <Φ i (γ (i) k )<1, i = 1, 2,...,n, k = 1, 2,...,m. Step 2. Set a = f (u 1 ) f (u 2 ) f (u m ), b = f (u 1 ) f (u 2 ) f (u m ). Step. Set r = (a + b)/2. Step 4. If M{ f (ξ) r} α, thena r. Step 5. If M{ f (ξ) r} <α,thenb r. Step 6. Repeat the third to fifth steps until b a <ɛfor a sufficiently small number ɛ. Step 7. f sup = (a + b)/2. (c) Let ξ = (ξ 1,ξ 2,...,ξ n ) be an ordinary uncertain vector where ξ i is an ordinary uncertain variable with continuous uncertainty distribution Φ i (x) for i = 1, 2,...,n, and f : R n R be a Borel function. The expected value E[ f (ξ)] is approached by the Algorithm 1.. Algorithm 1. (Uncertain simulation for E) Step 1. Set E = 0. Step 2. Randomly generate u k = (γ (1) k,γ (2) k,...,γ (n) k ) with 0 <Φ i (γ (i) k )<1, i = 1, 2,...,n, k = 1, 2,...,m. Step. Set a = f (u 1 ) f (u 2 ) f (u m ), b = f (u 1 ) f (u 2 ) f (u m ). Step 4. Randomly generate r from [a, b]. Step 5. If r 0, then E E + M{ f (ξ) r}. Step 6. If r < 0, then E E M{ f (ξ) r}. Step 7. Repeat the fourth to sixth steps for N times. Step 8. E[ f (ξ)] =a 0 + b 0 + E (b a)/n. 1.5 Uncertain Process The study of uncertain process was started by Liu [11] in 2008 for modeling the evolution of uncertain phenomena. Definition 1.12 ([11]) Let (Γ, L, M) be an uncertainty space and let T be a totally ordered set (e.g., time). An uncertain process is a function X t (γ ) from T (Γ, L, M)

31 22 1 Basics on Uncertainty Theory to the set of real numbers such that {X t B} is an event for any Borel set B at each time t. Remark 1.1 If X t is an uncertain process, then X t is an uncertain variable at each time t. Example 1.8 Let a and b be real numbers with a < b. Assume X t is a linear uncertain variable, i.e., X t L(at, bt) (1.40) at each time t. Then X t is an uncertain process. Example 1.9 Let a, b, c be real numbers with a 0. Assume X t is a normal uncertain variable, i.e., X t N(et,σt) (1.42) at each time t. Then X t is an uncertain process. Definition 1.1 ([11]) An uncertain process X t is said to have independent increments if X t0, X t1 X t0, X t2 X t1,...,x tk X tk 1 (1.4) are independent uncertain variables where t 0 is the initial time and t 1, t 2,...,t k are any times with t 0 < t 1 <...<t k. An uncertain process X t is said to have stationary increments if its increments are identically distributed uncertain variables whenever the time intervals have the same length; i.e., for any given t > 0, the increments X s+t X s are identically distributed uncertain variables for all s > 0. Definition 1.14 ([11]) An uncertain process is said to be a stationary independent increment process if it has not only stationary increments but also independent increments Liu Process In 2009, Liu [5] investigated a type of stationary independent increment process whose increments are normal uncertain variables. Later, this process was named by the academic community as Liu process due to its importance and usefulness.

32 1.5 Uncertain Process 2 Definition 1.15 ([5]) An uncertain process C t is said to be a canonical Liu process if (i) C 0 = 0 and almost all sample paths are Lipschitz continuous, (ii) C t has stationary and independent increments, (iii) every increment C s+t C s is a normal uncertain variable with expected value 0 and variance t Liu Integral Definition 1.16 ([5]) Let X t be an uncertain process and let C t be a canonical Liu process. For any partition of closed interval [a, b] with a = t 1 < t 2 < < t k+1 = b, the mesh is written as Δ = max t i+1 t i. (1.44) 1 i k Then Liu integral of X t with respect to C t is defined as b a X t dc t = lim Δ 0 k X ti (C ti+1 C ti ) (1.45) i=1 provided that the limit exists almost surely and is finite. In this case, the uncertain process X t is said to be integrable. Since X t and C t are uncertain variables at each time t, the limit in (1.45) isalso an uncertain variable provided that the limit exists almost surely and is finite. Hence, an uncertain process X t is integrable with respect to C t if and only if the limit in (1.45) is an uncertain variable. Theorem 1.14 ([5]) Let h(t, c) be a continuous differentiable function. Then Z t = h(t, C t ) is a Liu process and has an uncertain differential dz t = h t (t, C t)dt + h c (t, C t)dc t. (1.46) 1.6 Uncertain Differential Equation Definition 1.17 ([11]) Suppose C t is a canonical Liu process, and f and g are two functions. Then dx t = f (t, X t )dt + g(t, X t )dc t (1.47) is called an uncertain differential equation. A solution is a Liu process X t that satisfies (1.47) identically in t.

33 24 1 Basics on Uncertainty Theory Remark 1.2 The uncertain differential equation (1.47) is equivalent to the uncertain integral equation s X s = X s f (t, X t )dt + g(t, X t )dc t. (1.48) 0 Theorem 1.15 Let u t and v t be two integrable uncertain processes. Then the uncertain differential equation dx t = u t dt + v t dc t (1.49) has a solution t t X t = X 0 + u s ds + v s dc s. (1.50) 0 0 Theorem 1.16 ([12], Existence and Uniqueness Theorem) The uncertain differential equation dx t = f (t, X t )dt + g(t, X t )dc t (1.51) has a unique solution if the coefficients f (t, x) and g(t, x) satisfy linear growth condition f (t, x) + g(t, x) L(1 + x ), x R, t 0 (1.52) and Lipschitz condition f (t, x) f (t, y) + g(t, x) g(t, y) L x y, x, y R, t 0 (1.5) for some constant L. Moreover, the solution is sample-continuous. Definition 1.18 ([1]) Let α be a number with 0 < α < 1. An uncertain differential equation dx t = f (t, X t )dt + g(t, X t )dc t is said to have an α-path Xt α if it solves the corresponding ordinary differential equation = f (t, Xt α )dt + g(t, X t α ) Φ 1 (α)dt dx α t where Φ 1 (α) is the inverse standard normal uncertain distribution, i.e., Φ 1 (α) = π ln α 1 α.

34 1.6 Uncertain Differential Equation 25 Theorem 1.17 ([1]) Let X t and Xt α be the solution and α-path of the uncertain differential equation dx t = f (t, X t )dt + g(t, X t )dc t, respectively. Then the solution X t has an inverse uncertainty distribution Ψ 1 t (α) = X α t. References 1. Liu B (2007) Uncertainty theory, 2nd edn. Springer, Berlin 2. Kahneman D, Tversky A (1979) Prospect theory: an analysis of decision under risk. Econometrica 47(4): Liu B (2012) Why is there a need for uncertainty theory? J Uncertain Syst 6(1): Liu B (2010) Uncertainty theory: a branch of mathematics for modeling human uncertainty. Springer, Berlin 5. Liu B (2009) Some research problems in uncertainty theory. J Uncertain Syst (1): Peng Z, Iwamura K (2010) A sufficient and necessary condition of uncertainty distribution. J Interdiscip Math 1(): Liu Y, Ha M (2010) Expected value of function of uncertain variables. J Uncertain Syst 4(): Zhu Y (2010) Uncertain optimal control with application to a portfolio selection model. Cybern Syst 41(7): Zhu Y (2012) Functions of uncertain variables and uncertain programming. J Uncertain Syst 6(4): Sheng L, Zhu Y (201) Optimistic value model of uncertain optimal control. Int J Uncertain Fuzziness Knowl Based Syst 21(1): Liu B (2008) Fuzzy process, hybrid process and uncertain process. J Uncertain Syst 2(1): Chen X, Liu B (2010) Existence and uniqueness theorem for uncertain differential equations. Fuzzy Optim Decis Making 9(1): Yao K, Chen X (201) A numerical method for solving uncertain differential equations. J Intell Fuzzy Syst 25():825 82

35 Chapter 2 Uncertain Expected Value Optimal Control Uncertain optimal control problem is to choose the best decision such that some objective function related to an uncertain process driven by an uncertain differential equation is optimized. Because the objective function is an uncertain variable for any decision, we can not optimize it as a real function. A basic question is how to rank two different uncertain variables. In fact, there are many methods to do so but there is not a best one. These methods are established due to some criteria including, for example, expected value, optimistic value, pessimistic value, and uncertain measure [1].In this chapter, we make use of the expected value-based method to optimize the uncertain objective function. That is, we assume that an uncertain variable is larger than the other if the expected value of it is larger than the expected value of the other. 2.1 Problem of Uncertain Optimal Control Unless stated otherwise, we assume that C t is a canonical Liu process. We consider the following uncertain expected value optimal control problem subject to [ T J(0, x 0 ) sup E u t U 0 ] f (s, u s, X s )ds + G(T, X T ) (2.1) dx s = ν(s, u s, X s )ds + σ(s, u s, X s )dc s and X 0 = x 0. (2.2) In the above problem, X s is the state variable, u s the decision variable (represents the function u s (s, X s ) of the time s and state X s ) with the value in U, f the objective function, and G the function of terminal reward. For a given u s, X s is provided by Springer Nature Singapore Pte Ltd Y. Zhu, Uncertain Optimal Control, Springer Uncertainty Research, 27

36 28 2 Uncertain Expected Value Optimal Control the uncertain differential equation (2.2), where ν and σ are two functions of time s, u s, and X s. The function J(0, x 0 ) is the expected optimal reward obtainable in [0, T ] with the initial condition that at time 0 we are in state x 0. For any 0 < t < T, J(t, x) is the expected optimal reward obtainable in [t, T ] with the condition that at time t weareinstatex t = x. That is, we have [ T ] J(t, x) sup E f (s, u s, X s )ds + G(T, X T ) u t t subject to dx s = ν(s, u s, X s )ds + σ(s, u s, X s )dc s and X t = x. (2.) 2.2 Principle of Optimality Now we present the following principle of optimality for uncertain optimal control. Theorem 2.1 ([2]) For any (t, x) [0, T ) R, and Δt > 0 with t + Δt < T, we have J(t, x) = sup E [ f (t, u t, X t )Δt + J(t + Δt, x + ΔX t ) + o(δt)], (2.4) u t where x + ΔX t = X t+δt. Proof We denote the right side of (2.4) by J(t, x). It follows from the definition of J(t, x) that J(t, x) E + [ t+δt t T t+δt f (s, u s [t,t+δt), X s )ds ] f (s, u s [t+δt,t ], X s )ds + G(T, X T ) (2.5) for any u t, where u s [t,t+δt) and u s [t+δt,t ] are the values of decision variable u t restricted on [t, t + Δt) and [t + Δt, T ], respectively. Thus, J(t, x) E [ f (t, u t, X t )Δt + o(δt) [ T ]] +E f (s, u s [t+δt,t ], X s )ds + G(T, X T ). (2.6) t+δt Taking the supremum with respect to u s [t+δt,t ] first, and then u s [t,t+δt) in (2.6), we get J(t, x) J(t, x).

37 2.2 Principle of Optimality 29 On the other hand, for all u t,wehave [ T ] E f (s, u s, X s )ds + G(T, X T ) [ t t+δt [ T = E f (s, u s, X s )ds + E t t+δt E [ f (t, u t, X t )Δt + o(δt) + J(t + Δt, x + ΔX t )] J(t, x). ]] f (s, u s [t+δt,t ], X s )ds + G(T, X T ) Hence, J(t, x) J(t, x), and then J(t, x) = J(t, x). The theorem is proved. Remark 2.1 It is easy to know that the principle of optimality is true for x R n under the multidimensional case. 2. Equation of Optimality Consider the uncertain optimal control problem (2.). Now let us give a fundamental result called equation of optimality in uncertain optimal control. Theorem 2.2 (Equation of optimality, [2]) Let J(t, x) be twice differentiable on [0, T ] R. Then we have J t (t, x) = sup { f (t, u t, x) + J x (t, x)ν(t, u t, x)}, (2.7) u t where J t (t, x) and J x (t, x) are the partial derivatives of the function J(t, x) in t and x, respectively. Proof For any Δt > 0, by using Taylor series expansion, we get J(t + Δt, x + ΔX t ) = J(t, x) + J t (t, x)δt + J x (t, x)δx t J tt(t, x)δt 2 Substituting Eq. (2.8) into Eq. (2.4) yields J xx(t, x)δx t 2 + J tx (t, x)δtδx t + o(δt). (2.8) 0 = sup u t { f (t, u t, x)δt + J t (t, x)δt + E[J x (t, x)δx t J tt(t, x)δt J xx(t, x)δx t 2 + J tx (t, x)δtδx t ]+o(δt)}. (2.9) Let ξ be an uncertain variable such that ΔX t = ξ + ν(t, u t, x)δt. It follows from Eq. (2.9) that

38 0 2 Uncertain Expected Value Optimal Control 0 = sup u t { f (t, u t, x)δt + J t (t, x)δt + J x (t, x)ν(t, u t, x)δt + E[(J x (t, x) + J xx (t, x)ν(t, u t, x)δt + J tx (t, x)δt)ξ J xx(t, x)ξ 2 ]+o(δt)} = sup{ f (t, u t, x)δt + J t (t, x)δt + J x (t, x)ν(t, u t, x)δt u t + E[aξ + bξ 2 ]+o(δt)}, (2.10) where a J x (t, x) + J xx (t, x)ν(t, u t, x)δt + J tx (t, x)δt, and b 1 2 J xx(t, x). It follows from the uncertain differential equation, the constraint in (2.), that ξ = ΔX t ν(t, u t, x)δt is a normally distributed uncertain variable with expected value 0 and variance σ 2 (t, u t, x)δt 2.Ifb = 0, then E[aξ + bξ 2 ]=ae[ξ] =0. Otherwise, Theorem 1.8 implies that [ a E[aξ + bξ 2 ]=be b ξ + ξ 2] = o(δt). (2.11) Substituting Eq. (2.11) into Eq. (2.10) yields J t (t, x)δt = sup { f (t, u t, x)δt + J x (t, x)ν(t, u t, x)δt + o(δt)}. (2.12) u t Dividing Eq. (2.12) byδt, and letting Δt 0, we can obtain Eq. (2.7). Remark 2.2 If the equation of optimality has solutions, then the optimal decision and optimal expected value of objective function are determined. If function f is convex in its arguments, then the equation will produce a minimum, and if f is concave in its arguments, then it will produce a maximum. We note that the boundary condition for the equation is J(T, X T ) = E[G(T, X T )]. Remark 2. We note that in the equation of optimality (Hamilton Jacobi Bellman equation) for stochastic optimal control, there is an extra term 1 2 J xx(t, x)σ 2 (t, u t, x). 2.4 Equation of Optimality for Multidimension Case We now consider the optimal control model for multidimension case: [ T J(t, x) sup E u t U t ] f (s, u s, X s )ds + G(T, X T ) (2.1) subject to dx s = ν(s, u s, X s )ds + σ (s, u s, X s )dc s and X t = x. (2.14)

39 2.4 Equation of Optimality for Multidimension Case 1 In the above model, X s is the state vector of dimension n with the initial condition that at time t we are in state X t = x, u s the decision vector of dimension r (represents the function u s of time s and state X s ) in a domain U, f :[0, + ) R r R n R the objective function, and G :[0, + ) R n R the function of terminal reward. In addition, ν :[0, + ) R r R n R n is a column-vector function, σ : [0, + ) R r R n R n R k a matrix function, and C s = (C s1, C s2,...,c sk ) τ, where C s1, C s2,...,c sk are independent canonical Liu processes. Note that y τ represents the transpose vector of the vector y, and the final time T > 0isfixedorfree. We have the following equation of optimality. Theorem 2. ([]) Let J(t, x) be twice differentiable on [0, T ] R n. Then we have J t (t, x) = sup { f (t, u t, x) + ν(t, u t, x) τ x J(t, x)}, (2.15) u t U where J t (t, x) is the partial derivative of the function J(t, x) in t, and x J(t, x) is the gradient of J(t, x) in x. Proof For Δt with t + Δt [0, T ], denote X t+δt = x + ΔX t. By using Taylor series expansion, we have J(t + Δt, x + ΔX t ) = J(t, x) + J t (t, x)δt + x J(t, x) τ ΔX t J tt(t, x)δt ΔX t τ xx J(t, x)δx t + x J t (t, x) τ ΔX t Δt + o(δt) (2.16) where xx J(t, x) is the Hessian matrix of J(t, x). Since ΔX t = ν(t, u t, X t )Δt + σ (t, u t, X t )ΔC t, the expansion (2.16) may be rewritten as J(t + Δt, x + ΔX t ) = J(t, x) + J t (t, x)δt + x J(t, x) τ ν(t, u t, X t )Δt + x J(t, x) τ σ (t, u t, X t )ΔC t J tt(t, x)δt ν(t, u t, X t ) τ xx J(t, x)ν(t, u t, X t )Δt 2 + ν(t, u t, X t ) τ xx J(t, x)σ (t, u t, X t )ΔC t Δt (σ (t, u t, X t )ΔC t ) τ xx J(t, x)σ (t, u t, X t )ΔC t + x J t (t, x) τ ν(t, u t, X t )Δt 2 + x J t (t, x) τ σ (t, u t, X t )ΔC t Δt + o(δt) = J(t, x) + J t (t, x)δt + x J(t, x) τ ν(t, u t, X t )Δt +{ x J(t, x) τ σ (t, u t, X t ) + x J t (t, x) τ σ (t, u t, X t )Δt + ν(t, u t, X t ) τ xx J(t, x)σ (t, u t, X t )Δt}ΔC t ΔC t τ σ (t, u t, X t ) τ xx J(t, x)σ (t, u t, X t )ΔC t + o(δt). (2.17)

40 2 2 Uncertain Expected Value Optimal Control Denote a = x J(t, x) τ σ (t, u t, X t ) + x J t (t, x) τ σ (t, u t, X t )Δt + ν(t, u t, X t ) τ xx J(t, x)σ (t, u t, X t )Δt, B = 1 2 σ (t, u t, X t ) τ xx J(t, x)σ (t, u t, X t ). Hence, Eq. (2.17) may be simply expressed as J(t + Δt, x + ΔX t ) = J(t, x) + J t (t, x)δt + x J(t, x) τ ν(t, u t, X t )Δt + aδc t + ΔC τ t BΔC t + o(δt). It follows from the principle of optimality that Thus, J(t, x) = sup E[ f (t, u t, x)δt + J(t + Δt, x + ΔX t ) + o(δt)]. u t U J(t, x) = sup { f (t, u t, x)δt + J(t, x) + J t (t, x)δt + x J(t, x) τ ν(t, u t, X t )Δt u t U + E[aΔC t + ΔC τ t BΔC t ]} + o(δt). (2.18) Let a = (a 1, a 2,...,a k ), B = (b ij ) k k.wehave aδc t + ΔC t τ BΔC t = k k k a i ΔC ti + b ij ΔC ti ΔC tj. i=1 i=1 j=1 Since b ij ΔC ti ΔC tj 1 2 b ij (ΔC ti 2 + ΔC tj 2 ), we have k k a iδc ti b ij 2 ΔC ti aδc t + ΔC τ t BΔC t i=1 j=1 k k a iδc ti + b ij 2 ΔC ti. i=1 j=1

41 2.4 Equation of Optimality for Multidimension Case It follows from the independence of C t1, C t2,...,c tk that k k E a i ΔC ti b ij 2 ΔC ti E[aΔC t + ΔC τ t BΔC t ] i=1 j=1 k k E a i ΔC ti + b ij 2 ΔC ti. i=1 j=1 It follows from Theorem 1.8 that k E a i ΔC ti b ij 2 ΔC ti = o(δt), j=1 and k E a i ΔC ti + b ij 2 ΔC ti = o(δt). j=1 Hence, E[aΔC t + ΔC t τ BΔC t ]=o(δt). Therefore, Eq. (2.15) directly follows from Eq. (2.18). The theorem is proved. 2.5 Uncertain Linear Quadratic Model We consider a kind of special optimal control model with a quadratic objective function subject to a linear uncertain differential equation. { T J(0, x) min E [α 1 (t)x u t 2 + α 2 (t)u 2 t + α (t)x t u t t 0 +α 4 (t)x t + α 5 (t)u t + α 6 (t)]dt + S T X T 2 } subject to dx t = [β 1 (t)x t + β 2 (t)u t + β (t)] dt + [Δ 1 (t)x t + Δ 2 (t)u t + Δ (t)] dc t X 0 = x 0, (2.19) where x 0 denotes the initial state, α i (t) (i = 1, 2,...,6), β j (t), and Δ j (t) (j = 1, 2, ) are all the functions of time t. The aim to discuss this model is to find an optimal control u t which is a function of time t and state X t. For any 0 < t < T, use J(t, x) to denote the optimal value obtainable in [t, T ] with the condition that at time t we are in state X t = x. Theorem 2.4 Assume that J(t, x) is a twice differentiable function on [0, T ] R. Let α i (t) (i = 1, 2,...,6), β j (t), Δ j (t) (j = 1, 2, ) and α 1 2 (t) be continuous

42 4 2 Uncertain Expected Value Optimal Control bounded functions of t, and α 1 (t) 0,α 2 (t) >0. A necessary and sufficient condition that u t is an optimal control for (2.19) is that P(T ) = 2S T, u t = α (t)x + α 5 (t) + β 2 (t) [P(t)x + Q(t)], (2.20) 2α 2 (t) where x is the state of the state variable X t at time t obtained by applying the optimal control u t, the function P(t) satisfies the following Riccati differential equation and boundary condition dp(t) dt = [β 2(t)] 2 { } α (t)β 2 (t) 2α 2 (t) P2 (t) + 2β 1 (t) P(t) α 2 (t) + α2 (t) 2α 2 (t) 2α 1(t) (2.21) and the function Q(t) satisfies the following differential equation and boundary condition { dq(t) α (t)β 2 (t) = + [β 2(t)] 2 P(t) dt { 2α 2 (t) 2α } 2 (t) α5 (t)β 2 (t) + β (t) 2α 2 (t) Q(T ) = 0. The optimal value is } β 1 (t) Q(t) P(t) + α (t)α 5 (t) 2α 2 (t) α 4 (t) (2.22) J(0, x 0 ) = 1 2 P(0)x Q(0)x 0 + R(0), where T { [β2 (s)] 2 [ ] α5 (s)β 2 (s) R(0) = 0 4α 2 (s) Q2 (s) + β (s) Q(s) 2α 2 (s) + α2 5 (s) } 4α 2 (s) α 6(s) ds. (2.2) Proof The necessity will be proved first. It follows from the equation of optimality (2.7) that { J t = min α1 (t)x 2 + α 2 (t)u 2 + α (t)xu + α 4 (t)x + α 5 (t)u + α 6 (t) u + [β 1 (t)x + β 2 (t)u + β (t)] J x } = min L(u), (2.24) u

43 2.5 Uncertain Linear Quadratic Model 5 where L(u) represents the term in the braces. The optimal u satisfies L(u) u = 2α 2 (t)u + α (t)x + α 5 (t) + β 2 (t)j x = 0. Since we know that 2 L(u) u 2 = 2α 2 (t) >0, u t = α (t)x + α 5 (t) + [β 2 (t) + rδ 2 (t)] J x 2α 2 (t) (2.25) is the minimum point of L(u).ByEq.(2.24), we have J + α 1 (t)x 2 + α 2 (t)u 2 t + α (t)xu t + α 4 (t)x + α 5 (t)u t + α 6 (t) t + {[ β 1 (t)x + β 2 (t)u t + β (t) ]} J x = 0. (2.26) Taking derivative in both sides of (2.26) with respect to x yields that or 2 J x t + 2α 1(t)x + 2α 2 (t)u u t t + {[ β 1 (t)x + β 2 (t)u t + β (t) ]} 2 J x 2 + = 0, x + α (t)u t + α (t)x u t x + α 4(t) + α 5 (t) u t x [ ] β 1 (t) + β 2 (t) u t J x x 2 J x t + 2α 1(t)x + α (t)u t + α 4 (t) + β 1 (t)j x + [ β 1 (t)x + β 2 (t)u t + β (t) ] 2 J x 2 + { 2α 2 (t)u t + α (t)x + α 5 (t) + β 2 (t)j x } u t x = 0. By (2.25), we get 2 J x t + 2α 1(t)x + α (t)u t + α 4 (t) + β 1 (t)j x + [ β 1 (t)x + β 2 (t)u t + β (t) ] 2 J x 2 = 0.

44 6 2 Uncertain Expected Value Optimal Control Hence, Let 2 J x t = 2α 1(t)x α (t)u t α 4 (t) β 1 (t)j x Since J(T, x) = S T x 2, we conjecture that [ β 1 (t)x + β 2 (t)u t + β (t) ] 2 J x 2. (2.27) λ(t) = J x. (2.28) J x = λ(t) = P(t)x(t) + Q(t). (2.29) Taking derivative in both sides of (2.29) with respect to x, wehave Substituting (2.29) into(2.25) yields that 2 J = P(t). (2.0) x 2 u t = α (t)x + α 5 (t) + β 2 (t)[p(t)x + Q(t)]. (2.1) 2α 2 (t) Taking derivative in both sides of (2.28) with respect to t yields that dλ(t) dt = 2 J x t + 2 J x 2 dx dt. (2.2) Substituting (2.27), (2.29), and (2.0) into(2.2) yields that dλ(t) = 2α 1 (t)x α (t)u t α 4 (t) β 1 (t)[p(t)x + Q(t)] dt + P(t) dx dt [ β 1 (t)x + β 2 (t)u t + β (t) ] P(t). (2.) Substituting (2.1) into(2.), we have dλ(t) dt { [β2 (t)] 2 [ α (t) [β 2 (t)] = 2α 2 (t) P2 (t) + α 2 (t) { } { α5 (t)β 2 (t) [β2 (t)] 2 + β (t) P(t) + 2α 2 (t) β 1 (t)} Q(t) + α (t)α 5 (t) 2α 2 (t) ] 2β 1 (t) P(t) + α2 (t) 2α 2 (t) P(t) + α (t)β 2 (t) 2α 2 (t) } 2α 2 (t) 2α 1(t) α 4 (t) + P(t) dx dt. (2.4) x

45 2.5 Uncertain Linear Quadratic Model 7 Taking derivative in both sides of (2.29) with respect to t yields that dλ(t) dt By (2.4) and (2.5), we get dp(t) dt = dp(t) x + P(t) dx dt dt + dq(t) dt. (2.5) = [β 2(t)] 2 { } α (t)β 2 (t) 2α 2 (t) P2 (t) + 2β 1 (t) P(t) + α2 (t) α 2 (t) 2α 2 (t) 2α 1(t), and dq(t) dt { α (t)β 2 (t) = + [β 2(t)] 2 P(t) 2α 2 (t) 2α 2 (t) { } α5 (t)β 2 (t) + β (t) 2α 2 (t) } β 1 (t) Q(t) P(t) + α (t)α 5 (t) 2α 2 (t) α 4 (t). It follows from (2.28) and (2.29) that λ(t ) = 2S T x(t ) and λ(t ) = P(T )x(t ) + Q(T ). So we have P(T ) = 2S T and Q(T ) = 0. Hence, P(t) satisfies the Riccati differential equation and boundary condition (2.21), and the function Q(t) satisfies the differential equation and boundary condition (2.22). By solving the above equations, the expressions of P(t) and Q(t) can be obtained, respectively. In other words, the optimal control u t is provided for the linear quadratic model (2.19) by(2.20). Next we will verify the sufficient condition of the theorem. Suppose that u t, P(t), Q(t) satisfy (2.20), (2.21), (2.22), respectively. Now we prove that u t is an optimal control for the linear quadratic model (2.19). By the equation of the optimality (2.7), we have J t { = min α1 (t)x 2 + α 2 (t)u 2 + α (t)xu + α 4 (t)x + α 5 (t)u + α 6 (t) u + [β 1 (t)x + β 2 (t)u + β (t)] J x }. So J { + min α1 (t)x 2 + α 2 (t)u 2 + α (t)xu + α 4 (t)x + α 5 (t)u + α 6 (t) t u + [β 1 (t)x + β 2 (t)u + β (t)] J x } = 0. (2.6)

46 8 2 Uncertain Expected Value Optimal Control We conjecture that J(t, x) = 1 2 P(t)x 2 + Q(t)x + R(t). where R(t) is provided by T { [β2 (s)] 2 [ ] α5 (s)β 2 (s) R(t) = t 4α 2 (s) Q2 (s) + β (s) Q(s) 2α 2 (s) + α2 5 (s) } 4α 2 (s) α 6(s) ds. (2.7) Then J t + α 1 (t)x 2 + α 2 (t)u 2 t + [ β 1 (t)x + β 2 (t)u t + β (t) ] J x = J t + α (t)xu t + α 4 (t)x + α 5 (t)u t + α 6 (t) + α 1 (t)x 2 + α 4 (t)x + α 6 (t) + α 2 (t)u 2 t + {[β 1 (t) + rδ 1 (t)] x + β (t)} J x + β 2 (t)u t J x = 1 dp(t) x 2 + dq(t) 2 dt dt + α 2 (t) x + dr(t) dt + [α (t)x + α 5 (t)] u t + α 1 (t)x 2 + α 4 (t)x + α 6 (t) { α (t)x + α 5 (t) + β 2 (t) [P(t)x + Q(t)] 2α 2 (t) { + [α (t)x + α 5 (t)] α (t)x + α 5 (t) + β 2 (t) [P(t)x + Q(t)] 2α 2 (t) + {β 1 (t)x + β (t)} [P(t)x + Q(t)] + β 2 (t) = 1 2 { dp(t) dt { dq(t) + dt { α (t)x + α 5 (t) + β 2 (t) [P(t)x + Q(t)] 2α 2 (t) + 2α 1 (t) α2 (t) 2α 2 (t) [β 2(t)] 2 α (t)β 2 (t) Q(t) P(t) [β 2(t)] 2 2α 2 (t) } α 5(t)β 2 (t) P(t) α (t)α 5 (t) + α 4 (t) 2α 2 (t) 2α 2 (t) } 2 } } [P(t)x + Q(t)] 2α 2 (t) P2 (t) α (t)β 2 (t) α 2 (t) } P(t) x 2 2α 2 (t) Q(t) + β 1(t)Q(t) + β (t)p(t) x + dr(t) dt [β 2(t)] 2 4α 2 (t) Q2 (t) α 5(t)β 2 (t) Q(t) + β (t)q(t) 2α 2 (t) = 0. α2 5 (t) 4α 2 (t) + α 6(t)

47 2.5 Uncertain Linear Quadratic Model 9 Therefore, we know that u t is a solution of Eq. (2.6). Because objective function is convex, Eq. (2.6) produces a minimum. That is u t is an optimal control. At the same time, we also get the optimal value The theorem is proved. J(0, x 0 ) = 1 2 P(0)x Q(0)x 0 + R(0). 2.6 Optimal Control Problem of the Singular Uncertain System We consider the following continuous-time singular uncertain system { FdXt = g(t)ax t dt + h(t)bx t dc t, t 0 X 0 = x 0. (2.8) where X t R n is the state vector of the system, and g(t), h(t) :[0, + ) (0, + ) are both bounded functions, and A R n n, B R n n are known coefficient matrices associated with X t.thef is a known (singular) matrix with rank(f) = q n, and deg(det(zf A)) = r where z is a complex variable. Notice that det(zf A) is the determinant of the matrix zf A and deg(det(zf A)) is the degree of the polynomial det(zf A). TheC t is a canonical Liu process representing the noise of the system. For a matrix A =[a ij ] n n and a vector X = (x 1, x 2,...,x n ) T, we define n A = a ij, X = i, j=1 n x i. i=1 For the system (2.8), the matrices F and A play main roles. Notice that (F, A) is said to be regular if det(zf A) is not identically zero and (F, A) is said to be impulse-free if deg(det(zf A)) = rank(f). Lemma 2.1 ([4]) If (F, A) is regular, impulse-free and rank[f, B] =rank(f) = r, there exist a pair of nonsingular matrices P R n n and Q R n n for the triplet (F, A, B) such that the following conditions are satisfied: PFQ = [ ] [ ] Ir 0 A1 0, PAQ =, PBQ = 00 0 I n r where A 1 R r r,b 1 R r r,b 2 R r n r. [ ] B1 B 2 0 0

48 40 2 Uncertain Expected Value Optimal Control Lemma 2.2 ([5]) System (2.8) has a unique solution if (F, A) is regular, impulsefree and rank[f, B] =rank F. Moreover, the solution is sample-continuous. [ ] X1,t Proof Let = Q X 1 X t, where X 1,t R r and X 2,t R n r. Then system (2.8) 2,t is equivalent to { dx1,t = g(t)a 1 X 1,t dt + h(t)[b 1 X 1,t + B 2 X 2,t ]dc t, 0 = g(t)x 2,t dt, or { dx1,t = g(t)a 1 X 1,t dt + h(t)b 1 X 1,t dc t, 0 = X 2,t, (2.9) for all t 0. By [6], the equation dx 1,t = g(t)a 1 X 1,t dt + h(t)b 1 X 1,t dc t [ ] X1,t has a unique solution X 1,t on interval [0, + ). Obviously, X t = Q for all X 2,t t 0, which is the unique solution to (2.8) on[0, + ). Finally, for each γ Γ, according to the result in [6], we have X t (γ ) X r (γ ) = Q t r g(s)a 1 X 1,s (γ )ds + Q t r h(s)b 1 X 1,s (γ )dc s (γ ) 0. as r t. Thus, X t is sample-continuous and this completes the proof. Unless stated otherwise, it is always assumed that system (2.8) is regular and impulse-free. Then, under this assumption, we will introduce the following optimal control problem for an uncertain singular system: [ T J(0, X 0 ) = sup E u(s) U 0 ] f (s, u(s), X s ) ds + G(T, X T ) subject to FdX s = g(s) [AX s + Bu(s)] ds + h(s)du(s)dc s, and X 0 = x 0. In the above problem, X s R n is the state vector, u(s) U R m is the input vector, f is the objective function, and G is the function of terminal reward, A R n n, B R n m, D R n m. For a given u(s), X s is defined by the uncertain differential equations, where g(s), h(s) :[0, + ) (0, + ) are both bounded functions. The function J(0, X 0 ) is the expected optimal value obtainable in [0, T ] with the initial state that at time 0 we are in state X 0. For any 0 < t < T, J(t, X) is the expected optimal reward obtainable in [t, T ] with the condition that at time t weareinstatex t = x. That is, we have

49 2.6 Optimal Control Problem of the Singular Uncertain System 41 [ T J(t, x) = sup E u(s) U t ] f (s, u(s), X s ) ds + G(T, X T ) subject to FdX s = g(s) [AX s + Bu(s)] ds + h(s)du(s)dc s, and X t = x. Now let us give the following equation of optimality. (2.40) Theorem 2.5 (Equation of Optimality, [5])The (F, A) is assumed to be regular and impulse-free, and P 2 u t = 0. Let J(t, X) be twice differentiable on [0, T ] R n and u(s) derivable on [0, T ]. Then we get J t (t, x) = sup u(t) U { f (t, u(t), x) + x J(t, x) T p }, (2.41) where p = Q [ ] [ ] g(t) (A1 X 1 + B 1 u(t)) P1 and P =,P B 2 u(t) P 1 R r n,p 2 R (n r) n. 2 Proof Because (F, A) is regular and impulse-free, by Lemma 2.1 there exist invertible matrices P and Q such that PFQ = and from P 2 u t = 0 we get [ ] [ ] [ ] Ir 0 A1 0 B1, PAQ =, PB =, 00 0 I n r B 2 PD = [ P1 P 2 ] u t = [ ] ut1, 0 [ ] X1,s where u t1 = P 1 u t.letx s = Q for any s [t, T ] and especially at time t X [ ] 2,s X1 denote X = Q, so we are easy to obtain X 2 { dx1,s = g(s) [ A 1 X 1,s + B 1 u(s) ] ds + h(s)u t1 u(s)dc s, 0 = g(s) [ X 2,s + B 2 u(s) ] ds where s [t, T ]. Thus at any time s [t, T ] we have X 2,s = B 2 u(s). Letting s = t and s = t + Δt, respectively, gets the following two equations X 2,t = B 2 u(t) X 2,t+Δt = B 2 u(t + Δt).

50 42 2 Uncertain Expected Value Optimal Control Using the latter equation minus the former one, we obtain ΔX 2,t = B 2 u(t)δt + (Δt), where u(t + Δt) = u(t) + u(t)δt + (Δt), because u(s) is derivable on [t, T ]. Obviously we know ΔX 1,t = g(t) [A 1 X 1 + B 1 u(t)] Δt + h(t)u t1 u(t)δc t, where ΔC t N(0,Δt) which means ΔC t is a normal [ ] uncertain variable with expected value 0 and variance Δt 2 X1,s. Because X s = Q, we obtain X 2,s ΔX t = Q [ ] g(t)[a1 X 1 + B 2 u(t)] Δt + h(t)q B 2 u(t) 1 u t1 u(t)δc t + (Δt) where Q = [ Q 1 Q 2 ] and Q1 R n r, Q 2 R n (n r). Now denote [ ] g(t)[a1 X p = Q 1 + B 2 u(t)], B 2 u(t) q = h(t)q 1 u t1 u(t). Then we have ΔX t = pδt + qδc t + (Δt). By employing Taylor series expansion, we obtain J(t + Δt, X + ΔX t )= J(t, X) + J t (t, X)Δt + X J(t, X) T ΔX t J tt(t, X)Δt 2 Substituting Eq. (2.42) into Eq. (2.4) yields + X J t (t, X) T ΔX t Δt ΔX T t XX J(t, X)ΔX t + (Δt). (2.42) { [ 0 = sup u(t) f (X, u(t), t)δt + J t (t, X)Δt + E X J(t, X) T ΔX t + X J t (t, X) T ΔX t Δt + 1 ] } 2 ΔX t T XX J(t, X)ΔX t + (Δt) (2.4) Applying Theorem 1.8, we know

51 2.6 Optimal Control Problem of the Singular Uncertain System 4 E [ X J(t, X) T ΔX t + X J t (t, X) T ΔX t Δt + 12 ] ΔX Tt X J XX (t, X)ΔX t [ = E X J(t, X) T (pδt + qδc t + (Δt)) + pδt + X J t (t, X) T (pδt + qδc t + (Δt))Δt + 1 ] 2 (pδt + qδc t + (Δt)) T XX J(t, X)(pΔt + qδc t + (Δt)) [( = X J(t, X) T pδt + E + 1 ] 2 qt XX J(t, X)qΔCt 2 + (Δt) [ = X J(t, X) T pδt + E aδc t + bδct 2 X J(t, X)q + X J t (t, X)qΔt + p T X J XX (t, X)q ) ΔC t ] + (Δt) [ a ] = X J(t, X) T pδt + be b ΔC t + ΔCt 2 + (Δt) = X J(t, X) T pδt + (Δt) (2.44) where a = X J(t, X)q + X J t (t, X)qΔt + p T X J XX (t, X)q and b = 1 2 q T XX J(t, X)q. Substituting Eq. (2.44) into(2.4), we obtain [ J t (t, X)Δt = sup f (X, u(t), t)δt + X J(t, X) T pδt + (Δt)) ]. (2.45) u(t) Dividing Eq. (2.45) byδt and letting Δt 0, we are able to get Eq. (2.41). Remark 2.4 Note that when F is invertible, the uncertain singular system becomes uncertain normal system and the optimal control problem of the uncertain normal system [2] has been tackled in recent years. Remark 2.5 The solutions of the presented model (2.40) may [ be obtained from settling the equation of optimality (2.41). The vector p = Q 1 + B 1 u(t)) ] g(t) (A1 X B 2 u(t) is related to the function u(t) which is totally different from the optimal control problem of the uncertain normal system, and it will bring lots of matters in solving Eq. (2.41). Example Consider the following problem: J(t, x t ) = [ T ] sup E α τ (s)x s u(s)ds + α τ (T )X T u(t) U ad t subject to FdX s = g(s) [AX s + Bu(s)] ds + h(s)du(s)dc s, and X t = x (2.46) where X s R 4 is the state vector, α(s) R 4 is the coefficient of X s, U ad =[ 1, 1], α τ (s) =[1, 1, 1, 2]e s, g(s) = 1, h(s) = s + 1, and

52 44 2 Uncertain Expected Value Optimal Control F = , A = , B = 0 1, D = Through calculating, we know z det(zf A) = det 1 0 z = z2 + z Obviously, det(zf A) is not identically zero and deg(det(zf A)) = rank(f), namely, the given system is regular and impulse-free. By using Lemma 2.1, we obtain two invertible matrices P = , Q = , such that PFQ = , PAQ = , PB = 0 2, PD = Easily, we can see A 1 = [ ] 1 1, B = [ ] [ [ 1 2 0, B 0 2 =, P 1] 2 u t = 0] [ ] where P 2 =. Denote x =[x , x 2, x, x 4 ] τ, and we assume that x 1 + x = 0. Because Q 1 = , 1 001

53 2.6 Optimal Control Problem of the Singular Uncertain System 45 [ ] x1 and x 2 we know = Q 1 x, we obtain x 1 =[x 1, x ] T. Combining these results and Eq. (2.41), p = Q [ ] g(t) (A1 X 1 + B 1 u(t)) B 2 u(t) (x 1 + x ) u(t) = x + u(t) 2 u(t) x 1 (x 1 + x ) u(t) + u(t) u(t) = x + u(t) 2 u(t). x 1 u(t) + u(t) We conjecture that J(t, x) = kα T (t)x kα τ (T )E[X T ]+α τ (T )E[X T ]. Then and J t (t, x) = kα T (t)x, x J(t, x) = kα(t), α τ (t)xu(t) + x J(t, x) τ p = (x 1 + x 2 + x + 2x 4 )e t u(t) + k [ u(t) + (x + u(t) 2 u(t)) + x 1 + 2( u(t) + u(t))] e t = [(x 1 + x 2 + x + 2x 4 ) 2k] e t u(t). Applying Eq. (2.41), we get k(x 1 + x 2 + x + 2x 4 )e t = sup u(t) [ 1,1] [(x 1 + x 2 + x + 2x 4 ) 2k] e t u(t) = e t sup u(t) [ 1,1] [(x 1 + x 2 + x + 2x 4 ) 2k] u(t) = e t (x 1 + x 2 + x + 2x 4 ) 2k. (2.47) Dividing Eq. (2.47) bye t, we obtain k(x 1 + x 2 + x + 2x 4 ) = (x 1 + x 2 + x + 2x 4 ) 2k, (2.48) and namely k 2 (x 1 + x 2 + x + 2x 4 ) 2 = [(x 1 + x 2 + x + 2x 4 ) 2k] 2, (a 2 4)k 2 + 4ak a 2 = 0,

54 46 2 Uncertain Expected Value Optimal Control where a = x 1 + x 2 + x + 2x 4. According to Eq. (2.48), the symbols of k and a must keep coincidence, so we know Thus the optimal control is a, if a =±2 4 0, if a = 0 k = a, if a < 2 or0< a < 2 a 2 a, if 2 < a < 0ora > 2. a + 2 u (t) = sign(a 2k). References 1. Liu B (2009) Theory and practice of uncertain programming, 2nd edn. Springer, Berlin 2. Zhu Y (2010) Uncertain optimal control with application to a portfolio selection model. Cybern Syst 41(7): Xu X, Zhu Y (2012) Uncertain bang-bang control for continuous time model. Cybern Syst Int J 4(6): Dai L (1989) Singular control systems. Springer, Berlin 5. Shu Y, Zhu Y (2017) Stability and optimal control for uncertain continuous-time singular systems. Eur J Control 4: Ji X, Zhou J (2015) Multi-dimensional uncertain differential equation: existence and uniqueness of solution. Fuzzy Optim Decis Mak 14(4):

55 Chapter Optimistic Value-Based Uncertain Optimal Control Expected value is the weighted average of uncertain variables in the sense of uncertain measure. However, in some cases, we need to take other characters of uncertain variables into account. For instance, if the student test scores presented two levels of differentiation phenomenon, and the difference between higher performance and lower performance is too large, then average grade may not be considered only. In this case, critical value (optimistic value or pessimistic value) of test scores may be discussed. We may investigate the problem such as which point the lowest of the 95% test scores is up to. Different from the expected value optimal control problems, in this chapter, we will introduce another kind of uncertain optimal control problems, namely optimal control problems, for uncertain differential systems based on optimistic value criterion..1 Optimistic Value Model Assume that C t = (C t1, C t2,...,c tk ) τ, where C t1, C t2,...,c tk are independent canonical Liu processes. For any 0 < t < T, and confidence level α (0, 1), we introduce an uncertain optimistic value optimal control problem for multidimensional case as follows [1]. J(t, x) sup F sup (α) u t U subject to dx s = μ(s, u s, X s )ds + σ (s, u s, X s )dc s and X t = x (.1) where F = T t f (s, u s, X s )ds + G(T, X T ), and F sup (α) = sup { F M { F F } α} which denotes the α-optimistic value to F. The vector X s is a state vector of dimension n, u s is a control vector of dimension r subject to a constraint set U. Springer Nature Singapore Pte Ltd Y. Zhu, Uncertain Optimal Control, Springer Uncertainty Research, 47

56 48 Optimistic Value-Based Uncertain Optimal Control The function f :[0, T ] R r R n R is an objective function, and G :[0, T ] R n R is a function of terminal reward. In addition, μ :[0, T ] R r R n R n is a vector-value function, and σ :[0, T ] R r R n R n R k is a matrix-value function. All functions mentioned are continuous. We first present the following principle of optimality. Theorem.1 ([1]) For any (t, x) [0, T ) R n, and Δt > 0 with t + Δt < T,we have J(t, x) = sup { f (t, u t, x)δt + J(t + Δt, x + ΔX t ) + o(δt)}, (.2) u t U where x + ΔX t = X t+δt. Proof We denote the right side of (.2) by J(t, x). For arbitrary u t U, itfollows from the definition of J(t, x) that J(t, x) + [ t+δt t T t+δt f (s, u s [t,t+δt), X s )ds ] f (s, u s [t+δt,t ], X s )ds + G(T, X T ) (α), sup where u s [t,t+δt) and u s [t+δt,t ] are control vector u s restricted on [t, t + Δt) and [t + Δt, T ], respectively. Since for any Δt > 0, t+δt t f (s, u s [t,t+δt), X s )ds = f (t, u t, x)δt + o(δt), we have J(t, x) f (t, u t, x)δt + o(δt) [ T ] + f (s, u s [t+δt,t ], X s )ds + G(T, X T ) (α). (.) t+δt sup Taking the supremum with respect to u s [t+δt,t ] in (.), we get J(t, x) J(t, x). On the other hand, for all u t,wehave [ T t ] f (s, u s, X s )ds + G(T, X T ) [ T = f (t, u t, x)δt + o(δt) + t+δt (α) sup f (t, u t, x)δt + o(δt) + J(t + Δt, x + ΔX t ) J(t, x). ] f (s, u t [t+δt,t ], X s )ds + G(T, X T ) Hence, J(t, x) J(t, x), and then J(t, x) = J(t, x). Theorem.1 is proved. (α) sup

57 .2 Equation of Optimality 49.2 Equation of Optimality Consider the uncertain optimal control problem (.1). Now let us give an equation of optimality in optimistic value model. Theorem.2 ([1]) Let J(t, x) be twice differentiable on [0, T ] R n. Then we have J t (t, x) = sup { f (t, u t, x) + x J(t, x) τ μ(t, u t, x) u U } + π ln 1 α x J(t, x) τ σ (t, x, u) α 1 (.4) where J t (t, x) is the partial derivative of the function J(t, x) in t, x J(t, x) is the gradient of J(t, x) in x, and 1 is the 1-norm for vectors, that is, p 1 = n p i for p = (p 1, p 2,...,p n ). Proof By Taylor expansion, we get i=1 J(t + Δt, x + ΔX t ) = J(t, x) + J t (t, x)δt + x J(t, x) τ ΔX t J tt(t, x)δt ΔXτ t xxj(t, x)δx t + x J t (t, x) τ ΔX t Δt + o(δt) (.5) where xx J(t, x) is the Hessian matrix of J(t, x). Substituting Eq. (.5) into Eq. (.2) yields that 0 = sup { f (t, u t, x)δt + J t (t, x)δt + [ x J(t, x) τ ΔX t u U ] ΔXτ t xxj(t, x)δx t + x J t (t, x) τ ΔX t Δt (α) sup + o(δt)}. (.6) Note that ΔX t = μ(t, u t, x)δt + σ (t, u t, x)δc t. It follows from (.6) that 0 = sup { f (t, u t, x)δt + J t (t, x)δt + x J(t, x) τ μ(t, u t, x)δt u U +[aδc t + ΔC τ t BΔC t] sup (α) + o(δt) }, (.7) where a = x J(t, x) τ σ (t, u t, x) + x J t (t, x) τ σ (t, u t, x)δt + μ(t, u t, x) τ xx J(t, x)σ (t, u t, x)δt, B = 1 2 σ (t, u t, x) τ xx J(t, x)σ (t, u t, x).

58 50 Optimistic Value-Based Uncertain Optimal Control Let a = (a 1, a 2,...,a k ), B = (b ij ) k k. Then we have aδc t + ΔC τ t BΔC t = k k k a i ΔC ti + b ij ΔC ti ΔC tj. i=1 i=1 j=1 Since b ij ΔC ti ΔC tj 1 2 b ij (ΔC ti 2 + ΔC tj 2 ),we have k i=1 { ( k ) } 2 a i ΔC ti b ij ΔC ti j=1 aδc t + ΔC τ t BΔC t k i=1 { ( k ) } 2 a i ΔC ti + b ij ΔC ti. j=1 Because of the independence of C t1, C t2,...,c tk,wehave k i=1 [ ( k a i ΔC ti j=1 ) ] b ij ΔC 2 ti (α) [aδc t + ΔC τ t BΔC t ] sup (α) sup k i=1 [ ( k a i ΔC ti + It follows from Theorem 1.12 that for any small enough ε>0, we have and [aδc t + ΔC τ t BΔC t] sup (α) [aδc t + ΔC τ t BΔC t] sup (α) j=1 π ln 1 α + ε Δt α ε ) ] b ij ΔC 2 ti (α). sup k a i i=1 ( + π ln 2 ε ) 2 Δt 2 ε π ln 1 α ε Δt α + ε k i=1 k a i i=1 ( π ln 2 ε ) 2 Δt 2 ε k i=1 k b ij, (.8) j=1 k b ij. (.9) By Eq. (.7) and inequality (.8), for Δt > 0, there exists a control u t u ε,δt such that j=1

59 .2 Equation of Optimality 51 εδt {f (t, u t, x)δt + J t (t, x)δt + x J(t, x) τ μ(t, u t, x)δt +[aδc t + ΔC τ t BΔC t] sup (α) + o(δt)} f (t, u t, x)δt + J t (t, x)δt + x J(t, x) τ μ(t, u t, x)δt + π ln 1 α + ε k Δt a i α ε i=1 ( + π ln 2 ε ) 2 k k Δt 2 b ij +o(δt). ε i=1 j=1 Dividing both sides of the above inequality by Δt, we get ε f (t, u t, x) + J t (t, x) + x J(t, x) τ μ(t, u t, x) + π ln 1 α + ε x J(t, x) τ σ (t, u t, x) 1 + h 1 (ε, Δt) + h 2 (Δt) α ε J t (t, x) + sup{ f (t, u t, x) + x J(t, x) τ μ(t, u t, x) + u U } π ln 1 α + ε x J(t, x) τ σ (t, u t, x) 1 + h 1 (ε, Δt) + h 2 (Δt) α ε since k a i x J(t, x) τ σ (t, u t, x) 1 i=1 as Δt 0, where h 1 (ε, Δt) 0 and h 2 (Δt) 0asΔt 0. Letting Δt 0, and then ε 0 results in 0 J t (t, x) + sup{ f (t, u t, x) + x J(t, x) τ μ(t, u t, x) + u U } π ln 1 α x J(t, x) τ σ (t, u t, x) 1. (.10) α On the other hand, by Eq. (.7) and inequality (.9), applying the similar method, we can obtain 0 J t (t, x) + sup{ f (t, u t, x) + x J(t, x) τ μ(t, u t, x) + u U } π ln 1 α x J(t, x) τ σ (t, u t, x) 1. (.11) α Combining (.10) and (.11), we obtain the Eq. (.4). The theorem is proved. Remark.1 The solutions of the proposed model (.1) may be derived from solving the equation of optimality (.4).

60 52 Optimistic Value-Based Uncertain Optimal Control Remark.2 Note that in the case of stochastic optimal control, we cannot obtain the similar conclusion to (.4) due to the difficulty of calculating optimistic value of the variables with the form of aη + bη 2, where η is a normally distributed random variable, while random normal distribution function has no analytic expression. Remark. Particularly, for one-dimensional case, the equation of optimality has a simple form: J t (t, x) = sup { f (t, u t, x) + J x (t, x)μ(t, u t, x) u t U } + π ln 1 α J x (t, x)σ (t, u t, x). (.12) α. Uncertain Optimal Control Model with Hurwicz Criterion Grounded on uncertain measure, the optimistic value criterion and pessimistic value criterion of uncertain variables have been introduced for handling optimization problems in uncertain environments. Applying the optimistic value criterion to consider the objectives is essentially a maximum approach, which maximizes the uncertain return. This approach suggests that the decision maker who is attracted by high payoffs to take some adventures. As opposed to the optimistic value criterion, using the pessimistic value criterion for uncertain decision system is essentially a maximin approach, which the underlying philosophy is based on selecting the alternative that provides the least bad uncertain return. It suggests the decision maker who is in pursuit of cautious that there is at least a known minimum payoff in the event of an unfavourable outcome. The Hurwicz criterion can also be called optimism coefficient method, designed by economics professor Leonid Hurwicz [2] in It is a complex decision-making criterion attempting to find the intermediate area between the extremes posed by the optimistic and pessimistic criteria. Instead of assuming totally optimistic or pessimistic, Hurwicz criterion incorporates a measure of both by assigning a certain percentage weight to optimism and the balance to pessimism. With the Hurwicz criterion, the decision maker first should subjectively select a coefficient ρ denoting the optimism degree, note that 0 ρ 1. Simultaneously, 1 ρ represents a measure of the decision maker s pessimism. For every decision alternative, let the maximum return be multiplied by the coefficient of optimism ρ, and the minimum return be multiplied by the coefficient 1 ρ, then sum the results obtained. After computing each alternative s weighted average return, select the alternative with the best return as the chosen decision. Particularly, by changing the coefficient ρ, the Hurwicz criterion becomes various criteria. If ρ = 1, it reduces the Hurwicz criterion to the optimistic value criterion; if ρ = 0, the criterion is the pessimistic value criterion.

61 . Uncertain Optimal Control Model with Hurwicz Criterion 5 Assume that C t = (C t1, C t2,...,c tk ) τ, where C t1, C t2,..., C tk are independent canonical processes. A selected coefficient ρ (0, 1) denoting the optimism degree, and predetermined confidence level α (0, 1). For any 0 < t < T, we present an uncertain optimal control model with Hurwicz criterion for multidimensional case as follows []. J(t, x) sup Hα ρ = sup u t U u t U { ρ Fsup (α) + (1 ρ)f inf (α) } subject to dx s = μ(s, u s, X s )ds + σ (s, u s, X s )dc s and X t = x (.1) where F = T t f (s, u s, X s )ds + G(T, X T ), and F sup (α) = sup { F M { F F } α} which denotes the α-optimistic value to F, F inf (α) = inf { F M { F F } α } reflects the α-pessimistic value to F. The vector X s is the state vector of dimension n, u is a control vector of dimension r subject to a constraint set U. The function f :[0, T ] R r R n R is the objective function, and G :[0, T ] R n R is the function of terminal reward. In addition, μ :[0, T ] R r R n R n is a vectorvalue function, and σ :[0, T ] R r R n R n R k is a matrix-value function. For the purpose of solving the proposed model, now we present the following principle of optimality and equation of optimality. Theorem. ([]) For any (t, x) [0, T ) R n, and Δt > 0 with t + Δt < T,we have J(t, x) = sup { f (t, u t, x)δt + J(t + Δt, x + ΔX t ) + o(δt)}, (.14) u t U where x + ΔX t = X t+δt. Proof The proof is similar to that of Theorem.1. Theorem.4 ([]) Suppose J(t, x) C 2 ([0, T ] R n ). Then we have J t (t, x) = sup { f (t, u t, x) + x J(t, x) τ b(t, u t, x) u t U ( ) } + (2ρ 1) π ln 1 α x J(t, x) τ σ(t, u t, x) α 1 (.15) where J t (t, x) is the partial derivative of the function J(t, x) in t, x J(t, x) is the gradient of J(t, x) in x, and 1 is the 1-norm for vectors, that is, p 1 = n p i for p = (p 1, p 2,...,p n ). i=1

62 54 Optimistic Value-Based Uncertain Optimal Control Proof By using Taylor expansion, we get J(t + Δt, x + ΔX t ) = J(t, x) + J t (t, x)δt + x J(t, x) τ ΔX t J tt(t, x)δt ΔXτ t xxj(t, x)δx t + x J t (t, x) τ ΔtΔX t + o(δt) (.16) where xx J(t, x) is the Hessian matrix of J(t, x) in x. Note thatδx t = b(t, u t, x)δt +σ (t, u t, x)δc t. Substituting Eq. (.16) into Eq. (.14) and simplifying the resulting expression yields that where 0 = sup { f (t, u t, x)δt + J t (t, x)δt + x J(t, x) τ b(t, u t, x)δt u t U + Hα ρ [aδc t + ΔC τ t BΔC t]+o(δt) }, (.17) a = x J(t, x) τ σ (t, u t, x) + x J t (t, x) τ σ (t, u t, x)δt + b(t, u t, x) τ xx J(t, x)σ (t, u t, x)δt, B = 1 2 σ (t, u t, x) τ xx J(t, x)σ (t, u t, x). Let a = (a 1, a 2,...,a k ), B = (b ij ) k k.wehave aδc t + ΔC τ t BΔC t = k k k a i ΔC ti + b ij ΔC ti ΔC tj. i=1 i=1 j=1 Since b ij ΔC ti ΔC tj 1 2 b ij (ΔC ti 2 + ΔC tj 2 ),we have k i=1 { ( k ) } 2 a i ΔC ti b ij ΔC ti aδc t + ΔC τ t BΔC t j=1 k i=1 Because of the independence of C t1, C t2,..., C tk,wehave k i=1 H ρ α { ( k ) } 2 a i ΔC ti + b ij ΔC ti. [ ( k ) ] 2 a i ΔC ti b ij ΔC ti Hα ρ [aδc t + ΔC τ t BΔC t] j=1 k i=1 H ρ α j=1 [ ( k ) ] 2 a i ΔC ti + b ij ΔC ti. j=1

63 . Uncertain Optimal Control Model with Hurwicz Criterion 55 By Eq. (.17), for Δt > 0 and any small enough ε>0, there exists a control u t u ε,δt such that εδt {f (t, u t, x)δt + J t (t, x)δt + x J(t, x) τ μ(t, u t, x)δt + Hα ρ [aδc t + ΔC τ t BΔC t]+o(δt)}. Applying Theorems 1.12 and 1.1, wehave εδt f (t, u t, x)δt + J t (t, x)δt + x J(t, x) τ μ(t, u t, x)δt { + (2ρ 1) π ln 1 α + ε k Δt a i α ε i=1 ( + π ln 2 ε ) 2 k k Δt 2 b ij ε + o(δt). Dividing both sides of the above inequality by Δt, and taking the supremum with respect to u t, we get ε J t (t, x) + sup{ f (t, u t, x) + x J(t, x) τ μ(t, u t, x) + (2ρ 1) u U } π ln 1 α + ε x J(t, x) τ σ (t, u t, x) 1 + h 1 (ε, Δt) + h 2 (Δt) α ε i=1 j=1 since k a i x J(t, x) τ σ (t, u t, x) 1 i=1 as Δt 0; where h 1 (ε, Δt) 0 and h 2 (Δt) 0asΔt 0. Letting Δt 0, and then ε 0 results in 0 J t (t, x) + sup { f (t, u t, x) + x J(t, x) τ μ(t, u t, x) + (2ρ 1) u t U } π ln 1 α x J(t, x) τ σ (t, u t, x) 1. (.18) α On the other hand, by Theorems 1.12 and 1.1 again and applying the similar process, we can obtain 0 J t (t, x) + sup{ f (t, u t, x) + x J(t, x) τ b(t, u t, x) u U + (2ρ 1) } π ln 1 α x J(t, x) τ σ (t, u t, x) 1. (.19) α Combining (.18) and (.19), we obtain the Eq. (.15). The theorem is proved.

64 56 Optimistic Value-Based Uncertain Optimal Control Remark.4 If we consider a discounted infinite horizon optimal control problem, we assume that the objective function f, drift μ and diffusion σ are independent of time. Thus, we replace f (s, u s, X s ), b(s, u s, X s ), and σ (s, u s, X s ) by f (u s, X s ), μ(u s, X s ) and σ (u s, X s ), respectively. The problem is stated as follows: [ ] J(x) sup Hα ρ e γ s f (u s, X s )ds u U t subject to dx s = μ(u s, X s )ds + σ (u s, X s )dc s and X t = x. (.20) At time 0, the present value of the objective is given by e γ t J(x). Using the relations from Eq. (.15), we obtain the present value by γ J(x) = sup { f (x, u) + x J(x) τ μ(x, u) u t U ( ) } + (2ρ 1) π ln 1 α x J(x) τ σ(x, u) α 1. (.21) Example.1 Consider the following optimization problem comes from the Vidale- Wolfe advertising model [4] in uncertain environments: [ J(0, x 0 ) max H α ρ u U 0 ] e γ t (δx t u 2 )dt subject to dx t =[ru 1 X t kx t ]dt + σ(x t )dc t, where X t [0, 1] is the fraction of market potential, u 0 denotes the rate of advertising effort, r > 0, k > 0, σ is a small diffusion coefficient, σ 0, γ is a discount factor. In this case, we have F = 0 e γ t (δx t u 2 )dt. Applying Eq. (.15), we obtain γ J = max u = max L(u) u { (δx u 2 ) + (ru 1 x kx)j x + (2ρ 1) } π ln 1 α J x σ α (.22) where L(u) denotes the term in the braces. Setting dl(u)/du = 0, we obtain the necessary condition for optimality u = r 1 x 2 J x (t, x).

65 . Uncertain Optimal Control Model with Hurwicz Criterion 57 Substituting the equality into Eq. (.22), we have γ J = δx + r 2 (1 x) Jx 2 4 kxj x + (2ρ 1) π ln 1 α α σ J x (.2) We conjecture that J(t, x) = Px + Q (P > 0). This gives J x = P. Using the expression in Eq. (.2), we have the following condition for optimality ( 4γ P + r 2 P 2 4δ + 4kP ) x + 4γ Q r 2 P 2 4P(2ρ 1) π ln 1 α α σ = 0, or 4γ P + r 2 P 2 4δ + 4kP = 0, and 4γ Q r 2 P 2 4P(2ρ 1) The solution is given by π ln 1 α α σ = 0. P = 2(γ + k) + 2 (γ + k) 2 + r 2 δ and Q = r 2 P 2 + 4P(2ρ 1) r 2 γ The optimal decision is determined by u = rp 1 x 2. π 1 α ln α σ..4 Uncertain Linear Quadratic Model Under Optimistic Value Criterion We discuss an optimal control problem of uncertain linear quadratic model under optimistic value criterion. The problem is of the form: { T } ( J(0, x 0 ) = inf X τ u s Q(s)X s + u τ s R(s)u ) s ds + X τ T S T X T (α) s 0 sup subject to dx s = (A(s)X s + B(s)u s )ds + M(s)X s dc s X 0 = x 0, (.24) where X s is a state vector of dimension n, u s is a decision vector of dimension r, S T is a symmetric matrix and x s [a, b] n, where x s represents the state of X s at time s. The matrices Q(s), R(s), S T, A(s), B(s), and M(s) are appropriate size matrix functions, where Q(s) is a symmetric nonnegative definite matrix and R(s) is a symmetric positive definite matrix. For any 0 < t < T,weusex to denote the state of X s at time t and J(t, x) to denote the optimal value obtainable in [t, T ]. First, we shall make the following two assumptions: (i) the elements of Q(s), R(s), A(s), B(s), M(s), and R 1 (s) are

66 58 Optimistic Value-Based Uncertain Optimal Control continuous and bounded functions on [0, T ]; (ii) the optimal value J(t, x) is a twice differentiable function on [0, T ] [a, b] n. Then, applying the equation of optimality (.4), we obtain { inf x τ Q(t)x + u τ u t R(t)u t + x J(t, x) τ (A(t)x + B(t)u t ) t } + π ln 1 α α x J(t, x) τ M(t)x +J t (t, x) = 0. (.25) Theorem.5 ([5]) A necessary and sufficient condition that u t be an optimal control for model (.24) is that u t = 1 2 R 1 (t)b τ (t)p(t)x, (.26) where the function P(t) satisfies the following Riccati differential equation dp(t) dt 2Q(t) A τ (t)p(t) P(t)A(t) = π 1 α ln α P(t)M(t) π ln 1 α α Mτ (t)p(t) P(t)B(t)R 1 (t)b τ (t)p(t) if (t, x) 1, 2Q(t) A τ (t)p(t) P(t)A(t) + 1 α ln π α P(t)M(t) + π ln 1 α α Mτ (t)p(t) P(t)B(t)R 1 (t)b τ (t)p(t) if (t, x) 2 (.27) and boundary condition P(T ) = 2S T, where 1 = { (t, x) x τ P(t)M(t)x 0,(t, x) [0, T ] [a, b] n}, 2 = { (t, x) x τ P(t)M(t)x < 0,(t, x) [0, T ] [a, b] n}. The optimal value of model (.24) is Proof Denote J(0, x 0 ) = 1 2 xτ 0 P(0)x 0. (.28) ψ(u t ) = x τ Q(t)x + u τ t R(t)u t + x J(t, x) τ (A(t)x + B(t)u t ) + π ln 1 α α x J(t, x) τ M(t)x +J t (t, x). (.29) First, we verify the necessity. Since J(T, X T ) = x τ T S T x T, we conjecture that x J(t, x) = P(t)x

67 .4 Uncertain Linear Quadratic Model Under Optimistic Value Criterion 59 with the boundary condition P(T ) = 2S T. Setting ψ(u t) u t = 0, we have u t = 1 2 R 1 (t)b τ (t)p(t)x. (.0) Because 2 ψ(u t ) u 2 t = 2R(t) >0, u t is the optimal control of model (.24), i.e., u t = 1 2 R 1 (t)b τ (t)p(t)x. (.1) If (t, x) 1, taking the derivative of ψ(u t ) with respect to x, wehave ( 2Q(t) + A τ (t)p(t) + P(t)A(t) + π ln 1 α α P(t)M(t) ) + π ln 1 α α Mτ (t)p(t) 1 2 P(t)B(t)R 1 (t)b τ (t)p(t) + dp(t) x = 0. dt That is, dp(t) dt = 2Q(t) A τ (t)p(t) P(t)A(t) π ln 1 α α Mτ (t)p(t) P(t)B(t)R 1 (t)b τ (t)p(t). π ln 1 α α P(t)M(t) If (t, x) 2, by the same method, we obtain dp(t) dt = 2Q(t) A τ (t)p(t) P(t)A(t) + + π ln 1 α α Mτ (t)p(t) P(t)B(t)R 1 (t)b τ (t)p(t). π ln 1 α α P(t)M(t) Hence, the solution P(t) is a symmetric matrix. Because x J(t, x) = P(t)x and J(T, X T ) = x τ T S T x T, we have J(t, x) = 1 2 xτ P(t)x. Then, the optimal value J(0, x 0 ) is J(0, x 0 ) = 1 2 xτ 0 P(0)x 0. (.2) Then, we prove the sufficient condition. Because J(T, X T ) = x τ T S T x T,we assume that J(t, x) = 1 2 xτ P(t)x, where P(t) satisfies the Riccati differential equation (.27) with the boundary condition P(T ) = 2S T. Substituting Eqs. (.26) and (.27) intoψ(u t ),wehaveψ(u t ) = 0. Because the objective function of model

68 60 Optimistic Value-Based Uncertain Optimal Control (.24) is convex, there must be an optimal control solution. Hence, u t is the optimal control and J(t, x) = 1 2 xτ P(t)x. Furthermore, the optimal value J(0, x 0 ) is The theorem is proved. J(0, x 0 ) = 1 2 xτ 0 P(0)x 0. (.) Remark.5 We know that there is yet no simple and effective method to solve the Riccati differential equation with absolute value function. In order to obtain the solution of P(t), we need to make a judgment about the sign of x τ P(t)M(t)x.The procedure is as follows. First, we assume that x τ P(t)M(t)x 0orx τ P(t)M(t)x < 0 and use the four-order Runge-Kutta method to solve the numerical solution of P(t). Then we check whether the result is consistent with the assumption. If they are consistent, the numerical solution of P(t) is serviceable and we can use Theorem.5 to obtain the optimal control. If they are both inconsistent, then we can not solve the optimal control problem in this case. Moreover, if we can verify the positive or negative definiteness of P(t)M(t), then the Theorem.5 can be used immediately. Hence, here we only consider the reconcilable cases..5 Optimistic Value Optimal Control for Singular System Consider the following optimal control problem for an continuous-time singular uncertain system: { T J(0, X 0 ) = sup u s U 0 } f (s, u s, X s )ds + G(T, X T ) (α) sup subject to FdX s = [AX s + Bu(s)] ds + Du(s)dC s, and X 0 = x 0, where X s R n is the state vector, u s U R m is the input variable, f is the objective function, and G is the function of terminal reward. For a given u s, X s is defined by the uncertain differential equations. The function J(0, X 0 ) is the expected optimal value obtainable in [0, T ] with the initial state that at time 0 we are in state x 0. For any 0 < t < T, J(t, X) is the expected optimal reward obtainable in [t, T ] with the condition that at time t weareinstatex t = x. That is, we have [ T J(t, X) = sup u s U t ] f (s, u s, X s ) ds + G(X T, T ) (α) sup subject to FdX s = [AX s + Bu s ] ds + Du s dc s, and X t = x. (.4)

69 .5 Optimistic Value Optimal Control for Singular System 61 If (F, A) is regular and impulse-free and rank(f) = r, by Lemma 2.1 there exist invertible matrices P and Q such that PFQ = [ ] [ ] [ ] Ir 0 A1 0 B1, PAQ =, PB = I n r B 2 We have the following equation of optimality. Theorem.6 ([6]) The (F, A) is assumed to be regular and impulse-free, and P 2 D = 0. Let J(t, x) be twice differentiable on [0, T ] R n and u s derivable on [0, T ]. Then, we have { J t (t, x) = sup f (t, u t, x) + x J(t, x) τ p + u t U [ ] A1 x where p = Q 1 + B 1 u t B 2 u t R n (n r),p= x 2 R n r. [ P1 } π ln 1 α α x J(t, x) τ q (.5),q= Q 1 D 1 u t, and Q = [ ] Q 1 Q 2,Q1 R n r,q 2 [ ] x1,x x 1 R r, 2 P 2 ],D 1 = P 1 D, P 1 R r n,p 2 R (n r) n,x= Q Proof It follows from P 2 D = 0 that [ ] X1,s Let X s = Q X 2,s are easy to obtain PD = [ P1 P 2 ] D = [ ] D1. 0 [ ] x1 for any s [t, T ] and especially at time t, x = Q.Sowe x 2 { dx1,s = [ ] A 1 X 1,s + B 1 u s ds + D1 u s dc s, 0 = [ ] X 2,s + B 2 u s ds, where s [t, T ]. Since at any time s [t, T ] we have X 2,s = B 2 u s Let s = t and s = t + Δt, respectively. We get the following two equations: X 2 = B 2 u t X 2,t+Δt = B 2 u t+δt Using the latter equation minus the former one, we obtain ΔX 2,t = B 2 u(t)δt + (Δt),

70 62 Optimistic Value-Based Uncertain Optimal Control where u t+δt = u t + u t Δt + (Δt), because u s is derivable on [t, T ]. Obviously we know ΔX 1,t = [A 1 X 1 + B 1 u t ] Δt + D 1 u t ΔC t, where ΔC t N (0,Δt 2 ) which means ΔC t is a normally uncertain variable with expected value 0 and variance Δt 2.Wehave ΔX t = Q [ ] A1 X 1 + B 2 u t Δt + Q B 2 u(t) 1 D 1 u t ΔC t + (Δt). Now denote Then we have p = Q [ ] A1 X 1 + B 2 u t, q = Q B 2 u(t) 1 D 1 u t. ΔX t = pδt + qδc t + (Δt). By employing Taylor series expansion, we obtain J(t + Δt, x + ΔX t ) = J(t, x) + J t (t, x)δt + x J(t, x) τ ΔX t J tt(t, x)δt 2 + x J t (t, x) τ ΔX t Δt ΔX τ t xx J(t, x)δx t + (Δt). (.6) Substituting Eq. (.6) into Eq. (.2) yields { [ 0 = sup f (t, u t, x)δt + J t (t, x)δt + x J(t, x) τ ΔX t + x J t (t, x) τ ΔX t Δt u t + 1 ] } 2 ΔX t τ xx J(t, x)δx t (α) + (Δt). (.7) sup Then, we know [ x J(t, x) τ ΔX t + x J(t, x) τ ΔX t Δt + 12 ΔX τt xx J(t, x)δx t ] (α) sup [ = x J(t, x) τ (pδt + qδc t + (Δt)) + pδt + x J(t, x) τ (pδt + qδc t + (Δt))Δt + 1 ] 2 (pδt + qδc t + (Δt)) T xx J(t, x)(pδt + qδc t + (Δt)) (α) sup [( ) = x J(t, x) τ pδt + x J(t, x)q + x J t (t, x)qδt + p T xx J(t, x)qδt ΔC t + 1 ] 2 qt xx J(t, x)qδct 2 (α) + (Δt) sup [ ] = x J(t, x) τ pδt + aδc t + bδct 2 (α) + (Δt), (.8) sup

71 .5 Optimistic Value Optimal Control for Singular System 6 where a = x J(t, x)q + x J t (t, x)qδt + p τ xx J(t, x)qδt, and b = 1 2 qτ xx J(t, x)q. Substituting Eq. (.8) into(.7) results in { 0 = sup f (t, u t, x)δt + J t (t, x)δt + x J(t, x) T pδt u t Obviously, we have + [ aδc t + bδc 2 t ]sup (α) + (Δt) }. (.9) aδc t b ΔC 2 t aδc t + bδc 2 t aδc t + b ΔC 2 t. (.40) Applying Theorem 1.12 that for any small enough ε>0, we get [ aδct + b ΔCt 2 ]sup (α) π ln 1 α + ε a Δt α ε ( ) + π ln 2 ε 2 b Δt 2, (.41) ε [ aδct b ΔCt 2 ]sup (α) π ln 1 α ε a Δt α + ε ( ) π ln 2 ε 2 b Δt 2. (.42) ε Combining inequalities (.40), (.41), and (.42), we obtain [ aδct + bδct 2 ]sup (α) π ln 1 α + ε a Δt α ε ( ) + π ln 2 ε 2 b Δt 2, (.4) ε [ aδct + bδct 2 ]sup (α) π ln 1 α ε a Δt α + ε ( ) π ln 2 ε 2 b Δt 2. (.44) ε According to Eq. (.9) and inequality (.4), for Δt > 0, there exists a control u t such that

72 64 Optimistic Value-Based Uncertain Optimal Control εδt { f (t, u t, x)δt + J t (t, x)δt + x J(t, x) τ pδt + [ } aδc t + bδct 2 ]sup (α) + (Δt) f (t, u t, x)δt + J t (t, x)δt + x J(t, x) τ pδt + π ln 1 α + ε a Δt α ε ( ) + π ln 2 ε 2 b Δt 2 + (Δt). ε Dividing both sides of this inequality by Δt,wehave ε f (t, u t, x) + J t (t, x) + x J(t, x) τ p + ( ) + π ln 2 ε 2 b Δt + (Δt) ε Δt J t (t, x) + sup u t { f (t, u t, x) + x J(t, x) τ p + ( ) + π ln 2 ε 2 b Δt + (Δt). ε Δt π ln 1 α + ε a α ε π ln 1 α + ε } a α ε Since a x J(t, x) τ q as Δt 0, letting Δt 0 and then ε 0, it is easy to know 0 J t (t, x) + sup u t { f (t, u t, x) + x J(t, x) τ p + } π ln 1 α α x J(t, x) τ q. (.45) On the other hand, according Eq. (.9) and inequality (.44), using the similar approach, we are able to obtain 0 J t (t, X) + sup u t { f (t, u t, x) + x J(t, x) τ p + } π ln 1 α α x J(t, x) τ q. (.46) By inequalities (.45) and (.46), we get the Eq. (.5). This completes the proof. Remark.6 The solutions of the presented model (.4) may [ be obtained from] g(t) (A1 X settling the equation of optimality (.5). The vector p = Q 1 + B 1 u t ) B 2 u(t) is related to the function u(t) which is totally different from the optimal control problem of the uncertain normal system, and it will bring lots of matters in solving equation (.5). In some special cases, this equation of optimality may be settled to get analytical solution such as the following example. Otherwise, we have to employ numerical methods to obtain the solution approximately.

73 .5 Optimistic Value Optimal Control for Singular System Example Consider the following problem: [ + ] J(t, X) = sup ρ τ (s)x s u s ds (α) u t U ad t sup subject to FdX s = [AX s + Bu s ] ds + Du s dc s, and X t = x. (.47) where X s R is the state vector, U ad =[ 1, 1], and F = 00 1, A = 1 0 1, B = 2, D = , and ρ τ (s) =[1, 0, 2]e s. Through calculating, we know z det(zf A) = det 1 0 z 1 = (z 1) Obviously, det(ze A) is not identically zero and deg(det(zf A)) = rank(f), namely, (F, A) is regular and impulse free. By using Lemma 2.1, through deduction we obtain two invertible matrices P and Q: P = , Q = such that 100 PFQ = 010, PAQ = , PB = 2, PD = Easily, we can see A 1 = [ ] 1 1 4, B 01 1 = [ 1 ], B 2 2 = 1, D 1 = [ ] [ , P 2 D =, Q 0] 1 =

74 66 Optimistic Value-Based Uncertain Optimal Control where P 2 = [ 00 1 ]. Denote x =[x 1, x 2, x ] τ and assume that x 1 + 2x = 0. Because Q 1 = , and [x 1, x 2 ] τ = Q 1 x, we obtain x 1 =[ 1 4 x, x 1 ] τ. Combining these results and Theorem.6, we know [ ] x A1 X p = Q 1 + B 1 u 1 + 2u t t = 2x B 2 u 1 + x + 4u t + u t, t x 1 + x + 2u t 1 u t q = Q 1 D 1 u t = 0. 1 u t We conjecture that J(t, x) = kρ τ (t)x, and let α = 0.2. Then and ρ τ (t)xu(t) + x J(t, x) τ p + J t (t, x) = kρ τ (t)x, x J(t, x) = kρ(t), π ln 1 α x J(t, x) τ q α = (x 1 2x )e t u t + k [x 1 + 2u t 2 (x 1 + x + 2u t )] e t + = (x 1 2x 6k)e t u t + π ln 4 ku t e t. Applying Eq. (.5), we get k(x 1 2x )e t = sup u t [ 1,1] = e t sup u t [ 1,1] Dividing Eq. (.48) bye t, we obtain k(x 1 2x ) = sup u t [ 1,1] π ln 1 α ku(t) e t α [ ] (x 1 2x 2k)e t u t + ln 4 ku(t) e t π [ ] (x 1 2x 2k)u t + ln 4 ku(t). (.48) π [ (x 1 2x 2k)u t + ] π ln 4 ku t. (.49)

75 .5 Optimistic Value Optimal Control for Singular System 67 If ku t 0, Eq.(.49) turns to be [ ] k(x 1 2x ) = sup (x 1 + 2x 2k)u t + u t [ 1,1] π ku t ln 4 ( ) = x 1 2x 2 π ln 4 k, (.50) and then namely [ k 2 (x 1 2x ) 2 = x 1 2x ( ) 2 2 k] π ln 4, (a 2 b 2 1 )k2 + 2ab 1 k a 2 = 0, where a = x 1 2x, and b 1 = 2 π ln 4 > 0. Because ku t 0, and by Eq. (.50) the symbols of k and a must keep coincident, we know a, if a =±b 1 2b 1 0, if a = 0 k = a, if a b 1. a + b 1 The optimal control is u t = sign(a b 1 k). If ku t < 0, Eq.(.49) turns to be [ ] k(x 1 2x ) = sup (x 1 + 2x 2k)u t ln 4ku(t) u t [ 1,1] π ( ) = x 1 2x 2 + π ln 4 k. Using the similar method, we are able to obtain k = a, if a =±b 2 2b 2 a, if b 2 < a < 0 a + b 2 a, if 0 < a < b 2, a b 2

76 68 Optimistic Value-Based Uncertain Optimal Control where a = x 1 2x, and b 2 = 2 + π When b 1 < a 0. The optimal control is u t = sign(a b 2 k). ( ) b 1 b 2 + 2a > b 1 b 2 = 4 1 π ln 4 > 0, and obviously a a = a(b 1 b 2 + 2a) a b 2 a + b 1 (b 2 a)(b 1 + a) > 0. When b 2 < a b 2, sign(a b 2 k), if 0 < a < b 1, or b 1 < a b 2. References 1. Sheng L, Zhu Y (201) Optimistic value model of uncertain optimal control. Int J Uncertain Fuzziness Knowl-Based Syst 21(Suppl. 1): Hurwicz L (1951) Some specification problems and application to econometric models. Econometrica 19:4 44. Sheng L, Zhu Y, Hamalainen T (201) An uncertain optimal control model with Hurwicz criterion. Appl Math Comput 224: Sethi S, Thompson G (2000) Optimal control theory: applications to management science and economics, 2nd edn. Springer 5. Li B, Zhu Y (2018) Parametric optimal control of uncertainn systems under optimistic value criterion. Eng Optim 50(1): Shu Y, Zhu Y (2017) Optimistic value based optimal control for uncertain linear singular systems and application to dynamic input-output model. ISA Trans 71(part 2):25 251

77 Chapter 4 Optimal Control for Multistage Uncertain Systems In this chapter, we will investigate the following expected value optimal control problem for a multistage uncertain system: min u(i) U i 0 i N N E f (x( j), u( j), j) j=0 subject to: x( j + 1) = φ(x( j), u( j), j) + σ(x( j), u( j), j) C j+1, j = 0, 1, 2,...,N 1, x(0) = x 0, (4.1) where x( j) is the state of the system at stage j, u( j) the control variable at stage j, U j the constraint domain for the control variables u( j) for j = 0, 1, 2,...,N, f the objective function, φ and σ two functions, and x 0 the initial state of the system. In addition, C 1, C 2,...,C N are some independent uncertain variables. 4.1 Recurrence Equation For any 0 < k < N,letJ(x k, k) be the expected optimal reward obtainable in [k, N] with the condition that at stage k, we are in state x(k) = x k.thatis,wehave N J(x k, k) min E f (x( j), u( j), j) u(i) U i k i N j=k subject to: x( j + 1) = φ(x( j), u( j), j) + σ(x( j), u( j), j) C j+1, j = k, k + 1,...,N 1, x(k) = x k, Springer Nature Singapore Pte Ltd Y. Zhu, Uncertain Optimal Control, Springer Uncertainty Research, 69

78 70 4 Optimal Control for Multistage Uncertain Systems Theorem 4.1 We have the following recurrence equations J(x N, N) = min f (x N, u(n), N), u(n) U N (4.2) J(x k, k) = min E[ f (x k, u(k), k) + J(x(k + 1), k + 1)] u(k) U k (4.) for k = N 1, N 2,...,1, 0. Proof It is obvious that J(x N, N) = min f (x N, u(n), N). For any k = N u(n) U N 1, N 2,...,1, 0, we have N J(x k, k) = min E f (x( j), u( j), j) u(i) U i k i N j=k N = min E f (x(k), u(k), k) + E f (x( j), u( j), j) u(i) U i k i N j=k+1 min E f (x k, u(k), k) + min N E f (x( j), u( j), j) u(i) U i k i N u(i) U i k+1 i N j=k+1 = min E[ f (x k, u(k), k) + J(x(k + 1), k + 1)]. u(k) U k In addition, for any u(i), k i N, wehave N J(x k, k) E f (x( j), u( j), j) j=k = E f (x k, u(k), k) + E N j=k+1 f (x( j), u( j), j). Since J(x k, k) is independent on u(i) for k + 1 i N, wehave J(x k, k) E f (x k, u(k), k) + min u(i) U i k+1 i N N E j=k+1 f (x( j), u( j), j) = E [ f (x k, u(k), k) + J(x(k + 1), k + 1)]. Taking the minimum for u(k) in the above inequality yields that J(x k, k) min u(k) U k E[ f (x k, u(k), k) + J(x(k + 1), k + 1)]. The recurrence Eq. (4.) is proved.

79 4.1 Recurrence Equation 71 Note that the recurrence Eqs. (4.2) and (4.) may be reformulated as J(x N, N) = min f (x N, u(n), N), (4.4) u(n) U N J(x k, k) = min E[ f (x k, u(k), k) + J(φ(x k, u(k), k) u(k) U k +σ(x k, u(k), k) C k+1, k + 1)] (4.5) for k = N 1, N 2,...,1, 0. Theorem 4.1 tells us that the solution of problem (4.1) can be derived from the solution of the simpler problems (4.2) and (4.) step by step from the last stage to the initial stage or in reverse order. 4.2 Linear Quadratic Model By using the recurrence Eqs. (4.2) and (4.), we will obtain the exact solution for the following uncertain optimal control problem with a quadratic objective function subject to an uncertain linear system: N min E A j x 2 ( j) + B j u 2 ( j) u(i) 0 i N j=0 subject to: x( j + 1) = a j x( j) + b j u( j) + σ j+1 C j+1, j = 0, 1, 2,...,N 1, x(0) = x 0, (4.6) where A j 0, B j 0 and a j, b j,σ j = 0 are constants for all j. Generally, a j x( j) + b j u( j) > σ j+1 for any j. In addition, C 1, C 2,...,C N are ordinary linear uncertain variables L( 1, 1) with the same distribution 0, if x 1 Φ(x) = (x + 1)/2, if 1 x 1 1, if x 1. Denote the optimal control for the above problem by u (0), u (1),..., u (N).By the recurrence Eq. (4.4), we have J(x N, N) = min u(n) {A N x 2 N + B N u 2 (N)} =A N x 2 N, where u (N) = 0. For k = N 1, we have

80 72 4 Optimal Control for Multistage Uncertain Systems J(x N 1, N 1) = min E[A N 1xN B N 1u 2 (N 1) + J(x(N), N)] u(n 1) = min {A N 1xN B N 1u 2 (N 1) + A N E[x 2 (N)]} u(n 1) = min {A N 1xN B N 1u 2 (N 1) u(n 1) +A N E[(a N 1 x N 1 + b N 1 u(n 1) + σ N C N ) 2 ]} = min {A N 1xN B N 1u 2 (N 1) + A N (a N 1 x N 1 + b N 1 u(n 1)) 2 u(n 1) +A N E[2σ N (a N 1 x N 1 + b N 1 u(n 1))C N + σn 2 C2 N ]}. (4.7) Denote d = 2σ N (a N 1 x N 1 + b N 1 u(n 1)). It follows from Example 1.6, denoting b = d/σn 2 the absolute value of which is larger than 2 that E[2σ N (a N 1 x N 1 + b N 1 u(n 1))C N + σ 2 N C2 N ]=σ 2 N E[bC N + C 2 N ] Substituting (4.8)into(4.7) yields that J(x N 1, N 1) = min {A N 1xN B N 1u 2 (N 1) u(n 1) + A N (a N 1 x N 1 + b N 1 u(n 1)) σ 2 N A N = 1 σ 2 N. (4.8) }. Let H = A N 1 x 2 N 1 + B N 1u 2 (N 1) + A N (a N 1 x N 1 + b N 1 u(n 1)) σ 2 N A N. It follows from H u(n 1) = 2B N 1u(N 1) + 2A N b N 1 [a N 1 x N 1 + b N 1 u(n 1)] = 0 that the optimal control is u (N 1) = a N 1b N 1 A N B N 1 + b 2 N 1 A N x N 1

81 4.2 Linear Quadratic Model 7 which is the minimum point of the function H because Hence, 2 H u 2 (N 1) = 2B N 1 + 2A N b 2 N 1 0. J(x N 1, N 1) = A N 1 xn a2 N 1 b2 N 1 A2 N B N 1 (B N 1 + b 2 N 1 A N ) x 2 2 N 1 ( +A N a N 1 a N 1b 2 N 1 A ) 2 N B N 1 + b 2 N 1 A xn N σ N 2 A N ( = A N 1 + a2 N 1 b2 N 1 B N 1 A 2 N (B N 1 + b 2 N 1 A N ) 2 a 2 N 1 + B2 N 1 A ) N (B N 1 + b 2 N 1 A x 2 N ) 2 N σ N 2 A N. Let ( Q N 1 = A N 1 + a2 N 1 b2 N 1 B N 1 A 2 N (B N 1 + b 2 N 1 A N ) + a2 N 1 B2 N 1 A ) N 2 (B N 1 + b 2 N 1 A. N ) 2 We have J(x N 1, N 1) = Q N 1 x 2 N σ 2 N A N. (4.9) For k = N 2, we have J(x N 2, N 2) = min E[A N 2xN B N 2u 2 (N 2) + J(x(N 1), N 1)] u(n 2) = min {A N 2xN B N 2u 2 (N 2) + E[Q N 1 x 2 (N 1) + 1 u(n 2) σ N 2 A N ] = min {A N 2xN B N 2u 2 (N 2) + Q N 1 E[(a N 2 x N 2 + b N 2 u(n 2) u(n 2) + σ N 1 C N 1 ) 2 ]+ 1 } σ N 2 A N = min {A N 2xN B N 2u 2 (N 2) + Q N 1 (a N 2 x N 2 + b N 2 u(n 2)) 2 u(n 2) } +Q N 1 E[2σ(a N 2 x N 2 + b N 2 u(n 2))C N 1 +σn 1 2 C2 N 1 ]+1 σ N 2 A N.

82 74 4 Optimal Control for Multistage Uncertain Systems It follows from the similar computation to (4.8) that E[2σ(a N 2 x N 2 + b N 2 u(n 2))C N 1 + σ 2 N 1 C2 N 1 ]=1 σ 2 N 1. By the similar computation to the case for k = N 1, we get J(x N 2, N 2) = min {A N 2xN B N 2u 2 (N 2) + Q N 1 (a N 2 x N 2 + b N 2 u(n 2)) 2 u(n 2) + 1 } (σ N 1 2 Q N 1 + σn 2 A N ) ( = A N 2 + a2 N 2 b2 N 2 B N 2 Q 2 N 1 (B N 2 + b 2 N 2 Q N 1) + a 2 N 2 B2 N 2 Q ) N 1 2 (B N 2 + b 2 N 2 Q x 2 N 1) 2 N (σ 2 N 1 Q N 1 + σ 2 N A N ) with the optimal control u (N 2) = a N 2b N 2 Q N 1 B N 2 + b 2 N 2 Q x N 2. N 1 Let ( Q N 2 = A N 2 + a2 N 2 b2 N 2 B N 2 Q 2 N 1 (B N 2 + b 2 N 2 Q N 1) + a 2 N 2 B2 N 2 Q ) N 1 2 (B N 2 + b 2 N 2 Q. N 1) 2 We have J(x N 2, N 2) = Q N 2 x 2 N (σ 2 N 1 Q N 1 + σ 2 N A N ). (4.10) By induction, we can obtain the optimal control for problem (4.6) asfollows: u (N) = 0, u (k) = a kb k Q k+1 B k + bk 2 Q x k k+1 where Q N = A N, ( Q k = A k + a2 k b2 k B k Q 2 k+1 (B k + bk 2 Q k+1) + a2 k B2 k Q ) k+1 2 (B k + bk 2 Q, k+1) 2

83 4.2 Linear Quadratic Model 75 and the optimal values are J(x N, N) = A N x 2 N, J(x k, k) = Q k x 2 k + 1 N j=k+1 σ 2 j Q j for k = N 1, N 2,...,1, General Case In previous section, we studied an optimal control problem for a quadratic objective function subject to an uncertain linear system. For that problem, we can get the exact feedback optimal controls of the state at all stages. If the system is nonlinear, or the objective function is not quadratic, or the uncertain variables C j s are not linear, the optimal controls may be not displayed exactly by the state of the system at all stages. In such cases, we have to consider the numerical solutions for the problem. For the uncertain optimal control problem (4.1), assume that the state x(k) of the system is in [l k, l+ k ], and the control variable u(k) is constrained by the set U k for k = 0, 1,...,N. For each k, divide the interval [l k, l+ k ] into n k subintervals: l k = x(k) 0 < x(k) 1 < < x(k) nk = l + k. We will numerically compute the optimal controls in outline way for all states x(k) i (i = 0, 1,...,n k, k = 0, 1,...,N). Based on these data, we can obtain the optimal controls in online way for any initial state x 0 by an interpolation method. In practice, for simplicity, it is reasonable to assume that the range of each state variable x(k) is a finite interval, even if it may be a subset of a finite interval. These intervals are set according to the background of the problem. To balance the accuracy of approximations by interpolation and the computational cost, the number of state variables in the range [l k, l+ k ] should be chosen properly. Next, we will establish two methods to produce the optimal controls for all states x(k) i (i = 0, 1,...,n k, k = 0, 1,...,N): hybrid intelligent algorithm and finite search method Hybrid Intelligent Algorithm By the recurrence Eqs. (4.4) and (4.5), we first approximate the value J(x(N), N). For each x(n) i (i = 0, 1,...,n N ), solve the following optimization J(x(N) i, N) = min f (x(n) i, u(n), N) u(n) U N

84 76 4 Optimal Control for Multistage Uncertain Systems by genetic algorithm to get optimal control u (N) i and optimal objective value J(x(N) i, N). Then for each x(n 1) i (i = 0, 1,...,n N 1 ), solve the following optimization J(x(N 1) i, N 1) = min E[ f (x(n 1) i, u(n 1), N 1) + J(x(N), N)], u(n 1) U N 1 where x(n) = φ(x(n 1) i, u(n 1), N 1) + σ(x(n 1) i, u(n 1), N 1) C N, by hybrid intelligent algorithm (integrating uncertain simulation, neural network, and genetic algorithm) to get optimal control u (N 1) i and optimal objective value J(x(N 1) i, N 1). Note that the optimal control u (N 1) i is selected in U N 1 and the set of u(n 1) such that x(n) = φ(x(n 1) i, u(n 1), N 1) + σ(x(n 1) i, u(n 1), N 1) C N is in [l N, l+ N ].ThevalueofJ(x(N), N) may be calculated by interpolation based on the values J(x(N) i, N) (i = 0, 1,...,n N ). In addition, the expected value E[ f (x(n 1) i, u(n 1), N 1) + J(x(N), N)] may be approximated by uncertain simulation established in Sect By induction, we can solve the following optimization J(x(k) i, k) = min u(k) U k E[ f (x(k) i, u(k), k) + J(x(k + 1), k + 1)], by hybrid intelligent algorithm, to get optimal control u (k) i and optimal objective value J(x(k) i, k) for k = N 2, N,...,1, 0. The method to produce a list of data on the optimal controls and optimal objective values for all states x(k) i (i = 0, 1,...,n k, k = 0, 1,...,N) by hybrid intelligent algorithm may be summarized as Algorithm Finite Search Method At every stage k, the constraint domain U k of control variable u(k) is assumed to be an interval [q k, q+ k ]. Averagely divide the interval [q k, q+ k ] into m k subintervals: q k = u(k) 0 < u(k) 1 < < u(k) mk 1 < u(k) mk = q + k. The approximate optimal control u (k) i is searched in the finite set {u(k) j 0 j m k }. That is, E[ f (x(k) i, u (k) i, k) + J(x(k + 1), k + 1)] = min E[ f (x(k) i, u(k) j, k) + J(x(k + 1), k + 1)] (4.11) 0 j m k where x(k + 1) = φ(x(k) i, u(k) j, k) + σ(x(k) i, u(k) j, k) C k+1.

85 4. General Case 77 Algorithm 4.1 (Data production by hybrid intelligent algorithm) Step 1. Averagely divide [l k, l+ k ] to generate states x(k) i as l k = x(k) 0 < x(k) 1 < < x( j) nk = l + k for k = 0, 1,...,N. Step 2. Solve J(x(N) i, N) = min f (x(n) i, u(n), N) u(n) U N by genetic algorithm to produce u (N) i and J(x(N) i, N) for i = 0, 1,...,n N. Step. For k = N 1 to 0, perform the next two steps. Step 4. Approximate the function Step 5. Solve by Algorithm 1., where u(k) E[ f (x(k) i, u(k), k) + J(x(k + 1), k + 1)] x(k + 1) = φ(x(k) i, u(k), k) + σ(x(k) i, u(k), k) C k+1. J(x(k) i, k) = min E[ f (x(k) i, u(k), k) + J(x(k + 1), k + 1)], u(k) U k by hybrid intelligent algorithm to produce u (k) i and J(x(k) i, k) for i = 0, 1,...,n k. The method to produce a list of data on the optimal controls and optimal objective values for all states x(k) i (i = 0, 1,...,n k, k = 0, 1,...,N) by finite search method may be summarized as the following Algorithm 4.2. Remark 4.1 Generally speaking, the optimal controls u (k) i obtained by Algorithm 4.2 is not finer than by Algorithm 4.1. But the perform time by Algorithm 4.2 is much less than by Algorithm 4.1, which will be seen in the next numerical example. 4.. Optimal Controls for Any Initial State Now, if an initial state x(0) is given, we may online perform the following Algorithm 4. to get a state of the system, optimal control and optimal objective value based on the data produced by Algorithms 4.1 or 4.2. Remark 4.2 The data on the optimal controls and optimal objective values at all given states are produced based on the recurrence equations step by step from the last stage to the initial stage in reverse order, whereas the optimal controls and optimal objective value for any initial state are got, based on the data obtained, step by step from the initial stage to the last stage orderly.

86 78 4 Optimal Control for Multistage Uncertain Systems Algorithm 4.2 (Data production by finite search method) Step 1. Averagely divide [q k, q+ k ] to generate controls u(k) j as q k = u(k) 0 < u(k) 1 < < u(k) mk 1 < u(k) mk = q + k. for k = 0, 1,...,N. Step 2. Find u (N) i {u(n) j 0 j m N } such that J(x(N) i, N) = f (x(n) i, u (N) i, N) = min f (x(n) i, u(n) j, N) 0 j m N for i = 0, 1,...,n N. Step. For k = N 1 to 0, perform the next two steps. Step 4. Approximate the value by Algorithm 1., where E[ f (x(k) i, u(k) j, k) + J(x(k + 1), k + 1)] x(k + 1) = φ(x(k) i, u(k) j, k) + σ(x(k) i, u(k) j, k) C k+1. Step 5. Find u (k) i {u(k) j 0 j m k } such that (4.11) holds, and J(x(k) i, k) = E[ f (x(k) i, u (k) i, k) + J(x(k + 1), k + 1)]. for i = 0, 1,...,n k Algorithm 4. (Online optimal control) Step 1. For initial state x(0), ifx(0) i x(0) x(0) i+1, compute u (0) and J(x(0), 0) by interpolation: u (0) = u (0) i + u (0) i+1 u (0) i x(0) i+1 x(0) i (x(0) x(0) i ), J(x(0), 0) = J(x(0) i, 0) + J(x(0) i+1, 0) J(x(0) i, 0) (x(0) x(0) i ). x(0) i+1 x(0) i Step 2. For k = 1toN, perform the next two steps. Step. Randomly generate a number r [0, 1], produce a number c(k) according to the distribution function Φ k (x) of uncertain variable C k such that Φ k (c(k)) = r. Set x(k) = φ(x(k 1), u (k 1), k 1) + σ(x(k 1), u (k 1), k 1) c(k). Step 4. If x(k) i x(k) x(k) i+1, compute u (k) by interpolation: u (k) = u (k) i + u (k) i+1 u (k) i x(k) i+1 x(k) i (x(k) x(k) i ).

87 4.4 Example Example Consider the following example: 10 min E Ax 4 ( j) + Bu 2 ( j) u(i) 0 i 10 j=0 subject to: x( j + 1) = ax( j) + bu( j) + σ C j+1, j = 0, 1, 2,...,N 1, x(0) = x 0, (4.12) where A = 2, B = 0.01, a = 0.8, b = 0.09, σ = , and 0.5 x( j) 0.5, 1 u( j) 1for0 j 10. In addition, the uncertain variables C 1, C 2,...,C 10 are independent and normally distributed with expected value 0 and variance 1, whose distribution function is Table 4.1 Data produced by hybrid intelligent algorithm (Algorithm 4.1) x(k) Stage J(, 10) u (10) J(, 9) u (9) J(, 8) u (8) J(, 7) u (7) J(, 6) u (6) J(, 5) u (5) J(, 4) u (4) J(, ) u () J(, 2) u (2) J(, 1) u (1) J(, 0) u (0)

88 80 4 Optimal Control for Multistage Uncertain Systems Φ(x) = ( ( 1 + exp π x )) 1, x R. (4.1) The interval [ 0.5, 0.5] of state x(k) is averagely inserted into 21 states x(k) i = i (i = 0, 1,...,20) for k = 0, 1,...,10. Algorithm 4.1 (with 4000 cycles in simulation, 2000 training data in neural network and 600 generations in genetic algorithm) is employed to produce a list of data as shown in Tables 4.1, 4.2, and 4.. The interval [ 1, 1] of control u(k) is averagely inserted into 1001 controls u(k) j = j ( j = 0, 1,...,1000) for k = 0, 1,...,10. Algorithm 4.2 is employed to produce a list of data as shown in Tables 4.4, 4.5, and 4.6. In Tables 4.1, 4.2, 4., 4.5, and 4.6, the data in the first rows are the 21 states in range [ 0.5, 0.5]. In the following each row, reported are the optimal objective values (topmost number) and the optimal controls with respect to corresponding states at the stage indicated with the leftmost number. Note that the optimal controls u (10) in the stage 10 are all zero because each of them is the minimal solution of the problem such as Table 4.2 Data produced by hybrid intelligent algorithm (continuous) x(k) Stage J(, 10) u (10) J(, 9) u (9) J(, 8) u (8) J(, 7) u (7) J(, 6) u (6) J(, 5) u (5) J(, 4) u (4) J(, ) u () J(, 2) u (2) J(, 1) u (1) J(, 0) u (0)

89 4.4 Example 81 Table 4. Data produced by hybrid intelligent algorithm (continuous) x(k) Stage J(, 10) u (10) J(, 9) u (9) J(, 8) u (8) J(, 7) u (7) J(, 6) u (6) J(, 5) u (5) J(, 4) u (4) J(, ) u () J(, 2) u (2) J(, 1) u (1) J(, 0) u (0) min f (x(10) i, u(10), 10) = min {2 x 4 (10) i u 2 (10)}. u(10) U 10 1 u(10) 1 If we have six initial states x 0 = 0.45, 0.65, 0.126, 0.09, 0.275, and 0.488, performing Algorithm 4. for every initial state yields optimal objective value and optimal controls which are listed in Tables 4.7 and 4.8. The data in the second rows in these two tables are the optimal objective values of the problem for initial states given in the first rows. In the third rows are the optimal controls at initial stage. In the following each row, reported are the optimal controls (topmost number) and the realized states at the corresponding stage. All computations are processed with C programming in a PC (Intel(R) Core(TM) 2 Duo CPU P8600@2.40GHz). Note that performing Algorithm 4. very quick (less than one second), but performing Algorithms 4.1 or 4.2 is time-consuming. Performing Step 5 in Algorithm 4.1 each time needs about 175 seconds, and then completing the data in Tables 4.1, 4.2, and 4. by Algorithm 4.2 needs about seconds. However performing Step 5 in Algorithm 4.2 each time needs about 75 seconds, and then completing the data in Tables 4.4, 4.5, and 4.6 by Algorithm 4.2

90 82 4 Optimal Control for Multistage Uncertain Systems Table 4.4 Data produced by finite search method (Algorithm 4.2) x(k) Stage J(, 10) u (10) J(, 9) u (9) J(, 8) u (8) J(, 7) u (7) J(, 6) u (6) J(, 5) u (5) J(, 4) u (4) J(, ) u () J(, 2) u (2) J(, 1) u (1) J(, 0) u (0) needs about s. Therefore, perform time by Algorithm 4.2 is much less than by Algorithm 4.1. Generally, if the length of the state variable range [l k, l+ k ] is thought to be larger or the precision of approximations by interpolation (Algorithm 4.) is required to improve, the number of state variables in range [l k, l+ k ] will be increased, and then this will increase the perform time. In the example, one more of number of state variables results in about s more of perform time by Algorithm 4.1, and about s more of perform time by Algorithm 4.2. It follows from Tables 4.7 and 4.8 that for problem (4.12), optimal solutions obtained based on the data produced by hybrid intelligent algorithm are near to optimal solutions obtained based on the data produced by finite search method. The difference of the optimal objective values obtained by two methods and listed at the second rows in Tables 4.7 and 4.8 may be seen in Table 4.9. Each absolute difference is small (the first not larger than 0.001, and the others not larger than ). So, the efficiency of two proposed methods to solve the problem presented in the paper is comparative.

91 4.5 Indefinite LQ Optimal Control with Equality Constraint 8 Table 4.5 Data produced by finite search method (continuous) x(k) Stage J(, 10) u (10) J(, 9) u (9) J(, 8) u (8) J(, 7) u (7) J(, 6) u (6) J(, 5) u (5) J(, 4) u (4) J(, ) u () J(, 2) u (2) J(, 1) u (1) J(, 0) u (0) Indefinite LQ Optimal Control with Equality Constraint Problem Setting Consider the indefinite LQ optimal control with equality constraint for discrete-time uncertain systems as follows. inf J(x u 0, u) = k 0 k N 1 subject to N 1 k=0 E [ x τ k Q k x k + u τ k R ku k ] + E [ x τ N Q N x N ] x k+1 = A k x k + B k u k + λ k (A k x k + B k u k )ξ k, k = 0, 1,...,N 1, F x N = η, (4.14)

92 84 4 Optimal Control for Multistage Uncertain Systems Table 4.6 Data produced by finite search method (continuous) x(k) Stage J(, 10) u (10) J(, 9) u (9) J(, 8) u (8) J(, 7) u (7) J(, 6) u (6) J(, 5) u (5) J(, 4) u (4) J(, ) u () J(, 2) u (2) J(, 1) u (1) J(, 0) u (0) where λ k R and 0 λ k 1. The vector x k is an uncertain state with the initial state x 0 R n and u k is a control vector subject to a constraint set U k R m. Denote u = (u 0, u 1,...,u N 1 ). Moreover, Q 0, Q 1,...,Q N and R 0, R 1,...,R N 1 are real symmetric matrices with appropriate dimensions. In addition, the coefficients A 0, A 1,...,A N 1 and B 0, B 1,...,B N 1 are assumed to be crisp matrices with appropriate dimensions. Let F R r n, η = (η 1,η 2,,η r ) τ, where η i (i = 1, 2,...,r) are uncertain variables. Besides, the noises ξ 0,ξ 1,,ξ N 1 are independent ordinary linear uncertain variables L( 1, 1) with the distribution 0, if x 1 Φ(x) = (x + 1)/2, if 1 x 1 1, if x 1. Note that we allow the cost matrices to be singular or indefinite. We need to give the following definitions.

93 4.5 Indefinite LQ Optimal Control with Equality Constraint 85 Table 4.7 Optimal controls for some initial states based on the data of Tables 4.1, 4.2, and4. x J(x 0, 0) u (0) u (1) x(1) u (2) x(2) u () x() u (4) x(4) u (5) x(5) u (6) x(6) u (7) x(7) u (8) x(8) u (9) x(9) u (10) x(10) Definition 4.1 The uncertain LQ problem (4.14) is called well posed if V (x 0 ) = inf J(x u 0, u) >, x 0 R n. k 0 k N 1 Definition 4.2 A well-posed problem is called solvable, if for x 0 R n, there is a control sequence (u 0, u 1,, u N 1 ) that achieves V (x 0). In this case, the control sequence (u 0, u 1,, u N 1 ) is called an optimal control sequence An Equivalent Deterministic Optimal Control We transform the uncertain LQ problem (4.14) into an equivalent deterministic optimal control problem. Let X k = E[x k x τ k ]. Since state x k R n, we know that x k x τ k is a n n matrix which elements are uncertain variables, and X k is a symmetric crisp matrix

94 86 4 Optimal Control for Multistage Uncertain Systems Table 4.8 Optimal controls for some initial states based on the data of Tables 4.4, 4.5, and4.6 x J(x 0, 0) u (0) u (1) x(1) u (2) x(2) u () x() u (4) x(4) u (5) x(5) u (6) x(6) u (7) x(7) u (8) x(8) u (9) x(9) u (10) x(10) Table 4.9 Absolute difference of the optimal values obtained by two methods Initial state x Optimal value J(x 0, 0) in Table 4.7 Optimal value J(x 0, 0) in Table Absolute difference (k = 0, 1,...,N). Denote K = (K 0, K 1,...,K N 1 ), where K i are matrices for i = 0, 1,...,N 1. Theorem 4.2 ([1]) If the uncertain LQ problem (4.14) is solvable by a feedback control sequence u k = K k x k for k = 0, 1,...,N 1,

95 4.5 Indefinite LQ Optimal Control with Equality Constraint 87 where K 0, K 1,...,K N 1 are constant crisp matrices, then the uncertain LQ problem (4.14) is equivalent to the following deterministic optimal control problem N 1 min J(X 0, K) = tr [ (Q k + K K k τ R ] k K k )X k + tr [Q N X N ] k 0 k N 1 k=0 subject to X k+1 = (1 + 1 λ2 k )(A k X k A τ k + A k X k Kk τ Bτ k + B k K k X k A τ (4.15) k +B k K k X k Kk τ Bτ k ), k = 0, 1,...,N 1, X 0 = x 0 x τ 0, FX N F τ = G, G = E[ηη τ ]. Proof Assume that the uncertain LQ problem (4.14) is solvable by a feedback control sequence u k = K k x k for k = 0, 1,...,N 1. Considering the dynamical equation of the uncertain LQ problem (4.14), we have where X k+1 = E[x k+1 x τ k+1 ] = E[(A k + B k K k + λ k (A k + B k K k )ξ k )x k x τ k (Aτ k + K k τ Bτ k +λ k (A τ k + K k τ Bτ k )ξ k)] = A k X k A τ k + A k X k Kk τ Bτ k + B k K k X k A τ k + B k K k X k Kk τ Bτ k +E[S k ξ k + V k ξk 2 ], (4.16) S k = 2λ k (A k X k A τ k + A k X k K τ k Bτ k + B k K k X k A τ k + B k K k X k K τ k Bτ k ), V k = λ 2 k (A k X k A τ k + A k X k K τ k Bτ k + B k K k X k A τ k + B k K k X k K τ k Bτ k ). It is easily found that λ k S k = 2V k. Now, we compute E[S k ξ k + V k ξk 2 ] as follows. (i) If V k = 0, we obtain E[S k ξ k + V k ξ 2 k ]=E[S kξ k ]=S k E[ξ k ]=0. (ii) If V k = 0, we know that λ k = 0 and 2 λ k 2. According to Example 1.6,we have E [ [ ] [ ] S k ξ k + V k ξk 2 ] 2 2 = E V k ξ k + V k ξk 2 = V k E ξ k + ξk 2 = 1 λ k λ k V k.

96 88 4 Optimal Control for Multistage Uncertain Systems Based on the above analysis, we conclude that E [ S k ξ k + V k ξk 2 ] 1 = V k. (4.17) Substituting (4.17) into(4.16), we know that (4.16) can be written as X k+1 =(1 + 1 λ2 k )(A k X k A τ k + A k X k Kk τ Bτ k + B k K k X k A τ k + B k K k X k Kk τ Bτ k ). (4.18) Moreover, the associated cost function is expressed equivalently as min J(X 0, K) = K k 0 k N 1 N 1 min K k 0 k N 1 k=0 tr [ (Q k + Kk τ R ] k K k )X k + tr [Q N X N ]. Note that F x N x τ N F τ = ηη τ. (4.19) Taking expectations in (4.19), we have FX N F τ = G, G = E[ηη τ ]. Therefore, the uncertain LQ problem (4.14) is equivalent to the deterministic optimal control problem (4.15). Remark 4. Obviously, if the uncertain LQ problem (4.14) has a linear feedback optimal control solution u k = K k x k (k = 0, 1,...,N 1), then K k (k = 0, 1,...,N 1) is the optimal solution of the deterministic LQ problem (4.15) A Necessary Condition for State Feedback Control We apply the deterministic matrix minimum principle [2] to get a necessary condition for the optimal linear state feedback control with deterministic gains to the uncertain LQ optimal control problem (4.14). Theorem 4. ([1]) If the uncertain LQ problem (4.14) is solvable by a feedback control u k = K k x k (4.20) for k = 0, 1,...,N 1, where K 0, K 1,...,K N 1 are constant crisp matrices, then there exist symmetric matrices H k, and a matrix ρ R r r solving the following constrained difference equation

97 4.5 Indefinite LQ Optimal Control with Equality Constraint 89 H k = Q k + (1 + 1 λ2 k )Aτ k H k+1 A k Mk τ L+ k M k L k L + k M k M k = 0, and L k 0 L k = R k + (1 + 1 λ2 k )Bτ k H k+1 B k (4.21) M k = (1 + 1 λ2 k )Bτ k H k+1 A k H N = Q N + F τ ρ F for k = 0, 1,...,N 1. Moreover K k = L + k M k + Y k L + k L ky k (4.22) with Y k R m n,k= 0, 1,...,N 1, being any given crisp matrices. Proof Assume the uncertain LQ problem (4.14) is solvable by u k = K k x k for k = 0, 1,...,N 1, where the matrices K 0,...,K N 1 are viewed as the control to be determined. It is obvious that problem (4.15) is a matrix dynamical optimization problem. Next, we will deal with this class of problems by minimum principle. Introduce the Lagrangian function associated with problem (4.15) as follows N 1 L = J(X 0, K) + tr[h k+1 g k+1 (X k, K k )]+tr[ρg(x N )], k=0 where N 1 J(X 0, K) = tr [ (Q k + Kk τ R ] k K k )X k + tr [Q N X N ], k=0 g k+1 (X k, K k ) = (1 + 1 λ2 k )(A k X k A τ k + A k X k Kk τ Bτ k + B k K k X k A τ k +B k K k X k Kk τ Bτ k ) X k+1 g(x N ) = FX N F τ G, and the matrices H 0,...,H k+1 as well as ρ R r r are the Lagrangian multipliers. By the matrix minimum principle [2], the optimal feedback gains and Lagrangian multipliers satisfy the following first-order necessary conditions

98 90 4 Optimal Control for Multistage Uncertain Systems L K k = 0 (k = 0, 1,...,N 1), (4.2) H k = L X k (k = 0, 1,...,N). (4.24) Based on the partial rule of gradient matrices, (4.2) can be transformed into [R k + (1 + 1 λ2 k )Bτ k H k+1 B k ]K k + (1 + 1 λ2 k )Bτ k H k+1 A k = 0. (4.25) Let L k = R k + (1 + 1 λ2 k )Bτ k H k+1 B k M k = (1 + 1 λ2 k )Bτ k H k+1 A k. (4.26) Then, (4.25) can be rewritten as L k K k + M k = 0. The solution of (4.25) is given by K k = L + k M k + Y k L + k L ky k, Y k R m n. (4.27) if and only if L k L + k M k = M k, where L + k matrix L k.by(4.24), first we have is the Moor Penrose inverse of the H N = L X N, (4.28) that is Second, we have H N = Q N + F τ ρ F. H k = L X k (k = 0, 1,...,N 1), which is H k = Q k + (1 + 1 λ2 k )Aτ k H k+1 A k + K τ k [R k + (1 + 1 λ2 k )Bτ k H k+1 B k ]K k +(1 + 1 λ2 k )Aτ k H k+1 B k K k + (1 + 1 λ2 k )K τ k Bτ k H k+1 A k. (4.29) Substituting (4.27) into(4.29) gets H k = Q k + (1 + 1 λ2 k )Aτ k H k+1 A k M τ k L+ k M k. (4.0)

99 4.5 Indefinite LQ Optimal Control with Equality Constraint 91 The objective function is = J(x 0, u) N 1 k=0 E [ x τ k Q k x k + u τ k R ku k ] + E [ x τ N Q N x N ] N 1 { [ = E x τ k Q k x k + u τ k R ] [ ] [ ]} ku k + E x τ k+1 H k+1 x k+1 E x τ k H k x k k=0 +E [ x τ N Q N x N ] E [ x τ N H N x N ] + x τ 0 H 0 x 0 N 1 { [ = tr (Qk + Kk τ R ] [ ] k K k )X k + tr Hk+1 X k+1 tr [Hk X k ] } k=0 +tr [(Q N H N )X N ] + x τ 0 H 0x 0. (4.1) Substituting (4.18) into(4.1), we can rewrite the cost function as follows J(X 0, K) N 1 { = tr [ (Q k + Kk τ R k K k ) + (1 + 1 λ2 k )(Aτ k H k+1 A k + Bk τ H k+1 A k K k k=0 +A τ k H k+1 B k K k + Kk τ Bτ k H k+1 B k K k ) H k ] X k } + tr [(Q N H N )X N ] +x τ 0 H 0x 0 N 1 = tr { [Q k + (1 + 1 ] λ2k )Aτk H k+1 A k H k + 2(1 + 1 λ2 k )Bτ k H k+1 A k K k k=0 +K τ k [R k + (1 + 1 λ2k )Bτk H k+1 B k ] K k } X k + tr [(Q N H N )X N ] +x τ 0 H 0x 0. (4.2) Substituting (4.26) and (4.0) into(4.2), a completion of square implies N 1 J(X 0, K) = tr [ (K k + L + k M k) τ L k (K k + L + k M ] k)x k k=0 +tr [(Q N H N )X N ] + x τ 0 H 0x 0. (4.) Next, we will prove that L k (k = 0, 1,...,N 1) satisfies L k = R k + (1 + 1 λ2 k )Bτ k H k+1 B k 0. (4.4)

100 92 4 Optimal Control for Multistage Uncertain Systems If it is not so, there is a L p for p {0, 1,...,N 1} with a negative eigenvalue λ. Denote the unitary eigenvector with respect to λ as v λ (i.e., v τ λ v λ = 1 and L p v λ = λv λ ). Let δ = 0 be an arbitrary scalar. We construct a control sequence ũ = (ũ 1, ũ 2,, ũ N 1 ) as follows { L + k ũ k = M k x k, k = p δ λ 1 2 v λ L + k M k x k, k = p. By (4.), the associated cost function becomes (4.5) J(x 0, ũ) N 1 ] = tr [( K k + L + k M k) τ L k ( K k + L + k M k)x k + tr [(Q N H N )X N ] + x τ 0 H 0x 0 = k=0 N 1 k=0 [ δ = λ 1 2 E [ (ũ k + L + k M k x k ) τ L k (ũ k + L + k M k x k ) ] + tr [(Q N H N )X N ] + x τ 0 H 0x 0 ] τ [ ] δ v λ L p v λ 1 λ + tr [(Q N H N )X N ] + x τ 0 H 0x 0 2 = δ 2 + tr [(Q N H N )X N ] + x τ 0 H 0x 0. Letting δ, it yields J(x 0, ũ), which contradicts the solvability of the uncertain LQ problem (4.14) Well Posedness of the Uncertain LQ Problem Next, we will show that the solvability of Eq. (4.21) is sufficient for the well posedness of the uncertain LQ problem (4.14). Moreover, any optimal control can be obtained viathesolutiontoeq.(4.21). Theorem 4.4 ([1]) The uncertain LQ problem (4.14) is well posed if there exist symmetric matrices H k solving the constrained difference Eq.(4.21). Moreover, the uncertain LQ problem (4.14) is solvable by u k = [R k + (1 + 1 λ2 k )Bτ k H k+1 B k ] + [(1 + 1 λ2 k )Bτ k H k+1 A k ]x k, (4.6) for k = 0, 1,...,N 1. Furthermore, the optimal cost of the uncertain LQ problem (4.14) is V (x 0 ) = x τ 0 H 0x 0 tr(ρg).

101 4.5 Indefinite LQ Optimal Control with Equality Constraint 9 Proof Let H k solve Eq. (4.21). Then, we have J(x 0, u) = N 1 k=0 E [ x τ k Q k x k + u τ k R ku k ] + E [ x τ N Q N x N ] N 1 { [ = E x τ k Q k x k + u τ k R ] [ ] [ ]} ku k + E x τ k+1 H k+1 x k+1 E x τ k H k x k k=0 + E [ x τ N Q N x N ] E [ x τ N H N x N ] + x τ 0 H 0 x 0 N 1 { [ = tr (Qk + Kk τ R ] [ ] k K k )X k + tr Hk+1 X k+1 tr [Hk X k ] } k=0 + tr [(Q N H N )X N ] + x τ 0 H 0x 0 N 1 = tr { [Q k + (1 + 1 ] λ2k )Aτk H k+1 A k H k + 2(1 + 1 λ2 k )Bτ k H k+1 A k K k k=0 + K τ k [R k + (1 + 1 λ2k )Bτk H k+1 B k ] K k } X k + tr [(Q N H N )X N ] + x τ 0 H 0x 0 N 1 = tr [ Mk τ L+ k M k + 2M k K k + Kk τ L ] k K k Xk + tr [(Q N H N )X N ] + x τ 0 H 0x 0 k=0 A completion of square implies N 1 J(X 0, K) = tr [ (K k + L + k M k) τ L k (K k + L + k M ] k)x k k=0 +tr [(Q N H N )X N ] + x τ 0 H 0x 0. (4.7) Because of L k 0, we obtain that the cost function of problem (4.14) is bounded from below by V (x 0 ) tr [(Q N H N )X N ] + x τ 0 H 0x 0 >, x 0 R n. Hence, the uncertain LQ problem (4.14) is well posed. It is clear that it is solvable by the feedback control u k = K k x k = L + k M k x k, k = 0, 1,...,N 1. Furthermore, (4.7) indicates that the optimal value equals V (x 0 ) = tr [(Q N H N )X N ] + x τ 0 H 0x 0.

102 94 4 Optimal Control for Multistage Uncertain Systems Since { H N = Q N + F τ ρ F FX N F τ = G, and we obtain X N = E[x N x τ N ], V (x 0 ) = x τ 0 H 0x 0 tr(ρg). Remark 4.4 We have shown that the solvability of the constrained difference Eq. (4.21) is sufficient for the existence of an optimal linear state feedback control. As a special case, we consider the following indefinite LQ optimal control without constraint for the discrete-time uncertain systems. inf u k 0 k N 1 J(x 0, u) = N 1 k=0 E [ x τ k Q k x k + u τ k R ku k ] + E [ x τ N Q N x N ] subject to x k+1 = A k x k + B k u k + λ k (A k x k + B k u k )ξ k, k = 0, 1,...,N 1. (4.8) Corollary 4.1 If the uncertain LQ problem (4.8) is solvable by a feedback control u k = K k x k, for k = 0, 1,...,N 1, (4.9) where K 0, K 1,...,K N 1 are constant crisp matrices, then there exist symmetric matrices H k that solve the following constrained difference equation H k = Q k + (1 + 1 λ2 k )Aτ k H k+1 A k Mk τ L+ k M k L k L + k M k M k = 0, and L k 0 L k = R k + (1 + 1 λ2 k )Bτ k H k+1 B k (4.40) M k = (1 + 1 λ2 k )Bτ k H k+1 A k H N = Q N for k = 0, 1,...,N 1. Moreover K k = L + k M k + Y k L + k L ky k (4.41)

103 4.5 Indefinite LQ Optimal Control with Equality Constraint 95 with Y k R m n,k = 0, 1,...,N 1, being any given crisp matrices. Furthermore, the uncertain LQ problem (4.14) is solvable by u k = [R k + (1 + 1 λ2 k )Bτ k H k+1 B k ] + [(1 + 1 λ2 k )Bτ k H k+1 A k ]x k, k = 0, 1,...,N 1, the optimal cost of the uncertain LQ problem (4.8) is given by V (x 0 ) = x τ 0 H 0x 0. Proof Let F = 0 and η = 0 in the unconstrained uncertain LQ problem (4.14). Then, the constrained uncertain LQ problem (4.14) becomes the unconstrained uncertain LQ problem (4.8). The conclusions in the corollary directly follow by similar approach as in Theorems 4. and Example Present a two-dimensional indefinite LQ optimal control with equality constraint for discrete-time uncertain systems to illustrate the effectiveness of our result. In the constrained discrete-time uncertain LQ control problem (4.14), we give out a set of specific parameters of the coefficients: and A 0 = x 0 = ( ) 10, A 00 1 = ( ( 0 1, F = 1) 2, 1 ), N = 2, η L(0, 5 /4), 2 ( ) ( ( , B 10 0 =, B 0) 1 =,λ 1) 0 = 0.2, λ 1 = 0.1. The state weights and the control weights are as follows Q 0 = ( ) 1 0, Q = ( ) 10, Q = ( ) 00, R 00 0 = 1, R 1 = 4. Note that in this example, the state weight Q 0 is negative definite, Q 1 is negative semidefinite, Q 2 is positive semidefinite, and the control weight R 0 is negative definite. The constraint is given as follows FX 2 F τ = FE[x 2 x τ 2 ]F τ = G = E[η 2 ]= 25 4.

104 96 4 Optimal Control for Multistage Uncertain Systems First, it follows from H k = Q k + (1 + 1 λ2 k )Aτ k H k+1 A k Mk τ L+ k M k L k L + k M k M k = 0 L k = R k + (1 + 1 λ2 k )Bτ k H k+1 B k 0 M k = (1 + 1 λ2 k )Bτ k H k+1 A k, k = 0, 1. H 2 = Q 2 + F τ ρ F X k+1 = (1 + 1 λ2 k )(A k X k A τ k + A k X k K τ k Bτ k + B k K k X k A τ k + B k K k X k K τ k Bτ k ), k = 0, 1, X 0 = x 0 x τ 0, FX 2 F τ = FE[x 2 x τ 2 ]F τ = G = 25 4, that ρ = 8. Then, we have H 2 = Q 2 + F τ ρ F = ( ) Second, applying Theorem 4., we obtain the optimal controls and optimal cost value as follows. For k = 1, we obtain L 1 = R 1 + (1 + 1 λ2 1 )Bτ 1 H 2 B 1 = , M 1 = (1 + 1 λ2 1 )Bτ 1 H 2 A 1 = (8.1067, 0), H 1 = Q 1 + (1 + 1 λ2 1 )Aτ 1 H 2 A 1 M τ 1 L+ 1 M 1 = The optimal feedback control is u 1 = K 1 x 1 where For k = 0, we obtain K 1 = L + 1 M 1 = ( , 0). ( ) L 0 = R 0 + (1 + 1 λ2 0 )Bτ 0 H 1 B 0 = , M 0 = (1 + 1 λ2 0 )Bτ 0 H 1 A 0 = (1.6840, 0), H 0 = Q 0 + (1 + 1 λ2 0 )Aτ 0 H 1 A 0 M τ 0 L+ 0 M 0 = ( )

105 4.5 Indefinite LQ Optimal Control with Equality Constraint 97 The optimal feedback control is u 0 = K 0 x 0 where K 0 = L + 0 M 0 = ( , 0). Finally, the optimal cost value is V (x 0 ) = x τ 0 H 0x 0 tr(ρg) = References 1. Chen Y, Zhu Y (2016) Indefinite LQ optimal control with equality constraint for discrete-time uncertain systems. Jpn J Ind Appl Math (2): Athans M (1968) The matrix minimum principle. Inf Control, 11:

106 Chapter 5 Bang Bang Control for Uncertain Systems If the optimal control of a problem takes the maximum value or minimum value in its admissible field, the problem is called a bang bang control problem. 5.1 Bang Bang Control for Continuous Uncertain Systems Now, we consider the following problem: [ T J(0, x 0 ) max E u s 0 ] f (X s, s)ds + h(x T, T ) subject to dx s = (α(x s, s) + β(x s, s)u s )ds + σ(x s, u s, s)dc s (5.1) X 0 = x 0 u s [ 1, 1] r, where X s is the state vector of dimension n with the initial condition that at time 0 we are in state X 0 = x 0, u s the decision vector of dimension r in a domain [ 1, 1] r, f : R n [0, + ) R the objective function, and h : R n [0, + ) R the function of terminal reward. In addition, α : R n [0, + ) R n is a column-vector function, β : R n [0, + ) R n R r and σ : R n R r [0, + ) R n R k are matrix functions, and C s = (C s1, C s2,, C sk ) τ, where C s1, C s2,...,c sk are independent canonical Liu processes. The final time T > 0 is fixed or free. The model (5.1) may be suitable to the fuel and time problems when the system dx s = (α(x s, s) + β(x s, s)u s )ds is disturbed by an uncertain factor and then is the form of uncertain differential equation dx s = (α(x s, s) + β(x s, s)u s )ds + σ(x s, u s, s)dc s. For any 0 < t < T, J(t, x) is the expected optimal reward obtainable in [t, T ] with the condition that at time t weareinstatex t = x. Springer Nature Singapore Pte Ltd Y. Zhu, Uncertain Optimal Control, Springer Uncertainty Research, 99

107 100 5 Bang Bang Control for Uncertain Systems Theorem 5.1 ([1]) Assume that J(t, x) is a twice differentiable function on [0, T ] R n. Then the optimal control of (5.1) is a bang bang control. Proof It follows from the equation of optimality (2.15) that J t (t, x) = max { f (x, t) + (α(x, t) + β(x, t)u t) τ x J(t, x)}. (5.2) u t [ 1,1] r On the right side of (5.2), let u t make it the maximum, max { f (x, t) + (α(x, t) + β(x, t)u t) τ x J(t, x)} u t [ 1,1] r = f (x, t) + (α(x, t) + β(x, t)u t )τ x J(t, x), that is, Denote and max { u t [ 1,1] r x J(t, x) τ β(x, t)u t } = x J(t, x) τ β(x, t)u t. (5.) u t = (u 1 (t), u 2 (t),, u r (t))τ, x J(t, x) τ β(x, t) = (g 1 (t, x), g 2 (t, x),, g r (t, x)), (5.4) which is called a switching vector. Then, 1, if g i (t, x) >0 ui (t) = 1, if g i (t, x) <0 undetermined, if g i (t, x) = 0 (5.5) for i = 1, 2,...,r, which is a bang bang control as shown in Fig Fig. 5.1 Bang bang control

108 5.1 Bang Bang Control for Continuous Uncertain Systems An Uncertain Bang Bang Model Consider a special case of the model (5.1) as follows. [ T ] J(0, x 0 ) max E f (s) τ X s ds + S u T τ X T s 0 subject to dx s = (A(s)X s + B(s)u s )ds + σ(x s, u s, s)dc s (5.6) X 0 = x 0 u s [ 1, 1] n, where f :[0, + ) R n and A, B :[0, + ) R n n are some twice continuously differentiable functions, and S T R n. Denote B(s) = (b ij (s)) n n.wehave the following conclusion. Theorem 5.2 ([1]) The optimal control u t = (u 1 (t), u 2 (t),, u n (t))τ of (5.6) is a bang bang control u j (t) = sgn{(b 1 j(t), b 2 j (t),, b nj (t))p(t)} for j = 1, 2,...,n, where p(t) R n satisfies d p(t) dt The optimal value of (5.6) is = f (t) A(t) τ p(t), p(t ) = S T. (5.7) T J(0, x 0 ) = p(0) τ x p(s) τ B(s)u s ds. (5.8) Proof It follows from the equation of optimality (2.7) that J t (t, x) = Since J(T, x T ) = S τ T x T, we guess max { f u t [ 1,1] (t)τ x + (A(t)x + B(t)u t ) τ x J(t, x)}. (5.9) r J(t, x) = p(t) τ x + c(t) and p(t ) = S T, c(t ) = 0. So x J(t, x) = p(t), J t (t, x) = d p(t)τ dt x + dc(t). (5.10) dt Substituting (5.10) into(5.9) that

109 102 5 Bang Bang Control for Uncertain Systems d p(t)τ dt x + dc(t) dt = f (t) τ x + (A(t)x + B(t)u t )τ p(t). Therefore, and d p(t)τ dt = f (t) τ + p(t) τ A(t), p(t ) = S T, dc(t) dt = p(t) τ B(t)u t. Thus, it follows from (5.5) that Furthermore, u j (t) = sgn{g j(t, x)} =sgn{p(t) τ (b 1 j (t), b 2 j (t),, b nj (t)) τ }. The theorem is proved. T J(t, x) = p(t) τ x + c(t) = p(t) τ x + p(s) τ B(s)u s ds. t Example Consider the following example of uncertain optimal control model J(0, x 0 ) max E[2X 1 (1) X 2 (1)] u subject to dx 1 (s) = X 2 (s)ds dx 2 (s) = u(s)ds + σ dc s, σ R (5.11) X(0) = (X 1 (0), X 2 (0)) = x 0 u(s) 1, 0 s 1. We have A(s) = ( ) 01, B(s) = 00 ( ) ( 00 2, S 01 1 =. 1) It follows from (5.7) that which has the solution d p(t) dt = ( ) ( 00 2 p(t), p(1) =, 10 1)

110 5.1 Bang Bang Control for Continuous Uncertain Systems 10 The switching vector is So we have the switching function ( ) 2 p(t) =. 2t + 1 x J(t, x) τ β(x, t) = p(t) τ B(t) = (0, 2t + 1). g(t, x) = 2t + 1. Hence, u (t) = sgn{ 2t + 1} = { 1, if 0 t < 1 2 1, if 1 2 < t 1. We can find out the switching time at 0.5 as shown in Fig Next, we will find out the optimal trajectory (X 1 (t), X 2 (t)) τ. Denote x 0 = (x 01, x 02 ) τ.itfollowsfromdx 2 (s) = u (s)ds + σ dc s that X 2 (t) = x 02 + t (1 t) + σ C t. It follows from dx 1 (s) = X 2 (s)ds that t X 1 (t) = x 01 + X 2 (s)ds 0 t t = x 01 + x 02 t + s (1 s)ds + σ C s ds { 0 0 x01 + x = 02 t t 2 + σ tc t + σ t 0 sdc s, if 0 t < 1 2 x 01 + x 02 t 1 2 t 2 + t σ tc t + σ t { 0 sdc s, if 1 2 < t 1 x01 + x = 02 t t 2 + σξ(t), if 0 t < 1 2 x 01 + x 02 t 1 2 t 2 + t σξ(t), if 1 2 < t 1, Fig. 5.2 Optimal control

111 104 5 Bang Bang Control for Uncertain Systems where ξ is an uncertain variable such that t ( t ) ξ(t) = tc t + sdc s tn (0, t) + N 0, sds = N (0, 2 ) t We can see that X 1 (1) = x 01 + x σξ(1) and X 2 (1) = x 02 + σ C 1. Thus, E[2X 1 (1) X 2 (1)] =2x 01 + x σ E[2ξ(1) C 1 ]=2x 01 + x , which is coincident with the optimal value provided by (5.8). The switching point is (X 1 (0.5), X 2 (0.5)) = (x 01 + x σξ(0.5), x σ C 0.5 ). The trajectory of the system is an uncertain vector (X 1 (t), X 2 (t)) τ. In practice, a realization of the system is a sample trajectory which formed by sample points. We try to provide sample point and sample trajectory as follows. Since the distribution function of ξ(t) is ( ( )) 2π x 1 Φ(x) = 1 + exp, x R, t 2 we may get a sample point ξ(t) of ξ(t) from ξ(t) = Φ 1 (rand(0, 1)) that ξ(t) = t 2 ( ) 2π ln 1 rand(0, 1) 1. Similarly, we may get a sample point c t of C t by ( ) t c t = π ln 1 rand(0, 1) 1. A sample trajectory of (X 1 (t), X 2 (t)) τ may be given by X 1 (t) = { x01 + x 02 t t 2 + σ ξ(t), if 0 t < 1 2 x 01 + x 02 t 1 2 t 2 + t σ ξ(t), if 1 2 < t 1, X 2 (t) = x 02 + t (1 t) + σ c t. A simulated sample trajectory of (X 1 (t), X 2 (t)) τ is shown in Fig. 5. with σ = 0.01 and x 0 = (0, 0). In this sample, X 1 (1) = , X 2 (1) = , and the switching point (0.125, 0.499). Since the trajectory is an uncertain vector of dimension two, its realization is dependent on the uncertain variables ξ(t) and C t whose sample points are produced by their distributions. That is to say, the trajectory is disturbed by an uncertain vector. So, each sample trajectory is not so smooth. In practice, the system may realize many times. A realization (including sample trajectory and switching point) may different from another, but each realization has only one switching point.

112 5.2 Bang Bang Control for Multistage Uncertain Systems 105 Fig. 5. A sample trajectory 5.2 Bang Bang Control for Multistage Uncertain Systems Consider the following uncertain optimal control problem with a linear objective function subject to an uncertain linear system: N J(0, x 0 ) = max E A j x( j) u(i) 1 0 i N j=0 subject to x( j + 1) = a j x( j) + b j u( j) + σ j+1 C j+1, for j = 0, 1, 2,...,N 1, x(0) = x 0, (5.12) where A j > 0, a j > 0 and b j,σ j are constants for all j. In addition, C 1, C 2,..., C N are independent uncertain variables with expected values e 1, e 2,...,e N, respectively. Theorem 5. ([2]) The optimal controls u (k) of (5.12) are provided by and the optimal values are u (N) 1, { u sgn{bk }, if b (k) = k = 0 undetermined, otherwise J(N, x N ) = P N x N + Q N, N J(k, x k ) = P k x k + Q i + i=k N i=k+1 P i σ i e i, where P N = A N, P k = A k + P k+1 a k, Q N = 0, Q k = P k+1 b k, for k = N 1, N 2,...,1, 0.

113 106 5 Bang Bang Control for Uncertain Systems Proof Denote the optimal control for the above problem by u (0), u (1),..., u (N). By using the recurrence Eq. (4.2), we have J(N, x N ) = max {A N x N }=A N x N, u(n) 1 where u (N) 1. Let P N = A N, Q N = 0. Then J(N, x N ) = P N x N + Q N.For k = N 1, by using the recurrence Eq. (4.), we have J(N 1, x N 1 ) = max E[A N 1x N 1 + J(x(N), N)] u(n 1) 1 = max {A N 1x N 1 + P N E[x(N)]+Q N } u(n 1) 1 = max {A N 1x N 1 + P N E[a N 1 x N 1 + u(n 1) 1 b N 1 u(n 1) + σ N C N ]+Q N } = max {(A N 1 + P N a N 1 )x N 1 + u(n 1) 1 P N b N 1 u(n 1) + P N σ N e N + Q N }. Hence, P N b N 1 u (N 1) = max P N b N 1 u(n 1). u(n 1) 1 Therefore, we have 1, if b N 1 > 0; u (N 1) = 1, if b N 1 < 0; undetermined, if b N 1 = 0, { sgn{bn 1 }, if b = N 1 = 0; undetermined, otherwise. Hence, we have J(N 1, x N 1 ) = (A N 1 + P N a N 1 )x N 1 + P N b N 1 u (N 1) + P N σ N e N + Q N. When b N 1 = 0, we have J(N 1, x N 1 ) = (A N 1 + P N a N 1 )x N 1 + P N σ N e N + Q N. Denote Then P N 1 = A N 1 + P N a N 1, Q N 1 = 0. J(N 1, x N 1 ) = P N 1 x N 1 + Q N 1 + Q N + P N σ N e N. When b N 1 > 0, we have u (N 1) = 1, and then

114 5.2 Bang Bang Control for Multistage Uncertain Systems 107 J(N 1, x N 1 ) = (A N 1 + P N a N 1 )x N 1 + P N b N 1 + Q N + P N σ N e N. Denote Then P N 1 = A N 1 + P N a N 1, Q N 1 = P N b N 1. J(N 1, x N 1 ) = P N 1 x N 1 + Q N 1 + Q N + P N σ N e N. When b N 1 < 0, we have u (N 1) = 1, and then J(N 1, x N 1 ) = (A N 1 + P N a N 1 )x N 1 P N b N 1 + Q N + P N σ N e N. Denote Then P N 1 = A N 1 + P N a N 1, Q N 1 = P N b N 1. J(N 1, x N 1 ) = P N 1 x N 1 + Q N 1 + Q N + P N σ N e N. By induction, we can obtain the conclusion of the theorem. The theorem is proved. By Theorem 5., we can get the exact bang bang optimal controls and the optimal objective values with the state of the system at all stages for a linear objective function subject to an uncertain linear system. If the system is nonlinear in control variable, we consider the following problem. N J(0, x 0 ) = max E A j x( j) u(i) 1 0 i N j=0 subject to x( j + 1) = a j x( j) + b j u( j) + d j u 2 ( j) + σ j+1 C j+1, for j = 0, 1, 2,...,N 1, x(0) = x 0, (5.1) where d j < 0for0 j N, and other parameters have the same meaning as in (5.12). Theorem 5.4 ([2]) The optimal controls u (k) of (5.1) are provided by and the optimal values are u (N) 1, { b k u 2d (k) = k, if 2d k b k 2d { } k sgn bk 2d k, otherwise,

115 108 5 Bang Bang Control for Uncertain Systems N N J(N, x N ) = P N x N + Q N, J(k, x k ) = P k x k + Q i + P i σ i e i, i=k i=k+1 where and P N = A N, P k = A k + P k+1 a k ; P k+1 (d k + b k ), if u (k) = 1 Q N = 0, Q k = P k+1 (d k b k ), if u (k) = 1 P k+1bk 2 4d k, if u (k) = b k 2d k, for k = N 1, N 2,...,1, 0. Proof Denote the optimal control for the above problem by u (0), u (1),..., u (N). By using the recurrence Eq. (4.2), we have J(N, x N ) = max {A N x N }=A N x N, u(n) 1 where u (N) 1. Let P N = A N, Q N = 0. Then J(N, x N ) = P N x N + Q N. For k = N 1, by using the recurrence Eq. (4.), we have J(N 1, x N 1 ) = max E[A N 1x N 1 + J(x(N), N)] u(n 1) 1 = max {A N 1x N 1 + P N E[x(N)]+Q N } u(n 1) 1 = max {A N 1x N 1 + P N E[a N 1 x N 1 u(n 1) 1 +b N 1 u(n 1) + d N 1 u 2 (N 1) + σ N C N ]+Q N } = (A N 1 + P N a N 1 )x N 1 + P N σ N e N + Q N + max {P N b N 1 u(n 1) + P N d N 1 u 2 (N 1)}. (5.14) u(n 1) 1 Let H(u(N 1)) = P N b N 1 u(n 1) + P N d N 1 u 2 (N 1). It follows from dh(u(n 1)) du(n 1) = P N b N 1 + 2P N d N 1 u(n 1) = 0 that u(n 1) = b N 1 2d N 1. If b N 1 /(2d N 1 ) 1, then u (N 1) = b N 1 / (2d N 1 ) is the maximum point of H(u(N 1)) (its trace is as H1inFig.5.4) because d 2 H(u(N 1)) du(n 1) 2 = 2P N d N 1 < 0.

116 5.2 Bang Bang Control for Multistage Uncertain Systems 109 Fig. 5.4 Three types of functions H(u) That is, if 2d N 1 b N 1 2d N 1, then the optimal control at (N 1)th stage is u (N 1) = b N 1 /(2d N 1 ). Otherwise, since H(u(N 1)) (its trace is as H2 infig.5.4) is increasing in u(n 1) [ 1, 1] if b N 1 /(2d N 1 )>1, and H(u(N 1)) (its trace is as H infig.5.4) is decreasing in u(n 1) [ 1, 1] if b N 1 /(2d N 1 )< 1, we know that the optimal control at (N 1)th stage is 1 if b N 1 /(2d N 1 )>1 and 1 if b N 1 /(2d N 1 )< 1. Hence max H(u(N 1)) = u(n 1) 1 P N b 2 N 1 4d N 1, if 2d k b k 2d k P N (d N 1 + b N 1 ), if b N 1 > 2d N 1 P N (d N 1 b N 1 ), if b N 1 < 2d N 1. Substituting it into (5.14) gets the result of J(N 1, x N 1 ). By induction, we can obtain the conclusion of the theorem. The theorem is proved Example Consider the following example: 10 J(0, x 0 ) = max E A j x( j) u(i) 1 0 i 10 j=0 subject to x( j + 1) = a j x( j) + b j u( j) + σ j+1 C j+1, for j = 0, 1, 2,...,9, x(0) = x 0, (5.15) where coefficients are listed in Table 5.1. In addition, C 1, C 2,...,C 10 are independent zigzag uncertain variables ( 1, 0, 1), and then E[C j ]=0for j = 1, 2,...,10. The optimal controls and optimal values are obtained by Theorem 5. and listed in Table 5.2. The data in the fourth column of Table 5.2 is the corresponding states which are derived from x(k + 1) = a k x(k) + b k u(k) + σ k+1 c k+1 for initial stage x(0) = 1,

117 110 5 Bang Bang Control for Uncertain Systems Table 5.1 Coefficients of the example j A j σ j a j b j Table 5.2 The optimal results Stage r k c k x(k) u (k) J(k, x k ) u (N) u (N) where c k+1 is the realization of uncertain variable C k+1, and may be generated by c k+1 = 2r k+1 1 for a random number r k+1 [0, 1] (k = 0, 1, 2,...,9). 5. Equation of Optimality for Saddle Point Problem A saddle point problem concerns on a situation that one control vector aims at minimizing some given objective function while the other control vector tries to maximize it. This problem often arises in the military and security fields. When launching a missile to pursue the target, we hope to minimize the distance between the missile and the target. Meanwhile, the target tries to increase the distance so that it can evade. The policemen do their best to catch the terrorists to reduce the loss while the terrorists do the opposite. This is why we need to make a study of the saddle point problem. The research is based on an uncertain dynamic system as follows: dx s = f (s, u 1, u 2, X s )ds + g(s, u 1, u 2, X s )dc s and X 0 = x 0.

118 5. Equation of Optimality for Saddle Point Problem 111 In the above equation, X s is the state variable of dimension n with the initial state X 0 = x 0, u 1 D 1 R p is a control vector which maximizes some given objective function, and u 2 D 2 R q is to minimize the objective function. C t = (C t1, C t2,, C tk ) τ where C t1, C t2,, C tk are independent canonical Liu processes. In addition, f :[0, T ] R p R q R n R n is a vector value function, g :[0, T ] R p R q R n R n k is a matrix value function. For any 0 < t < T and some given confidence level α (0, 1), we choose the objective function as follows: V (u 1, u 2 ) = H sup (α) where H sup (α) = sup{ H M{H H} α} and H = T t h(s, u 1, u 2, X s )ds + G(X T, T ). Besides, h :[0, T ] R p R q R n R is an integrand function of state and control, and G :[0, T ] R p R q R n R is a function of terminal reward. In addition, all the functions mentioned above are continuous. Then we consider the following saddle point problem. Find (u 1, u 2 ) such that V (u 1, u 2 ) V (u 1, u 2 ) V (u 1, u 2) subject to: dx s = f (s, u 1, u 2, X s )ds + g(s, u 1, u 2, X s )dc s, t s T X t = x. (5.16) In fact, the optimal value will change as long as the initial time t and the initial state x change. Thus we can denote the optimal value V (u 1, u 2 ) as J(t, x). Now we present the equation of optimality for saddle point problem under uncertain environment. Theorem 5.5 ([]) Let J(t, x) be twice differentiable on [0, T ] R n. Then we have J t (t, x) = max min { x J(t, x) τ f (t, u 1, u 2, x) + h(t, u 1, u 2, x) u 1 u 2 } + π ln 1 α α x J(t, x) τ g(t, u 1, u 2, x) 1 (5.17) = min max { x J(t, x) τ f (t, u 1, u 2, x) + h(t, u 1, u 2, x) u 2 u 1 } + π ln 1 α α x J(t, x) τ g(t, u 1, u 2, x) 1 (5.18)

119 112 5 Bang Bang Control for Uncertain Systems Proof Assume that (u 1, u 2 ) is the optimal control function pair for saddle point problem (5.16). We know that u 1 and u 2 are the solutions to the following problems: (P1) and (P2) J(t, x) max u 1 D 1 V (u 1, u 2 ) subject to: dx s = f (s, u 1, u 2, X s)ds + g(s, u 1, u 2, X s)dc s and X t = x, J(t, x) min u 2 D 2 V (u 1, u 2) subject to: dx s = f (s, u 1, u 2, X s )ds + g(s, u 1, u 2, X s )dc s and X t = x. Applying Theorem.2 to (P1) and (P2), we have J t (t, x) = max{ x J(t, x) τ f (t, u 1, u u 2, x) + h(t, u 1, u 2, x) 1 } + π ln 1 α α x J(t, x) τ g(t, u 1, u 2, x) 1 (5.19) From (5.19), we know that = min{ x J(t, x) τ f (t, u u 1, u 2, x) + h(t, u 1, u 2, x) 2 } + π ln 1 α α x J(t, x) τ g(t, u 1, u 2, x) 1. (5.20) J t (t, x) max min{ x J(t, x) τ f (t, u 1, u 2, x) + h(t, u 1, u 2, x) u 1 u 2 } + π ln 1 α α x J(t, x) τ g(t, u 1, u 2, x) 1. (5.21) Similarly, from (5.20), we can also get Let J t (t, x) min max{ x J(t, x) τ f (t, u 1, u 2, x) + h(t, u 1, u 2, x) u 2 u 1 } + π ln 1 α α x J(t, x) τ g(t, u 1, u 2, x) 1. (5.22)

120 5. Equation of Optimality for Saddle Point Problem 11 σ(u 1, u 2 ) = x J(t, x) τ f (t, u 1, u 2, x) + h(t, u 1, u 2, x) + π ln 1 α α x J(t, x) τ g(t, u 1, u 2, x) 1. We note that Thus max u 1 max u 1 min σ(u 1, u 2 ) min σ(u 1, u 2 ), u 1. u 2 u 2 min u 2 σ(u 1, u 2 ) min max σ(u 1, u 2 ). (5.2) u 2 u 1 Together with (5.21) and (5.22), we prove the theorem. Remark 5.1 The equation of optimality (5.17) for saddle point problem gives a sufficient condition. If it has solutions, the saddle point is determined. Specially, if the max and min operators are interchangeable and σ(u 1, u 2 ) is concave (convex respectively) in u 1 (u 2 respectively), then the system can reach a saddle point equilibrium. Remark 5.2 The conclusion we obtained is different from that in the case of stochastic saddle point problem which has an extra term 1 2 tr{gτ (t, u 1, u 2, x) xx J(t, x)g(t, u 1, u 2, x)} on the right side of the equation comparing to the deterministic case. Here, we have one additional term 1 α ln π α x J(t, x) τ g(t, u 1, u 2, x) 1 on the right side of the equation comparing to the deterministic case. Note that the first-order derivative here may make it easier to calculate than the stochastic case. 5.4 Bang Bang Control for Saddle Point Problem For a given confidence level α (0, 1), consider the following model. Find (u 1, u 2 ) such that V (u 1, u 2 ) V (u 1, u 2 ) V (u 1, u 2) subject to: dx s =[a(x s, s) + b(x s, s)u 1 + c(x s, s)u 2 ]ds +σ (X s, s)dc s and X 0 = x 0 u 1 [ 1, 1] p, u 2 [ 1, 1] q (5.24) where [ T V (u 1, u 2 ) = 0 ] f (X s, s)ds + G(X T, T ) (α). sup

121 114 5 Bang Bang Control for Uncertain Systems In the above model, a : R n [0, T ]) R n is a column-vector function, b : R n [0, T ] R n p, c : R n [0, T ] R n q and σ : R n [0, T ] R n k are matrix functions. We still use J(t, x) to denote the optimal reward obtainable in [t, T ] with the condition that we have state X t = x at time t. Theorem 5.6 ([]) Let J(t, x) be a twice differentiable function on [0, T ] R n. Then the optimal control pair of problem (5.24) are bang bang controls. Proof It follows from the equation of optimality (5.17) that J t (t, x) = max min { x J(t, x) τ (a(x, t) + b(x, t)u 1 + c(x, t)u 2 ) + f (x, t) u 1 u 2 } + π ln 1 α α x J(t, x) τ σ (x, t) 1 = max u 1 x J(t, x) τ b(x, t)u 1 + min x J(t, x) τ c(x, t)u 2 u 2 + x J(t, x) τ a(x, t) + f (x, t) + Assume that u 1 and u 2 are the optimal controls. We have π ln 1 α α x J(t, x) τ σ (x, t) 1. max u 1 x J(t, x) τ b(x, t)u 1 = x J(t, x) τ b(x, t)u 1, min u 2 x J(t, x) τ c(x, t)u 2 = x J(t, x) τ c(x, t)u 2. Let x J(t, x) τ b(x, t) = (g 1 (t, x), g 2 (t, x),, g p (t, x)), (5.25) x J(t, x) τ c(x, t) = (h 1 (t, x), h 2 (t, x),, h q (t, x)), (5.26) and u 1 = (u 11 (t), u 12 (t),, u 1p (t))τ, u 2 = (u 21 (t), u 22 (t),, u 2q (t))τ. Then, we can easily obtain that 1, if g i (t, x) >0 u 1i (t) = 1, if g i (t, x) <0 undetermined, if g i (t, x) = 0 for i = 1, 2,...,p, and 1, if h j (t, x) >0 u 2 j (t) = 1, if h j (t, x) <0 undetermined, if h j (t, x) = 0 (5.27) (5.28)

122 5.4 Bang Bang Control for Saddle Point Problem 115 for j = 1, 2,...,q. They are bang bang controls and Eqs. (5.25) and (5.26) are called switching vectors A Special Bang Bang Control Model Consider the following special bang bang control model. Find (u 1, u 2 ) such that V (u 1, u 2 ) V (u 1, u 2 ) V (u 1, u 2) subject to: dx s =[a(s)x s + b(s)u 1 + c(s)u 2 ]ds +σ (s)dc s and X 0 = x 0 u 1 [ 1, 1] p, u 2 [ 1, 1] q (5.29) where [ T V (u 1, u 2 ) = 0 ] f τ (s)x s ds + g τ T X T (α). sup In addition, a :[0, T ] R n n, b :[0, T ] R n p, c :[0, T ] R n q and σ : [0, T ] R n k are all matrix functions. Besides, f :[0, T ] R n is a continuously differential function and g T R n. We denote b(s) = (b li (s)) n p and c(s) = (c lj (s)) n p. Then, we have the conclusion below. Theorem 5.7 ([]) Let J(t, x) be a twice differentiable function on [0, T ] R n. Then the optimal control pair of problem (5.29) are: u 1i (t) = sgn{(b 1i(t), b 2i (t),, b ni (t)) p(t)} for i = 1, 2,...,p, (5.0) u 2 j (t) = sgn{(c 1 j(t), c 2 j (t),, c nj (t)) p(t)} for j = 1, 2,...,q, (5.1) where p(t) R n satisfies the following equation: And the optimal value is ṗ(t) = f (t) a τ (t) p(t), p(t ) = g T. (5.2) T J(0, x 0 ) = p(0) τ x π ln 1 α α p τ (s)(b(s)u 1 + c(s)u 2 )ds T 0 p τ (s)σ (s) 1 ds. Proof Applying the equation of optimality (5.17), we have

123 116 5 Bang Bang Control for Uncertain Systems J t (t, x) = max min { x J(t, x) τ (a(x, t) + b(x, t)u 1 + c(x, t)u 2 ) u 1 u 2 } + f (x, t) + π ln 1 α α x J(t, x) τ σ (t) 1 = max{ x J(t, x) τ b(x, t)u 1 }+min{ x J(t, x) τ c(x, t)u 2 }+ f (x, t) u 1 u 2 + x J(t, x) τ a(x, t) + π ln 1 α α x J(t, x)τσ(x, t) 1. (5.) Since J(T, X T ) = g τ T X T, we conjuncture that J(t, x) = p τ (t)x + q(t) and p(t ) = g T, q(t ) = 0. Then J t (t, x) = ṗ τ (t)x + q(t) and x J(t, x) = p(t). Substituting them into (5.) yields ṗ τ (t)x q(t) = p τ (t)b(t)u 1 + pτ (t)c(t)u 2 + f τ (t)x + p τ (t)a(t)x + Thus, we have π ln 1 α α pτ (t)σ (t) 1. ṗ(t) = f (t) a τ (t) p (5.4) q(t) = p τ (t)(b(t)u 1 + c(t)u 2 ) π ln 1 α α pτ (t)σ (t) 1. (5.5) According to Theorem 5.6, we could obtain the bang bang controls: for i = 1, 2,...,p, and u 1i (t) = sgn{ p(t)τ (b 1i (t), b 2i (t),, b ni (t)) τ } u 2 j (t) = sgn{ p(t)τ (c 1 j (t), c 2 j (t),, c nj (t)) τ } for j = 1, 2,...,q. Integrating (5.5) fromt to T,wehave q(t) = T t p τ (s)(b(s)u 1 + c(s)u 2 )ds + π ln 1 α T p τ (s)σ (s) 1 ds. α t The conclusions are proved Example Consider the following example of the bang bang control model for saddle point problem. We have the system equations as follows:

124 5.4 Bang Bang Control for Saddle Point Problem 117 { dx1 (s) = (X 1 (s) + X 2 (s) + u 1 (s))ds + σ dc s dx 2 (s) = 2u 2 (s)ds where σ R, X(0) = (X 1 (0), X 2 (0)) = x 0 and u 1 (s), u 2 (s) [ 1, 1], 0 s 1. The performance index is [ 1 ] V (u 1, u 2 ) = (X 1 (s) + X 2 (s))ds + X 1 (1) X 2 (1) (α), 0 sup in which u 1 aims at maximizing the performance index while u 2 does the minimizing job. Suppose a(s) = ( ) ( ( ( ( ) , b(s) =, c(s) =, f (s) =, g 00 0) 2) 1) 1 =. 1 It follows from (5.2) that ( 1 ṗ = 1) ( ) ( ) 10 1 p, p(1) = We can obtain the solution p(t) = ( ) 2e 1 t 1 2e 1 t. Thus according to Theorem 5.7, we find the bang bang controls: u 1 = 1, u 2 = sgn{2e 1 t } which are shown in Fig Denote x 0 = (x 01, x 02 ) τ. We can obtain the system states as follows: e t (x 01 + x 02 1) + e t σξ(t) t x 02, for 0 t < 1 + ln 2 ln, X 1 (t) = e t (x 01 + x 02 1) + e t σξ(t) + 6e t 1 2t + 4ln2 4ln+ 1 x 02, for 1 + ln 2 ln < t 1, X 2 (t) = x 02 2t (4 + 4ln2 4ln 2t), where ξ(t) is an uncertain process which is subject to the normal distribution N (0, 1 e t ) when t is fixed. The optimal value J(0, x 0 ) is (2e 1)x 01 + (2e )x e 12 ln + 12 ln 2 + π ln 1 α (2e )σ. α

125 118 5 Bang Bang Control for Uncertain Systems Fig. 5.5 Bang bang control for two variables Fig. 5.6 A sample trajectory It can be seen that the trajectory of the system is an uncertain vector. If we set σ = 0.1, x 0 = (0, 1) and α = 0.85, we could simulate a sample trajectory of the uncertain vector. From Fig. 5.6, we can observe the roughness of the curve which is caused by the uncertain vector in the system. In this sample, the switching point is (1.225, 0.178). But this does not mean the control always switches at this point. The switching point always changes at every simulation for the uncertain vector. And the optimal value is when α = 0.85.

126 5.4 Bang Bang Control for Saddle Point Problem 119 References 1. Xu X, Zhu Y (2012) Uncertain bang-bang control for continuous time model. Cybern Syst Int J 4(6): Kang Y, Zhu Y (2012) Bang-bang optimal control for multi-stage uncertain systems. Inf Int Interdiscip J 15(8): Sun Y, Zhu Y (2017) Bang-bang property for an uncertain saddle point problem. J Intell Manufact 28():605 61

127 Chapter 6 Optimal Control for Switched Uncertain Systems Many practical systems operate by switching between different subsystems or modes. They are called switched systems. The optimal control problems of switched systems arise naturally when the control systems under consideration have multiple operating modes. A powertrain system [1] can also be viewed as a switched system which needs switching between different gears to achieve an objective such as fast and smooth acceleration response to the driver s commands, low fuel consumption, and low levels of pollutant emissions. For switched systems, the aim of optimal control is to seek both the optimal switching law and the optimal continuous input to optimize a certain performance criterion. Many successful algorithms have already been developed to seek the optimal control of switched systems. It is worth mentioning that Xu and Antsaklis [2] considered the optimal control of continuous-time switched systems. A two-stage optimization strategy was proposed in [2]. Stage (a) is a conventional optimal control problem under a given switching law, and Stage (b) is a constrained nonlinear optimization problem that finds the local optimal switching instants. A general continuous-time switching problem was investigated in [] based on the maximum principle and an embedding method. Furthermore, Teo et al. proposed a control parameterization technique [4] and the time scaling transform method [5] to find the approximate optimal control inputs and switching instants, which have been used extensively. For continuous-time switched systems with subsystems perturbed by uncertainty, our aim is to seek both the switching instants and the optimal continuous input to optimize a certain performance criterion. In this chapter, we will study such problem based on different criterions and provide suitable solution methods. Springer Nature Singapore Pte Ltd Y. Zhu, Uncertain Optimal Control, Springer Uncertainty Research, 121

128 122 6 Optimal Control for Switched Uncertain Systems 6.1 Switched Uncertain Model Considering a switched uncertain system consisting of the following subsystems: dx s = (A i (s)x s + B i (s)u s )ds + σ(s, u s, X s )dc s, s [0, T] i I ={1, 2,...,M } (6.1) X 0 = x 0 where X s R n is the state vector and u s R r is the decision vector in a domain U, A i :[0, T] R n n, B i :[0, T] R n r are some twice continuously differentiable functions for i I, C s = (C s1, C s2,, C sk ) τ, where C s1, C s2,, C sk are independent canonical Liu processes. An optimal control problem of such system involves finding an optimal control ut and an optimal switching law such that a given cost function is minimized. A switching law in [0, T] for system (6.1) is defined as Λ = ((t 0, i 0 ), (t 1, i 1 ),...,(t K, i K )) where t k (k = 0, 1,...,K) satisfying 0 = t 0 t 1 t K t K+1 = T are the switching instants and i k I for k = 0, 1,...,K. Here(t k, i k ) indicates that at instants t k, the system switches from subsystem i k 1 to i k. During the time interval [t k, t k+1 ) ([t K, T] if k = K), subsystem i k is active. Since many practical problems only involve optimizations in which a prespecified order of active subsystems is given, for convenience, we assume subsystem i is active in [t i 1, t i ). 6.2 Expected Value Model Consider the following uncertain expected value optimal control model of a switched uncertain system. [ T ] min min ti u E f (s) τ X s ds + S s [ 1,1] r T τ X T 0 subject to (6.2) dx s = (A i (s)x s + B i (s)u s )ds + σ(s, u s, X s )dc s s [t i 1, t i ), i = 1, 2,...,K + 1 X 0 = x 0. In the above model, f is the objective function of dimension n and S T R n.for given t 1, t 2,...,t K,useJ (t, x) to denote the optimal value obtained in [t, T] with the condition that at time t weareinstatex t = x. That is

129 6.2 Expected Value Model 12 J (t, x) = min E[ T u s [ 1,1] r t f (s) τ X s ds + ST τ X T ] subject to dx s = (A i (s)x s + B i (s)u s )ds + σ(s, u s, X s )dc s s [t i 1, t i ), i = 1, 2,...,K + 1 X t = x. (6.) By the equation of optimality (2.15) to deal with the model (6.2), the following conclusion can be obtained. Theorem 6.1 Let J (t, x) be twice differentiable on [t i 1, t i ) R n. Then we have J t (t, x) = min {f u t [ 1,1] (t)τ x + (A i (t)x + B i (t)u t ) τ x J (t, x)}, (6.4) r where J t (t, x) is the partial derivatives of the function J (t, x) in t, and x J (t, x) is the gradient of J (t, x) in x. An optimal control problem of switched uncertain systems given by (6.2) is to choose the best switching instants and the optimal inputs such that an expected value is optimized subject to a switched uncertain system Two-Stage Algorithm In order to solve the problem (6.2), we decompose it into two stages. Stage (a) is an uncertain optimal control problem which seeks the optimal value under a given switching sequence. Stage (b) is an optimization problem in switching instants Stage (a) In this stage, we need to solve the following model and find the optimal value. [ T ] J (0, x 0, t 1,, t K ) = min u E f (s) τ X s ds + S s [ 1,1] r T τ X T 0 subject to (6.5) dx s = (A i (s)x s + B i (s)u s )ds + σ(s, u s, X s )dc s s [t i 1, t i ), i = 1, 2,...,K + 1 X 0 = x 0 where t 1, t 2,...,t K are fixed and t 0 = 0, t K+1 = T. Denote B i (s) = (b (i) lj (s)) n r.we have the following conclusion.

130 124 6 Optimal Control for Switched Uncertain Systems Theorem 6.2 ([6]) Let J (t, x) be twice differentiable on [t i 1, t i ) R n (i = 1, 2,...,K+ 1). The optimal control u (i) t = (u (i) 1 (t), u (i) 2 (t),, u r (i) (t)) τ of (6.5) is a bang bang control u (i) j (t) = sgn{ (b (i) 1j (t), b(i) 2j (t),, b(i) nj (t))p i(t)} (6.6) for i = 1, 2,...,K + 1;j= 1, 2,...,r, where p i (t) R n, t [t i 1, t i ), satisfies { dpi (t) = f (t) A i (t) τ p i (t) dt p K+1 (T) = S T and p i (t i ) = p i+1 (t i ) for i K. (6.7) The optimal value of model (6.5) is K+1 J (0, x 0, t 1,...,t K ) = p 1 (0) τ x 0 + i=1 ti t i 1 p i (t) τ B i (t)u (i) t dt. (6.8) Proof First we prove the optimal control of model (6.5) is a bang bang control. It follows from the equation of optimality (6.4) that J t (t, x) = On the right side of (6.9), let u (i) t min {f u t [ 1,1] (t)τ x + (A i (t)x + B i (t)u t ) τ x J (t, x)}. (6.9) r make it the minimum. We have { min f (t) T x + (A i (t)x + B i (t)u t ) τ x J (t, x) } u t [ 1,1] r = f (t) τ x + (A i (t)x + B i (t)u (i) t ) τ x J (t, x). That is, Denote and min { xj (t, x) τ B i (t)u t } = x J (t, x) τ B i (t)u (i) u t [ 1,1] r t. u (i) t x J (t, x) τ B i (t) = (g (i) 1 = (u (i) 1 (t), u (i) 2 (t),..., u (i) r (t)) τ (6.10) (t, x), g(i) 2 (t, x),...,g(i) r (t, x)). (6.11) Then, u (i) j (t) = 1, if g (i) j (t, x) <0 1, if g (i) j (t, x) >0 undetermined, if g (i) j (t, x) = 0 (6.12) for i = 1, 2,...,K + 1; j = 1, 2,...,r, which is a bang bang control. The functions (t, x) are called switching functions. If at least one switching function equal to g (i) j

131 6.2 Expected Value Model 125 zero in some interval, we call it singular control. But here we only consider switching functions equal to zero at most in some discrete points. According to (6.9), when t [t K, T], wehave J t (t, x) = Since J (T, x T ) = S τ T x T, we guess min {f u t [ 1,1] (t)τ x + (A K+1 (t)x + B K+1 (t)u t ) τ x J (t, x)}. (6.1) r J (t, x) = p K+1 (t) τ x + q K+1 (t) and p K+1 (T) = S T, q K+1 (T) = 0. So x J (t, x) = p K+1 (t), J t (t, x) = dp K+1(t) τ dt x + dq K+1(t). (6.14) dt Thus, it follows from (6.12) that u (K+1) j (t) = sgn{ (b (K+1) 1j Substituting (6.14) into(6.1) gets dp K+1(t) τ dt Therefore, x dq K+1(t) dt (t), b (K+1) 2j (t),, b (K+1) nj (t))p K+1 (t)}. (6.15) = f (t) τ x + (A K+1 (t)x + B K+1 (t)u (K+1) t ) τ p K+1 (t). dp K+1(t) τ dt = f (t) τ + p K+1 (t) τ A K+1 (t), p K+1 (T) = S T, (6.16) and dq K+1(t) dt = p K+1 (t) τ B K+1 (t)u (K+1) t, q K+1 (T) = 0. (6.17) From (6.17), we have Furthermore, q K+1 (t) = T J (t, x) = p K+1 (t) τ x + q K+1 (t) = p K+1 (t) τ x + t T t p K+1 (t) τ B K+1 (t)u (K+1) t dt. (6.18) p K+1 (t) τ B K+1 (t)u (K+1) t dt, t [t K, T]. (6.19) where p K+1 (t) satisfies the Riccati differential equation and boundary condition (6.16). When t [t i 1, t i ) for i K, assume

132 126 6 Optimal Control for Switched Uncertain Systems J (t, x) = p i (t) τ x + q i (t), and p i (t i ) = p i+1 (t i ), q i (t i ) = q i+1 (t i ). By the same method as the above procedure, we can get dp i(t) τ = f (t) τ + p i (t) τ A i (t), p i (t i ) = p i+1 (t i ) dt dq i(t) = p i (t) τ B i (t)u (i) t, q i (t i ) = q i+1 (t i ). dt (6.20) Hence, J (t, x) = p i (t) τ x + q i (t) = p i (t) τ x + ti Hence, the optimal value of the model (6.5)is t K+1 J (0, x 0, t 1,, t K ) = p 1 (0) τ x 0 + The theorem is proved. p i (t) τ B i (t)u (i) t dt + q i+1 (t i ), t [t i 1, t i ). i=1 ti p i (t) τ B i (t)u (i) t t i 1 dt. If there are two subsystems only, the model is as follows: [ ] T J (0, x 0, t 1 ) = min E u s [ 1,1] 2 0 f (s)τ X s ds + ST τ X T subject to dx s = (A 1 (s)x s + B 1 (s)u s )ds + σ(s, u s, X s )dc s, s [t 0, t 1 ) dx s = (A 2 (s)x s + B 2 (s)u s )ds + σ(s, u s, X s )dc s, s [t 1, T] X 0 = x 0. (6.21) According to Theorem 6.2, two Riccati differential equations have to be solved in order to solve the model (6.21). Then the optimal cost J (0, x 0, t 1 ) can be obtained as follows: t1 T J (0, x 0, t 1 ) = p 1 (0) τ x 0 + p 1 (t) τ B 1 (t)u (1) t dt + 0 t 1 p 2 (t) τ B 2 (t)u (2) t dt. Denote J (t 1 ) = J (0, x 0, t 1 ). The next stage is to solve an optimization problem min J (t 1 ). (6.22) 0 t 1 T

133 6.2 Expected Value Model Stage (b) For the model (6.21), we cannot obtain the analytical expressions of solutions according to Theorem 6.2 which leads to the unavailability of an explicit form of the first-order derivative and the second-order derivative of the cost function in t 1.Because the cost functions of optimal control problems are not multimodal in practice, the modified golden section method [7], which does not require derivatives of cost functions, can be carried to solve the optimization problem (6.22). This method is usually used to solve one dimension optimization problems. Its basic idea for minimizing a function over an interval is iteratively reducing the length of the interval by comparing the function values of the observations. When the length of the interval is reduced to some acceptable degree, the points on the interval can be regarded as approximations of minimizer. We can use the following algorithm to solve the optimization problem. Algorithm 6.1 (Modified golden section method for solving (6.22)) Step 1. Give the iteration precision ε>0. Set a 1 = 0, b 1 = T, λ 1 = a (b 1 a 1 ) = 0.82T, μ 1 = a (b 1 a 1 ) = 0.618T. Calculate J (a 1 ), J (b 1 ), J (λ 1 ), J (μ 1 ).Putk = 1. Step 2. If b k a k <ε, end. The optimal solution t 1 [a k, b k ].Lett 1 = 2 1 (a k + b k ). Step. Let J = min{ J (a k ), J (b k ), J (λ k ), J (μ k )}.If J = J (a k ) or J = J (λ k ),gotostep4; otherwise, go to step 5. Step 4. Let a k+1 := a k, μ k+1 := λ k, b k+1 := μ k, J (a k+1 ) := J (a k ), J (μ k+1 ) := J (λ k ), J (b k+1 ) := J (μ k ), λ k+1 = a k (b k+1 a k+1 ). Calculate J (λ k+1 ).Turntostep6. Step 5. Let a k+1 := λ k, λ k+1 := μ k, b k+1 := b k, J (a k+1 ) := J (λ k ), J (λ k+1 ) := J (μ k ), J (b k+1 ) := J (b k ), μ k+1 = a k (b k+1 a k+1 ). Calculate J (μ k+1 ). Step 6. Let k := k + 1. Turn to step 2. From Algorithm 6.1, we can see after nth iteration that the length of the interval is (0.618) n T. Therefore, the convergence rate of this method is linear.

134 128 6 Optimal Control for Switched Uncertain Systems An Example Consider the following example of optimal control model for switched uncertain systems J (0, x 0 ) = min t 1 subject to [ 1 min E u s 2 X 1(1) 2 ] X 2(1) { dx 1 (s) = u 1 (s)ds subsystem 1 : dx 2 (s) = (X 1 (s) + u 2 (s))ds + σ dc s, s [0, t 1 ) { dx 1 (s) = (2X 2 (s) u 1 (s))ds subsystem 2 : dx 2 (s) = (u 1 (s) + u 2 (s))ds + σ dc s, s [t 1, 1] X(0) = (X 1 (0), X 2 (0)) τ = (0, 0) τ u i (s) 1, 0 s 1, i = 1, 2. (6.2) We have A 2 (s) = A 1 (s) = ( ) 00, B 10 1 (s) = ( ) 02, B 00 2 (s) = ( ) 10, 01 ( ) ( 10 1 ), S = 2 2. It follows from (6.7) that dp 2 (t) dt = ( ) ( 00 1 ) p 20 2 (t), p 2 (1) = 2 2, which has the solution Hence, ( 1 ) p 2 (t) = 2 1 t. ( u (2) u (2) t = 1 (t) ) u (2) 2 (t) ( sgn( 1 = 6 + t) sgn(t 1 ) ). It also follows from (6.7) that ( ) ( dp 1 (t) 01 1 ) = p dt 00 1 (t), p 1 (t 1 ) = p 2 (t 1 ) = 2 1 t, 1 and the solution is

135 6.2 Expected Value Model 129 ( (t1 1 p 1 (t) = )t + ( t 1 t1 2) ) 1 t. 1 Hence, ( u (1) u (1) t = 1 (t) ) u (1) 2 (t) ( sgn[( 1 = t 1)t ( t 1 t 2 1 )] sgn(t 1 1 ) ). Choose ε = By applying Algorithm 6.1, after 10 iterations, we find the optimal switching instant t1 [0.592, 0.600], and t 1 = The corresponding optimal cost is The optimal control law and J (t, x) are J (t, x) = ( u (1) u (1) ) t = 1 (t) u (1) 2 (t) ( u (2) u (2) ) t = 1 (t) u (2) 2 (t) = ( ) 1 1 ( 1 = 1) t [0, 0.596), t [0.596, 1], (0.26t + 0.4)x x t t 0.985, t [0, 0.596) 1 2 x 1 + ( 1 t)x 2 + t t 5, t [0.596, 1] LQ Switched Optimal Control Problem We consider a kind of special model of switched uncertain systems with a quadratic objective function subject to some linear uncertain differential equations. Then the following uncertain expected value LQ model of switched uncertain systems is considered. [ T min E ( 1 u s 0 2 X s τ Q(t)X s + Xs τ V (t)u s uτ s R(t)u s +M (t)x s + N(t)u s + W (t))ds + 1 ] 2 X T τ Q T X T + M T X T + L T J (0, x 0 ) = min ti subject to dx s = (A i (s)x s + B i (s)u s )ds + σ(s, u s, X s )dc s, s [t i 1, t i ), i = 1, 2,...,K + 1 X 0 = x 0. (6.24) where T, x 0 are given, Q(t) R n n, V (t) R n r, R(t) R r r, M (t) R n, N(t) R r, W (t) R are functions of time t and Q T 0, Q(t) 0, R(t) >0. The aim to discuss this model is to find not only an optimal control u t but also an optimal switching law. To begin with we consider the following problem.

136 10 6 Optimal Control for Switched Uncertain Systems [ T J (t, x) = min E ( 1 u t t 2 X s τ Q(t)X s + Xs τ V (t)u s uτ s R(t)u s + M (t)x s +N(t)u s + W (t))ds + 1 ] 2 X T τ Q T X T + M T X T + L T subject to dx s = (A i (s)x s + B i (s)u s )ds + σ(s, u s, X s )dc s, s [t i 1, t i ), i = 1, 2,...,K + 1 X t = x. (6.25) By the equation of optimality (2.15) to deal with the model (6.25), the following conclusion can be obtained. Theorem 6. Assume that J (t, x) be twice differentiable on [t i 1, t i ) R n. Then we have [ 1 J t (t, x) = min u t 2 xτ Q(t)x + x τ V (t)u t uτ t R(t)u t + M (t)x + N(t)u t + W (t) +(A i (t)x + B i (t)u t ) τ x J (t, x)] (6.26) where J t (t, x) is the partial derivatives of the function J (t, x) in t, and x J (t, x) is the gradient of J (t, x) in x. Theorem 6.4 ([8]) Assume that J (t, x) be twice differentiable on [t i 1, t i ) R n (i = 1, 2,...,K + 1). Let Q(t), V(t), R(t), M(t), N(t), W(t), A i (t), B i (t), R(t) 1 be continuous bounded functions of t, and Q(t) 0, Q T 0, R(t) >0. The optimal control of model (6.25) when t [t i 1, t i ) is that u (i) t = R(t) 1 (B i (t) τ P i (t) + V (t) τ )x R(t) 1 (B i (t) τ S τ i (t) + N(t)τ ) (6.27) for i = 1, 2,...,K + 1, where P i (t) = Pi τ (t) and S i(t) satisfy, respectively, Ṗ i (t) = Q(t) + P i (t)a i (t) + A i (t) τ P i (t) (P i (t)b i (t) + V (t))r(t) 1 (B i (t) τ P i (t) + V (t) τ ) P K+1 (T) = Q T and P i (t i ) = P i+1 (t i ) for i K, (6.28) and Ṡ i (t) = M (t) + S i (t)a i (t) (N(t) + S i (t)b i (t))r(t) 1 (B i (t) τ P i (t) + V (t) τ ) S K+1 (T) = M T and S i (t i ) = S i+1 (t i ) for i K. (6.29) The optimal value of model (6.25) is J (0, x 0 ) = 1 2 xτ 0 P 1(0)x 0 + S 1 (0)x 0 + L 1 (0). (6.0)

137 6. LQ Switched Optimal Control Problem 11 where L i (t), t [t i 1, t i ) satisfies { L i (t) = W (t) 1 2 (S i(t)b i (t) + N(t))R(t) 1 (B i (t) τ Si τ (t) + N(t)τ ) L K+1 (T) = L T and L i (t i ) = L i+1 (t i ) for i K. (6.1) Proof It follows from the equation of optimality (6.26) that Let J t (t, x) = min[ 1 u t 2 xτ Q(t)x + x τ V (t)u t uτ t R(t)u t + M (t)x + N(t)u t + W (t) +(A i (t)x + B i (t)u t ) τ x J (t, x)]. (6.2) L(u (i) t ) = 1 2 xτ Q(t)x + x τ V (t)u (i) t The optimal control u (i) t u(i)τ t R(t)u (i) t + M (t)x + N(t)u (i) t + W (t) +(A i (t)x + B i (t)u (i) t ) τ x J (t, x). (6.) satisfies L(u (i) t ) u (i) = V (t) τ x + R(t)u (i) t + N(t) τ + B i (t) τ x J (t, x) = 0. (6.4) t Since 2 L(u (i) t ) 2 u (i) = R(t) >0, (6.5) t we have u (i) t = R(t) 1 (V (t) τ x + N(t) τ + B i (t) τ x J (t, x)), t [t i 1, t i ). (6.6) Since J (T, x T ) = 1 2 X τ T Q T X T + M T X T + L T, we guess and J (t, x) = 1 2 xτ P K+1 (t)x + S K+1 (t)x + L K+1 (t), t [t K, T], (6.7) P K+1 (t) = P K+1 (t) τ, P K+1 (T) = Q T, S K+1 (T) = M T, L K+1 (T) = L T. So and J t (t, x) = 1 2 xτ Ṗ K+1 (t)x + Ṡ K+1 (t)x + L K+1 (t) (6.8) x J (t, x) = P K+1 (t)x + SK+1 τ (t). (6.9)

138 12 6 Optimal Control for Switched Uncertain Systems Thus, it follows from (6.6) that u (K+1) t = R(t) 1 (B K+1 (t) τ P K+1 (t) + V (t) τ )x R(t) 1 (B τ K+1 S K+1(t) τ + N(t) τ ). (6.40) Substituting (6.8), (6.9), and (6.40) into(6.2) that 1 2 xτ Ṗ K+1 (t)x Ṡ K+1 (t)x L K+1 (t) = 1 2 xτ [ Q(t) + P K+1 (t)a K+1 (t) + A K+1 (t) τ P K+1 (t) (P K+1 (t)b K+1 (t) + V (t))r(t) 1 (B K+1 (t) τ P K+1 (t) + V (t) τ ) ] x + [ S K+1 (t)a K+1 (t) (N(t) + S K+1 (t)b K+1 (t)) R(t) 1 (B K+1 (t) τ P K+1 (t) +V (t) τ ) + M (t)]x [ + W (t) 1 ] 2 (S K+1(t)B K+1 (t) + N(t))R(t) 1 (B K+1 (t) τ SK+1 τ (t) + N(t)τ ). Therefore, we have Ṗ K+1 (t) = Q(t) + P K+1 (t)a K+1 (t) + A K+1 (t) τ P K+1 (t) (P K+1 (t)b K+1 (t) + V (t))r(t) 1 +(B K+1 (t) τ P K+1 (t) + V (t) τ ), P K+1 (T) = Q T, Ṡ K+1 (t) = M (t) + S K+1 (t)a K+1 (t) (N(t) +S K+1 (t)b K+1 (t))r(t) 1 (B K+1 (t) τ P K+1 (t) + V (t) τ ) S K+1 (T) = M T, (6.41) (6.42) and L K+1 (t) = W (t) 1 2 (S K+1(t)B K+1 (t) +N(t))R(t) 1 (B K+1 (t) τ SK+1 τ (t) + N(t)τ ) (6.4) L K+1 (T) = L T Hence, P K+1 (t), S K+1 (t) and L K+1 (t) satisfy the Riccati differential equation and boundary condition (6.41), (6.42), and (6.4), respectively. When t [t i 1, t i ) for i K, assume J (t, x) = 1 2 xτ P i (t)x + S i (t)x + L i (t), (6.44) By the same method as above procedure, we can get u (i) t = R(t) 1 (B i (t) τ P i (t) + V (t) τ )x R(t) 1 (B i (t) τ S τ i (t) + N(t)τ ) (6.45)

139 6. LQ Switched Optimal Control Problem 1 and J (0, x 0 ) = 1 2 xτ 0 P 1(0)x 0 + S 1 (0)x 0 + L 1 (0). (6.46) where P i (t) = Pi τ (t), S i(t), L i (t) satisfy, respectively, Ṗ i (t) = Q(t) + P i (t)a i (t) + A i (t) τ P i (t) (P i (t)b i (t) + V (t))r(t) 1 (B i (t) τ P i (t) + V (t) τ ) P i (t i ) = P i+1 (t i ), { Ṡi (t) = M (t) + S i (t)a i (t) (N(t) + S i (t)b i (t))r(t) 1 (B i (t) τ P i (t) + V (t) τ ) S i (t i ) = S i+1 (t i ), and { L i (t) = W (t) 1 2 (S i(t)b i (t) + N(t))R(t) 1 (B i (t) τ Si τ (t) + N(t)τ ) L i (t i ) = L i+1 (t i ) The theorem is proved. According to Theorem 6.2, there are 2(K + 1) matrix Riccati differential equations to be solved in order to solve the model (6.5). Then the optimal cost J (0, x 0, t 1, t K ) can be obtained by (6.46). Denote J (t 1, t K ) = J (0, x 0 ). The next stage is to solve an optimization problem min J (t 1,, t K ). (6.47) 0 t 1 t 2 t K T 6.4 MACO Algorithm for Optimal Switching Instants For the model (6.5), we may not obtain the analytical expressions of solutions according to Theorem 6.2. But most optimization algorithms need explicit forms of the first-order derivative of the objective functions. Being presented with such difficulties, evolutionary metaheuristic algorithms may be a good choices to solve Stage (b). An intelligent algorithm combining a mutation ant colony optimization algorithm and a simulated annealing method (MACO) was designed by Zhu [9]tosolve continuous optimization models. We will use this algorithm to solve the following optimization problem min J (t 1,, t K ) subject to (6.48) 0 t 1 t 2 t K T t i R(t), i = 1, 2,...,K.

140 14 6 Optimal Control for Switched Uncertain Systems The vector t = (t 1, t K ) is a decision vector which is in the feasible set of constrains Ω ={t = (t 1,, t K ) 0 t 1 t 2 t K T}. Assume that t i = a 1 a 2 a l.a l+1 a l+2 a m for i = 1, 2,...,K, where l and m (m l) are some positive integers and a k is a natural number which is no less than zero and no more than nine for k = 1, 2,...,m. That is t i = m a k 10 l k, i = 1, 2,...,K. (6.49) k=1 where a k {0, 1, 2,...,9} for k = 1, 2,...,m. The parameters l and m are selected according to required precision of solutions of problem (6.48). Let artificial ants walk step by step. Call the numbers k = 0, 1,...,9 to be nodes of each step. For every t i, each artificial ant is first put on 0 and moves to a node of the 1st step, and then to a node of the 2nd step, until to a node of the mth step. In this movement, an artificial ant walks from a node to the next node according to the strength of the pheromone trails on the latter nodes. If the node of the kth step that an artificial ant selects is j, then equip a k by j. Once all artificial ants have completed their walk, pheromone trails are updated. Denote the pheromone trail by τ i;k,j (s) associated to node j of the kth step for the variable t i at iteration s. The procedures are described as follows. (1) Initialization Process: Randomly generate a feasible solution t as the best solution ˆt. Set τ i;k,j (0) = τ 0, i = 1, 2,...,K, k = 1, 2,...,m, j = 0, 1,...,9, where τ 0 is a parameter. (2) Ant Movement: Atstepk after building the sequence a 1, a 2,, a k, select the next node j of the (k + 1)th step in the following probability p k,k+1 = τ i;k+1,j (s) 9 q=0 τ i;k+1,q(s). (6.50) After obtaining the sequence a 1, a 2,, a m, and form t i according to Eq. (6.49). The feasible set Ω may be used to check the feasibility of the vector t = (t 1, t K ). In order to avoid the premature of the best solution ˆt so far, we modify it based on the idea of mutation and Metropolis acceptance law. Construct a feasible vector t in the neighbor of ˆt as follows: randomly selecting h i ( 1, 1), and l i [0, L) for some positive number L, let t = ˆt + (l 1 h 1, l 2 h 2, l K h K ) The feasibility of t may be guaranteed by choosing l i small enough or l i = 0. If Δf = f (t ) f (ˆt) 0, then ˆt t. Otherwise, if Metropolis acceptance law holds, that is, exp( Δf /T s )>random(0, 1) where T s 0 as iteration s, then denote t t.

141 6.4 MACO Algorithm for Optimal Switching Instants 15 () Pheromone Update: At each moment s,letˆt be the best solution found so far, and t s be the best solution in the current algorithm iteration s. If J (t s )< J (ˆt), then ˆt t s. Reinforce the pheromone trails on arcs of ˆt and t (if any) and evaporate the pheromone trails on arcs of others: (1 ρ)τ i;k,j (s 1) + ρg(ˆt), if (k, j) ˆt τ i;k,j (s) = (1 ρ)τ i;k,j (s 1) + ρ 2 g(ˆt), if (k, j) t (1 ρ)τ i;k,j (s 1), otherwise (6.51) where ρ,0 <ρ<1, is the evaporation rate, and g(x) is a function with that g(x) g(y) if J (x) < J (y). The algorithm can be summarized as follows. Algorithm 6.2 (MACO algorithm for solving (6.48)) Step 1. Initialize all pheromone trails with the same amount of pheromone and randomly generate a feasible solution. Step 2. Ant movement according to the pheromone trails to produce the value of a decision variable. Step. Repeat step 2 to produce t 1, t 2,, t K and check them with the feasible set Ω. Step 4. Repeat step 2 to step for a given number of artificial ants. Step 5. Update pheromone according to the best feasible solution found so far. Step 6. Repeat step 2 to step 5 for a given number of cycles. Step 7. Report the best solution as the optimal solution Example Consider the following example of LQ models for switched uncertain systems [ 1 J (0, x 0 ) = min min E ( X (s) u(s) + 1 ] t 1,t 2 u(s) 2 u2 (s) + 1)ds X 2 (1) 0 subject to subsystem 1 : dx (s) =[u(s) α 1 X (s)]ds + σ X (s)dc s, s [0, t 1 ) subsystem 2 : dx (s) =[u(s) α 2 X (s)]ds + σ X (s)dc s, s [t 1, t 2 ) subsystem : dx (s) =[u(s) α X (s)]ds + σ X (s)dc s, s [t 2, 1] X (0) = 1. (6.52) Comparing the example with model (6.24), we have: Q(t) = 0, R(t) = 1, V (t) = 0, M (t) = 1, N(t) = 1, W (t) = 1, T = 1, Q T = 2, M (t) T = 0, L T = 0, A i (t) = α i, B i (t) = 1(i = 1, 2, ).

142 16 6 Optimal Control for Switched Uncertain Systems Stage (a): Fix t 1, t 2 and formulate J (t 1, t 2 ) according to Theorem 6.4. It follows from (6.28) and (6.29) that { Ṗi (t) = P 2 i (t) 2α ip i (t) P (1) = 2, P (t 2 ) = P 2 (t 2 ), P 2 (t 1 ) = P 1 (t 1 ), (6.5) and { Ṡi (t) = (P i (t) + α i )S i (t) + P i (t) 1 S (1) = 0, S (t 2 ) = S 2 (t 2 ), S 2 (t 1 ) = S 1 (t 1 ) (6.54) Then the solutions of Eqs. (6.5) and (6.54) are P (t) = m e m t e m t + n, S (t) = 2(m + 1)e mt 1 2n + c m e 2 m t m (n e m t ) for i =, where m = 2α, S t = 1, n = ( S t α )e m, c = ( 4 m + 1)e 1 2 m. In addition, we have P 2 (t) = m 2S t2 e m 2t S t2 e m 2t + n 2, S 2 (t) = 2S t 2 (m 2 + 1)e m2t 1 2n 2 + c 2 m 2 e 2 m 2t m 2 (n 2 + S t2 e m 2t ) for i = 2, where m 2 = 2α 2, S t2 = 1 2 P (t 2 ), n 2 = ( S t2 α 2 )e m 2t 2, S t 2 = S (t 2 ), c 2 = ( m 2 2 S t 2 2S t2 1 4S ) t 2 e 1 2 m 2t 2, m 2 and P 1 (t) = m 1S t1 e m 1t S t1 e m 1t + n 1, S 1 (t) = 2S t 1 (m 1 + 1)e m1t 1 2n 1 + c 1 m 1 e 2 m 1t m 1 (n 1 + S t1 e m 1t ) for i = 1, where m 1 = 2α 1, S t1 = 1 2 P 2(t 1 ), n 1 = ( S t1 α 1 )e m 1t 1, S t 1 = S 2 (t 1 ), c 1 = According to Theorem 6.4, the optimal value is ( m 1 2 S t 1 2S t1 1 4S ) t 1 e 1 2 m 1t 1. m 1 J (t 1, t 2 ) = 1 2 P 1(0) + S 1 (0) + L 1 (0).

143 6.4 MACO Algorithm for Optimal Switching Instants 17 where L 1 (0) = t t 2 [ 1 2 S2 1 (t) + S 1(t) + 1 ] t2 dt + 2 [ 1 2 S2 (t) + S (t) + 1 ] dt. 2 t 1 [ 1 2 S2 2 (t) + S 2(t) + 1 ] dt 2 Stage (b): Find the optimal switching instant t1, t 2 according to Algorithm 6.2. Choose α 1 = 1, α 2 = 1 4, α = 1. By applying Algorithm 6.2, we find the optimal 2 switching instant t1 = 0.0, t 2 = The optimal control is e0.667t e 0.t e 0.667t e0.667t x(t) e 0.667t, t [0, 0.0) ut = e0.5t 19.79e 0.25t e 0.5t 1.74e0.5t x(t), t [0.0, 0.462) e0.5t 1 + 4et 8.24e 0.5t e t et x(t), t [0.462, 1] et 6.5 Optimistic Value Model Consider an optimistic value model of switched uncertain systems for multidimensional case as follows. J (0, x 0 ) = min max F ti sup(α) u s [ 1,1] r subject to dx s = (A i (s)x s + B i (s)u s )ds + Q i dc s, (6.55) s [t i 1, t i ), i = 1, 2,, K + 1 X t = x. where t K+1 = T, F = T t f (s) τ X s ds + S τ T X T, and F sup (α)= sup{ F M{F F} α} which denotes the α-optimistic value to F. The function f :[0, T] R n is the objective function of dimension n, S T R n. We will use J (t, x) to denote the optimal value max us F sup (α) obtained in [t, T] with the condition that at time t we are in state X t = x. Applying the equation of optimality (.4) to deal with model (6.55), the following conclusion can be obtained.

144 18 6 Optimal Control for Switched Uncertain Systems Theorem 6.5 Let J (t, x) be twice differentiable on [t i 1, t i ) R n for i = 1, 2,..., K + 1. Then we have { J t (t, x) = max f (t) τ x + (A i (t)x + B i (t)u t ) τ x J (t, x) u t [ 1,1] r + π ln 1 α α xj (t, x) τ } Q i 1, (6.56) where J t (t, x) is the partial derivatives of the function J (t, x) in t, x J (t, x) is the gradient of J (t, x) in x, and 1 is the 1-norm for vectors, that is, v 1 = m i=1 v i for v = (v 1, v 2 v m ) Two-Stage Approach In order to solve problem (6.55), we decompose it into two stages. Stage (a) deals with a conventional uncertain optimal control problem which seeks the optimal value of J with respect to the switching instants. Stage (b) solves an optimization problem in the switching instants Stage (a) Now we fix the switching instants t 1, t 2,, t K and handle the following model to find the optimal value: [ T ] J (0, x 0, t 1,, t K ) = max f (s) τ X s ds + S τ u s [ 1,1] r T X T (α) 0 sup subject to (6.57) dx s = (A i (s)x s + B i (s)u s )ds + Q i dc s s [t i 1, t i ), i = 1, 2,...,K + 1 X 0 = x 0. Applying Eq. (6.56) to model (6.57), we have the following conclusion. Theorem 6.6 ([10]) Let J (t, x) be twice differentiable on [t i 1, t i ) R n (i = 1, 2,...,K + 1). The optimal control u (i) t = (u (i) 1 (t), u (i) 2 (t),, u r (i) (t)) τ of (6.57) is a bang bang control u (i) j (t) = sgn{(b (i) 1j (t), b(i) 2j (t),, b(i) nj (t))p i(t)} (6.58)

145 6.5 Optimistic Value Model 19 for i = 1, 2,...,K + 1;j = 1, 2,...,r, where B i (t) = (b (i) lj (t)) n r and p i (t) R n, t [t i 1, t i ), satisfies { dpi (t) = f (t) A i (t) τ p dt i (t) p K+1 (T) = S T and p i (t i ) = p i+1 (t i ) for i K. (6.59) The optimal value of model (6.57) is ti K+1 J (0, x 0, t 1,, t K ) = p 1 (0) τ [ x 0 + pi (t) τ B i (t) 1 i=1 t i 1 + π ln 1 α α p i(t) τ ] Q i 1 dt. (6.60) Proof First we prove the optimal control of model (6.57) is a bang bang control. It follows from the equation of optimality (6.56) that { J t (t, x) = max f (t) τ x + (A i (t)x + B i (t)u t ) τ x J (t, x) u t [ 1,1] r + π ln 1 α α xj (t, x) τ } Q i 1. (6.61) On the right-hand side of (6.61), let u (i) t { max f (t) τ x + (A i (t)x + B i (t)u t ) τ x J (t, x) u t [ 1,1] r + π ln 1 α α xj (t, x) τ } Q i 1 = f (t) τ x + (A i (t)x + B i (t)u (i) t ) τ x J (t, x) + make it the maximum. We have π ln 1 α α xj (t, x) τ Q i 1. That is, Denote and max { xj (t, x) τ B i (t)u t } = x J (t, x) τ B i (t)u (i) u t [ 1,1] r t. (6.62) u (i) t x J (t, x) τ B i (t) = (g (i) 1 = (u (i) 1 (t), u (i) 2 (t),, u r (i) (t, x), g(i) 2 (t)) τ (t, x),, g(i) r (t, x)). Then, u (i) j (t) = 1, if g (i) j (t, x) >0 1, if g (i) j (t, x) <0 undetermined, if g (i) j (t, x) = 0 (6.6)

146 140 6 Optimal Control for Switched Uncertain Systems for i = 1, 2,...,K + 1; j = 1, 2,...,r, which is a bang bang control. The functions g (i) j (t, x) are called switching functions. If at least one switching function equals to zero in some interval, we call it a singular control. But here we only consider switching functions equal to zero at most in some discrete points. According to (6.61), when t [t K, T], wehave { J t (t, x) = max f (t) τ x + (A K+1 (t)x + B K+1 (t)u t ) τ x J (t, x) u t [ 1,1] r + π ln 1 α α xj (t, x) τ } Q K+1 1. (6.64) Since J (T, x T ) = S τ T x T, we assume J (t, x) = p K+1 (t) τ x + q K+1 (t) and p K+1 (T) = S T, q K+1 (T) = 0. So x J (t, x) = p K+1 (t), J t (t, x) = dp K+1(t) τ dt x + dq K+1(t). (6.65) dt Thus, it follows from (6.6) that u (K+1) j (t) = sgn{(b (K+1) 1j for j = 1, 2,...,r. Substituting (6.65) into(6.64) yields dp K+1(t) τ dt Therefore, we have x dq K+1(t) dt (t), b (K+1) 2j (t),, b (K+1) nj (t))p K+1 (t)} (6.66) = f (t) τ x + (A K+1 (t)x + B K+1 (t)u (K+1) t ) τ p K+1 (t) + π ln 1 α α p K+1(t) τ Q K+1 1. dp K+1(t) τ dt = f (t) τ + p K+1 (t) τ A K+1 (t), p K+1 (T) = S T, (6.67) and dq K+1(t) dt + q K+1 (T) = 0. = p K+1 (t) τ B K+1 (t)u (K+1) t π ln 1 α α p K+1(t) τ Q K+1 1, (6.68) Substituting (6.66) into(6.68), we can get [ T q K+1 (t) = p K+1 (s) τ B K+1 (s) 1 + t ] π ln 1 α α p K+1(s) τ Q K+1 1 ds.

147 6.5 Optimistic Value Model 141 So when t [t K, T], wehave J (t, x) = p K+1 (t) τ x + q K+1 (t) T = p K+1 (t) τ [ x + pk+1 (s) τ B K+1 (s) 1. t + π ln 1 α α p K+1(s) τ ] Q K+1 1 ds, where p K+1 (t) satisfies the Riccati differential equation and boundary condition (6.67). When t [t i 1, t i ) for i K, assume J (t, x) = p i (t) τ x + q i (t), and p i (t i ) = p i+1 (t i ), q i (t i ) = q i+1 (t i ). By the same method as the above procedure, we can get u (i) j (t) = sgn{(b (i) 1j (t), b(i) 2j (t),, b(i) nj (t))p i(t)} for j = 1, 2,...,r, where dp i(t) τ = f (t) τ + p dt i (t) τ A i (t), p i (t i ) = p i+1 (t i ) dq i(t) = p dt i (t) τ B i (t) 1 + π ln 1 α α p i(t) τ Q i 1, q i (t i ) = q i+1 (t i ), and J (t, x) = p i (t) τ x + q i (t) [ ti = p i (t) τ x + p i (s) τ B i (s) 1 + +q i+1 (t i ). Summarily, the optimal value of model (6.57) is t ] π ln 1 α α p i(s) τ Q i 1 ds K+1 J (0, x 0, t 1,, t K ) = p 1 (0) τ [ x 0 + pi (t) τ B i (t) 1. i=1 t i 1 + π ln 1 α α p i(t) τ ] Q i 1 dt. The theorem is proved. ti

148 142 6 Optimal Control for Switched Uncertain Systems 6.5. Stage (b) According to Theorem 6.6, there are (K + 1) matrix Riccati differential equations to be solved in order to solve the model (6.57). Then the optimal cost J (0, x 0, t 1,, t K ) can be obtained by (6.60). Denote J (t 1,, t K ) = J (0, x 0, t 1,, t K ). The next stage is to solve an optimization problem: max J (t 1,, t K ) subject to (6.69) 0 t 1 t 2 t K T t i R, i = 1, 2,...,K. For model (6.57), we may not obtain the analytical expressions and derivative of the optimal reward according to Theorem 6.6. But gradient algorithms need explicit forms of the first-order derivative of the optimal reward. Being presented with such difficulties, evolutionary metaheuristic algorithms such as GA and PSO algorithm are good choices to solve Stage (b) which offer a high degree of flexibility and robustness in dynamic environments Example Consider the following optimal control problem with two uncertain subsystems: [ 1 J (0, x 0 ) = min max t 1 u s 2 X 1(1) 2 ] X 2(1) (α) sup subject to { dx1 (s) = u subsystem 1 : 1 (s)ds + σ dc s1 { dx 2 (s) = (X 1 (s) + u 2 (s))ds + σ dc s2, s [0, t 1 ) dx1 (s) = (2X subsystem 2 : 2 (s) u 1 (s))ds + σ dc s1 dx 2 (s) = (u 1 (s) + u 2 (s))ds + σ dc s2, s [t 1, 1] X 1 (0) = X 2 (0) = 0 u i (s) 1, 0 s 1, i = 1, 2. Comparing the example with the model (6.55), we have A 1 (s) = ( ) 00, A 10 2 (s) = f (s) = 0, Q 1 = Q 2 = ( ) 02, B 00 1 (s) = ( ) 10, B 01 2 (s) = ( ) ( σ 0 1 ), S 0 σ 1 = 2 2. ( ) 10, 1 1

149 6.5 Optimistic Value Model 14 Stage (a): Fix t 1 and formulate J (t 1 ) according to Theorem 6.6. It follows from (6.59) that ( ) ( dp 2 (t) 00 1 ) = p dt 20 2 (t), p 2 (1) = 2 2 which has the solution ( 1 ) p 2 (t) = 2 1 t. Hence, ( u (2) u (2) ) t = 1 (t) u (2) 2 (t) ( sgn( 1 = 6 t) sgn( 1 t) ). It also follows from (6.59) that dp 1 (t) dt = ( ) ( 01 1 ) p 00 1 (t), p 1 (t 1 ) = p 2 (t 1 ) = 2 1 t 1 which has the solution ( (t1 1 p 1 (t) = )t + ( t 1 t1 2) ) 1 t. 1 Hence ( u (1) u (1) ) t = 1 (t) u (1) 2 (t) ( sgn[(t1 1 = )t + ( t 1 t1 2)] sgn( 1 t 1) ), and ( ) J (t 1 ) = π ln 1 α t1 [ ( σ +1 t 1 1 ) ( 1 t + α t 1 t1) 2 + t 1 1 ] 1 [ 1 dt + t t + 1 t + π ln 1 α ( α σ 1 t + 1 )] dt 2 by (6.60). Stage (b): Find the optimal switching instant t 1. For GA, we keep the parameters as following: population size 40, maximal number of generations 200, crossover probability 0.9, and mutation probability 0.1. For PSO algorithm, the parameters are taken as swarm size 20, maximal number of iterations 00, the first strength of attraction constant 1.49, and the second strength of attraction constant 1.49.

150 144 6 Optimal Control for Switched Uncertain Systems Table 6.1 Results of optimization Approaches t1 J (t1 ) GA-based approach PSO-based approach Let σ = 0.1, and choose α = Table 6.1 presents the results by the two approaches. From this table, we can see that nearly the same results are obtained by GA and PSO approaches. However, compared with GA, PSO algorithm is easier to implement because it has no evolution operators such as crossover and mutation. Therefore, under the condition of about the same result, we are more inclined to use the PSO algorithm for solving the problem. The optimal control law by PSO is ( u (1) u (1) ) t = 1 (t) u (1) 2 (t) ( u (2) u (2) ) t = 1 (t) u (2) 2 (t) = = ( 1 1 ), t [0, 0.576), ( ) 1, t [0.576, 1] Discrete-Time Switched Linear Uncertain System Considering the following class of discrete-time switched linear uncertain systems consisting of m subsystems. x(k + 1) = A y(k) x(k) + B y(k) u(k) + σ k+1 ξ k+1, k = 0, 1,...,N 1 (6.70) where (i) for each k K {0, 1,, N 1}, x(k) R n is the state vector with x(0) given and u(k) R r is the control vector, y(k) M {1,, m} is the switching control that indicates the active subsystem at stage k; (ii) for each i M, A i,b i are constant matrices of appropriate dimension; (iii) for each k K, σ k+1 R n and σ k+1 = 0, ξ k is the disturbance and ξ 1,ξ 2,,ξ N are independent ordinary linear uncertain variables denoted by L( 1, 1). The performance of the sequence u(k) N 1 k=0 and y(k) N 1 k=0 can be measured by the following expected value: [ ] N 1 E x(n) 2 Q f + ( x(k) 2 Q y(k) + u(k) 2 R y(k) ) k=0 (6.71)

151 6.6 Discrete-Time Switched Linear Uncertain System 145 where, for any i M, Q i 0, R i > 0 and (Q i, R i ) constitutes the cost-matrix pair of the ith subsystem and Q f > 0 is the terminal penalty matrix. The goal is to solve the following problem. Problem 6.1 Find u (k) N 1 k=0 and y (k) N 1 k=0 to minimize (6.71) subject to the dynamical system (6.70) with initial state x(0) = x 0. By using the dynamic programming approach, we will derive the analytical solution of Problem 6.1. However, we should introduce the recurrence formula first. For any 0 < k < N 1, let J (k, x k ) be the optimal reward obtainable in [k, N] with the condition that at stage k weareinstatex(k) = x k. Then we have N 1 J (k, x k ) = min u(i),y(i),k i N E x N 2 Q f + ( x(j) 2 Q y(j) + u(j) 2 R y(j) ) j=k subject to (6.72) x(j + 1) = A y(j) x(j) + B y(j) u(j) + σ j+1 ξ j+1, j = k,...,n 1, x(k) = x k. Theorem 6.7 For model (6.72), we have the following recurrence equation: J (k, x k ) = J (N, x N ) = ] min [ x N 2 Qf u(n),y(n) [ ] min E x k 2 Q u(k),y(k) y(k) + u(k) 2 R y(k) + J (k + 1, x k+1 ) Proof The proof is similar to that of Theorem Analytical Solution By using the recurrence equation, the analytical solution of Problem 6.1 can be derived. As in [11], define the following Riccati operator ρ i (P) : S n + S+ n for given i M and P S n +, ρ i (P) Q i + A τ i PA i A τ i PB i(bi τ PB i + R i ) 1 Bi τ PA i Let {H i } N i=0 denote the set of ordered pairs of matrices defined recursively: H 0 ={(Q f, 0)}, H k+1 = Γ k (P, r), (P,r) H k

152 146 6 Optimal Control for Switched Uncertain Systems with Γ k (P, r) = i M{(ρ i (P), r + 1 σ N k 2 P )},(P, r) H k for k = 0, 1,...,N 1. Suppose that for each i M, k = 0, 1,...,N 1 and P 0, the following condition holds (A i (k)x(k) + B i (k)u(k)) τ Pσ k+1 σ k+1 2 P (6.7) which means that at each stage k, the disturbance upon each subsystem is comparatively small. Next, we will derive the analytical solution of Problem 6.1.First,wehave J (N, x N ) = x N 2 Q f = min (P,r) H 0 ( x N 2 P + r). For k = N 1, the following equation holds by Theorem 6.7: J (N 1, x N 1 ) [ = min E u(n 1),y(N 1) = min u(n 1),y(N 1) ] x N 1 2 Q y(n 1) + u(n 1) 2 R y(n 1) + J (N, x N ) { x N 1 2 Q y(n 1) + u(n 1) 2 R y(n 1) + E [( A y(n 1) x N 1 +B y(n 1) u(n 1) + σ N ξ N ) τ Qf ( Ay(N 1) x N 1 + B y(n 1) u(n 1) + σ N ξ N )] } { = min x N 1 2 u(n 1),y(N 1) Q y(n 1) +A τ y(n 1) Q f A y(n 1) + u(n 1) 2 R y(n 1) +By(N 1) τ Q f B y(n 1) [ + 2u τ (N 1)By(N 1) τ Q f A y(n 1) x N 1 + E 2(A y(n 1) x N 1 ]} + B y(n 1) u(n 1)) τ Q f σ N ξ N + σ N 2 Q f ξn 2. (6.74) Denote a = 2(A y(n 1) x N 1 + B y(n 1) u(n 1)) τ Q f σ N, b = σ N 2 Q f and s = a/b. With condition (6.7), we can derive s 2. Moreover, ξ N is an ordinary linear uncertain variable and ξ N L( 1, 1). According to Example 1.6, the following equations hold E[aξ N + bξ 2 N ]=be[ξ 2 N + sξ N ]= 1 b = 1 σ N 2 Q f. (6.75) Substituting (6.75) into(6.74) yields

153 6.6 Discrete-Time Switched Linear Uncertain System 147 J (N 1, x N 1 ) = The optimal control u (N 1) satisfies { min x N 1 2 Q u(n 1),y(N 1) y(n 1) +A τ y(n 1) Q f A y(n 1) + u(n 1) 2 R y(n 1) +B τ y(n 1) Q f B y(n 1) +2u τ (N 1)B τ y(n 1) Q f A y(n 1) x N σ N 2 Q f } min f (u(n 1), y(n 1)). (6.76) u(n 1),y(N 1) f u(n 1) = 2(R y (N 1) + B τ y (N 1) Q f B y (N 1))u (N 1) + 2B τ y (N 1) Q f A y (N 1)x N 1 = 0. Since 2 f u 2 (N 1) = 2(R y (N 1) + B τ y (N 1) Q f B y (N 1)) >0, we have u (N 1) = (R y (N 1) + By τ (N 1) Q f B y (N 1)) 1 By τ (N 1) Q f A y (N 1)x N 1. (6.77) Substituting (6.77) into(6.76) yields J (N 1, x N 1 ) { x τ N 1 = min y(n 1) [ Q y(n 1) + A τ y(n 1) Q f A y(n 1) A τ y(n 1) Q f B y(n 1) (R y(n 1) +B τ y(n 1) Q f B y(n 1) ) 1 B τ y(n 1) Q f A y(n 1) ] x N σ N 2 Q f }. (6.78) According to the definition of ρ i (P) and H k,eq.(6.78) can be written as J (N 1, x N 1 ) = Moreover, according to Eq. (6.79), we have y (N 1) = arg min ( x N 1 2ρy(N 1)(Qf ) + 1 ) σ N 2Qf y(n 1) M = min (P,r) H 1 ( xn 1 2 P + r). (6.79) min (P,r) H 0 { x N 1 2ρy(N 1)(P) + 1 σ N 2P + r }. (6.80)

154 148 6 Optimal Control for Switched Uncertain Systems For k = N 2, we have J (N 2, x N 2 ) [ ] = min E x N 2 2 Q u(n 2),y(N 2) y(n 2) + u(n 2) 2 R y(n 2) + J (N 1, x N 1 ) { = min x N 2 2 Q u(n 2),y(N 2),(P,r) H y(n 2) + u(n 2) 2 R y(n 2) + E [ (A y(n 2) x N B y(n 2) u(n 2) + σ N 1 ξ N 1 ) τ P(A y(n 2) x N 2 + B y(n 2) u(n 2) + σ N 1 ξ N 1 )]+r} = min u(n 2),y(N 2),(P,r) H 1 { x N 2 2 Q y(n 2) +A τ y(n 2) PA y(n 2) + u(n 2) 2 R y(n 2) +B τ y(n 2) PB y(n 2) + 2u(N 2)τ B τ y(n 2) PA y(n 2)x N 2 + E [ 2(A y(n 2) x N 2 + B y(n 2) u(n 2)) τ Pσ N 1 ξ N 1 + σ N 1 2 P ξ 2 N 1] + r }. (6.81) It follows from a similar computation to (6.75) that E [ 2(A y(n 2) x N 2 + B y(n 2) u(n 2)) τ Pσ N 1 ξ N 1 + σ N 1 2 P ξ N 1 2 ] = 1 σ N 1 2 P. By the similar method to the above process, we can obtain u (N 2) = (R y (N 2) + By τ (N 2) PB y (N 2)) 1 By τ (N 2) PA y (N 2)x(N 2), J (N 2, x N 2 ) = ( x N 2 2ρy(N 2)(P) + 1 ) σ N 1 2P + r min y(n 2) M,(P,r) H 1 = min (P,r) H 2 ( xn 2 2 P + r), and y (N 2) = arg min y(n 2) M,(P,r) H 1 { x N 2 2ρy(N 2)(P) + 1 σ N 1 2P + r }. By induction, we can obtain the following theorem. Theorem 6.8 ([12]) Under condition (6.7), atstagek,forgivenx k, the optimal switching control and optimal continuous control are y (k) = arg min y(k) M,(P,r) H N k 1 { x k 2ρy(k)(P) + 1 σ k+1 2P + r }

155 6.6 Discrete-Time Switched Linear Uncertain System 149 and u (k) = (R y (k) + B τ y (k) P B y (k)) 1 B τ y (k) P A y (k)x(k), respectively, where (y (k), P, r ) = arg min { x k 2ρy(k)(P) + 1 } σ k+1 2P + r. y(k) M,(P,r) H N k 1 The optimal value of Problem 6.1 is J (0, x 0 ) = min ( x 0 2 P + r). (6.82) (P,r) H N Remark 6.1 Theorem 6.8 reveals that at iteration k, the optimal value and the optimal control law at all the future iterations only depend on the current set H k. The above theorem properly transforms the enumeration over the switching sequences in m N to the enumeration over the pairs of matrices in H k. It will be shown in the next section that the expression given by (6.82) is more convenient for the analysis and the efficient computation of Problem Two-Step Pruning Scheme According to Theorem 6.8, at iteration k, the optimal value and the optimal control law at all the future iterations only depend on the current set H k. However, as k increases, the size of H k grows exponentially. It becomes unfeasible to compute H k when k grows large. A natural way of simplifying the computation is to ignore some redundant pairs in H k. In order to improve computational efficiency, a two-step pruning scheme aiming at removing some redundant pairs will be presented in this section. The first step is a local pruning and the second step is a global pruning. To formalize the above idea, the following definitions are introduced. Definition 6.1 A pair of matrices ( ˆP, ˆr) is called redundant with respect to H if min { x 2 P + r} = min (P,r) H\{( ˆP,ˆr)} (P,r) H { x 2 P + r}, x Rn. Definition 6.2 The set Ĥ is called equivalent to H, denoted by Ĥ H if min { x 2 P + r} = min (P,r) Ĥ (P,r) H { x 2 P + r}, x Rn. Therefore, any equivalent subsets of H k define the same J (k, x k ). To ease the computation, we shall prune away as many redundant pairs as possible from H k and obtain an equivalent subset of H k whose size is as small as possible. In order to remove as

156 150 6 Optimal Control for Switched Uncertain Systems many redundant pairs of matrices from H k as possible, a two-step pruning scheme is applied here. The first step is a local pruning which prunes away some redundant pairs from Γ k (P, r) for any (P, r), and the second step is a global pruning which removing redundant pairs from H k+1 after the first step Local Pruning Scheme The goal of local pruning algorithm is removing as many redundant pairs of matrices as possible from Γ k (P, r). However, testing whether a pair is redundant or not is a challenging problem. A sufficient condition for checking pairs redundant or not is given in the following lemma. Lemma 6.1 ([12]) A pair ( ˆP, ˆr) is redundant in Γ k (P, r) if there exist nonnegative constants α 1,α 2,,α s 1 such that s 1 i=1 α i = 1 and s 1 ˆP α i P (i) (6.8) i=1 where s = Γ k (P, r) and {(P (i), r (i) )} s 1 i=1 is an enumeration of Γ k(p, r)\{( ˆP, ˆr)}. Proof First, from the definition of Γ k (P, r) = {(ρ i (P), r + 1 σ N k 2 P )}, for any i M pair (P (i), r (i) ) in Γ k (P, r), the second part r (i) is equal to r + 1 σ N k 2 P. Additionally, we know α 1 ( ˆP P (1) ) + +α s 1 ( ˆP P (s 1) ) 0 by the condition (6.8). For any x 0, we have α 1 x 2ˆP P (1) + +α s 1 x 2ˆP P (s 1) 0. So there exists at least one i such that the following formula holds According to r (i) =ˆr, we obtain x 2ˆP P (i) 0. x 2ˆP +ˆr x 2 P (i) + r (i), which indicates ( ˆP, ˆr) is redundant in Γ k (P, r). The proof is completed. Checking the condition (6.8) in Lemma 6.1 is a LMI feasibility problem which can be solved with MATLAB toolbox LMI. However, Lemma 6.1 cannot remove all the redundant pairs. If the condition in Lemma 6.1 is met, then the pairs under

157 6.6 Discrete-Time Switched Linear Uncertain System 151 consideration will be discarded; otherwise, the pairs will be kept and get into H k+1. As we know, the size of H k+1 is crucial throughout the computational process. So, after this step, we apply a global pruning to H k Global Pruning Scheme A pair in H k being redundant or not can be checked by the following lemma. Lemma 6.2 ([12]) A pair ( P, r) is redundant in H k if there exist nonnegative constants α 1,α 2,...,α l 1 such that l 1 i=1 α i = 1 and ( ) P 0 l 1 ( ) P (i) 0 α 0 r i 0 r (i) i=1 (6.84) where l = H k and {(P (i), r (i) )} l 1 i=1 is an enumeration of H k\{( P, r)}. The proof of Lemma 6.2 is similar to Lemma 6.1. A detailed description of the two-step pruning process is given in Algorithm 6.. Remark 6.2 Here, after the local pruning, the set H k is represented by H k. Then H k is represented by Ĥ k after the global pruning. This two-step pruning scheme is different from the approach proposed in [1] which only prunes redundant pairs in H k+1. Because the size of H k+1 is much larger than Γ k, the computation cost of checking whether a pair in H k+1 is redundant or not is more complicated than it is in Γ k whose size is only m. The two-step pruning scheme thus decreases the computational complexity of each round of checking. Remark 6. In order to make our two-step pruning scheme more clearly, we make a metaphor. Image that we have to select several best basketball players from a university with thousands of students. How should we select efficiently? Obviously, one to one competition or one to several competition for all the students in this university is not an efficient method. The global pruning scheme [1] is just like this. Without one step above, two-step pruning scheme is like that, first, we choose some better players from each college or department of the university which can be viewed as a local pruning. Second, the best players are selected by competitions by these better players, and this step can be viewed as a global pruning. From the two-step pruning scheme, we can select the best basketball players from a university efficiently. The similar pruning scheme has been widely used in some influential sport games, such as the regular season and playoffs of NBA, the group phase and knockout round of the Football World Cup. Remark 6.4 The discrete-time problem is multistage decision-making course. It has obvious difference with continuous-time case [6] not only in the form of solution but also in the methods of solving.

158 152 6 Optimal Control for Switched Uncertain Systems Algorithm 6. :(Two-step pruning scheme) 1: Set Ĥ 0 ={(Q f, 0)}; 2: for k = 0toN 1 do : for all (P, r) Ĥ k do 4: Γ k (P, r) = ; 5: for i = 1tom do 6: P (i) = ρ i (P), 7: 8: r (i) = r + 1 σ N k 2 P, Γ k (P, r) = Γ k (P, r) {(P (i), r (i) )}; 9: end for 10: for i = 1tomdo 11: if (P (i), r (i) ) satisfies the condition in Lemma 6.1, then 12: Γ k (P, r) = Γ k (P, r)\{(p (i), r (i) )}; 1: end if 14: end for 15: end for 16: Ĥ k+1 = Γ k (P, r); (P,r) Ĥ k 17: H k+1 = Ĥ k+1 ; 18: for i = 1to Ĥ k+1 do 19: if ( ˆP (i), ˆr (i) ) satisfies the condition in Lemma 6.2, then 20: Ĥ k+1 = Ĥ k+1 \{( ˆP (i), ˆr (i) )}; 21: end if 22: end for 2: end for 24: J (0, x 0 ) = min ( x 0 2 P + r). (P,r) Ĥ N The sets {Ĥ k } N k=0 generated by Algorithm 6. typically contain much fewer pairs of matrices than {H k } N k=0 and are thus much easier to deal with Examples Example 6.1 Consider the uncertain discrete-time optimal control Problem 6.1 with N = 10, m = and A 1 = ( ) ( 21 1, B 01 1 =, A 1) 2 = Q 1 = Q 2 = Q = ( ) 10, Q 01 f = ( ) ( 11 1, B 12 2 =, A 2) = ( ) ( 21 2, B 12 =, 1) ( ) 41, R 12 1 = R 2 = R = 1, σ k = ( ) for k = 1, 2,...,N. Algorithm 6. is applied to solve this problem. The numbers of elements in H k and Ĥ k at each step is listed in Table 6.2. It turns out that Ĥ k is

159 6.6 Discrete-Time Switched Linear Uncertain System 15 Table 6.2 Size of H k and Ĥ k for Example 6.1 k H k Ĥ k Table 6. Optimal results of Example 6.1 k y (k) r k x(k) u (k) J (k, x k ) 0 2 (, 1) τ (1.2768, 0.509) τ (0.5908, ) τ (0.1649, ) τ (0.17, ) τ (0.0765, ) τ (0.0122, ) τ ( 0.050, ) τ ( , ) τ (0.0846, 0.095) τ very small, and the maximum value is as compared to growing exponentially as k increases. Choose x 0 = (, 1) τ, the optimal controls and the optimal values are obtained by Theorem 6.8 and listed in Table 6.. The data in the fourth column of Table 6. are the corresponding states which are derived from x(k + 1) = A y (k)x(k) + B y (k)u (k) + σ k+1 r k+1, where r k+1 is the realization of uncertain variable ξ k+1 L( 1, 1) and may be generated by r k+1 = 1 ξ k+1 (random(0, 1)) (k = 0, 1, 2,...,9). The number of H k indicates the effect of the local pruning. In order to test the effect of the local pruning further, we increase the number of subsystems and consider the following problem. Example 6.2 Consider a more complex example with 6 subsystems (m = 6). The first three subsystems are the same as Example 6.1 and the other three are chosen as A 4 = ( ) ( 12 0, B 01 4 =, A 1) 5 = Q 4 = Q 5 = Q 6 = ( ) ( 12 1, B 11 5 =, A ) 6 = ( ) 10, R 01 4 = R 5 = R 6 = 1. ( ) ( 51 2, B 15 6 =, 1) The numbers of elements in H k and Ĥ k at each step are listed in Table 6.4. It can be seen that the numbers of H k and Ĥ k does not necessarily increase with the number of

160 154 6 Optimal Control for Switched Uncertain Systems Table 6.4 Size of H k and Ĥ k for Example 6.2 k H k Ĥ k 2 4 Table 6.5 Optimal results of Example 6.2 k y (k) r k x(k) u (k) J (k, x k ) 0 5 (, 1) τ (0.56, ) τ (0.4526, ) τ (0.0702, 0.276) τ (0.1276, ) τ (0.0805, ) τ ( , ) τ ( , 0.050) τ ( , ) τ (0.0747, 0.110) τ subsystems. Additionally, with more subsystems, the effectiveness of local pruning becomes more apparent. Choose x 0 = (, 1) τ, the optimal controls and the optimal values are listed in Table 6.5. References 1. Wang L, Beydoun A, Sun J, Kolmanasovsky I (1997) Optimal hybrid control with application to automotive powertrain systems. Lecture Notes in Control and Information Science 222: Xu X, Antsaklis P (2004) Optimal control of switched systems based on parameterization of the switching instants. IEEE Trans Autom Control 49(1):2 16. Benga S, Decarlo R (2005) Optimal control of switching systems. Automatica 41(1): Teo KL, Goh C, Wong K (1991) A unified computational approach to optimal control problems. Longman Scientific and Technical, New York 5. Lee H, Teo K, Rehbock V, Jennings L (1999) Control parametrization enhancing technique for optimal discrete-valued control problems. Automatica 5(8): Yan H, Zhu Y (2015) Bang-bang control model for uncertain switched systems. Appl Math Modell 9(10 11): Hopfinger E, Luenberger D (1976) On the solution of the unidimensional local minimization problem. J Optim Theory Appl 18(): Yan H, Sheng L, Zhu Y (2016) Linear quadratic optimization models of uncertain switched systems. ICIC Exp Lett 10(10):

161 References Zhu Y (201) An intelligent algorithm: MACO for continuous optimization models. J Int Fuzzy Syst 24(1): Yan H, Zhu Y (2017) Bang-bang control model with optimistic value criterion for uncertain switched systems. J Intell Manuf 28(): Zhang W, Hu J, Abate A (2009) On the value function of the discrete-time switched LQR problem. IEEE Trans Autom Control 54(11): Yan H, Sun Y, Zhu Y (2017) A linear-quadratic control problem of uncertain discrete-time switched systems. J Ind Manag Optim 1(1): Zhang W, Hu J, Lian J (2010) Quadratic optimal control of switched linear stochastic systems. Syst Control Lett 59(11):76 744

162 Chapter 7 Optimal Control for Time-Delay Uncertain Systems Assume that an uncertain process X t (t d) takes values in a closed set A R n, which describes the state of a system at time t that started at time d < 0. Here, d describes a constant delay inherent to the system. Let C A [ d, 0] denote the space of all continuous functions on [ d, 0] taking values in A.Fort [ d, 0], the process X t is consistent with a function ϕ 0 C A [ d, 0].Fort 0, X t+s (s [ d, 0]) describes the associated segment process of X t, denoted by ϕ t (s) = X t+s, s [ d, 0]. We consider a system whose dynamics may not only depend on the current state but also depend on the segment process through the processes Y t = 0 d e λs f (X t+s )ds, ζ t = f (X t d ), t 0 where f : R n R k is a differentiable function and λ R is a constant. The system can be controlled by u ={u t, t 0} taking values in a closed subset U of R m. At every time t 0, an immediate reward F(t, X t, Y t, u t ) is accrued and the terminal state of the system earns a reward h(x T, Y T ). Then we are looking for a control process u that maximizes the overall expected reward over the horizon [0, T ]. That is, we consider the following uncertain optimal control problem with time-delay: Springer Nature Singapore Pte Ltd Y. Zhu, Uncertain Optimal Control, Springer Uncertainty Research, 157

163 158 7 Optimal Control for Time-Delay Uncertain Systems [ T ] J(0,ϕ 0 ) = sup E F(s, X s, Y s, u s )ds + h(x T, Y T ) u U 0 subject to dx s = μ 1 (s, X s, Y s, u s )ds + μ 2 (X s, Y s )ζ s ds + σ(s, X s, Y s, u s )dc s, s [0, T ] X s = ϕ 0 (s), d s 0. (7.1) In the above model, X s is the state vector of n dimension, u s takes values in a closed subset U of R m, F :[0, + ) R n R k U R the objective function, and h : R n R k R the function of terminal reward. In addition, μ 1 :[0, + ) R n R k U R n is a column-vector function, μ 2 : R n R k R n k a matrix function, σ :[0, + ) R n R k U R n l a matrix function, and C s = (C s1, C s2,...c sl ) τ, where C s1, C s2,...c sl are independent Liu canonical process. The function J(0,ϕ 0 ) is the expected optimal reward obtainable in [0, T ] with the initial condition that at time 0 we have the state ϕ 0 (s) between d and 0, where ϕ 0 C A [ d, 0] is a given function. The final time T > 0 is fixed or free. A feasible control process means that it takes values in the set U. 7.1 Optimal Control Model with Time-Delay For any 0 < t < T, J(t,ϕ t ) is the expected optimal reward obtainable in [t, T ] with the condition that we have the state ϕ t (s) between t d and t. That is, consider the following problem (P): (P) [ T ] J(t,ϕ t ) = sup E F(s, X s, Y s, u s )ds + h(x T, Y T ) u U t subject to dx s = μ 1 (s, X s, Y s, u s )ds + μ 2 (X s, Y s )ζ s ds +σ(s, X s, Y s, u s )dc s, s [t, T ] X s = ϕ t (s), s [ d, 0]. (7.2) Note that the value function J is defined on the infinite-dimensional space [0, T ] C A [ d, 0] so that the equation of optimality (2.15) is not directly applicable. We will formulate an uncertain control problem (P) with finite-dimensional state space such that an optimal control process for (P) can be constructed from an optimal solution of the problem (P). In order to transform the uncertain control problem (P), we introduce the following assumption. Assumption 7.1 There exists an operator Z : R n R k R n such that e λd D x Z(x, y)μ 2 (x, y) D y Z(x, y) = 0, (x, y) R n R k, (7.)

164 7.1 Optimal Control Model with Time-Delay 159 where D x Z(x, y) and D y Z(x, y) denote the Jacobi matrices of Z in x and in y, respectively. This transformation yields a new state process Z t = Z(X t, Y t ).LetS = A y(c A [ d, 0]). For ψ C A [ d, 0], we denote x(ψ) = ψ(0), y(ψ) = 0 d eλs f (ψ(s))ds and ζ(ψ) = f (ψ( d)). Then Z t take values in Z(S). In order to derive the dynamics of the transformed process Z, we need the following lemma. Lemma 7.1 ([1]) Let G(t, x, y) :[0, + ) R n R k R n be continuously differentiable function and consider a feasible control process u t U. Then the uncertain process G(t, X t, Y t ) satisfies dg(t, X t, Y t ) ={G t (t, X t, Y t ) + D x G(t, X t, Y t )(μ 1 (t, X t, Y t, u t ) + μ 2 (X t, Y t )ζ t )}dt + D x G(t, X t, Y t )σ (t, X t, Y t, u t )dc t + D y G(t, X t, Y t )( f (X t ) e λd ζ t λy t )dt. (7.4) Proof For a given feasible control process u t with state process X t, define a process F t by t F t = f (X s )ds. Then the process Y t has the representation Thus, Y t = 0 d e λs f (X t+s )ds = 0 0 d 0 e λs d F t+s = e λs F t+s 0 d 0 d F t+s de λs = F t e λd F t d λe λs F t+s ds d t t d 0 t+s = f (X s )ds e λd f (X s )ds λ e λs f (X r )drds. 0 0 d 0 dy t = ( f (X t ) e λd f (X t d ) λy t ) dt = ( f (Xt ) e λd ζ t λy t ) dt. Applying Theorem 1.14 to G(t, X t, Y t ),Eq.(7.4) follows. Now we are able to present the dynamics for Z t = Z(X t, Y t ) by using (7.) and (7.4) as follows. It can be seen that dz t = dz(x t, Y t ) = D x Z(X t, Y t )(μ 1 (t, X t, Y t, u t ) + μ 2 (X t, Y t )ζ t )dt +D x Z(X t, Y t )σ (t, X t, Y t, u t )dc t + D y Z(X t, Y t )( f (X t ) e λd ζ t λy t )dt = D x Z(X t, Y t )μ 1 (t, X t, Y t, u t )dt + D y Z(X t, Y t )( f (X t ) λy t )dt +D x Z(X t, Y t )σ (t, X t, Y t, u t )dc t.

165 160 7 Optimal Control for Time-Delay Uncertain Systems Define μ :[0, + ) R n R k U R n by μ(t, x, y, u) = D x Z(x, y)μ 1 (t, x, y, u) + D y Z(x, y)( f (x) λy), and σ :[0, + ) R n R k U R n l by σ(t, x, y, u) = D x Z(x, y)σ (t, x, y, u). If the functions μ and σ as well as h would depend on (x, y) through Z(x, y) only, then the problem (P) could be reduced to a finite-dimensional problem. Assumption 7.2 There are functions μ :[0, + ) R n U R n, σ :[0, + ) R n U R n l, F :[0, + ) R n U R, h : R n R such that for all t [0, T ], u U,(x, y) R n R k,wehave μ(t, Z(x, y), u) = μ(t, x, y, u), F(t, Z(x, y), u) = F(t, x, y, u), σ(t, Z(x, y), u) = σ(t, x, y, u), h(z(x, y)) = h(x, y). We introduce a finite-dimensional control problem ( P) associated to ( P) via the transformation. For ϕ t C A [ d, 0], define z = Z(x(ϕ t ), y(ϕ t )) Z(S). Then for t [0, T ], the problem (P) can be transformed to the problem (P) (P) [ T ] J(t, z) = sup E F(s, Z s, u s )ds + h(z T ) u t U t subject to dz s = μ(s, Z s, u s )ds + σ(s, Z s, u s )dc s, s [t, T ] Z t = z, u s U, s [t, T ]. (7.5) The value function J of the uncertain optimal control problem (P) has a finitedimensional state space. So we can directly use the equation of optimality (2.15)for (P) and have the main result of this paper. Theorem 7.1 ([1]) Suppose that Assumptions 7.1 and 7.2 hold and J t (t, z) is twice differentiable on [0, T ] R n. Then we have { J t (t, z) = sup{f(t, z, u t ) + z J(t, z) τ μ(t, z, u t )} u t U J(T, Z T ) = h(z T ), (7.6) and J(t, z) = J(t,ϕ t ), where J t (t, z) is the partial derivative of the function J(t, z) in t, and z J(t, z) is the gradient of J(t, z) in z.

166 7.1 Optimal Control Model with Time-Delay 161 Proof Eq. (7.6) directly follows from the equation of optimality (2.15). In addition, for any u t U,wehave [ T ] J(t, z) E F(s, Z s, u s )ds + h(z T ) [ t T ] = E F(s, X s, Y s, u s )ds + h(x T, Y T ). t Thus, [ T ] J(t, z) sup E F(s, X s, Y s, u s )ds + h(x T, Y T ) = J(t,ϕ t ). u t U t Similarly, we can get J(t,ϕ t ) J(t, z). Therefore, the theorem is proved. Remark 7.1 The optimal decision and optimal expected value of problem (P) are determined if Eq. (7.6) has solutions. 7.2 Uncertain Linear Quadratic Model with Time-Delay In this section, we apply the result obtained in the previous section to study an uncertain LQ problem with time-delay. Let A 1 (t), A 2 (t), A 4 (t), A 5 (t), A 6 (t), A 7 (t), B(t), H(t), I (t), L(t), M(t), N(t), R(t) be continuously differentiable functions of t. What is more, let A = 0 and a be constants, and I (t) 0, R(t) <0. For ψ C R [ d, 0], denote x(ψ) = ψ(0), y(ψ) = 0 d eλs ψ(s)ds, ζ(ψ) = ψ( d). Then an uncertain LQ problem with time-delay is stated as [ T { J(t,ϕ t ) = sup E I (s)(e λd X s + A Y s ) 2 + R(s)u 2 s u U t +H(s)(e λd X s + A Y s )u s + L(s)(e λd X s + A Y s ) +M(s)u s + N(s)} ds + a(e λd X T + A Y T ) 2] subject to (LQ) dx s ={A 1 (s)x s + A 2 (s)y s + A ζ s + B(s)u s + A 4 (s)}ds +{A 5 (s)x s +A 6 (s)y s + A 7 (s)}dc s, s [t, T ] 0 Y s = e λr X s+r dr, ζ s = X s d, s [t, T ] d X s = ϕ t (s), d s 0 u s U, s [t, T ]. where ϕ 0 C R [ d, 0] is a given initial function and ϕ t C R [ d, 0] is the segment of X t for t > 0, and U is the set of feasible controls. In addition, we are in state X t = x at time t.

167 162 7 Optimal Control for Time-Delay Uncertain Systems Theorem 7.2 ([1]) If A 2 (t) = e λd A (A 1 (t) + e λd A + λ) and A 6 (t) = e λd A A 5 (t) hold in the (LQ) model, then the optimal control u t of (LQ) is u t = (H(t) + e λd B(t)P(t))z + e λd B(t)Q(t) + M(t), (7.7) 2R(t) where P(t) satisfies dp(t) = e 2λd B(t) 2 dt 2R(t) + H(t)2 2R(t) P(T ) = 2a, ( e P(t) 2 λd ) H(t)B(t) + 2A 1 (t) 2A e λd P(t) R(t) 2I (t) and Q(t) is a solution of the following differential equation ( dq(t) e λd H(t)B(t) + e 2λd B(t) 2 ) P(t) = A 1 (t) A e λd Q(t) dt 2R(t) e λd P(t)A 4 (t) L(t) + e λd M(t)B(t)P(t) + H(t)M(t) 2R(t) Q(T ) = 0. The optimal value of (LQ) is (7.8) (7.9) J(t,ϕ t ) = 1 2 P(t)z2 + Q(t)z + K (t), (7.10) where z = e λd x + A 0 d eλs X t+s ds, and K (t) = T t { M(s) 2 4R(s) + e 2λd B(t) 2 Q(t) 2 4R(s) e λd Q(s)A 4 (s)}ds. + e λd B(t)M(s)Q(t) 2R(s) N(s) (7.11) Proof The problem (LQ) is a special case of (P). In order to solve (LQ) by employing Theorem 7.1, we need to check Assumptions 7.1 and 7.2 for the (LQ) model. Note that μ 1 (t, x, y, u) = A 1 (t)x + A 2 (t)y + B(t)u + A 4 (t), μ 2 (x, y) = A, F(t, x, y, u) = I (t)(e λd x + A y) 2 + R(t)u 2 + H(t)(e λd x + A y)u + L(t)(e λd x + A y) + M(t)u + N(t), h(x, y) = a(e λd x + A y) 2, σ(t, x, y, u) = A 5 (t)x + A 6 (t)y + A 7 (t).

168 7.2 Uncertain Linear Quadratic Model with Time-Delay 16 We set Z(x, y) = e λd x + A y so that Assumption 7.1 is supported in this (LQ) problem. Furthermore, we have μ(t, x, y, u) = Z x (x, y)μ 1 (t, x, y, u) + Z y (x, y)( f (x) λy) = e λd (A 1 (t)x + A 2 (t)y + B(t)u + A 4 (t)) + A (x λy) = (A 1 (t) + e λd A )Z(x, y) + (e λd A 2 (t) A A 1 (t) e λd A 2 λa )y + e λd (B(t)u + A 4 (t)), F(t, x, y, u) = I (t)z(x, y) 2 + R(t)u 2 + H(t)Z(x, y)u + L(t)Z(x, y) + M(t)u + N(t), h(x, y) = az(x, y) 2, σ(t, x, y, u) = Z x (x, y)σ (t, x, y, u) = e λd (A 5 (t)x + A 6 (t)y + A 7 (t)) = A 5 (t)z(x, y) (A A 5 (t) A 6 (t)e λd )y + e λd A 7 (t). Therefore, Assumption 7.2 holds if only if A 2 (t) = e λd A (A 1 (t) + e λd A + λ), A 6 (t) = e λd A A 5 (t). The reduced finite-dimensional uncertain control problem becomes (LQ) [ T { J(t, z) = sup E I (s)z 2 s + R(s)u 2 s + H(s)Z su s + L(s)Z s u U t ] +M(s)u s + N(s)} ds + GZT 2 subject to dz s = { (A 1 (s) + e λd A )Z s + e λd (B(s)u s + A 4 (s)) } ds +{A 5 (s)z s + e λd A 7 (s)}dc s, s [t, T ] Z t = z, u s U, s [t, T ] (7.12) where z = Z(x(ϕ t ), y(ϕ t )). By using Theorem 7.1, we know that J(t, z) satisfies that is, J t (t, z) = sup{f(t, z, u t ) + J z (t, z)μ(t, z, u t )}, u t U J t (t, z) = sup{i (t)z 2 + R(t)u 2 t + H(t)zu t + L(t)z + M(t)u t + N(t) u U +[(A 1 (t) + e λd A )z + e λd (B(t)u t + A 4 (t))]j z }. (7.1) Let g(u t ) = I (t)z 2 + R(t)u 2 t + H(t)zu t + L(t)z + M(t)u t + N(t) +[(A 1 (t) + e λd A )z + e λd (B(t)u t + A 4 (t))]j z.

169 164 7 Optimal Control for Time-Delay Uncertain Systems Setting g(u t) u t = 0 yields 2R(t)u t + H(t)z + M(t) + e λd B(t)J z = 0, Hence, u t = H(t)z + M(t) + e λd B(t)J z. (7.14) 2R(t) By Eq. (7.1), we have J t (t, z) = I (t)z 2 + R(t)u 2 t + H(t)zu t + L(t)z + M(t)u t + N(t) +[(A 1 (t) + e λd A )z + e λd (B(t)u t + A 4 (t))]j z. (7.15) Since J(T, Z T ) = GZT 2, we guess J(t, z) = 1 2 P(t)z2 + Q(t)z + K (t). (7.16) Thus, J t (t, z) = 1 2 dp(t) dt z 2 + dq(t) z + dk(t) dt dt (7.17) and J z (t, z) = P(t)z + Q(t). (7.18) Substituting (7.14) and (7.18) into(7.15) yields J t (t, z) [ H(t) 2 = 4R(t) + e λd H(t)B(t)P(t) + e 2λd B(t) 2 P(t) 2 P(t)A 1 (t) 2R(t) 4R(t) P(t)A e λd I (t) ] [ e z 2 λd H(t)B(t) + e 2λd B(t) 2 P(t) + Q(t) 2R(t) A e λd Q(t) A 1 (t)q(t) L(t) + e λd M(t)B(t)P(t) + H(t)M(t) 2R(t) e λd P(t)A 4 (t) ]z + M(t)2 4R(t) + e 2λd B(t) 2 Q(t) 2 + e λd B(t)M(t)Q(t) 4R(t) 2R(t) N(t) e λd Q(t)A 4 (t). (7.19) By Eqs. (7.17) and (7.19), we get

170 7.2 Uncertain Linear Quadratic Model with Time-Delay 165 dp(t) dt = 2I (t) + H(t)2 2R(t) + e λd H(t)B(t)P(t) + e 2λd B(t) 2 P(t) 2 R(t) 2R(t) 2P(t)(A 1 (t) + A e λd ), (7.20) dq(t) dt ( e λd H(t)B(t) + e 2λd B(t) 2 ) P(t) = A 1 (t) A e λd Q(t) 2R(t) e λd P(t)A 4 (t) + e λd M(t)B(t)P(t) + H(t)M(t) L(t), (7.21) 2R(t) and dk (t) dt = M(t)2 4R(t) + e 2λd B(t) 2 Q(t) 2 + e λd B(t)M(t)Q(t) N(t) 4R(t) 2R(t) e λd Q(t)A 4 (t). (7.22) Since J(T, z) = 1 2 P(T )z2 + Q(T )z + K (T ) = az 2,wehaveP(T ) = 2a, Q(T ) = 0, and K (T ) = 0. By Eqs. (7.20) and (7.21), we obtain (7.8) and (7.9). By Eq. (7.22), Eq. (7.11) holds. Therefore, is the optimal value of (LQ), and J(t,ϕ t ) = J(t, z) = 1 2 P(t)z2 + Q(t)z + K (t) u t = (H(t) + e λd B(t)P(t))z + e λd B(t)Q(t) + M(t) 2R(t) is the optimal control, where 0 z = e λd x(ϕ t ) + A y(ϕ t ) = e λd ϕ t (0) + A e λs ϕ t (s)ds d 0 0 = e λd X t + A e λs X t+s ds = e λd x + A e λs X t+s ds. d d Example We consider the following example of uncertain optimal control model with timedelay

171 166 7 Optimal Control for Time-Delay Uncertain Systems J(0,ϕ 0 ) = sup E u U [ 2 0 ] { (e 1 X s + Y s ) 2 u 2 } s ds + (e 1 X T + Y T ) 2 subject to dx t ={( e 5)X t + X t u t }dt + dc t, t [0, 2] X t = ϕ 0 (t) = cos πt, 0.2 t 0 0 Y t = e 5s X t+s ds, t [0, 2] 0.2 u t R, t [0, 2]. (7.2) We have A 1 (s) = (e + 5), A 2 (s) = 0, A = 1, A 4 (s) = 0, B(s) = 1, A 5 (s) = A 6 (s) = 0, A 7 (s) = 1, I (s) = 1, R(s) = 1, H(s) = L(s) = M(s) = N(s) = 0, a = 1, λ = 5, d = 0.2. Hence, A 2 (t) = ea (A 1 (t) + ea + 5) and A 6 (t) = ea A 5 (t) hold in this model. By Theorem 7.2, the function Q(t) satisfies dq(t) = dt Q(2) = 0. ( 1 ) 2e P(t) Q(t), t [0, 2] Thus, Q(t) = 0fort [0, 2], and then K (t) = 0fort [0, 2]. Therefore, the optimal control u t is u t = e 1 P(t)z t, where z t = e 1 x t + y t, and the optimal value is 2 J(0,ϕ 0 ) = 1 2 P(0)z2 0, where z 0 = e 1 x 0 + y 0, and P(t) satisfies { dp(t) = 1 dt 2e 2 P(t)2 + 10P(t) + 2 P(2) = 2, (7.24) and x 0 = X 0 = 1, y t = Y t = e 5s X t+s ds, y 0 = Y 0 = e 5s X s ds = e 5s cos πs ds = π sin(0.2π) 5 cos(0.2π)+ 5e. e(π ) Since the value of y t is derived from the value of X s between t 0.2 and t, the analytical expression of y t cannot be obtained and so is that of u t. Now we consider the numerical solutions of the model. Let 1 = s 0, s 1,...s 20 be an average partition of [ 0.2, 0] (i.e., 0.2 = s 0 < s 1 < < s 20 = 0), and s = Thus, y t = Y t = 20 i=0 e 5s i X t+si s. Let 2 = t 0, t 1,...t 200 be an average partition of [0, 2] (i.e., 0 = t 0 < t 1 < < t 200 = 2), and t = Thus,

172 7.2 Uncertain Linear Quadratic Model with Time-Delay 167 X t = ( (e + 5)X t + X t u t ) t + C t. Since C t is a normal uncertain variable with expected value 0 and variance t 2,the ( ( )) 1, distribution function of C t is Φ(x) = 1 + exp π x t x R. We may get a sample point c t of C t from c t = Φ 1 ( (rand(0, 1)) that c t = t π ln 1 ). rand(0,1) 1 Thus, x t, y t, and u t may be given by the following iterative equations y t j = x t j+1 20 i=0 = x t j + X t e 5s i x t j +s i s, u t j = e 1 2 P(t j)(e 1 x t j + y t j ), = x t j + ( (e + 5)x t j + x t j u t j ) t + t π ( ) ln 1 rand(0, 1) 1 for j = 0, 1, 2,...,200, and x si = cos πs i for i = 0, 1,...,20, where the numerical solution P(t j ) of (7.24) is provided by ( P(t j 1 ) = P(t j ) 1 ) 2e P(t j) P(t 2 j ) + 2 Δt for j = 200, 199,...,2, 1 with P(t 200 ) = 2. Therefore, the optimal value of the example is J(0,ϕ 0 ) = , and the optimal controls and corresponding states are obtained in Table 7.1 for part data. Table 7.1 Numerical solutions t x y u t x y u t x y u

173 168 7 Optimal Control for Time-Delay Uncertain Systems 7. Model with Multiple Time-Delays Consider an uncertain linear systems with multiple time-delays in control input ( dx s = a 0 (s) + a 1 (s)x s + ) p B i (s)u(s h i ) ds + b(s)dc s (7.25) i=1 with the initial condition X (t 0 ) = X 0, where t 0 is the initial time. Here X s is the state vector of n dimension, u s is the control vector of m dimension, h i > 0(i = 1,...,p) are positive time-delays, h = max{h 1,...,h p } is the maximum delay shift, and C s = (C s1, C s2,...,c sp ), where C s1, C s2,...,c sp are independent canonical Liu process. And a 0 (s), a 1 (s), b(s) and B i (s)(i = 1, 2,...,p) are piecewise continuous matrix functions of appropriate dimensions. The quadratic cost function to be maximized is defined as follows ( 1 J(t, x) = sup E u t U 2 T t (u τ s R(s)u s + X τ s L(s)X s)ds + X τ T Ψ T X T ), (7.26) where X t = x, R(s) is positive, Ψ T and L(s) are nonnegative definite symmetric matrices and T > 0. Theorem 7. ([2]) Let μ 1t be an n n integrable uncertain process, μ 2t and v 2t be two n-dimensional integrable uncertain processes. Then the n-dimensional linear uncertain differential equation dx t = (μ 1t X t + μ 2t )dt + v 2t dc t, (7.27) has a solution where t t ) X t = U t (X 0 + Us 1 μ 2s ds + Us 1 v 2s dc s, (7.28) 0 0 ( t ) U t = exp μ 1s ds. 0 Proof At first, we define two uncertain processes U t and V t via uncertain differential equations, du t = μ 1t U t dt, dv t = Ut 1 μ 2t dt + Ut 1 v 2t dc t. It follows from the integration by parts that d(u t V t ) = U t dv t + du t V t = (μ 1t U t V t + μ 2t )dt + (v 2t )dc t.

174 7. Model with Multiple Time-Delays 169 That is, the uncertain process X t = U t V t is a solution of the uncertain differential equation (7.27). The uncertain process U t can also be written as U t = n=0 ( 1 t n μ 1s ds) U 0. n! 0 Taking differentiation operations on both sides, we have du t = μ 1t n=1 ( 1 t n 1 ( t ) μ 1s ds) U 0 dt = μ 1t exp μ 1s ds U 0 dt. (n 1)! 0 0 Thus, ( t ) U t = exp u 1s ds U 0, 0 t t V t = V 0 + Us 1 μ 2s ds + Us 1 v 2s dc s. 0 0 Taking U 0 = I and V 0 = X 0, we get the solution (7.28). The theorem is proved. Theorem 7.4 ([2]) For the uncertain linear system with input delay (7.25) and the quadratic criterion (7.26), the optimal control law for t t 0 is given by where P(t) satisfies u (t) = R 1 (t) p B i (t)m i (t)(p(t)x + Q(t)), i=1 Ṗ(t) = 1 p Mi τ 2 (t)bτ i (t)pτ (t)r 1 (t)p(t) i=1 + L(t) + a 1 (t)p(t) P(T ) = 2Ψ T, p B i (t)m i (t) i=1 (7.29) and Q(t) is a solution of the following differential equation p Q(t) = Mi τ (t)bτ i (t)p(t)r 1 (t)q(t) i=1 + a 0 (t)p(t) + a 1 (t)q(t) Q(T ) = 0, p B i (t)m i (t) i=1 (7.0) where M i (t) = exp( t t h i a 1 (s)ds). The optimal value for t t 0 is given by

175 170 7 Optimal Control for Time-Delay Uncertain Systems J(t, x) = 1 2 xτ P(t)x + Q(t)x + K (t), where { T K (t) = 1 p p Mi τ t 2 (s)bτ i (s)q(s)τ R 1 (s)q(s) B i (s)m i (s) i=1 i=1 +a 0 (t)q(s)} ds. (7.1) Proof For the optimal control problem (7.25) and (7.26), using the equation of optimality (2.15), we get { 1 J t (t, x) = sup u U 2 (uτ t R(t)u t + x τ L(t)x) + Jx τ a 0(t) } p Jx τ a 1(t)x + Jx τ B i (t)u t hi. (7.2) Let i=1 g(u t ) = 1 2 (uτ t R(t)u t + x τ L(t)x) + J τ x a 0(t) + J τ x a 1(t)x + J τ x p B i (t)u t hi. i=1 Setting g(u t ) u t = 0 yields where M i (t) = u t h i u t By Eq. (7.2), we have J t = 1 2 R(t)u t +. Hence, p Mi τ (t)bτ i (t)j x = 0, i=1 u t = R 1 (t) p Mi τ (t)bτ i (t)j x. (7.) i=1 ( u τ t R(t)u t + x τ L(t)x ) + Jx τ Since J(T, X T ) = X τ T Ψ X T, we guess ( a 0 (t) + a 1 (t)x + ) p B i (t)u t h i. i=1 (7.4) J(t, x) = 1 2 xτ P(t)x + Q(t)x + K (t).

176 7. Model with Multiple Time-Delays 171 Then and J t = 1 2 xτ Ṗ(t)x + Q(t)x + K (t), (7.5) J x = P(t)x + Q(t). (7.6) Substituting Eqs. (7.), (7.6) into Eq. (7.4) yields J t (t, x) { = x τ 1 2 p Mi τ (t)bτ i (t)pτ (t)r 1 (t)p(t) i=1 { + a 1 (t)p(t)} x + p B i (t)m i (t) + L(t) i=1 p Mi τ (t)bτ i (t)p(t)r 1 (t)q(t) i=1 + a 0 (t)p(t) + a 1 (t)q(t)} x 1 p Mi τ 2 (t)bτ i (t)q(t)τ R 1 (t)q(t) i=1 By Eq. (7.5) and Eq. (7.7), we get p B i (t)m i (t) i=1 p B i (t)m i (t) + a 0 (t)q(t). (7.7) i=1 Ṗ(t) = 1 p p Mi τ 2 (t)bτ i (t)pτ (t)r 1 (t)p(t) B i (t)m i (t) i=1 i=1 +L(t) + a 1 (t)p(t), (7.8) Q(t) = p Mi τ (t)bτ i (t)p(t)r 1 (t)q(t) i=1 p B i (t)m i (t) +a 0 (t)p(t) + a 1 (t)q(t), (7.9) and K (t) = 1 p p Mi τ 2 (t)bτ i (t)q(t)τ R 1 (t)q(t) B i (t)m i (t) + a 0 (t)q(t). i=1 i=1 (7.40) Since J(T, x) = 1 2 x τ P(T )x + Q(T )x + K (T ) = x τ Ψ T x,wehavep(t ) = 2Ψ T, Q(T ) = 0, and K (T ) = 0. Eqs. (7.29), (7.0) and (7.1) follow directly from Eqs. (7.8), (7.9), and (7.40), respectively. Therefore, i=1 J(t, x) = 1 2 xτ P(t)x + Q(t)x + K (t),

177 172 7 Optimal Control for Time-Delay Uncertain Systems is the optimal value of the uncertain linear system with input delay equation (7.25) and the quadratic criterion equation (7.26), and u t = R 1 (t)m τ i (t) p i=1 B τ i (t)(p(t)x + Q(t)). (7.41) Let us find the value of matrices M i (t) for this problem. Substituting the optimal control law equation (7.41) into the Eq. (7.25) gives { dx s = p B i (s)r 1 (s h i )Mi τ (s h i) i=1 p Bi τ (s h i) ( P(s h i )X s hi +Q(s h i )) + a 0 (s) + a 1 (s)x s } ds + b(s)dc s. (7.42) The multidimensional uncertain differential equation (7.42) has the solution { X t = U(r, t) t r i=1 { p U(t, s) 1 B i (s)r 1 (s h i )Mi τ (s h i) i=1 (P(s h i )X s hi + Q(s h i ) ) + a 0 (s) } ds + t r p Bi τ (s h i) i=1 U(t, s) 1 b(s)dc s + X r }(7.4) where t, r t 0, and ( t ) U(r, t) = exp a 1 (s)ds, r by Theorem 7., and we know ( t ) U(t h i, t) = exp a 1 (s)ds. t h i Since the integral terms in the right-hand side of Eq. (7.4) do not explicitly depend on u t,wehave X t / u t = U(r, t) X r / u t. It can be converted to Hence, the equality u t / X t = ( u t / X r )U(t, r). Su t = K 1 U(r, t)k 2 X r holds, where S R n m and K 1, K 2 R n n can be selected the same for any t, r t 0. Writing the last equality for t + h i, h i > 0, we have Su t+hi = K 1 U(r, t + h i )K 2 X r.

178 7. Model with Multiple Time-Delays 17 Thus, ( (Su t )/ Su t+hi ) = U(r, t)(u(r, t + h i )) 1 = U(t + h i, t), which leads to ( (Su t )/ u t+hi ) = U(t + h i, t)s. For any S, usingt h i instead of t yields ( t ) S( u t hi / u t ) = SM i (t) = U(t, t h i )S = exp a 1 (s)ds S, t h i for t t 0 + h i.so ( t ) M i (t) = exp a 1 (s)ds. t h i The theorem is proved Example Consider the following example of uncertain linear systems with multiple time-delays in control input ( 1 2 ) J(0, X 0 ) = sup E (u 2 s u U 2 + X s 2 )ds + X T 2, 0 subject to dx t = (X t + u t u t )dt + dc t, t [0, 2] u t = 0, t [ 0.1, 0] X 0 = 1. (7.44) We have a 0 (t) = 0, a 1 (t) = 1, B(t) = 1, b 0 (t) = 0, b 1 (t) = 1, R(t) = 1, L(t) = 1, Ψ(T ) = 1. So we get M 1 (t) = exp( 0.1), and M 2 (t) = 1. By Theorem 7.4, the function Q(t) satisfies dq(t) = (1 + exp( 0.1)) 2 P(t)Q(t) + Q(t) dt (7.45) Q(2) =0. Thus, Q(t) = 0fort [0, 2], and then K (t) = 0fort [0, 2]. So we get the optimal control u t is ( ) u t = 1 + exp( 0.1) P(t)x,

179 174 7 Optimal Control for Time-Delay Uncertain Systems and the optimal value is J(0, X 0 ) = 1 2 P(0)X 0 2, and P(t) satisfies dp(t) = (1 + exp( 0.1)) 2 P(t) 2 + 2P(t) + 2 dt P(2) = 2. (7.46) Now we consider the numerical solution of this model. Let S = t 0, t 1,...t 200 be an average partition of [0, 2] (i.e., 0 = t 0 < t 1 < < t 200 = 2), and Δt = Thus, ΔX t = (X t + u t u t )Δt + ΔC t. Since ΔC t is a normal uncertain variable with expected value 0 and variance Δt 2,the ( ( )) 1, distribution function is Φ(x) = 1 + exp π x Δt x R. So we may get a sample point c t of ΔC t from c t = Φ 1 (rand(0, 1)) that c t = ( Δt π ln 1 ). rand(0,1) 1 Thus, x t and u t may be given by the following iterative equations u t j = (1 + exp( 0.1))P(t j )x t j, Δt ( x t j+1 = x t j + (x t + u t j u t j )Δt + π ln 1 ) rand(0, 1) 1, for j = 0, 1, 2,...,200, and u t j 0.1 = 0, where t j [0, 0.1]. The numerical solution P(t j ) of (7.46) is provided by ( ) P(t j 1 ) = P(t j ) (1 + exp( 0.1)) 2 P(t j ) 2 + 2P(t j ) + 2 Δt, for j = 200, 199,...,2, 1 with P(t 200 ) = 2. Table 7.2 Numerical solutions t x t P(t) u t t x t P(t) u t t x t P(t) u t

180 7. Model with Multiple Time-Delays 175 Therefore, the optimal value of the example is J(0, X 0 ) = , and the optimal controls and corresponding states are obtained in Table 7.2. References 1. Chen R, Zhu Y (201) An optimal control model for uncertain systems with time-delay. J Oper Res Soc Jpn 54(4): Jiang Y, Yan Y, Zhu Y (2016) Optimal control problem for uncertain linear systems with multiple input delays. J Uncertain Anal Appl 4(5):10 pages

181 Chapter 8 Parametric Optimal Control for Uncertain Systems As it is well known, the optimal control of linear quadratic model is given in a feedback form, which is determined by the solution of a Riccati differential equation. However, the corresponding Riccati differential equation cannot be solved analytically in many cases. Even if an analytic solution can be obtained, it might be a complex time-oriented function. Then the optimal control is often difficult to be implemented and costly in industrial production. Hence, a practical control in a simplified form should be chosen for overcoming these issues at the precondition of keeping an admissible accuracy of a controller. This chapter aims at formulating an approximate model with parameter to simplify the form of optimal control for uncertain linear quadratic model and presenting a parametric optimization approach for solving it. 8.1 Parametric Optimization Based on Expected Value To begin with we consider the following multidimensional uncertain linear quadratic model without control parameter: [ T J(0, x 0 ) = min E u s 0 ] ( X τ s Q(s)X s + u τ s R(s)u ) s ds + x τ T S T x T subject to dx s = (A(s)X s + B(s)u s )ds + (M(s)X s + N(s)u s )dc s X 0 = x 0, (8.1) where the state X s is an uncertain vector process of dimension n. The matrix functions Q(s), R(s), A(s), B(s), M(s), N(s) are appropriate size, where Q(s) is symmetric nonnegative definite, R(s) is symmetric positive definite, and S T is symmetric. For any 0 < t < T,weusex to denote the state of X s at time t and J(t, x) to denote Springer Nature Singapore Pte Ltd Y. Zhu, Uncertain Optimal Control, Springer Uncertainty Research, 177

182 178 8 Parametric Optimal Control for Uncertain Systems the optimal value obtainable in [t, T ]. Assume that the following two conditions are satisfied. Assumption 8.1 The elements of Q(s), R(s), A(s), B(s), M(s), N(s), and R 1 (s) are continuous and bounded functions on [0, T]. Assumption 8.2 The optimal value J(t, x) is a twice differentiable function on [0, T ] [a, b] n. Theorem 8.1 ([1]) A necessary and sufficient condition that u t be an optimal control for model (8.1) is that u t = 1 2 R 1 (t)b τ (t)p(t)x, (8.2) where the function P(t) satisfies the following Riccati differential equation and boundary condition { dp(t) = 2Q(t) A τ (t)p(t) P(t)A(t) dt P(t)B(t)R 1 (t)b τ (t)p(t) P(T ) = 2S T. (8.) The optimal value of model (8.1) is Proof Applying Theorem 2., wehave J(0, x 0 ) = 1 2 xτ 0 P(0)x 0. (8.4) min u t { x τ Q(t)x + u τ t R(t)u t + (A(t)x + B(t)u t ) τ x J(t, x) + J t (t, x) } = 0. Denote ψ(u t ) = x τ Q(t)x + u τ t R(t)u t + (A(t)x + B(t)u t ) τ x J(t, x) + J t (t, x). First, we verify the necessity. Since J(T, X T ) = x τ T S T x T, we conjecture that x J(t, x) = P(t)x, with boundary condition P(T ) = 2S T. Setting ψ(u t) = 0, u t we have u t = 1 2 R 1 (t)b τ (t)p(t)x. Because 2 ψ(u t ) u 2 t = 2R(t) >0, u t is the optimal control of model (8.1), i.e., u t = 1 2 R 1 (t)b τ (t)p(t)x. Taking the gradient of ψ(u t ) with respect to x, wehave ( 2Q(t) + A τ (t)p(t) + P(t)A(t) 1 2 P(t)B(t)R 1 (t)b τ (t)p(t) + dp(t) dt ) x = 0.

183 8.1 Parametric Optimization Based on Expected Value 179 Thus dp(t) dt = 2Q(t) A τ (t)p(t) P(t)A(t) P(t)B(t)R 1 (t)b τ (t)p(t) (8.5) with P(T ) = 2S T. According to the existence and uniqueness theorem of differential equation and Assumption 8.1, we can infer that the solution P(t) is existent and unique. In addition, we have ( ) dp(t) τ [ ] = 2Q(t) A τ (t)p(t) P(t)A(t) + 1 τ dt 2 P(t)B(t)R 1 (t)b τ (t)p(t) P τ (T ) = 2ST τ, That is, dp τ (t) = 2Q(t) P τ (t)a(t) A τ (t)p τ (t) dt Pτ (t)b(t)r 1 (t)b τ (t)p τ (t) P τ (T ) = 2S T. (8.6) It follows from Eqs. 8.5 and 8.6 that P(t) and P τ (t) are solutions of the same Riccati differential equation with the same boundary condition. So, P(t) is symmetric. Further, we have J(t, x) = 1 2 xτ P(t)x. Then, the optimal value J(0, x 0 ) is J(0, x 0 ) = 1 2 xτ 0 P(0)x 0. (8.7) Then, we prove the sufficient condition. Assume that J(t, x) = 1 2 xτ P(t)x. Substituting Eqs. (8.2) and (8.)intoψ(u t ),wehaveψ(u t ) = 0. Because the objective function of model (8.1) is convex, there must be an optimal control solution. Hence, u t is the optimal control. The optimal value J(0, x 0 ) is The theorem is proved. J(0, x 0 ) = 1 2 xτ 0 P(0)x 0. (8.8) Parametric Optimal Control Model The parametric optimal control problem we will address here is of the form:

184 180 8 Parametric Optimal Control for Uncertain Systems [ T ] V (0, x 0 ) = min E ( X τ s Q(s)X s + u τ s R(s)u ) s ds + x τ T S T x T u s U 0 subject to (8.9) dx s = (A(s)X s + B(s)u s )ds + (M(s)X s + N(s)u s )dc s X 0 = x 0, where X s is a state vector of dimension n with initial condition X 0 = x 0 and u s is a decision vector of dimension r. U ={Kx s K = (k ij ) r n R r n }, where x s represents the state of X s at time s with x s [a, b] n. The matrix functions Q(s), R(s), S T, A(s), B(s), M(s), and N(s) are defined as in model (8.1) and satisfy the Assumption 8.1. For any 0 < t < T,weusex to denote the state of X s at time t and V (t, x) to denote the optimal value obtainable in [t, T ]. Solving an optimal control vector u t of model (8.9) is essentially equivalent to solving an optimal parameter matrix K. From now on, we assume that V (t, x) is a twice differentiable function on [0, T ] [a, b] n. Applying Eq. (2.15), we obtain min K {xτ Q(t)x + (Kx) τ R(t)(Kx) + (A(t)x + B(t)Kx) τ x V (t, x) + V t (t, x)} = 0. (8.10) Note that the u t in Eq. (8.2) can be used to achieve global minimum for model (8.1), and an optimal control u t of model (8.9) can be seen as a local optimal control solution for model (8.1). Therefore, the optimality of optimal parameter matrix K means that V (0, x 0 ) can be close to J(0, x 0 ) as much as possible. In order to solve an optimal parameter matrix K,weuseJ(t, x) as a substitute for V (t, x), where J(t, x) is defined in model (8.1). Hence, ϒ (K) = x τ Q(t)x + (Kx) τ R(t) (Kx) + (A(t)x + B(t)Kx) τ x J(t, x) + J t (t, x). Remark 8.1 Because K R r n, we could not obtain the optimal parameter matrix K of model (8.9) by taking gradient of ϒ (K) with respect to K Parametric Approximation Method Note that L([0, T ] [a, b] n ) represents the space of absolutely integrable functions on domain [0, T ] [a, b] n, where T > 0 and a, b R. For the sake of discussion, we define a norm as b b ( T ) f (t, x) = f (t, x) dt dx 1 dx n, (8.11) a a where f (t, x) L([0, T ] [a, b] n ) and x = (x 1, x 2,...,x n ) τ. The optimal parameter matrix K needs to ensure the difference between ϒ (K) and 0 achieves minimum in the sense of the norm defined above, i.e., 0

185 8.1 Parametric Optimization Based on Expected Value 181 K = arg min K R r n xτ Q(t)x + (Kx) τ R(t)(Kx) + (A(t)x + B(t)Kx) τ x J(t, x) +J t (t, x). (8.12) We know that J(t, x) = 1 2 xτ P(t)x, where the function P(t) satisfies the following matrix Ricatti differential equation and boundary condition { dp(t) = 2Q(t) A τ (t)p(t) P(t)A(t) dt P(t)B(t)R 1 (t)b τ (t)p(t) P(T ) = 2S T. (8.1) Remark 8.2 A variety of numerical algorithms have been developed by many researchers for solving the Riccati equation (see Balasubramaniam et al. [2], Caines and Mayne [], Khan et al. [4]). Assume that P(t) = (p ij (t)) n n. In solving the matrix Riccati differential equation (8.1), the following system of nonlinear differential equation has occurred: ṗ ij (t) = f ij (t, p 11 (t),...,p 1n (t), p 21 (t),...,p 2n (t),...,p n1 (t),...,p nn (t)) (8.14) for i, j = 1, 2,...,n. Apparently, matrix Riccati differential equation (8.1) contains n 2 first-order ordinary differential equations with n 2 variables. The Runge Kutta method is considered as the best tool for the numerical integration of ordinary differential equations. For convenience, the fourth-order Runge Kutta method is explained for a system of two first-order ordinary differential equations with two variables: p 11 (s + 1) = p 11 (s) + h 6 (k 1 + 2k 2 + 2k + k 4 ), where p 12 (s + 1) = p 12 (s) + h 6 (l 1 + 2l 2 + 2l + l 4 ), k 1 = f 11 (t, ( k 11, k 12 ), l 1 = f 12 (t, k 11, k 12 ), k 2 = f 11 t + h 2, k 11 + hk 1 2, k 12 + hl 1 2 ( k = f 11 t + h 2, k 11 + hk 2 2, k 12 + hl 2 2 ), l 2 = f 12 ( t + h 2, k 11 + hk 1 2, k 12 + hl 1 2 ) (, l = f 12 t + h 2, k 11 + hk 2 2, k 12 + hl 2 2 k 4 = f 11 (t + h, k 11 + hk, k 12 + hl ), l 4 = f 12 (t + h, k 11 + hk, k 12 + hl ). In the similar way, the original system (8.1) can be solved for n 2 first-order ordinary differential equations. ), ),

186 182 8 Parametric Optimal Control for Uncertain Systems Setting ( L 1 = l (1) ij ) T = r r 0 ( R(t)dt, L 2 = l (2) ij ) T = r n 0 B τ (t)p(t)dt. Then, we have the following theorem to ensure the solvability of optimal control parameter matrix K. Theorem 8.2 ([1]) Denote L(K) = ( L ij (K) ) n n = K τ L 1 K + K τ L 2. Then we have [ K 1 n = arg min K R r n (b2 + ba + a 2 )(b a) n L ii (K) i=1 + 1 (8.15) n n 4 (b + a)2 (b a) n L ij (K). Proof Applying Eq. (8.12), we have i=1 j=1, j =i K = arg min K R r n xτ Q(t)x + (Kx) τ R(t)(Kx) + (A(t)x + B(t)Kx) τ x J(t, x) +J t (t, x). Because ϒ(K) 0, we have b ( b T [ ϒ(K) = x τ Q(t)x + (Kx) τ R(t)(Kx) + (A(t)x a a 0 +B(t)(Kx)) τ x J(t, x) + J t (t, x) ] dt ) dx 1 dx n. Denote L(K) = ( L ij (K) ) n n = K τ L 1 K + K τ L 2. It holds that K = arg min K R r n = arg min K R r n The theorem is proved. b b x τ L(K)xdx 1 dx n [ a a 1 n (b2 + ba + a 2 )(b a) n L ii (K) (b + a)2 (b a) n n Therefore, the optimal control of the model (8.9) is i=1 n i=1 j=1, j =i L ij (K). u t = K x. (8.16) Assume that V (t, x) = 1 2 xτ G(t)x. FromEq.(8.10), we obtain

187 8.1 Parametric Optimization Based on Expected Value 18 Q(t) + K τ R(t)K Aτ (t)g(t) G(t)A(t) G(t)B(t)K K τ B τ (t)g(t) + 1 dg(t) = 0. 2 dt Using the fourth-order Runge Kutta method described above, we can obtain the solution of G(t), where the function G(t) satisfies the following matrix Riccati differential equation and boundary condition dg(t) dt = 2Q(t) 2K τ R(t)K A τ (t)g(t) G(t)A(t) G(t)B(t)K K τ B τ (t)g(t) G(T ) = 2S T. Hence, the optimal value of model (8.9) is (8.17) V (0, x 0 ) = 1 2 xτ 0 G(0)x 0. (8.18) 8.2 Parametric Optimization Based on Optimistic Value We will study the following multidimensional uncertain linear quadratic model under optimistic value criterion with control parameter as an approximation of the model (.24): { T } ( V (0, x 0 ) = inf X τ s Q(s)X s + u τ s R(s)u ) s ds + X τ T S T X T (α) u s U 0 sup subject to dx s = (A(s)X s + B(s)u s )ds + M(s)X s dc s X 0 = x 0, (8.19) where u s is a decision vector of dimension r, U ={Kx s K = K l = (k (l) ij ) r n R r n, s [t l 1, t l ), l = 1, 2,...,m} with 0 = t 0 < t 1 < < t m 1 < t m = T and K is a control parameter matrix. Here, we stipulate the last subinterval [t m 1, t m ) represents the closed interval [t m 1, t m ]. For any 0 < t < T,weusex to denote the state of X s at time t and V (t, x) to denote the optimal value obtainable in [t, T ]. Assume that V (t, x) is a twice differentiable function on [0, T ] [a, b] n. According to Theorem.2, wehave { V t (t, x) = inf x τ Q(t)x + u τ t R(t)u t + x V (t, x) τ (A(t)x + B(t)u t ) u t U + π ln 1 α α xv (t, x) τ M(t)x }. (8.20)

188 184 8 Parametric Optimal Control for Uncertain Systems It is noticeable that the optimal control u t of model (8.19) can be seen as a suboptimal control solution for model (.24). Hence, the optimality of optimal parameter matrix K means that the error between V (0, x 0 ) and J(0, x 0 ) should be as small as possible. Therefore, we replace V (t, x) with J(t, x) in Eq. (8.20). For convenience, we denote Γ (K) = x τ Q(t)x + (Kx) τ R(t) (Kx) + x J(t, x) τ (A(t)x + B(t)Kx) + π ln 1 α α x J(t, x) τ M(t)x +J t (t, x). Remark 8. The optimal parameter matrix K cannot be obtained by taking gradient of Γ (K) because K R r n is a numerical matrix Piecewise Optimization Method On each subinterval [t l 1, t l ), l = 1, 2,...,m, the optimal control parameter matrix K l needs to ensure the difference between Γ ( ) K l and 0 achieves minimum in the sense of the norm defined by (8.11), i.e., K l = arg min K l R xτ Q(t)x + (K l x) τ R(t)(K l x) + x J(t, x) τ (A(t)x r n +B(t)K l x) + π ln 1 α α x J(t, x) τ M(t)x +J t (t, x).(8.21) Assume that J(t, x) = 1 2 xτ P(t)x, where the function P(t) satisfies the Riccati differential equation (.27) and boundary condition P(T ) = 2S T. Theorem 8. ([5]) Denote W = t l t l 1 R(t)dt, Y = t l t l 1 P(t)B(t)dt. Then we have K l = arg min K l R r n = arg min K l R r n x [a,b] n x τ (K τ l WK l + YK l )xdx [ 1 (b2 + ba + a 2 )(b a) n (b + a)2 (b a) n n n Z ii (K l ) i=1 n i=1 j=1, j =i Z ij (K l ), (8.22) where Z(K l ) = ( Z ij (K l ) ) n n = K τ l WK l + YK l. Proof It follows from Eq. (8.21) that

189 8.2 Parametric Optimization Based on Optimistic Value 185 K l = arg min K l R xτ Q(t)x + (K l x) τ R(t)(K l x) + x J(t, x) τ (A(t)x r n +B(t)K l x) + 1 α ln π α x J(t, x) τ M(t)x +J t (t, x) = arg min (K l x) τ R(t)(K l x) + x J(t, x) τ B(t)K l x K l R r n = arg min t l 1 [(K l x) τ R(t)(K l x) + x J(t, x) τ B(t)K l x] dtdx = arg min tl K l R r n x [a,b] n K l R r n tl [ t l 1 x τ K l τ R(t)K l x + x τ P(t)B(t)K l x ] dtdx. x [a,b] n Denote W = t l t l 1 R(t)dt, Y = t l t l 1 P(t)B(t)dt. Then K l = arg min K l R r n x [a,b] n = arg min K l R r n The theorem is proved. x τ (K τ l WK l + YK l )xdx [ 1 (b2 + ba + a 2 )(b a) n n (b + a)2 (b a) n n i=1 n i=1 j=1, j =i Z ii (K l ) ] Z ij (K l ). Here, we use the fourth-order Runge Kutta method to reversely calculate the numerical value of P(t) on each subinterval. In the first step, we calculate P(t) on interval [t m 1, t m ) with the boundary value P(m) = P(T ). Then, in the ith (i = 2,...,m) step, we calculate P(t) on interval [t m i, t m i+1 ), where the boundary value P(t m i+1 ) is obtained in (i 1)th step. At last, we calculate the integral value of P(t)B(t) on each subinterval [t m 1, t m ), l = 1, 2,...,m. It follows from Eq. (8.22) that the optimal parameter matrix K l can be obtained by the method of derivation. Hence, the optimal control of model (8.19) is u t = K l x, l = 1, 2,...,m, t l 1 t < t l. (8.2) Assume that V (t, x) = 1 2 xτ G(t)x.Let Ω = {(t, x) x τ G(t)M(t)x 0,(t, x) [t l 1, t l ) [a, b] n, l = 1, 2,...,m}, Ω 4 = {(t, x) x τ G(t)M(t)x < 0,(t, x) [t l 1, t l ) [a, b] n, l = 1, 2,...,m}. Substituting the piecewise continuous control u t into Eq. (8.20), we have x τ ( Q(t) + K l τ R(t)K l Aτ (t)g(t) G(t)A(t) G(t)B(t)K l K τ l B τ (t)g(t) ) dg(t) x + dt π ln 1 α α xτ G(t)M(t)x =0. Then, the function G(t) satisfies the following matrix Riccati differential equation

190 186 8 Parametric Optimal Control for Uncertain Systems dg(t) dt = 2Q(t) 2K τ l K τ l B τ (t)g(t) π π ln 1 α 2Q(t) 2K τ l R(t)K l A τ (t)g(t) G(t)A(t) G(t)B(t)K l ln 1 α G(t)M(t) α α M(t)τ G(t), if (t, x) Ω, R(t)K l A τ (t)g(t) G(t)A(t) G(t)B(t)K l ln 1 α G(t)M(t) K τ l B τ (t)g(t) + π + π ln 1 α α M(t)τ G(t), if (t, x) Ω 4 (8.24) and boundary condition G(T ) = 2S T. Similar to the solving procedure of P(t), we can also calculate the numerical value of G(t) at each point t l 1, l = 1, 2,...,m, with G(T ) = 2S T. Thus, the optimal value of model (8.19) is α V (0, x 0 ) = 1 2 xτ 0 G(0)x 0. (8.25) References 1. Li B, Zhu Y (2017) Parametric optimal control for uncertain linear quadratic models. Appl Soft Comput 56: Balasubramaniam P, Samath J, Kumaresan N, Kumar A (2006) Solution of matrix Riccati differential equation for the linear quadratic singular system using neural networks. Appl Math Comput 182(2): Caines P, Mayne D (2007) On the discrete time matrix Riccati equation of optimal control. Int J Control 12(5): Khan N, Ara A, Jamil M (2011) An efficient approach for solving the Riccati equation with fractional orders. Comput Math Appl 61(9): Li B, Zhu Y, Chen Y (2017) The piecewise optimisation method for approximating uncertain optimal control problems under optimistic value criterion. Int J Syst Sci 48(8):

191 Chapter 9 Applications 9.1 Portfolio Selection Models Expected Value Model Portfolio selection problem is a classical problem in financial economics of allocating personal wealth between investment in a risk-free security and investment in a single risk asset. Under the assumption that the risk asset earns a random return, Merton [1] studied a portfolio selection model by stochastic optimal control, and Kao [2] considered a generalized Merton s model. If we assume that the risk asset earns an uncertain return, this generalized Merton s model may be solved by uncertain optimal control. Let X t be the wealth of an investor at time t. The investor allocates a fraction w of the wealth in a sure asset and remainder in a risk asset. The sure asset produces a rate of return b. The risk asset is assumed to earn an uncertain return and yields a mean rate of return μ (μ >b) along with a variance of σ 2 per unit time. That is to say, the risk asset earns a return dr t in time interval (t, t + dt), where dr t = μdt + σ dc t, and C t is a canonical Liu process. Thus X t+dt = X t + bwx t dt + dr t (1 w)x t = X t + bwx t dt + (μdt + σ dc t )(1 w)x t = X t +[bw + μ(1 w)]x t dt + σ(1 w)x t dc t. (9.1) Assume that an investor is interested in maximizing the expected utility over an infinite time horizon. Then, a portfolio selection model [] is provided by [ J (t, x) max E e βt (wx t ) λ ] dt w 0 λ subject to dx t =[bwx t + μ(1 w)x t ]dt + σ(1 w)x t dc t, Springer Nature Singapore Pte Ltd Y. Zhu, Uncertain Optimal Control, Springer Uncertainty Research, 187

192 188 9 Applications where β>0, 0 <λ<1. By the equation of optimality (2.7), we have J t = max {e βt (wx) λ w λ } + (b μ)wxj x + μxj x = max L(w), w where L(w) represents the term in the braces. The optimal w satisfies or L(w) w = e βt (wx) λ 1 x + (b μ)xj x = 0, w = 1 x [ (μ b)jx e βt] 1 λ 1. Hence J t = 1 λ e βt [ (μ b)j x e βt] λ λ 1 + (b μ) [ (μ b)j x e βt] 1 λ 1 J x + μxj x, or J t e βt = ( ) 1 [(μ λ 1 b)jx e βt] λ λ 1 + μxj x e βt. (9.2) We conjecture that J (t, x) = kx λ e βt. Then Substituting them into Eq. (9.2) yields kβx λ = J t = kβx λ e βt, J x = kλx λ 1 e βt. ( ) 1 λ 1 (μ b) λ λ λ 1 (kλ) λ 1 x λ + μkλx λ, or So we get β μλ (kλ) 1 λ 1 =. (1 λ) (μ b) λ λ 1 kλ = ( ) β μλ λ λ (μ b). λ Therefore, the optimal fraction of investment on sure asset is determined by w = (μ b) 1 λ 1 (kλ) 1 λ 1 = β μλ (1 λ)(μ b).

193 9.1 Portfolio Selection Models 189 Remark 9.1 Note that the optimal fraction of investment on sure asset or risk asset is independent of total wealth. This conclusion is similar to that in the case of randomness [2] Optimistic Value Model Consider the following optimistic value model [4] provided by [ J (t, x) max e βt (ωx t) λ ] dt (α) ω 0 λ sup (9.) subject to dx t =[bω + μ(1 ω)]x t dt + σ(1 ω)x t dc t where α (0, 1) is a specified confidence level, β>0 and 0 <λ<1. Conjecture that J x (t, x) 0. Then by the equation of optimality (.12), we have J t = max ω { βt (ωx)λ e λ ( +J x x μ + σ ( + J x ωx b μ σ π ln 1 α α π ln 1 α ) } max α L(ω) ω where L(ω) represents the term enclosed by the braces. The optimal ω satisfies L(ω) ω = e βt (xω) λ 1 x + J x ( b μ σ π ln 1 α ) x = 0, α ) or ω = 1 x [ ( μ + σ ] 1 π ln 1 α λ 1 b )J x e βt. α Substituting the preceding result into max L(ω), we obtain ω J t = 1 λ e βt [ ( μ + σ +J x (b μ σ ] λ π ln 1 α λ 1 ( b )J x e βt + μ + σ α π ln 1 α ) xj x α π ln 1 α ) [ ( ] 1 μ + σ α π ln 1 α λ 1 b )J x e βt α

194 190 9 Applications which may be rewritten as ( ) [ ( 1 e βt J t = λ 1 μ + σ π ln 1 α α + ( μ + σ π ln 1 α α ] λ λ 1 b )J x e βt ) xj x e βt. (9.4) We conjecture that J (t, x) = kx λ e βt. Then J t = kβx λ e βt, J x = kλx λ 1 e βt. Substituting them into (9.4) yields kβ = So we get ( ) [ ( 1 λ 1 μ + σ π ln 1 α ) ] λ λ 1 ( b kλ + μ + σ α π ln 1 α ) kλ. α kλ = ( β μ + σ π 1 λ Therefore, the optimal ω is ln 1 α α ( β ω = (1 λ) ) λ λ 1 μ + σ π ( μ + σ π 1 ( μ + σ ) λ. π ln 1 α b α 1 α ln α ) λ ln 1 α b α ) (9.5) Remark 9.2 The conclusions obtained here are different from that in the case of expected value model of uncertain optimal control studied in Sect Here, the optimal fraction and the optimal reward depend on all the parameters β, λ, b, μ, and σ, while the conclusions in Sect. 2.6 depend on the parameters β, λ, b, and μ. However, there are still some similar conclusions. First, in both cases, the optimal fraction of investment on risk-free asset or risky asset is independent of total wealth. Second, the optimal reward J (t, x) of both two cases can be expressed as the product of a power function with respect to x and an exponential function with respect to t. 9.2 Manufacturing Technology Diffusion Problem There are three phases in the life cycle of any new technology: research and development, transfer and commercialization, and operation and regeneration [5]. Investigations on the technology diffusion originated in some researches of marketing diffusion, such as Bass [6], Horsky and Simon [7]. Technology diffusion refers to the

195 9.2 Manufacturing Technology Diffusion Problem 191 transition of technology s economic value during the transfer and operation phases of a technology life cycle. Modeling of technology diffusion must address two aspects: regularity due to the mean depletion rate of the technology s economic value and uncertainty owing to the disturbances occurring in technological evolution and innovation. Liu [8] studied a flexible manufacturing technology diffusion problem in a stochastic environment. If we employ uncertain differential equations as a framework to model technology diffusion problems, this flexible manufacturing technology diffusion in [8] may be solved by uncertain optimal control model with the Hurwicz criterion. Let X t be the potential market share at time t (state variable) and u be the proportional production level (control variable). An annual production rate can be determined as ux t. The selling price has been fairly stable at p per unit time. The unit production cost has been a function of the annual production rate and can be calculated as cux t, where c is a cost conversion coefficient. With constant β as a fixed learning percentage, the learning effect can be expressed as βx t. Thus, the typical drift is b(t, X t, u) = ux t. 1 + βx t The diffusion is σ(t, X t, u) = ax t, where a > 0 is scaling factor. Since the unit ux profit is (p cux t ), and the production rate is t 1+βX t, the unit profit function f is expressed as ux t f (t, u, X t ) = (p cux t ). 1 + βx t Let k > 0 be the discount rate and e kt h 0 (k μ X T ) be the salvage value at the end time, with μ>1, k 1. Then, a manufacturing technology diffusion problem can be defined as to choose an appropriate control û, so that the Hurwicz weighted average total profit is maximized. The model [9] is provided by { T J (0, x 0 ) max H ρ u α e [(p kt cux t ) 0 subject to [ dx t = ux ] t dt + ax t dc t. 1 + βx t ux t 1 + βx t ] } dt + e kt h 0 (k μ X T ) Conjecture that J x (t, x) 0. Then applying the equation of optimality (.12), we have { J t = max e kt ux (p cux) u 1 + βx J ux x 1 + βx +J x ax(2ρ 1) (σ π ln 1 α α ) } max L(u) u

196 192 9 Applications where L(u) represents the term enclosed by the braces. The optimal u satisfies L(u) u = e kt cx ux 1 + βx + e kt (p cux) x 1 + βx J x x 1 + βx = 0, or u = 1 2cx (p ekt J x ). Substituting the above result into max L(u), we obtain u J t = (e kt p J x )(p e kt J x ) 4c(1 + βx) + J x ax(2ρ 1) (σ π ln 1 α ). α Conjecture that J (t, x) = e kt y(x), and this gives J t = ke kt y(x), J x = e kt y (x). Using the last expression, denoting (2ρ 1) ( σ ) 1 α ln π α by parameter q, we find ky(x) = (p y (x)) 2 4c(1 + βx) + q axy (x). Letting λ(x) = y, then we have y = λ2 + 2 [ 2cq(1 + βx) ax p ] λ + p 2 4kc(1 + βx), y = λ. The derivative of the right side of the first expression should be equal to the right side of the second expression. So we get [ ] dλ λ + 2cq(1 + βx) ax p(1 + βx) dx [ β = 2(1 + βx) λ2 + 2kc(1 + βx)λ + βp 2 cqa(1 + βx) 2(1 + βx) 2 ax βp ]. (1 + βx) This differential equation is a second type of Abelian equation with respect to λ(x) with the following form [λ + g(x)] dλ dx = f 2(x)λ 2 + f 1 (x)λ + f 0. (9.6) Then, solving the ordinary differential equation (9.6) with the terminal condition J XT = [e kt h 0 (k μ X T )] X T = e kt h 0 ( ln μ μ X T ) = e kt y (X T ),

197 9.2 Manufacturing Technology Diffusion Problem 19 we get y = λ(x) ( 2 = g(x) + L [I 1 + (f 1 + g 2f 2 g)ldx) + 2 (f 0 f 1 g + f 2 g 2 )L 2 dx = p(βx + 1) 2cq ax(βx + 1) + (2βx + 2) 1 2 { I ( ) 2 (2βcx2 acq 2 + μp 2 15 cq ax(8cμx(βx + 5) + 20βpx + 15) + px(4cμ + βp) + c2 ( βx + 1(aβqx + 4μ ax(βx + 1)) 18aβ 2 x +aq βx arc sinh 1 ( βx )) 2 } 1 2 ) (p 2)p βx + 1 ] 1 2 where L = exp( f 2 dx), and I 0 satisfies the equation λ(x T ) = h 0 ( ln μ μ X T ). The optimal proportional production level is determined by u = 1 (p λ(x)). 2cx And the J x = e kt λ(x) denotes the rate of the current value function. 9. Mitigation Policies for Uncertain Carbon Dioxide Emissions Climate change is accelerating and has become one of the most troublesome pollution issues to the whole society. Over the past 20 years, a lot of effort has been toward evaluating policies to control the accumulative greenhouse gases (GHG) that risein the earth s atmosphere to lead global warming and ocean acidification. The major subject has been studied is to stabilize greenhouse gases concentration level, chiefly carbon dioxide (CO 2 ). Besides the emissions from natural systems, more emissions that increase the atmospheric carbon dioxide are generated by human activities, and the determinate mathematical model describing a climate economy dynamic system is presented in DeLara and Doyen [10], Doyen et al. [11], Nordhaus [12]. Inspired by this work, we plug uncertain variables into the dynamic system and deal with the management of the interaction between economic growth and greenhouse gas emissions. In order to formulate mathematical models, we use the following notations: M (t): the atmospheric CO 2 concentration level measured in mg/kg at time t (state variable);

198 194 9 Applications Q(t): the aggregated economic production level such as gross world product (GWP) at time t, measured in trillion US dollar (state variable); u(t): the abatement rate reduction of CO 2 emissions, 0 u(t) 1 (control variable); M : the preindustrial equilibrium atmospheric concentration; δ: the parameter that stands for the natural rate of removal of atmospheric CO 2 to unspecified sinks; E (Q(t)):theCO 2 emissions released by the economic production Q(t); ξt+1 e : the uncertain rate of growth for the production level (uncertain variable); ξ p t+1 : the conversion factor from emissions to concentration, it sums up highly complex physical mechanisms which are denoted by an uncertain variable. And then ξ p t+1 E(Q(t)) stands for the CO 2 retention in the atmosphere (uncertain variable); T: a positive integer denotes the number of managing time stages. We present the dynamics of the carbon cycle and global economic production described by uncertain differential equations M (t + 1) = M (t) δ(m (t) M ) + ξt+1 p E (Q(t)) (1 u(t)), (9.7) Q(t + 1) = (1 + ξt+1 e ) Q(t), (9.8) where time t vary in {0, 1,...,T 1}. M 0 and Q 0 denote the initial CO 2 concentration level and initial production level, respectively. This carbon cycle dynamics (9.7) can be rewrote as (M (t + 1) M ) = (1 δ)(m (t) M ) + ξt+1 p E (Q(t)) (1 u(t)), which represents the anthropogenic perturbation of a natural system from a preindustrial equilibrium atmospheric concentration. While dynamics (9.8) indicates that abatement policies or costs do not directly influence the economy, assuredly it is a restrictive assumption but this is normally used in modeling for GHG reduction policies. In addition, we suppose that the uncertain variables are stage-by-stage independent. Consider a physical or environmental requirement as a constraint through the limitation of CO 2 concentrations below a tolerable threshold at the specific final horizon T. This concentration target is pursued to avoid danger M (T) M lim. (9.9) And now, we add C(Q(t), u(t)) to specify the abatement costs function, and the parameter ρ (0, 1) denotes the discount factor. If the total cost is to be minimized, the controller has to balance his desire to minimize the cost due to the current decision against his desire to avoid future situations where high cost is inevitable. We study the following pessimistic value model of uncertain optimal control problem:

199 9. Mitigation Policies for Uncertain Carbon Dioxide Emissions 195 J (M 0, Q 0, 0) = subject to (9.7), (9.8)and(9.9) [ T 1 ] min ρ t C(Q(t), u(t)) (α) u(0),,u(t 1) t=0 inf (9.10) where the parameter α (0, 1] denotes the predetermined confidence level. It is similar to the literature [11], assumed the abatement costs function C having the following multiplicative form ( ) Q(t) μ C(Q(t), u(t)) = E (Q(t)) L(u(t)), and in this work, we set C(Q(t), u(t)) is linear or quadratic with respect to abatement rate u(t) via, respectively, designing L(u(t)) = ηu(t) or L(u(t)) = ηu 2 (t)/2, where coefficient μ interrelates with the technical progress rate and η relies on the price of the backstop technology. The problem is solved at 1-year intervals, and T = 40. Initial CO 2 concentration level and initial production level are set according to the data from Web site co2now.org/ and in 201, respectively. So we have M 0 = ppm and Q 0 = trillion US$. The concentration target is fixed to M lim = 450 ppm, while preindustrial level M = 274 ppm. We give the confidence level α = 0.90, natural removal rate δ = 0.017, parameters of the abatement cost functions η = 100, μ = 1.0, and the discount factor ρ = 1/1.08. Indeterminate factors ξ p 1,...,ξp T are specified as independent normal uncertain variables whose uncertainty distribution is Q 0 ( ( )) π(e x) 1 Φ p (x) = 1 + exp σ, with e = 0.64 and σ = Additionally, ξ1 e,...,ξe T are independent uncertain variables following a linear distribution which is denoted by L (a, b). That is 0, if x a Φ e (x) = (x a)/(b a), if a x b 1, if x b where a = 0.00, b = The feasible solutions can be illustrated by Fig. 9.1 drawn support from uncertain simulation algorithm in Sect It shows several CO 2 concentrations trajectories, and the concentrations M (t) sometimes are larger than terminal target M lim.this indicates that even though the final state is restricted to a target sets, it allows for exceeding the boundary during the time. And it should point out that an uncertain

200 196 9 Applications Fig. 9.1 Feasible CO 2 concentrations trajectories CO 2 Concentration Time variable ξ t+1 denoted by c t+1 could be realized by Φ(c t+1 ) = r t+1, where r t+1 is a arbitrarily generated number from the interval [ , ]. Using the recursion equation [1] for pessimistic value model, we give the numerical results and simulations. Referring to Tables 9.1 and 9.2, two different cost functions are used to obtain the minimal pessimistic discounted intertemporal costs J (M 0, Q 0, 0). Since the CO 2 concentrations M (t) and the productions Q(t) are both uncertain processes, that is for each fixed time t, M (t) and Q(t) are uncertain variables. We can only realize a typical sample path; the state path and its interrelated optimal control sequence are shown in two tables, respectively. In both cases, the optimal abatement rates u (t) always increase along the time. Additionally, an occurrence of a jump in u (t) appears for the linear cost case while it will vanish for the quadratic case and replaced by a gentle change of slope. Obviously, the minimal total abatement cost is an uncertain variable that is why we measure it with the pessimistic criterion. As displayed in Fig. 9.2, we compare these realization points of uncertain cost with its 0.9-pessimistic value and its expected value. It can be observed that the minimal total abatement cost is larger in pessimistic criterion than in an expected criterion; although the minimal expected cost is optimal, the realizations can be far from it while the pessimistic one may be not hard to reach. Minimizing the pessimistic cost is cautious to some extent, and it actually provides the least bad cost with belief degree However in this problem, the target is finding mitigation policies to stabilize CO 2 concentration. From this perspective, using the pessimistic criterion probably does not strongly support the costs of mitigation but prevents the damages to come. From the numerical results presented in Table 9.1, we minimized the pessimistic cost subject to the 0.90-level belief degree and 450 ppm concentration limit. And we obtained an optimal objective value of about $ per ton of carbon. Now, we make α vary from 0.50-level to 0.95-level and make M lim vary from 425 to 475 ppm and then plot the minimal total abatement cost J as a function of α and M lim.

201 9. Mitigation Policies for Uncertain Carbon Dioxide Emissions 197 Table 9.1 Numerical results of a linear cost function Stage t M (t) Q(t) u (t) J (M t, Q t, t)

202 198 9 Applications Table 9.2 Numerical results of a quadratic cost function Stage t M (t) Q(t) u (t) J (M t, Q t, t)

9. Mitigation Policies for Uncertain Carbon Dioxide Emissions 199 expected value 0.9 pessimistic value 10 20 0 40 50 60 70 80 90 100 Sample Index Fig. 9.2 Intertemporal discounted costs realizations Fig.

203 9. Mitigation Policies for Uncertain Carbon Dioxide Emissions 199 expected value 0.9 pessimistic value Sample Index Fig. 9.2 Intertemporal discounted costs realizations Fig. 9. Optimal objective value respected to confidence level and concentrations tolerable threshold As shown in Fig. 9., the color is deeper, and the value J is larger. It turns out that if concentration limit is fixed, the minimal total abatement cost increases with respect to belief degree, this can be interpreted as if the higher belief degree is set which means the lower risk we can bear, and greater cost may be needed to satisfy the given target constraint. Simultaneously, the minimal total abatement cost decreases with respect to concentration limit; that is, if we relaxed the target constraint, the corresponding abatement cost can be cut under the same belief degree. It well displays the trade-offs between sustainability thresholds and risk.

Runge-Kutta Method for Solving Uncertain Differential Equations

Yang and Shen Journal of Uncertainty Analysis and Applications 215) 3:17 DOI 1.1186/s4467-15-38-4 RESEARCH Runge-Kutta Method for Solving Uncertain Differential Equations Xiangfeng Yang * and Yuanyuan