Critical Cones for Regular Controls with Inequality Constraints

Size: px

Start display at page:

Download "Critical Cones for Regular Controls with Inequality Constraints"

Rosanna Simpson
5 years ago
Views:

1 International Journal of Mathematical Analysis Vol. 12, 2018, no. 10, HIKARI Ltd, Critical Cones for Regular Controls with Inequality Constraints Jorge A. Becerril, Karla L. Cortez and Javier F. Rosenblueth IIMAS, Universidad Nacional Autónoma de México Apartado Postal , CDMX 01000, México Copyright c 2018 Jorge A. Becerril, Karla L. Cortez and Javier F. Rosenblueth. This article is distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Abstract Based on the notions of normality and regularity for constrained problems in optimization, a conjecture on second order necessary conditions related to different critical cones is posed for a wide range of optimal control problems involving equality and inequality constraints. Several properties of the sets and functions delimiting the problem are derived and, for a specific case where the surmise has been proved correct, some fundamental questions derived from this result are completely solved. Mathematics Subject Classification: 49K15 Keywords: Nonlinear programming, optimal control, second order necessary conditions, normality, regularity 1 Introduction: three main questions When dealing with the theory of second order necessary conditions for constrained problems in optimization, one usually faces three fundamental questions which can be summarized in these words: who (that is, which function satisfies the conditions), when (under what assumptions), and where (on which set do they hold). For constrained problems in the finite dimensional case, they have been answered in a satisfactory way in terms of Lagrange multipliers but,

2 440 Jorge A. Becerril, Karla L. Cortez and Javier F. Rosenblueth even for such a well-known theory, the functions, the assumptions, and the sets where the conditions hold, may vary from text to text. In this paper we shall be concerned with these same questions in the context of optimal control. Though second order conditions for such problems can be found in the literature, some fundamental questions still remain open, and our aim is to give answers to some of them. For the optimal control problem we shall deal with, the control functions are restricted to satisfy equality and inequality constraints for all points t in a fixed compact time interval. As pointed out in [12], these constraints make the problem much more complex than the mathematical programming problem, or even the isoperimetric problem (see [6, 7]), in part because one deals with infinitely many constraints, one for each t. In some cases like in [21, 25], the main results on second order conditions are not proved and, quoting [21], the derivation of the conditions is very special and difficult. There is an extensive literature explaining the importance of deriving second order conditions for optimal control problems, both from a theoretical and an applied point of view. Particular attention deserves the research developed in [1 4] where one of the main features is that the necessary conditions obtained make sense, and are derived, without a priori assumptions of normality, and second order conditions are expressed in terms of the maximum of a quadratic form over certain multipliers, for all the elements of the critical cone of the original set of constraints. Similarly, in [9 11, 24, 26], all essential references in the subject, second order necessary conditions in Pontryagin form are derived for problems with pure state and mixed (control-state) constraints, and varying the set of multipliers (those satisfying Pontryagin s principle, or the larger set of Lagrange multipliers) where the maximum of the quadratic forms on the set of critical variations is taken. For a more complete study of the subject, see also [16, 18, 20, 22, 27 30] and references therein. Our approach, based on the notions of normal and regular controls, is rather different to the one found in those references. To begin with, one finds in those references results from an abstract optimization theory on Banach spaces applied to the control problem posed over L -controls, a technique which does not work in our setting, where the fixed endpoint Lagrange control problem studied is posed over piecewise C 1 trajectories and piecewise continuous controls (the same setting can be found, for example, in [18, 22, 24]). Secondly, we shall be able to establish, for certain cases, the nonnegativity of a certain quadratic form under weaker assumptions and on larger sets than those appearing in those texts (see [22, 28], where a similar set of critical directions for our problem is proposed). Finally, the nature of our necessary conditions differs from the previous ones in that the nonnegativity of the second variation holds for the same multipliers without the need to take a maximum over different sets of multipliers.

3 Critical cones for regular controls 441 This paper is organized as follows. To clearly situate the contributions of the article, we provide in Section 2 a summary of the main features, in terms of the notions of regularity and normality, that occur in the finite dimensional case. This is crucial in our exposition, not only because some of the results given are used later, but also due to the way it is presented and for comparison reasons with the infinite dimensional case. In Section 3 we pose the optimal control problem we shall deal with and follow a way of reasoning similar to that of the previous section, leading to a natural conjecture on second order conditions, as well as to a new fundamental result related to a characterization of a certain set of critical directions. A particular case, covering a wide range of problems and recently derived in [27], is studied in Section 4, where we solve some important surmises and questions related to the theory of second order necessary conditions. 2 The finite dimensional case In this section we shall deal with constrained minimum problems in finite dimensional spaces. We shall state, in a succinct and clear way, first and second order necessary conditions for optimality in terms of Lagrange multiplier rules. The approach we follow is based on the notions of regularity and normality. For a full account of these ideas we refer to [18, 19]. Suppose we are given a set S R n and a function f mapping R n to R, and consider the problem, which we label P(S), of minimizing f on S. The notion of tangent cone given below is due to Hestenes [18] and, as shown in [17], it is equivalent to the one introduced by Bouligand (1932), also known as the contingent cone to S at x 0. Other authors such as Bazaraa, Goode, Nashed, Kurcyusz, Rockafellar, Saks, Rogak and many more, have given various equivalent definitions of such a cone (see [5, 17]). Definition 2.1 We shall say that a sequence {x q } R n converges to x 0 in the direction h if h is a unit vector, x q x 0, and lim x q x 0 = 0, q x q x 0 lim q x q x 0 = h. The tangent cone of S at x 0, denoted by T S (x 0 ), is the (closed) cone determined by the unit vectors h for which there exists a sequence {x q } in S converging to x 0 in the direction h. Equivalently (see [19]), T S (x 0 ) is the set of all h R n for which there exist a sequence {x q } in S and a sequence {t q } of positive numbers such that lim t q = 0, q lim q x q x 0 t q = h.

4 442 Jorge A. Becerril, Karla L. Cortez and Javier F. Rosenblueth Clearly, if {x q } converges to x 0 in the direction h and f has a differential at x 0, then f(x q ) f(x 0 ) lim = f (x 0 ; h). q x q x 0 If f has a second differential at x 0, then f(x q ) f(x 0 ) f (x 0 ; x q x 0 ) lim = 1 q x q x f (x 0 ; h). From these facts, necessary conditions for P(S) follow straightforwardly. Theorem 2.2 Suppose x 0 solves P(S) locally. If f has a differential at x 0, then f (x 0 ; h) 0 for all h T S (x 0 ). If f has a second differential at x 0 and f (x 0 ) = 0, then f (x 0 ; h) 0 for all h T S (x 0 ). Consider now problem P(S) with S defined by inequality and equality constraints. We are given functions f, g i : R n R (i A B) and S = {x R n g α (x) 0 (α A), g β (x) = 0 (β B)} where A = {1,..., p}, B = {p + 1,..., m}. The cases p = 0 and p = m are to be given the obvious interpretations. Assume f and g i are continuous on a neighborhood of x 0 S and possess second differentials at x 0. Definition 2.3 Define the set of active indices at x 0 by I(x 0 ) := {α A g α (x 0 ) = 0} and the set of vectors satisfying the tangential constraints at x 0 by R S (x 0 ) := {h R n g α(x 0 ; h) 0 (α I(x 0 )), g β(x 0 ; h) = 0 (β B)}. Note that T S (x 0 ) R S (x 0 ). We shall say that x 0 is a regular point of S if T S (x 0 ) = R S (x 0 ). The first order Lagrange multiplier rule is a consequence of Theorem 2.2 and some basic results on linear functionals. Let us define the set E of extremals as the set of all (x 0, λ) S R m such that i. λ α 0 and λ α g α (x 0 ) = 0 (α A). ii. If F (x) := f(x) + λ, g(x) then F (x 0 ) = 0. Theorem 2.4 Suppose x 0 solves P(S) locally. If x 0 is a regular point of S, then λ R m such that (x 0, λ) E. The second order Lagrange multiplier rule follows by a simple application of Theorem 2.2.

5 Critical cones for regular controls 443 Theorem 2.5 Suppose (x 0, λ) E, F (x) = f(x) + λ, g(x), and S 1 = {x S F (x) = f(x)}. If x 0 solves P(S) locally, then F (x 0 ; h) 0 for all h T S1 (x 0 ). Note that where Γ = {α A λ α > 0}, and S 1 = {x S g α (x) = 0 (α Γ)} R S1 (x 0 ) = {h R S (x 0 ) g α(x 0 ; h) = 0 (α Γ)} = {h R S (x 0 ) f (x 0 ; h) = 0}. In general, it may be difficult to test for regularity and one usually requires some criteria that implies that condition. A simple criterion is that of normality, defined in [19] as follows. Definition 2.6 We shall say that x 0 S is normal relative to S if λ = 0 is the only solution to i. λ α 0 and λ α g α (x 0 ) = 0 (α A). ii. m 1 λ i g i(x 0 ) = 0. To be precise, we should say in this definition that x 0 S is normal relative to the functions defining the set S and not relative to the set S defined by those functions. In other words, if λ = 0 is the only solution to the above system, then x 0 S is normal relative to the function g and the sets A and B of indices for inequality and equality constraints respectively. However, following the definition introduced in [18, 19], we shall use the above definition, and no confusion should arise if S is replaced with a different set defined also in terms of inequality and equality constraints. In [19], it is proved that x 0 is normal relative to S if and only if the set {g β(x 0 ) β B} is linearly independent and, if p > 0, h such that g α(x 0 ; h) < 0 (α I(x 0 )), g β(x 0 ; h) = 0 (β B). This characterization is used in [19] to prove the following, crucial result. Theorem 2.7 If x 0 is a normal point of S then x 0 is a regular point of S. The following extended rule, the Fritz John necessary optimality condition, yields in a natural way the above definition of normality (see [23] for a simple proof based on the theory of augmentability). Theorem 2.8 Suppose x 0 solves P(S) locally. Then λ 0 0 and λ R m, not both zero, such that i. λ α 0 and λ α g α (x 0 ) = 0 (α A). ii. If F 0 (x) := λ 0 f(x) + λ, g(x) then F 0(x 0 ) = 0.

6 444 Jorge A. Becerril, Karla L. Cortez and Javier F. Rosenblueth Clearly, in this theorem, if x 0 is also a normal point of S then λ 0 > 0 and the multipliers can be chosen so that λ 0 = 1. The real numbers λ 1,..., λ m satisfying the first order Lagrange multiplier rule (or the Karush-Kuhn-Tucker conditions) are called the Kuhn-Tucker or Lagrange multipliers, and the function F is the standard Lagrangian. In general, if x 0 affords a local minimum to f on S, those conditions may not hold at x 0, and some additional assumptions should be imposed to guarantee the existence of Lagrange multipliers. Assumptions of this nature are usually referred to as constraint qualifications (see [17]) since they involve only the constraints and are independent of the geometric structure of the feasible set S (a broader definition, in terms of critical directions, is given in [8]). Equivalently, they correspond to conditions which assure the positiveness of the cost multiplier λ 0 in Theorem 2.8. Both normality and regularity are constraint qualifications, as well as the characterization of normality mentioned above, known as the Mangasarian-Fromovitz constraint qualification. A combination of Theorems 2.5 and 2.7 yields the following basic result on second order necessary conditions. Theorem 2.9 Suppose (x 0, λ) E, F (x) = f(x)+ λ, g(x), and S 1 = {x S F (x) = f(x)}. If x 0 solves P(S) locally and is a normal point of S 1, then F (x 0 ; h) 0 for all h R S1 (x 0 ). Let us turn now to a different notion of normality given in [18]. Definition 2.10 x 0 S is s-normal relative to S if the linear functionals g α(x 0 ; h) (α I(x 0 ) B) in h are linearly independent. By an application of the implicit function theorem, and making use of the set of curvilinear tangent vectors of S at x 0 given by C S (x 0 ) := {h R n δ > 0, x: [0, δ) S such that x(0) = x 0, ẋ(0) = h}, one can easily show that s-normality implies that R S (x 0 ) C S (x 0 ). Since C S (x 0 ) T S (x 0 ) R S (x 0 ), this implies the following well-known result. Theorem 2.11 If x 0 is s-normal relative to S then x 0 is a regular point of S. This statement is precisely Lemma 10.1, Chapter 1 of [18] and it has an implication, in particular, on second order conditions. In Theorem 2.5, if x 0 is a regular point of S 1, then F (x 0 ; h) 0 for all h R S1 (x 0 ). By Theorem 2.7, x 0 is a regular point of S 1 if it is a normal point of S 1, and thus Theorem 2.9 holds. In [18], on the other hand, the criterion for this condition to hold is for x 0 to be s-normal relative to S since then it would be s-normal relative to

7 Critical cones for regular controls 445 S 1, and hence a regular point of S 1. As one readily verifies, x 0 S is s-normal relative to S if it is normal relative to the set S 0 := {x R n g i (x) = 0 (i I(x 0 ) B)} and thus the criterion used in [18] for regularity relative to S 1 is in general stronger than that of normality relative to S 1 used in [19]. Moreover, in most textbooks (see a list of well-known references in [8]) not only the assumption of normality relative to S 1 is replaced by (the stronger assumption of) normality relative to S 0, but the set of tangential constraints R S1 (x 0 ) is replaced by (the, in general, smaller set of tangential constraints) R S0 (x 0 ) = {h R n g i(x 0 ; h) = 0 (i I(x 0 ) B)} This rather well-worn result (as Ben-Tal puts it [8]) is the following. Theorem 2.12 Suppose (x 0, λ) E and F (x) = f(x) + λ, g(x). If x 0 solves P(S) locally and is a normal point of S 0, then F (x 0 ; h) 0 for all h R S0 (x 0 ). As pointed out in [8], The source of this weaker result can be attributed to the traditional way of treating the active inequality constraints as equality constraints. Let us end this section with an example which illustrates some of the main features on second order necessary conditions explained above. It should be noted that, for this example, the solution x 0 to the problem is a regular point of S so that, by Theorem 2.4, λ such that (x 0, λ) is an extremal. However, in contrast with the conclusion of Theorem 2.9, we can exhibit h R S1 (x 0 ) with F (x 0 ; h) < 0. Example 2.13 Consider the problem of minimizing f(x, y) = y + x 2 on the set S = {(x, y) g α (x, y) 0 (α = 1, 2, 3)} where g 1 (x, y) = y, g 2 (x, y) = cos x y, g 3 (x, y) = cos 2x y. As one readily verifies, (0, 1) solves the problem. The set of active indices at this point is I(0, 1) = {2, 3}, and it is a regular point of S since T S (0, 1) = {(h, k) k 0}, R S (0, 1) = {(h, k) k 0}. By Theorem 2.4, there exists λ = (λ 1, λ 2, λ 3 ) with λ 1 = 0 and λ 2, λ 3 0 such that, if F (x, y) = f(x, y) + λ, g(x, y) = y + x 2 (λ 1 + λ 2 + λ 3 )y + λ 2 cos x + λ 3 cos 2x

8 446 Jorge A. Becerril, Karla L. Cortez and Javier F. Rosenblueth then F (0, 1) = 0. Since F (x, y) = (2x λ 2 sin x 2λ 3 sin 2x, 1 λ 1 λ 2 λ 3 ) we have, with λ 1 = 0, that F (0, 1) = (0, 1 λ 2 λ 3 ) and so λ 2 + λ 3 = 1. Now, note that ( ) 2 F λ2 cos x 4λ (x, y) = 3 cos 2x and therefore F (0, 1; h, k) = (2 λ 2 4λ 3 )h 2. Let λ 2 = 0 and λ 3 = 1. Then R S1 (0, 1) = {(h, k) k 0, k = 0} and so, for any h 0, we have (h, 0) R S1 (0, 1) but 3 Optimal control F (0, 1; h, 0) = 2h 2 < 0. In this section we shall deal with a fixed endpoint Lagrange optimal control problem involving inequality and equality constraints in the control functions. We are interested in deriving, for this optimal control problem, a theory parallel to the one given in the previous section. We shall follow a similar line of thought and, in particular, try to see if a result analogous to Theorem 2.9, that is, second order conditions in terms of normality relative to the corresponding set S 1, can be established. Suppose we are given an interval T := [t 0, t 1 ] in R, two points ξ 0, ξ 1 in R n, and functions L and f mapping T R n R m to R and R n respectively, and ϕ = (ϕ 1,..., ϕ q ) mapping R m to R q (q m). Denote by X the space of piecewise C 1 functions mapping T to R n, by U k the space of piecewise continuous functions mapping T to R k (k N), set Z := X U m, D := {(x, u) Z ẋ(t) = f(t, x(t), u(t)) (t T ), x(t 0 ) = ξ 0, x(t 1 ) = ξ 1 }, S := {(x, u) D ϕ α (u(t)) 0, ϕ β (u(t)) = 0 (α R, β Q, t T )} where R = {1,..., r}, Q = {r+1,..., q}, and consider the functional I: Z R given by I(x, u) := t1 t 0 L(t, x(t), u(t))dt ((x, u) Z). The problem we shall deal with, which (again, but no confusion should arise) we label P(S), is that of minimizing I over S. Implicit in the statement of the problem is a relatively open subset O of T R n for which the domain of L and f is O R m. Elements of Z will be called processes, and a process (x, u) is admissible if it belongs to S and

9 Critical cones for regular controls 447 (t, x(t)) O (t T ). A process (x, u) solves P(S) (locally) if it is admissible and (upon shrinking O if necessary) we have I(x, u) I(y, v) for all admissible processes (y, v). Given (x, u) Z we shall use the notation ( x(t)) to represent (t, x(t), u(t)), and denotes transpose. With respect to the smoothness of the functions delimiting the problem, we assume that L, f and ϕ are C 2 and the q (m+r)- dimensional matrix ( ) ϕi δ u k iα ϕ α (i = 1,..., q; α = 1,..., r; k = 1,..., m) has rank q on U (here δ αα = 1, δ αβ = 0 (α β)), where U := {u R m ϕ α (u) 0 (α R), ϕ β (u) = 0 (β Q)}. This condition is equivalent to the condition that, at each point u in U, the matrix ( ) ϕi (i = i u k 1,..., i p ; k = 1,..., m) has rank p, where i 1,..., i p are the indices i {1,..., q} such that ϕ i (u) = 0 (see [15] for details). For all (t, x, u, p, µ, λ) in T R n R m R n R q R let H(t, x, u, p, µ, λ) := p, f(t, x, u) λl(t, x, u) µ, ϕ(u). First order necessary conditions are well established (see, for example, [15, 18, 22]) and one version can be stated as follows. Theorem 3.1 If (x 0, u 0 ) solves P(S), λ 0 0, p X, and µ U q not vanishing simultaneously on T, such that a. µ α (t) 0 and µ α (t)ϕ α (u 0 (t)) = 0 (α R, t T ); b. ṗ(t) = H x( x 0 (t), p(t), µ(t), λ 0 ) and H u ( x 0 (t), p(t), µ(t), λ 0 ) = 0 on every interval of continuity of u 0. From this result we introduce the notion of normality as in Section 2, that is, in such a way that the non-vanishing of the cost multiplier can be assured. This is accomplished by having zero as the unique solution to the adjoint equation whenever λ 0 = 0. Definition 3.2 (x 0, u 0 ) S is a normal process of S if, given (p, µ) X U q satisfying i. µ α (t) 0 and µ α (t)ϕ α (u 0 (t)) = 0 (α R, t T ); ii. ṗ(t) = f x( x 0 (t))p(t) [ = H x( x 0 (t), p(t), µ(t), 0) ] (t T ); iii. 0 = f u( x 0 (t))p(t) ϕ (u 0 (t))µ(t) [ = H u( x 0 (t), p(t), µ(t), 0) ] (t T ), then p 0. Note that, in this event, also µ 0.

10 448 Jorge A. Becerril, Karla L. Cortez and Javier F. Rosenblueth We define the set of extremals as the set of all (x, u, p, µ) for which the conditions of Theorem 3.1 hold with λ 0 = 1. Definition 3.3 Denote by E the set of all (x, u, p, µ) Z X U q such that a. µ α (t) 0 and µ α (t)ϕ α (u(t)) = 0 (α R, t T ); b. ṗ(t) = f x( x(t))p(t) + L x( x(t)) (t T ); c. f u( x(t))p(t) = L u( x(t)) + ϕ (u(t))µ(t) (t T ). Note 3.4 From the above definitions it follows that, in Theorem 3.1, if (x 0, u 0 ) is also a normal process of S, then (p, µ) X U q such that (x 0, u 0, p, µ) E. Let us now introduce the second variation with respect to H, as well as the set of curvilinear tangent processes. For any (x, u, p, µ) Z X U q let J((x, u, p, µ); (y, v)) := t1 where, for all (t, y, v) T R n R m, t 0 2Ω(t, y(t), v(t))dt ((y, v) Z) 2Ω(t, y, v) := [ y, H xx (t)y + 2 y, H xu (t)v + v, H uu (t)v ] and H(t) denotes H( x(t), p(t), µ(t), 1). For any (x 0, u 0 ) S, denote by C S (x 0, u 0 ) the set of all processes (y, v) for which there exist δ > 0 and (x(, ɛ), u(, ɛ)) S (0 ɛ < δ) such that a. (x(t, 0), u(t, 0)) = (x 0 (t), u 0 (t)) (t T ); b. (x ɛ (t, 0), u ɛ (t, 0)) = (y(t), v(t)) (t T ); c. x(t, ɛ) is continuous and has continuous first and second derivatives with respect to ɛ; u(t, ɛ) and its first and second derivatives with respect to ɛ are piecewise continuous with respect to t. Based on these definitions, we can establish the following fundamental result. Theorem 3.5 Let (x 0, u 0 ) S and suppose (p, µ) X U q (x 0, u 0, p, µ) E. Let such that S 1 [= S 1 (µ)] := {(x, u) S ϕ α (u(t)) = 0 (α R, µ α (t) > 0, t T )}. If (x 0, u 0 ) solves P(S) then J((x 0, u 0, p, µ); (y, v)) 0 for all (y, v) C S1 (x 0, u 0 ).

11 Critical cones for regular controls 449 Proof: Define K(x, u) := p(t 1 ), ξ 1 p(t 0 ), ξ 0 + where, for all (t, x, u) T R n R m, t1 t 0 F (t, x(t), u(t))dt ((x, u) Z) F (t, x, u) := L(t, x, u) p(t), f(t, x, u) + µ(t), ϕ(u) ṗ(t), x. Observe that F (t, x, u) = H(t, x, u, p(t), µ(t), 1) ṗ(t), x and, if (x, u) S, then K(x, u) = I(x, u) + Note that t1 t 0 µ(t), ϕ(u(t)) dt. S 1 = {(x, u) S µ α (t)ϕ α (u(t)) = 0 (α R, t T )} = {(x, u) S K(x, u) = I(x, u)}. Since (x 0, u 0 ) S 1, (x 0, u 0 ) minimizes K on S 1. Now, let (y, v) belong to C S1 (x 0, u 0 ) and let δ > 0 and (x(, ɛ), u(, ɛ)) S 1 (0 ɛ < δ) be such that (x(t, 0), u(t, 0)) = (x 0 (t), u 0 (t)) and (x ɛ (t, 0), u ɛ (t, 0)) = (y(t), v(t)). Then g(ɛ) := K(x(, ɛ), u(, ɛ)) (0 ɛ < δ) satisfies g(ɛ) = I(x(, ɛ), u(, ɛ)) I(x 0, u 0 ) = K(x 0, u 0 ) = g(0) (0 ɛ < δ). Note that F x ( x 0 (t)) = H x ( x 0 (t), p(t), µ(t), 1) ṗ (t) = 0, F u ( x 0 (t)) = H u ( x 0 (t), p(t), µ(t), 1) = 0 and therefore g (0) = 0. Consequently 0 g (0) = J((x 0, u 0, p, µ); (y, v)). As in the finite dimensional case, we are interested in characterizing the set C S (x 0, u 0 ) in terms of a set of tangential constraints whose membership, in contrast with C S (x 0, u 0 ), may be easily verifiable. Definition 3.6 For all (x 0, u 0 ) S let L(x 0, u 0 ) := {(y, v) Z ẏ(t) = A(t)y(t) + B(t)v(t) (t T ), y(t 0 ) = y(t 1 ) = 0}

12 450 Jorge A. Becerril, Karla L. Cortez and Javier F. Rosenblueth where A(t) = f x ( x 0 (t)), B(t) = f u ( x 0 (t)). Denote the set of active indices at u R m by I a (u) := {α R ϕ α (u) = 0} and define the set of processes satisfying the tangential constraints at (x 0, u 0 ) with respect to S by R S (x 0, u 0 ) := {(y, v) L(x 0, u 0 ) ϕ α(u 0 (t))v(t) 0 (α I a (u 0 (t)), t T ), ϕ β(u 0 (t))v(t) = 0 (β Q, t T )}. As we prove next, curvilinear tangent processes satisfy the tangential constraints. Proposition 3.7 For all (x 0, u 0 ) S, C S (x 0, u 0 ) R S (x 0, u 0 ). Proof: Let (y, v) C S (x 0, u 0 ) and let δ > 0 and (x(, ɛ), u(, ɛ)) S (0 ɛ < δ) be such that (x(t, 0), u(t, 0)) = (x 0 (t), u 0 (t)) and (x ɛ (t, 0), u ɛ (t, 0)) = (y(t), v(t)). Then, for all 0 ɛ < δ, ẋ(t, ɛ) = f(t, x(t, ɛ), u(t, ɛ)) (t T ), x(t 0, ɛ) = ξ 0, x(t 1, ɛ) = ξ 1 and so (y, v) solves L(x 0, u 0 ). Also, for all (t, ɛ) T [0, δ), ϕ α (u(t, ɛ)) 0 (α R), ϕ β (u(t, ɛ)) = 0 (β Q). Fix i R Q and t T, and set γ(ɛ) := ϕ i (u(t, ɛ)) so that γ (0) = ϕ i(u 0 (t))v(t). If i I a (u 0 (t)) then γ (0) 0. If i Q then γ (0) = 0. Definition 3.8 We say (x 0, u 0 ) is a c-regular point of S if C S (x 0, u 0 ) = R S (x 0, u 0 ). Now, let us exhibit three sets of constraints and their corresponding sets of tangential constraints. For any u R m and µ R q, let τ 0 (u) := {h R m ϕ i(u)h = 0 (i I a (u) Q)}, τ 1 (u, µ) := {h R m ϕ α(u)h 0 (α I a (u), µ α = 0), ϕ β(u)h = 0 (β R with µ β > 0, or β Q)}, τ 2 (u) := {h R m ϕ α(u)h 0 (α I a (u)), ϕ β(u)h = 0 (β Q)}.

13 Critical cones for regular controls 451 Note that R S (x 0, u 0 ) := {(y, v) L(x 0, u 0 ) v(t) τ 2 (u 0 (t)) (t T )}. Also, we have S 1 (µ) = {(x, u) D ϕ α (u(t)) 0 (α R, µ α (t) = 0, t T ), ϕ β (u(t)) = 0 (β R with µ β (t) > 0, or β Q, t T )} and therefore R S1 (x 0, u 0 ) := {(y, v) L(x 0, u 0 ) v(t) τ 1 (u 0 (t), µ(t)) (t T )}. Similarly, if S 0 := {(x, u) D ϕ i (u(t)) = 0 (i I a (u 0 (t)) Q, t T )}, then R S0 (x 0, u 0 ) := {(y, v) L(x 0, u 0 ) v(t) τ 0 (u 0 (t)) (t T )}. Let us now apply the definition of normality, given in Definition 3.2, to the sets S 0 and S 1. Note 3.9 (x 0, u 0 ) is a normal process of S 0 if, given (p, µ) X U q satisfying i. µ α (t)ϕ α (u 0 (t)) = 0 (α R, t T ); ii. ṗ(t) = f x( x 0 (t))p(t) and f u( x 0 (t))p(t) = ϕ (u 0 (t))µ(t) (t T ) then p 0. Note 3.10 (x 0, u 0 ) is a normal process of S 1 (µ) if, given (p, ν) X U q satisfying i. ν α (t) 0 and ν α (t)ϕ α (u 0 (t)) = 0 (α R, µ α (t) = 0, t T ); ii. ṗ(t) = f x( x 0 (t))p(t) and f u( x 0 (t))p(t) = ϕ (u 0 (t))ν(t) (t T ) then p 0. Normality relative to S 0, S 1 and S can be characterized in terms of the sets τ 0, τ 1 and τ 2 defined above. We refer to [28] for a detailed proof of these characterizations. Proposition 3.11 Let (x 0, u 0 ) S and suppose µ U q is such that µ α (t) 0 and µ α (t)ϕ α (u 0 (t)) = 0 (α R, t T ). Then (x 0, u 0 ) is a normal process of S 0 (respectively S 1, S) if and only if z 0 is the only solution of the system ż(t) = A (t)z(t), z (t)b(t)h 0 for all h τ 0 (u 0 (t)) (t T ) (respectively h τ 1 (u 0 (t), µ(t)), h τ 2 (u 0 (t))).

14 452 Jorge A. Becerril, Karla L. Cortez and Javier F. Rosenblueth As one readily verifies (see also [28] for details), if (x 0, u 0 ) is normal relative to S 0 (or strongly normal) then it is normal relative to S 1 (with µ as above) which in turn implies normality relative to S (or weak normality). It is important to mention, at this stage, that second order conditions in terms of R S0 (x 0, u 0 ) assuming normality relative to S 0 are well established in the literature (see, for example, [15]). The result is similar to Theorem 2.12 and it can be stated as follows. Theorem 3.12 Let (x 0, u 0 ) S and suppose (p, µ) X U q such that (x 0, u 0, p, µ) E. If (x 0, u 0 ) solves P(S) and is a normal process of S 0, then J((x 0, u 0, p, µ); (y, v)) 0 for all (y, v) R S0 (x 0, u 0 ). As we show next, we are able to provide two examples which illustrate a crucial aspect of the theory, namely, that if we modify the assumptions or the space where the nonnegativity of the quadratic form holds, the conclusion of Theorem 3.12 may no longer hold. Let us begin with slightly changing the assumptions. In this first example, we exhibit a solution (x 0, u 0 ) to a problem P(S) which is normal relative to S, together with a pair (p, µ) such that (x 0, u 0, p, µ) E, but J((x 0, u 0, p, µ); (y, v)) < 0 for some (y, v) R S0 (x 0, u 0 ). Example 3.13 Let a > 0, k > 1 and consider the problem of minimizing subject to (x, u) Z, x(0) = x(1) = 0, and I(x, u) = 1 0 {u 2 (t) u 1 (t)}dt ẋ(t) = 2u 2 (t) (k + 1)u 1 (t) + au 2 3(t) (t [0, 1]), u 2 (t) u 1 (t), ku 1 (t) u 2 (t) (t [0, 1]). In this case T = [0, 1], n = 1, m = 3, r = q = 2, ξ 0 = ξ 1 = 0 and, for all t T, x R, and u = (u 1, u 2, u 3 ), We have L(t, x, u) = u 2 u 1, f(t, x, u) = 2u 2 (k + 1)u 1 + au 2 3, ϕ 1 (u) = u 1 u 2, ϕ 2 (u) = u 2 ku 1. H(t, x, u, p, µ) = p(2u 2 (k + 1)u 1 + au 2 3) + u 1 u 2 µ 1 (u 1 u 2 ) µ 2 (u 2 ku 1 ) and so H u (t, x, u, p, µ) = ( p(k + 1) µ 1 + kµ 2 + 1, 2p + µ 1 µ 2 1, 2apu 3 ),

15 Critical cones for regular controls H uu (t, x, u, p, µ) = ap so that, for all (x, u, p, µ) Z X U 2 and (y, v) Z, 1 J((x, u, p, µ); (y, v)) = 2ap(t)v3(t)dt. 2 0 Clearly (x 0, u 0 ) (0, 0) is a solution to the problem. Note that (x 0, u 0, p, µ) is an extremal if µ α (t) 0 (α = 1, 2), ṗ(t) = 0 and f u( x 0 (t))p(t) = L u( x 0 (t)) + ϕ (u 0 (t))µ(t) (t T ). This last relation corresponds to (k + 1) 2 0 p(t) = 1 µ 1 (t) kµ 2 (t) 1 + µ 1 (t) + µ 2 (t) 0 0 and thus, if µ = (µ 1, µ 2 ) (0, 1) and p 1, (x 0, u 0, p, µ) E. Now, since ϕ 1(u) = (1, 1, 0) and ϕ 2(u) = ( k, 1, 0), we have τ 0 (u 0 (t)) = {h R 3 h 1 h 2 = 0, h 2 kh 1 = 0}, τ 1 (u 0 (t), µ(t)) = {h R 3 h 1 h 2 0, h 2 kh 1 = 0}, τ 2 (u 0 (t)) = {h R 3 h 1 h 2 0, h 2 kh 1 0}. Since f x ( x 0 (t)) = 0 and f u ( x 0 (t)) = ( (k + 1), 2, 0), the system ż(t) = A (t)z(t) = 0, z (t)b(t)h = z(t)( (k + 1)h 1 + 2h 2 ) = 0 for all (h 1, h 2, h 3 ) τ 0 (u 0 (t)) (t T ) has nontrivial solutions and so (x 0, u 0 ) is not normal relative to S 0. On the other hand, z 0 is the only solution to the system ż(t) = 0, z(t)( (k + 1)h 1 + 2h 2 ) 0 for all (h 1, h 2, h 3 ) τ 2 (u 0 (t)) (t T ) since both (1, 1, 0) and (1, k, 0) belong to τ 2 (u 0 (t)) implying that z(t) 0 and z(t) 0 (t T ) respectively, so that (x 0, u 0 ) is normal relative to S. Finally, the system ż(t) = 0, z(t)( (k + 1)h 1 + 2h 2 ) 0 for all (h 1, h 2, h 3 ) τ 1 (u 0 (t), µ(t)) (t T ) corresponds to ż(t) = 0, z(t)(h 2 h 1 ) 0 with h 2 h 1 0, which has nontrivial solutions and so (x 0, u 0, µ) is not a normal point of S 1. Let v = (v 1, v 2, v 3 ) (0, 0, 1) and y 0. Then (y, v) solves ẏ(t) = (k + 1)v 1 (t) + 2v 2 (t) (t T ), y(0) = y(1) = 0, v(t) τ 0 (u 0 (t)), and J((x 0, u 0, p, µ); (y, v)) = 2a < 0.

16 454 Jorge A. Becerril, Karla L. Cortez and Javier F. Rosenblueth Our second example deals with a change, in Theorem 3.12, of the critical cone where the second order conditions hold. It shows that a solution to the problem, which is a normal process of S 0, may yield a negative second variation on R S (x 0, u 0 ). Example 3.14 Consider the problem of minimizing I(x, u) = 1 0 exp( u 2 (t))dt subject to (x, u) Z, ẋ(t) = u 1 (t) + u 2 2(t) (t [0, 1]), x(0) = x(1) = 0, u 2 (t) 0 (t [0, 1]). In this case T = [0, 1], n = 1, m = 2, r = q = 1, ξ 0 = ξ 1 = 0 and, for all t T, x R, and u = (u 1, u 2 ), L(t, x, u) = e u 2, f(t, x, u) = u 1 + u 2 2, ϕ 1 (u) = u 2. We have H(t, x, u, p, µ) = p(u 1 + u 2 2) + µ 1 u 2 + e u 2 and so H u (t, x, u, p, µ) = (p, 2pu 2 + µ 1 e u 2 ), H uu (t, x, u, p, µ) = so that, for all (x, u, p, µ) Z X U 1 and (y, v) Z, 1 J((x, u, p, µ); (y, v)) = (2p(t) + e u2(t) )v2(t)dt. 2 0 ( 0 0 ) 0 2p + e u 2 Clearly (x 0, u 0 ) (0, 0) solves the problem. Also, since f u (t, x, u) = (1, 2u 2 ), L u (t, x, u) = (0, e u 2 ) and ϕ (u) = (0, 1), (x 0, u 0, p, µ) is an extremal if ṗ(t) = 0 and ( ) ( ) ( ) p(t) = µ(t) Thus, if (p, µ) (0, 1), (x 0, u 0, p, µ) E. Observe now that τ 0 (u 0 (t)) = τ 1 (u 0 (t), µ(t)) = {h R 2 h 2 = 0}, τ 2 (u 0 (t)) = {h R 2 h 2 0}. Since z 0 is the only solution to the system ż(t) = 0, z(t)h 1 = 0 for all (h 1, h 2 ) τ 0 (u 0 (t)) (t T ), (x 0, u 0 ) is normal relative to S 0. Now, if v (0, 1) and y 0 then v(t) τ 2 (u 0 (t)) and ẏ(t) = v 1 (t) (t T ), y(0) = y(1) = 0, but J((x 0, u 0, p, µ); (y, v)) = 1 < 0.

17 Critical cones for regular controls 455 In spite of these examples, Theorem 3.12 clearly looks too weak in comparison with the main result that holds for the finite dimensional case, namely, Theorem 2.9. In the latter, one assumes normality relative to S 1 and one can assure that the second variation is nonnegative on the set of tangential constraints with respect to S 1. In the former, normality is assumed relative to S 0 (in general a stronger assumption than normality on S 1 ) and the condition on the second variation holds on the tangential constraints with respect to S 0 (in general a set smaller than that with respect to S 1 ). We pose the question if a result, analogous to Theorem 2.9, holds for our optimal control problem, that is, if the following result is valid or not. Conjecture 3.15 Let (x 0, u 0 ) S and suppose (p, µ) X U q such that (x 0, u 0, p, µ) E. Let S 1 := {(x, u) S ϕ α (u(t)) = 0 (α R, µ α (t) > 0, t T )}. If (x 0, u 0 ) solves P(S) and is normal relative to S 1, then J((x 0, u 0, p, µ); (y, v)) 0 for all (y, v) R S1 (x 0, u 0 ). Clearly, the role played by R S1 (x 0, u 0 ) above is crucial. We shall end this section by proving a new result, mentioned in the introduction, which entirely characterizes this set of critical directions. In particular, we shall conclude that, though the set S 1 (µ) of constraints depends on the multiplier µ, this dependence is no longer present in the set R S1 (x 0, u 0 ) of tangential constraints relative to S 1 (µ). In our characterization given below, we shall make use of the first variation of I given by I (x 0, u 0 ; y, v) = t1 t 0 {L x ( x 0 (t))y(t) + L u ( x 0 (t))v(t)}dt. For all (x 0, u 0 ) S, denote by M(x 0, u 0 ) the set of all (p, µ) X U q such that (x 0, u 0, p, µ) E, that is, a. µ α (t) 0 and µ α (t)ϕ α (u 0 (t)) = 0 (α R, t T ); b. ṗ(t) = A (t)p(t) + L x( x 0 (t)) (t T ); c. B (t)p(t) = L u( x(t)) + ϕ (u(t))µ(t) (t T ), where A(t) = f x ( x 0 (t)), B(t) = f u ( x 0 (t)). Recall that R S (x 0, u 0 ) := {(y, v) L(x 0, u 0 ) ϕ α(u 0 (t))v(t) 0 (α I a (u 0 (t)), t T ), where ϕ β(u 0 (t))v(t) = 0 (β Q, t T )} L(x 0, u 0 ) := {(y, v) Z ẏ(t) = A(t)y(t) + B(t)v(t) (t T ), y(t 0 ) = y(t 1 ) = 0}

18 456 Jorge A. Becerril, Karla L. Cortez and Javier F. Rosenblueth and therefore, given (p, µ) M(x 0, u 0 ), since S 1 [= S 1 (µ)] := {(x, u) S ϕ α (u(t)) = 0 (α R, µ α (t) > 0, t T )}, it follows that R S1 (x 0, u 0 ) = {(y, v) R S (x 0, u 0 ) ϕ α(u 0 (t))v(t) = 0 (α R, µ α (t) > 0, t T )}. Theorem 3.16 Suppose (x 0, u 0 ) S and (p, µ) M(x 0, u 0 ). Then R S1 (µ)(x 0, u 0 ) = {(y, v) R S (x 0, u 0 ) I (x 0, u 0 ; y, v) = 0}. In particular, R S1 (µ)(x 0, u 0 ) = R S1 (ν)(x 0, u 0 ) for any (q, ν) M(x 0, u 0 ). Proof: We proceed first as in the proof of Theorem 3.5. Define K(x, u) := p(t 1 ), ξ 1 p(t 0 ), ξ 0 + where, for all (t, x, u) T R n R m, t1 t 0 F (t, x(t), u(t))dt ((x, u) Z) F (t, x, u) := L(t, x, u) p(t), f(t, x, u) + µ(t), ϕ(u) ṗ(t), x. Since F (t, x, u) = H(t, x, u, p(t), µ(t), 1) ṗ(t), x, we have F x ( x 0 (t)) = H x ( x 0 (t), p(t), µ(t), 1) ṗ (t) = 0, F u ( x 0 (t)) = H u ( x 0 (t), p(t), µ(t), 1) = 0 and, consequently, K (x 0, u 0 ; y, v) = 0 for all (y, v) Z. From this equality, we obtain that 0 = t1 {[L x ( x 0 (t)) p (t)a(t) ṗ (t)]y(t) + t 0 [L u ( x 0 (t)) p (t)b(t) + µ (t)ϕ (u 0 (t))]v(t)}dt = I (x 0, u 0 ; y, v) t1 t 0 {p (t)(a(t)y(t) + B(t)v(t)) + ṗ (t)y(t)}dt + If (y, v) L(x 0, u 0 ), this implies that I (x 0, u 0 ; y, v) + t1 t1 t 0 µ (t)ϕ (u 0 (t))v(t)dt. t 0 µ (t)ϕ (u 0 (t))v(t)dt = 0.

19 Critical cones for regular controls 457 If also (y, v) R S (x 0, u 0 ), we have I (x 0, u 0 ; y, v) + t1 t 0 µ α (t)ϕ α(u 0 (t))v(t)dt = 0. α R Since µ α (t)ϕ (u 0 (t))v(t) 0 (α R, t T ), we conclude that (y, v) belongs to the set {(y, v) R S (x 0, u 0 ) I (x 0, u 0 ; y, v) = 0} if and only if ϕ (u 0 (t))v(t) = 0 (α R, µ α (t) > 0, t T ), that is, if and only if (y, v) R S1 (µ)(x 0, u 0 ). 4 Lagrangian independent of the state The particular case where the Lagrangian does not depend on the state variable has been studied in a recent paper [27]. It provides an excellent illustration of how the theory of first and second order necessary conditions for the finite dimensional case can be applied to optimal control problems. Moreover, it implies a second order condition stronger than the one given in Conjecture 3.15 since the assumption of normality of a solution (x 0, u 0 ) relative to S 1 implies the nonnegativity of the second variation in a set which generally strictly contains R S1 (x 0, u 0 ). In this section we derive some consequences of this result (see also [13, 14], where the constraints may depend not only on the control functions but also on the time variable). Let us then consider the case where L(t, x, u) = L(t, u), i.e., L does not depend on x. Our problem P(S) is that of minimizing the functional I(u) = t1 t 0 L(t, u(t))dt on S, that is, subject to (x, u) Z and ẋ(t) = f(t, x(t), u(t)) (t T ); x(t 0 ) = ξ 0, x(t 1 ) = ξ 1 ; ϕ α (u(t)) 0, ϕ β (u(t)) = 0 (α R, β Q, t T ). Our assumptions include that L, f and ϕ are C 2 and the q (m+r)-dimensional matrix ( ϕi u k δ iα ϕ α ) (i = 1,..., q; α = 1,..., r; k = 1,..., m) has rank q on the set of points u R m satisfying ϕ α (u) 0 (α R), ϕ β (u) = 0 (β Q). Let C := {u U m (t, u(t)) A (t T )}, where A = {(t, u) T R m ϕ α (u) 0 (α R), ϕ β (u) = 0 (β Q)}.

20 458 Jorge A. Becerril, Karla L. Cortez and Javier F. Rosenblueth Denote by (C) the problem of minimizing I(u) = t 1 t 0 L(t, u(t))dt over C. Note that, if (x 0, u 0 ) is admissible for P(S) and u 0 solves (C), that is, I(u 0 ) I(u) for all u C, then (x 0, u 0 ) solves P(S), that is, I(u 0 ) I(u) for all u C such that, if ẋ(t) = f(t, x(t), u(t)) (t T ) and x(t 0 ) = ξ 0, then x(t 1 ) = ξ 1. Clearly, the converse may not hold. Two simple examples illustrate this fact. to Example 4.1 Consider the problem of minimizing I(u) = 1 0 u(t)dt subject ẋ(t) = u(t), x(0) = x(1) = 0, u(t) 0. Then (x 0, u 0 ) (0, 0) solves P(S), being the only admissible process, but u 0 0 does not solve (C), that is, it does not minimize I over the set C = {u U 1 u(t) 0 (t T )}. Actually, in this example, u 0 maximizes I on C. Note also that (x 0, u 0 ) is not normal with respect to S since and so the system has nontrivial solutions. τ 2 (u 0 ) = {h R h 0} ż(t) = 0, z(t)h 0 for all h τ 2 (u 0 (t)) (t T ) Example 4.2 Consider the problem of minimizing I(u) = 2 0 (t 1)u(t)dt subject to ẋ(t) = u 2 (t), x(0) = x(2) = 0, u 2 (t) 1. Then (x 0, u 0 ) (0, 0) solves P(S) but u 0 0 does not solve (C) since it does not minimize I over the set C = {u U 1 1 u(t) 1 (t T )}. In contrast with the previous example, here u 0 does not minimize or maximize I on C. As one readily verifies, the solution to (C) is given by the function which equals 1 on [0, 1] and 1 on [1, 2] (for the maximum the signs are reversed). As before, clearly (x 0, u 0 ) is not normal with respect to S since f u (t, x 0 (t), u 0 (t)) = 0. Let us now state the result on second order conditions for problem P(S) proved in [27].

21 Critical cones for regular controls 459 Theorem 4.3 Suppose (x 0, u 0 ) solves P(S) and (p, µ) X U q satisfying a. µ α (t) 0 and µ α (t)ϕ α (u 0 (t)) = 0 (α R, t T ); b. ṗ(t) = A (t)p(t) (t T ); c. p (t)b(t) = L u (t, u 0 (t)) + µ (t)ϕ (u 0 (t)) (t T ). Suppose also that u 0 solves (C), and consider the following statements: i. (x 0, u 0 ) is normal relative to S 1 (µ). ii. p 0. iii. J((x 0, u 0, p, µ); (y, v)) 0 for all (y, v) Z with v(t) τ 1 (u 0 (t), µ(t)) (t T ). Then (i) (ii) (iii). It should be noted that the assumptions in this theorem imply that the second variation is nonnegative on the set of pairs (y, v) in Z satisfying v(t) τ 1 (u 0 (t), µ(t)) (t T ). This holds in contrast with the conclusion of Conjecture 3.15 where the nonnegativity of that function requires from (y, v) to also satisfy the linear equation ẏ(t) = A(t)y(t) + B(t)v(t) (t T ) together with the endpoint conditions y(t 0 ) = y(t 1 ) = 0. Observe also that, as Example 4.1 illustrates, we may have a solution (x 0, u 0 ) to our problem P(S) for which u 0 does not solve problem (C) and, moreover, u 0 maximizes I on C. Also, in that example, u 0 0 maximizes (t, u) u on A and, in Example 4.2, u 0 0 minimizes (t, u) u 2 on A. Apart from this, recall that, in Theorem 4.3, we are assuming normality relative to S 1 which in turn implies normality relative to S. These facts motivate the following result, assuming that f(t, x, u) = f(t, u), i.e., f is independent on x. Proposition 4.4 Let (x 0, u 0 ) be an admissible process and suppose that, for some i {1,..., n}, f i (t, u 0 (t)) f i (t, u) (or f i (t, u)) whenever (t, u) A. Then (x 0, u 0 ) is not normal relative to S. Proof: Suppose, without loss of generality, that f i (t, u 0 (t)) f i (t, u) whenever (t, u) A. Then there exists a unique ν U q such that f i u(t, u 0 (t)) + ν (t)ϕ (u 0 (t)) = 0 (t T ). Moreover, ν α (t) 0 and ν α (t)ϕ α (u 0 (t)) = 0 (α R, t T ). By definition, τ 2 (u) := {h R m ϕ i(u)h 0 (i I a (u)), ϕ j(u)h = 0 (j Q)} and (x 0, u 0 ) is a normal point of S if z 0 is the only solution to the system ż(t) = 0, z(t)f u (t, u 0 (t))h 0 for all h τ 2 (u 0 (t)) (t T ).

22 460 Jorge A. Becerril, Karla L. Cortez and Javier F. Rosenblueth Let z i (t) = 1 and z j (t) = 0 (j i). Then z(t)f u (t, u 0 (t))h = f i u(t, u 0 (t))h = ν (t)ϕ (u 0 (t))h 0 for all h τ 2 (u 0 (t)) and therefore the above system has nontrivial solutions. This proves the claim. The following example illustrates an application of Theorem 4.3, namely, the verification of normality by simply exhibiting an extremal (x 0, u 0, p, µ) for which p 0. Example 4.5 Consider the problem of minimizing I(x, u) = 1 0 u 2(t)dt subject to (x, u) Z and ẋ(t) = u 2 1(t) + u 2 (t) u 3 (t) (t [0, 1]); x(0) = x(1) = 0; u 2 (t) 0, u 3 (t) 0 (t [0, 1]). In this case T = [0, 1], n = 1, m = 3, r = q = 2, ξ 0 = ξ 1 = 0 and, for all t T, x R, and u = (u 1, u 2, u 3 ), L(t, x, u) = u 2, f(t, x, u) = u u 2 u 3, ϕ 1 (u) = u 2, ϕ 2 (u) = u 3. We have H(t, x, u, p, µ) = p(u u 2 u 3 ) u 2 + µ 1 u 2 + µ 2 u 3 and so H u (t, x, u, p, µ) = (2pu 1, p 1 + µ 1, p + µ 2 ) Clearly (x 0, u 0 ) (0, 0) is a solution to both problems P(S) and (C). Let µ = (µ 1, µ 2 ) (0, 1) and p 1. Then, as one readily verifies, (x 0, u 0, p, µ) E, that is, the first conditions of Theorem 4.3 hold. Since p 0, we conclude that the solution (x 0, u 0 ) is not normal relative to S 1. Actually, for this example, the pair (x 0, u 0 ) is normal relative to S and, if v = (v 1, v 2, v 3 ) (1, 0, 0) and y 0, then (y, v) solves ẏ(t) = v 2 (t) v 3 (t) (t T ), y(0) = y(1) = 0, v(t) τ 0 (u 0 (t)), and J((x 0, u 0, p, µ); (y, v)) = 2 < 0. Theorem 4.3 gives rise to several surmises and questions which we will be solved completely by means of some illustrative examples. Some of these issues were first posed, and left open, in [27]. A natural question is if, in Theorem 4.3, the assumption that u 0 solves (C) becomes redundant once normality relative to S 1 is imposed. In other words, if (x 0, u 0 ) solves P(S), (x 0, u 0, p, µ) is an extremal, and (i) holds, then necessarily u 0 solves (C). Our first example does answer this question. We provide a solution (x 0, u 0 ) to P(S) together with an extremal which not only satisfies (i) but (x 0, u 0 ) is strongly normal, and u 0 is not a solution to (C).

23 Critical cones for regular controls 461 Example 4.6 (x 0, u 0 ) solves P(S) and is strongly normal, but u 0 does not solve (C). Consider the problem P(S) of minimizing I(x, u) = 1 0 u(t)dt subject to ẋ(t) = u(t)x(t) (t [0, 1]), x(0) = 1, x(1) = e, u(t) 0 (t [0, 1]). For this problem T = [0, 1], ξ 0 = 1, ξ 1 = e, L(t, u) = u, f(t, x, u) = ux, ϕ(u) = u. Also S = {(x, u) D u(t) 0 (t T )} where D = {(x, u) Z ẋ(t) = u(t)x(t), x(0) = 1, x(1) = e}. To begin with note that, if (x, u) is admissible, then x(t) = exp( t 0 u(s)ds) and x(1) = e = exp( 1 0 u(t)dt) and so 1 0 u(t)dt = I(x, u) = 1. Therefore (x 0 (t), u 0 (t)) := (e t, 1) (t T ) is a solution to the problem P(S) and, clearly, u 0 1 is not a solution to (C). Now, f x ( x 0 (t)) = 1, f u ( x 0 (t)) = e t, and ϕ(u 0 (t)) = 1 < 0. Since, for this solution, there are no active constraints, S 0 = D. Clearly (x 0, u 0 ) is a normal process of S 0 since, given (p, µ) satisfying then p 0. µ(t)( 1) = 0, ṗ(t) = p(t), e t p(t) = µ(t) = 0 (t [0, 1]) This example illustrates some more features of the theorem. In particular, we can exhibit a pair (p, µ) such that (x 0, u 0, p, µ) is an extremal for which (i) is satisfied, but (ii) and (iii) do not hold. Example 4.7 (x 0, u 0 ) solves P(S), (p, µ) M(x 0, u 0 ) is such that, in Theorem 4.3, (i) is satisfied but (ii) and (iii) do not hold. Moreover, J((x 0, u 0, p, µ); (y, v)) = 0 for all (y, v) L(x 0, u 0 ). Consider the previous example with (x 0, u 0 ) defined as above. If p(t) = e t and µ 0 then (x 0, u 0, p, µ) E since ṗ(t) = p(t) and e t p(t) = 1 (t T ). Also, as seen before, (i) holds but not (ii). Now, we have H = pux u + µu and so H u = px 1 + µ, H x = pu, H uu = H xx = 0, H ux = H xu = p all evaluated at (t, x, u, p, µ). Thus 2Ω(t, y, v) = 2p(t)yv and so 1 J((x 0, u 0, p, µ); (y, v)) = 2 e t y(t)v(t)dt. 0

24 462 Jorge A. Becerril, Karla L. Cortez and Javier F. Rosenblueth Since there are no active constraints, τ 0 (u 0 (t)) = τ 1 (u 0 (t), µ(t)) = τ 2 (u 0 (t)) = R. Thus, if y v 1 then clearly (iii) fails to hold. Let us now show that J((x 0, u 0, p, µ); (y, v)) = 0 for all (y, v) L(x 0, u 0 ). Let (y, v) L(x 0, u 0 ) and note that, by definition, ẏ(t) = y(t) + e t v(t) (t T ) and y(0) = y(1) = 0. Let g(t) = t 0 v(s)ds. Then y(t) = et g(t) and so g(0) = g(1) = 0. Integrating by parts 1 1 g(t)v(t)dt = g 2 (t) 1 0 g(t)v(t)dt and therefore 0 1 J((x 0, u 0, p, µ), (y, v)) = 2 g(t)v(t)dt = 0. 0 Now, this particular solution gives rise to a different conjecture. It seems to be crucial, for this example, the fact that there are no active constraints. However, we can provide below a different solution for which the previous conclusions also hold but the constraint is active on a subinterval of T. Example 4.8 The conclusions of Examples 4.6 and 4.7 hold in the presence of active constraints. Consider again problem P(S) of Example 4.7. Let a = 1/2 and define the process (x 1, u 1 ) by 0 x 1 (t) = { e 2t if t [0, a] e if t [a, 1] u 1 (t) = { 2 if t [0, a] 0 if t (a, 1]. Clearly (x 1, u 1 ) is a solution to P(S) but not to (C). To check for normality, suppose (p, µ) X U 1 satisfies i. µ(t)ϕ(u 1 (t)) = 0 (t T ); ii. ṗ(t) = f x ( x 1 (t))p(t) and f u ( x 1 (t))p(t) = ϕ (u 1 (t))µ(t) (t T ). By (i), µ(t) = 0 on [0, a]. By (ii), since f x ( x 1 (t)) = u 1 (t), f u ( x 1 (t)) = x 1 (t), and ϕ (u 1 (t)) = 1, we have ṗ(t) = u 1 (t)p(t), x 1 (t)p(t) = µ(t) (t T ) which implies, in particular, that e 2t p(t) = 0 on [0, a] and ṗ(t) = 0 on (a, 1]. Thus p(t) = 0 on [0, a] and p(t) is constant on (a, 1]. By continuity, p 0 and so (x 1, u 1 ) is strongly normal. Now, for this solution, an extremal (x 1, u 1, p, µ) must satisfy a. µ(t) 0 and µ(t)ϕ(u 1 (t)) = 0 (t T ); b. ṗ(t) = u 1 (t)p(t) and p(t)x 1 (t) = 1 µ(t) (t T ).

25 Critical cones for regular controls 463 These conditions are clearly satisfied by (p, µ) with µ 0 and { e 2t if t [0, a] p(t) = e 1 if t (a, 1]. Observe now that, since ϕ(u 1 (t)) = u 1 (t) (t T ), the constraint is inactive on [0, a] but active on (a, 1]. Thus we have { R if t [0, a] τ 0 (u 1 (t)) = {h h = 0} if t (a, 1] { R if t [0, a] τ 2 (u 1 (t)) = {h h 0} if t (a, 1] and τ 1 (u 1 (t)), µ(t)) = τ 2 (u 1 (t)) (t T ). As before we have 2Ω(t, y, v) = 2p(t)yv and so, if y v 1, then J((x 1, u 1, p, µ); (y, v)) < 0. Now, (y, v) Z belongs to L(x 1, u 1 ) if and only if y(0) = y(1) = 0 and { 2y(t) + e 2t v(t) if t [0, a] ẏ(t) = ev(t) if t [a, 1] and so, if g(t) = t 0 v(s)ds, then { e 2t g(t) if t [0, a] y(t) = eg(t) if t (a, 1] with g(0) = g(1) = 0. We reach the same conclusion as before since 1 J((x 1, u 1, p, µ), (y, v)) = 2 g(t)v(t)dt = 0 0 for any (y, v) L(x 1, u 1 ). This last solution can be generalized by considering subintervals of length a n = 1/n and setting x n (t) = { e nt if t [0, a n ] e if t [a n, 1] u n (t) = { n if t [0, an ] 0 if t (a n, 1]. Then (x n, u n ) is a solution to the problem P(S) and essentially the same conclusions as before will follow. We have thus shown, in particular, that we may have a strongly normal solution (x 0, u 0 ) to P(S) and u 0 does not solve (C). In Example 4.6, however, note that f depends on x and, again, this seems to be fundamental in the analysis. In the following example we reach the same conclusion, but f does not depend on x.

26 464 Jorge A. Becerril, Karla L. Cortez and Javier F. Rosenblueth Example 4.9 The previous conclusions hold with the function f independent of x. Consider the problem P(S) of minimizing I(x, u) = subject to (x, u) Z and 1 0 {u 1 (t) + u 2 (t) + u 3 (t)}dt ẋ(t) = u 3 (t) (t [0, 1]); x(0) = 0, x(1) = 1; u 1 (t) 2u 2 (t) 0, u 2 (t) u 1 (t) 0 (t [0, 1]). For this problem T = [0, 1], ξ 0 = 0, ξ 1 = 1, L(t, u) = u 1 + u 2 + u 3, f(t, x, u) = u 3, ϕ 1 (u) = u 1 2u 2, ϕ 2 (u) = u 2 u 1. Clearly, (C) has no solution. Let x 0 (t) := t and u 0 (t) := (0, 0, 1) (t T ). Then (x 0, u 0 ) is a solution to problem P(S) since, for any (x, u) admissible, u 1 (t) + u 2 (t) 0 (t T ) and therefore I(x, u) 1 It is also normal relative to S 0 since 0 u 3 (t)dt = 1 = I(x 0, u 0 ). τ 0 (u 0 (t)) = {h R 3 h 1 2h 2 = 0, h 1 + h 2 = 0} = {h h 1 = h 2 = 0} and so z 0 is the only solution to the system ż(t) = 0 and z(t)h 3 = 0 for all h 3 R. Let us now show that one can find an extremal for which (i) of Theorem 4.3 holds but not (ii). This clearly follows if p 1 and (µ 1, µ 2 ) (2, 3), for then (x 0, u 0, p, µ) E since ṗ(t) = 0 and L u ( x 0 (t)) + µ (t)ϕ (u 0 (t)) = ( ) (1, 1, 1) + (2, 3) = (0, 0, 1) = p(t)f u ( x 0 (t)). We end this paper with another important property derived from Theorem 4.3. Our final example provides a solution (x 0, u 0 ) to a problem P(S) which is normal relative to S and u 0 solves (C), but (x 0, u 0 ) is not normal relative to S 1 (nor in consequence to S 0 ) for any pair (p, µ) with (x 0, u 0, p, µ) an extremal. Example 4.10 (x 0, u 0 ) solves P(S), it is normal relative to S, and u 0 solves (C), but (x 0, u 0 ) is not normal relative to S 1 (µ) for any (p, µ) M(x 0, u 0 ).

27 Critical cones for regular controls 465 to Consider the problem P(S) of minimizing I(x, u) = 2 0 (t 1)u(t)dt subject ẋ(t) = u(t) (t [0, 2]), x(0) = x(2) = 0, u 2 (t) 1 (t [0, 2]). Define the process (x 0, u 0 ) as x 0 (t) = { t if t [0, 1] 2 t if t [1, 2] u 0 (t) = { 1 if t [0, 1] 1 if t (1, 2]. Clearly (x 0, u 0 ) is a solution to problem P(S) and u 0 solves (C). It is not normal relative to S 0 since, for example, p 2 and µ u 0 show that the system ṗ(t) = 0 and p(t) = 2u 0 (t)µ(t) (t T ) has nonnull solutions. The conclusion also follows straightforwardly by noting that τ 0 (u 0 (t)) = {0} and so the system ż(t) = 0 and z(t)h 0 for all h τ 0 (u 0 (t)) has nonnull solutions. Now, note that τ 2 (u 0 (t)) = { {h h 0} if t [0, 1] {h h 0} if t (1, 2] and so (x 0, u 0 ) is normal relative to S since z 0 is the only solution of the system ż(t) = 0 and z(t)h 0 for all h τ 2 (u 0 (t)). Finally, if (x 0, u 0, p, µ) is an extremal, then µ(t) 0, ṗ(t) = 0 and p(t) = t 1 + 2µ(t)u 0 (t) (t T ). This implies, in particular, that p is constant, p t 1 on [0, 1] and p t 1 on (1, 2]. Hence p 0 and In this event we have µ(t) = τ 1 (u 0 (t), µ(t)) = { (1 t)/2 if t [0, 1] (t 1)/2 if t (1, 2]. { {0} if t [0, 1) (1, 2] {h h 0} if t = 1 and therefore (x 0, u 0 ) is not normal relative to S 1 (µ). References [1] A.V. Arutyunov, Smooth abnormal problems in extremum theory and analysis, Russian Mathematical Surveys, 67 (2012),

Sufficiency for Essentially Bounded Controls Which Do Not Satisfy the Strengthened Legendre-Clebsch Condition

Applied Mathematical Sciences, Vol. 12, 218, no. 27, 1297-1315 HIKARI Ltd, www.m-hikari.com https://doi.org/1.12988/ams.218.81137 Sufficiency for Essentially Bounded Controls Which Do Not Satisfy the Strengthened