A BEHAVIORAL APPROACH TO THE LQ OPTIMAL CONTROL PROBLEM

A BEHAVIORAL APPROACH TO THE LQ OPTIMAL CONTROL PROBLEM GIANFRANCO PARLANGELI AND MARIA ELENA VALCHER Abstract This paper aims at providing new insights into the linear quadratic optimization problem within the behavioral setting A new problem statement is introduced and the corresponding solution, which allows to easily compute the optimal trajectories from the given data, is derived and discussed in some detail A comparison with the standard LQ problem in the state-space approach is provided Some examples allow to further explore the relevance of the present approach for the class of state-space models Keywords Continuous-time behaviors, kernel/image representations, LQ problem, asymptotic stability, autonomous/controllable decomposition, state maps, quadratic differential forms 1 Introduction The behavioral approach represents a modern and powerful framework where system and control problems can be stated and solved in a very general form 7 Within this setting, a dynamical system is completely described by its behavior, namely the set of all possible evolutions of its system variables, and hence any optimization problem is naturally stated as the problem of selecting, within this set of trajectories, the best ones, for they either minimize a given cost function or maximize a certain gain function The linear quadratic optimal control problem has received several different statements (and, consequently, solutions) within the behavioral approach Indeed, in 5, two different optimization problems in the behavioral setting are first stated, and then proved to be two special instances of a more general concave problem The explicit solution of the LQ problem, however, makes use of the standard state-space theory In 14 and 16, Jan Willems addresses the LQ problem for controllable behaviors, by assuming that the cost function is expressed by means of a quadratic differential form Under these assumptions, the optimal behavior is defined as the set of all behavior trajectories whose cost cannot be decreased by means of finite support perturbations (namely by adding any finite support behavior trajectory) This approach leads to a set of optimal solutions which represents a behavior However, no mentioning of initial conditions or local constraints is made in the problem statement Weiland and Stoorvogel 11 afford and solve the LQ control problem in a purely behavioral framework, without referring to an explicit cost function but deeply resorting to the idea of control as interconnection The final solution resorts, again, to a statespace realization Finally, Ferrante and Zampieri 2 adopt a problem statement which involves a quadratic cost function and a local constraint on the behavior trajectories The set Dipartimento di Ingegneria dell Innovazione, Università di Lecce - strada per Monteroni, 731 Lecce, Italy - phone: +39-832-297-22, fax: +39-832-297-279 - e-mail: gianfrancoparlangeli@unileit Dipartimento di Ingegneria dell Informazione, Università di Padova - via Gradenigo 6B, 35131 Padova, Italy - phone: +39-49-827-7795, fax: +39-49-827-7614 - e-mail:meme@deiunipdit 1

of optimal solutions which, in general, is not a shift invariant set of trajectories, and hence is not a behavior, is obtained by means of state-space techniques In this paper, we aim at providing a different and simple statement for the LQ problem in the behavioral framework, by considering linear, time-invariant and complete continuous-time systems, described as the kernels of some polynomial matrix operators, and by assuming as cost function the integral (over R + ) of a suitable quadratic differential form Initial conditions are introduced by assigning the value, at t =, of a minimal state map for the system behavior The present statement is quite general and provides a natural extension, within the behavioral setting, of the classical LQ problem for state-space models The problem solution is here derived in a clean and simple way, by making use of the typical tools developed within the behavioral setting, in particular, state maps and quadratic differential forms, and without resorting to state-space realizations 2 Preliminaries on behaviors, quadratic differential forms and state maps In this paper we denote by Rξ the ring of polynomials in the indeterminate ξ, with coefficients in the real field R Given two trajectories w 1 and w 2 (R q ) R, we define their concatenation at time t = as { w1 (t) if t < ; w 1 w 2 (t) := w 2 (t) if t In a similar way it can be defined the concatenation (at time t = ) of two pairs of trajectories (x 1,w 1 ),(x 2,w 2 ) (R n R q ) R Consider a continuous-time dynamical system Σ = (R,R q, B) whose set of trajectories (the behavior) B (R q ) R is the solution set of a system of linear, constant coefficients differential equations (21) R w(t) + R 1 dw(t) + + R L d L w(t) L =, with constant matrices R i R p q, for some positive integer p Equation (21) is a kernel representation of the behavior and is denoted for short as (22) ( d R w =, ) where R(ξ) := L ( ( i= )) R i ξ i R p q ξ As a consequence, we adopt the shorthand notation B = ker R d It entails no loss of generality assuming that R is of full row rank 7, and this will be a steady assumption within this paper The modelling process adopted for the system behavior does not always lead to a trajectory description as in (22) Indeed, some auxiliary latent variables may need to be introduced, thus resulting in the following description (23) ( ( d d R w = M l, ) ) 2

where M(ξ) Rξ p l and l (R l ) R represents the latent variable vector Equation (23) provides a latent variable representation of the latent variable system Σ f = (R,R l,r q, B f ), where the full behavior B f consists of trajectories (w, l) satisfying (23) and induces an external behavior B, obtained by closed projection of B f on the external variables In this paper, as in 1, 9, 14, 15, we assume that the set of solutions of (22) and (23) are to be intended in L loc 1 and equality in the sense of distributions For instance, the behavior described by (22) is the set of all trajectories w such that ( w,r T d ) ( f = w T (t)r T d ) f(t) =, for all testing functions f, where f is a C vector-valued function with compact support Elimination of latent variables from (23) may be performed provided that we consider as external behavior of (23) not simply the projection of B f on the external variables (which may not be endowed with a kernel representation), ie, the set {w : l st (23) holds}, but its closure, in the L loc 1 topology An accurate discussion of these problems and of their solutions can be found in 1, 7, 9 and, specifically, in the latent variable elimination theorem (Theorem 21 in 9) We introduce, now, the two fundamental properties a behavior can be endowed with Definition 21 7 A behavior B is said to be autonomous if for any w 1,w 2 B and any τ R w 1 (t) = w 2 (t) for t τ w 1 (t) = w 2 (t) for every t R; controllable if for all w 1,w 2 B there exist τ R + and w B such that { w1 (t), for t < ; w(t) = w 2 (t τ), for t τ ( As is well-known, a behavior B = ker )), with R(ξ) Rξ p q, is autonomous ( R d if and only if R is of full column rank q If so, B can also be described as the kernel of a nonsingular square matrix, which is uniquely determined up to a left unimodular factor So, if we assume that R is square, det R (which is, of course, independent of the specific square representation, except for a multiplicative nonzero constant) is known as the characteristic polynomial of B 7 An autonomous behavior B is said to be asymptotically stable if for every w B we have lim t + w(t) = This happens if and only if the characteristic polynomial of B is Hurwitz, namely all its zeroes are included in the open half plane {ξ C : Re(ξ) < } Controllability represents the possibility of steering any past behavior trajectory into any future behavior trajectory, provided that one leaves time enough for adjustments A controllable behavior B admits an image description or, equivalently, it can 3

be described as the image of a polynomial matrix operator 7 This amounts to saying that there exist ( m N and a polynomial matrix M Rξ q m such that w B if and only if w = M d ) u, for some u (R m ) R The set of continuous-time trajectories ( ( thus obtained is denoted by im M d )) As a final result, which will be useful in the following, we quote the decomposition theorem (see 7, Theorem 5214) stating that every AR behavior B can always be expressed as the direct sum of its controllable part B c (the largest controllable behavior included in B) and of some (not unique) autonomous behavior B a : (24) B = B a B c As a consequence, every behavior trajectory w can always be described as the sum w = w a +w c of two trajectories w a B a and w c B c Both trajectories, however, depend on the specific choice of B a and are uniquely determined only once the autonomous behavior B a involved in the behavior decomposition is specified A polynomial matrix description of the controllable part of a behavior is quite easy to obtain Indeed, if R(ξ) R p q ξ is a full row rank matrix involved in the kernel description of B, it can always factorize as (25) R(ξ) = (ξ) R(ξ), for suitable polynomial ( matrices ) (ξ) R p p ξ nonsingular and R(ξ) R p q ξ left prime Then B c = ker R d On the other hand, if is a unimodular matrix such that (26) U(ξ) = P(ξ) M(ξ) Rξ q q R(ξ)U(ξ) = (ξ), ( ( with (ξ) Rξ p p nonsingular square, then B c = im M d )) Any such M(ξ) is said to be a minimal right annihilator (MRA) of R(ξ) In the sequel we will also make use of the two subsets of B c, B c and B + c, which represent the sets of controllable trajectories of B having support in (, ) and, + ), respectively If w and w both belong to B, then w belongs to B + c, as the only left compact support trajectories in B belong to its controllable part Moreover, we have the following result, which relates concatenability of two trajectories at t =, with the trajectory compositions ( ) Lemma 21 Given a behavior B = ker R d, with R(ξ) Rξ p q of full row rank, let B c denote the set of controllable trajectories of B and let B a be an autonomous behavior such that B = B a B c Let U(ξ) = P(ξ) M(ξ) Rξ q q be a unimodular matrix such that (26) holds, with (ξ) Rξ p p nonsingular square Given two trajectories w 1 and w 2 in B, the following facts are equivalent ones: i) w 1 w 2 B; ii) w 2 w 1 B c B+ c ; 4

iii) w a B a,w c,i B c,i = 1,2, with w c,2 w c,1 B c B + c such that ( iv) v a ker that ( d w i = w a + w c,i ; )) (,v c,i,i = 1,2, with M d ) (v c,2 v c,1 ) B c B+ c, such w i = P d d v a + M v c,i Proof i) ii) Of course, by the linearity of B, w 1 w 2 B if and only if (w 2 w 1 ) B Since left compact support trajectories in B necessarily belong to the controllable part of the behavior, this latter condition ensures that w 2 w 1 = w +w +, where w has support in R while w + belongs to B + c B c B Again, by the linearity of B, w B, and having right compact support it necessarily belongs to B c Consequently, ii) holds true ii) iii) Clearly, w i B,i = 1,2, ensures w i = w a,i + w c,i, w a,i B a,w c,i B c,i = 1,2 So, we are just remained to prove that if ii) holds then w a,1 = w a,2 and w c,2 w c,1 B c B+ c Indeed, this immediately follows from w 2 w 1 = (w a,2 w a,1 ) + (w c,2 w c,1 ) B c B + c iii) iv) Obvious iii) i) Set w c := w c,1 w c,2 Notice, first, that w c B c The trajectory w := w a + w c clearly belongs to B and it just coincides with w 1 w 2 Remarks 1) Two behavior trajectories w 1 and w 2 that satisfy any of the equivalent conditions of Lemma 21 share a common past, as indeed w 1 (,) represents a common past of w 1 and w 2, and a common future, namely w 2,+ ) Consequently 12, they share all pasts and all futures and hence are past and future equivalent 2) At the light of the previous lemma, we can say that two trajectories can be concatenated at t = if and only if they share the same autonomous part and their controllable parts differ in some trajectory which belongs to B c B+ c As we will see, a state map will allow us to verify whether or not two trajectories may been concatenated As a final concept, we introduce that of stabilizability of a behavior A behavior B is said to be stabilizable if for every trajectory w B and every τ R there exists some trajectory w B such that w (t) = w(t), t τ, and lim t + w (t) = ( By ) referring to the previous assumptions and notation, we get 7 that B = ker(r d ) is stabilizable if and only if rank(r(λ)) = p, λ C,Re(λ) <, or, equivalently, if and only if det is Hurwitz or if and only if in every autonomous/controllable decomposition (24) the autonomous behavior B a is asymptotically stable 5

21 Bilinear and quadratic differential forms Let Rζ,η q 1 q 2 denote the set of q 1 q 2 polynomial matrices in the indeterminates ζ and η with real coefficients Each element in Rζ,η q 1 q 2 can be represented as Φ(ζ,η) = h,k Φ h,k ζ h η k, with Φ h,k R q 1 q 2, and to any such matrix we can associate a bilinear form B Φ which maps any pair of (sufficiently smooth) trajectories (w 1,w 2 ) (R q 1 R q 2 ) R into the function B Φ (w 1,w 2 ) := h,k d h T w 1 h Φ h,k d k w 2 k B Φ maps C (R,R q 1 ) C (R,R q 2 ) into C (R,R) We will use B Φ as the standard notation for the bilinear form induced by Φ Rζ,η q 1 q 2 The dual of Φ is defined as Φ (ζ,η) := h,k Φ T h,k ζk η h = Φ T (η,ζ) If q 1 = q 2 = q, we say that Φ is symmetric if Φ = Φ, namely Φ T h,k = Φ k,h for all indices h,k The set of symmetric matrices in Rζ,η q q will be denoted as R s ζ,η q q With every Φ R s ζ,η q q we associate a homogeneous quadratic differential form, defined for every sufficiently smooth w (R q ) R as Q Φ (w) := h,k d h T w h Φ h,k d k w k Q Φ maps C (R,R q ) into C (R,R) We will use Q Φ as the standard notation for the quadratic form induced by Φ Rs q q ζ,η Q Φ is nonnegative (denoted Q Φ ) 17, 18 if, for every (sufficiently smooth) trajectory w (R q ) R, Q Φ (w), and positive (Q Φ > ) if it is nonnegative and Q Φ (w) = implies w = If B (R q ) R is a set of trajectories (in particular, a behavior) Q Φ is B-nonnegative, Q Φ B, (B-positive, Q Φ > B ), if Q Φ (w) for every trajectory in B (and, moreover, Q Φ (w) = implies w = ) A quadratic differential form Q Φ is nonnegative if and only if 17 there exists a polynomial matrix D such that Φ(ζ,η) = D T (ζ)d(η) and it is positive if and only if, in addition, D(λ) = q for every λ C (namely D is right prime) The characterization of B-nonnegative and B-positive quadratic forms can be found in 1, 17 Such characterizations become particularly easy when dealing with a controllable ( ( behavior, as we may resort to an image description for it 1 Indeed, if B = im M d )), then Q Φ is B-nonnegative (B-positive) if and only if the quadratic differential form Q Φ, associated with the polynomial matrix Φ (ζ,η) := M T (ζ)φ(ζ,η)m(η), is nonnegative (positive) 22 State-space maps The notion of state is well-settled in traditional system theory and its intuitive meaning is that of system memory In the behavioral approach, the state of a system is a 6

latent variable endowed with the special property that if two behavior trajectories exhibit the same state value at time t (and satisfy some continuity requirements) they can be concatenated in t, thus resulting in a new behavior trajectory Definition 22 9 Let Σ s = (R,R n,r q, B s ) be a linear time-invariant system with latent variables Σ s is a state-space system if, given two trajectories (x 1,w 1 ),(x 2, w 2 ) B s, (27) x 1 () = x 2 () and x 1,x 2 continuous at (x 1,w 1 ) (x 2,w 2 ) B s A state-space system is state trim if the projection of B s over the first component x, of dimension say n, at the instant t = coincides with R n A state-space system induces, by projection of B s on the external variable w, an external behavior B = π w B s The state-space system will be called a state-space representation for B If the dimension n of a state-space representation is minimal for a given behavior B, the state-space representation is said to be minimal Remarks 1) As pointed out in 9, the continuity requirement is inspired by the fact that we are dealing with solutions in L loc 1, in which case the simple requirement x 1 () = x 2 () is of little consequence 2) By the system linearity, (27) can be equivalently restated in the simpler form: given (x,w) B s, the following condition holds (28) x() = and x continuous at (,) (x,w) B s Definition 23 1 Given a behavior B (R q ) R and a polynomial matrix S(ξ), with n rows, we say that S induces a state map for B if the system Σ s = (R,R n,r q, B s ) of behavior { ( } d B s := (x,w) : w B,x = S w ) provides a state-space representation for B B s represents the extended state behavior associated with S S is a minimal state map if Σ s is a minimal state-space representation Remarks 1) If Σ s is a state-space representation of B induced by a state map S, then it is surely observable (by this meaning that the state variable can be exactly observed from the external variable), as a consequence minimality and trimness are equivalently properties for Σ s 1, 9, 13 2) If we refer to a state map S, condition (28) becomes (29) S d S ( d w () = ) w continuous at { w ( B ) S d w = S d ( w) It is worthwhile noticing that S d w may be continuous (at t = ) even though the trajectory w is not Notice, also, that, by Lemma 21, w splits into the sum of a left compact support trajectory belonging to B + c and of a right compact support trajectory belonging to B c and hence, if (29) hold, then w B c B+ c 7

In order to address minimality of state maps, consider the shift-and-cut operator σ + defined as follows 9: In other words, σ + : Rξ Rξ : r(ξ) := r + r 1 ξ + + r L ξ L r 1 + r 2 ξ + + r L ξ L 1 σ + (r(ξ)) := ξ 1 r(ξ) r The definition of σ + can be extended to the matrix case in a componentwise manner Corresponding to any polynomial matrix R(ξ) R p q ξ, described as R(ξ) := R + R 1 ξ + + R L ξ L,R i R p q, we associate the Ξ-matrix R Ξ R Lp q defined as (21) R Ξ (ξ) := R 1 (ξ) R 2 (ξ) R L (ξ), where R i (ξ) := σ + (R i 1 (ξ)),i 1, and R (ξ) := R(ξ) In 9 it has been shown that if R(ξ) is a row reduced matrix, the polynomial matrix obtained ( ) by removing all zero rows in R Ξ (ξ) provides a minimal state map for B = ker R d Even more (see Proposition 61 in 9), if w B, then d w B R Ξ w () =, (notice that the continuity of R d Ξ w is always ensured 9) This property has also been referred to as faithfulness of a state map 8 and it holds true for every minimal state map By putting together this fact with the result of Lemma 21, we get the following proposition ( ) Proposition 24 6 Given a behavior B = ker R d, with R(ξ) Rξ p q of full row rank, let M Rξ q (q p) be a minimal right annihilator of R Let S(ξ) be a minimal state map for B If B c represents the set of controllable trajectories of B, the following facts are equivalent ones: i) w B; ii) w B c B+ c ; iii) S d w () = (211) Moreover B + c = { w L loc and S ( d 1 : w = M ) ( d ) M ( d ) v v, v such that v(t) = t < } () = 8

It is wortwhile noticing that condition ) w () =, in the general case, does ( S d not mean that any of the trajectory derivatives is zero at t = For instance, if we assume R(ξ) = ξ + 1 ξ, which is left prime and (trivially) row reduced, S(ξ) = R Ξ (ξ) = 1 1 and hence ( d S w () = w 1 () = w 2 () ) This same remarks applies to the case when we are dealing with trajectories belonging to the controllable part of the behavior Indeed, if M(ξ) represents ( a minimal right annihilator of R(ξ), and hence w B c if and only if w = M d ) v, for some trajectory v, condition d d (212) S M v () = does not generally mean that any of v(), dv(), is zero For instance, if we assume R(ξ) = 1 + ξ ξ 1, a minimal right annihilator of R(ξ) is ξ 1 M(ξ) = 1 + ξ 1 = 1 1 + 1 ξ, ξ 1 while a minimal state map S(ξ) = R Ξ (ξ) is obtained as So, in this case, = ( d S S(ξ) = 1 1 ) ( d M ) v () = v 1 () + v 2 (), which means v 2 () = v 1 (), with v 1 () a free parameter Since in the following, we will deal with condition (212), by assuming that S(ξ) is the minimal state map R Ξ (ξ) and M(ξ) is a minimal right annihilator of the system matrix R(ξ), it is worthwhile devoting some attention to its solution If we assume R(ξ) = L i= R i ξ i and M(ξ) = m i= M i ξ i, then R 1 (ξ) R 2 (ξ) S(ξ)M(ξ) = R Ξ (ξ)m(ξ) = M(ξ) R L (ξ) R(ξ) R ξ 1 R(ξ) R R 1 ξξ 2 = R(ξ) R R 1 ξ R L 1 ξ L 1 ξ L = R ξ 1 M(ξ) R ξ 2 R 1 ξ 1 M(ξ) R ξ L R 1 ξ L R L 1 ξ 1 M(ξ) M(ξ) 9

As a first remark, we observe that condition (212) involves only v() and its first m 1 derivatives dv()/,,d m 1 v()/ m 1, since the degree of S(ξ)M(ξ), seen as a polynomial with matrix coefficients, is at most m 1 On the second hand, the previous condition may be easily rewritten as an algebraic equation Indeed, once we set and M 1 M 2 M 3 M 4 M m M 2 M 3 M 4 M := M 3 M 4 R qm (q p)m M m 1 M m M m R R 1 R R = R pl ql, R L 2 R R L 1 R 1 R we can obtain the equivalent algebraic formulation (213) if m L, or (214) R pl q(m L) M R M q(l m) (q p)m v() dv() d m 1 v() m 1 v() dv() d m 1 v() m 1 = = if m < L Notice that R is of full row rank if and only if R is, while M is of full column rank if and only if M n is Either when dealing with equation (213) or with equation (214), we can describe the vector of initial conditions on v at t = as belonging to some vector subspace of R m Therefore, some full column rank matrix V, with mq columns, can be found such that v() dv() = V c, d m 1 v() m 1 for some real vector c This analysis will be useful in the following sections 1

3 Problem statement As discussed in the Introduction, the linear quadratic optimal control problem has received several different formulations and solutions within the behavioral framework 2, 5, 11, 14, 16 In this paper we choose a problem statement which puts together the ingredients of the problem statements given in 2 and 14 Indeed, the cost function is expressed as a quadratic differential form, as in 14 However, differently from this last reference and similarly to 2, we constrain the trajectory past evolution on (, ) by assigning the initial conditions at t = that, in the present problem statement, are expressed by means of a minimal state map for the system behavior This formulation appears as a natural extension, within the behavioral setting, of the traditional LQ problem for classic state-space models, as it will be shown in detail, but also encompasses more general situations ( ) Problem Statement: Given a behavior B := ker R d, with R(ξ) R p q ξ of full row rank, a minimal state map S( ) for B, with S(ξ) Rξ n q, a positive 1 quadratic differential functional Q Φ (associated with Φ(ζ,η) R s ζ,η q q ) and any vector b R n, find the set of trajectories w which are asymptotically converging to zero and minimize the cost function (31) J = subject to the following constraints: (32) Q Φ (w)(t), ( d R w = ) (33) d S w () = b, Remarks 1) It is wortwhile noticing that we have a priori selected, among all possible solutions, only those which go to zero as t goes to + This requirement may seem unnecessary at a first sight However, a diverging optimal solution, even if minimizing the cost function, would create some serious problems and hence would be highly undesirable So, we have chosen to impose asymptotic stability on the set of optimal behavior trajectories This immediately ensures that the optimal trajectories belong, for t, to an (asymptotically stable) autonomous behavior Since autonomous behaviors consist of C trajectories, in the following we will confine our attention to those behavior trajectories which are L loc 1 all over R and, in particular, as a result of the optimization problem, are C in (,+ ), by this meaning that all solutions are continuous with their derivatives in (,+ ) and d i w(t) lim t + i 1 As a matter of fact, for our purposes, Q Φ need not be a positive quadratic differential functional It is sufficient that it is B c-positive, where B c denotes, as usual, the controllable part of B Indeed, since initial conditions will uniquely determine the autonomous part of the optimal trajectories, we aim at distinguishing between trajectories that differ just in the controllable component This is possible by simply weighting all controllable trajectories 11

exists for every nonnegative index i and d i w( + ) i = lim t + d i w(t) i 2) Let w (b) be any optimal solution corresponding to the initial condition b and set B := {w (b),b R n } Due to the nature of our problem and the choice of our constraint, the real optimization problem just concerns the future part of the behavior trajectory, namely the portion of w (b) corresponding to nonnegative time instants Indeed, the past time evolution (corresponding to negative time instants) is supposed to be known (or, at least, it is known the inluence it entails on the future evolution) and it cannot be changed Under this point of view, it is reasonable to consider just the restriction of the optimal behavior trajectories to R + : B + := B,+ ) According to the previous remark, such trajectories will belong to C ((,+ ),R q ) 4 Problem solution We aim at deriving a description of the set B + of all solutions (restricted to R + ) of the optimization problem, as the initial condition varies over the set of all possible n-tuples of real numbers A trajectory w B is optimal corresponding to some initial condition b if and only if it converges to zero and, for every w B satisfying (33) (for the same value of b) and converging to zero, one has (41) Q Φ (w)(t) Q Φ (w )(t) Since S d w () = b = S d w () is equivalent, by the linearity and Proposition 24, to condition w w B c B+ c, and since the cost function just pertains the future portion of the trajectories, it follows that (41) can be equivalently rewritten as follows (42) Q Φ (w + w c + )(t) Q Φ (w )(t), where w c + belongs to B + c, the set of controllable trajectories in B having support in R +, and converges to zero, being the difference of two trajectories both converging to zero Moreover, ( d (43) S w c ) + () = By the linearity of the operators involved, equation (42) holds true if and only if (see 14, 15) B Φ (w c +,w ) (t) = + w Q Φ (w c + c B+ c, ) (t) 12

where B Φ is the bilinear differential functional associated with Φ The second condition is obviously satisfied for every choice of w c + B + c, as Q Φ is a positive (B c -positive) quadratic differential form So, the above conditions reduce to the following one (44) d B Φ (M )v,w (t) = v st M d v B + c ( By Proposition 24, M d ) v B + c if and only if v(t) =,t <, and = S d ( M d ) v () Moreover, keeping in mind that both w and v go to zero as t goes to +, condition (44) is satisfied if and only if (see Lemma A1, in the Appendix) (45) (46) i+h 1 h,k i j= ( 1) j ( d i+h 1 j v() i+h 1 j ( M T d ) ( Φ d, d ) w =, t >, ) T ( d Mi T k+j w ) () Φ h,k k+j = We first analyze equation (45) Such an equation can be equivalently rewritten as where ( w ker + M T d ) ( Φ d, d ), ) { } d d ker + (A := w : A w (t) =,t So, putting together the initial kernel description for the behavior and this additional differential equation, we obtain that any optimal trajectory w satisfies ( ) d w ker + R, where R R(ξ) (ξ) := M T Rξ q q ( ξ)φ( ξ,ξ) Under the positivity (or B c -positivity) assumption on the quadratic differential form Q Φ, R (ξ) proves to be nonsingular, and hence ker + (R ) d represents an autonomous behavior Finally, since we are just interested in asymptotically stable solutions, we can further specify the optimal behavior description by factorizing R (ξ) as follows R (ξ) = R ns (ξ)r s (ξ), where Rns and R s are nonsingular square matrices and det R s is Hurwitz, while all zeros of detrns have ( nonnegative )) real part So, the asymptotically stable autonomous behavior ker + (R s d is an autonomous behavior including all optimal trajectories As far as condition (46) is concerned, we have to look at it as a trasversality condition (see 4, 1) which has to be put together with (43) or, equivalently, with the constraint on v ( ( d d (47) = S M v () ) ) 13

Indeed, as we have seen at the end of section 2, the previous equation introduces some constraints on v and its derivatives in Once such constraints are introduced in (46), we end up with the family of conditions the optimal trajectories have to satisfy in More specifically, denote by V a full column rank matrix, with real coefficients, such that (see section 2) condition (47) holds if and only if (48) dv T () v() dv() d m 1 v() m 1 = V c, for some real vector c (m denoting the degree of M(ξ)) If H := deg ζ Φ = deg η Φ, the trasversality condition may be rewritten as w () dw () = v T (49) (), d m+h 1 v T m+h 1 Φ M d m+2h 1 w () m+2h 1 for some suitable matrix Φ M whose coefficients are obtained by suitable combining the matrix coefficients of Φ and M Keeping in mind (48), (49) can be rewritten as follows = c T d m v T () m d m+h 1 v T m+h 1 V T Φ I M H(q p) w () dw () d m+2h 1 w () m+2h 1 which, by the arbitrariety of the row vector on the left hand size, means w () dw V T () = Φ I M (41) H(q p) d m+2h 1 w () m+2h 1 These represent the initial constraints on the behavior trajectories ) and hence, the set of optimal trajectories is just the set of trajectories in ker + (R s d that satisfy the set of initial conditions (41) Notice that this result is quite different from the analogous ones in 14, 15, as we are considering only the restriction of the trajectories to the halfline R + This requires to take into account also the initial conditions at t =, a problem that did not arise in 14, 15 Moreover, in order to get this result, it has been fundamental to assume that the behavior trajectories we are considering converge to zero together with their derivatives of any order If not, there would have been also a contribution at infinity in the integral expansion, 14

Notice that the matrix S does not play any role in the expression of the optimal behavior, provided that it represents a minimal state map for the behavior B Indeed, B only depends on the differential operator Φ involved in the cost index and on the polynomial matrices R and M providing a kernel and an image description, respectively, of B and of its controllable part B c However, S has a role in determining the constraints the trajectory w and its derivatives have to satisfy in t = 5 Comparisons with the standard LQ control One may wonder in which sense the previous problem statement provides a generalization of the standard LQ problem for state-space models, ie min x T (t)qx(t) + u T (t)ru(t) x,u st ẋ(t) = Fx(t) + Gu(t), t R +, x() = b, where x represents the state sequence, u the input sequence, Q and R are symmetric positive semidefinite matrices (R, in particular, is supposed to be positive definite) Indeed, if we define as system variable w := x u, then the quadratic index J may be rewritten as J = t= x T (t)qx(t) + u T (t)ru(t) = Thus J can be represented as in (31) upon setting Q Φ (w) := w(t) T Q w(t) R w(t) T Q w(t) R Q (which corresponds to Φ(ζ, η) = ) On the other hand the state-space equation R can also be formalized as follows: x d I F G =, u and hence (32) holds for R(ξ) := ξi n F G So, we are remained to show that once ( ( we)) assume S(ξ) = I n we get a minimal state map for the behavior B = ker R d This immediately follows from the fact that R(ξ) is row reduced and S(ξ) = R Ξ (ξ) Notice that the asymptotic stability requirement on the optimal trajectories and hence the autonomy of the resulting optimal system are perfectly analogous to the 15

results obtained within the state-space setting, where the optimal behavior takes the following form d I (F + GK) x K I m u =, and the (stabilizing) gain matrix K (obtained starting from the appropriate Algebraic Riccati Equation or ARE) makes F +GK asymptotically stable Thus the set of optimal state and input trajectories represents an asymptotically stable autonomous behavior Even more, for t, both x (t) and u (t) take an exponential form and hence are C ((,+ ),R n+m ) In order to compare the problem solution provided in the previous section with the standard solution available for state-space models, it seems convenient to start by analyzing an example Example 1 Consider the scalar (controllable) state-space model ẋ(t) = Fx(t) + Gu(t) = x(t) u(t) with initial condition x() = b and quadratic cost function x 2 (t) + u 2 (t), (ie, Q = R = 1 ) In behavioral terms, the problem is restated as in (31), (32), (33), with w1 x w = =, Φ(ζ,η) = I u 2, R(ξ) = 1 + ξ 1, S(ξ) = 1 w 2 ( ) Notice that ker R d is controllable also in behavioral terms Correspondingly, a minimal left annihilator (MLA) for R(ξ) is M(ξ) = 1 1 ξ By applying the previous procedure, we obtain the polynomial matrix operator R R(ξ) 1 + ξ 1 1 ξ (ξ) = M T = = 2 2 ( ξ) 1 ξ 1 1 ξ 1 ξ + 1 1 The set of asymptotically stable trajectories in ker + (R ) d is ( d ker + ) 2 + d + 1 1 meanwhile the (only) trasversality condition is ( d = ker + ) 2 + 1, 2 1 v()u () = Keeping in mind the initial constraint (47), which, for this example, amounts to = 1 v(t) v(t) dv(t) 16 () = v(),

it follows that the above trasversality condition in ineffective and hence the optimal behavior B + is ( d B + := ker + ) 2 + 1, 2 1 namely the set of solutions of the system of differential equations (51) (52) ẋ(t) = 2x(t) u(t) = ( 2 1)x(t) Corresponding to the initial condition b, we get the asymptotically convergent trajectories (53) x(t) = be 2t, u(t) = ( 2 1)be 2t, t R +, which leads to the optimum value of the cost index J: J = b 2 e 2 2t + ( 2 1) 2 b 2 e 2 2t = b 2 ( 2 1) A comparison with the classical state-space approach is almost immediate In that case, we would have considered the ARE in the unknown (scalar) matrix P = Q + F T P + PF PGR 1 G T P = 1 P P P 2 Such an equation admits two solutions P = 1 ± 2 Corresponding to the unique positive definite (and stabilizing) solution P = 2 1, we get the feedback gain matrix K = R 1 G T P = P = 2 1 The optimizing feedback control law u(t) = Kx(t) leads to the same optimal autonomous state-space model described in (51) -(52) ẋ(t) = (F + GK)x(t) = 2x(t) Even though the rationale behind the two approaches is the same (in a sentence: variational calculus), and hence the final solution, in terms of optimal trajectories and cost function, is again the same, the details are quite different The standard approach is based on a nonlinear algebraic equation: by resorting to the optimal (stabilizing) solution of the ARE (if any), a feedback gain matrix K is obtained such that the resulting closed loop system is asymptotically stable and autonomous, and generates (corresponding to any assigned initial condition) the optimal state and input trajectories The behavioral approach here described, when applied to a state-space model and a constant quadratic cost function, leads to a family of differential equations which define an autonomous behavior the optimal system trajectories must belong to Indeed, for state-space models with quadratic cost function Φ(ζ, η) = autonomous behavior is ( ) ( d d ker + R = ker I n F + Q M T x d M T u Q R ) G d, R, such an 17

Mx (ξ) where is an MRA of the system matrix R(ξ) = ξi M u (ξ) n F G (equivalently, M x (ξ)mu 1(ξ) n F) 1 G) and det M u det(ξi n F) 3, thus ensuring, in particular, that Mu T ( ξ)r is nonsingular square If we assume, wlog, that G is of full column rank (this is always possible, provided that we get rid of the largest family of inputs which are linearly dependent from the remaining ones), then G Mu T ( ξ)r is a right prime matrix and hence there exists some unimodular matrix U(ξ) such that D(ξ) U(ξ)R(ξ) =, N(ξ) I for suitable polynomial matrices N(ξ) and D(ξ), with D(ξ) nonsingular square This ensures that the optimal state and input trajectories satisfy ( d D x ) (t) = d N x (t) = u (t) Factorize, now, D(ξ) as D(ξ) = D ns (ξ)d s (ξ), where D ns and D s are nonsingular square matrices and detd s is Hurwitz, while all zeros of detd ns have nonnegative real part Since it can be shown 6 that, when dealing with state-space models and a constant quadratic cost function, the trasversality conditions (41) play no role, as the vector v and all its derivatives are zero at t =, the optimal behavior just coincides with the set of asymptotically stable trajectories satisfying the previous equations, namely with the autonomous behavior described by d D s x (t) = d N x (t) = u (t) Of course, in the general case, N(ξ) is not a constant matrix So, apparently, the optimal input trajectory is not explicitly obtained as a (static) function of the corresponding ) state evolution However, we have to keep in mind that x (t) ker + (D d s can be expressed as a vector coefficient combination of exponential modes t k e λt, for suitable λ (the characteristic zeros of detd s ) When applying the polynomial matrix operator N(d/) to x, we obtain a vector coefficient combination of the same exponential modes which therefore leads to express of the optimal input as u (t) = Kx (t) So, to conclude, for the classical LQ problem, the behavioral approach presented in the paper leads to the same closed loop solution obtained through the standard technique, passing through polynomial matrix calculations instead of the solution of an ARE Even confining our attention to state-space models, however, it is when dealing with arbitrary positive quadratic differential functions Q Φ, that the present approach 18

most proves its effectiveness Indeed, in this situation the trasversality conditions play an active role and, consequently, the set of optimal trajectories may be properly included in the (asymptotically stable autonomous) behavior obtained by selecting the asymptotically stable trajectories in ker + (R ( d )) This situation is only marginally touched upon in this contribution, by means of the following example More detailed results will be provided in 6 Example 2 Consider the scalar (controllable) state-space model ẋ(t) = Fx(t) + Gu(t) = x(t) u(t) with initial condition x() = b and cost function du(t) x(t) + In behavioral terms, the problem is restated as in (31), (32), (33), with w = w1 w 2 = 2 x 1 η, Φ(ζ,η) = u ζ ζη R(ξ) = 1 + ξ 1, S(ξ) = 1 = 1 ζ 1 η, By applying the same reasonings as in Example 1, we obtain the polynomial matrix operator R R(ξ) 1 + ξ 1 (ξ) = M T = ( ξ)φ( ξ,ξ) 1 + ξ ξ 2 ξ(1 + ξ ξ 2 ) By applying suitable elementary row operations on R, we get the following polynomial matrix: R (1 + ξ ξ (ξ) = 2 )(1 ξ ξ 2 ) (ξ = 2 5ξ + 1)(ξ 2 + 5ξ + 1) 1 + ξ 1 1 + ξ 1 So the set of (asymptotically stable) optimal trajectories is included in ( d 2 + ker 5 d 2 + 1 ) (54) + 1 + d 1 The trasversality conditions are v() x () dx () + du () dv() d2 u () 2 =, x () + du () Keeping in mind that the initial constraint ensures = v(), it follows that the first trasversality condition in ineffective On the other hand, as the state constraint does not constrain dv(), which is completely free, it follows that the optimal trajectories must satisfy the additional condition = (55) x () + du () 19 =

So, the optimal behavior B + is the set of trajectories (in R + ) belonging to (54), which satisfy the additional constraint (55) It is not hard to verify that condition (55) implies ( that) the optimal state trajectories must belong to the autonomous behavior ker d + + 1+ 5 2 and hence the set of optimal trajectories is given by ( ) d B + = ker + 1+ 5 2 + 1 + d 1 x As a consequence, for every choice of in B + one gets not only (55), but also x (t) + du (t) u =, t R + This ensures that, for every initial condition x() = b, J = This is not surprising, however, as the cost function can also be expressed as J = R 2 ( d ) x u 2, with R 2 (ξ) = 1 ξ, and it is easily seen that the set of asymptotically stable trajectories in ) ( ) d d 1 + d ker + (R ker + R 2 = ker 1 + d 1 coincides with B + Acknowledgment We are deeply indebted with Jan Willems for fruitful discussions and for the precious copy of an old draft 15 about the LQ problem in the behavioral setting he started writing but, unfortunately for the Control community, never completed Appendix A A technical result about optimization Lemma A1 Let Φ(ζ,η) = h,k Φ h,kζ h η k be in R s ζ,η q q and let B Φ denote the corresponding bilinear form Let M = i M iξ i Rξ q (q p) be a polynomial matrix Given two trajectories, w and v, both converging to zero as t goes to +, condition (A1) holds if and only if (A2) i+h 1 (A3) h,k i j= B Φ (M ( d )v,w ) (t) =, ( 1) j ( d i+h 1 j v() i+h 1 j v st M ( d ) v B + c ( M T d ) ( Φ d, d ) w =, t >, d k+j w () ) T M T i Φ h,k k+j = 2

Proof Notice, first, that B Φ (M d )v,w (t) = h,k i d i+h T v i+h Mi T Φ h,k d k w k Moreover, for any h,k and i, by exploiting the derivation by parts formula and the asymptotic convergence assumption, one gets d i+h T v i+h Mi T Φ h,k d k w + d k = ( 1) i+h v T Mi T h+k+i Φ w h,k h+k+i i+h 1 j= ( 1) j ( d i+h 1 j v() i+h 1 j ) T M T i Φ h,k ( d k+j w () k+j ) So, putting all terms together, one gets d B Φ (M )v,w (t) = ( + d ( 1) i+h v T M T h+k+i w ) i Φ h,k h,k i h+k+i i+h 1 d ( 1) j i+h 1 j T v() d h,k i j= i+h 1 j Mi T k+j Φ w () h,k k+j The right handside of the previous equation consists of two parts The first one can be rewritten as follows ( + d ( 1) i+h v T Mi T h+k+i w ) Φ h,k = h,k i = = v T h+k+i ( ( 1) h M T d h,k ( v T M T d ) ( Φ d, d )w ( d )Φ h+k w ) h,k h+k So, we get B Φ (M h,k d + )v,w (t) = i+h 1 i j= ( 1) j ( d i+h 1 j v() i+h 1 j ( v T M T d ) ( Φ d, d )w d k+j w () ) T M T i Φ h,k k+j Of course, by the arbitrariety of v the above quantity can always be zero if and only if (A2) and (A3) hold true 21

REFERENCES 1 M Belur Control in a behavioral context PhD thesis, University of Groningen, 23 2 A Ferrante and S Zampieri Linear quadratic optimization for systems in the behavioral approach SIAM Journal on Control and Optimization, 39, no 1:159 178, 2 3 GD Forney Minimal bases of rational vector spaces, with applications to multivariable linear systems SIAM J Control, 13:493 521, 1975 4 IM Gelfand and SV Fomin Calculus of variations Englewood-Cliffs, 1963 5 JW Nieuwenhuis A nother look at linear-quadratic optimization problems over time Systems & Control Letters, 25:89 97, 1995 6 G Parlangeli and ME Valcher The linear quadratic optimal control problem in the behavioral setting in preparation, 24 7 JW Polderman and JC Willems Introduction to Mathematical Systems Theory: A behavioral approach Springer-Verlag, 1998 8 C Praagman Trim state representations for linear discrete time behaviors submitted, 1998 9 P Rapisarda and JC Willems State maps for linear systems SIAM J Contr Optim, 35, no 3:153 191, 1997 1 J Tou Modern Control Theory McGraw-Hill, 1964 11 S Weiland and AA Stoorvogel A behavioral approach to the linear quadratic optimal control problem In Proceedings of the MTNS 2, page file SI199 pdf, Perpignan (France), 2 12 JC Willems From time series to linear system, part I: Finite dimensional linear time invariant systems Automatica, 22:561 58, 1986 13 JC Willems Paradigms and puzzles in the theory of dynamical systems IEEE Trans on Aut Contr, AC-36:259 294, 1991 14 JC Willems LQ-control: a behavioral approach In Proceedings of the 32nd Conference on Decision and Control, pages 3664 3668, San Antonio, Texas, 1993 15 JC Willems LQ-control: a behavioral approach draft, pages 1 16, 1993 16 JC Willems Control as interconnection In B Francis and ATannenbaum, editors, Feedback control, nonlinear systems and complexity Springer-Verlag Lecture Notes in Control and Information Sciences, 1996 17 JC Willems Path integrals and stability In J Baillieul and JC Willems, editors, Mathematical Control Theory - Festschrift at the occasion of the 6-th birthday of RW Brockett, pages 1 32 Springer Verlag, 1998 18 JC Willems and HL Trentelman On quadratic differential forms SIAM Journal on Control and Optimization, 36:172 1749, 1998 22