Iterative Learning Control Analysis and Design I

Iterative Learning Control Analysis and Design I Electronics and Computer Science University of Southampton Southampton, SO17 1BJ, UK etar@ecs.soton.ac.uk http://www.ecs.soton.ac.uk/

Contents Basics Representations for Design Stability and Convergence Analysis Robustness Acknowledgement: The material in this section largely follows that in the following paper. D. A. Bristow, M. Tharayil and A. G. Alleyne A Survey of Iterative Learning Control IEEE Control Systems Magazine, 26(3):960 114, 2006.

Basics continuous-time ILC has many aspects and design/analysis tools (and open research questions). To start with, it is assumed that the plant to be controlled is adequately modeled by either a linear continuous or discrete-time model in state-space or transfer-function terms. Continuous-time: Plant state-space model in ILC notation ẋ k (t) = Ax k (t) + Bu k (t) y k (t) = Cx k (t) (1) Control task: the output y k (t) is required to track the supplied reference signal over the fixed finite interval 0 t T.

Standard Assumptions The trial duration T has the same value for all trials. The initial condition is the same on all trials. The system dynamics are time-invariant. The dynamics are deterministic (noise-free). Many of these assumptions can be relaxed. Many ILC designs are for single-input single-output (SISO) systems In this section it is the SISO case that is considered the multi-input multi-output (MIMO) case is noted where relevant.

Original Arimoto Algorithm u k+1 (t) = u k (t) + Γė k (t) (2) where Γ is the learning gain. This ILC law will ensure that y k y d, or e k = y d y k = 0, k (3) if I CBΓ < 1 (4) where is an appropriately chosen norm what happens if CB = 0?. Note that this convergence condition places no constraints on the state matrix A!!

More ILC algorithms A PID like ILC algorithm is u k+1 (t) = u k (t) + Φe k (t) + Ψ e k (τ) dτ + Γė k (t) (5) A higher order ILC (HOILC) version of the PID ILC is u k+1 (t) = + N (1 Λ)P k e k (t) + Λu 0 (t) k=1 N (Φ k e i k+1 (t) k=1 + Ψ k e i k+1 (τ)dτ + +γ k ė i k+1 ) (6)

More ILC algorithms If N P k = 1, then proper choice of the learning gains k=1 ensures that e k converges asymptotically to zero in k (trial-to-trial error convergence). A time-varying P-type (no derivative and integral effects) ILC law (5) is u k+1 (t) = u k (t) + Γ k (t)e k (t) (7) where Γ(t) is the proportional learning gain that is trial and time-varying (special case is time-invariant heavily used in applications).

More ILC algorithms In this simple structure ILC law, the critical feature is the use of information from the most recent trial to update the current trial input. Other time-varying HOILC laws include or k l u k+1 (t) = u k (t) + Γ i (t)e i (t) (8) i=k k l k l u k+1 (t) = Γ i u i (t) + Γ i (t)e i (t) (9) i=k If required, make use of all available previous trials information. i=k

Discrete-time ILC algorithms Consider linear discrete-time (SISO) systems with state-space model in the ILC setting x k (p + 1) = Ax k (p) + Bu k (p), 0 p T y k (p) = Cx k (p) + Du k (p), x k (0) = x 0 (10) or (operator representation) y k (p) = G(q)u k (p) + d(p) (11) where q is the forward shift-time operator qx(p) = x(p + 1) and d(p) is an exogenous signal that repeats on each trial. Equivalently: the initial conditions are the same on each trial and (for simplicity) there are no external disturbances.

Discrete-time ILC algorithms To derive (11) from (10), write (with D = 0 for simplicity) y k (p) = C(qI A) 1 Bu k (p) + CA p x 0 (12) G(q) = C(qI A) 1 B A widely used ILC algorithm is d(p) = CA p x(0) (13) u k+1 (p) = Q(q) [u k (p) + L(q)e k (p + 1)] (14) where Q(p) is termed the Q-filter and L(p) is the learning function, respectively.

Discrete-time ILC algorithms There are many variations of (14) these include time-varying, nonlinear functions and trial-varying functions. Also the order can be increased use of information from more than the previous trial HOILC as above. Current trial feedback is a method of incorporating feedback with ILC and in this case (14) is extended to u k+1 (p) = Q(q) [u k (p) + L(q)e k (p + 1)] + C(q)e k+1 (p) (15) The term C(q)e k+1 (p) is feedback action on the current trial.

Discrete-time ILC algorithms Write (15) as Hence u k+1 (p) = w k+1 (p) + C(q)e k+1 (p) (16) w k+1 (p) = Q(q) [ w k (p) + (L(q) + q 1 C(q))e k (p + 1) ] (17) Hence the feedforward part of current trial ILC is identical to (15) with learning function L(q) + q 1 C(q) The ILC law (14) with learning function L(q) + q 1 C(q) combined with a feedback controller in the parallel architecture is equivalent to the complete current trial ILC see Figure 1.

Current trial feedback Memory L ILC Memory Q yd e k C Feedback Controller w k uk G Plant y k Figure 1: ILC with current trial feedback.

Assumptions/Implications of the model used The plant G(q) in (11) is a proper rational function of q and in general has a delay, or equivalently, a relative degree of m. Assumption: G(q) is stable (asymptotic stability) if not it can be stabilized by applying a feedback controller and then applying ILC to the resulting system. The trial duration is finite but in some analysis transfer-function/frequency domain an infinite duration is assumed (technical point). Discrete-time is a natural domain for ILC because this design method requires storage of past trial data, which is typically sampled. Temporary assumption: non-zero first Markov parameter (CB 0).

Assumptions/Implications of the model used The model (11) is sufficiently general to capture IIR and FIR plants. Repeating disturbances, repeated non-zero initial conditions and systems augmented with feedback and feedforward control can be included in the term d(p). Figure 4 illustrates the 2D systems nature of ILC information propagation from trial-to-trial (k) and along a trial (p). 2D control systems analysis is well developed in theory ILC provides an application area see Figure 2 with advantages of this setting for design (more later).

2D systems structure of ILC Figure 2: Illustrating the 2D systems structure of ILC.

Representations for Design

Representations for Design The lifted description is heavily used in discrete-time ILC analysis and design. First expand G(q) from the model (11) as an infinite power series G(q) = p 1 q 1 + p 2 q 2 + p 3 q 3 +... (18) where the p i are the Markov parameters. The p i form the impulse response and since CB 0 is assumed, p 1 0. In the state-space description, p j = CA j 1 B (still with D = 0 for simplicity). G(p) with relative degree greater than unity is a critical issue in ILC (more later).

Lifted Model Introduce the vectors y k (1) y k (2) Y k =., U k = y k (T) u k (0) u k (1). u k (T 1), d = Then the system dynamics can be written as d(1) d(2). d(t) (19) G = Y k = GU k + d (20) p 1 0... 0 p 2 p 1... 0..... (21). p T p T 1... p 1

Lifted Model, cont d The entries in Y k and d are shifted by one time step (relative degree is unity) to account for the one step delay in the plant. This ensures that G is invertible. If there is an m > 1 step delay, the above construction generalizes in a natural manner. In the lifted form the time and trial domain dynamics are replaced by an algebraic updating in the trial index only. This means that the along the trial dynamics are hidden the 2D systems approach is a way of avoiding this and enabling simultaneous design for trial-to-trial error convergence and control of the along the trial dynamics (more on this later).

Lifted Model, cont d The learning law (14) can also be written in lifted form. The Q-filter and learning function L can be non-causal functions of the impulse response. Q(q) =... + q 2 q 2 + q 1 q + q 0 + q 1 q 1 + q 2 q 2 +... L(q) =... + l 2 q 2 + l 1 q + l 0 + l 1 q 1 + l 2 q 2 +... (22) In lifted form U k+1 = QU k + LE k (23) E k = Y d Y k (24) Y d = [ y d (1) y d (2)... y d (T) ] T (25)

Lifted Model, cont d Q = L = q 0 q 1... q (T 1) q 1 q 0... q (T 2)...... q (T 1) q (T 2)... q 0 l 0 l 1... l (T 1) l 1 l 0... l (T 2)...... l (T 1) l (T 2)... l 0 (26) (27)

Lifted Model, cont d When Q(q) and L(q) are causal functions q 1 = q 2 =... = 0, l 1 = l 2 =... = 0 (28) and the matrices Q and L are lower triangular. The matrices G, Q and L are also Toeplitz, i.e., all entries along each diagonal are equal. This setting also extends to linear time-varying systems but the corresponding matrices do not have the Toeplitz structure. Next we introduce the z transform description.

Lifted Model, cont d The one-sided z-transform of a signal x(j), j = 0, 1,..., is X(z) = x(j)z j (29) j=0 and is obtained by replacing q by z. The frequency response is obtained by setting z = e jθ, θ [ π, π]. To use the z transform we need to assume T =. This is not an issue!!

Lifted Model, cont d In z-transform terms the plant and controller dynamics are Y k (z) = G(z)U k (z) + D(z) (30) U k+1 (z) = Q(z) [U k (z) + zl(z)e k (z)] (31) E k (z) = Y d (z) Y k (z) (32) The z term in this last equation emphasizes the forward time shift.

Causality Question: What is causal data for ILC? Definition (Bristow et al.) The ILC law (14) is causal if u k+1 (p) depends only on u k (τ) and the error e k (τ), τ p. It is noncausal if u k+1 (p) is also a function of u k (τ) or e k (τ) for some τ > p. Critical Fact: Unlike the standard concept of causality, a non causal ILC law is implementable in practice because the entire time sequence of data is available from all previous trials. Consider the non causal ILC law and the causal ILC law u k+1 (p) = u k (p) + k p e k (p + 1) (33) u k+1 (p) = u k (p) + k p e k (p) (34)

Causality, cont d Moreover, a disturbance d(p) enters the error as e k (p) = y d (p) G(q)u k (p) d(p) (35) Hence the non-causal ILC anticipates the disturbance d(p + 1) and compensates with the control action u k+1 (p). The causal ILC law has no anticipation since u k+1 (p) compensates for the disturbance d(p) with the same time index p. Causality also has consequences for feedback equivalence where the final, or converged, control, denoted u, can instead be obtained by a feedback controller. It can be shown that there is a feedback equivalence for causal ILC laws and the equivalent controller can be obtained directly from the ILC law.

Causality, cont d The assertion now is: causal ILC laws are of limited (or no!!) use since the same control action can be obtained by applying the equivalent feedback controller without the learning process. There are, however, critical limitations to this equivalence. The first limitation is the noise-free requirement. Another limitation is that as the ILC performance increases the equivalent feedback controller has increasing gain. In the presence of noise, use of high gain can lead to performance degradation and equipment damage. Hence casual ILC algorithms are still of interest and, in fact, this equivalence was already known in the repetitive process/2d systems literature.

Causality, cont d Critical Fact: The equivalent feedback controller may not be stable!! There is no equivalence for non-causal ILC as a feedback controller reacts to errors. P. B. Goldsmith On the equivalence of causal LTI iterative learning control and feedback control. Automatica, 38(4):703 708, 2004. D. H. Owens and E.Rogers Comments on On the equivalence of causal LTI iterative learning control and feedback control. Automatica, 40(5):895 898, 2004.

Stability and Convergence Analysis

Stability and Convergence Analysis We consider the system formed by applying an ILC law of the form (14) to a system described by (11). Note again the stability assumption on the plant dynamics. Definition The system formed by applying an ILC law of the form (14) to a plant described by (11) is asymptotically stable (AS) if there exists û R such that u k (p) û, p = 0, 1,..., T 1, k 0 (36) and lim k u k (p) exists. The symbol denotes for all.

Stability and Convergence Analysis The limit u is termed the learned control. In lifted form the controlled dynamics are described by U k+1 = Q(I LG)U k + QL(Y d d) (37) Maths/notation: Let H be an h h matrix with eigenvalues h i, 1 i h. Then r(h) = max i h i is termed its spectral radius and I denotes the identity matrix with compatible dimensions. Theorem The system formed by applying an ILC law of the form (14) to plants described by (11) is AS if and only if r(q(i LG)) < 1 (38)

Stability and Convergence Analysis If Q and L are causal, Q(I LG) is lower triangular and Toeplitz with repeated eigenvalues Hence stability provided λ = q 0 (1 l 0 p 1 ) q 0 (1 l 0 p 1 ) < 1 Note: This condition does not hold if p 1 = 0. In the z transfer-function domain U k+1 (z) = Q(z)[1 zl(z)g(z)]u k (z) + zq(z)l(z)[y d (z) D(z)] (39)

Stability and Convergence Analysis A sufficient condition for stability of the ILC scheme described by (39) can be obtained by requiring that Q(z)[1 zl(z)g(z)] satisfies the contraction mapping condition terminology is a contraction mapping. For a given T(z) define T(z) = sup θ [ π,π] T(e jθ ) where sup denotes the least upper bound (maximum in many cases). Theorem The system formed by applying an ILC law of the form (14) to a plant described by (11) is AS with T = if Q(z)[1 zl(z)g(z)] < 1 (40)

Stability and Convergence Analysis When Q(z) and L(z) are causal this last condition also implies AS for finite-duration ILC. The condition (40) is sufficient but not necessary and in general can be much more conservative than the necessary and sufficient condition. The 2D systems setting (see later) will bring Linear Matrix Inequalities into the analysis. Next we example performance where there are two issues trial-to-trial error convergence (k) and along the trial performance (p). One of many questions: What are the consequences of monotonic trial-to-trial error convergence?

Performance If the controlled system is AS, the error as k (asymptotic error) is e (p) = lim k e k(p) = lim d(p) G(p)u k (p) d(p)) k = y r (p) G(p)u (p) d(p) (41) One method of assessing performance is to use e (p) e 0 (p) - either qualitatively or quantitatively by, for example, the Root Mean Square (RMS) error. If the controlled system is AS then for the lifted system Y = [I G[I LG] 1 QL](Y d d) (42)

Performance In z transfer-function terms E (z) = 1 Q(z) 1 Q(z)[1 zl(z)g(z)] [Y d(z) D(z)] (43) Essentially, these results can be obtained by replacing k with and then solving for e and E (z). Is it possible to design for e = 0? Theorem If G and L are not identically zero, the system formed by applying an ILC law of the form (14) to a plant described by (11) then e = 0 for all p and for all y d and d if and only if AS holds and Q(q) = 1.

Performance

Performance Many ILC laws set Q(q) = 1 and hence do not include Q-filtering. The last theorem shows this choice is required for trial-to-trial error convergence to zero (perfect tracking). Q-filtering can improve transient learning and robustness. To explore further, consider selecting Q as an ideal low-pass filter with unity magnitude at low frequencies θ [0, θ 0 ] and zero magnitude for θ (θ 0, π]. For this ideal low-pass filter, using (43), E (e jθ ) = 0 for θ [0, θ 0 ] and equal to Y r (e jθ ) D(e jθ ) for θ (θ 0, π]. For those frequencies where Q(e jθ ) = 1 perfect tracking results and for those where Q(e jθ ) = 0, the ILC is effectively switched off. Hence the Q filter can be used to determine which frequencies are emphasized in the design.

Transient Learning Here we concerned with trial-to-trial error convergence. The following example is from Bristow et al. Plant dynamics Control law G(q) = q (q 0.9) 2 u k+1 (p) = u k (p) + 0.5e k (p + 1) (44) In this case p 1 = 1, q 0 = 1 and l 0 = 0.5. Q and L are causal and all eigenvalues of the lifted system are 0.5. Hence the controlled system is AS.

Transient Learning In this case Q = 1 and hence e = 0. Take the trial duration as T = 50. Running a simulation shows that over the first 12 trials the trial-to-trial error, measured by the Euclidean or 2-norm, grows by over nine orders of magnitude. This example shows the large trial-to-trial error growth that can arise in this form of ILC. This large growth is problematic since neither the rate or the magnitude is closely related to the stability condition the lifted system eigenvalue is well within the stability region.

Transient Learning/Monotonic Convergence It is also difficult to distinguish error growth from instability due to the very large initial growth rate and magnitude. Later we will see that the 2D systems based design can prevent this problem but at the possible cost of a conservative design. To avoid these problems, monotonic convergence is desirable. For any given norm, the system considered is monotonically convergent if e e k+1 η e e k, k = 1, 2,... (45) where 0 η < 1 is the convergence rate.

Monotonic Error Convergence Write E E k+1 = GQ(I LG)G 1 (E E k ) (46) When G(q), Q(q) and L(q) are causal, the matrices G, Q and L commute and (46) becomes In the z-domain E E k+1 = Q(I LG)(E E k ) (47) E (z) E k+1 (z) = Q(z)(1 L(z)G(z))(E (z) E k (z)) (48)

Monotonic Error Convergence The (non-zero) singular values of a matrix, say H, are given by taking the positive square roots of the eigenvalues of HH T or H T H. Let σ( ) denote the maximum singular value of a matrix. Then from (46) and (47) we have the following result. Theorem If the following condition holds for the system formed by applying an ILC law of the form (14) to plants described by (11) γ 1 = σ(gq(i LG)G 1 ) < 1 (49) then e e k+1 2 < γ 1 e e k 2 (50) for all k = 1, 2,... and 2 denotes the Euclidean norm.

Monotonic Error Convergence Theorem If the following condition holds for the system formed by applying an ILC law of the form (14) to plants described by (11) with T = then for all k = 1, 2,... γ 2 = Q(z)[1 zl(z)g(z)] < 1 (51) E (z) E k+1 (z) < γ 2 E (z) E k (z) (52) If Q(z) and L(z) are causal then (51) also implies that for all k = 1, 2,... and finite T. e e k+1 2 < γ 2 e e k 2 (53)

Monotonic Error Convergence The z domain monotonic convergence condition (51) is equivalent to the stability condition (40). Hence when Q(z) and L(z) are causal, this stability condition also guarantees monotonic trial-to-trial error convergence independent of T. The lifted system monotonic convergence condition is more strict than the stability condition and both are specific to T. In the presence of AS the worst-case learning can be bounded above by a decaying geometric function e e k 2 kγ k e e k 2 (54) with γ < 1. This is a well-known result in discrete-time linear systems theory and in this area and is a function of T.

Robustness Model uncertainties are a fact of life in ILC as in all other areas. Robust ILC is a large problem area and we will revisit it again later after the initial discussion given next. Question: Does a given AS ILC scheme remain AS to plant perturbations? Consider the case of Q(q) = 1, resulting in e = 0, and causal L(q). The stability condition in this case is 1 l 0 p 1 < 1 Hence if l 0 and p 1 are nonzero the ILC scheme is AS if and only if sgn(p 1 ) = sgn(l 0 ) and l 0 p 1 2

Robustness As a consequence, ILC can achieve e = 0 using only knowledge of the sign of p 1 and an upper bound on p 1. Perturbations in the higher order Markov parameters do not destabilize!! Also a large upper bound for p 1 is possible by selecting l 0 suitably small. Hence ILC is robust to all perturbations that do not alter the sign of p 1. Fact: Robust stability does not imply acceptable learning transients.

Robustness Consider the uncertain plant description (multiplicative uncertainty) Theorem If G(q) = Ĝ(q)[1 + W(q) (q)] (55) where Ĝ(q) is the nominal model, W(q) is known and stable and (q) is unknown but stable with (z) < 1. W(e jθ ) γ Q(e jθ ) 1 e jθ L(e jθ )Ĝ(e jθ ) Q(e jθ ) e jθ L(e jθ )Ĝ(e jθ ) for all θ [ π, π] then the ILC system formed by applying an ILC law of the form (14) to plants described by (11) and (55) with T = is asymptotically convergent with convergence rate γ.

Robustness Unlike robust stability, the monotonic robustness conditions also depends on the dynamics of G(q), Q(q) and L(q). The most direct means of increasing the robustness W(e jθ ) at any given θ is to decrease the Q filter gain. There is a trade-off between performance and robustness! Other robustness issues, e.g., noise will be covered later.