AN IDENTIFICATION ALGORITHM FOR ARMAX SYSTEMS

AN IDENTIFICATION ALGORITHM FOR ARMAX SYSTEMS First the X, then the AR, finally the MA Jan C. Willems, K.U. Leuven Workshop on Observation and Estimation Ben Gurion University, July 3, 2004 p./2

Joint work with Ivan Markovsky (K.U. Leuven and Paolo Rapisarda (Un. Maastricht p.2/2

General introduction p.3/2

SYSTEM ID Observed data System model p.4/2

SYSTEM ID Observed data System model Case on interest: Data = a time-series record: Required: an algorithm to obtain a dynamical system that explains this time-series. p.4/2

SYSTEM ID Observed data System model Case on interest: Data = a time-series record: and (bi-infinite data records In the theory, the case! play an important role. p.4/2

SYSTEM ID Observed data System model Difficulties to cope with: blackbox data unmeasured inputs latency any element of the model class fits the data only approximately misfit measurement errors danger of overfitting p.4/2

ARMAX SYSTEM ID Usual approach: Data = an input/output record 74 2 54 32 +, '0 $&% " # +,-/ // - '. " $&% # +,- '( " $&% # '0 $&* '. $&* '( $&* p.5/2

:? 2? ARMAX SYSTEM ID Usual approach: Data = an input/output record 74 2 54 32 +, '0 $&% " # +,-/ // - '. " $&% # +,- '( " $&% # '0 $&* '. $&* '( $&* System model = an ARMAX model = : <; shift, noise? = suitably sized polynomial matrices. D BC A@ @4. white, independent of stationary ergodic gaussian,? p.5/2

? @4 2? ARMAX SYSTEM ID System model = an ARMAX model =? noise? D BC A@ stationary ergodic gaussian, suitably sized polynomial matrices. white, independent of.? : lack of fit, prevents predictability of from, etc. Note subtle non-uniqueness of the ARMAX representation. p.5/2

? = = E I = G = E F E ARMAX SYSTEM ID System model = an ARMAX model =? noise Well-known: ARMAX systems are those that allow finite dimensional state representations J? H? p.5/2

? ARMAX SYSTEM ID System model = an ARMAX model =? noise ID Algorithm: K L L L (or another repr. of the ARMAX model Quality of the ID algorithm: Assume that the data has been generated by an element of the model class; then require asymptotic convergence to the true system, for (consistency, efficiency, etc. p.5/2

CENTRAL PARADIGM Test the proposed algorithm assuming that the data has been generated by a model from a given (ARMAX model class. The algorithm should perform well in this test case. In other words, ID algorithms should perform well with simulated data There is no need to refer to the real or true system. As a(n approximate description of reality, the stochastic assumptions about? are indeed rather tenuous! p./2

M CENTRAL PARADIGM Test the proposed algorithm assuming that the data has been generated by a model from a given (ARMAX model class. The algorithm should perform well in this test case. In other words, ID algorithms should perform well with simulated data Same sort of justification for Kalman filtering, LQ-, LQG-, -control, adaptive control, etc.: We want that our algorithm works well under certain ideal circumstances. Stochasticity can thus in good conscience be interpreted as relative frequency. p./2

Is the CENTRAL PARADIGM reasonable? ID algorithms should perform well with simulated data What does perform well mean? What simulated data should one test the algorithm for? p.7/2

Is the CENTRAL PARADIGM reasonable? ID algorithms should perform well with simulated data What does perform well mean? What simulated data should one test the algorithm for? The ARMAX model class, with stochastic inputs and disturbances is a very broad model class, but it puts stochasticity very central. p.7/2

Is the CENTRAL PARADIGM reasonable? ID algorithms should perform well with simulated data What does perform well mean? What simulated data should one test the algorithm for? Approximation deserves a much more central place in system ID. It (data produced by high order, nonlinear, time-varying system seems much more the core problem in system ID than protection against unmeasured stochastic inputs or measurement errors. p.7/2

The behavior of ARMAX systems p./2

47 N The behavior of an ARMAX system When does the stochastic process 45PO ; belong to the behavior of the ARMAX system =?? p./2

47 N ;? The behavior of an ARMAX system When does the stochastic process 45PO ; belong to the behavior of the ARMAX system =?? is zero mean, stationary, gaussian, and there exist a stationary gaussian white noise process that =?, independent of, a.s. such Cfr. the work of Picci, Lindquist, (and co-workers, Deistler, Ljung, e.a. p./2

47 N The behavior of an ARMAX system Deterministic language When does the time-series 45PO ; belong to the behavior of the ARMAX system =?? p./2

4 N S R Q The behavior of an ARMAX system There is an underlying shift-invariant Hilbert space time-series : ; are assumed to belong. to which the components of all the signals Q S R : : of p./2

4 N S R Q [ W V V M _ 0 The behavior of an ARMAX system There is an underlying shift-invariant Hilbert space time-series : ; are assumed to belong. to which the components of all the signals Q S R : : of Examples: jointly stationary ergodic gaussian stochastic processes. T TZ XY TTU 0 0 dfeg ` ` bc a \^]. T TZ XY TTU. T T TTU h. almost periodic sequences p./2

47 N 2 4i N? ; 2 d N 2? d j N 2? The behavior of an ARMAX system 45PO ; belongs to the behavior of the ARMAX system, i.e. ;? =. its components 2. there exists with (a components (b the s are orthonormal, (c the s are to the s, d such that =? p./2

p u t m s o o m m v m o The notion of an input rq k no lk nm lk Call o free if., such that Maximally free no further free components in. Maximally free = input. Then = output. is input square and z p y w x p.0/2

47 N ;! ~ ; 2 4i N? ; 2 N 2 ƒ The behavior of an ARMA system An ARMA system is an ARMAX system that has only outputs. Analog of autonomous system. behavior of the ARMA system, i.e. belongs to the } {?. its components 2. there exists with (a components (b the s are orthonormal, d such that? To avoid difficulties yet to be dealt with in the proofs, assume throughout that ƒ is left prime and that ' ˆ has no roots on the unit circle. p./2

! ~ 7 2 N The behavior of an ARMA system An ARMA system is an ARMAX system that has only outputs. Analog of autonomous system. } {? The behavior of an ARMA system consists of the have the same autocorrelation function Š ; * * that with A 7 47 S d Q ; Š * * etc., etc. p./2

47 N ; ; 2 4i N? ; 2 N 2 The behavior of an MA system An MA system is a special ARMA system. to the behavior of the MA system, i.e. belongs?. its components 2. there exists with (a components (b the s are orthonormal, d such that? p.2/2

7 2 N The behavior of an MA system An MA system is a special ARMA system.? The behavior of an MA system consists of the the same compact support autocorrelation function Š ; * with A 7 47 that have S d Q <; Š * etc., etc. p.2/2

u s Œ Ž Œ Œ Ž Œ Ž The model class What subsets of are representable as a ARMAX, ARMA, MA system? Denote the family of these subsets as. Œ p.3/2

u s Œ Ž Œ Œ Ž Œ Ž Œ Œ Ž Ž Œ Ž The model class What subsets of are representable as a ARMAX, ARMA, MA system? Denote the family of these subsets as. Œ If representations Œ Œ what are all its an equivalence relation on tuples of polynomial matrices? p.3/2

m o m Œ Ž Œ The model class Identifiability Given on the o, are there simple conditions (say such that there is only one element in that contains this? o persistency of excitation p.3/2

m o The model class Identification problem Give an algorithm p estimate, finite data records, consistency, etc. p.3/2

The structure of ARMAX systems p.4/2

}! 2 K B D ( 54 Ÿ N p.5/2 The module of orthogonalisers Consider. ~ { square, l Define the following set of polynomial vectors d j D * % ž Cg BC œ 7 2 š< ; 2 and for all 2 for all

ž D ( 54 Ÿ N 2 D ( D ( BC4 The module of orthogonalisers Define the following set of polynomial vectors d j D * % B Cg œ 7 BC 2 š< ; for all and for all 2 Easy: C g submodule of C g BC œ 7 45. Hence finitely generated. viewed as a module over FAQ: What is in terms of the ARMAX matrices? p.5/2

B ( g ž d d? The module of orthogonalisers Ÿ d j D * % D ( C g BC œ 7 45 2 š ; l s j s and? =. D 2 B the transposes of the rows of But, these do not always form a set of generators. p.5/2

B ( g ž D ( d d? B D BF D B D B F D B The module of orthogonalisers Ÿ d j D * % C g BC œ 7 45 2 š ; l =? and the transposes of the rows of But, these do not always form a set of generators. Proposition: s j D 2 s. Let with square and non-singular, and The transposes of the rows of generators of the module. left prime. form a (minimal set of p.5/2

B ( g ž D ( D BF D B D B F D B D B ( The module of orthogonalisers Ÿ d j D * % C g BC œ 7 45 2 š ; l Proposition: Let with square and non-singular, and The transposes of the rows of generators of the module. left prime. form a (minimal set of In behavioral language, g ARMAX system. defines a controllable kernel; is the transfer function of the deterministic part of our p.5/2

B ( g ž D ( D BF D B D B F D B B D B D B K The module of orthogonalisers Ÿ d j D * % C g BC œ 7 45 2 š ; l Proposition: Let with square and non-singular, and The transposes of the rows of generators of the module. left prime. form a (minimal set of unique up to D unimodular This submodule the part of the ARMAX system p.5/2

}! 2 K D 74 The AR module Consider the ARMA system. ~ { square, Define the following set of polynomial vectors Ÿ! Q 2 d j ž BC N 2 š ; p./2

! ~ 2 K Ÿ! Q ž 74 The AR module Consider the ARMA system square, } {. Define the following set of polynomial vectors N 2 d D BC j 2 š ; Easy: submodule of D BC 47. Hence finitely generated. FAQ: What is in terms of the ARMAX matrices? p./2

! ~ 2 K Ÿ! Q ž 74 ž The AR module Consider the ARMA system square, } {. Define the following set of polynomial vectors N 2 d D BC j 2 š ; is the module generated by the transp. of the rows of, with ( ž C g C ( C g C a unit circle spectral factorization. p./2

D B D BF D B 74 š B N B ª : The AR module Now, given an ARMAX system with orthogonalisers, it can be shown that, with behavior, and generators of the module of : ; D D * 2 % such that Ÿ = is an ARMA system. p.7/2

74 š B N B ª : F The AR module : ; D «D * 2 % such that Ÿ = is an ARMA system. Described by :.? The associated AR submodule the AR part of the ARMAX system. Assume generated by the polynomial matrix. p.7/2

F The AR module Described by :.? The associated AR submodule the AR part of the ARMAX system. Assume generated by the polynomial matrix. It follows finally that is an MA system,.? The associated MA matrix the MA part of the ARMAX system. p.7/2

u o v m o p p p Recapitulation Time-series Behavior, an ARMAX behavior. input, output. v s m. The module of orthogonalisers.. 2. nm k no lk The AR-module. is an ARMA system. 3. k is an MA-system 4. the ARMAX representation q k no lk k m k lk p./2

An identification algorithm p./2

d ' d d ' An identification algorithm For notational simplicity, we only treat the case that an infinite time series +,- $&* " # +,-/ // - $ % '. " $&* '. # $ % +,- '( " $&* '( # $ % +,- '± " $&* '± # $ % - +,- d ' " $ * g ' # $&% g - is given, components in. p./2

d ' d d ' Algorithm Data: +,- $&* " # +,-/ // - $ % '. " $&* '. # $ % +,- '( " $&* '( # $ % +,- '± " $&* '± # $ % - +,- d ' " $ * g ' # $&% g -, We assume that the data has been produced by an ARMA system, and we are looking for an algorithm that returns. D B equivalently, p.20/2

o ' ' ' PERSISTENCY of EXCITATION A key ingredient is persistency of excitation. The vector time-series is said to be persistently exciting of order if the Hankel matrix +»³»»µ»»»µ»»³», (œ g '0 $ % ' $ % '. $ % '( $ %.œ g '0 $ % '¹ $ % ' $ % '. $ % œ g '0 $ % 'º $ % '¹ $ % ' $ %............... " ²³² ²µ² ² ²µ² ²³² # '0 $ %.œ (œ $&% $&% $&% is of full row rank. Persistency of excitation no linear relations of order L. p.2/2

g g p.22/2 Estimating the X part Determine the orthogonalisers, for instance by computing the MPUM type module generated by the left kernel of the Hankel matrix of the mixed correlation matrix +»»µ»»µ»»³»³»³»»µ»»³»³», '0 '. ¼3½ ½ ¼ ½ ½ '0 '. ¼f¾ ½ ¼ ¾ ½ (œ '0 ' ¼ˆ½ ½ ¼ˆ½ ½ ¼3½ ½ '( ¼3½ ½ '( ¼f¾ ½ '. ¼ˆ½ ½ '. ¼3¾ ½ ' ¼3½ ½ " ² ²µ² ²µ² ²³²³²³² ²µ² ²³²³² # (œ '0 ' ¼3¾ ½ ¼ ¾ ½............... ( œ '0 (œ ' ¼ ½ ½ ( œ '0 ¼ ¾ ½ (œ ' ¼ˆ¾ ½ ' ¼ˆ¾ ½ S d Q Š * % S d Q Š % % with

AR part p.23/2 Compute the ARMA signal L = L :

g AR part Compute the ARMA signal L = L : Compute its AR module for instance by computing the MPUM type module generated by the left kernel of the Hankel matrix +»µ»»³»³»³», '0 '. ¼À À ¼À À (œ '0 ' ¼À À ¼À À............... '( ¼À À '. ¼À À " ²µ² ²³²³²³² # ( œ '0 ¼À À (œ ' ¼À À ' ¼À À S : d: Q ŠÁ Á with p.23/2

g L AR part Compute the ARMA signal L = L : Compute its AR module for instance by computing the MPUM type module generated by the left kernel of the Hankel matrix +»µ»»³»³»³», '0 '. ¼À À ¼À À (œ '0 ' ¼À À ¼À À............... '( ¼À À '. ¼À À " ²µ² ²³²³²³² # ( œ '0 ¼À À (œ ' ¼À À ' ¼À À S : d: Q ŠÁ Á with p.23/2 Result:

L MA part Compute the signal : p.24/2

L L p.24/2 MA part Compute the signal : Compute its MA representation for instance by computing the and factoring as d S Q Š<Â Â correlations ( ž C g C L d C Š Â Â Ã Ã dfeg

L L MA part Compute the signal : Compute its MA representation for instance by computing the correlations Š<Â Â Q d S and factoring as ( ž C g L C L d C Š Â Â Ã Ã dfeg Result: p.24/2

Ä MA part Real computations, finite time series, noise: all sorts of approximations. How can we guarantee that is Schur, how do we guarantee that the autocorrelation of has the necessary positivity and compact support properties? We take a look at the second problem, in the scalar case. p.25/2

Š Š Ã ž Å Ì ( Ê 0! Æ Š Í Ã MA part S. d Q L Š Â Â Compute estimates by solving the LMI: with a legal MA L Š We can approximate subject to. L Š Ã d eg minimize ( ÇÉÈ... d C Ã dfeg such that ª ÇË...Ç Ë ÇÈ p.25/2

Š Š Ã ž Å Ì ( Ê 0! Æ Š Í Ã ž ž = = ( Ô Ð e MA part S. d Q L Š Â Â Compute estimates by solving the LMI: with a legal MA L Š We can approximate subject to. L Š Ã d eg minimize ( ÇÉÈ... d C Ã dfeg such that ª ÇË Ë ÇÈ...Ç Îthen ÏÎ ˆ( The dyadic expansion of, an LS MA approximation. Can be ÊÖÕ +», " ² # ' Ç yields, with...õ...ñò ÑÓ reduced, via storage functions, and another LMI, to a scalar. p.25/2

= Ø Û Û Ø ÚÙ ÚÙ ( e ' Ç Çä Çã Ç. e ' Ç Ç º Ç œ ( e ' Ç ß Ç (º¹ã åç àç e ' Ç á / A simulation The system is? 'ÝÜ Þ 'ÝÜ, and are selected as follows:, where the polynomials, Ç º - œ±/ å - ± à g / œ±/ ( ( g / Þâá à - œ±/ éçä.æ º ± / è g œ±/ ä( œ(/ ç º ( / Ç g ä( æ ± / ( g p.2/2

??!!! ê? A simulation The inputs and! obtained from and are zero mean, gaussian, white, with variances, respectively. The initial condition, under which and is a random vector. is The time horizon for the simulation is simulated time series is used for estimation. and the The experiment is repeated realizations of and in each run. times with different p.2/2

ì L A simulation 0. 0. 0.4 0.2 0 PSfrag replacements 0.2 0.4 0. 0. 0.5 0 0.5.5 Roots of,,, and L 'ë, for 'ë. p.2/2

ì L A simulation 0. 0. 0.4 0.2 0 0.2 0.4 0. 0. 0.5 0 0.5.5 Roots of and, for 'ë. p.2/2

L í A simulation 0 50 40 30 20 0 0 0 20 30 40 0 0.5.5 2 2.5 3 Bode plots of (solid line and L '( í '( (dashed line. p.2/2

L í A simulation 40 30 20 0 0 0 20 30 40 50 0 0 0.5.5 2 2.5 3 Bode plots of (solid line and L '( í '( (dashed line. p.2/2

L A simulation 50 40 30 20 PSfrag replacements 0 0 0 0 5 0 5 20 25 Autocorrelation of (solid line and (dashed line. p.2/2

Remarks Note dramatic simplification to orthogonalisers/ma if deterministic part in controllable. The orthogonality suffices for a finite number of shifts. p.27/2

Thank you Thank you Thank you Thank you Thank you Thank you Thank you Thank you p.2/2