A structured low-rank approximation approach to system identification. Ivan Markovsky

Size: px

Start display at page:

Download "A structured low-rank approximation approach to system identification. Ivan Markovsky"

Ashley Richards
5 years ago
Views:

1 1 / 35 A structured low-rank approximation approach to system identification Ivan Markovsky

2 Main message: system identification SLRA minimize over B dist(a,b) subject to rank(b) r and B structured (SLRA) SLRA problem A dist(, ) r structure system identification observed data noise properties model complexity model class 2 / 35

3 3 / 35 Plan of the presentation System identification low-rank approximation Missing data estimation Nonlinear system identification

4 4 / 35 Plan of the presentation System identification low-rank approximation Missing data estimation Nonlinear system identification

5 5 / 35 Identification: finding models from data data D identification model B M aim: "accurate" and "simple" model "accurate" "simple" smallest approximation error Occam s razor principle: among equally accurate models, choose the simplest

6 6 / 35 Data D: set of vector-valued time series the data D is a set {w 1,...,w N } of vector valued w k = w k 1.. w k q time series w k i = ( w k i (1),...,w k i (T k ) ) N # of repeated experiments q # of variables T k # of time samples in kth exp.

7 7 / 35 Model B: subset of the data space behavioral definition of a model B = { w g(w)=0 holds } g(w)=0 representation of B model class M : set of models L set of linear models

8 / 35 dist(d,b):= min D B k w k ŵ k 2 2 8 6 w 2 4 2 0 B ŵ k w k ŵ k w k 0 5 10 w 1

8 8 / 35 dist(d,b):= min D B k w k ŵ k w B ŵ k w k ŵ k w k w 1 errors-in-variables model: data = true value + noise other error measures: output error, ARMAX,...

9 9 / 35 Model complexity = (# inputs, # states) simple = small ( B1 B 2 = B 1 is simpler than B 2 ) linear model is subspace, then size of B dimension of B linear time-invariant (LTI) dynamic model dimension of B ( # inputs, # }{{}} states {{} ) m l L m,l LTI systems of bounded complexity

10 10 / 35 Identification: error-complexity trade-off data D identification model B M minimize over B M [ ] dist(d,b) complexity(b)

11 11 / 35 Scalarization of the bi-objective problem 1. minimize dist(d, B) + λ complexity(b) minimize λ complexity(b) subject to dist(d, B) µ minimize dist(d, B) subject to complexity(b) (m, l) describe the same set of Pareto optimal solutions withmgiven, finding l is an order selection problem

12 12 / 35 LTI identification problem minimize over B dist(d,b) subject to B L m,l with distance measure dist(d,b)= min k w k ŵ k 2 2 = min D D D B D B the problem is minimize over B, D D D subject to D B Lm,l

13 13 / 35 w exact rank deficient Hankel matrix exact trajectory w B L m,l R 0 w(t)+r 1 w(t+ 1)+ +R l w(t+l)=0 w(1) w(2) w(t l) [ ] w(2) w(3) w(t l+1) R0 R 1 R l }{{}... = 0 R w(l+1) w(l+2) }{{ w(t) } H l+1 (w) rank ( H l+1 (w) ) = q(l+1) rank(r)=ql+m

14 14 / 35 D exact rank ( H l+1 (D) ) ql+m exact data D B L m,l w k B L m,l for all k = 1,...,N rank ([ H l+1 (w 1 ) H l+1 (w N ) ] ) ql+m }{{} mosaic-hankel matrix H l+1 (D)

15 15 / 35 LTI identification is mosaic-hankel SLRA minimize over B and D D D subject to D B Lm,l minimize over D D D subject to rank ( H l+1 ( D) ) ql+m

16 16 / 35 Summary: identification SLRA LTI model class bounded complexity Hankel structure rank constraint (m,l) ql+m

17 17 / 35 Plan of the presentation System identification low-rank approximation Missing data estimation Nonlinear system identification

18 Motivation goes beyond data corruption sensor failures measurements are accidentally corrupted compressive sensing measurements are intentionally skipped data-driven estimation and control missing data is what we aim to find examples: state estimation, control, and realization 18 / 35

19 1. state estimation given: system B, input u, and output y missing: initial conditions wini such that minimize over w ini, ŷ y ŷ }{{} s.t. estimation error w ini (u,ŷ) B 2. output tracking control given: B, wini, and reference output y ref missing: control input u such that minimize over û, ŷ y ref ŷ }{{} tracking error s.t. w ini (û,ŷ) B 3. realization given: impulse response h(1),...,h(t) missing: extension h(t + 1),h(T + 2), / 35

20 20 / 35 Exact, noisy, and missing data exact data kept fixed inexact / "noisy" data approximated (min error 2 ) missing data interpolated from w B the initial conditions w ini are the "past" of w w w ini w past future t

21 21 / state estimation past future input? u output? y 2. output tracking control past future input u ini? output y ini y ref 3. (noisy) realization past future input 0 δ output 0 (h,?) black exact blue inexact/noisy red missing

22 22 / 35 SLRA with element-wise weighted 2-norm minimize over D dist(d, D) subject to rank ( H l+1 (ŵ) ) ql+m weighted 2-norm approximation dist(d, D):= k,i,t vi k (t) ( wi k (t) ŵi k (t) ) 2 with element-wise weights vi k (t) (0, ) if wi k (t) is noisy vi k (t)=0 if wi k (t) is missing v k i (t)= if w k i (t) is exact approximate wi k (t) interpolate wi k (t) ŵi k (t)=wi k (t)

23 23 / 35 Summary: SLRA solves control problems the given data is exact or noisy what we want to compute is missing data exact/noisy/missing data is handled by /finite/0 weights

24 24 / 35 Plan of the presentation System identification low-rank approximation Missing data estimation Nonlinear system identification

25 25 / 35 Conic section fitting the points (x 1,y 1 ),...,(x N,y N ) lie on a conic section there are A=A, b, c, at least one of them nonzero, s.t. [ ] [ ] xi xi y i A + [ ] x y i y i b+ c = 0, for i = 1,...,N i there is θ = [ a 11 a 12 a 22 b 1 b 2 c ] 0, such that x1 2 x 2 N x 1 y 1 x N y N θ x 1 x N y1 2 yn 2 = 0 y 1 y N 1 1

26 26 / 35 Conic section fitting rank deficiency the points (x 1,y 1 ),...,(x N,y N ) lie on a conic section B(θ)={w w Aw+ w b+ c = 0} x1 2 x 2 N x 1 y 1 x N y N rank x 1 x N y1 2 yn 2 5 y 1 y N 1 1

27 27 / 35 Examples rank < 5 = nonunique fit rank = 5 = unique fit rank = 6 = no exact fit by a conic section

28 28 / 35 Nonlinear system identification discrete-time nonlinear system B := { w R ( w(t),w(t 1),...,w(t l) ) = 0 } special case: input/output NARX system B = { w = [ ] ( )} u y y(t)=f u(t),w(t 1),...,w(t l) linear parameterization: B θ φ model structure R(x)= θ i φ i (x)=θφ(x), θ parameter vector x(t):= ( w(t),w(t 1),...,w(t l) )

29 29 / 35 Link to SLRA parameter estimation problem ŵ B θ minimize over θ and ŵ w ŵ subject to ŵ B θ (NL SYSID) ( [φ rank ( x(1) ) φ ( x(t l) )] ) r }{{} polynomially structured matrix Φ(ŵ) (NL SYSID) polynomially structured LRA (NL SYSID) is nonconvex and yields biased estimator

30 30 / 35 Bias correction ignoring the structure of Φ(ŵ) leads to kernel PCA easy to compute, but biased in the EIV model w = w+ w, where w B and w N(0,σ 2 I) define Ψ:=Φ(w)Φ (w) and Ψ:=Φ( w)φ ( w) goal: construct corrected matrix Ψ c, such that E(Ψ c )= Ψ

31 31 / 35 Derivation of the correction Hermite polynomials h k (x) have the property E ( h k ( x+ x) ) = x k, where x N(0,σ 2 ) ( ) with w =(x,y), the (i,j)th element of Ψ=ΦΦ is ( x+ x) n x,i+n x,j (ȳ+ ỹ) n y,i+n y,j then, by ( ) φ c,ij := h nx,i +n x,j (x)h ny,i +n y,j (y) has the desired property E(ψ c,ij )= x n x,i+n x,jȳn y,i +n y,j =: ψ ij

32 32 / 35 Unbiased estimator the corrected Ψ c is an even polynomial in σ Ψ c (σ 2 )=Ψ c,0 + σ 2 Ψ c,1 + +σ 2n ψ Ψ c,nψ estimate: Ψ c (σ 2 )θ = 0 computing simultaneously σ and θ is polynomial EVP examples of static nonlinear model fitting KPCA dotted PLRA dashed-dotted bias corrected dashed

33 33 / 35 2 Example: x 3 + y 3 3xy = 0 1 y x

34 34 / 35 Conclusion: system identification SLRA LTI model class mosaic-hankel structure solving control problems as missing data estimation bias correction procedure for polynomial SLRA

35 35 / 35 Conclusion: system identification SLRA LTI model class mosaic-hankel structure solving control problems as missing data estimation bias correction procedure for polynomial SLRA papers, course materials, and code at:

Data-driven signal processing

1 / 35 Data-driven signal processing Ivan Markovsky 2 / 35 Modern signal processing is model-based 1. system identification prior information model structure 2. model-based design identification data parameter