Modeling and Analysis of Dynamic Systems

Modeling and Analysis of Dynamic Systems by Dr. Guillaume Ducard c Fall 2016 Institute for Dynamic Systems and Control ETH Zurich, Switzerland G. Ducard c 1

Outline 1 Lecture 9: Model Parametrization 2 G. Ducard c 2

Outline Lecture 9: Model Parametrization 1 Lecture 9: Model Parametrization 2 G. Ducard c 3

Introduction Lecture 9: Model Parametrization You came up with a mathematical model of a system, which contains some parameters (ex: mass, elasticity, specific heat,...). Now you need to run experiments to identify the model parameters. How to proceed? G. Ducard c 4

Introduction Lecture 9: Model Parametrization Least Squares Methods: Classical LS methods for static and linear systems closed-form solutions available. Nonlinear LS methods for dynamic and nonlinear systems only numerical (optimization) solutions available. Remark: there are closed-form approaches for linear dynamic systems as well. See master-level courses (e.g. Introduction to Recursive Filtering and Estimation ). G. Ducard c 5

Planning experiments is about knowing What excitation of the system : choice of correct input signals What to measure in the system (choice of sensors, their location, etc.) Measurements for linear or nonlinear model identification Frequency content of the excitation signals Noise level at input and output of the system Safety issues are best to efficiently identify the system parameters. Choose signals such that all the relevant dynamics and static effects inside the plant are excited with the correct amount of input energy. in p. 75-76 script. G. Ducard c 6

The data obtained experimentally may be used for two purposes: 1 To identify unknown system structures and system parameters. Using a first set of data: u 1,y r,1 u y 1 r,1 Real Plant Modeled System y m 2 To validate the results of the system modeling and parameter identification. Using a second set of data: u 2,y r,2 u y 2 r,2 Real Plant Modeled System y m G. Ducard c 7

A word of caution: It is of fundamental importance not to use the same data set for both purposes. The real quality of a parameterized model may only be assessed by: comparing the prediction of that model with measurement data that have not been used in the model parametrization. u y 2 r,2 Real Plant Modeled System y m Remark: the model and its identification are validated if: for the same input signal u 2, the output signals y r,2 and y m are sufficiently similar. G. Ducard c 8

Outline Lecture 9: Model Parametrization 1 Lecture 9: Model Parametrization 2 G. Ducard c 9

Introduction Lecture 9: Model Parametrization Least Squares estimation is used to fit the parameters of a linear and static model (model that mathematically describes inputs/outputs of the system). The model is never exact: for the same input signals there will be a difference between the outputs of the model, and true system outputs modeling errors. Remark: These errors may be considered a deterministic or stochastic variables. Both formulations are equivalent, as long as these errors are completely unpredictable and not correlated with the inputs. G. Ducard c 10

LS Formulation Lecture 9: Model Parametrization e u System s model y Figure: Elementary least-squares model structure. It is assumed that the output of the real system may be approximated by the output of the system s model with some model error e according to the linear equation: y(k) = h T (u(k)) π +e(k) with: k [1,...,r]: discrete-time instant G. Ducard c 11

y(k) = h T (u(k)) π +e(k) k: represents the index of discrete time (discrete-time instant k) u(k) R m input vector y(k) R is the output signal (measurement) (scalar). π R q is the vector of the q unknown parameters (those we want to estimate). h(.) R q is the regressor, depends on u in a nonlinear but algebraic way. e(k) is the error (scalar). Typically, there are more measurements than unknown parameters: (r q). G. Ducard c 12

LS Objective: Estimate π R q is the vector of unknown parameters (those we want to estimate) such that the model error e is minimized. In order to do that, let s formulate the problem into a matrix form (derived on the blackboard during the class). + Example. G. Ducard c 13

Outline Lecture 9: Model Parametrization 1 Lecture 9: Model Parametrization 2 G. Ducard c 14

Least-square solution and comments π LS = [ H T W H ] 1 HT W ỹ The regression matrix H must have full column rank, i.e., all q parameters (π 1,π 2,...π q ) are required to explain the data. Moore-Penrose inverse: M = (M T M) 1 M T, M R r q, r > q, rank{m} = q G. Ducard c 15

Least-square solution and comments If the error e is an uncorrelated white noise signal with: mean value 0 and variance σ, then 1 the expected value of the parameter estimation π LS is equal to its true value, a E(π LS ) = π true 2 covariance matrix: Σ = σ 2 (H T W H) 1. a Of course, only if the model perfectly describes the true system. G. Ducard c 16

Least Squares Solution: geometric interpretation Particular case: q = 2, r = 3 The result of the LS identification can be geometrically interpreted: the columns of H define the directions (projection vectors) that define a plane (defined by the 2 vectors in this case: h 1 h2 ) and therefore, ẽ LS is perpendicular to that plane. ỹ = Hπ LS +ẽ LS [ πls,1 ỹ = [ h 1 h 2 ] π LS,2 ] +ẽ LS ỹ = π LS,1 h 1 +π LS,2 h 2 +ẽ LS G. Ducard c 17

ỹ h1 ẽ LS π LS,2 h 2 h2 π LS,1 h 1 G. Ducard c 18

Outline Lecture 9: Model Parametrization 1 Lecture 9: Model Parametrization 2 G. Ducard c 19

Iterative Least Squares: Up to now, a batch-like approach 1 has been assumed: Problems: π LS = [ H T W H ] 1 HT W ỹ 1 The computation the matrix inversion part is the most time-consuming step. 2 Assuming that r measurements have been taken, a solution has been computed, numerically very inefficient to repeat the full matrix inversion procedure when an additional measurement data becomes available. 1 Batch-like approach: 1. all measurement are made, 2. data are organized in the LS pb formulation, 3. the LS solution is computed once G. Ducard c 20

Instead, an iterative solution of the form π LS (r +1) = f (π LS (r),y(r +1)), initialized by π LS (0) = E{π} would be much more efficient. How do we build up a recursive Least-Squares algorithm? G. Ducard c 21

Recursive LS Formulation 1 Start: π LS = [ H T W H ] 1 HT W ỹ G. Ducard c 22

Recursive LS Formulation 1 Start: π LS = [ H T W H ] 1 HT W ỹ 2 Simplification: consider weighting matrix simply as W = I (extension with W easily possible). π LS = [ H T H ] 1 HT ỹ G. Ducard c 23

Recursive LS Formulation 1 Start: π LS = [ H T W H ] 1 HT W ỹ 2 Simplification: consider weighting matrix simply as W = I (extension with W easily possible). π LS = [ H T H ] 1 HT ỹ 3 Formulate matrix products as sums: [ r 1 π LS (r) = h(k) h (k)] T k=1 r h(k) y(k) k=1 G. Ducard c 24

Matrix Inversion Lemma Suppose M R n n is a regular matrix (det(m) 0), and v R n is a column vector, which satisfies the condition: 1+v T M 1 v 0. In this case: [M +v v T ] 1 = M 1 1 1+v T M 1 v M 1 v v T M 1 Remarks Proof by inspection: multiply from the left with M +v v T. Main advantage of this lemma: no additional matrix inversion than M 1 is needed. Inversion of the new matrix M +v v T may be carried out very efficiently. G. Ducard c 26

[ r 1 r π LS (r) = h(k) h (k)] T h(k) y(k) k=1 k=1 To simplify the notation, a matrix Ω is defined as: Ω(r) = Then compute Ω(r +1): [ r h(k) h T (k) k=1 ] 1 Ω(r +1) = = [ r+1 h(k) h T (k) k=1 ] 1 [ r h(k) h T (k)+h(r +1) h T (r +1) ] 1 k=1 G. Ducard c 27

[ r Ω(r +1) = h(k) h T (k)+h(r +1) h T (r +1) k=1 ] 1 we use the Inversion Lemma: [M +v v T ] 1 = M 1 Recursive formulation of the matrix inverse Ω(r +1) = Ω(r) 1 1+v T M 1 v M 1 v v T M 1 1 1+c(r +1) Ω(r) h(r +1) ht (r +1) Ω(r) where c(r +1) = h T (r +1) Ω(r) h(r +1) (scalar). G. Ducard c 28

r π LS (r) = Ω(r) h(k) y(k) k=1 How to compute recursively the estimate? r+1 π LS (r +1) = Ω(r +1) h(k) y(k) k=1 [ ] 1 = Ω(r) 1+c(r +1) Ω(r)h(r +1)hT (r +1)Ω(r) ( r ) h(k) y(k)+h(r +1) y(r +1) k=1 G. Ducard c 29

π LS(r+1) = Ω (r) and r h (k) y (k) + Ω (r) h (r+1) y (r+1) k=1 } {{ } π LS(r) 1 Ω 1+c (r) h (r+1) h T (r+1) Ω (r) (r+1) r h (k) y (k) k=1 } {{ } π LS(r) 1 Ω 1+c (r) h (r+1) h T (r+1) Ω (r) h (r+1) (r+1) }{{} c (r+1) y (r+1) c (r+1) 1+c (r+1) = 1 1+c (r+1) 1 G. Ducard c 30

π LS(r+1) = π LS(r) + Ω (r) h (r+1) y (r+1) }{{} 1 Ω 1+c (r) h (r+1) h T (r+1) π LS(r) (r+1) + 1 1 1+c (r+1) }{{} Ω (r) h (r+1) y (r+1) π LS(r+1) = 1 π LS(r) Ω 1+c (r) h (r+1) h T (r+1) π LS(r) (r+1) }{{} 1 + Ω 1+c (r) h (r+1) (r+1) }{{} y (r+1) G. Ducard c 31

Recursive computation of the parameter vector π LS(r) 1 ( ) π LS(r+1) = π LS(r) + Ω 1+c (r) h (r+1) y (r+1) h T (r+1) π LS(r) (r+1) with Recursive update of the gain matrix Ω Ω (r+1) = Ω (r) 1 1+c (r+1) Ω (r) h (r+1) h T (r+1) Ω (r) where c (r+1) = h T (r+1) Ω (r) h (r+1) (scalar). and Initialization π LS(0), Ω (0) G. Ducard c 32

Recursive computation of the parameter vector π LS(r) 1 ( ) π LS(r+1) = π LS(r) + Ω 1+c (r) h (r+1) y (r+1) h T (r+1) π LS(r) (r+1) can be rewritten as: ( ) π LS(r+1) = π LS(r) +δ(r +1) y (r+1) h T (r+1) π LS(r) Comments on the recursive formulation: The blue term is a vector indicating the correction direction: δ(r + 1) applied by the innovation term (or prediction error). Interesting to note that the correction direction is not dependent on the magnitude of the prediction error. G. Ducard c 33

Outline Lecture 9: Model Parametrization 1 Lecture 9: Model Parametrization 2 G. Ducard c 34

Exponential Forgetting New error weighting for the recursive case ǫ(r) = r λ r k [y(k) h T (k) π LS (k)] 2, λ < 1 k=1 This introduces an exponential forgetting process: older errors have a smaller influence on the result of the parameter estimation. Can cope with slowly varying parameters. Update equations π LS(r+1) = π LS(r) + Ω (r+1) = 1 λ Ω (r) 1 [ ] Ω λ+c (r) h (r+1) y (r+1) h T (r+1) π LS(r) (r+1) ] [ I 1 λ+c (r+1) h (r+1) h T (r+1) Ω (r) G. Ducard c 35

Outline Lecture 9: Model Parametrization 1 Lecture 9: Model Parametrization 2 G. Ducard c 36

Kaczmarz s projection algorithm Each new prediction error : e (r+1) = y (r+1) h T (r+1) π (r) contains new information on the parameters π only in the direction of h (r+1). Therefore, π (r+1) is sought, which requires the smallest possible change π (r+1) π (r) to explain the new observation Cost function to minimize: J(π) = 1 2 [π (r+1) π (r) ] T (π(r+1) π (r) )+µ [y (r+1) h T (r+1) π (r+1)] Necessary conditions for the minimum: J π (r+1) = 0 J µ = 0 G. Ducard c 37

Solve this linear equations for π(r +1) and µ π (r+1) = π (r) + Usually this solution is modified as h(r +1) h T (r +1) h (r+1) [y(r +1) h T (r+1) π (r)] γ h (r+1) π (r+1) = π (r) + λ+h T (r+1) h (r+1) [y(r +1) h T (r+1) π(r)] 0 < γ < 2, 0 < λ < 1 to achieve desired convergence and forgetting. Discussions Kaczmarz projection algorithm requires less computational efforts than regular LS It converges much slower than regular LS algorithm. Choice of algorithm depending on resources at hand and convergence speed requirements. G. Ducard c 38

Next lecture + Upcoming Exercise Next lecture Stability Analysis Properties of Linear Systems Next exercises: Least squares Parameter identification G. Ducard c 39