EL1820 Modeling of Dynamical Systems

Similar documents
EL1820 Modeling of Dynamical Systems

Advanced Process Control Tutorial Problem Set 2 Development of Control Relevant Models through System Identification

CONTROL SYSTEMS, ROBOTICS, AND AUTOMATION - Vol. V - Prediction Error Methods - Torsten Söderström

Control Systems Lab - SC4070 System Identification and Linearization

EECE Adaptive Control

Outline 2(42) Sysid Course VT An Overview. Data from Gripen 4(42) An Introductory Example 2,530 3(42)

EECE Adaptive Control

12. Prediction Error Methods (PEM)

f-domain expression for the limit model Combine: 5.12 Approximate Modelling What can be said about H(q, θ) G(q, θ ) H(q, θ ) with

Identification, Model Validation and Control. Lennart Ljung, Linköping

Outline. What Can Regularization Offer for Estimation of Dynamical Systems? State-of-the-Art System Identification

Matlab software tools for model identification and data analysis 11/12/2015 Prof. Marcello Farina

Identification of ARX, OE, FIR models with the least squares method

6.435, System Identification

An Exponentially Weighted Moving Average Method for Identification and Monitoring of Stochastic Systems

Data mining for system identi cation

Lecture 7: Discrete-time Models. Modeling of Physical Systems. Preprocessing Experimental Data.

Introduction to system identification

Chapter 6: Nonparametric Time- and Frequency-Domain Methods. Problems presented by Uwe

Non-parametric identification

B y t = γ 0 + Γ 1 y t + ε t B(L) y t = γ 0 + ε t ε t iid (0, D) D is diagonal

Matlab software tools for model identification and data analysis 10/11/2017 Prof. Marcello Farina

On Input Design for System Identification

THERE are two types of configurations [1] in the

IDENTIFICATION OF A TWO-INPUT SYSTEM: VARIANCE ANALYSIS

Identification of Linear Systems

An Algorithm for Finding Process Identification Intervals from Normal Operating Data

y k = ( ) x k + v k. w q wk i 0 0 wk

Model structure. Lecture Note #3 (Chap.6) Identification of time series model. ARMAX Models and Difference Equations

Nonlinear System Identification Using MLP Dr.-Ing. Sudchai Boonto

System Identification

Optimal Polynomial Control for Discrete-Time Systems

Econometrics II - EXAM Answer each question in separate sheets in three hours

Parametric Output Error Based Identification and Fault Detection in Structures Under Earthquake Excitation

Model Identification and Validation for a Heating System using MATLAB System Identification Toolbox

Errors-in-variables identification through covariance matching: Analysis of a colored measurement noise case

Exam in Automatic Control II Reglerteknik II 5hp (1RT495)

6.435, System Identification

PERFORMANCE ANALYSIS OF CLOSED LOOP SYSTEM WITH A TAILOR MADE PARAMETERIZATION. Jianhong Wang, Hong Jiang and Yonghong Zhu

EECE Adaptive Control

GMM, HAC estimators, & Standard Errors for Business Cycle Statistics

Time Series Analysis

Process Dynamics & Control LECTURE 1: INTRODUCTION OF MODEL PREDICTIVE CONTROL A Multivariable Control Technique for the Process Industry

Closed-loop Identification of Hammerstein Systems Using Iterative Instrumental Variables

Improving performance and stability of MRI methods in closed-loop

IDENTIFICATION FOR CONTROL

Lecture Note #7 (Chap.11)

14 th IFAC Symposium on System Identification, Newcastle, Australia, 2006

Identification in closed-loop, MISO identification, practical issues of identification

Lecture Stat Information Criterion

We are IntechOpen, the world s leading publisher of Open Access books Built by scientists, for scientists. International authors and editors

OPTIMAL EXPERIMENT DESIGN IN CLOSED LOOP. KTH, Signals, Sensors and Systems, S Stockholm, Sweden.

Refined Instrumental Variable Methods for Identifying Hammerstein Models Operating in Closed Loop

MS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari

What can regularization offer for estimation of dynamical systems?

SGN Advanced Signal Processing: Lecture 8 Parameter estimation for AR and MA models. Model order selection

FRTN 15 Predictive Control

REGLERTEKNIK AUTOMATIC CONTROL LINKÖPING

Unbiased Power Prediction of Rayleigh Fading Channels

A summary of Modeling and Simulation

System Identification & Parameter Estimation

On instrumental variable-based methods for errors-in-variables model identification

Advanced Econometrics

Parameter Estimation in a Moving Horizon Perspective

Sign-Perturbed Sums (SPS): A Method for Constructing Exact Finite-Sample Confidence Regions for General Linear Systems

Study of Time Series and Development of System Identification Model for Agarwada Raingauge Station

Time series models in the Frequency domain. The power spectrum, Spectral analysis

Simple Linear Regression: The Model

ESTIMATION ALGORITHMS

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Simple Linear Regression

SGN Advanced Signal Processing Project bonus: Sparse model estimation

RESEARCH ARTICLE Parameter Consistency and Quadratically Constrained Errors-in-Variables Least-Squares Identification

Time Series Analysis

Identification of Stochastic Systems Under Multiple Operating Conditions: The Vector Dependent FP ARX Parametrization

Machine Learning and Computational Statistics, Spring 2017 Homework 2: Lasso Regression

Analysis of Discrete-Time Systems

ECE 636: Systems identification

System Identification, Lecture 4

Further Results on Model Structure Validation for Closed Loop System Identification

Spatial Statistics with Image Analysis. Lecture L08. Computer exercise 3. Lecture 8. Johan Lindström. November 25, 2016

Response Surface Methods

AUTOMATIC CONTROL COMMUNICATION SYSTEMS LINKÖPINGS UNIVERSITET. Questions AUTOMATIC CONTROL COMMUNICATION SYSTEMS LINKÖPINGS UNIVERSITET

! # % & () +,.&/ 01),. &, / &

OPTIMAL DESIGN INPUTS FOR EXPERIMENTAL CHAPTER 17. Organization of chapter in ISSO. Background. Linear models

Linear Approximations of Nonlinear FIR Systems for Separable Input Processes

1 Outline. 1. Motivation. 2. SUR model. 3. Simultaneous equations. 4. Estimation

Estimation theory. Parametric estimation. Properties of estimators. Minimum variance estimator. Cramer-Rao bound. Maximum likelihood estimators

EL2520 Control Theory and Practice

LECTURE 10 LINEAR PROCESSES II: SPECTRAL DENSITY, LAG OPERATOR, ARMA. In this lecture, we continue to discuss covariance stationary processes.

Improving Convergence of Iterative Feedback Tuning using Optimal External Perturbations

Appendix A: The time series behavior of employment growth

University of Pavia. M Estimators. Eduardo Rossi

Finite Sample Confidence Regions for Parameters in Prediction Error Identification using Output Error Models

Lecture 2: Statistical Decision Theory (Part I)

Covariance Stationary Time Series. Example: Independent White Noise (IWN(0,σ 2 )) Y t = ε t, ε t iid N(0,σ 2 )

Just-in-Time Models with Applications to Dynamical Systems

Expressions for the covariance matrix of covariance data

Modelling Non-linear and Non-stationary Time Series

Univariate Time Series Analysis; ARIMA Models

Transcription:

EL1820 Modeling of Dynamical Systems Lecture 9 - Parameter estimation in linear models Model structures Parameter estimation via prediction error minimization Properties of the estimate: bias and variance Lecture 9 1

You should be able to Today s goal distinguish between common model structures used in identification estimate model parameters using the prediction-error method calculate the optimal parameters for ARX models using least-squares estimate bias and variance of estimates from model and input signal properties Lecture 9 2

System identification Basic idea: estimate system from measurements of u(t) and y(t) w(t) - disturbance u(t) System y(t) e(t) - measurement noise u(kh) y(kh) Many issues Many issues choice of sampling frequency, input signal (experiment conditions) choice of sampling freq., input signal (experimental conditions) what class of models how to model disturbances? what class of models how to model disturbances? estimating model parameters from sampled, finite and noisy data estimating model parameters from sampled, finite and noisy data Lecture 9 3

System identification via parameter estimation w[k] (disturbance) u[k] Linear system y[k] Need to fix model structure before trying to estimate parameters Need to fix model system structure model, disturbance before trying model to estimate parameters system model, order disturbance (degrees model of transfer function polynomials) model order (degrees of transfer function polynomials) Lecture 9 4

Model structures Model structures commonly used (BJ includes all as special cases) ARMAX (autoregressive moving average exogeneous input) e[k] BJ (Box Jenkins) e[k] C(q) C(q) D(q) u[k] B(q) 1 A(q) y[k] u[k] B(q) F(q) y[k] ARX (autoregressive with exogeneous input) e[k] OE (output error) e[k] u[k] B(q) 1 A(q) y[k] u[k] B(q) A(q) y[k] Lecture 9 5

Transfer function parameterizations The transfer functions G(q) and H(q) in the linear model will be parameterized as y[k] = G(q; θ)u[k] + H(q; θ)e[k] G(q; θ) = q n k b 0 + b 1 q 1 + + b nb q n b 1 + f 1 q 1 + + f nf q n f H(q; θ) = 1 + c 1q 1 + + c nc q n c 1 + d 1 q 1 + + d nd q n d where the parameter vector θ contains {b k }, {f k }, {c k }, {d k } Note n k determines dead-time; n b, n f, n c, n d : order of polynomials Lecture 9 6

Model order selection from physical insight Physical insight can often help us to determine the right model order If system is sampled using zero-order hold (input piecewise constant), n f equals the number of poles of continuous-time system if system has no delay and no direct term, then n b = n f 1, n k = 1 if system has no delay but direct term, then n b = n f, n k = 0 if continuous system has time delay, then n k = τ/h + 1 Note n b does not depend on number of continuous-time zeros! Lecture 9 7

EL1820 Modeling of Dynamical Systems Lecture 9 - Parameter estimation in linear models Model structures Parameter estimation via prediction error minimization Properties of the estimate: bias and variance Lecture 9 8

Basic principle of parameter estimation w[k] System y[k] u[k] Model ^ y[k] For given parameters θ, the model predicts that the system output should be ŷ[t; θ] For given θ, the model predicts that the system output will be ŷ[k; θ] Determine θ so that ŷ[t; θ] matches observed output y[t] as closely as possible Determine θ so ŷ[k; θ] matches observed y[k] as closely as possible To solve the parameter estimation problem, we note that To solve the parameter estimation problem, we note that 1. The value of ŷ[t; θ] depends on the disturbance model 1. The value of ŷ[k; θ] depends on the disturbance model 2. The concept as closely as possible must be given a mathematical formulation 2. The concept as closely as possible must be mathematically formalized Lecture 9 8 April 29, 2004 Lecture 9 9

1. Compute Prediction error minimization (PEM) ŷ[k; θ k 1] the model s prediction of the system output, given information at time k 1 2. Form the prediction error ε[k] = y[k] ŷ[k; θ k 1] 3. Construct the loss function V N (θ) = 1 N N ε 2 [k] k=1 4. The optimal θ is the one minimizing the loss function θ = arg min θ V N (θ) Lecture 9 10

Prediction using linear models Consider the linear model: y[k] = G(q)u[k] + H(q)e[k] Multiply by H 1 (q) (to make noise term white) and re-write as y[k] = (1 H 1 (q))y[k] + H 1 (q)g(q)u[k] + e[k] Since {e[k]} is a white noise sequence, our best prediction is ŷ[k] = (1 H 1 (q))y[k] + H 1 (q)g(q)u[k] If n c n d, prediction uses only old outputs (measured up to k 1) Lecture 9 11

Prediction using ARX models For ARX models, H = 1/A and G = q n k B/A, so (1 H 1 (q))y[k] = (1 A(q))y[k] = (a 1 q 1 + + a na q n a )y[k] H 1 (q)g(q)u[k] = q n k B(q)u[k] = (b 0 + b 1 q 1 + + b nb q n b )q n k u[k] Thus, the predictor is linear in the parameters ŷ[k; θ k 1] = ϕ T [k]θ where θ = a 1. a na. b nb ϕ[k] = y[k 1]. y[k n a ] u[k n k ]. u[k n k n b ] Lecture 9 12

Linear regression Linear model, linear predictor ({e[k]}: white noise) y[k] = ϕ T [k]θ 0 + e[k] ŷ[k] = ϕ T [k]θ Convenient to express the residuals ε[k] = y[k] ŷ[k] in vector form, ε[1] y[1] ϕ T [1] ε N = = θ = y N ϕ N θ. ε[n]. y[n]. ϕ T [N] Then, the loss function can be written as V (θ) = 1 N N n=1 ε2 [k] = 1 N εt Nε N = 1 N (y N ϕ N θ) T (y N ϕ N θ) and the optimal estimate is found by solving V/ θ = 0: ˆθ = (ϕ T Nϕ N ) 1 ϕ T N y N (provided ( ) 1 exists; see end of slides for proof) Lecture 9 13

Example: Estimation in ARX models Example Estimate the model parameters a and b in the ARX model y[k] = ay[k 1] + bu[k 1] + e[k] from input and output sequences {y[k]}, {u[k]} for k = 0,..., N Using θ = (a b) T and ψ[k] = (y[k 1] u[k 1]) T, we find ϕ T Nϕ N = [ y[0] y[n 1] u[0] u[n 1] ] y[0] u[0]. y[n 1] u[n 1] so the optimal estimate is given by [ N ] 1 [ N N ] ˆθ = k=1 y2 [k 1] k=1 y[k 1]u[k 1] N k=1 u[k 1]y[k 1] k=1 y[k 1]y[k] N N k=1 u2 [k 1] k=1 u[k 1]y[k]. Note Estimate computed using covariances of u[k], y[k] Lecture 9 14

Estimation in general model structures Estimation more difficult when predictor is not linear in parameters In general, we need to minimize V N (θ) using iterative numerical methods, e.g., θ (i+1) = θ (i) µ (i) M (i) V N(θ (i) ) Example Newton s method uses M (i) = (V N (θ (i) )) 1 while Gauss-Newton approximates M (i) using first-order derivatives Problem Result is locally optimal, but not necessarily globally optimal Lecture 9 15

Example Example G(s) = 10/(s 2 + 2s + 10) sampled w/ h = 0.05, var{v} = 0.1 2 10 1 10 0 Magnitude 10 1 10 2 10 3 True system ARX OE 10 0 10 1 Frequency (rad/s) 0 Phase (deg) 50 100 150 200 10 0 10 1 Frequenct (rad/s) Lecture 9 16 Model structure matters!

EL1820 Modeling of Dynamical Systems Lecture 9 - Parameter estimation in linear models Model structures Parameter estimation via prediction error minimization Properties of the estimate: bias and variance Lecture 9 17

Properties of PEM estimates What can we say about models estimated using prediction-error minimization? Model errors have two components: 1. Bias errors: arise if model is unable to capture true system 2. Variance errors: due to influence of stochastic disturbances We will study two properties of general prediction error methods: 1. Convergence: what happens with ˆθ N as N grows? 2. Accuracy: what can we say about size of ˆθ N θ 0 as N increases? Lecture 9 18

Convergence If disturbances acting on system are stochastic, then so is prediction error ε[k] Under quite general conditions (even if ε[k] are not independent) and lim N 1 N N ε 2 [k; θ] = E{ε 2 [k; θ]} k=1 ˆθ N θ = arg min θ E{ε 2 [k; θ]} as N Even if model cannot reflect reality, estimate will minimize prediction mean squared error! Lecture 9 19

Example Example Assume you try to estimate the parameter b in the model ŷ[k] = bu[k 1] while the true system is y[k] = u[k 1] + u[k 2] + e[k] where {u[k]}, {e[k]} are white noise signals, indep. of each other What will the PEM estimate converge to? PEM will find the parameters that minimize the mean squared error E{ε 2 [k]} = E{(y[k] ŷ[k]) 2 } = E{(u[k 1] + u[k 2] + e[k] bu[k 1]) 2 } = E{((1 b)u[k 1] + u[k 2]) 2 } + σe 2 = (1 b) 2 σu 2 + σu 2 + σe 2 This expression is minimized by b = 1 (the asymptotic estimate) Lecture 9 20

Consistency Assume that there is some θ 0 such that {ε[k; θ 0 ]} is white noise, then E{ε 2 [k; θ]} is minimized by this value (see end of slides for proof) If, moreover, then one can conclude that ŷ[k; θ 0 ] = ŷ[k; θ] = θ = θ 0 ˆθ N θ 0 as N Lecture 9 21

θ : frequency domain characterization Assume that the true system is described by y[k] = G 0 (q)u[k] + w[k] and that we try to estimate a model of the form (H (q) indep. of θ) y[k] = G(q; θ)u[k] + H (q)e[k] If {u[k]} and {w[k]} are independent, θ = lim N ˆθ N = arg min θ π π G 0 (e iω ) G(e iω ; θ) 2 Φ u (ω) H (e iω ) 2 dω θ minimizes least-squares criterion, weighted by Φ u (ω)/ H (e iω ) 2 good fit where Φ u (ω) has much energy, or H(e iω ) has little energy Can focus model accuracy on important frequency range by choosing {u[k]} Lecture 9 22

Example Output error method using low- and high-frequency input signal 10 2 Magnitude 10 0 10 2 True system OE 10 0 10 1 Frequency (rad/s) 10 2 Magnitude 10 0 10 2 True system OE 10 0 10 1 Frequency (rad/s) Lecture 9 23

Estimation error variance If {e[k]} is white noise with variance λ, then E{(θ θ 0 )(θ θ 0 ) T } 1 N λr 1 where R = E{ψ[k; θ 0 ]ψ T [k; θ 0 ]} ψ[k; θ 0 ] = d dθ ŷ[k; θ] θ=θ0 Error variance decreases with sensitivity of prediction error (w.r.t. parameters) number of measurements Lecture 9 24

Estimation error variance cont d We can estimate the estimation error variance via ˆP N = 1 N ˆλ ˆR 1 N where ˆλ = 1 N N ε 2 [k; ˆθ N ], ˆRN = 1 N N ψ[k; ˆθ N ]ψ T [k; ˆθ N ] k=1 k=1 Moreover, one can show that N(ˆθN θ 0 ) d N (0, λr 1 ) This can be used to compute confidence regions for parameter estimates Lecture 9 25

Error variance in the frequency domain For the variance of the frequency response of the estimate, we have var{g(e iω ; θ)} n N Φ w (ω) Φ u (ω) n, N 1 Variance increases with number of model parameters n decreases with number of observations, and signal-to-noise ratio again, the frequency content of the input influences accuracy of the model Similar to spectral analysis error bounds G(e iω ; θ) typically decreases at ω π/h, while variance is constant (or increases!) = high relative error at high freq. Lecture 9 26

Example Confidence intervals for freq. responses for two different input spectra 10 1 Input spectrum 1 Input spectrum 2 10 1 10 0 10 0 10 1 10 1 10 2 10 2 10 3 10 2 10 1 10 0 10 1 10 3 10 2 10 1 10 0 10 1 10 1 Estimate 1 Estimate 2 10 0 10 0 10 1 10 1 10 2 10 2 10 1 10 0 10 1 frequency (rad/sec) 10 2 10 2 10 1 10 0 10 1 frequency (rad/sec) Lecture 9 27

Next lecture Experimental condition and model validation Lecture 9 28

Bonus: calculation of V θ V (θ) = 1 N (y N ϕ N θ) T (y N ϕ N θ) = 1 N (yt Ny N 2θ T ϕ T Ny N + θ T ϕ T N ϕ N θ }{{}}{{} 2 i θ i(ϕ T N y N ) i i,j θ iθ j (ϕ T N ϕ N ) ij ) Therefore, or V θ k = 2 N (ϕt Ny N ) k + 2 N (ϕ T N ϕ N ) ik θ k V θ = 2 N ϕt Ny N + 2 N (ϕt Nϕ N )θ! = 0 k for θ = ˆθ Hence: (ϕ T Nϕ N )ˆθ = ϕ T N y N ˆθ = (ϕ T N ϕ N ) 1 ϕ T Ny N Lecture 9 29

Bonus: Proof that ε[k; ˆθ] is white noise E{ε 2 [k; θ]} = E{(y[k] ŷ[k; θ 0 ] +ŷ[k; θ 0 ] ŷ[k; θ]) 2 } }{{} =ε[k;θ 0 ] = E{ε 2 [k; θ 0 ]} + E{(ŷ[k; θ 0 ] ŷ[k; θ]) 2 } + 2E{ε[k; θ 0 ](ŷ[k; θ 0 ] ŷ[k; θ])} E{ε 2 [k; θ 0 ]} if E{ε[k; θ 0 ](ŷ[k; θ 0 ] ŷ[k; θ])} = 0 Now, y[k] = ε[k; θ 0 ] + ŷ[k; θ 0 ] is a function of ε[k; θ 0 ], ε[k 1; θ 0 ],..., u[k], u[k 1],..., because ŷ[k; θ 0 ] is a function of y[k 1], y[k 2],... and u[k], u[k 1],..., where y[k 1],... depend on previous values of ε[k 1; θ 0 ], and so on Then, since {ε[k; θ 0 ]} is white noise, ε[k; θ 0 ] is uncorrelated to y[k 1],... and u[k],..., hence it is uncorrelated to both ŷ[k; θ 0 ] and ŷ[k; θ], i.e., E{ε[k; θ 0 ](ŷ[k; θ 0 ] ŷ[k; θ])} = 0 This shows that E{ε 2 [k; θ]} E{ε 2 [k; θ 0 ]} for all θ Lecture 9 30