System Identification and Optimization Methods Based on Derivatives. Chapters 5 & 6 from Jang

Size: px

Start display at page:

Download "System Identification and Optimization Methods Based on Derivatives. Chapters 5 & 6 from Jang"

Jessie Bryant
6 years ago
Views:

1 System Identification and Optimization Methods Based on Derivatives Chapters 5 & 6 from Jang

2 Neuro-Fuzzy and Soft Computing Model space Adaptive networks Neural networks Fuzzy inf. systems Approach space Derivative-free optim. Soft Computing Derivative-based optim.

3 Outline System Identification Least Square Optimization Optimization Based on Differentiation First order differentiation Steepest descent Second order differentiation 3

4 System Identification: Introduction Goal Determine a mathematical model for an unknown system (or target system) by observing its inputoutput data pairs 4

5 System Identification: Introduction Purpose: To predict a system s behavior, as in time series prediction & weather forecasting To explain the interactions & relationships between inputs & outputs of a system To design a controller based on the model of a system, as an aircraft or ship control Simulate the system under control once the model is known 5

6 System Identification: Introduction There are two main steps that are involved Structure identification Parameter identification 6

7 System Identification: Introduction Structure identification Apply a-priori knowledge about the target system to determine a class of models within which the search for the most suitable model is to be conducted; this class of model is denoted by a function y = f(u,θ) where: y is the model output u is the input vector θ is the parameter vector f depends on the problem at hand and on the designer s experience and the laws of nature governing the target system 7

8 System Identification: Introduction Parameter identification The structure of the model is known, however we need to apply optimization techniques in order to determine the parameter vector such that the resulting model describes the system appropriately: ŷ θ = f(u, θˆ ) = θˆ y i ŷ 0 with y i assigned to u i 8

9 System Identification: Introduction 9

10 System Identification: Introduction The data set composed of m desired input-output pairs (u i ;y i ) (i =,,m) is called the training data System identification needs to do both structure & parameter identification repeatedly until satisfactory model is found: it does this as follows: ) Specify & parameterize a class of mathematical models representing the system to be identified ) Perform parameter identification to choose the parameters that best fit the training data set 3) Conduct validation set to see if the model identified responds correctly to an unseen data set 0 4) Terminate the procedure once the results of the validation test are satisfactory. Otherwise, another class of model is selected and repeat steps to 4

11 Least-Square Estimators General form: y = θ f (u) + θ f (u) + + θ n f n (u) (*) where: u = (u,, u p ) T is the model input vector f,, f n are known functions of u θ,, θ n are unknown parameters to be estimated

12 Least-Square Estimators The task of fitting data using a linear model is referred to as linear regression We collect a training data set {(u i ;y i ), i =,, m} Equation (*) becomes: f f M f (u (u (u m ) θ ) θ ) θ + f + f + f ) θ ) θ ) θ f f f which is equivalent to: A θ = y (u (u (u m n n (u (u n (u ) θ ) θ m n n ) θ = = n y y = y m

13 Least-Square Estimators Where: A is an m*n matrix which is: θ is n* unknown parameter vector: A = θ = f M f θ θ M (u (u n m ) Lf n ) Lf n (u (u ) ) m and y is an m* output vector: y y = M y m and a T i = [ f (u ),...,f (u )] i n i 3 A θ = y θ= A - y (solution)

14 Least-Square Estimators We have m outputs & n fitting parameters to find (or m equations & n unknown variables) Usually, m is greater than n, since the model is just an approximation of the target system and the data observed might be corrupted. Therefore, an exact solution is not always possible. To overcome this inherent conceptual problem, an error vector e is added to compensate A θ + e = y 4

15 Least-Square Estimators The goal now consists of finding that reduces the errors between y i and or min imize θ min imize θ m i = m y i a i T θˆ θ ( y ) T i a i θ i= ŷ = T i ai θ 5

16 Least-Square Estimators If e = y - Aθ then: E( θ) = i = m i= T T (y i ai θ) = e e = (y Aθ) (y Aθ) T We need to compute: min(y θ Aθ) T (y Aθ) 6

17 Least-Square Estimators Theorem [least-squares estimator] The squared error is minimized when θ = (called the least-squares estimators LSE) satisfies the normal equation A T A = A T y, if A T A is nonsingular, is unique & is given by θˆ θˆ = (A T A) - A T y θˆ θˆ 7

18 Least-Square Estimators: Example Example The relationship is between the spring length & the force applied L = k f + k 0 (linear model) Goal: find = (k o, k ) T that best fits the data for a given force f o, we need to determine the corresponding spring length L 0 - Solution : provide pairs (L 0,f 0 ) and (L,f ) and solve a linear system of equations and variables k and k 0 However, because of noisy data, this solution is not reliable. kˆ θˆ & kˆ 0 - Solution : use a larger training set (L i,f i ) 8

20 0 Least-Square Estimators: Example Least-Square Estimators: Example since y = e + Aθ, we can write: therefore the LSE of [k 0,k ] T which minimizes is equal to : { { 3 M 443 y e A e e e e e k k = + θ

21 Least-Square Estimators: Example Therefore, the LSE of [k 0,k ] T which minimizes is equal to : e T e = 7 i= e i kˆ kˆ 0 = (A T A) A T y =

22 Least-Square Estimators: Example We rely on this estimation because we have more data If we are not happy with the LSE estimators then we can increase the model s degree of freedom such that: L = k 0 + k f + k f + + k n f n (least square polynomial)

23 Least-Square Estimators: Example Higher order models fit better the data but they do not always reflect the inner law that governs the system For example, when f is increasing toward 0N, the length is decreasing. 3

24 Least-Square Estimators: Example 4

25 Derivative-Based Optimization Based on first derivatives: Steepest descent Conjugate gradient method Gauss-Newton method Levenberg-Marquardt method And many others Based on second derivatives: Newton method And many others 5

26 Steepest Descent Have n-dimensional space Want to find * θ = θ [ θ θ ] T θ =,,..., such that Investigate the search space via iteration such that E θ E θ n ( * θ ) = min( E( θ )) ( ) = θ k + ηkdk =,,3,... k + k ( θ ) = E( θ + η ) < E( θ ) d k + k k k 6

27 Steepest Descent If the direction d is determined from the gradient of E( θ ) i.e., g ( θ) = E( ) ( θ ) E( θ ) E( θ ) T def E θ =,,..., θ θ θ n such that θ = θ k + k ηg then we talk about the method of steepest descent 7

Chapter 6: Derivative-Based. optimization 1

Chapter 6: Derivative-Based. optimization 1 Chapter 6: Derivative-Based Optimization Introduction (6. Descent Methods (6. he Method of Steepest Descent (6.3 Newton s Methods (NM (6.4 Step Size Determination (6.5 Nonlinear Least-Squares Problems