Comparison of Prediction-Error-Modelling Criteria

Downloaded from orbitdtudk on: Aug 25, 208 Comparison of Prediction-Error-Modelling Criteria Jørgensen, John Bagterp; Jørgensen, Sten Bay Published in: American Control Conference, 2007 ACC '07 Link to article, DOI: 009/ACC20074283020 Publication date: 2007 Document Version Publisher's PDF, also known as Version of record Link back to DTU Orbit Citation (APA): Jørgensen, J B, & Jørgensen, S B (2007) Comparison of Prediction-Error-Modelling Criteria In American Control Conference, 2007 ACC '07 IEEE DOI: 009/ACC20074283020 General rights Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights Users may download and print one copy of any publication from the public portal for the purpose of private study or research You may not further distribute the material or use it for any profit-making activity or commercial gain You may freely distribute the URL identifying the publication in the public portal If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim

Proceedings of the 2007 American Control Conference Marriott Marquis Hotel at Times Square New York City, USA, July -3, 2007 WeA046 Comparison of Prediction-Error-Modelling Criteria John Bagterp Jørgensen* Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs Lyngby, Denmark jbj@immdtudk Sten Bay Jørgensen Department of Chemical Engineering Technical University of Denmark DK-2800 Kgs Lyngby, Denmark sbj@ktdtudk Abstract Single and multi-step prediction-error-methods based on the maximum likelihood and least squares criteria are compared The prediction-error methods studied are based on predictions using the Kalman filter and Kalman predictors for a linear discrete-time stochastic state space model, which is a realization of a continuous-discrete multivariate stochastic transfer function model The proposed prediction error-methods are demonstrated for a SISO system parameterized by the transfer functions with time delays of a continuous-discrete-time linear stochastic system The simulations for this case suggest to use the one-step-ahead prediction-error maximum-likelihood (or maximum a posteriori) estimator It gives consistent estimates of all parameters and the parameter estimates are almost identical to the estimates obtained for long prediction horizons but with consumption of significantly less computational resources The identification method is suitable for predictive control I INTRODUCTION In this paper, we address construction of stochastic linear models using the prediction-error-method [] [3] We parameterize the stochastic linear models as continuousdiscrete-time transfer functions with delay and realize these models as discrete-time stochastic linear state space models The Kalman filter and Kalman predictor for this system is used to generate the prediction errors and covariances need by the prediction-error identification criteria We investigate multi-step prediction error identification and compare it to single-step predictor error identification Shah and coworkers [4] [6] apply a similar multi-step approach based on impulse response models and a least-squares criterion The approach presented in this paper distinguishes itself by being general for linear systems, by applying least-squares as well as maximum likelihood criteria for the prediction errors in the estimator, and in particular by being directly applicable to state space model based predictive control in its modern implementation predictive control We propose a method to address the request for better identification methods tailored for predictive control in order to potentially improve the closed-loop performance of such control systems [7] [] II PREDICTION-ERROR-METHODS A Standard Regression Problem The essence of regression is to select some parameters, θ, such that the predicted outputs, ŷ k (θ), match the measured outputs, y k, as well as possible for all measurements k = *Corresponding author 0,,,N The estimation problem is often stated as the stochastic relation y k = ŷ k (θ)+e k, e k N(0,R k ), k = 0,,,N () The predictor or estimator, ŷ k (θ), is a function of the parameters, θ Θ R n θ For the measured realization, {y k } N, of the outputs, {y k } N, the parameters, θ, are computed such that some measure, eg the least squares measure, of the residuals, {e k (θ) = y k ŷ k (θ)} N, is minimized This is the standard nonlinear regression problem [2], [3], which can be stated as the optimization problem ˆθ = arg min V (θ) (2) θ Θ with the objective function V (θ) = V LS (θ) being V LS (θ) = 2 N e k (θ) 2 2 (3) in the least squares case The maximum-likelihood estimate corresponds to using negative log-likelihood function in (2), ie V (θ) = V ML (θ) with V ML (θ) defined as V ML (θ) = Nn y 2 + 2 ln(2π) + 2 N N ln (detr k (θ)) e k (θ) R k (θ) e k (θ) The maximum a posteriori estimate assumes that a priori the parameters stem from the distribution θ N(θ 0,P θ0 ) in which θ 0 Θ R n θ Then using Bayes rule the negative log-likelihood a posteriori function is V MAP (θ) = V ML (θ) + n θ 2 ln(2π) + 2 ln (detp θ 0 ) + 2 (θ θ 0) P θ 0 (θ θ 0 ) Hence, the maximum a posteriori estimate is obtained by applying V (θ) = V MAP (θ) in (2) B Parametrization, Realization and Prediction One way to represent multivariate stochastic distributed processes is through the input-output representation in the LaPlace domain Z(s) = G(s;θ)U(s) + H(s;θ)E(s) y(t k ) = z(t k ) + v(t k ) (4) (5) (6a) (6b) -4244-0989-6/07/$2500 2007 IEEE 40

WeA046 in which U(s) is the process input vector, E(s) is a vector with white noise components, Z(s) is the process output vector v(t k ) N(0,R vv (θ)) is the measurement noise vector and y(t k ) is the measured process output vector at time t k The elements, {g ij (s)} and {h ij (s)}, of the transfer function matrices, G(s) and H(s), are rational transfer functions with time delays g ij (s) = b ij(s;θ) a ij (s;θ) exp( τ ij(θ)s) h ij (s) = d ij(s;θ) c ij (s;θ) exp( λ ij(θ)s) (7a) (7b) Assuming that u(t) is a zero-order-hold input, (6) may be realized as a discrete-time stochastic linear state space model x k+ = A(θ)x k + B(θ)u k + w k y k = C(θ)x k + v k in which [wk ] ([ ] [ ]) 0 Rww (θ) R N iid, wv (θ) 0 R wv (θ) R vv (θ) and v k x 0 N(ˆx 0 (θ),p 0 (θ)) (8a) (8b) (8c) (8d) The Kalman filter and Kalman predictor are optimal estimators for (8) and therefore also optimal estimators for (6) [4], [5] The Kalman filter equations e k = y k Cˆx k k R e,k = CP k k C + R vv K fx,k = P k k C R e,k K fw,k = R wv R e,k ˆx k k = ˆx k k + K fx,k e k ŵ k k = K fw,k e k P k k = P k k K fx,k R e,k K fx,k Q k k = R ww K fw,k R e,k K fw,k (9a) (9b) (9c) (9d) (9e) (9f) (9g) (9h) provide the mean and covariance for the conditional distributions x k I k N(ˆx k k,p k k ) and w k I k N(ŵ k k,q k k ) I k = {(y j,u j )} k j=0 The Kalman one-step state predictor equations ˆx k+ k = Aˆx k k + Bû k k + ŵ k k (0a) P k+ k = AP k k A + Q k k AK fx,k R wv R wv K fx,ka and the j-step (j > ) state predictor equations ˆx k+j k = Ax k+j k + Bû k+j k P k+j k = AP k+j k A + R ww (0b) (a) (b) provide the mean and covariance for the conditional distribution x k+j I k N(ˆx k+j k,p k+j k ) for j The output predictions ŷ k+j k = Cˆx k+j k R k+j k = CP k+j k C + R vv (2a) (2b) provide the conditional mean and covariance for y k+j I k N(ŷ k+j k,r k+j k ) for j The Kalman filter and Kalman predictor are implemented numerically robust using the array algorithm propagating the square root of the covariances rather than the covariances themselves [4] C The PE Method as a Regression Problem The family of prediction-error-methods can be considered as solving a general regression problem similar to () The estimate of the prediction error estimates are obtained by solving an optimization problem like (2) for some criteria (LS, ML, MAP) and some predictors For the case considered in this paper, the predictors in the prediction error method are the Kalman predictors, ŷ k+j k (θ) The prediction errors, ε k+j k = y k+j ŷ k+j k (θ), correspond to the residuals in the standard regression problem Therefore, the predictionerror-method is a standard regression problem with a predictor generated by the Kalman filter and predictor In the following, the statistical properties of the predictors and the prediction errors will be discussed and various criteria for estimating the parameters in the prediction error framework are presented If it is possible to know the true structure of the system, S, and the model identified, M(θ), is equal to the true system, M(θ) = S, then this model will be optimal in a statistical sense no matter for what purpose it is to be used and what consistent estimator (criterion) used for determining the parameters In any realistic situation, it is almost impossible to know the true model structure due to changing process conditions, changing disturbance properties and nonlinearities Therefore, in practice the model should be suited and be identified for the purpose it is going to be used In predictive control this corresponds to minimization of multi-step predictions compatible with the regulator objective function D Single-Step j-step-ahead Prediction Error Let {(u k,y k )} N denote the IO-data for identification and let N p denote the prediction horizon Let the time indices be k =,0,,,N j and the prediction index be j N p This implies that 0 k + j N The conditional outputs, y k+j I k, have the distribution y k+j I k N(ŷ k+j k,r k+j k ) (3) and their correlation may be computed by [4] R (i,j) k = (y k+i I k ) ŷ k+i k,(y k+j I k ) ŷ k+j k CA i j N k+j k i > j (4) = CP k+i k C + R vv i = j N k+i k (Aj i ) C i < j in which i N p, j N p, and N k+i k = AP k+i k C + R wv (5) Hence, the single-step j-step-ahead prediction error problem may be stated as y k+j I k = ŷ k+j k (θ)+ε k+j,k I k k =,0,,N j (6) 4

WeA046 with ε k+j,k I k N(0,R k+j k ) in the ideal case when the system and the model on which the predictor is computed are identical ε k+j,k denotes the residual of the single-step j- step-ahead predictor at time k This corresponds to a standard regression problem in which some measure of the j-step prediction error ε k+j k = y k+j ŷ k+j k (θ), k =,0,,N j (7) is minimized ε k+j k can be regarded as the realization of ε k+j,k I k N(0,R k+j k ) When the structure of the model and the system are different, ε k+j,k I k may have a non-zero mean and a covariance different from R k+j k Even in such cases it seems reasonable to minimize some measure of the prediction error, ε k+j k However, as the distribution of ε k+j,k I k is unknown maximum likelihood based procedures can only be considered as approximation, ie quasi maximum likelihood As ŷ k+j k (θ) is not a simple function of θ, the analytical derivatives of ε k+j k = ε k+j k (θ) with respect to θ are generally not available Hence, the optimization algorithms for solving the parameter estimation problem must compute the derivatives of the objective functions numerically, ie by finite difference The one-step prediction-error estimates may be regarded as special versions of the j-step prediction-error estimates However, in that case no extra effort is needed for computing ε k+ k and R k+ k as they are already computed as part of the Kalman filter updates In the j-step prediction case with j >, ε k+j k and R k+j k must be computed explicitly if needed in the parameter estimation objective function E Multi-Step Maximum Likelihood Predictors To deduce true multi-step prediction-error (quasi) maximum likelihood and maximum a posteriori estimators, the correlation between ε k+i,k I k and ε k+j,k I k for i j must be taken into account This correlation is The multi-step prediction error problem can then be expressed as the stochastic model Y k I k = Ŷk(θ) + ǫ k I k k =,0,,N 2 (2) with ǫ k I k N(0,R k ) and R k = ǫ k I k,ǫ k I k R (,) k R (,2) k R (,Np) k R (2,) k R (2,2) k R (2,Np) k = R (Np,) k R (Np,2) k R (Np,N p) k ε k+i,k I k,ε k+j,k I k = R (i,j) k (8) R k = L k R ǫ,k L k (25) Define Y k = [ y k+ y ] in which the factorization, L k and R ǫ,k, is computed using k+n p k =,0,,N Np Y k = [ the Kalman filter recursions (9)-(2) Using the one-step y k+ y N ] predictive Kalman gain k = N Np,,N 2 K and the corresponding multi-step predictions p,k = AK fx,k + K fw,k (26) [ Ŷ k (θ) = ŷ k+ k ŷ k+n p k] the block lower triangular matrix, L k, may be computed as k =,0,,N Np [ I 0 0 Ŷ k (θ) = ŷ k+ k ŷ N k] k = N Np,,N 2 CK p,k+ I 0 L Furthermore, define the conditional multi-step prediction k = CAK p,k+ CK p,k+2 0 (27) error vector as CA ε k+,k I k ε k+,k I Np 2 K p,k+ CA Np 3 K p,k+2 I k ε k+2,k I k ǫ k I k = ǫ ε k+2,k I k and the block diagonal matrix, R ǫ,k, is k I k = (20) R e,k+ ε k+np,k I k ε N,k I k R e,k+2 R ǫ,k = for k =,0,,N N p (the left vector) and k = (28) N N p,,n 2 (the right vector), respectively Re,k+Np 42 (22) The realization of the multi-step prediction-error vector for k =,0,,N N p is ǫ k k = Y k Ŷk(θ) = y k+ ŷ k+ k y k+2 ŷ k+2 k y k+np ŷ k+np k = ε k+ k ε k+2 k ε k+np k (23) The negative log likelihood function for the multi-step prediction is V :Np,ML(θ) = n yf ln (2π) 2 + N 2 ( ) (24) ln (detr k ) + ǫ k k 2 R k ǫ k k k= [ in which f = N p N 2 (N p ) ] In computation of the multi-step prediction-error maximum likelihood estimate, ln (detr k ) and ǫ k k R k ǫ k k must be computed ǫ k k is obtained by computing the j-step prediction errors This is accomplished using (0)-(2) for j =,2,,N p given I k and û k+j k = u k+j By construction the covariance matrix, R k, has the special structure that arise from a state space model This implies that [4]

WeA046 Hence, the determinant of R k may be computed as N p detr k = detr ǫ,k = detr e,k+j (29) which implies N p ln (detr k ) = ln detr e,k+j N p = ln (detr e,k+j ) (30) Consequently, the term N 2 k= ln(detr k) in (24) may be evaluated as N 2 k= ln (detr k ) = N p 2 (k + )ln (detr e,k ) N + N p k=n p The term ǫ k k R k ǫ k k can be evaluated as in which {ēk+j k } Np ln (detr e,k ) ǫ k k R k ǫ k k = ǫ k k (L kr ǫ,k L k) ǫ k k = (L k ǫ k k) R ǫ,k (L k ǫ k k) N p = ē k+j k R e,k+jēk+j k (3) (32) [ ē k+ k ē k+2 k ē k+n p k] = L k ǫ k k is efficiently computed using the Kalman filter recursions for j =,2,,N p ē k+j k = ε k+j k C x k+j k x f = x k+j k + K fx,k+j ē k+j k w f = K fw,k+j ē k+j k x k+j+ k = A x f + w f (33a) (33b) (33c) (33d) with x k+ k = 0 Note that (33b)-(33d) may be expressed as x k+j+ k = A x k+j k + K p,k+j ē k+j k (34) which implies that (33) can be expressed as x k+j+ k = (A K p,k+j C) x k+j k + K p,k+j ε k+j k ē k+j k = C x k+j k + ε k+j k Consequently, the term N 2 efficiently evaluated using N 2 k= ǫ k k R k ǫ k k = + N p 2 k= ǫ k k R N k+ k=n p (35a) (35b) k ǫ k k in (24) may be ē k k j R e,kēk k j N p ē k k j R e,kēk k j (36) and a bank of Kalman filter recursion (33) for computing ē k k j and x k+ k j for j =,2,,N p Hence, at each time instant k the multi-step prediction error ǫ k k is computed using the Kalman predictions This vector is stored u y 2 0 2 0 50 00 50 200 250 300 350 400 450 500 05 0 05 0 50 00 50 200 250 300 350 400 450 500 time Fig IO-data for the SISO system, S, defined by (37)-(38) The inputs, {u(t)}, are PRBS with bandwidth [0 002] and levels [ ] in memory for N p iterations such that ε k k j can be used in computation of ē k k j and subsequent evaluation of the terms in (36) The advantage of this method compared to a naive implementation is that gains and covariances in the Kalman recursions need to be evaluated only once at each time step III SISO EXAMPLE To illustrate the identification criteria discussed in this paper, we consider the SISO system, S = {g(s),h(s)}, defined as Z(s) = g(s)u(s) + h(s)e(s) y(t k ) = z(t k ) + v(t k ) (37a) (37b) in which E(s) is standard white noise and v(t k ) N iid (0,r 2 ) The transfer function, g(s), from the process inputs, U(s), to the process output, Y (s), and the disturbance transfer function, h(s), are K g(s) = (α s + )(α 2 s + ) e τs (38a) h(s) = σ γs + (38b) The parameters defining the system S and used for generating the data are: K = 0, α = 0, α 2 = 30, τ = 52, σ = 02, γ = 0 and r = 02 The system is sampled with a sampling time of T s = 025 The deterministic input, U(s), is assumed to be implemented using a zero-order-hold circuit The IO-data used for estimation of this system are illustrated in Figure A Identical Model and System Structure Consider the situation in which the model and the system has the same structure In this case the structure of the model, 43

WeA046 ĝ(s), and the disturbance model, ĥ(s), are ˆK ĝ(s) = e ˆτs (39a) (ˆα s + )(ˆα 2 s + ) ĥ(s) = ˆσ (39b) {ĝ(s),ĥ(s) ˆγs + } Let M = This implies that the true system, S, is within the class of models, M, estimated, ie S M The estimates for the single-step and multi-step least squares criteria and various prediction horizons are shown in Tables I-II From these results, it is apparent that the LS method cannot be used to uniquely estimate σ and r However, their ratio seems to be constant for different starting guesses and decreases with increasing horizon This implies that the identified model approaches an output error model for long prediction horizons The estimates for the single-step and multi-step maximum likelihood criteria and various prediction horizons are shown in Tables III-IV σ and r are estimated consistently for various initial guesses For long-range single-step maximum likelihood estimation, the estimated model is essentially an output error model The step response for the true model and the models estimated by the multi-step maximum-likelihood criterion with prediction horizons N p = and N p = 200 are shown in Figure 2 There is not much difference between the two estimated models, but a little steady difference compared to the true model However, as can be read off from Table IV the main difference between the estimated models for prediction horizon N p = and prediction horizon N p = 200 is not the deterministic transfer function, ĝ(s), but the disturbance model, ĥ(s), and the covariance of the measurement noise, ˆr 2 B Simplified Model with Output Integrator In this subsection we will illustrate the methodology when the model structure, M, is different from the system model, S, used to generate the data, ie S / M To do this consider the model ˆK ĝ(s) = e ˆτs (40a) ˆαs + ĥ(s) = ˆσ (40b) s In the process industries most stable models can be approximated quite well by delayed first-order transfer functions, ĝ(s) The disturbance model, ĥ(s), is chosen as an integrator to ensure off-set free control for step-type disturbances and model-plant mismatch in the resulting predictive control system for which the estimated model is applied For internal model control (IMC) which can be considered as a restricted class of predictive control this modelling approach is commonplace [6] In contrast to the least-squares prediction-error-methods, the maximum-likelihood prediction-error-methods yield All computations are conducted using a 320 GHz Pentium IV processor The CPU time is reported to indicate the order of magnitude of computing time needed to calculate the various estimates y(t) 08 06 04 02 0 0 5 0 5 20 25 30 t Fig 2 Step response for the deterministic part of the SISO model estimated using a simplified model with an output integrator Estimated model (39a) using the multi-step maximum likelihood criterion with N p = (solid line) and N p = 200 (dotted line) Dashed line: True model (38a) y(t) 08 06 04 02 0 0 5 0 5 20 25 30 t Fig 3 Step response for the deterministic part of the SISO model estimated using a simplified model with an output integrator Solid line: Estimated model (40a) using the multi-step maximum likelihood criterion with N p = and N p = 200 Dashed line: True model (38a) unique estimates for the covariance matrices Hence, only the maximum-likelihood prediction-error-estimates for the system (40) will be reported here The single-step maximumlikelihood estimates for various prediction horizons are shown in Table V As the prediction horizon increases, ˆσ is decreased and the estimated model becomes essentially an output error model For the case considered, the estimated process noise vanishes already at a prediction horizon of j = 8 The measurement noise is increased slightly as the prediction horizon increases to accommodate the output noise that is not caught by the process noise The multistep maximum-likelihood estimates are shown in Table VI Compared to the single-step maximum likelihood estimates the multi-step maximum-likelihood estimates are much less sensitive to the chosen prediction horizon In fact there is 44

WeA046 TABLE I SINGLE-STEP LS ESTIMATION IN MODEL (39) j K α α 2 τ σ γ r σ/r V CPU sec 09797 0564 3426 537 08705 3757 0898 068 099 3 4 09792 05798 34053 539 04802 548 03756 2785 69 54 8 09790 07239 33496 52037 0830 885 05636 4730 200 227 20 09832 07086 3385 5209 9202 9777 20543 09347 92 35 40 09786 08639 3287 506 0824 09056 0776 0268 225 325 80 0979 0762 33374 5578 0800 09000 0800 0000 295 394 00 0087 09820 3347 48656 0800 09000 0800 0000 2003 376 200 09428 445 2880 50634 0800 09000 0800 0000 304 532 TABLE II MULTI-STEP LS ESTIMATION IN MODEL (39) N p K α α 2 τ σ γ r σ/r V CPU sec 09797 05632 3429 5379 03377 3754 0380 0620 00 87 4 09796 05657 3494 5370 00080 386 00075 0603 449 42 8 09794 0636 3398 52827 03805 364 03595 0585 9242 25 20 09792 076 33563 5207 03053 430 02897 0539 2370 393 40 09823 07394 3364 5836 07805 80826 0838 09383 4763 644 80 09804 07597 33357 5767 04734 7230 04975 0954 948 0 00 09796 07586 33305 5782 06825 65664 0727 09386 804 426 200 09760 07739 32966 5802 05728 6969 065 0934 23023 2382 TABLE III SINGLE-STEP ML ESTIMATION IN MODEL (39) j K α α 2 τ σ γ r σ/r V CPU sec 09797 0565 342 5364 02204 3762 02077 063-6327 5 4 09793 05798 34048 5320 02432 32 0868 3022-987 37 8 09789 0784 33500 52088 02853 088 0526 8694 2357 725 20 09832 0708 33829 52007 02243 90000 0239 09384 762 2038 40 09786 08639 3287 5060 00002 0380 02475 00007 4537 3297 80 0979 07608 33376 5580 00002 0985 02545 00007 007 6082 00 0088 0987 33474 48656 00002 02748 0365 00006 5369 6696 200 09426 232 2893 5078 00002 0029 02553 00007 074 3227 TABLE IV MULTI-STEP ML ESTIMATION IN MODEL (39) j K α α 2 τ σ γ r σ/r V CPU sec 09797 0565 342 5364 02204 3762 02077 063-6327 4 09798 05607 3425 5386 0282 4745 0202 0383-2487 60 8 09798 05307 34369 53425 0205 5453 0240 0958-4685 237 20 0980 05262 34700 53392 0860 5200 0289 08498-8290 490 40 09835 05302 35242 5324 0807 364 0289 08254-204 878 80 09872 05294 35495 53043 0840 3603 0290 08403-977 66 00 09879 05295 35535 53027 0843 3535 0289 0842-2397 2508 200 09898 05298 35640 52987 0840 330 0285 08420-475 6730 not much difference between the estimated parameters for the one-step ahead maximum likelihood estimate and the multi-step maximum likelihood estimate with a very long prediction horizon, ie N p = 200 The step responses for the estimated multi-step maximum likelihood estimate with a prediction horizon of N p =, ie the one-step maximum likelihood estimate, and a prediction horizon of N p = 200 are shown in Figure 3 They can hardly be distinguished Hence, for all practical purposes they can be considered identical This suggests that the one-step ahead prediction maximum-likelihood estimate should be applied in practice as the computing time for the one-step ahead prediction maximum-likelihood estimate is considerably lower than the computing time for the multi-step maximum-likelihood prediction with a long prediction horizon (N p = 200) Figure 3 also depicts the step response of the true system It is evident that the step responses of the estimated models approximate the true step response quite well IV CONCLUSION A constructive method for estimation of parameters in continuous-discrete-time stochastic systems described by 45

WeA046 TABLE V SINGLE-STEP ML ESTIMATION IN MODEL (40) j K α τ σ r σ/r V CPU sec 0043 38386 57243 00658 02226 02959-978 23 4 099 36390 57547 0024 02424 005 3024 399 8 098 35585 57792 00006 02490 00025 3402 857 20 0982 35568 57802 00004 02455 0008 2920 2245 40 09822 35750 57697 00002 02479 00009 4826 492 80 09747 35664 5768 00002 02547 00008 028 7448 00 007 36458 56487 00002 0369 00006 5395 822 200 09465 3373 5833 00006 02556 00023 05 7885 TABLE VI MULTI-STEP ML ESTIMATION IN MODEL (40) j K α τ σ r σ/r V CPU sec 0043 38386 57243 00658 02226 02959-978 20 4 0043 38387 57244 00659 02424 02962-7972 60 8 0043 38386 57243 00658 02490 02956-64 257 20 0043 38389 57244 00660 02455 02968-3984 382 40 0044 38394 57245 00666 02479 02995-7804 722 80 0039 3839 57256 00669 02547 030-54 550 00 0033 38277 5726 00670 0369 0308-954 2082 200 0024 38209 57269 00672 02556 03027-4234 4268 transfer functions with time delays has been described and demonstrated The method applies prediction-error criteria and the predictions are generated using the Kalman filter and predictor for a stochastic linear discrete-time state space model equivalent to the continuous-discrete-time stochastic transfer function model with time delays In particular, an efficient computing scheme for the multi-step maximum likelihood prediction-error estimator is developed The multi-step prediction-error criteria may be selected such that they are compatible with the optimization criterion applied by the predictive controller that uses the identified model Compared to the single-step least-squares and the single-step maximum likelihood estimators, the multi-step maximum likelihood estimator produces parameter estimates that are less sensitive to the prediction horizon applied In contrast to the singlestep and multi-step least squares estimators, the multi-step maximum likelihood estimator computes unique parameters for the process and measurement noise Hence, the multistep maximum likelihood estimators are recommended for predictive control Depending on the prediction horizon, the multi-step maximum likelihood estimator requires much more computer resources than the single-step one-step ahead least-squares predictor Consequently, based on the SISO simulation example, we recommend the maximum likelihood (or maximum a posteriori) estimator based on the one-step-ahead predictionerror The models obtained using the multi-step maximumlikelihood prediction-error method with a prediction horizon of one and a very long prediction horizon are essentially identical However, the long prediction horizon demands much more computational resources than the criterion based on the one-step-ahead prediction REFERENCES [] K J Åström, Maximum likelihood and prediction error methods, Automatica, vol 6, pp 55 574, 980 [2] L Ljung, System Identification Theory for the User, 2nd ed Upper Saddle River, NJ: Prentice Hall, 999 [3] N R Kristensen, H Madsen, and S B Jørgensen, Parameter estimation in stochastic grey-box models, Automatica, vol 40, pp 225 237, 2004 [4] D S Shook, C Mohtadi, and S L Shah, A control-relevant identification strategy for GPC, IEEE Transactions on Automatic Control, vol 37, pp 975 980, 992 [5] R B Gopaluni, R S Patwardhan, and S L Shah, The nature of data pre-filters in MPC relevant identification - open- and closed-loop issues, Automatica, vol 39, pp 67 626, 2003 [6], MPC relevant identification - tuning the noise model, Journal of Process Control, vol 4, pp 699 74, 2004 [7] M Morari and J H Lee, Model predictive control: Past, present and future, Computers and Chemical Engineering, vol 23, pp 667 682, 999 [8] S B Jørgensen and J H Lee, Recent advances and challenges in process identification, in Chemical Process Control - 6: Assessment and New Directions for Research, J B Rawlings and B A Ogunnaike, Eds, January 200 [9] H Hjalmarsson, From experiments to closed loop control, in 3th IFAC Symposium on System Identification, 27-29 August 2003, Rotterdam, The Netherlands IFAC, 2003 [0] M Gevers, A personal view on the development of system identification, in 3th IFAC Symposium on System Identification, 27-29 August 2003, Rotterdam, The Netherlands IFAC, 2003 [], Identification for control: Achievements and open problems, in DYCOPS 7, Cambridge, Massachussetts USA, July 5-7 IFAC, 2004 [2] G A F Seber and C J Wild, Nonlinear Regression New York: Wiley, 989 [3] J D Hamilton, Time Series Analysis Princeton, New Jersey: Princeton University Press, 994 [4] T Kailath, A H Sayed, and B Hassibi, Linear Estimation Prentice Hall, 2000 [5] A H Jazwinski, Stochastic Processes and Filtering Theory Academic Press, 970 [6] M Morari and E Zafiriou, Robust Process Control Englewood Cliffs, New Jersey: Prentice-Hall, 989 46