An On-line Method for Estimation of Piecewise Constant Parameters in Linear Regression Models

Similar documents
An Homotopy Algorithm for the Lasso with Online Observations

arxiv: v1 [math.st] 1 Dec 2014

Expressions for the covariance matrix of covariance data

CONTROL SYSTEMS, ROBOTICS, AND AUTOMATION - Vol. V - Prediction Error Methods - Torsten Söderström

A New Subspace Identification Method for Open and Closed Loop Data

Sparse inverse covariance estimation with the lasso

Parameter Estimation in a Moving Horizon Perspective

On Input Design for System Identification

Near Ideal Behavior of a Modified Elastic Net Algorithm in Compressed Sensing

LTI Systems, Additive Noise, and Order Estimation

Nonconcave Penalized Likelihood with A Diverging Number of Parameters

Smoothly Clipped Absolute Deviation (SCAD) for Correlated Variables

Online Adaptive Estimation of Sparse Signals: where RLS meets the l 1 -norm

Algorithm for Multiple Model Adaptive Control Based on Input-Output Plant Model

Exploiting Sparsity for Wireless Communications

Chris Fraley and Daniel Percival. August 22, 2008, revised May 14, 2010

State Smoothing by Sum-of-Norms Regularization

Generalized Elastic Net Regression

Auxiliary signal design for failure detection in uncertain systems

Selection of Smoothing Parameter for One-Step Sparse Estimates with L q Penalty

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference

Sparse Least Mean Square Algorithm for Estimation of Truncated Volterra Kernels

SOLVING NON-CONVEX LASSO TYPE PROBLEMS WITH DC PROGRAMMING. Gilles Gasso, Alain Rakotomamonjy and Stéphane Canu

The picasso Package for Nonconvex Regularized M-estimation in High Dimensions in R

Analysis Methods for Supersaturated Design: Some Comparisons

Analysis of the AIC Statistic for Optimal Detection of Small Changes in Dynamic Systems

Sliding Window Recursive Quadratic Optimization with Variable Regularization

MULTI-MODEL FILTERING FOR ORBIT DETERMINATION DURING MANOEUVRE

An Algorithm for Bayesian Variable Selection in High-dimensional Generalized Linear Models

Coordinate descent. Geoff Gordon & Ryan Tibshirani Optimization /

Tractable Upper Bounds on the Restricted Isometry Constant

Finite-time experiment design with multisines

Cramér-Rao Bounds for Estimation of Linear System Noise Covariances

Riccati difference equations to non linear extended Kalman filter constraints

On Identification of Cascade Systems 1

LASSO Review, Fused LASSO, Parallel LASSO Solvers

Using Multiple Kernel-based Regularization for Linear System Identification

Gaussian Graphical Models and Graphical Lasso

Approximation. Inderjit S. Dhillon Dept of Computer Science UT Austin. SAMSI Massive Datasets Opening Workshop Raleigh, North Carolina.

Lecture 25: November 27

Fast Regularization Paths via Coordinate Descent

An Introduction to Graphical Lasso

Adaptive Dual Control

Lecture 2 Part 1 Optimization

ROBUST BLIND CALIBRATION VIA TOTAL LEAST SQUARES

Inferring biological dynamics Iterated filtering (IF)

A Blockwise Descent Algorithm for Group-penalized Multiresponse and Multinomial Regression

BLIND SEPARATION OF TEMPORALLY CORRELATED SOURCES USING A QUASI MAXIMUM LIKELIHOOD APPROACH

ORIE 4741: Learning with Big Messy Data. Regularization

Growing Window Recursive Quadratic Optimization with Variable Regularization

Inverse Covariance Estimation with Missing Data using the Concave-Convex Procedure

Cover page. : On-line damage identication using model based orthonormal. functions. Author : Raymond A. de Callafon

Average Reward Parameters

MINIMUM EXPECTED RISK PROBABILITY ESTIMATES FOR NONPARAMETRIC NEIGHBORHOOD CLASSIFIERS. Maya Gupta, Luca Cazzanti, and Santosh Srivastava

Likelihood Bounds for Constrained Estimation with Uncertainty

Calibration of a magnetometer in combination with inertial sensors

Sparse Covariance Selection using Semidefinite Programming

Paper Review: Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties by Jianqing Fan and Runze Li (2001)

The Rationale for Second Level Adaptation

Lecture Outline. Target Tracking: Lecture 3 Maneuvering Target Tracking Issues. Maneuver Illustration. Maneuver Illustration. Maneuver Detection

FIR Filters for Stationary State Space Signal Models

ISyE 691 Data mining and analytics

Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices

Or How to select variables Using Bayesian LASSO

Condensed Table of Contents for Introduction to Stochastic Search and Optimization: Estimation, Simulation, and Control by J. C.

A Survey of L 1. Regression. Céline Cunen, 20/10/2014. Vidaurre, Bielza and Larranaga (2013)

BAGUS: Bayesian Regularization for Graphical Models with Unequal Shrinkage

The lasso: some novel algorithms and applications

Gradient Descent. Ryan Tibshirani Convex Optimization /36-725

Optimization for Compressed Sensing

Confidence Estimation Methods for Neural Networks: A Practical Comparison

Pathwise coordinate optimization

Least Absolute Shrinkage is Equivalent to Quadratic Penalization

Optimization methods

The lasso, persistence, and cross-validation

A Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models

Least-squares data fitting

Regularized Least Squares Temporal Difference learning with nested l 2 and l 1 penalization

Sparse Gaussian conditional random fields

arxiv: v2 [math.st] 12 Feb 2008

Stochastic Subgradient Methods

Recursive l 1, Group lasso

Title without the persistently exciting c. works must be obtained from the IEE

An efficient ADMM algorithm for high dimensional precision matrix estimation via penalized quadratic loss

A unified framework for EIV identification methods in the presence of mutually correlated noises

A COMPARISON OF TWO METHODS FOR STOCHASTIC FAULT DETECTION: THE PARITY SPACE APPROACH AND PRINCIPAL COMPONENTS ANALYSIS

Regression, Ridge Regression, Lasso

On Mixture Regression Shrinkage and Selection via the MR-LASSO

A NEW INFORMATION THEORETIC APPROACH TO ORDER ESTIMATION PROBLEM. Massachusetts Institute of Technology, Cambridge, MA 02139, U.S.A.

Machine Learning for Big Data CSE547/STAT548, University of Washington Emily Fox February 4 th, Emily Fox 2014

Sparse regression. Optimization-Based Data Analysis. Carlos Fernandez-Granda

TECHNICAL REPORT NO. 1091r. A Note on the Lasso and Related Procedures in Model Selection

Sparsity Regularization

Machine Learning for OR & FE

Linear regression methods

Learning with L q<1 vs L 1 -norm regularisation with exponentially many irrelevant features

Introduction to the genlasso package

Expectation propagation for signal detection in flat-fading channels

Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms

A direct formulation for sparse PCA using semidefinite programming

Transcription:

Preprints of the 8th IFAC World Congress Milano (Italy) August 8 - September, An On-line Method for Estimation of Piecewise Constant Parameters in Linear Regression Models Soheil Salehpour, Thomas Gustafsson, Andreas Johansson E-mail: {soheil, tgu, andreas.johansson}@ltu.se, Control Engineering Group, Luleå University of Technology, SE-97 87 Luleå, Sweden. Abstract: We present an on-line method for detecting changes and estimating parameters in AR(X) models. The method is based on the assumption of piecewise constant parameters resultinginasparsestructureoftheirderivative.toillustratethealgorithmanditsperformance, we apply it to the change in the model parameters of some ARX models. The example illustrates that the new method shows good performance for an AR(X) model with abrupt changes in the parameters. Keywords: Parameter estimation, ARX model, LASSO, l -norm, sparsity.. ITRODUCTIO The area of change detection is a quite active field, both in research and applications. Faults occur in almost all systems, and one aim with change detection is to locate the fault occurrence in time and raise an alarm. Another application is the estimation of perturbations (Salehpour and Johansson, 8). In (Gustafsson, ) and (Kay, 998), surveys are given overon-lineandoff-lineformulationsofsingleandmultiple change point estimation. In the on-line method, multiple filters are used in parallel, where each one is matched to a certain assumption on the abrupt changes. Two offline strategies are also proposed, one is based on Markov Chain Monte Carlo techniques, and the other approach is based on a recursive local search scheme. In (Salehpour, Johansson and Gustafsson 9) an off-line method based on MILP (Mixed Integer Linear Programming) and the sparsity of the derivative of parameters is presented which is an efficient method for fault detection and model quality estimation, but the disadvantage of this method is the computational complexity of the MILP-optimization. An off-line LASSO (Least Absolute Shrinkage and Selection Operator) estimator is also a good choice to maximize the sparsity of (t) = (t+) (t), and estimate (t), which is used in (Salehpour and Johansson, 8) for estimation of perturbations, in (Ozay, Sznaier, Lagoa and Camps, 8) for set membership identification and image segmentation, and is modified to use for segmentation in (Ohlsson, Ljung and Boyd, ). The goal with change point estimation, is to find a sequence k n = [k,k,,k n ] of time indices, where both the number n and the locations k i are unknown, such that the signal or the model of the signal can be described as piecewise constant, where time-varying parameters are mostly constant with abrupt changes. For this purpose, we assume that the signal can be described with a linear regression model y(t) = φ(t) T (t)+e(t) () where (t) is a piecewise constant vector between the time indices k n, and e(t) is some noise signal. For an ARX (n a,n b,n c ) model φ(t) T =[ y t,..., y t na,u t nk,...,u t nk n b +] (t) T =[a t,a t,...,a na t,b t,b t,...,b n b t ] Θ()=[(),(),,()] In Section, a LASSO estimator is described. An on-line method based on LASSO method is presented in Section 3. Simulation results are given in Section 4, followed by some concluding remarks and directions for future work in Section 5.. PRELIMIARIES. Estimation of Time-varying Parameters The RLS algorithm is traditionally used as an on-line method to estimate parameters in (), where we get () ˆ() = argmin β(,t)(y(t) φ(t) T ) () t= where the forgetting factor β(,t) describes one of the following data windowing choices a Infinite window with β(,t) = for time-invariant signals with proper initialization. a Exponentially decaying window with β(,t) = β t, and β <. RLS gives less weight to old samples and can track time-varying signals. Copyright by the International Federation of Automatic Control (IFAC) 37

Preprints of the 8th IFAC World Congress Milano (Italy) August 8 - September, a3 Finite window with β(,t) = if t M and β(,t) = otherwise, where the most recent M samples are used to estimate (t) and the rest are discarded. A sparse matrix is defined as a matrix populated primarily with zeros. The concept of sparsity is useful in complex systems and many application areas such as network theory. Huge sparse matrices often appear in science or engineering when solving partial differential equations. One common approach to seeking a sparse description of (t) is based on l -norm regularization (Boyd and Vandenberghe,7) where most parameters are shrunk to zero. The regularized method is where J(,Θ(t)) = minimize Θ(t) J(,Θ(t)) (3) β(,t)(y(t) φ(t) T (t)) + t= λ (t) (t ) t= where λ is a positive parameter. An iterative re-weighting is used in (Fazel, ) to get fewer parameter changes and better estimation of them. The regularization term in Eq. (3) is replaced with J(,Θ(t)) = β(,t)(y(t) φ(t) T (t)) + t= (4) λ ω(t) (t) t= where ω(t) >. The weights ω(t) tends to allow for successively better estimation of the nonzero coefficient locations. The algorithm is as follows: Set the iteration count l to zero and w (l) (t) = for t =,,. Solve the weighted l -minimization problem, and compute Θ (l) (t). Update the weights ω (l+) i (t) = /(ǫ+ (l) i (t) ) (5) The largest i (t) is most likely to be identified as nonzero.oncetheselocationsareidentified,theirinfluence is down weighted in order to increase sensitivity and tending to zero for identifying the remaining small but nonzero i (t). The optimization problem with cost function (4) is solved off-line and in purpose to solve it on-line, a regularized RLS is considered () = argmin J(,) (6) If the constant terms in J(,) of (4) are neglected, the cost function can be rewritten as where J(,) = T R T r +λ ˆ( ) (7) R = β(,t)φ(t)φ T (t), r = β(,t)y(t)φ(t) t= t= (8) R and r can be updated recursively with different data windowing (a-a3) as a: R = R +φ()φ T (), r = r +y()φ() a: R =βr +φ()φ T (), r =βr +y()φ() a3: R =R +φ()φ T () φ( M)φ T ( M), r =r +y()φ() y( M)φ( M) and λ (t) = ω(t)λ is chosen to satisfy the oracle properties which is discussed in Section... Adaptive LASSO and Oracle Conditions For the asymptotic analysis of (), two conditions are assumed in (Knight and Fu, ): e(t) are independent identically distributed random variables with mean and variance σ. φt φ C, where C is a positive definite matrix. We now define the adaptive LASSO. Suppose that ˆα is a consistent estimator to α, and define the weight vector ˆµ = / ˆα. The adaptive LASSO estimates ˆα ( ) is given by ˆα ( ) =argmin α β(,t)(y(t) φ(t) T α(t)) + t= λ µ(t) α(t) t= Let A = { j : α j } be a vector with length p. The estimated parameters is denoted as ˆα j (δ), we call δ an oracle procedure (Fan and Li, ) if ˆα j (δ) has the following oracle properties: Identifies the right subset model, A = {j : ˆα j (δ) } Has the optimal estimation rate and convergence in distribution ( d ), that is (ˆα j (δ) α j ) d (,Σ ), where Σ is the covariance matrix knowing the true subset model. It is shown in (Zou, 6) that with a proper choice of λ, the adaptive LASSO enjoys the oracle properties. Theorem (Oracle properties): Suppose that λ / and λ. Then the adaptive LASSO satisfies the following: () Consistency in variable selection: lim P(A = A) = () Asymptotic normality: (ˆα A α A) d (,C ) 37

Preprints of the 8th IFAC World Congress Milano (Italy) August 8 - September, and C is a p p submatrix of matrix [ ] C C C = C C in the second condition of asymptotic analysis. LetB = { } j : j } andb {j = : ˆ j (δ),where ˆ j (δ) is a δ oracle procedure of j. Then the oracle properties can also be shown for (4). () Consistency in variable selection: lim P(B = B) = () Asymptotic normality: ( ˆ B B) d (,C ) The proof of this is a straightforward modification of the proof in (Zou, 6). 3. OLIE METHOD BASED O A OLIE (CYCLIC) COORDIATE DESCET Agradient-basedminimizationof(7)isimpossiblebecause the l -norm is non-differentiable. A possible approach is offered by On-line coordinate descent iterative minimizers (Angelosante, Bazerque and Georgios, ). The algorithm is modified here to develop an online solver of (7) and compute a closed-form solution per iteration. In cyclic coordinate descent (CCD), iterative minimization of J(,) in (7) is performed with respect to one coordinate per iteration cycle. If the solution at time and iteration i is denoted as (i) (), the pth variable at the ith iteration is updated as (i) p () =argmin (i ) p+ J(,[ (i) (),,(i) p (),, (),,(i ) n a+n b ()]) (9) for p =,...,n a +n b. In every ith cycle, each coordinate p is optimized, while the pre-coordinates (,...,p ) are kept fixed to their values at the ith cycle, and the postcoordinates (p +,...,n a + n b ) are kept fixed to their values at (i )th cycle. The algorithm is solvable in closed form with an effective initialization (all-zero vector), and recentcomparativestudiesshowthatthecomplexityofthe method is similar to the state-of-art-batch LASSO solvers (Wu and Lange, 8). An adaptive equivalent of CCD LASSO is introduced (Angelosante, Bazerque and Georgios, ) as online coordinate descent (OCD) algorithm to iteratively solve (9), where the iteration index (i) is replaced in OCD by the time index, and the difficulty of OCD is to update only one variable in one direction per time. Let = k(n a + n b ) + p denote the time index, where p {,,n a +n b } is the only entry of to be updated at time (only p is updated [ and q () ] = q ( ) for q p is kept unchanged). k = n a+nb - is the number of cycles and how many times the pth coordinate is updated. Let ( ) denote the solution of the OCD algorithm at time and q () = q ( ) for q p, which sets all Algorithm : OCD Initialize with () = for k =,, for p =,,n a +n b. Get data y(), and φ(), = k(n a +n b )+p.. Compute r and R in a a3. 3. Set q() = q( ) for all q p. 4. Compute r,p in (). 5. Update p() as in (). Table. OCD Algorithm but the pth coordinate at time equal to those at time, and select the pth one by minimizing J(,) as p () =argmin J(,[ ( ),, p ( ),, p+ ( ),, na+n b ( )]) () After isolating q ( ) for q p in the cyclic update (), the J(,) depends on the pth coordinate, and can be rewritten as p ()=argmin [ ] R (p,p) r,p +λ x r,p =r (p) q pr (p,q) q ( ) () where x = p ( ). It is a scalar optimization problem, and has the closed-form solution (Friedman, Hastie, Höfling and Tibshirani, 7) p () = () sgn(r,p R (p,p)x) ( ) r,p R (p,p)x λ R (p,p) + +x where [γ] + := max(γ,). A soft-thresholding operation sets inactive entries to the previous value, and gives a sparse solution. The OCD algorithm is shown in Table. The OCD solver has low complexity but exhibits slow convergence becauseeach variable is updatedeveryn a +n b observation. We implement the OCD cyclically to update all coordinates once or several times per observation. Once () is solved the pth coordinate will be computed by minimization in the next steps (i) p () =argmin J( (i) (i) (),, p (),, (i ) (i ) p+ (),, n a+n b ()) (3) ˆ p (i) (i ) () is solved as in () with x = p ( ) (i) p () = (4) sgn(r,p R (p,p)x) ( ) r,p R (p,p)x λ R (p,p) +x + where r,p = r (p) R (p,q) q (i) () q<p q>p R (p,q) q (i ) () (5) The online cyclic coordinate descent (OCCD) is shown in Table. 373

Preprints of the 8th IFAC World Congress Milano (Italy) August 8 - September, Algorithm : OCCD Initialize with () = for =,. Get data y(), and φ().. Compute r and R in a a3. for l =, (times to update the weights ω(t) in (5)) for i =, (times OCCD update all coordinates) for p =,,n a +n b 3. Compute r,p in (5). 4. Update p() as in (4). Table. OCCD Algorithm 4. SIMULATIO RESULTS In purpose to give some idea about the performance of the method, we apply it to a number of AR(X) models. We take λ,max = σ log(n a +n b ) n= β( n), for a, λ,max = σ log(n a +n b ).5 for a and a3, λ =.λ,max, and ǫ =. in (5). The input is a ± PRBS (Pseudo-Random Binary Sequence) signal in Examples and. 4. Example : The method is applied to an ARX change model with n a = and n k = n b = and σ =., where the parameters are shown in Fig. -(c). The input and output are depicted in Fig. (d)-(e). Let β =.9 for a, and the size of window M = for a3 in the OCD and OCCD methods. We consider 5 times coordinate updating for the OCCD method. The parameter estimation for a set of data (y t,u t,e t ) is shown in Fig. -(c). In Fig., an unbiased variance of b t is also shown to compare with the RLS method for data sets using Monte Carlo analysis of samples, where the a t and a t are respectively.5 and.8 and b t is changed abruptly after 5 samples with magnitude. A smaller unbiased variance is obtained compared with the RLS method by using the OCCD algorithms. The RLS algorithm with window size of M = and the OCCD algorithm are compared in Fig., which shows less unbiased variance of the OCCD algorithms than RLS in a with β =.9 and a3 with M =. We also check the OCD and OCCD algorithms while b t is changed as a ramp function (Fig. 3(c)), which shows a good tracking of parameters in Fig. 3-3(c), despite the parameter is not piecewise constant. Fig. 3(d)-3(e) depicts the output and the PRBS input. 4. Example : Changing time delay: Consider the system y(t) =.9y(t )+u(t n k )+e(t) At time t = the time delay n k changes from to. An ARX-model y(t) = ay(t )+b u(t )+b u(k )+e(t) is used to estimate a, b and b. The OCCD method estimates the parameters with β =.9 for a and M = for a3, which is shown in Fig. 4-4(c) for a set of data (y t,u t,e t ). The result shows a good estimation of b and b which jump with magnitude at sample 5. 4.3 Example 3: The algorithm investigates the data of a human EEG signal (Fig. 5). A second order AR model is considered to model the time-varying EEG signal. An estimated and a smoothed piecewise constant parameter estimate is obtained using a bank of 8 filters from (Gustafsson, ). With an AR model, the change point 43 is computed (Fig. 5). Our algorithm is implemented with σ = and β =.99. Fig. 5 shows that the change of EEG is detected after samples and the parameter estimate converges to the estimate of the filter bank. 5. COCLUSIOS AD FUTURE WORK An on-line LASSO algorithm, is presented to estimate piecewise constant parameters in linear regression models. It is based on the assumption of piecewise constant parameters resulting in a sparse structure of their derivative, and a cyclic coordinate descent iterative minimization of LASSO problem. In particular, the parameters of an AR(X) model are considered. The method is tested on a linear AR(X) change model. The results shows good performance of the method. For future research, a faster convergence of the OCCD algorithm should be pursued. ACKOWLEDGEMETS The authors wish to thank the Hjalmar Lundbohm Research Center (HLRC) funded by LKAB for financing this research. REFERECES Angelosante D., Bazerque A. and Georgios B. Online adaptiveestimationofsparsesignals:whererlsmeets the l -norm. IEEE Transactions on signal Processing, Vol. 58, o. 7, July. Boyd S. and Vandenberghe L. Convex Optimization. Cambridge University, 7 Fan T. and Li R. Variable Selection via onconcave Penalized Likelihood and its Oracle Properties Journal of the American Statistical Association,Vol.96,o.456,. Fazel M. Matrix Rank Minimization with Applications. PhD thesis.elec.eng.dept,stanforduniversity,march. Friedman J., Hastie T., Höfling H., and Tibshirani R. Pathwise coordinate optimization Annals of Applied Statistics, Vol., o., 3-33, 7. Gustafsson F. Adaptive filtering and change detection, John Wiley and Sons, Ltd,. Kay S. M. Fundamentals of Statistical Signal Processing: Detection Theory. Prentice-Hall, 998. Knight K. and Fu W. J. Asymptotics for LASSO-type estimators. The Annals of Statistics, Vol. 8, o. 5, 356378,. 374

Preprints of the 8th IFAC World Congress Milano (Italy) August 8 - September,.5 ) ) ) Output Input.5 4 6 8 4 6 8.9.8.7.6.5.4.3.. 4 6 8 4 6 8 3.5.5.5 4 6 8 4 6 8 5 5 5 5 (c) 5 4 6 8 4 6 8.5.5 (d) Un biased Variance of Signal 3 3 4 5 6 7 8 9 Fig..Unbiasedvarianceofb t,rls(solid),rlswithwindows size M = (dash-dotted), OCCD algorithm with 5 times coordinate updating and β =.9 (dotted), and OCCD algorithm with the size of windows M = (dashed) Ohlsson H., Ljung L. and Boyd S. Segmentation of ARXmodels using sum-of-norms regularization, Automatica, (46), 6, 7-,. Ozay., Sznaier M., Lagoa C. and Camps, O. A sparsification approach to set membership identification of a class of affine hybrid systems. In Proceedings of the 47th IEEE conference on decision and control, 3-3, Dec. 8. Salehpour S., Johansson A. and Gustafsson T. Parameter estimation and change detection in linear regression models using mixed integer linear programming Proceedings of the 5th IFAC Symposium on System Identification (SYSID), Saint-Malo, France, July 9. Salehpour S. and Johansson A. Two Algorithms for Model Quality Estimation in State-Space Systems with Time- Varying Parameter Uncertainty, In Proceedings of the American Control Conference, 489-484, Seattle, USA, June 8. Wu T. T. and Lange K. Coordinate descent algorithms for LASSO penalized regression. Annals of Applied Statistics, Volume, umber, 4-44, 8. Zou H. The Adaptive LASSO and its Oracle Properties. Journal of the American Statistical Association, Vol., o. 476, 6. 4 6 8 4 6 8 (e) Fig.. The ARX change model with a white Gaussian noise (σ =.). The true parameters (solid), OCD algorithm (dashed) with β =.9, OCCD algorithm with 5 times coordinates updating and β =.9 (dash-dotted), and OCCD algorithm with the size of windows M = (dotted) in a t, a t and (c) b t (d) The output (e) Input 375

.8.6.4. Preprints of the 8th IFAC World Congress Milano (Italy) August 8 - September, ).5.5 4 6 8 4 6 8 ).8.6.4. 4 6 8 4 6 8 ).8.6.4 ). 4 6 8 4 6 8. 4 6 8 4 6 8 3.5 ).8.6.4 ) Output.5.5 4 6 8 4 6 8 5 5 5 (c). 4 6 8 4 6 8 (c) Fig. 4. The delay model with a white Gaussian noise (σ =.). The true parameters (solid), OCD algorithm (dashed) with β =.9, OCCD algorithm with 5 times coordinate updating and β =.9 (dash-dotted), and OCCD algorithm with the size of windows M = (dotted) in a t, b t and (c) b t 4 5 4 6 8 4 6 8 (d) EEG Signal 4 6.5 8 3 4 5 6 7 Input.5.5 4 6 8 4 6 8 (e) Fig. 3. The ARX change model with a white Gaussian noise (σ =.). The true parameters (solid), OCD algorithm (dashed) with β =.9, OCCD algorithm with 5 times coordinate updating and β =.9 (dashdotted), and OCCD algorithm with the size of windows M = (dotted) in a t, a t and (c) The parameter b t as a ramp function (d) The output (e) Input Parameters.5.5 3 4 5 6 7 Fig. 5. The human EEG signal The estimated parametersofanar()model,theestimated(dotted) and the smoothed estimation (dashed) using the bank of filters, the OCCD algorithm with 5 coordinate updates and β =.99 (solid) 376