This is a repository copy of An iterative orthogonal forward regression algorithm.

Similar documents
Additional File 1 - Detailed explanation of the expression level CPD

Improvements on Waring s Problem

Specification -- Assumptions of the Simple Classical Linear Regression Model (CLRM) 1. Introduction

Team. Outline. Statistics and Art: Sampling, Response Error, Mixed Models, Missing Data, and Inference

Estimation of Finite Population Total under PPS Sampling in Presence of Extra Auxiliary Information

Harmonic oscillator approximation

Chapter 6 The Effect of the GPS Systematic Errors on Deformation Parameters

AP Statistics Ch 3 Examining Relationships

Start Point and Trajectory Analysis for the Minimal Time System Design Algorithm

Small signal analysis

Statistical Properties of the OLS Coefficient Estimators. 1. Introduction

MULTIPLE REGRESSION ANALYSIS For the Case of Two Regressors

Confidence intervals for the difference and the ratio of Lognormal means with bounded parameters

Chapter 11. Supplemental Text Material. The method of steepest ascent can be derived as follows. Suppose that we have fit a firstorder

Root Locus Techniques

Improvements on Waring s Problem

Two Approaches to Proving. Goldbach s Conjecture

White Rose Research Online URL for this paper: Version: Accepted Version

Alpha Risk of Taguchi Method with L 18 Array for NTB Type QCH by Simulation

Solution Methods for Time-indexed MIP Models for Chemical Production Scheduling

On the SO 2 Problem in Thermal Power Plants. 2.Two-steps chemical absorption modeling

Introduction to Interfacial Segregation. Xiaozhe Zhang 10/02/2015

Pythagorean triples. Leen Noordzij.

Scattering of two identical particles in the center-of. of-mass frame. (b)

Method Of Fundamental Solutions For Modeling Electromagnetic Wave Scattering Problems

The multivariate Gaussian probability density function for random vector X (X 1,,X ) T. diagonal term of, denoted

Introduction. Modeling Data. Approach. Quality of Fit. Likelihood. Probabilistic Approach

Batch RL Via Least Squares Policy Iteration

Variable Structure Control ~ Basics

2.3 Least-Square regressions

CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE

Image Registration for a Series of Chest Radiograph Images

A New Inverse Reliability Analysis Method Using MPP-Based Dimension Reduction Method (DRM)

Lecture 10 Support Vector Machines II

Batch Reinforcement Learning

A Hybrid Evolution Algorithm with Application Based on Chaos Genetic Algorithm and Particle Swarm Optimization

A Complexity-Based Approach in Image Compression using Neural Networks

Wind - Induced Vibration Control of Long - Span Bridges by Multiple Tuned Mass Dampers

A Novel Approach for Testing Stability of 1-D Recursive Digital Filters Based on Lagrange Multipliers

GREY PREDICTIVE PROCESS CONTROL CHARTS

Difference Equations

Linear Feature Engineering 11

APPROXIMATE FUZZY REASONING BASED ON INTERPOLATION IN THE VAGUE ENVIRONMENT OF THE FUZZY RULEBASE AS A PRACTICAL ALTERNATIVE OF THE CLASSICAL CRI

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Errors for Linear Systems

Hongyi Miao, College of Science, Nanjing Forestry University, Nanjing ,China. (Received 20 June 2013, accepted 11 March 2014) I)ϕ (k)

Chapter 13: Multiple Regression

4DVAR, according to the name, is a four-dimensional variational method.

Kernel Methods and SVMs Extension

Discrete Simultaneous Perturbation Stochastic Approximation on Loss Function with Noisy Measurements

Resonant FCS Predictive Control of Power Converter in Stationary Reference Frame

Optimal inference of sameness Supporting information

Module 5. Cables and Arches. Version 2 CE IIT, Kharagpur

Design and Optimization of Fuzzy Controller for Inverse Pendulum System Using Genetic Algorithm

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

An Effective Feature Selection Scheme via Genetic Algorithm Using Mutual Information 1

728. Mechanical and electrical elements in reduction of vibrations

KEY POINTS FOR NUMERICAL SIMULATION OF INCLINATION OF BUILDINGS ON LIQUEFIABLE SOIL LAYERS

Predictors Using Partially Conditional 2 Stage Response Error Ed Stanek

A Hybrid Nonlinear Active Noise Control Method Using Chebyshev Nonlinear Filter

A Computational Method for Solving Two Point Boundary Value Problems of Order Four

Non-linear Canonical Correlation Analysis Using a RBF Network

CHAPTER 9 LINEAR MOMENTUM, IMPULSE AND COLLISIONS

Generalized Linear Methods

The Study of Teaching-learning-based Optimization Algorithm

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

Lecture 21: Numerical methods for pricing American type derivatives

A Kernel Particle Filter Algorithm for Joint Tracking and Classification

Negative Binomial Regression

Modeling of Wave Behavior of Substrate Noise Coupling for Mixed-Signal IC Design

Computer Control Systems

A Quadratic Constraint Total Least-squares Algorithm for Hyperbolic Location

MiniBooNE Event Reconstruction and Particle Identification

Design of Recursive Digital Filters IIR

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

Lecture Notes on Linear Regression

A Robust Method for Calculating the Correlation Coefficient

Transfer Functions. Convenient representation of a linear, dynamic model. A transfer function (TF) relates one input and one output: ( ) system

3 Implementation and validation of analysis methods

A Weighted UTASTAR Method for the Multiple Criteria Decision Making with Interval Numbers

Which Separator? Spring 1

Iterative Methods for Searching Optimal Classifier Combination Function

Bezier curves. Michael S. Floater. August 25, These notes provide an introduction to Bezier curves. i=0

Lectures on Multivariable Feedback Control

Verification of Selected Precision Parameters of the Trimble S8 DR Plus Robotic Total Station

Supporting Information

SUBTRACTION METHOD FOR REMOVING POWERLINE INTERFERENCE

Communication on the Paper A Reference-Dependent Regret Model for. Deterministic Tradeoff Studies

ENTROPY BOUNDS USING ARITHMETIC- GEOMETRIC-HARMONIC MEAN INEQUALITY. Guru Nanak Dev University Amritsar, , INDIA

Chapter 11: Simple Linear Regression and Correlation

Kinetic-Energy Density-Functional Theory on a Lattice

MMA and GCMMA two methods for nonlinear optimization

Information Acquisition in Global Games of Regime Change (Online Appendix)

Decomposing Travel Times Measured by Probe-based Traffic Monitoring Systems to Individual Road Segments

Seismic Reliability Analysis and Topology Optimization of Lifeline Networks

COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS

Bézier curves. Michael S. Floater. September 10, These notes provide an introduction to Bézier curves. i=0

Relevance Vector Machines Explained

Week 5: Neural Networks

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009

Transcription:

Th a repotory copy of An teratve orthogonal forward regreon algorthm. Whte Roe Reearch Onlne URL for th paper: http://eprnt.whteroe.ac.uk/0735/ Veron: Accepted Veron Artcle: Guo, Y., Guo, L. Z., Bllng, S. A. et al. ( more author) (205) An teratve orthogonal forward regreon algorthm. Internatonal Journal of Sytem Scence, 46 (5). pp. 776-789. ISSN 0020-772 http://do.org/0.080/0020772.204.98237 Reue Unle ndcated otherwe, fulltext tem are protected by copyrght wth all rght reerved. The copyrght excepton n ecton 29 of the Copyrght, Degn and Patent Act 988 allow the makng of a ngle copy olely for the purpoe of non-commercal reearch or prvate tudy wthn the lmt of far dealng. The publher or other rght-holder may allow further reproducton and re-ue of th veron - refer to the Whte Roe Reearch Onlne record for th tem. Where record dentfy the publher a the copyrght holder, uer can verfy any pecfc term of ue on the publher webte. Takedown If you conder content n Whte Roe Reearch Onlne to be n breach of UK law, pleae notfy u by emalng eprnt@whteroe.ac.uk ncludng the URL of the record and the reaon for the wthdrawal requet. eprnt@whteroe.ac.uk http://eprnt.whteroe.ac.uk/

An Iteratve Orthogonal Forward Regreon Algorthm Reved on 02/0/204 Reved on 23/07/204 Yuzhu Guo, L.Z. Guo, S. A. Bllng, and Hua-Lang We Department of Automatc Control and Sytem Engneerng, The Unverty of Sheffeld, Mappn Street, Sheffeld, S 3JD, UK Abtract -- A novel teratve learnng algorthm propoed to mprove the clac orthogonal forward regreon (OFR) algorthm n an attempt to produce an optmal oluton under a purely OFR framework wthout ung any other auxlary algorthm. The new algorthm earche for the optmal oluton on a global oluton pace whle mantanng the advantage of mplcty and computatonal effcency. Both a theoretcal analy and mulaton demontrate the valdty of the new algorthm. Index Term: Iteratve orthogonal forward regreon, model tructure detecton, nonlnear ytem dentfcaton, orthogonal leat quare. Introducton The NARMAX (Nonlnear AutoRegreve Movng Average wth exogenou nput) model and the aocated Orthogonal Forward Regreon (OFR) algorthm have been wdely appled n nonlnear ytem dentfcaton ncludng n the modellng of many engneerng, chemcal, bologcal, medcal, geographcal, and economc ytem (Bllng 203). Varaton of thee algorthm have been developed for lumped and dtrbuted parameter ytem, tme-nvarant and rapdly tme-varyng ytem, n the tme, frequency and pato-temporal doman. The OFR algorthm, whch alo known a the OLS (Orthogonal Leat Square) or the FOLSR (Forward Orthogonal Leat Square Regreon) algorthm, determne the model tructure of nonlnear ytem baed on the ERR (Error Reducton Rato) crteron wthout any a pror knowledge except for the pecfcaton of an ntal model et. However, under ome extreme crcumtance (for example non-pertently exctng nput) the clac OFR algorthm can ometme elect ome ncorrect model term (Bllng and We 2007;

Mao and Bllng 997; Prodd and Spnell 2003; Shertnky and Pcard 996). Soluton are avalable whch olve th problem (Bllng 203; Bllng and We 2007; L et al. 2006; Mao and Bllng 997; We and Bllng 2008) but mot of thee method nvolve combnng the OFR algorthm wth other routne. Contrary to the earler approache, th paper preent a new propoal to enhance the clac OFR wthout makng any major conceptual change. To clarfy the dfference wth a tandard OFR algorthm the new algorthm wll be referred to a the teratve Orthogonal Forward Regreon (OFR) algorthm. The core dea of the new algorthm are preented and t hown that the new OFR method can produce an optmal model under a reved but purely OFR-ERR framework. Another advantage of the new OFR algorthm that the new two-tep OFR doe not requre the ntal model obtaned at the frt tep to be an accurate model, whch dfferent from mot coare-to-fne algorthm. Th mean the OFR algorthm can tart from an ncomplete model and can tll produce a complete optmal model. The remander of the paper organed a follow. Secton 2 brefly revew the clac OFR algorthm. Secton 3 ntroduce the new teratve OFR algorthm. A mple example ntally ntroduced to motvate the ntroducton of an teratve proce to earch for the optmal oluton on a global oluton pace rather than a local pace. Three llutratve example are dcued n ecton 4. The frt example how that the OFR can uccefully elmnate any redundant term and obtan a parmonou model. The econd example how how the new OFR algorthm can fnd the correct term whch may have been med n earler algorthm. The thrd example ued to llutrate the dentfcaton of the NARMAX model wth noe term ung the new OFR algorthm. Concluon are fnally drawn n Secton 5. 2

2. NARMAX model and orthogonal forward regreon 2. NARMAX model A NARMAX model eentally an expanon of the output wth pat nput, output and noe term. A wde cla of nonlnear ytem can be repreented by a NARMAX model (Bllng 203; Leontart and Bllng 985) whch can be defned a y k, y k 2,, y k ny, u k d, u k d, yk F e k, u k d nu, ek, ek 2,, ek ne () where y(k), u(k) and e(k) are the ytem output, nput, and noe equence repectvely; n y, n u, and n e are the maxmum lag for the ytem output, nput, and noe; F () ome nonlnear functon; d a tme delay whch often et a d=. Although both Volterra ere and NARMAX model repreent nput-output relaton, the Volterra ere gve an explct repreentaton whle the NARMAX model gve an mplct repreentaton, whch often of a much more compact form. A large cla of ytem can be decrbed ung the NARMAX model by electng dfferent form of the functon F, for example the nonlnear DARX model (Shouche et al. 998). 2.2 OFR algorthm Sytem dentfcaton baed on the NARMAX model nvolve electng the gnfcant model term from a full canddate term dctonary and then etmatng the aocated parameter n order to buld a parmonou model. The earch for model ubet wth mnmum mean quare error (MSE) can be approached n a traghtforward manner by computng all poble regreon but the amount of computaton requred can be formdable, becaue the number of poble ubet ncreae exponentally. OFR offer an effcent procedure for fndng the bet ubet (Bllng et al. 989; Bllng et al. 988). The OFR algorthm nvolve a tepwe orthogonalaton of the regreor and a forward electon of the model term baed on the error reducton rato (ERR) crteron (Bllng 203). 3

Specfy an ntal full model et D, whch compoed of a total number of,, 2 canddate term. Term are lnear or nonlnear functon of the nput, output and noe. When the meaurement of nput, output, and noe are avalable, thee functon can be evaluated and repreented a the regreon matrx 2 (2) where the column vector ' are defned a () ( ) T. By lghtly abung the notaton, N we ometme ue the column vector to repreent term and the regreon matrx whch nclude all the column to repreent the term dctonary D n later dcuon. Becaue normally there a lack of knowledge regardng the tructure of functon F n (), the term dctonary elected to be redundant and t aumed that F can be expreed a a lnear combnaton of a ubet of D, that,,, 2 model of ytem () can be repreented by ba functon D D, where,2,,, o that the (3) y() t t e t where are the coeffcent. Hence ytem dentfcaton baed on the meaurement nvolve the determnaton of the model tructure and the etmaton of the parameter. However the determnaton of the tructure and the etmaton of the parameter are coupled wth each other. The gnfcance of a term n a model depend on the etmated parameter whle the etmaton of the coeffcent depend on the model tructure. Ung a tradtonal forward regreon algorthm, all the coeffcent n a model need to be re-etmated when a new term added. Hence the evaluaton of the contrbuton of a newly added term to the model computatonally ntenve becaue of the matrx nveron nvolved n the coeffcent re-etmaton. However, the tructure detecton and the parameter etmaton can be uccefully decoupled when all the term are orthogonal to each other. 4

Data collected from to N yeld the matrx form of equaton (3) y. (4) An orthogonal decompoton of gven a WA (5) Here A a unt upper trangular matrx and W w w (6) a N matrx wth orthogonal column whch atfy w 0,, w d j j 0, j (7) where, denote the nner product defned on pace N R, that N T j j j k w, w w w w ( k) w ( k). Equaton (4) can then be wrtten a y Wg w g (8) The coeffcent of each term g can be calculated ndvdually a g w, y w, w. (9) In the OFR algorthm a crteron called the error reducton rato (ERR) ha been ntroduced to meaure the gnfcance of the model term n the decrpton of ytem () and to determne the model tructure by electng all the gnfcant term. The error reducton rato defned a ERR due to term w g ERR w, w w, y y, y w, w y, y 2 2, (0) 5

It worth notng that the ERR crteron evaluate the contrbuton of a term conderng both the form of the term and alo the aocated coeffcent, whch eentally dfferent from the orthogonal projecton or nner product crteron ued by the Projecton Purut and Matchng Purut algorthm (Huber 985; Mallat and Zhang 993; Pat et al. 993), where the effect of the coeffcent ha not been condered. When all the term are orthogonal wth each other the value of ERR of the term n a model atfy ERR ee, yy, () where the lat term on the rght hand de of the equaton repreent the noe-to-gnal rato. The error reducton rato offer a mple, effectve, and ntutve mean of electng a ubet of gnfcant term from a large number of canddate term n a forward regreon manner. By applyng the OFR algorthm and the ERR crteron, the contrbuton of a term can be evaluated avodng reetmatng all the coeffcent. At each tep, a term whch produce the larget value of the canddate term elected, and the electon procedure termnated at tep when ERR among ERR (2) where a dered tolerance, and th lead to a ubet model of term. In the applcaton of the OFR-ERR algorthm, varou crtera, uch a AIC (Akake Informaton Crteron), BIC (Bayean Informaton Crteron), and other tattcal tet, can be ued to ad the termnaton of the term electon (Bllng and Chen 989). To ummare, the tandard orthogonal forward regreon algorthm cont of the followng tep: () Suffcently excte the ytem and meaure the nput and output of the ytem; () Specfy an ntal full model et of canddate term and the value of ; () Compute the value of the ERR for each of the canddate term and elect the term whch gve the larget value of ERR nto the model a the frt term; 6

(v) At the k th ( k 2 ) tage compute the value of the error reducton rato for each of the ( k ) remanng canddate term by aumng that each the k th term n the elected model and perform the correpondng orthogonalaton; The term that gve the larget value of the error reducton rato then elected nto the model a the k th term. If condton (2) atfed, fnh the proce and go to (v). Otherwe et kk and repeat tep (v); (v) The fnal model contan term and the parameter etmate can be calculated ung a leat quare formulae. A geometrc nterpretaton of the above procedure ha been gven by Chen, Bllng & Luo (989). Conder y a a vector n the N dmenonal Eucldean pace N R where are lnearly ndependent vector n th pace. Each of the vector can be panned nto a one dmenonal ubpace of N R. Denote the ubpace whch panned by a S. At the frt tep, the ERR for each meaure the orthogonal projecton of y onto each of the ubpace. The ubpace S whch gve the maxmal projecton determned and the correpondng term elected a the frt term whch denoted a w. At the econd tep, conder the orthogonal projecton of y onto a two dmenonal pace where,2,, \ projecton of y onto The term 2 S, whch panned by, and each of the remanng ( ) vector. Snce at each tep ha been orthogonaled nto w, the orthogonal S can be determned by evaluatng the orthogonal projecton of y onto w. whch pan the ubpace, S on whch the orthogonal projecton of y reache the 2 maxmum elected a the econd term. The orthogonaled vector w w compre an orthogonal ba of the ubpace S, 2, 2. At the kth tep, the orthogonal projecton of y onto k-dmenonal ubpace are condered. The elected term and the prevou k- term pan the ubpace k 2 k k S,,,, on whch the projecton of y maxmal. 7

Compared to tradtonal forward regreon method, the OFR algorthm computatonally effcent becaue t uccefully avod the re-etmaton of the parameter and evaluate the contrbuton of each term ndvdually. The OFR alo extremely uffcent n the term electon. At the kth tep the regreon analy are preformed on the orthogonal complement of the ubpace panned by the prevou k- term. Th uccefully elmnate the nformaton redundancy n the model and produce a parmonou model. Accordngly, the OFR can n mot cae obtan the optmal oluton wth only forward electon rather than tepwe regreon. However, the clac OFR algorthm may occaonally gve a uboptmal model becaue of the nformaton overlap among the nonorthogonal term. For example, a wrong term can be elected at the frt tep becaue the term carre the nformaton from more than one correct term. Th often happen at the frt tep becaue the term have not been orthogonaled. In th paper, a new teratve OFR algorthm wll be ntroduced to olve th problem, to mprove the performance of the clac OFR algorthm, and to provde a relatvely mple and eay to ue algorthm for term electon n complex dynamc model. 3. Iteratve orthogonal forward regreon Followng the dcuon n the prevou ecton, the OFR algorthm elect at each tep the bet term whch compre an optmal ubpace wth the extng term. However optmal choce at every tep cannot alway guarantee a global optmum. Although the clac OFR algorthm alway very effcent, OFR can ometme produce a uboptmal oluton rather than an optmal one (Bllng et al. 989). Th happen becaue the canddate term n the ntal term dctonary are not orthogonal wth each other and the nformaton whch repreented by thee term overlap wth each other. The value of the ERR may therefore depend on the order n whch the correpondng term enter the model. In th ecton, a very mple example frt tuded n detal to explan why the bac OFR algorthm ometme converge to a local optmum. Conder the problem of the regreon of a vector ung three lnear ndependent vector, 2, and 3 n a three-dmenonal pace. Defne the regreon matrx a 8

3 2. 2.3 2 3 2 2.8, 2.2. (3) 3 2. 2. It eay to how that vector n th example actually a lnear combnaton of vector and 2, atfyng 0.5 0.6. (4) 2 Equaton (4) gve the model whch repreent the accurate relatonhp between and the ndependent vector. Fgure how the geometrc repreentaton of the four vector n a three-dmenonal Eucldan pace. Oberve that vector cloer to 3 than to the other vector although actually tay n the plane panned by and 2. Th becaue vector 3 can be decompoed a 0.5 0.5 (5) 3 2 3 where 3 =[0. -0.2 0.] T. Vector 3 actually compoed of two component: the component whch le n the plane panned by and 2 and a mall component 3 whch perpendcular to the plane and ha no contrbuton to the explanaton of the dependent vector. Hence vector 3 poee both the nformaton of and the nformaton of 2. Fg The geometrc relaton of the vector n (3) The tandard OFR proce to fnd a equence of netng ubpace S S2 S tep by tep. Each ubpace S k whch panned by the (k-) vector from Sk and a new elected vector from the 9

regreon matrx optmal at each tep. Here optmal mean the orthogonal projecton of y on S k maxmal. In order to decouple the contrbuton of each term to the total projecton, the k-th term orthogonaled to the (k-) term elected n the earler tep o that the projecton can be calculated tepwe. The k orthogonal term form a k-dmenonal orthogonal ba of pace S k. Denote the um of the ERR value at the k -th tep a SERR k k ERR. The um of the ERR value repreent a normaled meaurement to the projecton. Mao and Bllng (997) argued that when ung the orthogonal algorthm to detect the model tructure, prevouly elected term can nfluence the electon of later term. Therefore the detecton of a mnmal model tructure can be condered a a earch for the optmal orthogonalaton path whch defned a the order n whch canddate term are orthogonaled nto the regreon equaton. In order to analye the effect of orthogonalaton path, all the poble orthogonalaton path are lted n Table. For th example, there are a total number of 6 dfferent orthogonalaton path n whch three term can be orthogonaled nto a model. Table ERR along the x dfferent orthogonalaton path for eq. (4) Path Path 2 Path 3 Path 4 Path 5 Path 6 No. Term ERR Term ERR Term ERR Term ERR Term ERR Term ERR 83.02 83.02 2 88.2 2 88.2 3 99.37 3 99.37 2 2 6.98 3 6.40.79 3.39 0. 06 2 0. 23 3 3 0 2 0. 58 3 0 0.40 2 0. 57 0. 40 SERR -- 00% -- 00% -- 00% -- 00% -- 00% -- 00% Along the x dfferent path, the orthogonaled term form x dfferent orthogonal bae whch are hown n Fg 2. Projectng on each of the orthogonal bae, the ERR value are gven n Table. Fg 2 Sx dfferent orthogonal bae for eq. (4) 0

An optmal model mean the mallet model whch nclude all the correct term. In the language of the ERR framework, an optmal model nclude the mallet number of term but produce the larget um of ERR value. In th example, the optmal model compoed of the two correct term and 2. The clac OFR algorthm earche for a oluton along a path where the um of the ERR ncreae at the fatet peed. In th example, although vector on the plane panned by and 2, the frt term that wll be elected by the OFR algorthm 3 becaue 3 the term whch motly cloe to and gve the larget projecton, ee Fg. Therefore, the OFR wll orthogonale the regreor along path 6 n Table, 3 2. However followng path 6 the obtaned model not optmal. When a pecfc tolerance taken, for example, =0.50%, a correct term wll be med along path 4, and 6 whle a redundant term wll be elected along path 2 and 5. However along both path 2 3 and 2 3, the earch proce produce an optmal model whch cont of two correct term for any tolerance le than 0%. Th mean that along a correct orthogonalaton path the algorthm wll be much more robut, to yeld optmal reult, and the cutoff much more obvou and eaer to elect. In a forward regreon proce, nce the term are elected one by one nto a model, all the orthogonal path compre a oluton tree. At the frt tep, there are opton, whch, repreent the drect chld node of the root node. Thee drect chld node dvde the whole tree nto branche and each of the branche alo of a tree tructure. There are ( ) opton at the econd tep and ( 2) opton at the thrd tep, and o on. There are a total number of! dfferent orthogonalaton path. A forward regreon method tart from the common root node and elect term one by one from an upper layer to a lower layer. The ERR value agn a weght for each branch of the tree. Each path from the root node to a leaf node repreent a complete orthogonalzaton path along whch a correpondng model can be obtaned. For the example under conderaton, the oluton tree hown n Fg 3.

Fg 3 Soluton tree of a forward regreon algorthm for eq. (4) For a globally optmal oluton, we would need to earch on the whole tree and evaluate all the oluton on the dfferent branche. Oberve that there are many opportunte to fnd a correct oluton by exhautng all the path on the oluton tree. For a correct model whch cont of m term there are a total number of m! dfferent perturbaton, where ( )! repreent the factoral operaton, and each repreent a correct oluton on dfferent branche of the oluton tree. There are a total number of m! optmal oluton on the tree. Thee optmal oluton can only happen on the path whch tart from a correct term. Hence only the ub-tree whch tart from a correct term need to be condered when earchng for an optmal oluton. In the above example, there are two optmal orthogonalaton path and only the branche tartng from and 2 may nclude the optmal oluton. A clac OFR algorthm can occaonally earch along a wrong orthogonalaton path by pckng a wrong term n the frt tep. A a reult, the earch proce wll be along a wrong path and produce a ub-optmal model. From the root node, a earchng proce can be forced onto a certan ub-tree by fxng the frt term before the earchng proce proceed. Therefore an ntutve dea to ue the OFR algorthm to earch on each of the ub-tree. A ub-optmal model obtaned on each ub-tree. Compare thee obtaned model and chooe the bet one a the fnal reult. The optmal model wll be n thee ub-optmal model. Remember that there are a total number of m! opportunte to fnd a optmal model. However earch on all the ub-tree not neceary becaue any earch path tartng 2

from a wrong term wll never gve an optmal model. Therefore, only the ub-tree tartng from a correct term need to be condered. However, t unknown whch term a correct term. Neverthele a uboptmal model can alway be a good tartng pont for the earch for an optmal oluton. It reaonable to aume that a uboptmal model cont of a majorty of correct term and a few ncorrect term. Therefore an teratve learnng algorthm can be propoed to fnd an effcent and ntutve oluton to th problem. The new teratve OFR algorthm cont of the followng tep. ). Preet a tolerance and apply the tandard OFR algorthm on the whole term dctonary to produce a uboptmal term et ; 2 ). Select a mall number a an amendment to the tolerance n the frt tep (ee Remark 2 for the choce of ); ). Select one of the term j ( j, 2,, ) n a a preelected frt term and earch the other term on the term et \ to contruct a uboptmal oluton atfyng ERR ; v). v). Repeat ) for all the n and obtan uboptmal model; Compare the obtaned uboptmal model and chooe the bet one a the fnal model op. NARMAX model whch typcally nclude hghly correlated unknown noe term can be dentfed by followng the teratve procedure below. Becaue the noe term are not known a pror the model predcton error (redual) wll be ued to approxmate the noe term. ). Aume the noe term are zero and dentfy the model whch doe not ncluded the noe term ung the OFR algorthm. ). ). Produce the predcton error ung the bet model obtaned n tep ). Ue the redual a the noe term and contructon the new term dctonary n whch the delayed noe term are ncluded. v). Identfy the full NARMAX model from the dctonary obtaned n tep ) ung the OFR algorthm. 3

v). Repeat tep ) - v) untl a atfyng model obtaned. At each tme the redual are calculated baed on the new dentfed model. Valdate the model. Remark: ) The earchng proce can be further terated by electng op a the uboptmal oluton n tep ) and repeatng tep ) ~ v) for a better reult. Th tme only term n the et :, and need to be condered. However further teraton are often not op op op needed. An optmal oluton may be found n the frt teraton becaue the OFR algorthm telf powerful n earchng for an optmal oluton but more tep may be needed n ome cae. 2) The amendment of the tolerance often take a mall non-negatve number. A tolerance mean the um of ERR of the elected term n a model no le than ( ). Hence the larger a tolerance, the le term wll be elected n the obtaned model. A potve mean ( ) a trcter tolerance and the teratve proce wll elmnate ome of the redundant term to produce a maller model. On the contrary, a negatve mean ( ) a looer tolerance and the teratve proce wll elect more term nto the model and produce a more accurate decrpton. An ncreae of the tolerance by can gnfcantly tghten the earch proce to produce a maller model. At the ame tme a mall wll not expel the correct term. For example, a good choce of the abolute value of can be the ERR value of the leat gnfcant term n, that, mn ERR. Th follow becaue after all the correct term have been elected, the remanng term are condered a redundant term and the correpondng contrbuton wll be much maller than the one wth the mallet contrbuton n the uboptmal model. Snce the new OFR earche for oluton on everal ub-tree n parallel and chooe the bet model a the reult t can be expected that the model whch obtaned n the teratve proce wll be no wore than the uboptmal model n the frt tep when the tolerance keep kept unchangnged. Hence another often ued electon of choce n 0. When correct term ntead of the wrong term are elected nto 4

the model, the um of ERR wll reach ( ) more quckly and a more parmonou model obtaned. 3) In the clac OFR algorthm, the electon of the tolerance crucal for the dentfcaton of the model. Addtonally, the electon of the tolerance often problem-dependent. For example, the tolerance may depend on the noe level n the meaurement of the nput and output. A tght tolerance may expel ome correct term whle a looe tolerance may caue overfttng of the data. In the clac OFR algorthm, a wrong term elected n to a model becaue the wrong term may nclude the nformaton of more than one correct term and th become more gnfcant at the frt tep of the forward electon proce. Selectng the wrong term nto the model at an early tage wll make the correct term much le gnfcant and the correct term wll be elected nto the model later to compenate for the nformaton whch ha been med. When the remanng unexplaned nformaton mall and comparable to the effect of noe, a lght change n the tolerance may lead to a dfferent model the tolerance wll become very entve. Hence accurate determnaton of the tolerance under whch no correct term wll be med can be dffcult. Th can be avoded by ung the new OFR algorthm. When any of thee correct term ha been forced to be the frt term, the contrbuton of the wrong term become much le gnfcant becaue part of the nformaton ha been explaned by the pre-determned correct term whch ha been elected n the prevou tep. A a reult, there a much leer poblty that the wrong term wll be elected at the followng tep. Along a correct orthogonalaton path all the correct term wll be gnfcant and the OFR algorthm wll be more robut to the value of the tolerance. Th ha been oberved n the prevou example where any tolerance whch not greater than 0% wll lead to the correct model. Hence, the ettng of the tolerance can be relatvely flexble n the new OFR algorthm. Th feature of the OFR can be very ueful n the dentfcaton of real ytem. 4) Unlke a coare-to-fne algorthm whch tart from a ub-optmal model and purely elmnate redundant term, In the teratve tepofr algorthm, the earch proce not operated earch term on the uboptmal et but on the whole dctonary rather than on the uboptmal et except for 5

the pre-determned term. Th enable the OFR algorthm to elect fnd the correct term whch have been med n the by the ub-optmal model whch obtaned at n the frt tage nto the fnal model toand obtan a better oluton. In other word, the new OFR algorthm doe not need the uboptmal model obtaned at the frt tep to be a uffcent model. Th wll be oberved n the example n Secton 4.2. The new OFR algorthm may occaonally gve a uboptmal oluton nce the algorthm only tre dfferent route at the frt tep and the remanng term electon could tll follow a uboptmal trace. However th wll happen wth a very low probablty. Frtly, the new OFR earche the optmal model n parallel along everal dfferent path on the whole oluton tree. Accordng to the prevou dcuon, there are a total number of m! opportunte to fnd the optmal model and hence the probablty at whch the OFR can fnd the optmal oluton wll ncreae gnfcantly. The mprovement n the poblty to fnd the optmal oluton wll be dcued below. Secondly, along the dfferent earch path the correpondng orthogonal ba wll be qute dfferent and the ERR agned to each term wll change accordngly. The gnfcant term wll then be elected nto the term n a dfferent order. Th ha been oberved n the example gven n Fg 2. To ome extent, th proce work lke the heatng proce n a mulated annealng algorthm where the metal atom n the materal wll be rearranged to buld a better crytal tructure n the coolng proce. Fnally, accordng to the prevou dcuon, the gnfcance of a wrong term whch may be elected n the frt tep becaue t contan nformaton from the correct term wll be greatly reduced when any of the correct term ha been frtly elected. In the electon of the remanng term, the wrong term wll be le lkely to be elected although the earch tll along the peedet ncreang path of the um of ERR value. Baed on the above dcuon, thee are probably the bet oluton avalable becaue the alteratve full optmal earch (Mao and Bllng 997) nvolve a huge computatonal overload that jut not feable when tudyng real data et where t often neceary to try lag over the range -30 n the ntal earch. Noe model term and MIMO (mult-nput-mult-output) ytem jut further aggravate th problem. 6

We hould ak what the probablty that the new OFR algorthm wll produce an optmal oluton. Aume term n a - term of the uboptmal model whch were wa obtaned at the frt tage and are correct; the OFR algorthm can fnd an optmal oluton wth a probablty of p along each path tartng wth a correct term. The OFR algorthm wll earch the optmal oluton along parallel path. The probablty that the OFR wll fnd the optmal oluton on at leat one path wll be p, whch equal to one mnu the probablty that the OFR fal to fnd the optmal oluton ( ) on all the path, eeng that the earch along the ( ) path whch tart wth a wrong term wll have no contrbuton for to the probablty. The probablty wll be much hgher than the probablty of the ngle path earch when large enough. For example, conder a uboptmal model whch cont of 20 term n whch 0 term are correct term. The OFR algorthm earche for the optmal oluton along 0 parallel path at a probablty of 50% on each path. It eay to calculate that the probablty that the OFR algorthm wll fnd the optmal oluton 0 ( 0.5 ) 99.9% whch much hgher than the 50% ucceon probablty of the ngle path earch algorthm. In fact, the probablty for the clac OFR to uccefully fnd the optmal oluton much better than 50%. Even the clacal OFR can produce an optmal oluton n mot cae except n ome pecal tuaton. The new OFR algorthm alo computatonally effcent. A dcued n the paper (Mao and Bllng 997), there are a total number of! orthogonalzaton orthogonalaton path for a term dctonary, where ( )! repreent the factoral operaton. Searchng for an optmal oluton by exhautng all thee path computatonally jut not practcal. Applyng the genetc algorthm atant MMSD (Mnmal Model Structure Detecton) algorthm the earch pace can be reduced to a much more practcal number whch can tll be a large number. Comparatvely, n the new OFR algorthm, the number of earchng path depend on the number of term n the ub-optmal model obtaned at the frt tage whch much le than the ze of the full dctonary, wthout mentonng the number of all the combnaton. Conder the example n the Mao and Bllng paper where 20 term were ncluded n the dctonary. An exhautve earch need to evaluate 8 2.433 0 path; the 7

3 MMSD algorthm earche along 20 path. Ung the new OFR algorthm, no more than 20 bet earchng path whch tart from the term n a uboptmal model need to be evaluated. Therefore t can be concluded that the new OFR algorthm very effcent. A dcuon of the computatonal complexty of the clac OFR algorthm ha been gven n the reference. The performance of the new OFR algorthm can be mproved by approprately ncreang the number of parallel earche. For example, a maller tolerance n the frt tage wll lead to a ub-optmal model wth more term and the global earch at the econd tage wll be carred out on more parallel ub-tree. The OFR algorthm can alo be mproved by ncreang the number of the pre-elected term where a ubet ntead of a term n pre-determned. The more correct term are predetermned, the le pobly the wrong term wll be elected nto the fnal model. However, both mprovement lead to an ncreae n computatonal complexty. An alternatve OFR algorthm ha been propoed by Prodd and Spnell (2003) baed on mnmng the model predcted or mulated output rather than the one tep ahead predcton. Th very mlar to the algorthm by Bllng and Mao (998). The mulated output baed algorthm ha been hown to be effectve where the data groly overampled and where the nput badly degned and not pertently exctng. However, thee oluton are hugely computatonally expenve o that they cannot be realtcally appled to complex model where earche over many lag, MIMO model, and noe are nvolved all of whch are typcal when dealng wth real data et rather than very mple mulated example. The new OFR offer a much mpler oluton. 4. Tet example Several example wll beare ued to llutrate the new OFR algorthm. Whle the lterature full of example where the clacal OFR algorthm work extremely well, each example below ha been delberately choen from the mall number of pat reult where the tandard OFR ha been hown to gve non deal reult. In other word wort cae example are ued below becaue on typcal 8

example where the data ampled correctly and the nput pertently exctng the algorthm work perfectly every tme. 4.. A lnear example Th lnear example wll be ued to how that the OFR algorthm can ometme gve a uboptmal model whch nclude redundant term. But by applyng an teratve proce, the OFR algorthm greatly mproved and able to produce an optmal oluton. Th example wa taken from (We and Bllng 2008). Conder the ytem.7 0.8 2 0.8 2 y k y k y k u k u k e k (6) where y(k), u(k), and e(k) repreent the output, nput and noe of the ytem. The nput unformly dtrbuted whte noe u(k) ~ U(-,). The noe normally dtrbuted whte noe e(k) ~ N(0,0. 2 ). A total number of 000 nput and output data are meaured for the ytem dentfcaton. Defne a canddate term dctonary whch compoed of the delayed nput and output term ={y(k-), y(k- 2), y(k-3), y(k-4), y(k-5), u(k-), u(k-2), u(k-3), u(k-4), u(k-5)}. Applyng the clac OFR algorthm, the dentfed model hown n Table. A total number of even term have been elected nto the model whch nclude all the correct term. However three redundant term have alo been elected. Table 2 Reult gven by the tandard OFR algorthm for ytem (6) No. Term ERR Coeffcent Standard Devaton y(k-) 67.828 -.6982 0.02894 2 u(k-) 26.6089 0.9994 0.00557 3 y(k-4) 2.8635 0.007 0.0235 4 u(k-4) 0.5968-0.002 0.0396 5 u(k-3) 0.4825 0.0036 0.0325 6 u(k-2) 0.42 0.792 0.02944 7 y(k-2) 0.3908-0.7992 0.03679 SERR -- 99.8 -- -- In th example the clac OFR algorthm gve an ncorrect model becaue of under-amplng. Sytem (6) can be condered a a dcretaton of a econd order ytem wth a very low amplng 9

frequency. The frequency repone functon and mpule repone of ytem (6) (ee Fgure 4) how that the natural frequency of the ytem around 0.45 Hz. The amplng frequency n th example Hz whch only about 2.2 tme of the natural frequency. The evere under-dampng n the ytem exaperate th problem. The mpule repone how that the ytem need a long tme to ettle down and ocllate wth a perod about 2.2. Th mean the repone of the ytem wll repeat every 2.2 (about 2 amplng nterval). Snce the output a convoluton of the nput wth the mpule repone functon, y(k) may be of a mlar pattern wth y(k-2). Th explan why y(k-4) may appear n the fnal model becaue the term look lke y(k-2) for th amplng and data cae. The effect of amplng tme on nonlnear ytem dentfcaton ha been tuded by Bllng and Agurre (995), and Bllng (203). Fgure 4 Frequency repone functon and mpule repone of ytem (6) Takng each term n Table 2 a the frt term and applyng an OFR algorthm where =0.3908 % whch the leat value of the ERR n Table 2, yeld the reult of the OFR proce gven n Table 3. Seven dfferent model were obtaned. Model 6 and 7 have the mplet tructure where only four term are ued to produce the bet SERR. Actually both model cont of four correct term. Model, 2, 3, 4, and 5 med the correct term y(k-2) under the amended tolerance ( ) and produce a relatvely maller SERR. Therefore model 6 and model 7 are elected a the fnal reult of the OFR proce. Notce that both model nclude the ame term wth the ame aocated coeffcent. All the 20

redundant term n Table 2 have been uccefully elmnated n the teratve proce. The OFR algorthm whch tart from the reult of a tandard OFR proce gave gve the optmal model. Table 3 Reult produced by OFR algorthm for ytem (6) Model Model 2 Model 3 Model 4 No. Term ERR Coeff Term ERR Coeff Term ERR Coeff Term ERR Coeff y(k-)* 67.828 -.08 u(k-)* 23.8334 0.997 y(k-4)* 3.509-0.2546 u(k-4)* 5.64 0.2622 2 u(k-) 26.6089 0.997 y(k-) 70.5972 -.08 y(k-) 66.7799 -.08 y(k-) 62.266 -.08 3 y(k-4) 2.8635-0.2546 y(k-4) 2.8635-0.2546 u(k-) 27.0053 0.997 u(k-) 26.822 0.997 4 u(k-4) 0.5968 0.2622 u(k-4) 0.5968 0.2622 u(k-4) 0.5968 0.2622 y(k-4) 3.2483-0.2546 5 u(k-3) 0.4825-0.2354 u(k-3) 0.4825-0.2354 u(k-3) 0.4825-0.2354 u(k-3) 0.4825-0.2354 6 u(k-2) 0.42 0.748 u(k-2) 0.42 0.748 u(k-2) 0.42 0.748 u(k-2) 0.42 0.748 SERR -- 98.79 -- -- 98.79 -- -- 98.79 -- -- 98.79 Model 5 Model 6 Model 7 No. Term ERR Coeff Term ERR Coeff Term ERR Coeff u(k-3)* 8.9466-0.3878 u(k-2)* 8.5449 0.788 y(k-2)* 36.7635-0.794 2 y(k-) 59.2524 -.2489 y(k-) 49.385 -.695 y(k-) 32.6657 -.695 3 u(k-) 26.2424 0.9987 u(k-) 26.6444 0.9992 u(k-) 26.8377 0.9992 4 y(k-3) 3.3286 0.3865 y(k-2) 4.609-0.794 u(k-2) 2.9095 0.788 5 u(k-2).242 0.348 SERR -- 99.0 -- -- 99.8 -- -- 99.8 -- * mean the term determned frt. 4.2. A nonlnear example Th example taken from (Mao and Bllng 997). Sytem (7) ha been wdely ued a a benchmark example for the tudy of varaton of OFR algorthm and for comparon of OFR wth other algorthm (Baldacchno et al.). In th example, t wll be hown that the OFR can produce an optmum oluton even when ome correct term are not elected n the frt OFR tep. Conder the nonlnear ytem 3 2 0.2 0.7 0.6 2 0.5 2 2 0.7 yk 2u k 2 ek y k y k y k u k u k y k (7) The ytem excted wth a unformly dtrbuted whte noe u(k) ~ U(-,) and the output y(k) dturbed by a normally dtrbuted whte noe e(k) ~ N(0,0. 2 ). A total number of 000 nput and output data were ued for the ytem dentfcaton. Up to thrd order polynomal of the delayed nput and output {y(k-), y(k-2), y(k-3), y(k-4), u(k-), u(k-2), u(k-3)} were ued to model the nonlnear ytem. A total number of 20 term were ncluded n the term dctonary. Applyng the OFR algorthm yeld a fve term model whch hown n Table 4. Notce that the model n Table 4 nclude an redundant term y(k-4)u 2 (k-2) but me a correct term y(k-2)u 2 (k-2). 2

Table 4 Reult produced by the tandard OFR algorthm for ytem (7) No. Term ERR Coeffcent Standard Devaton y(k-4)u 2 (k-2) 36.2732 0.2922 0.02602 2 y(k-)u(k-) 3.747 0.6544 0.0528 3 u 2 (k-2).3488 0.534 0.00933 4 y(k-2) 26.856-0.6743 0.065 5 y 3 (k-) 3.3248 0.949 0.009847 SERR -- 9.53 -- -- The nonlnear cro correlaton model valdaton tet (Bllng and Voon 986; Bllng and Zhu 994) clearly how that the model unacceptable a a uffcent model of ytem (7). Fgure 5 how that the cro correlaton tet fal wth two of the fve cro correlaton gnfcantly outde the 95% confdence nterval. Notce that th model ha been obtaned by delberately abung the clac OFR algorthm to tet the robutne of the new OFR algorthm. In the applcaton of the OFR algorthm, a model valdaton alway adopted conducted to ad the determnaton of the model ze. An obtaned model hould be uffcent enough to pa all the model valdaton. For th example, an acceptable model can be obtaned by ncreang the number of term untl the model valdaton are atfed. Fg 5 Cro correlaton model valdaton for example 4.2 22

Take the ncorrect model n Table 4 a the tartng pont and apply the OFR algorthm. Ue each term n the prevou model a the frt term and employ the OFR algorthm to elect the remanng term untl SERR atfe a tolerance where =0. The new OFR produced fve dfferent model. The reult are hown n Table 5. All the fve model have the ame number of term. However model 3 and 4 gve a better SERR than the other three model. Compared wth ytem (7), both model 3 and model 4 are compoed of the correct term and gve an accurate repreentaton of the orgnal ytem. It worth emphang that the med term y(k-2)u 2 (k-2) n Table 3 ha now been correctly elected nto the fnal model by the OFR algorthm. Table 5 Reult produced by OFR algorthm for ytem (7) Model Model 2 Model 3 No. Term ERR Coeff Term ERR Coeff Term ERR Coeff y(k-4)u 2 (k-2)* 36.2732 0.2922 y(k-)u(k-)* 3.75 0.6544 u 2 (k-2)* 24.7602 0.6004 2 y(k-)u(k-) 3.747 0.6544 y(k-4)u 2 (k-2) 36.2367 0.2922 y(k-2) 48.586-0.524 3 u 2 (k-2).3488 0.534 u 2 (k-2).3488 0.534 y(k-)u(k-) 3.985 0.6828 4 y(k-2) 26.856-0.6743 y(k-2) 26.856-0.6743 y(k-2)u 2 (k-2) 3.2488-0.6683 5 y 3 (k-) 3.3248 0.949 y 3 (k-) 3.3248 0.949 y 3 (k-) 3.4452 0.983 SERR -- 9.53 -- -- 9.53 -- -- 94.0208 Model 4 Model 5 No. Term ERR Coeff Term ERR Coeff y(k-2)* 29.8926-0.524 y 3 (k-)* 0.9922 0.949 2 u 2 (k-2) 43.449 0.6004 y(k-4)u 2 (k-2) 37.423 0.2922 3 y(k-)u(k-) 3.985 0.6828 y(k-)u(k-) 5.460 0.6544 4 y(k-2)u 2 (k-2) 3.2488-0.6683 y(k-2).9572-0.6743 5 y 3 (k-) 3.4452 0.983 u 2 (k-2) 25.6822 0.534 SERR -- 94.0208 -- -- 9.53 -- * repreent the term determned frt. In both example, the new OFR algorthm produced optmal model whch nclude all the correct term and are of the mplet tructure n a very effcent computatonal proce. The OFR algorthm worked well even when the frt OFR tep dd not gve a correct model a n the econd example. Moreover, n both example, the OFR algorthm found optmal oluton on more than one earchng path. Th ndcate that the new algorthm gnfcantly robut becaue the OFR can tll produce an optmal oluton even when the algorthm fal on one of the parallel earch path. 4.3. A nonlnear example wth noe modellng Th example taken from and ue the ame ettng n the Prodd and Spnell paper wth the ame parameter ettng (Prodd and Spnell 2003). Th example wll be ued to how that the OFR algorthm can correctly dentfy an optmal model even when the ytem are not pertently excted. 23

Th example alo llutrate the applcaton of OFR n dentfcaton of NARMAX model ncludng delayed noe term. The ytem gven a follow 3 0.5 2 0.25 ( ) ( 2) 0.3 w k u k u k u k u k u k y( k) w( k) e k 0.8z (8) where u repreent the nput gnal and y repreent the obervaton of the output w. Both the nput u(k) and the noe e(k) are Gauan dtrbuted whte noe. It can be hown that the clac OFR algorthm can correctly elect all the term and produce an accurate model when the ytem pertently excted. However, Prodd and Spnell argued that the clac OFR algorthm may ncorrectly elect autoregreve term when the nput gnal le not rch enough n frequency component. Prodd and Spnell recommended an nput whch generated by an AR proce wth two real pole between 0.75 and 0.9. Repeatng Prodd and Spnell mulaton ung an nput gnal whch wa generated by the followng AR proce. 0.3 u( k) v( k) 2.6z 0.64z (9) where v(k) Gauan noe v(k) ~ N(0,). The AR proce ha a repeat pole at 0.8 and the coeffcent 0.3 choen to guarantee the nput gnal at a reaonable level. Here the noe gnal e(k) a Gauan dtrbuted noe wth a varance 0.02, that, e(k) ~ N(0,0.02). The reult produced by the clac OFR algorthm are gven n Table 6. Table 6 Reult produced by the clac OFR algorthm for example 3 No. Term ERR Coeffcent Standard Devaton y(k-) 87.0633 0.4260 0.0436 2 y(k-2) 6.9723 0.03 0.004886 3 u 3 (k-).786-0.305 0.0074 4 u 3 (k-2) 3.6867 0.346 0.004265 5 u(k-) 0.97.097 0.094 6 u 2 (k-) 0.7733 0.409 0.003438 7 u(k-2) 0.0050-0.263 0.03052 8 y(k-)u(k-) 0.0023 0.0034 0.003889 9 y(k-2)u(k-) 0.0053 0.040 0.00935 0 y(k-)u(k-2) 0.006-0.069 0.004702 SERR -- 99.88 -- -- Oberve that everal ncorrect autoregreve term have been elected overwhelmng the correct term whle a correct term u(k-)u(k-2) wa med. The new OFR algorthm wa employed to olve the problem. Each term n the model n Table 6 wa elected a the pre-determned term and the remanng term were elected n a model ung a clac OFR algorthm. In th example, a total 24

number of 0 model were obtaned. Three n the ten model gve the ame model whch the bet model obtaned under the gven tolerance of 0.2%. Table 7 Model dentfed ung the OFR algorthm for example 3 No. Term ERR Coeffcent Standard Devaton u 3 (k-) 79.80-0.3002 0.000578 2 u(k-) 4.4655 0.9686 0.0938 3 u (k-)u(k-2) 5.3722 0.2485 0.00905 4 u(k-2) 0.596 0.543 0.0864 5 contant 0.0050 0.0477 0.009254 SERR -- 99.8 -- -- Next we generated the redual ( k) y( k) yˆ ( k), where yk ˆ( ) the one-tep-ahead predcton of the model n Table 7. Ue We then ued the redual to replace the noe term. The new dctonary compoed of all the up to thrd order monomal of varable { uk ( ), uk ( 2), yk ( ), yk ( 2), ( k ), ( k 2), ( k 3) }. The new OFR wa then ued to dentfy the full NARMAX model form the contructed dctonary under the tolerance level 0.2%. Th tme three of even earch path gave the optmal oluton whch hown n Table 8. All the term n ytem (8) were uccefully detected and the aocated coeffcent are cloe to the real value n the frt tme teraton for the noe model. Table 8 Full model dentfed ung the OFR algorthm for example 3 No. Term ERR Coeffcent Standard Devaton u3(k-) 79.829-0.2995 0.000349 2 u(k-) 4.4635 0.9647 0.02 3 u (k-)u(k-2) 5.379 0.2543 0.000933 4 u(k-2) 0.59 0.5395 0.056 5 (k -) 0.74-0.7922 0.0968 SERR -- 99.92 -- -- 5. Concluon Several algorthm have been propoed to enhance the OFR algorthm by ntroducng modfed or add on algorthm, but the new teratve orthogonal forward regreon algorthm mprove OFR under a purely OFR framework. Very lttle extra programmng needed to mplement the new OFR 25

algorthm whch alo hghly computatonally effcent. The new OFR mprove the clac OFR n two way: t elmnate the redundant term n a uboptmal model to produce a more parmonou model, and elect the correct term to obtan an accurate ytem decrpton. Becaue the new OFR earche for the oluton over the whole oluton tree OFR capable of producng an optmal oluton ung mple earch procedure and can be appled to etmate hghly complex ytem model wthn a very effcent and ntutve framework. Acknowledgement The author gratefully acknowledge upport from the UK Engneerng and Phycal Scence Reearch Councl (EPSRC) and the European Reearch Councl (ERC). Reference Baldacchno, T., Anderon, S. R., and Kadrkamanathan, V. "Computatonal ytem dentfcaton for Bayean NARMAX modellng." Automatca, 49(9), 264-265. Bllng, S. A. (203). Nonlnear ytem dentfcaton : NARMAX method n the tme, frequency, and pato-temporal doman, John Wley & Son Ltd, Hoboken, New Jerey. Bllng, S. A., and Agurre, L. A. (995). "Effect of the amplng tme on the dynamc and dentfcaton of nonlnear model." Internatonal Journal of Bfurcaton and Chao, 5(6), 54-556. Bllng, S. A., and Chen, S. (989). "Extended model et, global data and threhold-model dentfcaton of everely non-lnear ytem." Internatonal Journal of Control, 50(5), 897-923. Bllng, S. A., Chen, S., and Korenberg, M. J. (989). "Identfcaton of MIMO non-lnear ytem ung a forward-regreon orthogonal etmator." Internatonal Journal of Control, 49(6), 257-289. Bllng, S. A., Korenberg, M. J., and Chen, S. (988). "Identfcaton of non-lnear output-affne ytem ung an orthogonal leat-quare algorthm." Internatonal Journal of Sytem Scence, 9(8), 559-568. Bllng, S. A., and Mao, K. Z. (998). "Model dentfcaton and aement baed on model predcted output." Bllng, S. A., and Voon, W. S. F. (986). "Correlaton baed model valdty tet for nonlear model." Internatonal Journal of Control, 44(), 235-244. Bllng, S. A., and We, H.-L. (2007). "Spare Model Identfcaton Ung a Forward Orthogonal Regreon Algorthm Aded by Mutual Informaton." Neural Network, IEEE Tranacton on, 8(), 306-30. Bllng, S. A., and Zhu, Q. M. (994). "Nonlnear model valdaton ung correlaton tet." Internatonal Journal of Control, 60(6), 07-20. Chen, S., Bllng, S. A., and Luo, W. (989). "Orthogonal leat quare method and ther applcaton to non-lnear ytem dentfcaton." Internatonal Journal of Control, 50(5), 873-896. Huber, P. J. (985). "Projecton purut." The Annal of Stattc, 3(2), 435-475. 26

Leontart, I. J., and Bllng, S. A. (985). "Input-output parametrc model for non-lnear ytem Part I: determntc non-lnear ytem." Internatonal Journal of Control, 4(2), 303-328. L, K., Peng, J.-X., and Ba, E.-W. (2006). "A two-tage algorthm for dentfcaton of nonlnear dynamc ytem." Automatca, 42(7), 89-97. Mallat, S. G., and Zhang, Z. (993). "Matchng purut wth tme-frequency dctonare." Sgnal Proceng, IEEE Tranacton on, 4(2), 3397-345. Mao, K. Z., and Bllng, S. A. (997). "Algorthm for mnmal model tructure detecton n nonlnear dynamc ytem dentfcaton." Internatonal Journal of Control, 68(2), 3-330. Pat, Y. C., Rezafar, R., and Krhnapraad, P. S. "Orthogonal matchng purut: recurve functon approxmaton wth applcaton to wavelet decompoton." Sgnal, Sytem and Computer, 993. 993 Conference Record of The Twenty-Seventh Alomar Conference on, 40-44 vol.. Prodd, L., and Spnell, W. (2003). "An dentfcaton algorthm for polynomal NARX model baed on mulaton error mnmzaton." Internatonal Journal of Control, 76(7), 767-78. Shertnky, A., and Pcard, R. W. (996). "On the effcency of the orthogonal leat quare tranng method for radal ba functon network." Tran. Neur. Netw., 7(), 95-200. Shouche, M., Gencel, H., Premkran, V., and Nkolaou, M. (998). "Smultaneou Contraned Model Predctve Control and Identfcaton of DARX Procee." Automatca, 34(2), 52-530. We, H.-L., and Bllng, S. A. (2008). "Model tructure electon ung an ntegrated forward orthogonal earch algorthm ated by quared correlaton and mutual nformaton." Internatonal Journal of Modellng, Identfcaton and Control, 3(4), 34-356. 27