TECHNICAL RESEARCH REPORT

Similar documents
A New Look at Nonlinear Time Series Prediction with NARX Recurrent Neural Network. José Maria P. Menezes Jr. and Guilherme A.

ADAPTIVE INVERSE CONTROL BASED ON NONLINEAR ADAPTIVE FILTERING. Information Systems Lab., EE Dep., Stanford University

NONLINEAR PLANT IDENTIFICATION BY WAVELETS

inear Adaptive Inverse Control

Artificial Neural Networks Examination, June 2005

Chapter 4 Neural Networks in System Identification

Adaptive Inverse Control based on Linear and Nonlinear Adaptive Filtering

Experiments with a Hybrid-Complex Neural Networks for Long Term Prediction of Electrocardiograms

Long-Term Prediction, Chaos and Artificial Neural Networks. Where is the Meeting Point?

Ensembles of Nearest Neighbor Forecasts

Adaptive Inverse Control

Backpropagation Neural Net

Reconstruction Deconstruction:

MODELING NONLINEAR DYNAMICS WITH NEURAL. Eric A. Wan. Stanford University, Department of Electrical Engineering, Stanford, CA

Artificial Neural Networks Examination, June 2004

International Journal of Advanced Research in Computer Science and Software Engineering

y(n) Time Series Data

( t) Identification and Control of a Nonlinear Bioreactor Plant Using Classical and Dynamical Neural Networks

New Recursive-Least-Squares Algorithms for Nonlinear Active Control of Sound and Vibration Using Neural Networks

A STATE-SPACE NEURAL NETWORK FOR MODELING DYNAMICAL NONLINEAR SYSTEMS

SIMPLE CHAOTIC FLOWS WITH ONE STABLE EQUILIBRIUM

Converse Lyapunov theorem and Input-to-State Stability

Cascade Neural Networks with Node-Decoupled Extended Kalman Filtering

Complex Valued Nonlinear Adaptive Filters

arxiv: v1 [nlin.ao] 21 Sep 2018

ABOUT UNIVERSAL BASINS OF ATTRACTION IN HIGH-DIMENSIONAL SYSTEMS

IN neural-network training, the most well-known online

CONTROL SYSTEMS, ROBOTICS AND AUTOMATION - Vol. XIII - Nonlinear Observers - A. J. Krener

Evaluating nonlinearity and validity of nonlinear modeling for complex time series

Linear Least-Squares Based Methods for Neural Networks Learning

Modeling nonlinear systems using multiple piecewise linear equations

System Identification for the Hodgkin-Huxley Model using Artificial Neural Networks

Analysis of Electroencephologram Data Using Time-Delay Embeddings to Reconstruct Phase Space

EM-algorithm for Training of State-space Models with Application to Time Series Prediction

NONLINEAR AND ADAPTIVE (INTELLIGENT) SYSTEMS MODELING, DESIGN, & CONTROL A Building Block Approach

Phase-Space Learning

Handout 2: Invariant Sets and Stability

ECE521 Lectures 9 Fully Connected Neural Networks

IS NEGATIVE STEP SIZE LMS ALGORITHM STABLE OPERATION POSSIBLE?

Negatively Correlated Echo State Networks

CONTROL SYSTEMS, ROBOTICS AND AUTOMATION Vol. XI Stochastic Stability - H.J. Kushner

Neural Network Control in a Wastewater Treatment Plant

Statistical Machine Learning from Data

Christian Mohr

Several ways to solve the MSO problem

Neural Networks. Volker Tresp Summer 2015

Recurrence Enhances the Spatial Encoding of Static Inputs in Reservoir Networks

Pattern Matching and Neural Networks based Hybrid Forecasting System

Short Term Memory Quantifications in Input-Driven Linear Dynamical Systems

Wavelet Neural Networks for Nonlinear Time Series Analysis

Phase-Space Learning Fu-Sheng Tsung Chung Tai Ch'an Temple 56, Yuon-fon Road, Yi-hsin Li, Pu-li Nan-tou County, Taiwan 545 Republic of China Garrison

Design and Implementation of Predictive Controller with KHM Clustering Algorithm

Identification of Non-Linear Systems, Based on Neural Networks, with Applications at Fuzzy Systems

NEURAL NETWORK BASED HAMMERSTEIN SYSTEM IDENTIFICATION USING PARTICLE SWARM SUBSPACE ALGORITHM

Short Term Memory and Pattern Matching with Simple Echo State Networks

MULTISTABILITY IN A BUTTERFLY FLOW

Artificial Neural Networks Examination, March 2004

Reservoir Computing with Stochastic Bitstream Neurons

EEG- Signal Processing

Administration. Registration Hw3 is out. Lecture Captioning (Extra-Credit) Scribing lectures. Questions. Due on Thursday 10/6

Controlling Chaos in a State-Dependent Nonlinear System

Introduction to Neural Networks: Structure and Training

T Machine Learning and Neural Networks

D(s) G(s) A control system design definition

High accuracy numerical and signal processing approaches to extract flutter derivatives

Experiments with neural network for modeling of nonlinear dynamical systems: Design problems

Data Mining Part 5. Prediction

Linear Neural Networks

GENERALIZED DEFLATION ALGORITHMS FOR THE BLIND SOURCE-FACTOR SEPARATION OF MIMO-FIR CHANNELS. Mitsuru Kawamoto 1,2 and Yujiro Inouye 1

Recurrent Neural Network Identification: Comparative Study on Nonlinear Process M.Rajalakshmi #1, Dr.S.Jeyadevi #2, C.Karthik * 3

4. Multilayer Perceptrons

ARTIFICIAL INTELLIGENCE. Artificial Neural Networks

Neural Network Approach to Control System Identification with Variable Activation Functions

Deep Learning Architecture for Univariate Time Series Forecasting

Dimension estimation for autonomous nonlinear systems

Introduction Neural Networks - Architecture Network Training Small Example - ZIP Codes Summary. Neural Networks - I. Henrik I Christensen

PHONEME CLASSIFICATION OVER THE RECONSTRUCTED PHASE SPACE USING PRINCIPAL COMPONENT ANALYSIS

SPSS, University of Texas at Arlington. Topics in Machine Learning-EE 5359 Neural Networks

A SYSTEMATIC PROCEDURE FOR SYNCHRONIZING HYPERCHAOS VIA OBSERVER DESIGN

Introduction to Machine Learning Spring 2018 Note Neural Networks

Constructing Transportable Behavioural Models for Nonlinear Electronic Devices

PHYSICAL REVIEW E VOLUME 56, NUMBER 3 ARTICLES. Data-based control trajectory planning for nonlinear systems

Neural Network Based Response Surface Methods a Comparative Study

Power Amplifier Linearization Using Multi- Stage Digital Predistortion Based On Indirect Learning Architecture

NONLINEAR BLACK BOX MODELING OF A LEAD ACID BATTERY USING HAMMERSTEIN-WIENER MODEL

Do we need Experts for Time Series Forecasting?

Artificial neural networks

The Markov Decision Process Extraction Network

Adaptive Robust Control

WHAT IS A CHAOTIC ATTRACTOR?

NONLINEAR CLASSIFICATION AND REGRESSION. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition

Introduction to Biomedical Engineering

THE Back-Prop (Backpropagation) algorithm of Paul

Model Reference Adaptive Control for Multi-Input Multi-Output Nonlinear Systems Using Neural Networks

Fitting Linear Statistical Models to Data by Least Squares: Introduction

Nonlinear Prediction for Top and Bottom Values of Time Series

Deep Learning: Self-Taught Learning and Deep vs. Shallow Architectures. Lecture 04

Temporal Backpropagation for FIR Neural Networks

Identification of Nonlinear Systems Using Neural Networks and Polynomial Models

A New Approach for Computation of Timing Jitter in Phase Locked Loops

Transcription:

TECHNICAL RESEARCH REPORT Reconstruction of Nonlinear Systems Using Delay Lines and Feedforward Networks by D.L. Elliott T.R. 95-17 ISR INSTITUTE FOR SYSTEMS RESEARCH Sponsored by the National Science Foundation Engineering Research Center Program, the University of Maryland, Harvard University, and Industry

ISR Technical Research Report T.R. 95-17 RECONSTRUCTION OF NONLINEAR SYSTEMS USING DELAY LINES AND FEEDFORWARD NETWORKS 1 David L. Elliott NeuroDyne, Inc. and Institute for Systems Research, University of Maryland College Park, MD 20742 delliott@src.umd.edu Abstract Nonlinear system theory ideas have led to a method for approximating the dynamics of a nonlinear system in a bounded region of its state space, by training a feedforward neural network which is then reconfigured in recursive mode to provide a stand-alone simulator of the system. The input layer of the neural network contains time-delayed samples of one or more system outputs and control inputs. Autonomous systems can be simulated in this way by providing impulse inputs. Introduction In the past, methods for identifying nonlinear systems have typically used static polynomial (or sinusoidal) nonlinearities with linear dynamical blocks (Hammerstein or Wiener operators); Volterra functional series (for short times and small inputs); and other polynomial-based methods such as Larimore s CVA [2]. Feedforward 1 Portions of this research were supported by the National Science Foundation under Grant ECS 9216530. To appear, Proc. 1995 American Control Conference. neural networks using past samples of system outputs and inputs seem to be a reasonable method for predicting future outputs of time series (a common task in the neural network literature) and for system identification. Work in the control community [3] has used neural network simulators in control applications with some success. The hope is that future inexpensive neural network hardware will provide fast parallel computation. It is of course necessary to store past information about system inputs and outputs in the network to successfully train the weights and reconstruct dynamics: the network must contain delay elements. In the architecture here presented, all these elements are incorporated in delay-lines at the front end of the network. Others have used delays between the nodes of the network.. Linear Case Linear adaptive ARMA filters trained by the Widrow-Hoff LMS algorithm can be used to identify linear systems [1] by finding an empirical transfer function. A sufficiently-exciting input is presented to a linear plant of interest; the past values of the input and output are sampled at clock interval τ and their values u k and y k are used as inputs to two delay-lines of length L. The

delayline values are related by the plant transfer function: ŷ = H(z)û, H(z) =(B 1 + +B n z L )/(1+A 1 z + +A L z L ). The two delay-lines are used to construct an adaptive transverse filter as follows. The delayline taps are provided with adjustable weights A k,b k and the outputs summed to give Y t =Σ L k=1 (A ky t kτ + B k u t kτ ). The most recent sample y(t) of the plant output is compared with the value Y t predicted by the filter, and the squared error minimized by adjusting the A k and B k. online by the LMS rules for 1 k L: Ak = α(y t y t )y t kτ, B k = α(y t y t )u t kτ (1) or by similar rules (normalized LMS) in which the right-hand sides of Eq. (1) are normalized by the variances of y and u respectively. Note that these difference equations (1) are linear in the weights. Finding the best L and α to achieve rapid convergence is difficult, and those problems are of course no easier in the nonlinear generalization to be discussed below. Neural Networks for Nonlinear Systems The advent of neural network technology has raised the prospect of highly parallel computational approaches to identification, based on theorems showing that several of the function systems used in neural networks can be used to uniformly approximate multivariable functions that are sufficiently smooth. The transition maps for nonlinear systems appear to be an appropriate application. The task may be attacked through the use of recurrent neural networks [3] in the above case of the linear adaptive filter this means A k = α(y t y t )Y t kτ, which is quadratic in the weights, so that the dynamics in weight-space may become chaotic. Recurrent networks, like adaptive control systems, rely on slow variations of the trained parameters (weights) to ensure stability. However, the training can be performed by a feedforward network [4, 6, 10]; in this case recursion is used in the trained network but not during training. That is the approach we adopt here. Our work is based on a state-space approach, related to control theory notions of state, observability and controllability. The approach of [4] involves input-output relations. Learning the Dynamics Even some autonomous systems ẋ = f(x) can be treated as input-driven; we initialize at an equilibrium point (we could assume f(0) = 0 for instance) and provide a control bu(t) insuchaway that for the controlled system ẋ = f(x) +bu(t) the region of interest in the state space can be reached from 0. The input u(t) consists of one or more short pulses whose amplitudes vary randomly. As an example consider a van der Pol oscillator equipped with an input which makes it completely reachable with one pulse ξ 1 = ξ 2 ξ 2 = ξ 1 (aξ 2 (ξ1 2 1) 1) + u(t) y = ξ 1 (2) The system is simulated digitally by any method which well-approximates the system dynamics. We know the output only through randomly chosen finite sequences of clocked samples Y = y(t) :t = kτ, k =0, 1, 2,, T. The choice of τ and the training length T is problemdependent. First consider the autonomous (no input) case. Suppose initial states x(0) for identification experiments can be randomly chosen, uniformly distributed in a box B of side β. The random output sequences Y are now looked at with a sliding window of length N n. The N-vectors of sample values Y (t) =[y(t Nτ),,y(t τ)] (defined for y a system output) will determine the future of y, and are called lag coordinates for the system, if the map x(t) Y N (t) is a regular embedding. That is a stronger requirement than the local rank conditions used in local-observability theory. For observable linear systems N = n suffices for almost all τ, from elementary considerations. From the work of Floris Takens [7] and independently Dirk Aeyels [8] it is known that for smooth generic choices of nonlinear {f,h} and almost all τ, N =2n + 1 is enough to guarantee regular embedding. For particular systems

control c Nonlinear Plant dynamics output y(t) shift up Delay lines y(t-4) y(t-3) w0 y(t-2) y(t-1) σ(z1) σ(zm) w1 state layer x w23 w22 output subnets σ(n3) σ(n2) w21 σ(n1) Figure 1: Training the network Y3(t) Y2(t) Y1(t) output layer the value of N must be determined by experimentation or careful analysis. (Since in this study the inputs are used only to initialize the plant, the same considerations apply to examples like Eq. (2) provided that the input is known.) A generic smooth nonlinear system is thus observable using a sliding window of N =2n + 1 samples. That is, the state space can be mapped oneto-one into (not onto) Euclidean space of dimension N by the lag coordinates Z(t) =[y(t),y(t τ),..., y(t (N 1)τ)]. The Nonlinear Science research community uses this technique in studying the dimension of strange attractors. Recently Sauer [5] used low-pass filtered delay coordinates, and local low-dimensional linear models (as in the work of Farmer and Sidorowich) for prediction. In the Systems community, Moraal and Grizzle [9] have constructed nonlinear state observers using n lag coordinates under the hypothesis that the map x(t) Z(t) has full rank. Neural Network Design s 1+ s. In this study σ(s) = tanh(s), or σ(s) = These both have good differential equations to permit fast backpropagation : σ =1 σ 2 for the first one, σ =(1 σ ) 2 for the second, as I have shown elsewhere. The first layer of our neural net implements the sliding window by two delay-lines or FI- FOs. The first FIFO at time 0 t T holds a sliding array of samples of y() denoted by Y =[y(t (N +1)τ),,y(t τ)]. This FIFO is left-shifted at each clock time kτ. The second FIFO holds a similar array of input samples, U =[u(t Nτ),,u(t)]. The layers of the net are built of linear combinations and sigmoidal functions. Y t and U t are used as the first layer of our network. They are stacked up in a single vector r =[Y, U]. Now we want to project r down to a vector x whose dimension P is that of our plant (P = 2 for van der Pol s oscillator). This projection may be done either directly by a matrix, or via a sigmoidal hidden layer z of size M (to be chosen in the application) as shown in Figure 1. In that case, we have (using as usual a bias weight a i0 ) z i = σ( L h=1 (a ihy h + b ih U h )+a i0 ) for 1 i M (with, as usual, bias weights a i0 ). We used the notation a i and b i in analogy to the linear adaptive filter notation above; they are stacked up in Figure 1, w0 =[a, b]. For j =1,,P, x j = M 0 w2 jiz i where z 0 = 1 for the bias term; likewise x 0 =1. In some applications it is possible and convenient to train several network outputs Y 1,,Y R corresponding to plant observables (in Figures 1 and 2, R = 4). The outputs are now computed by R independent subnetworks, or segments, each with a sigmoidal layer and a linear layer; for 1 k R N kq = σ( P p=0 w2 kqpx p ) and Y k = Q q=1 w3 kqn kq. One of the outputs, denoted Y 1 (t) in Figure 1, predicts y(t). The others predict other plant observables (which, for a plant observable with a single output, need not be stored in delay-lines). To allow successful prediction one must train the network by providing sufficiently many inputs u. In Figure 1, c is used for the class of admissible controls u, which are treated as random functions on [0,T]. Training consists of changing the weights by backpropagation (the multi-layer version of LMS). The region in the plant s state space explored by these inputs is the interior of the reachable set, allowing for bounds on the inputs. We want to minimize the expected sum of the errors over the ensemble of controls, so the goal (if not always the accomplishment) of backpropagation is to find w = arg min w E cσ N k=1(y (t kτ) y(t kτ)) 2 This will be our operational definition of identifi-

shift up Y1(t-4) Y1(t-3) Y1(t-2) Y1(t-1) Delay lines w0 σ(z1) control c σ(zm) w1 state layer x w w22 w21 output subnets σ(n3) σ(n2) w3 w3 σ(n1) Figure 2: Trained, recurrent network 3 2 1 output layer Y3(t) Y2(t) Y1(t) cation, and it must be understood that ordinarily it cannot be accomplished. The best we can do is, as usual with nonlinear least-squares problems, to find a good local minimum, and use that for w. Now, as in Figure 2, we use w in a different configuration of the network, in which y from the plant is replaced by the output Y 1 of the network, giving a recursive network which as will be seen in some example figures, can mimic the original plant. The bottleneck layer x of dimension n may be treated as a state vector in some problems and assigned initial values. The parameters (number of nodes, learning rates) are a matter of art. A comparison of the impulse response of the controlled van der Pol oscillator of Eq. (2) with our neural network reconstruction is shown in Figure 3; these are lag-coordinate plots of (y(t),y(t 0.15)) and (Y (t),y(t 0.15)). and Figure 4 shows time plots. Here I used the single plant output of Eq. (2), N = 4, linear projection to the bottleneck layer x i = σ( 4 h=1 (a ihy h + b ih U h )+a i0 ), with P = 2, one output and Q = 36 to obtain the accuracy shown for large transients and good limit-cycle fit. Conclusion It has been the intent of this study to show that a simple, canonical architecture with delaylines and layered feedforward networks is adequate to perform the identification of some interesting and useful nonlinear systems. More complex schemes may well have advantages, but may also require more than off-the-shelf circuits to implement. Single-Instruction Multiple Data parallel circuits can implement the methods presented here. References [1] Peter M. Clarkson Optimal and Adaptive Signal Processing Boca Raton, Florida, CRC Press, 1993. [2] W. E. Larimore Identification and Filtering of Nonlinear Systems Using Canonical Variate Analysis in Nonlinear Modeling and Forecasting M. Casdagli and S. Eubank, editors, New York: Addison-Wesley, 1992. [3] P. Werbos, T. McAvoy, H. T. Su Neural Networks, System Identification, and Control in the Chemical Process Industries in Handbook of Intelligent Control D. Sofge and D. White, eds. New York: Van Nostrand Reinhold, 1992. [4] S.Chen, S. A. Billings Neural networks for nonlinear dynamic system modeling and identification Int.J.Control 56, No.2, 319-346, 1992. [5] Tim Sauer Time Series Prediction Using Delay Coordinate Embedding, in Predicting the Future and Understanding the Past: A Comparison of Approaches, A. Weigend, N. Gershenfeld, editors. Reading, MA : Addison- Wesley Pub. Co., 1994. [6] A.U. Levin Neural Networks in Dynamical Systems: a System Theoretic Approach Ph.D. Thesis, Yale University, November 1992. [7] F. Takens Detecting strange attractors in turbulence in Dynamic Systems: Warwick, 1980 D. Rand, L.-S. Young, editors. Berlin: Springer Lec. Notes Math. 898, 366 381 (1981). [8] D. Aeyels Generic observability of differentiable systems SIAM J. Contr. Opt. 19 595-603, 1981. [9] P. E. Moraal, J. W. Grizzle Observer design for nonlinear systems with discrete-time measurements, to appear IEEE Trans. Auto. Con. March 1995. [10] A. U. Levin, K. S. Narendra Identification of nonlinear dynamical systems using neural networks, to appear in Neural Networks for Control, D. L. Elliott, editor, Ablex Publishing Corporation, Norwood, NJ, 1995.

4 V N 3 2 1 0-1 -2-3 -3-2 -1 0 1 2 3 4 Figure 3: Pulse input excites van der Pol oscillator (V) and Net simulation (N); phase plot 4 C V N 3 2 1 0-1 -2-3 0 2 4 6 8 10 12 14 16 Figure 4: Time plot for van der Pol oscillator