Relating Real-Time Backpropagation and. Backpropagation-Through-Time: An Application of Flow Graph. Interreciprocity.

Size: px
Start display at page:

Download "Relating Real-Time Backpropagation and. Backpropagation-Through-Time: An Application of Flow Graph. Interreciprocity."

Transcription

1 Neural Computation, 1994 Relating Real-Time Backpropagation and Backpropagation-Through-Time: An Application of Flow Graph Interreciprocity. Francoise Beaufays and Eric A. Wan Abstract We show that signal ow graph theory provides a simple way to relate two popular algorithms used for adapting dynamic neural networks, real-time backpropagation and backpropagation-throughtime. Starting with the ow graph for real-time backpropagation, we use a simple transposition to produce a second graph. The new graph is shown to be interreciprocal with the original and to correspond to the backpropagation-through-time algorithm. Interreciprocity provides a theoretical argument to verify that both ow graphs implement the same overall weight update. Introduction Two adaptive algorithms, real-time backpropagation (RTBP) and backpropagation-throughtime (BPTT), are currently used to train multilayer neural networks with output feedback connections. RTBP was rst introduced for single layer fully recurrent networks by Williams and Zipser (1989). The algorithm has since been extended to include feedforward networks with output feedback (see, e.g., Narendra 1990). The algorithm is sometimes referred to as real-time recurrent learning, on-line backpropagation, or dynamic backpropagation (Williams and Zipser; Narendra et al., 1990; and, Hertz et al., 1991). The name recurrent backpropagation is also occasionally used, although this should not be confused with recurrent backpropagation as developed by Pinenda (1987) for learning xed points in feedback networks. RTBP is well suited for on-line adaptation of dynamic networks where a desired response is specied at each time step. BPTT, (Rumelhart et al. 1986; Nguyen and Widrow, 1990; and Werbos, 1990), on the other hand, involves unfolding the network in time and applying standard backpropagation through the unraveled system. It does not allow for on-line adaptation as in RTBP, but has been shown to be computationally less expensive. Both algorithms attempt to minimize the same performance criterion, and are equivalent in terms of what they compute (assuming all weight changes are made o-line). However, they are generally derived independently and take on very dierent mathematical formulations. In this paper, we use ow graph theory as a common support for relating the two algorithms. We begin by deriving a general ow graph diagram for the weight updates associated with RTBP. A second ow graph is obtained by transposing the original one, i.e., by reversing the arrows that link the graph nodes, and by interchanging the source and sink nodes. Flow graph theory shows that transposed ow graphs are interreciprocal, and for single input single output (SISO) systems, have identical transfer functions. This basic property, which was rst presented in the context of electrical circuits analysis (Peneld et al., 1970), nds applications in a wide The authors are with the department of Electrical Engineering, Stanford University, Stanford, CA This work was sponsored by EPRI under contract RP

2 variety of engineering disciplines, such as the reciprocity of emitting and receiving antennas in electromagnetism (Ramo et al., 1984), the relationship between controller and observer canonical forms in control theory (Kailath, 1980), and the duality between decimation in time and decimation in frequency formulations of the FFT algorithm in signal processing (Oppenheim and Schafer, 1989). The transposed ow graph is shown to correspond directly to the BPTT algorithm. The interreciprocity of the two ow graphs allows us to verify that RTBP and BPTT perform the same overall computations. These principles are then extended to a more elaborate control feedback structure. Network Equations A neural network with output recurrence is shown in Figure 1. Let r(k? 1) denote the vector of external reference inputs to the network and x(k? 1) the recurrent inputs. The output vector x(k) is a function of the recurrent and external inputs, and of the adaptive weights w of the network: x(k) = N (x(k? 1); r(k? 1); w): (1) r(k?1) x(k?1) N x(k) q Figure 1: Recurrent neural network (q represents a unit delay operator). The neural network N is most generally a feedforward multilayer architecture (Rumelhart et al., 1986). If N has only a single layer of neurons, the structure of Figure 1 represents a completely recurrent network (Williams and Zipser, 1989; Pineda, 1987). Any connectionist architecture with feedback units can, in fact, be represented in this standard format (Piche, 1993). Adapting the neural network amounts to nding the set of weights w that minimizes the cost function J = 1 2 E " KX k=1 e(k) T e(k) # = 1 2 KX k=1 h i E e(k)t e(k) ; (2) where the expectation E[] is taken over the external reference inputs r(k) and over the initial values of the recurrent inputs x(0). The error e(k) is dened at each time step as the dierence between the desired state d(k) and the recurrent state x(k) whenever the desired vector d(k) is dened, and is otherwise set to zero: e(k) = ( d(k)? x(k) if 9 d(k) 0 otherwise (3) For such problems as terminal control (Bryson and Ho, 1969; Nguyen and Widrow, 1990) a desired response may be given only at the nal time k = K, while for other problems such as system 2

3 identication (Ljung, 1987; Narendra, 1990) it is more common to have a desired response vector for all k. In addition, only some of the recurrent states may represent actual outputs while others may be used solely for computational purposes. In both RTBP and BPTT, a gradient descent approach is used to adapt the weights of the network. At each time step, the contribution to the weight update is given by w(k) T =? 2 h d e(k)t e(k) dw i = e(k) T dx(k) dw ; (4) where is the learning rate. Here the derivative is used to represent the change in error due to a weight change over all time 1. The accumulation of weight updates over k = 1:::K is given by w = P K k=1 w(k). Typically, RTBP uses on-line adaptation in which the weights are updated at each time k, whereas BPTT performs an update based on the aggregate w. The dierences due to on-line versus o-line adaptation will not be considered in this paper. For consistency, we assume that in both algorithms the weights are held constant during all gradient calculations. Flow Graph Representation of the Adaptive Algorithms RTBP was originally derived for fully recurrent single layer networks 2. A more general algorithm is obtained by using equation 1 to directly evaluate the state gradient dx(k) = dw in the above weight update formula. Applying the chain rule, we get: dx(k) dw dx(k? 1) dr(k? 1) 1) dw dw ; (5) in which dr(k? 1) = dw = 0 since the external inputs do not depend on the network weights, and dw = dw = I, where I is the identity matrix. With these simplications, equation 5 reduces to: dx(k) dw dx(k? 1) : (6) Equation 6 is then applied recursively, from k = 1 to k = K, with initial conditions dx(0) = dw = 0. For sake of clarity, let With this new notation, equation 6 can be rewritten as: rec (k) = dx(k) = dw; (7) (k) (8) J(k) 1): (9) rec (k) = J(k) rec (k? 1) + (k) 8k = 1:::K (10) with initial condition rec (0) = 0. The weight update at each time step is given by: w T (k) = e(k) T rec (k): (11) 1 We dene the derivative of a vector a 2 < n with respect to another vector b 2 < m as the matrix da=db 2 < nm whose (i; j) th element is da i=db j. Similarly, the partial of a vector a 2 < n with respect to another vector b 2 < m is the 2 < nm whose (i; j) th element i=@b j. For m = n = 1, this notation reduces to the scalar derivative and partial derivative as traditionally dened in calculus. It is easy to verify that most scalar operations in calculus, such as the chain rule, also hold in the vectorial case. 2 The linear equivalent of the RTBP algorithm was rst introduced in the context of Innite Impulse Response (IIR) lter adaptation (White, 1975). 3

4 w T w T (1) w T (k) w T (K) e T (1) e T (k) e T (K) rec (1) rec (k) rec (K) J(2) J(k) J(k+1) J(K?1) (1) (k) (K) 1:0 Figure 2: Flow graph associated with the real-time backpropagation algorithm. Equations 10 and 11 can be illustrated by a ow graph (see Figure 2). The input to the ow graph, or source node variable, is set to 1:0 and propagated along the lower horizontal branch of the graph. The center horizontal branch computes the state derivatives rec (k), and the upper horizontal branch accumulates the weight changes w(k) T. The total weight change, w T, is readily available at the output (sink node). RTBP is completely dened with this ow graph. Let us now build a new ow graph by transposing the ow graph of Figure 2. Transposing the original ow graph is accomplished by reversing the branch directions, transposing the branch gains, replacing summing junctions by branching points and vice versa, and interchanging source and sink nodes. The new ow graph is represented in Figure 3. From the work by Tellegen (1952) and Bordewijk (1956) (see Appendix A), we know that transposed ow graphs are a particular case of interreciprocal graphs. This means, in a SISO case, that the sink value obtained in one graph, when exciting the source with a given input, is the same as the sink value of the transposed graph, when exciting its source by the same input. Thus, if an input of 1:0 is distributed along the upper horizontal branch of the transposed graph, the output, which is now accumulated on the lower horizontal branch, will be equal to w. This w is identical to the output of our original ow graph 3. With the notation introduced before, and calling bp (k) the signal transmitted along the center horizontal branch, we can directly write down the equations describing the new ow graph: bp (k) = J T (k + 1) bp (k + 1) + e(k) 8k = K:::1 (12) with initial condition bp (K + 1) = 0. The weight update at each time step is given by: w(k) = T (k) bp (k): (13) Equations 12 and 13, obtained from the new ow graph, are nothing other than the description of hp BPTT: bp (k) is the error gradient? d Kk=1 e(k) T e(k) =dx(k) backpropagated from k = K:::1. This provides a simple theoretical derivation of BPTT. Alternative derivations include the use 3 The ow graphs introduced here are in fact single input multi output (SIMO). The arguments of interreciprocity may be applied by considering the SIMO graph to be a stack of SISO graphs, each of which can be independently transposed. i 4

5 1:0 e(1) e(k) e(k) bp (1) bp (k) bp (K) J T (2) J T (k) J T (k+1) J T (K?1) T (1) T (k) T (K) w w(1) w(k) w(k) Figure 3: Transposed ow graph: a representation of the backpropagation-through-time algorithm. of ordered derivatives (Werbos, 1964), heuristically unfolding the network in time (Rumelhart et al.,1986; Nguyen and Widrow, 1990), and solving a set of Euler-Lagrange equations (Le Cun, 1988; Plumer, 1993). Clearly one could have also taken a reverse path by starting with a derivation for BPTT, constructing the corresponding ow graph, transposing it, and then reading out the equations for RTBP. The two approaches lead to equivalent results. Another nice feature of ow graph representations is that the computational and complexity dierences between RTBP and BPTT can be directly observed from their respective ow graphs. By observing the dimension of the terms owing in the graphs and the necessary matrix calculations and multiplications, it can be veried that RTBP requires O(N 2 W ) operations while BPTT requires only O(W ), (N is the number of recurrent states and W is the number of weights) 4. Extension to Controller-Plant Structures Flow graph theory can also be applied to more complicated network arrangements, such as the dynamic controller-plant structure illustrated in Figure 4. A discrete-time dynamic plant P described by its state-space equations is controlled by a neural network controller C. Let x(k? 1) be the state of the plant, r(k? 1) the external reference inputs to the controller, and u(k? 1) the control signal used to drive the plant. Figure 4 can be described formally by the following equations: x(k) = P(x(k? 1); u(k? 1)) (14) u(k? 1) = C(x(k? 1); r(k? 1); w) (15) As before, the error vector e k is dened as the dierence between the desired state d(k) and the actual state x(k) when there exists a desired state, and zero otherwise. Using RTBP to adapt the weights of the controller requires the evaluation of the derivatives of the state with respect to the 4 For fully recurrent networks W = N 2 and RTBP is O(N 4 ). 5

6 r(k?1) x(k?1) C u(k?1) x(k?1) P x(k) q Figure 4: Controller-Plant Structure weights. Applying the chain rule to equations 14 and 15, we get: dx(k) dw dx(k? 1) 1) dx(k? 1) = 1) dw du(k? 1) dw du(k? 1) dw : (17) Equations 16 and 17 can then be represented by a ow graph (see Figure 5a). Transposing this ow graph, we get a new graph, which corresponds to the BPTT algorithm for the controllerplant structure (see Figure 5b). Again, the argument of interreciprocity immediately shows the equivalence of the weight updates performed by the two algorithms. In addition, it can be veried that BPTT applied to this structure still requires a factor of O(N 2 ) less computations than RTBP. 6

7 e T (1) w T (1) e T (K) w T w T (K) J Px(K) rec x (1) J Pu(1) J C(K?1) J Pu(K) rec x (K?1) rec x (K) u(0) u(k?1) 1:0 (a) 1:0 e(1) e(k) J T Px(K) J T Pu(1) bp x (1) bp x (K?1) J T C(K?1) J T Pu(K) bp x (K) T u(0) T u(k?1) w w(1) (b) w(k) Figure 5: (a) Flow graph corresponding to RTBP. (b) Transposed ow graph: a representation of BPTT. Notation: J C (k) J P x (k) 1); J P u (k) = : : : 1); u (k) rec : x (k) = dx(k)=dw; rec : u (k) = du(k)=dw; x bp : (k) =? d[ P K k=1 e T (k)e(k)]=dx(k): 7

8 Conclusion We have shown that real-time backpropagation and backpropagation-through-time are easily related when represented by signal ow graphs. In particular, the ow graphs corresponding to the two algorithms are the exact transpose of one another. As a consequence, ow graph theory could be applied to verify that the gradient calculations performed by the algorithms are equivalent. These principles were then extended to a controller-plant structure to illustrate how ow graph techniques can be applied to a variety of adaptive dynamic systems. Appendix A: Flow Graph Interreciprocity In this appendix we provide the formal denition of interreciprocity. We then prove that transposed ow graphs are interreciprocal, and that the transfer functions of single input single output interreciprocal ow graphs are identical. Y j T j;k Y k T l;k 1:0 X k Y l Figure 6: Example of nodes and branches in a signal ow graph. Let F be a ow graph. In F, we dene: Y k, the value associated with node k; T j;k, the transmittance of the branch (j; k); and V j;k = T j;k Y j, the output of branch (j; k). Let us further assume that each node k of the graph has associated to it a source node, i.e., a node connected to it by a branch of unity transmittance. Let X k be the value of this source node (if node k has no associated source node, X k is simply set to zero). It results from the above denitions that Y k = P j V j;k + X k = P j T j;k Y j + X k (see Figure 6). Let us now consider a second ow graph, ~ F, having the same topology as F (i.e., ~ F has the same set of nodes and branches as F, but the branch transmittances of both graphs may dier). ~F is described with the variables: ~ Yk ; ~ T j;k ; ~ V j;k, and ~ X k. Denition 1 Two ow graphs, F and transmittance matrices are transposed, i.e., ~ F, are said to be the transpose of each other i their ~T j;k = T k;j 8 j; k: (18) Denition 2 (Bordewijk, 1956): Two ow graphs, F and F, ~ are said to be interreciprocal i X ( Y ~ k X k? Y k X ~ k ) = 0: (19) We can now state the following theorem: Theorem 1 Transposed ow graphs are interreciprocal. k 8

9 Proof: Let F be a ow graph, and let F ~ be the transpose of F. We start from the identity P P ~ P k Y k Y k k Y k Yk ~, and replace Y k by P j T j;k Y j + X k in the rst member, and Yk ~ by ~ j T j;k Yj ~ + Xk ~ in the second member (Oppenheim and Schafer, 1989). Rearranging the terms, we get: X j;k ( ~ Y k V j;k? Y k ~ V j;k ) + X k ( ~ Y k X k? Y k ~ X k ) = 0: (20) Equation 20 is usually referred to as \the two-network form of Tellegen's theorem" (Tellegen,1952; Peneld, 1970). Since F ~ is the transpose of F, the rst term of equation 20 can be rewritten as P j;k ( ~ Yk V j;k? Y k ~ Vj;k ) = P j;k ( ~ Yk T j;k Y j? Y k ~ Tj;k ~ Yj ) = P j;k ( ~ Yk T j;k Y j? Y k T k;j ~ Yj ) = 0: Since the rst term of equation 20 is zero, the second term P k ( ~ Yk X k? Y k ~ Xk ) is also zero. The ow graphs ~ F and F are thus interreciprocal. QED. The last step consists in showing that SISO interreciprocal ow graphs have the same transfer functions. Let node a be the unique source of F and node b its unique sink. From the denition of transposition, node a is the sink of ~ F, and node b is its source. We thus have: Xk = 0 8k 6= a and ~X k = 0 8k 6= b. Therefore, equation 19 reduces to: X a ~ Y a = ~ X b Y b : (21) This last equality can be interpreted as follows (Peneld,1970; Oppenheim and Schafer,1989): the output Y b, obtained when exciting graph F with an input signal X a, is identical to the output ~ Y a of the transposed graph ~ F when exciting it at node b with an input ~ Xb X a. The transfer functions of the SISO systems represented by the two ow graphs are thus identical, which is the desired conclusion. References Bordewijk, J. L Inter-reciprocity applied to electrical networks. Appl. Sci. Res. 6B, Bryson, A. E.Jr. and Ho, Y Applied Optimal Control, chapter 2. Blaisdell Publishing Co., New York. Hertz, J. A., Krogh, A., and Palmer, R. G Introduction to the Theory of Neural Computation. Addison-Wesley, Reading. Kailath, T Linear Systems. Prentice-Hall, Englewood Clis, NJ. Le Cun, Y A Theoretical Framework for Back-Propagation. Proceedings of the 1988 Connectionist Models Summer School. Editors: Touretzky, D., Hinton, G. and Sejnowski, T., Morgan Kaufmann. San Mateo, CA Ljung L System Identication: theory for the user. Prentice-Hall, Englewood Clis, NJ. Pineda, F. J Generalization of Back-Propagation to Recurrent Neural Networks. IEEE Trans on neural networks, special issue on recurrent networks. Plumer, E. S Time-Optimal Terminal Control Using Neural Networks. Proceedings of the IEEE International Conference on Neural Networks. San Francisco, CA

10 Ramo, S., Whinnery, J. R. and Van Duzer, T Fields and waves in communication electronics. Second Edition. John Wiley & Sons. Rumelhart, D. E. and McClelland, J. L Parallel Distributed Processing. The MIT Press, Cambridge, MA. Tellegen, B. D. H A general network theorem, with applications. Philips Res. Rep Werbos, P Backpropagation Through Time: What It Does and How to Do It. Proc. IEEE, special issue on neural networks White, S. A An Adaptive Recursive Digital Filter. Proc. 9th Asilomar Conf. Circuits Syst. Comput Williams, R. J. and Zipser, D A Learning algorithm for continually running fully recurrent neural networks. Neural Computation 1(2)

Diagrammatic Methods for Deriving and. Speech Technology and Research Laboratory, SRI International,

Diagrammatic Methods for Deriving and. Speech Technology and Research Laboratory, SRI International, Diagrammatic Methods for Deriving and Relating Temporal eural etwork Algorithms Eric A. Wan 1 and Francoise Beaufays 2 1 Department of Electrical Engineering, Oregon Graduate Institute, P.O. Box 91000,

More information

Temporal Backpropagation for FIR Neural Networks

Temporal Backpropagation for FIR Neural Networks Temporal Backpropagation for FIR Neural Networks Eric A. Wan Stanford University Department of Electrical Engineering, Stanford, CA 94305-4055 Abstract The traditional feedforward neural network is a static

More information

inear Adaptive Inverse Control

inear Adaptive Inverse Control Proceedings of the 36th Conference on Decision & Control San Diego, California USA December 1997 inear Adaptive nverse Control WM15 1:50 Bernard Widrow and Gregory L. Plett Department of Electrical Engineering,

More information

Equivalence of Backpropagation and Contrastive Hebbian Learning in a Layered Network

Equivalence of Backpropagation and Contrastive Hebbian Learning in a Layered Network LETTER Communicated by Geoffrey Hinton Equivalence of Backpropagation and Contrastive Hebbian Learning in a Layered Network Xiaohui Xie xhx@ai.mit.edu Department of Brain and Cognitive Sciences, Massachusetts

More information

Diagrammatic Derivation of Gradient Algorithms for Neural Networks

Diagrammatic Derivation of Gradient Algorithms for Neural Networks Communicated by Fernando Pineda - Diagrammatic Derivation of Gradient Algorithms for Neural Networks Deriving gradient algorithms for time-dependent neural network structures typically requires numerous

More information

Adaptive Inverse Control based on Linear and Nonlinear Adaptive Filtering

Adaptive Inverse Control based on Linear and Nonlinear Adaptive Filtering Adaptive Inverse Control based on Linear and Nonlinear Adaptive Filtering Bernard Widrow and Gregory L. Plett Department of Electrical Engineering, Stanford University, Stanford, CA 94305-9510 Abstract

More information

ADAPTIVE INVERSE CONTROL BASED ON NONLINEAR ADAPTIVE FILTERING. Information Systems Lab., EE Dep., Stanford University

ADAPTIVE INVERSE CONTROL BASED ON NONLINEAR ADAPTIVE FILTERING. Information Systems Lab., EE Dep., Stanford University ADAPTIVE INVERSE CONTROL BASED ON NONLINEAR ADAPTIVE FILTERING Bernard Widrow 1, Gregory Plett, Edson Ferreira 3 and Marcelo Lamego 4 Information Systems Lab., EE Dep., Stanford University Abstract: Many

More information

Adaptive Inverse Control

Adaptive Inverse Control TA1-8:30 Adaptive nverse Control Bernard Widrow Michel Bilello Stanford University Department of Electrical Engineering, Stanford, CA 94305-4055 Abstract A plant can track an input command signal if it

More information

A Logarithmic Neural Network Architecture for Unbounded Non-Linear Function Approximation

A Logarithmic Neural Network Architecture for Unbounded Non-Linear Function Approximation 1 Introduction A Logarithmic Neural Network Architecture for Unbounded Non-Linear Function Approximation J Wesley Hines Nuclear Engineering Department The University of Tennessee Knoxville, Tennessee,

More information

IN neural-network training, the most well-known online

IN neural-network training, the most well-known online IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 10, NO. 1, JANUARY 1999 161 On the Kalman Filtering Method in Neural-Network Training and Pruning John Sum, Chi-sing Leung, Gilbert H. Young, and Wing-kay Kan

More information

memory networks, have been proposed by Hopeld (1982), Lapedes and Farber (1986), Almeida (1987), Pineda (1988), and Rohwer and Forrest (1987). Other r

memory networks, have been proposed by Hopeld (1982), Lapedes and Farber (1986), Almeida (1987), Pineda (1988), and Rohwer and Forrest (1987). Other r A Learning Algorithm for Continually Running Fully Recurrent Neural Networks Ronald J. Williams College of Computer Science Northeastern University Boston, Massachusetts 02115 and David Zipser Institute

More information

y(n) Time Series Data

y(n) Time Series Data Recurrent SOM with Local Linear Models in Time Series Prediction Timo Koskela, Markus Varsta, Jukka Heikkonen, and Kimmo Kaski Helsinki University of Technology Laboratory of Computational Engineering

More information

Keywords- Source coding, Huffman encoding, Artificial neural network, Multilayer perceptron, Backpropagation algorithm

Keywords- Source coding, Huffman encoding, Artificial neural network, Multilayer perceptron, Backpropagation algorithm Volume 4, Issue 5, May 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Huffman Encoding

More information

New Recursive-Least-Squares Algorithms for Nonlinear Active Control of Sound and Vibration Using Neural Networks

New Recursive-Least-Squares Algorithms for Nonlinear Active Control of Sound and Vibration Using Neural Networks IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 12, NO. 1, JANUARY 2001 135 New Recursive-Least-Squares Algorithms for Nonlinear Active Control of Sound and Vibration Using Neural Networks Martin Bouchard,

More information

Recurrent neural networks with trainable amplitude of activation functions

Recurrent neural networks with trainable amplitude of activation functions Neural Networks 16 (2003) 1095 1100 www.elsevier.com/locate/neunet Neural Networks letter Recurrent neural networks with trainable amplitude of activation functions Su Lee Goh*, Danilo P. Mandic Imperial

More information

Convergence of Hybrid Algorithm with Adaptive Learning Parameter for Multilayer Neural Network

Convergence of Hybrid Algorithm with Adaptive Learning Parameter for Multilayer Neural Network Convergence of Hybrid Algorithm with Adaptive Learning Parameter for Multilayer Neural Network Fadwa DAMAK, Mounir BEN NASR, Mohamed CHTOUROU Department of Electrical Engineering ENIS Sfax, Tunisia {fadwa_damak,

More information

A STATE-SPACE NEURAL NETWORK FOR MODELING DYNAMICAL NONLINEAR SYSTEMS

A STATE-SPACE NEURAL NETWORK FOR MODELING DYNAMICAL NONLINEAR SYSTEMS A STATE-SPACE NEURAL NETWORK FOR MODELING DYNAMICAL NONLINEAR SYSTEMS Karima Amoura Patrice Wira and Said Djennoune Laboratoire CCSP Université Mouloud Mammeri Tizi Ouzou Algeria Laboratoire MIPS Université

More information

Recurrent Backpropagat ion and the Dynamical Approach to Adaptive Neural Computation

Recurrent Backpropagat ion and the Dynamical Approach to Adaptive Neural Computation VIEW Recurrent Backpropagat ion and the Dynamical Approach to Adaptive Neural Computation Fernando J. Pineda Jet Propulsion Laboratory, California Institute of Technology, 4800 Oak Grove Drive, Pasadena,

More information

In: Proc. BENELEARN-98, 8th Belgian-Dutch Conference on Machine Learning, pp 9-46, 998 Linear Quadratic Regulation using Reinforcement Learning Stephan ten Hagen? and Ben Krose Department of Mathematics,

More information

( t) Identification and Control of a Nonlinear Bioreactor Plant Using Classical and Dynamical Neural Networks

( t) Identification and Control of a Nonlinear Bioreactor Plant Using Classical and Dynamical Neural Networks Identification and Control of a Nonlinear Bioreactor Plant Using Classical and Dynamical Neural Networks Mehmet Önder Efe Electrical and Electronics Engineering Boðaziçi University, Bebek 80815, Istanbul,

More information

C1.2 Multilayer perceptrons

C1.2 Multilayer perceptrons Supervised Models C1.2 Multilayer perceptrons Luis B Almeida Abstract This section introduces multilayer perceptrons, which are the most commonly used type of neural network. The popular backpropagation

More information

STRUCTURED NEURAL NETWORK FOR NONLINEAR DYNAMIC SYSTEMS MODELING

STRUCTURED NEURAL NETWORK FOR NONLINEAR DYNAMIC SYSTEMS MODELING STRUCTURED NEURAL NETWORK FOR NONLINEAR DYNAIC SYSTES ODELING J. CODINA, R. VILLÀ and J.. FUERTES UPC-Facultat d Informàtica de Barcelona, Department of Automatic Control and Computer Engineeering, Pau

More information

Application of neural networks to load-frequency control in. power systems. Francoise Beaufays, Youssef Abdel-Magid, and Bernard Widrow

Application of neural networks to load-frequency control in. power systems. Francoise Beaufays, Youssef Abdel-Magid, and Bernard Widrow Neural Networks, 1994 Application of neural networks to loadfrequency control in power systems. Francoise Beaufays, Youssef AbdelMagid, and Bernard Widrow Abstract This paper describes an application of

More information

Adaptive linear quadratic control using policy. iteration. Steven J. Bradtke. University of Massachusetts.

Adaptive linear quadratic control using policy. iteration. Steven J. Bradtke. University of Massachusetts. Adaptive linear quadratic control using policy iteration Steven J. Bradtke Computer Science Department University of Massachusetts Amherst, MA 01003 bradtke@cs.umass.edu B. Erik Ydstie Department of Chemical

More information

y k = (a)synaptic f(x j ) link linear i/p o/p relation (b) Activation link linear i/p o/p relation

y k = (a)synaptic f(x j ) link linear i/p o/p relation (b) Activation link linear i/p o/p relation Neural networks viewed as directed graph - Signal flow graph: w j f(.) x j y k = w kj x j x j y k = (a)synaptic f(x j ) link linear i/p o/p relation (b) Activation link linear i/p o/p relation y i x j

More information

Learning with Ensembles: How. over-tting can be useful. Anders Krogh Copenhagen, Denmark. Abstract

Learning with Ensembles: How. over-tting can be useful. Anders Krogh Copenhagen, Denmark. Abstract Published in: Advances in Neural Information Processing Systems 8, D S Touretzky, M C Mozer, and M E Hasselmo (eds.), MIT Press, Cambridge, MA, pages 190-196, 1996. Learning with Ensembles: How over-tting

More information

Learning in Boltzmann Trees. Lawrence Saul and Michael Jordan. Massachusetts Institute of Technology. Cambridge, MA January 31, 1995.

Learning in Boltzmann Trees. Lawrence Saul and Michael Jordan. Massachusetts Institute of Technology. Cambridge, MA January 31, 1995. Learning in Boltzmann Trees Lawrence Saul and Michael Jordan Center for Biological and Computational Learning Massachusetts Institute of Technology 79 Amherst Street, E10-243 Cambridge, MA 02139 January

More information

A recursive algorithm based on the extended Kalman filter for the training of feedforward neural models. Isabelle Rivals and Léon Personnaz

A recursive algorithm based on the extended Kalman filter for the training of feedforward neural models. Isabelle Rivals and Léon Personnaz In Neurocomputing 2(-3): 279-294 (998). A recursive algorithm based on the extended Kalman filter for the training of feedforward neural models Isabelle Rivals and Léon Personnaz Laboratoire d'électronique,

More information

A Robust PCA by LMSER Learning with Iterative Error. Bai-ling Zhang Irwin King Lei Xu.

A Robust PCA by LMSER Learning with Iterative Error. Bai-ling Zhang Irwin King Lei Xu. A Robust PCA by LMSER Learning with Iterative Error Reinforcement y Bai-ling Zhang Irwin King Lei Xu blzhang@cs.cuhk.hk king@cs.cuhk.hk lxu@cs.cuhk.hk Department of Computer Science The Chinese University

More information

4. Multilayer Perceptrons

4. Multilayer Perceptrons 4. Multilayer Perceptrons This is a supervised error-correction learning algorithm. 1 4.1 Introduction A multilayer feedforward network consists of an input layer, one or more hidden layers, and an output

More information

Abstract. In this paper we propose recurrent neural networks with feedback into the input

Abstract. In this paper we propose recurrent neural networks with feedback into the input Recurrent Neural Networks for Missing or Asynchronous Data Yoshua Bengio Dept. Informatique et Recherche Operationnelle Universite de Montreal Montreal, Qc H3C-3J7 bengioy@iro.umontreal.ca Francois Gingras

More information

Phase-Space Learning Fu-Sheng Tsung Chung Tai Ch'an Temple 56, Yuon-fon Road, Yi-hsin Li, Pu-li Nan-tou County, Taiwan 545 Republic of China Garrison

Phase-Space Learning Fu-Sheng Tsung Chung Tai Ch'an Temple 56, Yuon-fon Road, Yi-hsin Li, Pu-li Nan-tou County, Taiwan 545 Republic of China Garrison Phase-Space Learning Fu-Sheng Tsung Chung Tai Ch'an Temple 56, Yuon-fon Road, Yi-hsin Li, Pu-li Nan-tou County, Taiwan 545 Republic of China Garrison W. Cottrell Institute for Neural Computation Computer

More information

Optimal Polynomial Control for Discrete-Time Systems

Optimal Polynomial Control for Discrete-Time Systems 1 Optimal Polynomial Control for Discrete-Time Systems Prof Guy Beale Electrical and Computer Engineering Department George Mason University Fairfax, Virginia Correspondence concerning this paper should

More information

Introduction to Machine Learning Spring 2018 Note Neural Networks

Introduction to Machine Learning Spring 2018 Note Neural Networks CS 189 Introduction to Machine Learning Spring 2018 Note 14 1 Neural Networks Neural networks are a class of compositional function approximators. They come in a variety of shapes and sizes. In this class,

More information

The DFT as Convolution or Filtering

The DFT as Convolution or Filtering Connexions module: m16328 1 The DFT as Convolution or Filtering C. Sidney Burrus This work is produced by The Connexions Project and licensed under the Creative Commons Attribution License A major application

More information

Introduction to Artificial Neural Networks

Introduction to Artificial Neural Networks Facultés Universitaires Notre-Dame de la Paix 27 March 2007 Outline 1 Introduction 2 Fundamentals Biological neuron Artificial neuron Artificial Neural Network Outline 3 Single-layer ANN Perceptron Adaline

More information

Neural Network Identification of Non Linear Systems Using State Space Techniques.

Neural Network Identification of Non Linear Systems Using State Space Techniques. Neural Network Identification of Non Linear Systems Using State Space Techniques. Joan Codina, J. Carlos Aguado, Josep M. Fuertes. Automatic Control and Computer Engineering Department Universitat Politècnica

More information

Recurrent Neural Net Learning and Vanishing Gradient. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 6(2):107{116, 1998

Recurrent Neural Net Learning and Vanishing Gradient. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 6(2):107{116, 1998 Recurrent Neural Net Learning and Vanishing Gradient International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 6(2):107{116, 1998 Sepp Hochreiter Institut fur Informatik Technische Universitat

More information

1 Introduction Independent component analysis (ICA) [10] is a statistical technique whose main applications are blind source separation, blind deconvo

1 Introduction Independent component analysis (ICA) [10] is a statistical technique whose main applications are blind source separation, blind deconvo The Fixed-Point Algorithm and Maximum Likelihood Estimation for Independent Component Analysis Aapo Hyvarinen Helsinki University of Technology Laboratory of Computer and Information Science P.O.Box 5400,

More information

Back-Propagation Algorithm. Perceptron Gradient Descent Multilayered neural network Back-Propagation More on Back-Propagation Examples

Back-Propagation Algorithm. Perceptron Gradient Descent Multilayered neural network Back-Propagation More on Back-Propagation Examples Back-Propagation Algorithm Perceptron Gradient Descent Multilayered neural network Back-Propagation More on Back-Propagation Examples 1 Inner-product net =< w, x >= w x cos(θ) net = n i=1 w i x i A measure

More information

Direct Method for Training Feed-forward Neural Networks using Batch Extended Kalman Filter for Multi- Step-Ahead Predictions

Direct Method for Training Feed-forward Neural Networks using Batch Extended Kalman Filter for Multi- Step-Ahead Predictions Direct Method for Training Feed-forward Neural Networks using Batch Extended Kalman Filter for Multi- Step-Ahead Predictions Artem Chernodub, Institute of Mathematical Machines and Systems NASU, Neurotechnologies

More information

Lecture 5: Recurrent Neural Networks

Lecture 5: Recurrent Neural Networks 1/25 Lecture 5: Recurrent Neural Networks Nima Mohajerin University of Waterloo WAVE Lab nima.mohajerin@uwaterloo.ca July 4, 2017 2/25 Overview 1 Recap 2 RNN Architectures for Learning Long Term Dependencies

More information

Performance Comparison of Two Implementations of the Leaky. LMS Adaptive Filter. Scott C. Douglas. University of Utah. Salt Lake City, Utah 84112

Performance Comparison of Two Implementations of the Leaky. LMS Adaptive Filter. Scott C. Douglas. University of Utah. Salt Lake City, Utah 84112 Performance Comparison of Two Implementations of the Leaky LMS Adaptive Filter Scott C. Douglas Department of Electrical Engineering University of Utah Salt Lake City, Utah 8411 Abstract{ The leaky LMS

More information

Gradient Descent Training Rule: The Details

Gradient Descent Training Rule: The Details Gradient Descent Training Rule: The Details 1 For Perceptrons The whole idea behind gradient descent is to gradually, but consistently, decrease the output error by adjusting the weights. The trick is

More information

T Machine Learning and Neural Networks

T Machine Learning and Neural Networks T-61.5130 Machine Learning and Neural Networks (5 cr) Lecture 11: Processing of Temporal Information Prof. Juha Karhunen https://mycourses.aalto.fi/ Aalto University School of Science, Espoo, Finland 1

More information

Butterworth Filter Properties

Butterworth Filter Properties OpenStax-CNX module: m693 Butterworth Filter Properties C. Sidney Burrus This work is produced by OpenStax-CNX and licensed under the Creative Commons Attribution License 3. This section develops the properties

More information

Implementation of Discrete-Time Systems

Implementation of Discrete-Time Systems EEE443 Digital Signal Processing Implementation of Discrete-Time Systems Dr. Shahrel A. Suandi PPKEE, Engineering Campus, USM Introduction A linear-time invariant system (LTI) is described by linear constant

More information

Approximate Optimal-Value Functions. Satinder P. Singh Richard C. Yee. University of Massachusetts.

Approximate Optimal-Value Functions. Satinder P. Singh Richard C. Yee. University of Massachusetts. An Upper Bound on the oss from Approximate Optimal-Value Functions Satinder P. Singh Richard C. Yee Department of Computer Science University of Massachusetts Amherst, MA 01003 singh@cs.umass.edu, yee@cs.umass.edu

More information

Using Expectation-Maximization for Reinforcement Learning

Using Expectation-Maximization for Reinforcement Learning NOTE Communicated by Andrew Barto and Michael Jordan Using Expectation-Maximization for Reinforcement Learning Peter Dayan Department of Brain and Cognitive Sciences, Center for Biological and Computational

More information

International Journal of Advanced Research in Computer Science and Software Engineering

International Journal of Advanced Research in Computer Science and Software Engineering Volume 3, Issue 4, April 2013 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Application of

More information

Neural Dynamic Optimization for Control Systems Part II: Theory

Neural Dynamic Optimization for Control Systems Part II: Theory 490 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS PART B: CYBERNETICS, VOL. 31, NO. 4, AUGUST 2001 Neural Dynamic Optimization for Control Systems Part II: Theory Chang-Yun Seong, Member, IEEE, and

More information

Model Reference Adaptive Control for Multi-Input Multi-Output Nonlinear Systems Using Neural Networks

Model Reference Adaptive Control for Multi-Input Multi-Output Nonlinear Systems Using Neural Networks Model Reference Adaptive Control for MultiInput MultiOutput Nonlinear Systems Using Neural Networks Jiunshian Phuah, Jianming Lu, and Takashi Yahagi Graduate School of Science and Technology, Chiba University,

More information

Stochastic Dynamics of Learning with Momentum. in Neural Networks. Department of Medical Physics and Biophysics,

Stochastic Dynamics of Learning with Momentum. in Neural Networks. Department of Medical Physics and Biophysics, Stochastic Dynamics of Learning with Momentum in Neural Networks Wim Wiegerinck Andrzej Komoda Tom Heskes 2 Department of Medical Physics and Biophysics, University of Nijmegen, Geert Grooteplein Noord

More information

Frequency Selective Surface Design Based on Iterative Inversion of Neural Networks

Frequency Selective Surface Design Based on Iterative Inversion of Neural Networks J.N. Hwang, J.J. Choi, S. Oh, R.J. Marks II, "Query learning based on boundary search and gradient computation of trained multilayer perceptrons", Proceedings of the International Joint Conference on Neural

More information

EKF LEARNING FOR FEEDFORWARD NEURAL NETWORKS

EKF LEARNING FOR FEEDFORWARD NEURAL NETWORKS EKF LEARNING FOR FEEDFORWARD NEURAL NETWORKS A. Alessandri, G. Cirimele, M. Cuneo, S. Pagnan, M. Sanguineti Institute of Intelligent Systems for Automation, ISSIA-CNR National Research Council of Italy,

More information

Analog Neural Nets with Gaussian or other Common. Noise Distributions cannot Recognize Arbitrary. Regular Languages.

Analog Neural Nets with Gaussian or other Common. Noise Distributions cannot Recognize Arbitrary. Regular Languages. Analog Neural Nets with Gaussian or other Common Noise Distributions cannot Recognize Arbitrary Regular Languages Wolfgang Maass Inst. for Theoretical Computer Science, Technische Universitat Graz Klosterwiesgasse

More information

Application of Artificial Neural Networks in Evaluation and Identification of Electrical Loss in Transformers According to the Energy Consumption

Application of Artificial Neural Networks in Evaluation and Identification of Electrical Loss in Transformers According to the Energy Consumption Application of Artificial Neural Networks in Evaluation and Identification of Electrical Loss in Transformers According to the Energy Consumption ANDRÉ NUNES DE SOUZA, JOSÉ ALFREDO C. ULSON, IVAN NUNES

More information

Reinforcement Learning: the basics

Reinforcement Learning: the basics Reinforcement Learning: the basics Olivier Sigaud Université Pierre et Marie Curie, PARIS 6 http://people.isir.upmc.fr/sigaud August 6, 2012 1 / 46 Introduction Action selection/planning Learning by trial-and-error

More information

ECLT 5810 Classification Neural Networks. Reference: Data Mining: Concepts and Techniques By J. Hand, M. Kamber, and J. Pei, Morgan Kaufmann

ECLT 5810 Classification Neural Networks. Reference: Data Mining: Concepts and Techniques By J. Hand, M. Kamber, and J. Pei, Morgan Kaufmann ECLT 5810 Classification Neural Networks Reference: Data Mining: Concepts and Techniques By J. Hand, M. Kamber, and J. Pei, Morgan Kaufmann Neural Networks A neural network is a set of connected input/output

More information

Phase-Space learning for recurrent networks. and. Abstract. We study the problem of learning nonstatic attractors in recurrent networks.

Phase-Space learning for recurrent networks. and. Abstract. We study the problem of learning nonstatic attractors in recurrent networks. Phase-Space learning for recurrent networks Fu-Sheng Tsung and Garrison W Cottrell Department of Computer Science & Engineering and Institute for Neural Computation University of California, San Diego,

More information

MODELING NONLINEAR DYNAMICS WITH NEURAL. Eric A. Wan. Stanford University, Department of Electrical Engineering, Stanford, CA

MODELING NONLINEAR DYNAMICS WITH NEURAL. Eric A. Wan. Stanford University, Department of Electrical Engineering, Stanford, CA MODELING NONLINEAR DYNAMICS WITH NEURAL NETWORKS: EXAMPLES IN TIME SERIES PREDICTION Eric A Wan Stanford University, Department of Electrical Engineering, Stanford, CA 9435-455 Abstract A neural networ

More information

Training Radial Basis Neural Networks with the Extended Kalman Filter

Training Radial Basis Neural Networks with the Extended Kalman Filter Cleveland State University EngagedScholarship@CSU Electrical Engineering & Computer Science Faculty Publications Electrical Engineering & Computer Science Department 10-1-2002 Training Radial Basis Neural

More information

Neural Networks and the Back-propagation Algorithm

Neural Networks and the Back-propagation Algorithm Neural Networks and the Back-propagation Algorithm Francisco S. Melo In these notes, we provide a brief overview of the main concepts concerning neural networks and the back-propagation algorithm. We closely

More information

Computational Graphs, and Backpropagation

Computational Graphs, and Backpropagation Chapter 1 Computational Graphs, and Backpropagation (Course notes for NLP by Michael Collins, Columbia University) 1.1 Introduction We now describe the backpropagation algorithm for calculation of derivatives

More information

In Advances in Neural Information Processing Systems 6. J. D. Cowan, G. Tesauro and. Convergence of Indirect Adaptive. Andrew G.

In Advances in Neural Information Processing Systems 6. J. D. Cowan, G. Tesauro and. Convergence of Indirect Adaptive. Andrew G. In Advances in Neural Information Processing Systems 6. J. D. Cowan, G. Tesauro and J. Alspector, (Eds.). Morgan Kaufmann Publishers, San Fancisco, CA. 1994. Convergence of Indirect Adaptive Asynchronous

More information

Artificial Neural Networks. Historical description

Artificial Neural Networks. Historical description Artificial Neural Networks Historical description Victor G. Lopez 1 / 23 Artificial Neural Networks (ANN) An artificial neural network is a computational model that attempts to emulate the functions of

More information

Introduction Biologically Motivated Crude Model Backpropagation

Introduction Biologically Motivated Crude Model Backpropagation Introduction Biologically Motivated Crude Model Backpropagation 1 McCulloch-Pitts Neurons In 1943 Warren S. McCulloch, a neuroscientist, and Walter Pitts, a logician, published A logical calculus of the

More information

100 inference steps doesn't seem like enough. Many neuron-like threshold switching units. Many weighted interconnections among units

100 inference steps doesn't seem like enough. Many neuron-like threshold switching units. Many weighted interconnections among units Connectionist Models Consider humans: Neuron switching time ~ :001 second Number of neurons ~ 10 10 Connections per neuron ~ 10 4 5 Scene recognition time ~ :1 second 100 inference steps doesn't seem like

More information

Multilayer Perceptrons (MLPs)

Multilayer Perceptrons (MLPs) CSE 5526: Introduction to Neural Networks Multilayer Perceptrons (MLPs) 1 Motivation Multilayer networks are more powerful than singlelayer nets Example: XOR problem x 2 1 AND x o x 1 x 2 +1-1 o x x 1-1

More information

Asymptotic Convergence of Backpropagation: Numerical Experiments

Asymptotic Convergence of Backpropagation: Numerical Experiments 606 Ahmad, Thsauro and He Asymptotic Convergence of Backpropagation: Numerical Experiments Subutai Ahmad ICSI 1947 Center St. Berkeley, CA 94704 Gerald Tesauro mm Watson Labs. P. O. Box 704 Yorktown Heights,

More information

Lecture 7 Artificial neural networks: Supervised learning

Lecture 7 Artificial neural networks: Supervised learning Lecture 7 Artificial neural networks: Supervised learning Introduction, or how the brain works The neuron as a simple computing element The perceptron Multilayer neural networks Accelerated learning in

More information

ON HYPERPARAMETER OPTIMIZATION IN LEARNING SYSTEMS

ON HYPERPARAMETER OPTIMIZATION IN LEARNING SYSTEMS ON HYPERPARAMETER OPTIMIZATION IN LEARNING SYSTEMS Luca Franceschi 1,2, Michele Donini 1, Paolo Frasconi 3, Massimiliano Pontil 1,2 (1) Istituto Italiano di Tecnologia, Genoa, 16163 Italy (2) Dept of Computer

More information

Modelling Time Series with Neural Networks. Volker Tresp Summer 2017

Modelling Time Series with Neural Networks. Volker Tresp Summer 2017 Modelling Time Series with Neural Networks Volker Tresp Summer 2017 1 Modelling of Time Series The next figure shows a time series (DAX) Other interesting time-series: energy prize, energy consumption,

More information

Square-Root Algorithms of Recursive Least-Squares Wiener Estimators in Linear Discrete-Time Stochastic Systems

Square-Root Algorithms of Recursive Least-Squares Wiener Estimators in Linear Discrete-Time Stochastic Systems Proceedings of the 17th World Congress The International Federation of Automatic Control Square-Root Algorithms of Recursive Least-Squares Wiener Estimators in Linear Discrete-Time Stochastic Systems Seiichi

More information

TWO-LAYER LINEAR STRUCTURES FOR FAST ADAPTIVE FILTERING. a dissertation. submitted to the department of electrical engineering

TWO-LAYER LINEAR STRUCTURES FOR FAST ADAPTIVE FILTERING. a dissertation. submitted to the department of electrical engineering TWO-LAYER LIEAR STRUCTURES FOR FAST ADAPTIVE FILTERIG a dissertation submitted to the department of electrical engineering and the committee on graduate studies of stanford university in partial fulfillment

More information

Artificial Neural Networks

Artificial Neural Networks Introduction ANN in Action Final Observations Application: Poverty Detection Artificial Neural Networks Alvaro J. Riascos Villegas University of los Andes and Quantil July 6 2018 Artificial Neural Networks

More information

ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD

ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD WHAT IS A NEURAL NETWORK? The simplest definition of a neural network, more properly referred to as an 'artificial' neural network (ANN), is provided

More information

Notes on Dantzig-Wolfe decomposition and column generation

Notes on Dantzig-Wolfe decomposition and column generation Notes on Dantzig-Wolfe decomposition and column generation Mette Gamst November 11, 2010 1 Introduction This note introduces an exact solution method for mathematical programming problems. The method is

More information

Supervised Learning. George Konidaris

Supervised Learning. George Konidaris Supervised Learning George Konidaris gdk@cs.brown.edu Fall 2017 Machine Learning Subfield of AI concerned with learning from data. Broadly, using: Experience To Improve Performance On Some Task (Tom Mitchell,

More information

t Correspondence should be addressed to the author at the Department of Psychology, Stanford University, Stanford, California, 94305, USA.

t Correspondence should be addressed to the author at the Department of Psychology, Stanford University, Stanford, California, 94305, USA. 310 PROBABILISTIC CHARACTERIZATION OF NEURAL MODEL COMPUTATIONS Richard M. Golden t University of Pittsburgh, Pittsburgh, Pa. 15260 ABSTRACT Information retrieval in a neural network is viewed as a procedure

More information

Filter-Generating Systems

Filter-Generating Systems 24 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: ANALOG AND DIGITAL SIGNAL PROCESSING, VOL. 47, NO. 3, MARCH 2000 Filter-Generating Systems Saed Samadi, Akinori Nishihara, and Hiroshi Iwakura Abstract

More information

Artificial Neural Networks

Artificial Neural Networks Artificial Neural Networks Threshold units Gradient descent Multilayer networks Backpropagation Hidden layer representations Example: Face Recognition Advanced topics 1 Connectionist Models Consider humans:

More information

Introduction to Neural Networks

Introduction to Neural Networks Introduction to Neural Networks What are (Artificial) Neural Networks? Models of the brain and nervous system Highly parallel Process information much more like the brain than a serial computer Learning

More information

Chapter 3 Supervised learning:

Chapter 3 Supervised learning: Chapter 3 Supervised learning: Multilayer Networks I Backpropagation Learning Architecture: Feedforward network of at least one layer of non-linear hidden nodes, e.g., # of layers L 2 (not counting the

More information

Designing Dynamic Neural Network for Non-Linear System Identification

Designing Dynamic Neural Network for Non-Linear System Identification Designing Dynamic Neural Network for Non-Linear System Identification Chandradeo Prasad Assistant Professor, Department of CSE, RGIT,Koderma Abstract : System identification deals with many subtleties

More information

Gaussian Processes for Regression. Carl Edward Rasmussen. Department of Computer Science. Toronto, ONT, M5S 1A4, Canada.

Gaussian Processes for Regression. Carl Edward Rasmussen. Department of Computer Science. Toronto, ONT, M5S 1A4, Canada. In Advances in Neural Information Processing Systems 8 eds. D. S. Touretzky, M. C. Mozer, M. E. Hasselmo, MIT Press, 1996. Gaussian Processes for Regression Christopher K. I. Williams Neural Computing

More information

LSTM CAN SOLVE HARD. Jurgen Schmidhuber Lugano, Switzerland. Abstract. guessing than by the proposed algorithms.

LSTM CAN SOLVE HARD. Jurgen Schmidhuber Lugano, Switzerland. Abstract. guessing than by the proposed algorithms. LSTM CAN SOLVE HARD LONG TIME LAG PROBLEMS Sepp Hochreiter Fakultat fur Informatik Technische Universitat Munchen 80290 Munchen, Germany Jurgen Schmidhuber IDSIA Corso Elvezia 36 6900 Lugano, Switzerland

More information

Fast pruning using principal components

Fast pruning using principal components Oregon Health & Science University OHSU Digital Commons CSETech January 1993 Fast pruning using principal components Asriel U. Levin Todd K. Leen John E. Moody Follow this and additional works at: http://digitalcommons.ohsu.edu/csetech

More information

Proceedings of the International Conference on Neural Networks, Orlando Florida, June Leemon C. Baird III

Proceedings of the International Conference on Neural Networks, Orlando Florida, June Leemon C. Baird III Proceedings of the International Conference on Neural Networks, Orlando Florida, June 1994. REINFORCEMENT LEARNING IN CONTINUOUS TIME: ADVANTAGE UPDATING Leemon C. Baird III bairdlc@wl.wpafb.af.mil Wright

More information

Submitted to Electronics Letters. Indexing terms: Signal Processing, Adaptive Filters. The Combined LMS/F Algorithm Shao-Jen Lim and John G. Harris Co

Submitted to Electronics Letters. Indexing terms: Signal Processing, Adaptive Filters. The Combined LMS/F Algorithm Shao-Jen Lim and John G. Harris Co Submitted to Electronics Letters. Indexing terms: Signal Processing, Adaptive Filters. The Combined LMS/F Algorithm Shao-Jen Lim and John G. Harris Computational Neuro-Engineering Laboratory University

More information

Average Reward Parameters

Average Reward Parameters Simulation-Based Optimization of Markov Reward Processes: Implementation Issues Peter Marbach 2 John N. Tsitsiklis 3 Abstract We consider discrete time, nite state space Markov reward processes which depend

More information

Automatic Capacity Tuning. of Very Large VC-dimension Classiers. B. Boser 3. EECS Department, University of California, Berkeley, CA 94720

Automatic Capacity Tuning. of Very Large VC-dimension Classiers. B. Boser 3. EECS Department, University of California, Berkeley, CA 94720 Automatic Capacity Tuning of Very Large VC-dimension Classiers I. Guyon AT&T Bell Labs, 50 Fremont st., 6 th oor, San Francisco, CA 94105 isabelle@neural.att.com B. Boser 3 EECS Department, University

More information

HOPFIELD neural networks (HNNs) are a class of nonlinear

HOPFIELD neural networks (HNNs) are a class of nonlinear IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 52, NO. 4, APRIL 2005 213 Stochastic Noise Process Enhancement of Hopfield Neural Networks Vladimir Pavlović, Member, IEEE, Dan Schonfeld,

More information

Autonomous learning algorithm for fully connected recurrent networks

Autonomous learning algorithm for fully connected recurrent networks Autonomous learning algorithm for fully connected recurrent networks Edouard Leclercq, Fabrice Druaux, Dimitri Lefebvre Groupe de Recherche en Electrotechnique et Automatique du Havre Université du Havre,

More information

Learning and Memory in Neural Networks

Learning and Memory in Neural Networks Learning and Memory in Neural Networks Guy Billings, Neuroinformatics Doctoral Training Centre, The School of Informatics, The University of Edinburgh, UK. Neural networks consist of computational units

More information

THE concept of active sound control has been known

THE concept of active sound control has been known IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 10, NO. 2, MARCH 1999 391 Improved Training of Neural Networks for the Nonlinear Active Control of Sound and Vibration Martin Bouchard, Member, IEEE, Bruno Paillard,

More information

Bidirectional Representation and Backpropagation Learning

Bidirectional Representation and Backpropagation Learning Int'l Conf on Advances in Big Data Analytics ABDA'6 3 Bidirectional Representation and Bacpropagation Learning Olaoluwa Adigun and Bart Koso Department of Electrical Engineering Signal and Image Processing

More information

A New Hybrid System for Recognition of Handwritten-Script

A New Hybrid System for Recognition of Handwritten-Script computing@tanet.edu.te.ua www.tanet.edu.te.ua/computing ISSN 177-69 A New Hybrid System for Recognition of Handwritten-Script Khalid Saeed 1) and Marek Tabdzki ) Faculty of Computer Science, Bialystok

More information

Analysis of Multilayer Neural Network Modeling and Long Short-Term Memory

Analysis of Multilayer Neural Network Modeling and Long Short-Term Memory Analysis of Multilayer Neural Network Modeling and Long Short-Term Memory Danilo López, Nelson Vera, Luis Pedraza International Science Index, Mathematical and Computational Sciences waset.org/publication/10006216

More information

Neural Networks. Yan Shao Department of Linguistics and Philology, Uppsala University 7 December 2016

Neural Networks. Yan Shao Department of Linguistics and Philology, Uppsala University 7 December 2016 Neural Networks Yan Shao Department of Linguistics and Philology, Uppsala University 7 December 2016 Outline Part 1 Introduction Feedforward Neural Networks Stochastic Gradient Descent Computational Graph

More information