Temporal Backpropagation for FIR Neural Networks

Size: px
Start display at page:

Download "Temporal Backpropagation for FIR Neural Networks"

Transcription

1 Temporal Backpropagation for FIR Neural Networks Eric A. Wan Stanford University Department of Electrical Engineering, Stanford, CA Abstract The traditional feedforward neural network is a static structure which simply maps input to output. To better reflect the dynamics in the biological system a network structure is proposed which models each synapse by a Finite Impulse Response (FIR) linear filter. An eficient gradient descent algorithm is derived which will be shown to be a temporal generalization of the familiar backpropugation algorithm. 1 Introduction A standard neural network models a synapse by a single variable weight parameter. In a feedforward structure this results in a static network which maps input to output. Real neural networks are of course dynamic in nature which is reflected in the temporal properties of the synapse along with such processes as impulse transmission and membrane excitation. While many accurate models of such processes do exist, from an engineering standpoint most are unrealistic to work with. The model we propose to use represents a synapse not by just a single weight parameter, but by an adaptive filter [l]. Further, we restrict the filter to be discrete time, characterized by a Finite Impulse Response (FIR) '. While biologically motivated, we make no claims that the structure is necessarily biologically plausible. With this we proceed to derive algorithms for adapting the synaptic transfer functions so as to train the network as a whole. 2 Network Structure Each synapse in the network is modeled by a Finite Impulse Response (FIR) linear filter. The coefficients for a filter can be represented by a weight vector W = [w(o), w(l),..w(t)it(t denotes transpose). The output of the filter simply corresponds to the weighted sum of delayed samples of the input, z(k) (i.e. y(k) = CT=o~(i)z(k - where k is the discrete time index ). As usual, the total input to a neuron corresponds to the sum of all synaptic filter outputs which connect to that neuron '. The output of the neuron is usually taken to be a non-linear sigmoidal function of its input. Subscripts are used to indicate the specific location of a synapse or neuron within the network. Thus Wfj specifies the synaptic filter connecting the output of neuron i in layer 1 to the input of neuron j in the next layer. For a fully connected feedforward structure with L layers and NI neurons in each layer, the network can be completely specified as follows: i= I i=l where 1 5 i 5 NI, 1 5 j 5 NI+~, and 1 5 I 5 L. Note, if we replace the vectors, W and X by simple scalars, then the above equations reduce to the definition of the familiar static feedforward network [a]. ]The Infinite Impulse Response (IIR) case has also been studied and will be presented in a future paper. Without loss of generalization in the analysis, possible bias terms have been neglected for notational simplicity. I - 575

2 Figure 1: FIR Neural Network Structure To complete the definitions for the network structure, we must specify the input/output relationships for the network as a whole: input zp(k) 15 i 5 No output zc(k) 15 is NL (6) For notational purposes we have used $(k) as the external input to the network. This should not be confused with the output of a neuron. The structure of the FIR network is illustrated in Figure 1. 3 Adaptation Given a desired response vector for the network s output at each index of time, we have desired response di (k) inst. squared error ei(k) = d;(k) - zc(k) total inst. squared error e2(k) = ef(k) total squared error e2 = Ck=O e2(k) Learning will be based on traditional gradient descent in which we attempt to minimize the total squared error over all time. The error gradient with respect to each weight vector is normally expanded as follows: (7) By taking each term in the expansion as an unbiased instantaneous estimate of the gradient, we may form the on-line training algorithm: ae2(k) Wl,(k + 1) = W!.(k) - p- 3 awgk) (9) in which the weight vectors are updated at each increment of time (p is defined as the learning rate). As we will show, this obvious expansion of into the terms does not lead to a desirable learning J 3 algorithm for this structure. A less intuitive expansion, in fact, yields a more attractive algorithm which exploits the structure s FIR characteristics. For now we will complete the derivation of the first algorithm by proceeding to calculate the terms q. Starting with the last layer of synapses in the network: I - 576

3 By defining 6f(k) = -2ej(k)f'(yf(k)), for the last layer of synapses we have: W&-'(k + 1) + w&-'(k) - &(k). xf-'(k) (13) This is, of course, simply the LMS algorithm for a bank of FIR filters with a non-linear output. For the previous layer of synapses, however, the derivation is not as simple. Using the proper chain rule expansion we get: n=o ml 1 This last equation has many interpretations. The term b&(k)wfi'(n) is similar to the recurrent formula used in backpropagation to accumulate the gradients. However, there are TL-~ such terms corresponding to the number of tap delays in the final layer of the network. This equation could, in fact, have been written down by inspection if we were to interpret each tap delay as a "virtual" neuron whose input is delayed the appropriate number of time steps. This can be thought of as equivalent to the common technique of viewing the structure unfolded in time. The problem with these equations, however, is that there is a loss of a sense of symmetry between the forward propagation of the network and the backwards propagation of the terms necessary to calculate the gradients. The desired distributed nature of gradient computations disappear. In addition, the gradient equations do not generalize for arbitrary layers. There is no nice recurrent formula for the gradients. The gradient calculations for one more layer back results in a triple sum. In fact, the total number of operations actually grows geometrically with the number of layers. These drawbacks can, however, be overcome if we consider a completely different formulation of the gradient descent algorithm. Our original expansion of the total error gradient is not unique. Consider the following: This now yields an on-line version of the form: I - 577

4 Note the time index runs over y(k) and not e2(k). We may interpret as the change in the total,=. squared error over all time due to a change in the input to a neuron at a single instant of time. Furthermore, aez ay'.+'(k) a.i+l(k) -,awfj # Only under the assumption that the error and all neuron outputs are stationary processes can we regard each term as an unbiased instantaneous estimate of the true gradient. Now for any layer BY1. (k) = X:-'. Furthermore, we may simply define $& f 6;(k). This allows us to., rewrite Equation 21 in the more familiar notational form: "' ay:+'(k) Wfj(k + 1) = w:j(k) - p6:.+'(k).xi@) (22) which now holds for any layer in the network. To complete the derivation, an explicit formula for $(k) must be found. For the output layer we have simply: which is the same as before. For any hidden layer we have: But we recall: Thus which now yields -={ ay:+,'(t) wjm(t - k) for 0 5 t - k 5 ax; (k) 0 otherwise (29) where we have defined m=l n=o NIL, m= I &(k) = [&(k),&(k + I),.&(k + Z-i)] 3This is never a valid assumption for finite time. However, the expansion in Equation 20 is always valid. (32) (33) (34) I - 578

5 Figure 2: Backward filter propagation of gradient terms Summarizing, the complete adaptation algorithm can be expressed as follows: WiJk + 1) = Wij(k) - p6;+'(k).xf(k) l=l (35) m=l We now have a recursive formula for the error gradients. To calculate 6:(k) of a given neuron we simply propagate the 6's from the next layer backwards through the synaptic filters for which the given neuron feeds. In this sense, these equations can be thought of as the temporal version of backpropagation where the 6's are formed not by simply taking weighted sums but by backward filtering. For each new input and desired response vector we increment the forward filters one time step and the backward filters one time step. Thus by manipulating the terms used to accumulate the error gradients we have preserved the symmetry between the forward propagation of states, and the backward propagation of the 6 terms. This is illustrated in Figure 2. Again, if we replace the vectors X, W, and now A by scalars, the above equations reduce to the familiar backpropagation algorithm for static networks. In this sense, these equations may also be thought of as a "vector" generalization of backpropagation. This algorithm also has added computational advantages. Each neuron requires on the order of 2NT multiplications while the first algorithm (Equation 19) takes on the order of NT2 multiplications '. This savings comes from grouping terms into products of sums instead of sums of products. In fact, with this algorithm the total number of operations continue to grow linearly with the number of layers versus geometrically as in the first case. The careful reader may have observed what appears to be a flaw in this algorithm. The calculations for the b;!-'(k)'s are in fact non-causal. The source of this non-causal filtering can be seen by considering the terms 4. Since it takes time for the output of any internal neuron to completely propagate through the ayj(k) network, the change in the total error due to a change in an internal state is a function of future values within the network. Since the network is FIR, we can easily remedy the algorithm by adding a finite number of simple delay operators into the network. 4 Experimentation To compare the two algorithms derived we experimentally verify their equivalence. Figure 3 shows the averaged learning curves for a two layer network modeling an unknown non-linear system. Training 'N = nodes per layer, T = number of tap delays. NT2 is valid for the neurons in the first hidden layer back from the output. An explicit formula was not derived for all layers in the cae of the first algorithm. 51 inputs, 1 output, and 5 hidden units with Ctap FIR filters for each synapse. p =.05 I - 579

6 25 s 2o t i time increment, k time increment, k (a) Alg. 1 - instantaneous error gradient (b) Alg. 2 - temporal backpropagation Figure 3: Experimental Learning Curves differed only in terms of the algorithm used. As can be seen, the performance of both appear to be roughly equivalent. Differences are hidden in terms of the computational efficiency of the second algorithm as discussed earlier. Initial experimentation also seems to indicate the added benefit of less misadjustment for the second algorithm. Minor discrepancies arise due to differences in the timing at which weights are adjusted relative to the calculation of the error gradients (mathematically the algorithms become equivalent asp-0). 5 Relationship to other work The structure of the FIR networks presented here are similar to the Time-Delay Neural Networks of Waibel et a1 used in speech recognition [3]. Their training, however, is based on instantaneous error gradients as derived in the first algorithm and does not fully exploit the FIR structure of the network. The temporal backpropagation algorithm presented here can, of course, be simply substituted to train their structure. 6 Conclusion This paper has introduced an efficient gradient descent algorithm for FIR neural networks which can be considered a temporal generalization of the backpropagation algorithm. By modeling each synapse as a linear filter, the neural network as a whole, may be thought of as an adaptive system with its own internal dynamics. Equivalently we may think of the network as a complex nonlinear filter. Applications should thus include areas of pattern recognition where there is an inherent temporal quality to the data such as in speech recognition. Also the networks should find a natural use in areas of nonlinear control, and other adaptive signal processing and filtering applications such as noise cancellation or equalization. The purpose of this paper, however, was to introduce the learning algorithm itself, rather than demonstrate the full potential of the actual network structure. This will be the work of future research. References [l] B. Widrow and S. D. Stearns, Adaptive signal processing, Englewood Cliffs, NJ: Prentice Hall, [2] D.E. Rumelhart, J.L. McClelland, and the PDP Research Group, Parallel Distributed Processing: Explorations in the Microstructure of Cognition, Vol. 1, The MIT Press, Cambridge, MA, [3] A. Waibel, T. Hanazawa, G. Hinton, K. Shikano, and K. Lang, Phoneme Recognition Using Time-Delay Neural Networks, IEEE Transactions on Acoustics, Speech, and Signal Processing, Vol. 37, No. 3, pp , March Static neural network are already used for many of these applications. With the static networks, time relations are often created by taking the data and rippling it in time across the input of the network or either using the output of the network to form a feedback loop. In both cases the dynamics are really external to the actual network itself. I - 580

Adaptive Inverse Control based on Linear and Nonlinear Adaptive Filtering

Adaptive Inverse Control based on Linear and Nonlinear Adaptive Filtering Adaptive Inverse Control based on Linear and Nonlinear Adaptive Filtering Bernard Widrow and Gregory L. Plett Department of Electrical Engineering, Stanford University, Stanford, CA 94305-9510 Abstract

More information

EXTENDED FUZZY COGNITIVE MAPS

EXTENDED FUZZY COGNITIVE MAPS EXTENDED FUZZY COGNITIVE MAPS Masafumi Hagiwara Department of Psychology Stanford University Stanford, CA 930 U.S.A. Abstract Fuzzy Cognitive Maps (FCMs) have been proposed to represent causal reasoning

More information

Adaptive Inverse Control

Adaptive Inverse Control TA1-8:30 Adaptive nverse Control Bernard Widrow Michel Bilello Stanford University Department of Electrical Engineering, Stanford, CA 94305-4055 Abstract A plant can track an input command signal if it

More information

inear Adaptive Inverse Control

inear Adaptive Inverse Control Proceedings of the 36th Conference on Decision & Control San Diego, California USA December 1997 inear Adaptive nverse Control WM15 1:50 Bernard Widrow and Gregory L. Plett Department of Electrical Engineering,

More information

A Logarithmic Neural Network Architecture for Unbounded Non-Linear Function Approximation

A Logarithmic Neural Network Architecture for Unbounded Non-Linear Function Approximation 1 Introduction A Logarithmic Neural Network Architecture for Unbounded Non-Linear Function Approximation J Wesley Hines Nuclear Engineering Department The University of Tennessee Knoxville, Tennessee,

More information

Pattern Recognition Prof. P. S. Sastry Department of Electronics and Communication Engineering Indian Institute of Science, Bangalore

Pattern Recognition Prof. P. S. Sastry Department of Electronics and Communication Engineering Indian Institute of Science, Bangalore Pattern Recognition Prof. P. S. Sastry Department of Electronics and Communication Engineering Indian Institute of Science, Bangalore Lecture - 27 Multilayer Feedforward Neural networks with Sigmoidal

More information

New Recursive-Least-Squares Algorithms for Nonlinear Active Control of Sound and Vibration Using Neural Networks

New Recursive-Least-Squares Algorithms for Nonlinear Active Control of Sound and Vibration Using Neural Networks IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 12, NO. 1, JANUARY 2001 135 New Recursive-Least-Squares Algorithms for Nonlinear Active Control of Sound and Vibration Using Neural Networks Martin Bouchard,

More information

Relating Real-Time Backpropagation and. Backpropagation-Through-Time: An Application of Flow Graph. Interreciprocity.

Relating Real-Time Backpropagation and. Backpropagation-Through-Time: An Application of Flow Graph. Interreciprocity. Neural Computation, 1994 Relating Real-Time Backpropagation and Backpropagation-Through-Time: An Application of Flow Graph Interreciprocity. Francoise Beaufays and Eric A. Wan Abstract We show that signal

More information

Equivalence of Backpropagation and Contrastive Hebbian Learning in a Layered Network

Equivalence of Backpropagation and Contrastive Hebbian Learning in a Layered Network LETTER Communicated by Geoffrey Hinton Equivalence of Backpropagation and Contrastive Hebbian Learning in a Layered Network Xiaohui Xie xhx@ai.mit.edu Department of Brain and Cognitive Sciences, Massachusetts

More information

T Machine Learning and Neural Networks

T Machine Learning and Neural Networks T-61.5130 Machine Learning and Neural Networks (5 cr) Lecture 11: Processing of Temporal Information Prof. Juha Karhunen https://mycourses.aalto.fi/ Aalto University School of Science, Espoo, Finland 1

More information

ADAPTIVE INVERSE CONTROL BASED ON NONLINEAR ADAPTIVE FILTERING. Information Systems Lab., EE Dep., Stanford University

ADAPTIVE INVERSE CONTROL BASED ON NONLINEAR ADAPTIVE FILTERING. Information Systems Lab., EE Dep., Stanford University ADAPTIVE INVERSE CONTROL BASED ON NONLINEAR ADAPTIVE FILTERING Bernard Widrow 1, Gregory Plett, Edson Ferreira 3 and Marcelo Lamego 4 Information Systems Lab., EE Dep., Stanford University Abstract: Many

More information

Supervised (BPL) verses Hybrid (RBF) Learning. By: Shahed Shahir

Supervised (BPL) verses Hybrid (RBF) Learning. By: Shahed Shahir Supervised (BPL) verses Hybrid (RBF) Learning By: Shahed Shahir 1 Outline I. Introduction II. Supervised Learning III. Hybrid Learning IV. BPL Verses RBF V. Supervised verses Hybrid learning VI. Conclusion

More information

Artificial Neural Network

Artificial Neural Network Artificial Neural Network Contents 2 What is ANN? Biological Neuron Structure of Neuron Types of Neuron Models of Neuron Analogy with human NN Perceptron OCR Multilayer Neural Network Back propagation

More information

Vasil Khalidov & Miles Hansard. C.M. Bishop s PRML: Chapter 5; Neural Networks

Vasil Khalidov & Miles Hansard. C.M. Bishop s PRML: Chapter 5; Neural Networks C.M. Bishop s PRML: Chapter 5; Neural Networks Introduction The aim is, as before, to find useful decompositions of the target variable; t(x) = y(x, w) + ɛ(x) (3.7) t(x n ) and x n are the observations,

More information

Keywords- Source coding, Huffman encoding, Artificial neural network, Multilayer perceptron, Backpropagation algorithm

Keywords- Source coding, Huffman encoding, Artificial neural network, Multilayer perceptron, Backpropagation algorithm Volume 4, Issue 5, May 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Huffman Encoding

More information

Lecture 5: Logistic Regression. Neural Networks

Lecture 5: Logistic Regression. Neural Networks Lecture 5: Logistic Regression. Neural Networks Logistic regression Comparison with generative models Feed-forward neural networks Backpropagation Tricks for training neural networks COMP-652, Lecture

More information

y(x n, w) t n 2. (1)

y(x n, w) t n 2. (1) Network training: Training a neural network involves determining the weight parameter vector w that minimizes a cost function. Given a training set comprising a set of input vector {x n }, n = 1,...N,

More information

Gradient Descent Training Rule: The Details

Gradient Descent Training Rule: The Details Gradient Descent Training Rule: The Details 1 For Perceptrons The whole idea behind gradient descent is to gradually, but consistently, decrease the output error by adjusting the weights. The trick is

More information

A gradient descent rule for spiking neurons emitting multiple spikes

A gradient descent rule for spiking neurons emitting multiple spikes A gradient descent rule for spiking neurons emitting multiple spikes Olaf Booij a, Hieu tat Nguyen a a Intelligent Sensory Information Systems, University of Amsterdam, Faculty of Science, Kruislaan 403,

More information

Artificial Neural Networks

Artificial Neural Networks Artificial Neural Networks Threshold units Gradient descent Multilayer networks Backpropagation Hidden layer representations Example: Face Recognition Advanced topics 1 Connectionist Models Consider humans:

More information

COMP-4360 Machine Learning Neural Networks

COMP-4360 Machine Learning Neural Networks COMP-4360 Machine Learning Neural Networks Jacky Baltes Autonomous Agents Lab University of Manitoba Winnipeg, Canada R3T 2N2 Email: jacky@cs.umanitoba.ca WWW: http://www.cs.umanitoba.ca/~jacky http://aalab.cs.umanitoba.ca

More information

Neural Networks and the Back-propagation Algorithm

Neural Networks and the Back-propagation Algorithm Neural Networks and the Back-propagation Algorithm Francisco S. Melo In these notes, we provide a brief overview of the main concepts concerning neural networks and the back-propagation algorithm. We closely

More information

Lecture 4: Perceptrons and Multilayer Perceptrons

Lecture 4: Perceptrons and Multilayer Perceptrons Lecture 4: Perceptrons and Multilayer Perceptrons Cognitive Systems II - Machine Learning SS 2005 Part I: Basic Approaches of Concept Learning Perceptrons, Artificial Neuronal Networks Lecture 4: Perceptrons

More information

100 inference steps doesn't seem like enough. Many neuron-like threshold switching units. Many weighted interconnections among units

100 inference steps doesn't seem like enough. Many neuron-like threshold switching units. Many weighted interconnections among units Connectionist Models Consider humans: Neuron switching time ~ :001 second Number of neurons ~ 10 10 Connections per neuron ~ 10 4 5 Scene recognition time ~ :1 second 100 inference steps doesn't seem like

More information

The error-backpropagation algorithm is one of the most important and widely used (and some would say wildly used) learning techniques for neural

The error-backpropagation algorithm is one of the most important and widely used (and some would say wildly used) learning techniques for neural 1 2 The error-backpropagation algorithm is one of the most important and widely used (and some would say wildly used) learning techniques for neural networks. First we will look at the algorithm itself

More information

Introduction to Machine Learning Spring 2018 Note Neural Networks

Introduction to Machine Learning Spring 2018 Note Neural Networks CS 189 Introduction to Machine Learning Spring 2018 Note 14 1 Neural Networks Neural networks are a class of compositional function approximators. They come in a variety of shapes and sizes. In this class,

More information

IN neural-network training, the most well-known online

IN neural-network training, the most well-known online IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 10, NO. 1, JANUARY 1999 161 On the Kalman Filtering Method in Neural-Network Training and Pruning John Sum, Chi-sing Leung, Gilbert H. Young, and Wing-kay Kan

More information

Introduction to Neural Networks

Introduction to Neural Networks Introduction to Neural Networks What are (Artificial) Neural Networks? Models of the brain and nervous system Highly parallel Process information much more like the brain than a serial computer Learning

More information

Supervised Learning in Neural Networks

Supervised Learning in Neural Networks The Norwegian University of Science and Technology (NTNU Trondheim, Norway keithd@idi.ntnu.no March 7, 2011 Supervised Learning Constant feedback from an instructor, indicating not only right/wrong, but

More information

Back-Propagation Algorithm. Perceptron Gradient Descent Multilayered neural network Back-Propagation More on Back-Propagation Examples

Back-Propagation Algorithm. Perceptron Gradient Descent Multilayered neural network Back-Propagation More on Back-Propagation Examples Back-Propagation Algorithm Perceptron Gradient Descent Multilayered neural network Back-Propagation More on Back-Propagation Examples 1 Inner-product net =< w, x >= w x cos(θ) net = n i=1 w i x i A measure

More information

Introduction to Artificial Neural Networks

Introduction to Artificial Neural Networks Facultés Universitaires Notre-Dame de la Paix 27 March 2007 Outline 1 Introduction 2 Fundamentals Biological neuron Artificial neuron Artificial Neural Network Outline 3 Single-layer ANN Perceptron Adaline

More information

THE concept of active sound control has been known

THE concept of active sound control has been known IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 10, NO. 2, MARCH 1999 391 Improved Training of Neural Networks for the Nonlinear Active Control of Sound and Vibration Martin Bouchard, Member, IEEE, Bruno Paillard,

More information

COGS Q250 Fall Homework 7: Learning in Neural Networks Due: 9:00am, Friday 2nd November.

COGS Q250 Fall Homework 7: Learning in Neural Networks Due: 9:00am, Friday 2nd November. COGS Q250 Fall 2012 Homework 7: Learning in Neural Networks Due: 9:00am, Friday 2nd November. For the first two questions of the homework you will need to understand the learning algorithm using the delta

More information

Optimal Polynomial Control for Discrete-Time Systems

Optimal Polynomial Control for Discrete-Time Systems 1 Optimal Polynomial Control for Discrete-Time Systems Prof Guy Beale Electrical and Computer Engineering Department George Mason University Fairfax, Virginia Correspondence concerning this paper should

More information

SPSS, University of Texas at Arlington. Topics in Machine Learning-EE 5359 Neural Networks

SPSS, University of Texas at Arlington. Topics in Machine Learning-EE 5359 Neural Networks Topics in Machine Learning-EE 5359 Neural Networks 1 The Perceptron Output: A perceptron is a function that maps D-dimensional vectors to real numbers. For notational convenience, we add a zero-th dimension

More information

ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD

ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD WHAT IS A NEURAL NETWORK? The simplest definition of a neural network, more properly referred to as an 'artificial' neural network (ANN), is provided

More information

THE Back-Prop (Backpropagation) algorithm of Paul

THE Back-Prop (Backpropagation) algorithm of Paul COGNITIVE COMPUTATION The Back-Prop and No-Prop Training Algorithms Bernard Widrow, Youngsik Kim, Yiheng Liao, Dookun Park, and Aaron Greenblatt Abstract Back-Prop and No-Prop, two training algorithms

More information

Neural networks. Chapter 20. Chapter 20 1

Neural networks. Chapter 20. Chapter 20 1 Neural networks Chapter 20 Chapter 20 1 Outline Brains Neural networks Perceptrons Multilayer networks Applications of neural networks Chapter 20 2 Brains 10 11 neurons of > 20 types, 10 14 synapses, 1ms

More information

Using Variable Threshold to Increase Capacity in a Feedback Neural Network

Using Variable Threshold to Increase Capacity in a Feedback Neural Network Using Variable Threshold to Increase Capacity in a Feedback Neural Network Praveen Kuruvada Abstract: The article presents new results on the use of variable thresholds to increase the capacity of a feedback

More information

COMP 551 Applied Machine Learning Lecture 14: Neural Networks

COMP 551 Applied Machine Learning Lecture 14: Neural Networks COMP 551 Applied Machine Learning Lecture 14: Neural Networks Instructor: Ryan Lowe (ryan.lowe@mail.mcgill.ca) Slides mostly by: Class web page: www.cs.mcgill.ca/~hvanho2/comp551 Unless otherwise noted,

More information

Artificial Neural Networks

Artificial Neural Networks Artificial Neural Networks Oliver Schulte - CMPT 310 Neural Networks Neural networks arise from attempts to model human/animal brains Many models, many claims of biological plausibility We will focus on

More information

ADAPTIVE FILTER THEORY

ADAPTIVE FILTER THEORY ADAPTIVE FILTER THEORY Fourth Edition Simon Haykin Communications Research Laboratory McMaster University Hamilton, Ontario, Canada Front ice Hall PRENTICE HALL Upper Saddle River, New Jersey 07458 Preface

More information

Direct Method for Training Feed-forward Neural Networks using Batch Extended Kalman Filter for Multi- Step-Ahead Predictions

Direct Method for Training Feed-forward Neural Networks using Batch Extended Kalman Filter for Multi- Step-Ahead Predictions Direct Method for Training Feed-forward Neural Networks using Batch Extended Kalman Filter for Multi- Step-Ahead Predictions Artem Chernodub, Institute of Mathematical Machines and Systems NASU, Neurotechnologies

More information

Application of Artificial Neural Networks in Evaluation and Identification of Electrical Loss in Transformers According to the Energy Consumption

Application of Artificial Neural Networks in Evaluation and Identification of Electrical Loss in Transformers According to the Energy Consumption Application of Artificial Neural Networks in Evaluation and Identification of Electrical Loss in Transformers According to the Energy Consumption ANDRÉ NUNES DE SOUZA, JOSÉ ALFREDO C. ULSON, IVAN NUNES

More information

Chapter 4 Neural Networks in System Identification

Chapter 4 Neural Networks in System Identification Chapter 4 Neural Networks in System Identification Gábor HORVÁTH Department of Measurement and Information Systems Budapest University of Technology and Economics Magyar tudósok körútja 2, 52 Budapest,

More information

Artificial Neural Networks. Edward Gatt

Artificial Neural Networks. Edward Gatt Artificial Neural Networks Edward Gatt What are Neural Networks? Models of the brain and nervous system Highly parallel Process information much more like the brain than a serial computer Learning Very

More information

4. Multilayer Perceptrons

4. Multilayer Perceptrons 4. Multilayer Perceptrons This is a supervised error-correction learning algorithm. 1 4.1 Introduction A multilayer feedforward network consists of an input layer, one or more hidden layers, and an output

More information

Neural networks. Chapter 19, Sections 1 5 1

Neural networks. Chapter 19, Sections 1 5 1 Neural networks Chapter 19, Sections 1 5 Chapter 19, Sections 1 5 1 Outline Brains Neural networks Perceptrons Multilayer perceptrons Applications of neural networks Chapter 19, Sections 1 5 2 Brains 10

More information

STRUCTURED NEURAL NETWORK FOR NONLINEAR DYNAMIC SYSTEMS MODELING

STRUCTURED NEURAL NETWORK FOR NONLINEAR DYNAMIC SYSTEMS MODELING STRUCTURED NEURAL NETWORK FOR NONLINEAR DYNAIC SYSTES ODELING J. CODINA, R. VILLÀ and J.. FUERTES UPC-Facultat d Informàtica de Barcelona, Department of Automatic Control and Computer Engineeering, Pau

More information

Artificial Intelligence

Artificial Intelligence Artificial Intelligence Jeff Clune Assistant Professor Evolving Artificial Intelligence Laboratory Announcements Be making progress on your projects! Three Types of Learning Unsupervised Supervised Reinforcement

More information

(Feed-Forward) Neural Networks Dr. Hajira Jabeen, Prof. Jens Lehmann

(Feed-Forward) Neural Networks Dr. Hajira Jabeen, Prof. Jens Lehmann (Feed-Forward) Neural Networks 2016-12-06 Dr. Hajira Jabeen, Prof. Jens Lehmann Outline In the previous lectures we have learned about tensors and factorization methods. RESCAL is a bilinear model for

More information

Unit III. A Survey of Neural Network Model

Unit III. A Survey of Neural Network Model Unit III A Survey of Neural Network Model 1 Single Layer Perceptron Perceptron the first adaptive network architecture was invented by Frank Rosenblatt in 1957. It can be used for the classification of

More information

Backpropagation and Neural Networks part 1. Lecture 4-1

Backpropagation and Neural Networks part 1. Lecture 4-1 Lecture 4: Backpropagation and Neural Networks part 1 Lecture 4-1 Administrative A1 is due Jan 20 (Wednesday). ~150 hours left Warning: Jan 18 (Monday) is Holiday (no class/office hours) Also note: Lectures

More information

Learning and Memory in Neural Networks

Learning and Memory in Neural Networks Learning and Memory in Neural Networks Guy Billings, Neuroinformatics Doctoral Training Centre, The School of Informatics, The University of Edinburgh, UK. Neural networks consist of computational units

More information

Introduction Neural Networks - Architecture Network Training Small Example - ZIP Codes Summary. Neural Networks - I. Henrik I Christensen

Introduction Neural Networks - Architecture Network Training Small Example - ZIP Codes Summary. Neural Networks - I. Henrik I Christensen Neural Networks - I Henrik I Christensen Robotics & Intelligent Machines @ GT Georgia Institute of Technology, Atlanta, GA 30332-0280 hic@cc.gatech.edu Henrik I Christensen (RIM@GT) Neural Networks 1 /

More information

6.034f Neural Net Notes October 28, 2010

6.034f Neural Net Notes October 28, 2010 6.034f Neural Net Notes October 28, 2010 These notes are a supplement to material presented in lecture. I lay out the mathematics more prettily and etend the analysis to handle multiple-neurons per layer.

More information

How to do backpropagation in a brain

How to do backpropagation in a brain How to do backpropagation in a brain Geoffrey Hinton Canadian Institute for Advanced Research & University of Toronto & Google Inc. Prelude I will start with three slides explaining a popular type of deep

More information

A STATE-SPACE NEURAL NETWORK FOR MODELING DYNAMICAL NONLINEAR SYSTEMS

A STATE-SPACE NEURAL NETWORK FOR MODELING DYNAMICAL NONLINEAR SYSTEMS A STATE-SPACE NEURAL NETWORK FOR MODELING DYNAMICAL NONLINEAR SYSTEMS Karima Amoura Patrice Wira and Said Djennoune Laboratoire CCSP Université Mouloud Mammeri Tizi Ouzou Algeria Laboratoire MIPS Université

More information

Recurrent Neural Networks

Recurrent Neural Networks Recurrent Neural Networks Datamining Seminar Kaspar Märtens Karl-Oskar Masing Today's Topics Modeling sequences: a brief overview Training RNNs with back propagation A toy example of training an RNN Why

More information

Neural Dynamic Optimization for Control Systems Part II: Theory

Neural Dynamic Optimization for Control Systems Part II: Theory 490 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS PART B: CYBERNETICS, VOL. 31, NO. 4, AUGUST 2001 Neural Dynamic Optimization for Control Systems Part II: Theory Chang-Yun Seong, Member, IEEE, and

More information

Artificial Neural Networks

Artificial Neural Networks Introduction ANN in Action Final Observations Application: Poverty Detection Artificial Neural Networks Alvaro J. Riascos Villegas University of los Andes and Quantil July 6 2018 Artificial Neural Networks

More information

Artifical Neural Networks

Artifical Neural Networks Neural Networks Artifical Neural Networks Neural Networks Biological Neural Networks.................................. Artificial Neural Networks................................... 3 ANN Structure...........................................

More information

Introduction Biologically Motivated Crude Model Backpropagation

Introduction Biologically Motivated Crude Model Backpropagation Introduction Biologically Motivated Crude Model Backpropagation 1 McCulloch-Pitts Neurons In 1943 Warren S. McCulloch, a neuroscientist, and Walter Pitts, a logician, published A logical calculus of the

More information

The Neural Impulse Response Filter

The Neural Impulse Response Filter The Neural Impulse Response Filter Volker Tresp a, Ira Leuthäusser a, Martin Schlang a, Ralph Neuneier b, Klaus Abraham- Fuchs c and Wolfgang Härer c a Siemens AG, Central Research and Development, Otto-Hahn-Ring

More information

POWER SYSTEM DYNAMIC SECURITY ASSESSMENT CLASSICAL TO MODERN APPROACH

POWER SYSTEM DYNAMIC SECURITY ASSESSMENT CLASSICAL TO MODERN APPROACH Abstract POWER SYSTEM DYNAMIC SECURITY ASSESSMENT CLASSICAL TO MODERN APPROACH A.H.M.A.Rahim S.K.Chakravarthy Department of Electrical Engineering K.F. University of Petroleum and Minerals Dhahran. Dynamic

More information

Supervised learning in single-stage feedforward networks

Supervised learning in single-stage feedforward networks Supervised learning in single-stage feedforward networks Bruno A Olshausen September, 204 Abstract This handout describes supervised learning in single-stage feedforward networks composed of McCulloch-Pitts

More information

Neural Networks. Nicholas Ruozzi University of Texas at Dallas

Neural Networks. Nicholas Ruozzi University of Texas at Dallas Neural Networks Nicholas Ruozzi University of Texas at Dallas Handwritten Digit Recognition Given a collection of handwritten digits and their corresponding labels, we d like to be able to correctly classify

More information

Neural networks. Chapter 20, Section 5 1

Neural networks. Chapter 20, Section 5 1 Neural networks Chapter 20, Section 5 Chapter 20, Section 5 Outline Brains Neural networks Perceptrons Multilayer perceptrons Applications of neural networks Chapter 20, Section 5 2 Brains 0 neurons of

More information

Recurrent neural networks with trainable amplitude of activation functions

Recurrent neural networks with trainable amplitude of activation functions Neural Networks 16 (2003) 1095 1100 www.elsevier.com/locate/neunet Neural Networks letter Recurrent neural networks with trainable amplitude of activation functions Su Lee Goh*, Danilo P. Mandic Imperial

More information

Course 395: Machine Learning - Lectures

Course 395: Machine Learning - Lectures Course 395: Machine Learning - Lectures Lecture 1-2: Concept Learning (M. Pantic) Lecture 3-4: Decision Trees & CBC Intro (M. Pantic & S. Petridis) Lecture 5-6: Evaluating Hypotheses (S. Petridis) Lecture

More information

Neural Networks. Chapter 18, Section 7. TB Artificial Intelligence. Slides from AIMA 1/ 21

Neural Networks. Chapter 18, Section 7. TB Artificial Intelligence. Slides from AIMA   1/ 21 Neural Networks Chapter 8, Section 7 TB Artificial Intelligence Slides from AIMA http://aima.cs.berkeley.edu / 2 Outline Brains Neural networks Perceptrons Multilayer perceptrons Applications of neural

More information

EE-559 Deep learning Recurrent Neural Networks

EE-559 Deep learning Recurrent Neural Networks EE-559 Deep learning 11.1. Recurrent Neural Networks François Fleuret https://fleuret.org/ee559/ Sun Feb 24 20:33:31 UTC 2019 Inference from sequences François Fleuret EE-559 Deep learning / 11.1. Recurrent

More information

A METHOD OF ADAPTATION BETWEEN STEEPEST- DESCENT AND NEWTON S ALGORITHM FOR MULTI- CHANNEL ACTIVE CONTROL OF TONAL NOISE AND VIBRATION

A METHOD OF ADAPTATION BETWEEN STEEPEST- DESCENT AND NEWTON S ALGORITHM FOR MULTI- CHANNEL ACTIVE CONTROL OF TONAL NOISE AND VIBRATION A METHOD OF ADAPTATION BETWEEN STEEPEST- DESCENT AND NEWTON S ALGORITHM FOR MULTI- CHANNEL ACTIVE CONTROL OF TONAL NOISE AND VIBRATION Jordan Cheer and Stephen Daley Institute of Sound and Vibration Research,

More information

Data Mining Part 5. Prediction

Data Mining Part 5. Prediction Data Mining Part 5. Prediction 5.5. Spring 2010 Instructor: Dr. Masoud Yaghini Outline How the Brain Works Artificial Neural Networks Simple Computing Elements Feed-Forward Networks Perceptrons (Single-layer,

More information

Last update: October 26, Neural networks. CMSC 421: Section Dana Nau

Last update: October 26, Neural networks. CMSC 421: Section Dana Nau Last update: October 26, 207 Neural networks CMSC 42: Section 8.7 Dana Nau Outline Applications of neural networks Brains Neural network units Perceptrons Multilayer perceptrons 2 Example Applications

More information

Machine Learning. Neural Networks. Le Song. CSE6740/CS7641/ISYE6740, Fall Lecture 7, September 11, 2012 Based on slides from Eric Xing, CMU

Machine Learning. Neural Networks. Le Song. CSE6740/CS7641/ISYE6740, Fall Lecture 7, September 11, 2012 Based on slides from Eric Xing, CMU Machine Learning CSE6740/CS7641/ISYE6740, Fall 2012 Neural Networks Le Song Lecture 7, September 11, 2012 Based on slides from Eric Xing, CMU Reading: Chap. 5 CB Learning highly non-linear functions f:

More information

Feed-forward Networks Network Training Error Backpropagation Applications. Neural Networks. Oliver Schulte - CMPT 726. Bishop PRML Ch.

Feed-forward Networks Network Training Error Backpropagation Applications. Neural Networks. Oliver Schulte - CMPT 726. Bishop PRML Ch. Neural Networks Oliver Schulte - CMPT 726 Bishop PRML Ch. 5 Neural Networks Neural networks arise from attempts to model human/animal brains Many models, many claims of biological plausibility We will

More information

Feedforward Neural Nets and Backpropagation

Feedforward Neural Nets and Backpropagation Feedforward Neural Nets and Backpropagation Julie Nutini University of British Columbia MLRG September 28 th, 2016 1 / 23 Supervised Learning Roadmap Supervised Learning: Assume that we are given the features

More information

Lecture 7 Artificial neural networks: Supervised learning

Lecture 7 Artificial neural networks: Supervised learning Lecture 7 Artificial neural networks: Supervised learning Introduction, or how the brain works The neuron as a simple computing element The perceptron Multilayer neural networks Accelerated learning in

More information

arxiv: v1 [cs.sd] 28 Feb 2017

arxiv: v1 [cs.sd] 28 Feb 2017 Nonlinear Volterra Model of a Loudspeaker Behavior Based on Laser Doppler Vibrometry Alessandro Loriga, Parvin Moyassari, and Daniele Bernardini Intranet Standard GmbH, Ottostrasse 3, 80333 Munich, Germany

More information

ECLT 5810 Classification Neural Networks. Reference: Data Mining: Concepts and Techniques By J. Hand, M. Kamber, and J. Pei, Morgan Kaufmann

ECLT 5810 Classification Neural Networks. Reference: Data Mining: Concepts and Techniques By J. Hand, M. Kamber, and J. Pei, Morgan Kaufmann ECLT 5810 Classification Neural Networks Reference: Data Mining: Concepts and Techniques By J. Hand, M. Kamber, and J. Pei, Morgan Kaufmann Neural Networks A neural network is a set of connected input/output

More information

CMSC 421: Neural Computation. Applications of Neural Networks

CMSC 421: Neural Computation. Applications of Neural Networks CMSC 42: Neural Computation definition synonyms neural networks artificial neural networks neural modeling connectionist models parallel distributed processing AI perspective Applications of Neural Networks

More information

Learning Neural Networks

Learning Neural Networks Learning Neural Networks Neural Networks can represent complex decision boundaries Variable size. Any boolean function can be represented. Hidden units can be interpreted as new features Deterministic

More information

Blind Equalization Formulated as a Self-organized Learning Process

Blind Equalization Formulated as a Self-organized Learning Process Blind Equalization Formulated as a Self-organized Learning Process Simon Haykin Communications Research Laboratory McMaster University 1280 Main Street West Hamilton, Ontario, Canada L8S 4K1 Abstract In

More information

<Special Topics in VLSI> Learning for Deep Neural Networks (Back-propagation)

<Special Topics in VLSI> Learning for Deep Neural Networks (Back-propagation) Learning for Deep Neural Networks (Back-propagation) Outline Summary of Previous Standford Lecture Universal Approximation Theorem Inference vs Training Gradient Descent Back-Propagation

More information

1 What a Neural Network Computes

1 What a Neural Network Computes Neural Networks 1 What a Neural Network Computes To begin with, we will discuss fully connected feed-forward neural networks, also known as multilayer perceptrons. A feedforward neural network consists

More information

Multilayer Perceptron Tutorial

Multilayer Perceptron Tutorial Multilayer Perceptron Tutorial Leonardo Noriega School of Computing Staffordshire University Beaconside Staffordshire ST18 0DG email: l.a.noriega@staffs.ac.uk November 17, 2005 1 Introduction to Neural

More information

NONLINEAR CLASSIFICATION AND REGRESSION. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition

NONLINEAR CLASSIFICATION AND REGRESSION. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition NONLINEAR CLASSIFICATION AND REGRESSION Nonlinear Classification and Regression: Outline 2 Multi-Layer Perceptrons The Back-Propagation Learning Algorithm Generalized Linear Models Radial Basis Function

More information

Mark Gales October y (x) x 1. x 2 y (x) Inputs. Outputs. x d. y (x) Second Output layer layer. layer.

Mark Gales October y (x) x 1. x 2 y (x) Inputs. Outputs. x d. y (x) Second Output layer layer. layer. University of Cambridge Engineering Part IIB & EIST Part II Paper I0: Advanced Pattern Processing Handouts 4 & 5: Multi-Layer Perceptron: Introduction and Training x y (x) Inputs x 2 y (x) 2 Outputs x

More information

ARTIFICIAL INTELLIGENCE. Artificial Neural Networks

ARTIFICIAL INTELLIGENCE. Artificial Neural Networks INFOB2KI 2017-2018 Utrecht University The Netherlands ARTIFICIAL INTELLIGENCE Artificial Neural Networks Lecturer: Silja Renooij These slides are part of the INFOB2KI Course Notes available from www.cs.uu.nl/docs/vakken/b2ki/schema.html

More information

Introduction to feedforward neural networks

Introduction to feedforward neural networks . Problem statement and historical context A. Learning framework Figure below illustrates the basic framework that we will see in artificial neural network learning. We assume that we want to learn a classification

More information

Part 8: Neural Networks

Part 8: Neural Networks METU Informatics Institute Min720 Pattern Classification ith Bio-Medical Applications Part 8: Neural Netors - INTRODUCTION: BIOLOGICAL VS. ARTIFICIAL Biological Neural Netors A Neuron: - A nerve cell as

More information

Forecasting of Rain Fall in Mirzapur District, Uttar Pradesh, India Using Feed-Forward Artificial Neural Network

Forecasting of Rain Fall in Mirzapur District, Uttar Pradesh, India Using Feed-Forward Artificial Neural Network International Journal of Engineering Science Invention ISSN (Online): 2319 6734, ISSN (Print): 2319 6726 Volume 2 Issue 8ǁ August. 2013 ǁ PP.87-93 Forecasting of Rain Fall in Mirzapur District, Uttar Pradesh,

More information

C4 Phenomenological Modeling - Regression & Neural Networks : Computational Modeling and Simulation Instructor: Linwei Wang

C4 Phenomenological Modeling - Regression & Neural Networks : Computational Modeling and Simulation Instructor: Linwei Wang C4 Phenomenological Modeling - Regression & Neural Networks 4040-849-03: Computational Modeling and Simulation Instructor: Linwei Wang Recall.. The simple, multiple linear regression function ŷ(x) = a

More information

Computational Graphs, and Backpropagation

Computational Graphs, and Backpropagation Chapter 1 Computational Graphs, and Backpropagation (Course notes for NLP by Michael Collins, Columbia University) 1.1 Introduction We now describe the backpropagation algorithm for calculation of derivatives

More information

Machine Learning and Data Mining. Multi-layer Perceptrons & Neural Networks: Basics. Prof. Alexander Ihler

Machine Learning and Data Mining. Multi-layer Perceptrons & Neural Networks: Basics. Prof. Alexander Ihler + Machine Learning and Data Mining Multi-layer Perceptrons & Neural Networks: Basics Prof. Alexander Ihler Linear Classifiers (Perceptrons) Linear Classifiers a linear classifier is a mapping which partitions

More information

Neural Networks. Bishop PRML Ch. 5. Alireza Ghane. Feed-forward Networks Network Training Error Backpropagation Applications

Neural Networks. Bishop PRML Ch. 5. Alireza Ghane. Feed-forward Networks Network Training Error Backpropagation Applications Neural Networks Bishop PRML Ch. 5 Alireza Ghane Neural Networks Alireza Ghane / Greg Mori 1 Neural Networks Neural networks arise from attempts to model human/animal brains Many models, many claims of

More information

Machine Learning. Neural Networks

Machine Learning. Neural Networks Machine Learning Neural Networks Bryan Pardo, Northwestern University, Machine Learning EECS 349 Fall 2007 Biological Analogy Bryan Pardo, Northwestern University, Machine Learning EECS 349 Fall 2007 THE

More information

ECE521 Lecture 7/8. Logistic Regression

ECE521 Lecture 7/8. Logistic Regression ECE521 Lecture 7/8 Logistic Regression Outline Logistic regression (Continue) A single neuron Learning neural networks Multi-class classification 2 Logistic regression The output of a logistic regression

More information

Neural Networks, Computation Graphs. CMSC 470 Marine Carpuat

Neural Networks, Computation Graphs. CMSC 470 Marine Carpuat Neural Networks, Computation Graphs CMSC 470 Marine Carpuat Binary Classification with a Multi-layer Perceptron φ A = 1 φ site = 1 φ located = 1 φ Maizuru = 1 φ, = 2 φ in = 1 φ Kyoto = 1 φ priest = 0 φ

More information