Chapter 15. Dynamically Driven Recurrent Networks

Size: px
Start display at page:

Download "Chapter 15. Dynamically Driven Recurrent Networks"

Transcription

1 Chapter 15. Dynamically Driven Recurrent Networks Neural Networks and Learning Machines (Haykin) Lecture Notes on Self-learning Neural Algorithms Byoung-Tak Zhang School of Computer Science and Engineering Seoul National University Version

2 Contents 15.1 Introduction Recurrent Network Architectures Universal Approximation Theorem Computational Power of Recurrent Networks Learning Algorithms Back Propagation Through Time Real-Time Recurrent Learning Vanishing Gradients in Recurrent Networks Supervised Training Framework for Recurrent Networks Computer Experiment: Mackey-Glass Attractor Adaptivity Considerations Case Study: Model Reference Applied to Neurocontrol 34 Summary and Discussion (c) 2017 Biointelligence Lab, SNU 2

3 15.1 Introduction Global feedback is a facilitator of computational intelligence. In previous chapters we studied how the use of global feedback in a recurrent network makes it possible to achieve some useful tasks: o o o Content-addressable memory Autoassociation Dynamic reconstruction of a chaotic process In this chapter, we study the other important applications of recurrent networks: o o o o Input output mapping, the study of which naturally benefits from Chapter 14 on sequential state estimation Applying the feedback from the output layer to the input of the hidden layer Combining all possible feedback loops in a single recurrent network structure Other configurations as the building blocks for the construction of recurrent networks Recurrent networks have a very rich repertoire of architectural layouts, which makes them all the more powerful in computational terms. A recurrent network responds temporally to an externally applied input signal. We may therefore speak of the recurrent networks considered in this chapter as dynamically driven recurrent networks (c) 2017 Biointelligence Lab, SNU 3

4 15.2 Recurrent Network Architectures (1/8) Four specific network architectures 1) Input-Output Recurrent Model 2) State-Space Model 3) Recurrent Multilayer Perceptrons 4) Second-Order Network They all incorporate a static multilayer perceptron or parts thereof They all exploit the nonlinear mapping capability of the multilayer perceptron (c) 2017 Biointelligence Lab, SNU 4

5 15.2 Recurrent Network Architectures (2/8) 1) Input-Output Recurrent Model 1. The model has a single input that is applied to a tapped-delay-line memory of q units. 2. It has a single output that is fed back to the input via another tapped-delay-line memory, also of q units. 3. The present value of the model input is denoted by un, and the corresponding value of the model output is denoted by y n The dynamic behavior of the nonlinear autoregressive with exogenous inputs (NARX) model is described by y F y y u u = (,!, ;,!, ) n+ 1 n n- q+ 1 n n- q+ 1 where F is a nonlinear function of its arguments Figure 15.1 Nonlinear autoregressive with exogenous inputs (NARX) model; the feedback part of the network is shown in blue. 5

6 15.2 Recurrent Network Architectures (3/8) 2) State-Space Model Figure 15.2 State-space model; the feedback part of the model is shown in blue. Figure 15.3 Simple recurrent network (SRN); the feedback part of the network is shown in blue. 6

7 15.2 Recurrent Network Architectures (4/8) 2) State-Space Model 1. A state space model, is the basic idea of which was discussed in Chapter The output is fed back to the input layer via a bank of unit-time delays. 3. The input layer consists of a concatenation of feedback nodes and source nodes. x = n+ 1 a( xn, un) yn = Bxn 4. Elman's network (Fig.15.3) contains recurrent connections from the hidden neurons to a layer of context units consisting of unit-time delays. These context units store the outputs of the hidden neurons for one time-step and then feed them back to the input layer. (c) 2017 Biointelligence Lab, SNU 7

8 15.2 Recurrent Network Architectures (5/8) 3) Recurrent Multilayer Perceptrons Figure 15.4 Recurrent multilayer perceptron; feedback paths in the network are printed in blue. (c) 2017 Biointelligence Lab, SNU 8

9 15.2 Recurrent Network Architectures (6/8) 3) Recurrent Multilayer Perceptrons 1. A recurrent multilayer perceptron (RMLP) has one or more hidden layers, basically for the same reasons that static multilayer perceptrons are often more effective and parsimonious than those using a single hidden layer 2. Each computation layer of an RMLP has feedback around it, as illustrated in Fig for the case of an RMLP with two hidden layers. x x u I, n+ 1 = I( I, n, n) x = ( x x ) II, n+ 1 II II, n, I, n+ 1! f f f x = ( x x ) on, + 1 o on,, Kn, + 1 (c) 2017 Biointelligence Lab, SNU 9

10 15.2 Recurrent Network Architectures (7/8) 4) Second-Order Network Figure 15.5 Second-order recurrent network; bias connections to the neurons are omitted to simplify the presentation. The network has 2 inputs and 3 state neurons, hence the need for 3 X 2 = 6 multipliers. The feedback links in the figure are printed in blue to emphasize their global role. 10

11 15.2 Recurrent Network Architectures (8/8) 4) Second-Order Network First-order neuron Second-order neuron Second-order recurrent networks å å v = w x + w u v k a, kj j b, ki i j i = åå w xu k kij i j i j v b w x u = +åå k, n k kij i, n j, n i j x = 1 φ( v ) = + 1+ exp( -v ) kn, 1 kn, kn, d ( x u )= x i, j k 11

12 15.3 Universal Approximation Theorem (1/2) Any nonlinear dynamic system may be approximated by a recurrent neural network to any desired degree of accuracy and with no restrictions imposed on the compactness of the state space, provided that the network is equipped with an adequate number of hidden neurons. xn + 1 = f( W x + Wu ) y = Wx n c n a n b n 1- e φ ( x) = tanh( x) = 1 + e -2x -2x éx1ù éj( x1) ù ê x ú ê 2 j( x2) ú f : ê ú ê ú ê! ú ê! ú ê ú ê x ) ú êë qúû êëj ( xq úû 1 φ ( x) = 1 + e - x 12

13 15.3 Universal Approximation Theorem (2/2) Example 1 Fully Connected Recurrent Network m= 2, q= 3, and p= 1 W W a b éw w w = ê w21 w22 w ú ê 23 ú êëw w w éb w w = ê b2 w24 w ú ê 25 ú êëb w w ù úû ù úû Wc = [ 1 0 0] Figure 15.6 Fully connected recurrent network with two inputs, two hidden neurons, and one output neuron. The feedback connections are shown in blue to emphasize their global role. 13

14 15.5 Computational Power of Recurrent Networks (1/3) Every finite - state machine is equivalent to, and can be simulated by, some neural net. That is, given any finite - state machine M, we can build a certain neural net M N which, regarded as a black - box machine, will behave precisely like M! Theorem I (Siegelmann and Sontag, 1991) All Turing machines may be simulated by fully connected recurrent networks built on neurons with sigmoidal activation functions. Three functional blocks of Turing Machine a control unit, which can assume any one of a finite number of possible states linear tape, assumed to be infinitely long in both directions, which is marked off into discrete squares, where each square is available to store a single symbol taken from a finite set of symbols a read write head, which moves along the tape and transmits information to and from the control unit Figure 15.7 Turing machine. 14

15 15.5 Computational Power of Recurrent Networks (2/3) Figure 15.8 Illustration of Theorems I and II, and corollary to them. 15

16 15.5 Computational Power of Recurrent Networks (3/3) Theorem II (Siegelmann et al., 1997) NARX networks with one layer of hidden neurons with bounded, one-sided saturated (BOSS) activation functions and a linear output neuron can simulate fully connected recurrent networks with bounded, one-sided saturated activation functions, except for a linear slowdown. Three condtions of activation functions 1. The function j( ) has a bounded range; that is, a j( x) b, a¹ b, for all xîr 2. The function j( ) is saturated on the left side; that is, there exists values s and S such that j( x) = S for all x s 3. The function j( ) is nonconstant; that is j( x ) ¹ j( x ) for some x and x BOSS function ì 1 ï for x> j( x) = í1 + exp( - x) ï î0 for x s s 16

17 15.6 Learning Algorithms Two modes of training a recurrent network 1. Epochwise training. For a given epoch, the recurrent network uses a temporal sequence of input target response pairs and starts running from some initial state until it reaches a new state, at which point the training is stopped and the network is reset to an initial state for the next epoch. 2. Continuous training. This second method of training is suitable for situations where there are no reset states available or on-line learning is required. The distinguishing feature of continuous training is that the network learns while performing signal processing. Simply put, the learning process never stops. Two different learning algorithms 1. Back-propagation through time (BPTT) Section Epochwise, continuous, or combined 1. Real-time recurrent learning (RTRL) Section Derived from the state-space model 17

18 15.7 Back Propagation Through Time (1/3) The back-propagation-through-time (BPTT) algorithm for training a recurrent network is an extension of the standard back-propagation algorithm.8 It may be derived by unfolding the temporal operation of the network into a layered feedforward network, the topology of which grows by one layer at every time-step. Figure 15.9 (a) Architectural graph of a twoneuron recurrent network N. (b) Signal-flow graph of the network N. unfolded in time. (c) 2017 Biointelligence Lab, SNU 18

19 15.7 Back Propagation Through Time (2/3) Epochwise Back Propagation Through Time E n 1 = 2 1 åå n= n0 jîa e 2 total jn, δ = jn, E - total vj, n δ = jn, ' ' ì φ ( vjn, ) ejn, for n= n1 ï í é ù ïφ ( v ) êe å w ú for n n n î ë û j, n j, n + jkd k, n < < 1 kîa E D wji =-η w n 1 = η å n= n0+ 1 total δ ji jn, xin, - 1 (c) 2017 Biointelligence Lab, SNU 19

20 15.7 Back Propagation Through Time (3/3) Truncated Back Propagation Through Time 1 El= 2 å e Î δ = jl, j 2 j, l El δ jl, =- for all jîa and n- h< l n v ' ì φ ( vjl, ) ejl, for l = n ï í ' φ ( vj, l ) wjk, lδ k, l+ 1 for n- h< l < n ï å î kîa D w = η å δ x - n ji,n j, l i, l 1 n= n- h+ 1 The Ordered Derivative Approach A φ φ If a= φ( b, c), then F = F and F = F b c jl, l l l l - b - a - c - a (c) 2017 Biointelligence Lab, SNU 20

21 15.8 Real-Time Recurrent Learning (1/5) Real-time recurrent learning(rtrl): adjustments are made to the synaptic weights of a fully connected recurrent network in real time that is, while the network continues to perform its signal-processing function. T éφ( w1 ξ ) ù n ê ú ê! ú T n+ 1= ê φ( j n) ú éxn ù x w ξ ê ú ξn = ê ú ê! ú ëun û ê ú T êëφ( wqξn) úû w j éwa, jù = ê ú, j = 1,2,..., q êëw b, júû Figure Fully connected recurrent network for formulation of the RTRL algorithm; the feedback connections are all shown in blue. (c) 2017 Biointelligence Lab, SNU 21

22 15.8 Real-Time Recurrent Learning (2/5) Λ xn, =, j = 1,2,..., q w jn j é0 ù ê T ú U jn, = êξnú jth row, j = 1,2,..., q ê ú ë0 û Φ = diag(φ ( w ξ ),...,φ ( w ξ )...,φ ( w ξ )) ' T ' T ' T n 1 n j n q n Λ Φ W Λ U =, 1 ( + jn n an, jn, jn, ), j = + 1,2,..., q (c) 2017 Biointelligence Lab, SNU 22

23 15.8 Real-Time Recurrent Learning (3/5) e = d -y n n n = d -Wx n c n E 1 T = ee 2 n n n E æ n e ö n = ç e w j è w j ø n æ x ö n =-Wc e ç n è w j ø =- WcΛ j, nen, j = 1,2,..., q D w j,n =- η E w n j = η W Λ, = 1,2,..., c j, nen j q Λ j,0 = 0 for all j (c) 2017 Biointelligence Lab, SNU 23

24 15.8 Real-Time Recurrent Learning (4/5) (c) 2017 Biointelligence Lab, SNU 24

25 15.8 Real-Time Recurrent Learning (5/5) Teacher forcing, or equation-error method, involves replacing the actual output of a neuron, during training of the network, with the corresponding desired response (i.e., target signal) in subsequent computation of the dynamic behavior of the network, whenever that desired response is available. Faster training, corrective mechanism. x = φ( v ) jn, + 1 jn, = and tanh( v ) jn, φ( v jn, ) φ'( v jn, ) = v = jn, 2 sech ( v jn, ) = 1-x 2 jn, + 1 Figure Sensitivity graph of the fully recurrent network of Fig Note: The three nodes, labeled ξ l,n are all to be viewed as a single input. 25

26 15.9 Vanishing Gradients in Recurrent Networks (1/2) The vanishing-gradients problem arises in the training of the network to produce a desired response at the current time that depends on input data in the distant past. Robust information latching in a recurrent network is accomplished if the states of the network are contained in the reduced attracting set of a hyperbolic attractor. Figure Illustration of the vanishing-gradient problem: (a) State x n resides in the basin of attraction, β, but outside the reduced attraction set g. (b) State x n resides inside the reduced attraction set g. (c) 2017 Biointelligence Lab, SNU 26

27 15.9 Vanishing Gradients in Recurrent Networks (2/2) Long-Term Dependencies D = -η w n æ y ö D w = ηå d -y è ø (,, ) in, n ç i n i n i w å ( din, yin, ) in, in, = η - x f in, = x x i æ y x ö ç è xin, w ø ( xin,, un) ik, ik, = J E 2 total 1 Etotal = å di, n -yi, n 2 i w x, nk, D w = æ y æ x x öö d -y è è øø n in, in, ik, n ηå i n i n ç å i x ç in, k= 1 xik, wk ( J x,, ) (,, ) det nk 0 as k for all n The network is not robust to the presence of noise in the input signal, or else The network is unable to discover long-term dependencies (i.e., relationships between target outputs and inputs that occur in the distant past). (c) 2017 Biointelligence Lab, SNU 27

28 15.10 Supervised Training Framework for Recurrent Networks Using Nonlinear Sequential State Estimators (1/4) Figure Nonlinear state-space model depicting the underlying dynamics of a recurrent network undergoing supervised training. w = w + ω + n 1 n n d = b( w, v, u ) + v n n n n n (c) 2017 Biointelligence Lab, SNU 28

29 15.10 Supervised Training Framework for Recurrent Networks Using Nonlinear Sequential State Estimators (2/4) Description of the Supervised-Training Framework using the Extended Kalman Filter The recurrent neural network, undergoing training, performs the role of the predictor; and the extended Kalman filter, providing the supervision, performs the role of the corrector. (c) 2017 Biointelligence Lab, SNU 29

30 15.10 Supervised Training Framework for Recurrent Networks Using Nonlinear Sequential State Estimators (3/4) Description of the Supervised-Training Framework using the Extended Kalman Filter α = d -b( wˆ, v, u ) n n nn -1 n n wˆ = wˆ + G α nn nn -1 n n wˆ = wˆ + G ( d -y ) nn nn -1 n n n (c) 2017 Biointelligence Lab, SNU 30

31 15.10 Supervised Training Framework for Recurrent Networks Using Nonlinear Sequential State Estimators (4/4) Decoupled Extended Kalman Filter Figure Block-diagonal representation of the filtering-error covariance matrix pertaining to the decoupled Kalman filter (DEKF). The shaded parts of the square represent nonzero values of,where i = 1, 2, 3, 4 for the example illustrated in the figure. As we make the number of disjoint weight groups, g, larger, more zeros are created in the covariance matrix P n n ; in other words, the matrix P n n becomes more sparse. The computational burden is therefore reduced, but the numerical accuracy of the state estimation becomes degraded. (c) 2017 Biointelligence Lab, SNU 31

32 15.11 Computer Experiment: Dynamic Reconstruction of Mackey-Glass Attractor Figure Ensemble-averaged cumulative absolute error curves during the autonomous prediction phase of dynamic reconstruction of the Mackey-Glass attractor. d x dt t ax =- bx t t-dt 10 xt-d t Extended Kalman filter (EKF) Central-difference Kalman filter (CDKF) Cubature Kalman filter (CKF) (c) 2017 Biointelligence Lab, SNU 32

33 15.12 Adaptivity Considerations Adaptive Critic Figure Block diagram illustrating the use of an adaptive critic for the control of recurrent node activities v n in a recurrent neural network (assumed to have a single output); the part of the figure involving the critic is shown in blue. Consider a recurrent neural network embedded in a stochastic environment with relatively small variability in its statistical behavior. Provided that the underlying probability distribution of the environment is fully represented in the supervised-training sample supplied to the network, it is possible for the network to adapt to the relatively small statistical variations in the environment without any further on-line adjustments being made to the synaptic weights of the network. (c) 2017 Biointelligence Lab, SNU 33

34 15.13 Case Study: Model Reference Applied to Neurocontrol Figure Model-reference adaptive control system; the feedback loop of the system is printed in blue. T 1 J( w, θ ) = y ( n) -y( n, w, θ ) åå k i, r k T n = 1 i 2 (c) 2017 Biointelligence Lab, SNU 34

35 Summary and Discussion (1/2) Four main recurrent network models with global feedback Nonlinear autoregressive networks with exogenous inputs (NARX networks), which use feedback from the output layer to the input layer Fully connected recurrent networks, which use feedback from the hidden layer to the input layer Recurrent multilayer perceptrons with more than one hidden layer, which use feedback from the output of each computation layer to its own input Second-order recurrent networks, which use second-order neurons. Properties of Recurrent Neural Networks They are universal approximators of nonlinear dynamic systems, provided that they are equipped with an adequate number of hidden neurons. They are locally controllable and observable, provided that their linearized versions satisfy certain conditions around the equilibrium point. Given any finite-state machine, we can build a recurrent neural network which, regarded as a black-box machine, will behave like that finite-state machine. Recurrent neural networks exhibit a meta-learning (i.e., learning to learn) capability. (c) 2017 Biointelligence Lab, SNU 35

36 Summary and Discussion (2/2) Gradient-based Learning Algorithms Back propagation through time (BPTT) off-line learning Real-time recurrent learning (RTRL) on-line learning Supervised-learning Algorithms Based on Nonlinear Sequential State Estimation Extended Kalman filter (EKF), with the linearization of the measurement model pertaining to the recurrent neural network by using the BPTT or RTRL algorithm. Derivative-free nonlinear sequential state estimator (CKF / CDKF). In so doing, not only the applicability of this novel approach to supervised learning is broaden, but also numerical accuracy is improved (but with increased computational requirements). (c) 2017 Biointelligence Lab, SNU 36

Direct Method for Training Feed-forward Neural Networks using Batch Extended Kalman Filter for Multi- Step-Ahead Predictions

Direct Method for Training Feed-forward Neural Networks using Batch Extended Kalman Filter for Multi- Step-Ahead Predictions Direct Method for Training Feed-forward Neural Networks using Batch Extended Kalman Filter for Multi- Step-Ahead Predictions Artem Chernodub, Institute of Mathematical Machines and Systems NASU, Neurotechnologies

More information

4. Multilayer Perceptrons

4. Multilayer Perceptrons 4. Multilayer Perceptrons This is a supervised error-correction learning algorithm. 1 4.1 Introduction A multilayer feedforward network consists of an input layer, one or more hidden layers, and an output

More information

T Machine Learning and Neural Networks

T Machine Learning and Neural Networks T-61.5130 Machine Learning and Neural Networks (5 cr) Lecture 11: Processing of Temporal Information Prof. Juha Karhunen https://mycourses.aalto.fi/ Aalto University School of Science, Espoo, Finland 1

More information

Lecture 5: Recurrent Neural Networks

Lecture 5: Recurrent Neural Networks 1/25 Lecture 5: Recurrent Neural Networks Nima Mohajerin University of Waterloo WAVE Lab nima.mohajerin@uwaterloo.ca July 4, 2017 2/25 Overview 1 Recap 2 RNN Architectures for Learning Long Term Dependencies

More information

Introduction to Neural Networks

Introduction to Neural Networks Introduction to Neural Networks What are (Artificial) Neural Networks? Models of the brain and nervous system Highly parallel Process information much more like the brain than a serial computer Learning

More information

Recurrent Neural Networks

Recurrent Neural Networks Recurrent Neural Networks Datamining Seminar Kaspar Märtens Karl-Oskar Masing Today's Topics Modeling sequences: a brief overview Training RNNs with back propagation A toy example of training an RNN Why

More information

Artificial Neural Network and Fuzzy Logic

Artificial Neural Network and Fuzzy Logic Artificial Neural Network and Fuzzy Logic 1 Syllabus 2 Syllabus 3 Books 1. Artificial Neural Networks by B. Yagnanarayan, PHI - (Cover Topologies part of unit 1 and All part of Unit 2) 2. Neural Networks

More information

Learning and Memory in Neural Networks

Learning and Memory in Neural Networks Learning and Memory in Neural Networks Guy Billings, Neuroinformatics Doctoral Training Centre, The School of Informatics, The University of Edinburgh, UK. Neural networks consist of computational units

More information

ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD

ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD WHAT IS A NEURAL NETWORK? The simplest definition of a neural network, more properly referred to as an 'artificial' neural network (ANN), is provided

More information

Neural networks. Chapter 19, Sections 1 5 1

Neural networks. Chapter 19, Sections 1 5 1 Neural networks Chapter 19, Sections 1 5 Chapter 19, Sections 1 5 1 Outline Brains Neural networks Perceptrons Multilayer perceptrons Applications of neural networks Chapter 19, Sections 1 5 2 Brains 10

More information

Artificial Neural Network

Artificial Neural Network Artificial Neural Network Eung Je Woo Department of Biomedical Engineering Impedance Imaging Research Center (IIRC) Kyung Hee University Korea ejwoo@khu.ac.kr Neuron and Neuron Model McCulloch and Pitts

More information

Chapter 11. Stochastic Methods Rooted in Statistical Mechanics

Chapter 11. Stochastic Methods Rooted in Statistical Mechanics Chapter 11. Stochastic Methods Rooted in Statistical Mechanics Neural Networks and Learning Machines (Haykin) Lecture Notes on Self-learning Neural Algorithms Byoung-Tak Zhang School of Computer Science

More information

A New Look at Nonlinear Time Series Prediction with NARX Recurrent Neural Network. José Maria P. Menezes Jr. and Guilherme A.

A New Look at Nonlinear Time Series Prediction with NARX Recurrent Neural Network. José Maria P. Menezes Jr. and Guilherme A. A New Look at Nonlinear Time Series Prediction with NARX Recurrent Neural Network José Maria P. Menezes Jr. and Guilherme A. Barreto Department of Teleinformatics Engineering Federal University of Ceará,

More information

Lecture 4: Feed Forward Neural Networks

Lecture 4: Feed Forward Neural Networks Lecture 4: Feed Forward Neural Networks Dr. Roman V Belavkin Middlesex University BIS4435 Biological neurons and the brain A Model of A Single Neuron Neurons as data-driven models Neural Networks Training

More information

CS:4420 Artificial Intelligence

CS:4420 Artificial Intelligence CS:4420 Artificial Intelligence Spring 2018 Neural Networks Cesare Tinelli The University of Iowa Copyright 2004 18, Cesare Tinelli and Stuart Russell a a These notes were originally developed by Stuart

More information

Sample Exam COMP 9444 NEURAL NETWORKS Solutions

Sample Exam COMP 9444 NEURAL NETWORKS Solutions FAMILY NAME OTHER NAMES STUDENT ID SIGNATURE Sample Exam COMP 9444 NEURAL NETWORKS Solutions (1) TIME ALLOWED 3 HOURS (2) TOTAL NUMBER OF QUESTIONS 12 (3) STUDENTS SHOULD ANSWER ALL QUESTIONS (4) QUESTIONS

More information

Neural networks. Chapter 20. Chapter 20 1

Neural networks. Chapter 20. Chapter 20 1 Neural networks Chapter 20 Chapter 20 1 Outline Brains Neural networks Perceptrons Multilayer networks Applications of neural networks Chapter 20 2 Brains 10 11 neurons of > 20 types, 10 14 synapses, 1ms

More information

International University Bremen Guided Research Proposal Improve on chaotic time series prediction using MLPs for output training

International University Bremen Guided Research Proposal Improve on chaotic time series prediction using MLPs for output training International University Bremen Guided Research Proposal Improve on chaotic time series prediction using MLPs for output training Aakash Jain a.jain@iu-bremen.de Spring Semester 2004 1 Executive Summary

More information

Christian Mohr

Christian Mohr Christian Mohr 20.12.2011 Recurrent Networks Networks in which units may have connections to units in the same or preceding layers Also connections to the unit itself possible Already covered: Hopfield

More information

(Feed-Forward) Neural Networks Dr. Hajira Jabeen, Prof. Jens Lehmann

(Feed-Forward) Neural Networks Dr. Hajira Jabeen, Prof. Jens Lehmann (Feed-Forward) Neural Networks 2016-12-06 Dr. Hajira Jabeen, Prof. Jens Lehmann Outline In the previous lectures we have learned about tensors and factorization methods. RESCAL is a bilinear model for

More information

Neuro-Fuzzy Comp. Ch. 4 March 24, R p

Neuro-Fuzzy Comp. Ch. 4 March 24, R p 4 Feedforward Multilayer Neural Networks part I Feedforward multilayer neural networks (introduced in sec 17) with supervised error correcting learning are used to approximate (synthesise) a non-linear

More information

Last update: October 26, Neural networks. CMSC 421: Section Dana Nau

Last update: October 26, Neural networks. CMSC 421: Section Dana Nau Last update: October 26, 207 Neural networks CMSC 42: Section 8.7 Dana Nau Outline Applications of neural networks Brains Neural network units Perceptrons Multilayer perceptrons 2 Example Applications

More information

Feed-forward Network Functions

Feed-forward Network Functions Feed-forward Network Functions Sargur Srihari Topics 1. Extension of linear models 2. Feed-forward Network Functions 3. Weight-space symmetries 2 Recap of Linear Models Linear Models for Regression, Classification

More information

Artificial Neural Networks Examination, June 2005

Artificial Neural Networks Examination, June 2005 Artificial Neural Networks Examination, June 2005 Instructions There are SIXTY questions. (The pass mark is 30 out of 60). For each question, please select a maximum of ONE of the given answers (either

More information

Artificial Neural Network : Training

Artificial Neural Network : Training Artificial Neural Networ : Training Debasis Samanta IIT Kharagpur debasis.samanta.iitgp@gmail.com 06.04.2018 Debasis Samanta (IIT Kharagpur) Soft Computing Applications 06.04.2018 1 / 49 Learning of neural

More information

Consider the way we are able to retrieve a pattern from a partial key as in Figure 10 1.

Consider the way we are able to retrieve a pattern from a partial key as in Figure 10 1. CompNeuroSci Ch 10 September 8, 2004 10 Associative Memory Networks 101 Introductory concepts Consider the way we are able to retrieve a pattern from a partial key as in Figure 10 1 Figure 10 1: A key

More information

CSC242: Intro to AI. Lecture 21

CSC242: Intro to AI. Lecture 21 CSC242: Intro to AI Lecture 21 Administrivia Project 4 (homeworks 18 & 19) due Mon Apr 16 11:59PM Posters Apr 24 and 26 You need an idea! You need to present it nicely on 2-wide by 4-high landscape pages

More information

Neural networks. Chapter 20, Section 5 1

Neural networks. Chapter 20, Section 5 1 Neural networks Chapter 20, Section 5 Chapter 20, Section 5 Outline Brains Neural networks Perceptrons Multilayer perceptrons Applications of neural networks Chapter 20, Section 5 2 Brains 0 neurons of

More information

Neural Nets in PR. Pattern Recognition XII. Michal Haindl. Outline. Neural Nets in PR 2

Neural Nets in PR. Pattern Recognition XII. Michal Haindl. Outline. Neural Nets in PR 2 Neural Nets in PR NM P F Outline Motivation: Pattern Recognition XII human brain study complex cognitive tasks Michal Haindl Faculty of Information Technology, KTI Czech Technical University in Prague

More information

Chapter 4 Neural Networks in System Identification

Chapter 4 Neural Networks in System Identification Chapter 4 Neural Networks in System Identification Gábor HORVÁTH Department of Measurement and Information Systems Budapest University of Technology and Economics Magyar tudósok körútja 2, 52 Budapest,

More information

y k = (a)synaptic f(x j ) link linear i/p o/p relation (b) Activation link linear i/p o/p relation

y k = (a)synaptic f(x j ) link linear i/p o/p relation (b) Activation link linear i/p o/p relation Neural networks viewed as directed graph - Signal flow graph: w j f(.) x j y k = w kj x j x j y k = (a)synaptic f(x j ) link linear i/p o/p relation (b) Activation link linear i/p o/p relation y i x j

More information

Neural Networks, Computation Graphs. CMSC 470 Marine Carpuat

Neural Networks, Computation Graphs. CMSC 470 Marine Carpuat Neural Networks, Computation Graphs CMSC 470 Marine Carpuat Binary Classification with a Multi-layer Perceptron φ A = 1 φ site = 1 φ located = 1 φ Maizuru = 1 φ, = 2 φ in = 1 φ Kyoto = 1 φ priest = 0 φ

More information

Reservoir Computing and Echo State Networks

Reservoir Computing and Echo State Networks An Introduction to: Reservoir Computing and Echo State Networks Claudio Gallicchio gallicch@di.unipi.it Outline Focus: Supervised learning in domain of sequences Recurrent Neural networks for supervised

More information

Modelling Time Series with Neural Networks. Volker Tresp Summer 2017

Modelling Time Series with Neural Networks. Volker Tresp Summer 2017 Modelling Time Series with Neural Networks Volker Tresp Summer 2017 1 Modelling of Time Series The next figure shows a time series (DAX) Other interesting time-series: energy prize, energy consumption,

More information

Neural Networks Introduction

Neural Networks Introduction Neural Networks Introduction H.A Talebi Farzaneh Abdollahi Department of Electrical Engineering Amirkabir University of Technology Winter 2011 H. A. Talebi, Farzaneh Abdollahi Neural Networks 1/22 Biological

More information

Lecture 7 Artificial neural networks: Supervised learning

Lecture 7 Artificial neural networks: Supervised learning Lecture 7 Artificial neural networks: Supervised learning Introduction, or how the brain works The neuron as a simple computing element The perceptron Multilayer neural networks Accelerated learning in

More information

NN V: The generalized delta learning rule

NN V: The generalized delta learning rule NN V: The generalized delta learning rule We now focus on generalizing the delta learning rule for feedforward layered neural networks. The architecture of the two-layer network considered below is shown

More information

A recursive algorithm based on the extended Kalman filter for the training of feedforward neural models. Isabelle Rivals and Léon Personnaz

A recursive algorithm based on the extended Kalman filter for the training of feedforward neural models. Isabelle Rivals and Léon Personnaz In Neurocomputing 2(-3): 279-294 (998). A recursive algorithm based on the extended Kalman filter for the training of feedforward neural models Isabelle Rivals and Léon Personnaz Laboratoire d'électronique,

More information

Introduction Neural Networks - Architecture Network Training Small Example - ZIP Codes Summary. Neural Networks - I. Henrik I Christensen

Introduction Neural Networks - Architecture Network Training Small Example - ZIP Codes Summary. Neural Networks - I. Henrik I Christensen Neural Networks - I Henrik I Christensen Robotics & Intelligent Machines @ GT Georgia Institute of Technology, Atlanta, GA 30332-0280 hic@cc.gatech.edu Henrik I Christensen (RIM@GT) Neural Networks 1 /

More information

22c145-Fall 01: Neural Networks. Neural Networks. Readings: Chapter 19 of Russell & Norvig. Cesare Tinelli 1

22c145-Fall 01: Neural Networks. Neural Networks. Readings: Chapter 19 of Russell & Norvig. Cesare Tinelli 1 Neural Networks Readings: Chapter 19 of Russell & Norvig. Cesare Tinelli 1 Brains as Computational Devices Brains advantages with respect to digital computers: Massively parallel Fault-tolerant Reliable

More information

Temporal Backpropagation for FIR Neural Networks

Temporal Backpropagation for FIR Neural Networks Temporal Backpropagation for FIR Neural Networks Eric A. Wan Stanford University Department of Electrical Engineering, Stanford, CA 94305-4055 Abstract The traditional feedforward neural network is a static

More information

Introduction to Neural Networks: Structure and Training

Introduction to Neural Networks: Structure and Training Introduction to Neural Networks: Structure and Training Professor Q.J. Zhang Department of Electronics Carleton University, Ottawa, Canada www.doe.carleton.ca/~qjz, qjz@doe.carleton.ca A Quick Illustration

More information

Artificial Neural Networks. MGS Lecture 2

Artificial Neural Networks. MGS Lecture 2 Artificial Neural Networks MGS 2018 - Lecture 2 OVERVIEW Biological Neural Networks Cell Topology: Input, Output, and Hidden Layers Functional description Cost functions Training ANNs Back-Propagation

More information

inear Adaptive Inverse Control

inear Adaptive Inverse Control Proceedings of the 36th Conference on Decision & Control San Diego, California USA December 1997 inear Adaptive nverse Control WM15 1:50 Bernard Widrow and Gregory L. Plett Department of Electrical Engineering,

More information

Neural Networks. CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington

Neural Networks. CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington Neural Networks CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington 1 Perceptrons x 0 = 1 x 1 x 2 z = h w T x Output: z x D A perceptron

More information

Artificial Neural Network

Artificial Neural Network Artificial Neural Network Contents 2 What is ANN? Biological Neuron Structure of Neuron Types of Neuron Models of Neuron Analogy with human NN Perceptron OCR Multilayer Neural Network Back propagation

More information

CS 6501: Deep Learning for Computer Graphics. Basics of Neural Networks. Connelly Barnes

CS 6501: Deep Learning for Computer Graphics. Basics of Neural Networks. Connelly Barnes CS 6501: Deep Learning for Computer Graphics Basics of Neural Networks Connelly Barnes Overview Simple neural networks Perceptron Feedforward neural networks Multilayer perceptron and properties Autoencoders

More information

CSE 352 (AI) LECTURE NOTES Professor Anita Wasilewska. NEURAL NETWORKS Learning

CSE 352 (AI) LECTURE NOTES Professor Anita Wasilewska. NEURAL NETWORKS Learning CSE 352 (AI) LECTURE NOTES Professor Anita Wasilewska NEURAL NETWORKS Learning Neural Networks Classifier Short Presentation INPUT: classification data, i.e. it contains an classification (class) attribute.

More information

3.4 Linear Least-Squares Filter

3.4 Linear Least-Squares Filter X(n) = [x(1), x(2),..., x(n)] T 1 3.4 Linear Least-Squares Filter Two characteristics of linear least-squares filter: 1. The filter is built around a single linear neuron. 2. The cost function is the sum

More information

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Cheng Soon Ong & Christian Walder. Canberra February June 2018 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 Outlines Overview Introduction Linear Algebra Probability Linear Regression

More information

CSC321 Lecture 5: Multilayer Perceptrons

CSC321 Lecture 5: Multilayer Perceptrons CSC321 Lecture 5: Multilayer Perceptrons Roger Grosse Roger Grosse CSC321 Lecture 5: Multilayer Perceptrons 1 / 21 Overview Recall the simple neuron-like unit: y output output bias i'th weight w 1 w2 w3

More information

Project 1: A comparison of time delay neural networks (TDNN) trained with mean squared error (MSE) and error entropy criterion (EEC)

Project 1: A comparison of time delay neural networks (TDNN) trained with mean squared error (MSE) and error entropy criterion (EEC) 1 Project 1: A comparison of time delay neural networks (TDNN) trained with mean squared error (MSE) and error entropy criterion (EEC) Stefan Craciun Abstract The goal is to implement a TDNN (time delay

More information

Neural Networks. Chapter 18, Section 7. TB Artificial Intelligence. Slides from AIMA 1/ 21

Neural Networks. Chapter 18, Section 7. TB Artificial Intelligence. Slides from AIMA   1/ 21 Neural Networks Chapter 8, Section 7 TB Artificial Intelligence Slides from AIMA http://aima.cs.berkeley.edu / 2 Outline Brains Neural networks Perceptrons Multilayer perceptrons Applications of neural

More information

Multilayer Perceptron

Multilayer Perceptron Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Single Perceptron 3 Boolean Function Learning 4

More information

Multilayer Neural Networks. (sometimes called Multilayer Perceptrons or MLPs)

Multilayer Neural Networks. (sometimes called Multilayer Perceptrons or MLPs) Multilayer Neural Networks (sometimes called Multilayer Perceptrons or MLPs) Linear separability Hyperplane In 2D: w x + w 2 x 2 + w 0 = 0 Feature x 2 = w w 2 x w 0 w 2 Feature 2 A perceptron can separate

More information

Mark Gales October y (x) x 1. x 2 y (x) Inputs. Outputs. x d. y (x) Second Output layer layer. layer.

Mark Gales October y (x) x 1. x 2 y (x) Inputs. Outputs. x d. y (x) Second Output layer layer. layer. University of Cambridge Engineering Part IIB & EIST Part II Paper I0: Advanced Pattern Processing Handouts 4 & 5: Multi-Layer Perceptron: Introduction and Training x y (x) Inputs x 2 y (x) 2 Outputs x

More information

NARX neural networks for sequence processing tasks

NARX neural networks for sequence processing tasks Master in Artificial Intelligence (UPC-URV-UB) Master of Science Thesis NARX neural networks for sequence processing tasks eng. Eugen Hristev Advisor: prof. dr. René Alquézar Mancho June 2012 Table of

More information

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels Need for Deep Networks Perceptron Can only model linear functions Kernel Machines Non-linearity provided by kernels Need to design appropriate kernels (possibly selecting from a set, i.e. kernel learning)

More information

Analysis of Multilayer Neural Network Modeling and Long Short-Term Memory

Analysis of Multilayer Neural Network Modeling and Long Short-Term Memory Analysis of Multilayer Neural Network Modeling and Long Short-Term Memory Danilo López, Nelson Vera, Luis Pedraza International Science Index, Mathematical and Computational Sciences waset.org/publication/10006216

More information

Application of Artificial Neural Networks in Evaluation and Identification of Electrical Loss in Transformers According to the Energy Consumption

Application of Artificial Neural Networks in Evaluation and Identification of Electrical Loss in Transformers According to the Energy Consumption Application of Artificial Neural Networks in Evaluation and Identification of Electrical Loss in Transformers According to the Energy Consumption ANDRÉ NUNES DE SOUZA, JOSÉ ALFREDO C. ULSON, IVAN NUNES

More information

Autonomous learning algorithm for fully connected recurrent networks

Autonomous learning algorithm for fully connected recurrent networks Autonomous learning algorithm for fully connected recurrent networks Edouard Leclercq, Fabrice Druaux, Dimitri Lefebvre Groupe de Recherche en Electrotechnique et Automatique du Havre Université du Havre,

More information

Unit III. A Survey of Neural Network Model

Unit III. A Survey of Neural Network Model Unit III A Survey of Neural Network Model 1 Single Layer Perceptron Perceptron the first adaptive network architecture was invented by Frank Rosenblatt in 1957. It can be used for the classification of

More information

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels Need for Deep Networks Perceptron Can only model linear functions Kernel Machines Non-linearity provided by kernels Need to design appropriate kernels (possibly selecting from a set, i.e. kernel learning)

More information

Statistical Machine Learning (BE4M33SSU) Lecture 5: Artificial Neural Networks

Statistical Machine Learning (BE4M33SSU) Lecture 5: Artificial Neural Networks Statistical Machine Learning (BE4M33SSU) Lecture 5: Artificial Neural Networks Jan Drchal Czech Technical University in Prague Faculty of Electrical Engineering Department of Computer Science Topics covered

More information

A Logarithmic Neural Network Architecture for Unbounded Non-Linear Function Approximation

A Logarithmic Neural Network Architecture for Unbounded Non-Linear Function Approximation 1 Introduction A Logarithmic Neural Network Architecture for Unbounded Non-Linear Function Approximation J Wesley Hines Nuclear Engineering Department The University of Tennessee Knoxville, Tennessee,

More information

AI Programming CS F-20 Neural Networks

AI Programming CS F-20 Neural Networks AI Programming CS662-2008F-20 Neural Networks David Galles Department of Computer Science University of San Francisco 20-0: Symbolic AI Most of this class has been focused on Symbolic AI Focus or symbols

More information

Neural Networks and the Back-propagation Algorithm

Neural Networks and the Back-propagation Algorithm Neural Networks and the Back-propagation Algorithm Francisco S. Melo In these notes, we provide a brief overview of the main concepts concerning neural networks and the back-propagation algorithm. We closely

More information

Using a Hopfield Network: A Nuts and Bolts Approach

Using a Hopfield Network: A Nuts and Bolts Approach Using a Hopfield Network: A Nuts and Bolts Approach November 4, 2013 Gershon Wolfe, Ph.D. Hopfield Model as Applied to Classification Hopfield network Training the network Updating nodes Sequencing of

More information

Neural Networks. Mark van Rossum. January 15, School of Informatics, University of Edinburgh 1 / 28

Neural Networks. Mark van Rossum. January 15, School of Informatics, University of Edinburgh 1 / 28 1 / 28 Neural Networks Mark van Rossum School of Informatics, University of Edinburgh January 15, 2018 2 / 28 Goals: Understand how (recurrent) networks behave Find a way to teach networks to do a certain

More information

Modeling Economic Time Series Using a Focused Time Lagged FeedForward Neural Network

Modeling Economic Time Series Using a Focused Time Lagged FeedForward Neural Network Proceedings of Student Research Day, CSIS, Pace University, May 9th, 23 Modeling Economic Time Series Using a Focused Time Lagged FeedForward Neural Network N. Moseley ABSTRACT, - Artificial neural networks

More information

Serious limitations of (single-layer) perceptrons: Cannot learn non-linearly separable tasks. Cannot approximate (learn) non-linear functions

Serious limitations of (single-layer) perceptrons: Cannot learn non-linearly separable tasks. Cannot approximate (learn) non-linear functions BACK-PROPAGATION NETWORKS Serious limitations of (single-layer) perceptrons: Cannot learn non-linearly separable tasks Cannot approximate (learn) non-linear functions Difficult (if not impossible) to design

More information

Cascade Neural Networks with Node-Decoupled Extended Kalman Filtering

Cascade Neural Networks with Node-Decoupled Extended Kalman Filtering Cascade Neural Networks with Node-Decoupled Extended Kalman Filtering Michael C. Nechyba and Yangsheng Xu The Robotics Institute Carnegie Mellon University Pittsburgh, PA 15213 Abstract Most neural networks

More information

Lecture 4: Perceptrons and Multilayer Perceptrons

Lecture 4: Perceptrons and Multilayer Perceptrons Lecture 4: Perceptrons and Multilayer Perceptrons Cognitive Systems II - Machine Learning SS 2005 Part I: Basic Approaches of Concept Learning Perceptrons, Artificial Neuronal Networks Lecture 4: Perceptrons

More information

Branch Prediction using Advanced Neural Methods

Branch Prediction using Advanced Neural Methods Branch Prediction using Advanced Neural Methods Sunghoon Kim Department of Mechanical Engineering University of California, Berkeley shkim@newton.berkeley.edu Abstract Among the hardware techniques, two-level

More information

Artificial Neural Networks D B M G. Data Base and Data Mining Group of Politecnico di Torino. Elena Baralis. Politecnico di Torino

Artificial Neural Networks D B M G. Data Base and Data Mining Group of Politecnico di Torino. Elena Baralis. Politecnico di Torino Artificial Neural Networks Data Base and Data Mining Group of Politecnico di Torino Elena Baralis Politecnico di Torino Artificial Neural Networks Inspired to the structure of the human brain Neurons as

More information

Short Term Memory and Pattern Matching with Simple Echo State Networks

Short Term Memory and Pattern Matching with Simple Echo State Networks Short Term Memory and Pattern Matching with Simple Echo State Networks Georg Fette (fette@in.tum.de), Julian Eggert (julian.eggert@honda-ri.de) Technische Universität München; Boltzmannstr. 3, 85748 Garching/München,

More information

Artificial Intelligence Hopfield Networks

Artificial Intelligence Hopfield Networks Artificial Intelligence Hopfield Networks Andrea Torsello Network Topologies Single Layer Recurrent Network Bidirectional Symmetric Connection Binary / Continuous Units Associative Memory Optimization

More information

Long-Term Prediction, Chaos and Artificial Neural Networks. Where is the Meeting Point?

Long-Term Prediction, Chaos and Artificial Neural Networks. Where is the Meeting Point? Engineering Letters, 5:, EL_5 Long-Term Prediction, Chaos and Artificial Neural Networks. Where is the Meeting Point? Pilar Gómez-Gil Abstract This paper presents the advances of a research using a combination

More information

Neural Turing Machine. Author: Alex Graves, Greg Wayne, Ivo Danihelka Presented By: Tinghui Wang (Steve)

Neural Turing Machine. Author: Alex Graves, Greg Wayne, Ivo Danihelka Presented By: Tinghui Wang (Steve) Neural Turing Machine Author: Alex Graves, Greg Wayne, Ivo Danihelka Presented By: Tinghui Wang (Steve) Introduction Neural Turning Machine: Couple a Neural Network with external memory resources The combined

More information

Ch.8 Neural Networks

Ch.8 Neural Networks Ch.8 Neural Networks Hantao Zhang http://www.cs.uiowa.edu/ hzhang/c145 The University of Iowa Department of Computer Science Artificial Intelligence p.1/?? Brains as Computational Devices Motivation: Algorithms

More information

Multilayer Neural Networks. (sometimes called Multilayer Perceptrons or MLPs)

Multilayer Neural Networks. (sometimes called Multilayer Perceptrons or MLPs) Multilayer Neural Networks (sometimes called Multilayer Perceptrons or MLPs) Linear separability Hyperplane In 2D: w 1 x 1 + w 2 x 2 + w 0 = 0 Feature 1 x 2 = w 1 w 2 x 1 w 0 w 2 Feature 2 A perceptron

More information

Long-Short Term Memory

Long-Short Term Memory Long-Short Term Memory Sepp Hochreiter, Jürgen Schmidhuber Presented by Derek Jones Table of Contents 1. Introduction 2. Previous Work 3. Issues in Learning Long-Term Dependencies 4. Constant Error Flow

More information

Neural Networks with Applications to Vision and Language. Feedforward Networks. Marco Kuhlmann

Neural Networks with Applications to Vision and Language. Feedforward Networks. Marco Kuhlmann Neural Networks with Applications to Vision and Language Feedforward Networks Marco Kuhlmann Feedforward networks Linear separability x 2 x 2 0 1 0 1 0 0 x 1 1 0 x 1 linearly separable not linearly separable

More information

CSCI 315: Artificial Intelligence through Deep Learning

CSCI 315: Artificial Intelligence through Deep Learning CSCI 315: Artificial Intelligence through Deep Learning W&L Winter Term 2017 Prof. Levy Recurrent Neural Networks (Chapter 7) Recall our first-week discussion... How do we know stuff? (MIT Press 1996)

More information

Artificial Neural Networks Examination, June 2004

Artificial Neural Networks Examination, June 2004 Artificial Neural Networks Examination, June 2004 Instructions There are SIXTY questions (worth up to 60 marks). The exam mark (maximum 60) will be added to the mark obtained in the laborations (maximum

More information

Neural Network Training

Neural Network Training Neural Network Training Sargur Srihari Topics in Network Training 0. Neural network parameters Probabilistic problem formulation Specifying the activation and error functions for Regression Binary classification

More information

Artificial Neural Networks Examination, March 2004

Artificial Neural Networks Examination, March 2004 Artificial Neural Networks Examination, March 2004 Instructions There are SIXTY questions (worth up to 60 marks). The exam mark (maximum 60) will be added to the mark obtained in the laborations (maximum

More information

SPSS, University of Texas at Arlington. Topics in Machine Learning-EE 5359 Neural Networks

SPSS, University of Texas at Arlington. Topics in Machine Learning-EE 5359 Neural Networks Topics in Machine Learning-EE 5359 Neural Networks 1 The Perceptron Output: A perceptron is a function that maps D-dimensional vectors to real numbers. For notational convenience, we add a zero-th dimension

More information

CSC 411 Lecture 10: Neural Networks

CSC 411 Lecture 10: Neural Networks CSC 411 Lecture 10: Neural Networks Roger Grosse, Amir-massoud Farahmand, and Juan Carrasquilla University of Toronto UofT CSC 411: 10-Neural Networks 1 / 35 Inspiration: The Brain Our brain has 10 11

More information

Artificial Neural Networks. Q550: Models in Cognitive Science Lecture 5

Artificial Neural Networks. Q550: Models in Cognitive Science Lecture 5 Artificial Neural Networks Q550: Models in Cognitive Science Lecture 5 "Intelligence is 10 million rules." --Doug Lenat The human brain has about 100 billion neurons. With an estimated average of one thousand

More information

ARTIFICIAL NEURAL NETWORKS گروه مطالعاتي 17 بهار 92

ARTIFICIAL NEURAL NETWORKS گروه مطالعاتي 17 بهار 92 ARTIFICIAL NEURAL NETWORKS گروه مطالعاتي 17 بهار 92 BIOLOGICAL INSPIRATIONS Some numbers The human brain contains about 10 billion nerve cells (neurons) Each neuron is connected to the others through 10000

More information

Advanced Methods for Recurrent Neural Networks Design

Advanced Methods for Recurrent Neural Networks Design Universidad Autónoma de Madrid Escuela Politécnica Superior Departamento de Ingeniería Informática Advanced Methods for Recurrent Neural Networks Design Master s thesis presented to apply for the Master

More information

Harnessing Nonlinearity: Predicting Chaotic Systems and Saving

Harnessing Nonlinearity: Predicting Chaotic Systems and Saving Harnessing Nonlinearity: Predicting Chaotic Systems and Saving Energy in Wireless Communication Publishde in Science Magazine, 2004 Siamak Saliminejad Overview Eco State Networks How to build ESNs Chaotic

More information

EEE 241: Linear Systems

EEE 241: Linear Systems EEE 4: Linear Systems Summary # 3: Introduction to artificial neural networks DISTRIBUTED REPRESENTATION An ANN consists of simple processing units communicating with each other. The basic elements of

More information

Computational Intelligence Lecture 6: Associative Memory

Computational Intelligence Lecture 6: Associative Memory Computational Intelligence Lecture 6: Associative Memory Farzaneh Abdollahi Department of Electrical Engineering Amirkabir University of Technology Fall 2011 Farzaneh Abdollahi Computational Intelligence

More information

Ways to make neural networks generalize better

Ways to make neural networks generalize better Ways to make neural networks generalize better Seminar in Deep Learning University of Tartu 04 / 10 / 2014 Pihel Saatmann Topics Overview of ways to improve generalization Limiting the size of the weights

More information

Artificial Neural Networks

Artificial Neural Networks Artificial Neural Networks 鮑興國 Ph.D. National Taiwan University of Science and Technology Outline Perceptrons Gradient descent Multi-layer networks Backpropagation Hidden layer representations Examples

More information

y(x n, w) t n 2. (1)

y(x n, w) t n 2. (1) Network training: Training a neural network involves determining the weight parameter vector w that minimizes a cost function. Given a training set comprising a set of input vector {x n }, n = 1,...N,

More information

Artificial Neuron (Perceptron)

Artificial Neuron (Perceptron) 9/6/208 Gradient Descent (GD) Hantao Zhang Deep Learning with Python Reading: https://en.wikipedia.org/wiki/gradient_descent Artificial Neuron (Perceptron) = w T = w 0 0 + + w 2 2 + + w d d where

More information

EE04 804(B) Soft Computing Ver. 1.2 Class 2. Neural Networks - I Feb 23, Sasidharan Sreedharan

EE04 804(B) Soft Computing Ver. 1.2 Class 2. Neural Networks - I Feb 23, Sasidharan Sreedharan EE04 804(B) Soft Computing Ver. 1.2 Class 2. Neural Networks - I Feb 23, 2012 Sasidharan Sreedharan www.sasidharan.webs.com 3/1/2012 1 Syllabus Artificial Intelligence Systems- Neural Networks, fuzzy logic,

More information