Chapter 15. Dynamically Driven Recurrent Networks
|
|
- Junior Horn
- 5 years ago
- Views:
Transcription
1 Chapter 15. Dynamically Driven Recurrent Networks Neural Networks and Learning Machines (Haykin) Lecture Notes on Self-learning Neural Algorithms Byoung-Tak Zhang School of Computer Science and Engineering Seoul National University Version
2 Contents 15.1 Introduction Recurrent Network Architectures Universal Approximation Theorem Computational Power of Recurrent Networks Learning Algorithms Back Propagation Through Time Real-Time Recurrent Learning Vanishing Gradients in Recurrent Networks Supervised Training Framework for Recurrent Networks Computer Experiment: Mackey-Glass Attractor Adaptivity Considerations Case Study: Model Reference Applied to Neurocontrol 34 Summary and Discussion (c) 2017 Biointelligence Lab, SNU 2
3 15.1 Introduction Global feedback is a facilitator of computational intelligence. In previous chapters we studied how the use of global feedback in a recurrent network makes it possible to achieve some useful tasks: o o o Content-addressable memory Autoassociation Dynamic reconstruction of a chaotic process In this chapter, we study the other important applications of recurrent networks: o o o o Input output mapping, the study of which naturally benefits from Chapter 14 on sequential state estimation Applying the feedback from the output layer to the input of the hidden layer Combining all possible feedback loops in a single recurrent network structure Other configurations as the building blocks for the construction of recurrent networks Recurrent networks have a very rich repertoire of architectural layouts, which makes them all the more powerful in computational terms. A recurrent network responds temporally to an externally applied input signal. We may therefore speak of the recurrent networks considered in this chapter as dynamically driven recurrent networks (c) 2017 Biointelligence Lab, SNU 3
4 15.2 Recurrent Network Architectures (1/8) Four specific network architectures 1) Input-Output Recurrent Model 2) State-Space Model 3) Recurrent Multilayer Perceptrons 4) Second-Order Network They all incorporate a static multilayer perceptron or parts thereof They all exploit the nonlinear mapping capability of the multilayer perceptron (c) 2017 Biointelligence Lab, SNU 4
5 15.2 Recurrent Network Architectures (2/8) 1) Input-Output Recurrent Model 1. The model has a single input that is applied to a tapped-delay-line memory of q units. 2. It has a single output that is fed back to the input via another tapped-delay-line memory, also of q units. 3. The present value of the model input is denoted by un, and the corresponding value of the model output is denoted by y n The dynamic behavior of the nonlinear autoregressive with exogenous inputs (NARX) model is described by y F y y u u = (,!, ;,!, ) n+ 1 n n- q+ 1 n n- q+ 1 where F is a nonlinear function of its arguments Figure 15.1 Nonlinear autoregressive with exogenous inputs (NARX) model; the feedback part of the network is shown in blue. 5
6 15.2 Recurrent Network Architectures (3/8) 2) State-Space Model Figure 15.2 State-space model; the feedback part of the model is shown in blue. Figure 15.3 Simple recurrent network (SRN); the feedback part of the network is shown in blue. 6
7 15.2 Recurrent Network Architectures (4/8) 2) State-Space Model 1. A state space model, is the basic idea of which was discussed in Chapter The output is fed back to the input layer via a bank of unit-time delays. 3. The input layer consists of a concatenation of feedback nodes and source nodes. x = n+ 1 a( xn, un) yn = Bxn 4. Elman's network (Fig.15.3) contains recurrent connections from the hidden neurons to a layer of context units consisting of unit-time delays. These context units store the outputs of the hidden neurons for one time-step and then feed them back to the input layer. (c) 2017 Biointelligence Lab, SNU 7
8 15.2 Recurrent Network Architectures (5/8) 3) Recurrent Multilayer Perceptrons Figure 15.4 Recurrent multilayer perceptron; feedback paths in the network are printed in blue. (c) 2017 Biointelligence Lab, SNU 8
9 15.2 Recurrent Network Architectures (6/8) 3) Recurrent Multilayer Perceptrons 1. A recurrent multilayer perceptron (RMLP) has one or more hidden layers, basically for the same reasons that static multilayer perceptrons are often more effective and parsimonious than those using a single hidden layer 2. Each computation layer of an RMLP has feedback around it, as illustrated in Fig for the case of an RMLP with two hidden layers. x x u I, n+ 1 = I( I, n, n) x = ( x x ) II, n+ 1 II II, n, I, n+ 1! f f f x = ( x x ) on, + 1 o on,, Kn, + 1 (c) 2017 Biointelligence Lab, SNU 9
10 15.2 Recurrent Network Architectures (7/8) 4) Second-Order Network Figure 15.5 Second-order recurrent network; bias connections to the neurons are omitted to simplify the presentation. The network has 2 inputs and 3 state neurons, hence the need for 3 X 2 = 6 multipliers. The feedback links in the figure are printed in blue to emphasize their global role. 10
11 15.2 Recurrent Network Architectures (8/8) 4) Second-Order Network First-order neuron Second-order neuron Second-order recurrent networks å å v = w x + w u v k a, kj j b, ki i j i = åå w xu k kij i j i j v b w x u = +åå k, n k kij i, n j, n i j x = 1 φ( v ) = + 1+ exp( -v ) kn, 1 kn, kn, d ( x u )= x i, j k 11
12 15.3 Universal Approximation Theorem (1/2) Any nonlinear dynamic system may be approximated by a recurrent neural network to any desired degree of accuracy and with no restrictions imposed on the compactness of the state space, provided that the network is equipped with an adequate number of hidden neurons. xn + 1 = f( W x + Wu ) y = Wx n c n a n b n 1- e φ ( x) = tanh( x) = 1 + e -2x -2x éx1ù éj( x1) ù ê x ú ê 2 j( x2) ú f : ê ú ê ú ê! ú ê! ú ê ú ê x ) ú êë qúû êëj ( xq úû 1 φ ( x) = 1 + e - x 12
13 15.3 Universal Approximation Theorem (2/2) Example 1 Fully Connected Recurrent Network m= 2, q= 3, and p= 1 W W a b éw w w = ê w21 w22 w ú ê 23 ú êëw w w éb w w = ê b2 w24 w ú ê 25 ú êëb w w ù úû ù úû Wc = [ 1 0 0] Figure 15.6 Fully connected recurrent network with two inputs, two hidden neurons, and one output neuron. The feedback connections are shown in blue to emphasize their global role. 13
14 15.5 Computational Power of Recurrent Networks (1/3) Every finite - state machine is equivalent to, and can be simulated by, some neural net. That is, given any finite - state machine M, we can build a certain neural net M N which, regarded as a black - box machine, will behave precisely like M! Theorem I (Siegelmann and Sontag, 1991) All Turing machines may be simulated by fully connected recurrent networks built on neurons with sigmoidal activation functions. Three functional blocks of Turing Machine a control unit, which can assume any one of a finite number of possible states linear tape, assumed to be infinitely long in both directions, which is marked off into discrete squares, where each square is available to store a single symbol taken from a finite set of symbols a read write head, which moves along the tape and transmits information to and from the control unit Figure 15.7 Turing machine. 14
15 15.5 Computational Power of Recurrent Networks (2/3) Figure 15.8 Illustration of Theorems I and II, and corollary to them. 15
16 15.5 Computational Power of Recurrent Networks (3/3) Theorem II (Siegelmann et al., 1997) NARX networks with one layer of hidden neurons with bounded, one-sided saturated (BOSS) activation functions and a linear output neuron can simulate fully connected recurrent networks with bounded, one-sided saturated activation functions, except for a linear slowdown. Three condtions of activation functions 1. The function j( ) has a bounded range; that is, a j( x) b, a¹ b, for all xîr 2. The function j( ) is saturated on the left side; that is, there exists values s and S such that j( x) = S for all x s 3. The function j( ) is nonconstant; that is j( x ) ¹ j( x ) for some x and x BOSS function ì 1 ï for x> j( x) = í1 + exp( - x) ï î0 for x s s 16
17 15.6 Learning Algorithms Two modes of training a recurrent network 1. Epochwise training. For a given epoch, the recurrent network uses a temporal sequence of input target response pairs and starts running from some initial state until it reaches a new state, at which point the training is stopped and the network is reset to an initial state for the next epoch. 2. Continuous training. This second method of training is suitable for situations where there are no reset states available or on-line learning is required. The distinguishing feature of continuous training is that the network learns while performing signal processing. Simply put, the learning process never stops. Two different learning algorithms 1. Back-propagation through time (BPTT) Section Epochwise, continuous, or combined 1. Real-time recurrent learning (RTRL) Section Derived from the state-space model 17
18 15.7 Back Propagation Through Time (1/3) The back-propagation-through-time (BPTT) algorithm for training a recurrent network is an extension of the standard back-propagation algorithm.8 It may be derived by unfolding the temporal operation of the network into a layered feedforward network, the topology of which grows by one layer at every time-step. Figure 15.9 (a) Architectural graph of a twoneuron recurrent network N. (b) Signal-flow graph of the network N. unfolded in time. (c) 2017 Biointelligence Lab, SNU 18
19 15.7 Back Propagation Through Time (2/3) Epochwise Back Propagation Through Time E n 1 = 2 1 åå n= n0 jîa e 2 total jn, δ = jn, E - total vj, n δ = jn, ' ' ì φ ( vjn, ) ejn, for n= n1 ï í é ù ïφ ( v ) êe å w ú for n n n î ë û j, n j, n + jkd k, n < < 1 kîa E D wji =-η w n 1 = η å n= n0+ 1 total δ ji jn, xin, - 1 (c) 2017 Biointelligence Lab, SNU 19
20 15.7 Back Propagation Through Time (3/3) Truncated Back Propagation Through Time 1 El= 2 å e Î δ = jl, j 2 j, l El δ jl, =- for all jîa and n- h< l n v ' ì φ ( vjl, ) ejl, for l = n ï í ' φ ( vj, l ) wjk, lδ k, l+ 1 for n- h< l < n ï å î kîa D w = η å δ x - n ji,n j, l i, l 1 n= n- h+ 1 The Ordered Derivative Approach A φ φ If a= φ( b, c), then F = F and F = F b c jl, l l l l - b - a - c - a (c) 2017 Biointelligence Lab, SNU 20
21 15.8 Real-Time Recurrent Learning (1/5) Real-time recurrent learning(rtrl): adjustments are made to the synaptic weights of a fully connected recurrent network in real time that is, while the network continues to perform its signal-processing function. T éφ( w1 ξ ) ù n ê ú ê! ú T n+ 1= ê φ( j n) ú éxn ù x w ξ ê ú ξn = ê ú ê! ú ëun û ê ú T êëφ( wqξn) úû w j éwa, jù = ê ú, j = 1,2,..., q êëw b, júû Figure Fully connected recurrent network for formulation of the RTRL algorithm; the feedback connections are all shown in blue. (c) 2017 Biointelligence Lab, SNU 21
22 15.8 Real-Time Recurrent Learning (2/5) Λ xn, =, j = 1,2,..., q w jn j é0 ù ê T ú U jn, = êξnú jth row, j = 1,2,..., q ê ú ë0 û Φ = diag(φ ( w ξ ),...,φ ( w ξ )...,φ ( w ξ )) ' T ' T ' T n 1 n j n q n Λ Φ W Λ U =, 1 ( + jn n an, jn, jn, ), j = + 1,2,..., q (c) 2017 Biointelligence Lab, SNU 22
23 15.8 Real-Time Recurrent Learning (3/5) e = d -y n n n = d -Wx n c n E 1 T = ee 2 n n n E æ n e ö n = ç e w j è w j ø n æ x ö n =-Wc e ç n è w j ø =- WcΛ j, nen, j = 1,2,..., q D w j,n =- η E w n j = η W Λ, = 1,2,..., c j, nen j q Λ j,0 = 0 for all j (c) 2017 Biointelligence Lab, SNU 23
24 15.8 Real-Time Recurrent Learning (4/5) (c) 2017 Biointelligence Lab, SNU 24
25 15.8 Real-Time Recurrent Learning (5/5) Teacher forcing, or equation-error method, involves replacing the actual output of a neuron, during training of the network, with the corresponding desired response (i.e., target signal) in subsequent computation of the dynamic behavior of the network, whenever that desired response is available. Faster training, corrective mechanism. x = φ( v ) jn, + 1 jn, = and tanh( v ) jn, φ( v jn, ) φ'( v jn, ) = v = jn, 2 sech ( v jn, ) = 1-x 2 jn, + 1 Figure Sensitivity graph of the fully recurrent network of Fig Note: The three nodes, labeled ξ l,n are all to be viewed as a single input. 25
26 15.9 Vanishing Gradients in Recurrent Networks (1/2) The vanishing-gradients problem arises in the training of the network to produce a desired response at the current time that depends on input data in the distant past. Robust information latching in a recurrent network is accomplished if the states of the network are contained in the reduced attracting set of a hyperbolic attractor. Figure Illustration of the vanishing-gradient problem: (a) State x n resides in the basin of attraction, β, but outside the reduced attraction set g. (b) State x n resides inside the reduced attraction set g. (c) 2017 Biointelligence Lab, SNU 26
27 15.9 Vanishing Gradients in Recurrent Networks (2/2) Long-Term Dependencies D = -η w n æ y ö D w = ηå d -y è ø (,, ) in, n ç i n i n i w å ( din, yin, ) in, in, = η - x f in, = x x i æ y x ö ç è xin, w ø ( xin,, un) ik, ik, = J E 2 total 1 Etotal = å di, n -yi, n 2 i w x, nk, D w = æ y æ x x öö d -y è è øø n in, in, ik, n ηå i n i n ç å i x ç in, k= 1 xik, wk ( J x,, ) (,, ) det nk 0 as k for all n The network is not robust to the presence of noise in the input signal, or else The network is unable to discover long-term dependencies (i.e., relationships between target outputs and inputs that occur in the distant past). (c) 2017 Biointelligence Lab, SNU 27
28 15.10 Supervised Training Framework for Recurrent Networks Using Nonlinear Sequential State Estimators (1/4) Figure Nonlinear state-space model depicting the underlying dynamics of a recurrent network undergoing supervised training. w = w + ω + n 1 n n d = b( w, v, u ) + v n n n n n (c) 2017 Biointelligence Lab, SNU 28
29 15.10 Supervised Training Framework for Recurrent Networks Using Nonlinear Sequential State Estimators (2/4) Description of the Supervised-Training Framework using the Extended Kalman Filter The recurrent neural network, undergoing training, performs the role of the predictor; and the extended Kalman filter, providing the supervision, performs the role of the corrector. (c) 2017 Biointelligence Lab, SNU 29
30 15.10 Supervised Training Framework for Recurrent Networks Using Nonlinear Sequential State Estimators (3/4) Description of the Supervised-Training Framework using the Extended Kalman Filter α = d -b( wˆ, v, u ) n n nn -1 n n wˆ = wˆ + G α nn nn -1 n n wˆ = wˆ + G ( d -y ) nn nn -1 n n n (c) 2017 Biointelligence Lab, SNU 30
31 15.10 Supervised Training Framework for Recurrent Networks Using Nonlinear Sequential State Estimators (4/4) Decoupled Extended Kalman Filter Figure Block-diagonal representation of the filtering-error covariance matrix pertaining to the decoupled Kalman filter (DEKF). The shaded parts of the square represent nonzero values of,where i = 1, 2, 3, 4 for the example illustrated in the figure. As we make the number of disjoint weight groups, g, larger, more zeros are created in the covariance matrix P n n ; in other words, the matrix P n n becomes more sparse. The computational burden is therefore reduced, but the numerical accuracy of the state estimation becomes degraded. (c) 2017 Biointelligence Lab, SNU 31
32 15.11 Computer Experiment: Dynamic Reconstruction of Mackey-Glass Attractor Figure Ensemble-averaged cumulative absolute error curves during the autonomous prediction phase of dynamic reconstruction of the Mackey-Glass attractor. d x dt t ax =- bx t t-dt 10 xt-d t Extended Kalman filter (EKF) Central-difference Kalman filter (CDKF) Cubature Kalman filter (CKF) (c) 2017 Biointelligence Lab, SNU 32
33 15.12 Adaptivity Considerations Adaptive Critic Figure Block diagram illustrating the use of an adaptive critic for the control of recurrent node activities v n in a recurrent neural network (assumed to have a single output); the part of the figure involving the critic is shown in blue. Consider a recurrent neural network embedded in a stochastic environment with relatively small variability in its statistical behavior. Provided that the underlying probability distribution of the environment is fully represented in the supervised-training sample supplied to the network, it is possible for the network to adapt to the relatively small statistical variations in the environment without any further on-line adjustments being made to the synaptic weights of the network. (c) 2017 Biointelligence Lab, SNU 33
34 15.13 Case Study: Model Reference Applied to Neurocontrol Figure Model-reference adaptive control system; the feedback loop of the system is printed in blue. T 1 J( w, θ ) = y ( n) -y( n, w, θ ) åå k i, r k T n = 1 i 2 (c) 2017 Biointelligence Lab, SNU 34
35 Summary and Discussion (1/2) Four main recurrent network models with global feedback Nonlinear autoregressive networks with exogenous inputs (NARX networks), which use feedback from the output layer to the input layer Fully connected recurrent networks, which use feedback from the hidden layer to the input layer Recurrent multilayer perceptrons with more than one hidden layer, which use feedback from the output of each computation layer to its own input Second-order recurrent networks, which use second-order neurons. Properties of Recurrent Neural Networks They are universal approximators of nonlinear dynamic systems, provided that they are equipped with an adequate number of hidden neurons. They are locally controllable and observable, provided that their linearized versions satisfy certain conditions around the equilibrium point. Given any finite-state machine, we can build a recurrent neural network which, regarded as a black-box machine, will behave like that finite-state machine. Recurrent neural networks exhibit a meta-learning (i.e., learning to learn) capability. (c) 2017 Biointelligence Lab, SNU 35
36 Summary and Discussion (2/2) Gradient-based Learning Algorithms Back propagation through time (BPTT) off-line learning Real-time recurrent learning (RTRL) on-line learning Supervised-learning Algorithms Based on Nonlinear Sequential State Estimation Extended Kalman filter (EKF), with the linearization of the measurement model pertaining to the recurrent neural network by using the BPTT or RTRL algorithm. Derivative-free nonlinear sequential state estimator (CKF / CDKF). In so doing, not only the applicability of this novel approach to supervised learning is broaden, but also numerical accuracy is improved (but with increased computational requirements). (c) 2017 Biointelligence Lab, SNU 36
Direct Method for Training Feed-forward Neural Networks using Batch Extended Kalman Filter for Multi- Step-Ahead Predictions
Direct Method for Training Feed-forward Neural Networks using Batch Extended Kalman Filter for Multi- Step-Ahead Predictions Artem Chernodub, Institute of Mathematical Machines and Systems NASU, Neurotechnologies
More information4. Multilayer Perceptrons
4. Multilayer Perceptrons This is a supervised error-correction learning algorithm. 1 4.1 Introduction A multilayer feedforward network consists of an input layer, one or more hidden layers, and an output
More informationT Machine Learning and Neural Networks
T-61.5130 Machine Learning and Neural Networks (5 cr) Lecture 11: Processing of Temporal Information Prof. Juha Karhunen https://mycourses.aalto.fi/ Aalto University School of Science, Espoo, Finland 1
More informationLecture 5: Recurrent Neural Networks
1/25 Lecture 5: Recurrent Neural Networks Nima Mohajerin University of Waterloo WAVE Lab nima.mohajerin@uwaterloo.ca July 4, 2017 2/25 Overview 1 Recap 2 RNN Architectures for Learning Long Term Dependencies
More informationIntroduction to Neural Networks
Introduction to Neural Networks What are (Artificial) Neural Networks? Models of the brain and nervous system Highly parallel Process information much more like the brain than a serial computer Learning
More informationRecurrent Neural Networks
Recurrent Neural Networks Datamining Seminar Kaspar Märtens Karl-Oskar Masing Today's Topics Modeling sequences: a brief overview Training RNNs with back propagation A toy example of training an RNN Why
More informationArtificial Neural Network and Fuzzy Logic
Artificial Neural Network and Fuzzy Logic 1 Syllabus 2 Syllabus 3 Books 1. Artificial Neural Networks by B. Yagnanarayan, PHI - (Cover Topologies part of unit 1 and All part of Unit 2) 2. Neural Networks
More informationLearning and Memory in Neural Networks
Learning and Memory in Neural Networks Guy Billings, Neuroinformatics Doctoral Training Centre, The School of Informatics, The University of Edinburgh, UK. Neural networks consist of computational units
More informationARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD
ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD WHAT IS A NEURAL NETWORK? The simplest definition of a neural network, more properly referred to as an 'artificial' neural network (ANN), is provided
More informationNeural networks. Chapter 19, Sections 1 5 1
Neural networks Chapter 19, Sections 1 5 Chapter 19, Sections 1 5 1 Outline Brains Neural networks Perceptrons Multilayer perceptrons Applications of neural networks Chapter 19, Sections 1 5 2 Brains 10
More informationArtificial Neural Network
Artificial Neural Network Eung Je Woo Department of Biomedical Engineering Impedance Imaging Research Center (IIRC) Kyung Hee University Korea ejwoo@khu.ac.kr Neuron and Neuron Model McCulloch and Pitts
More informationChapter 11. Stochastic Methods Rooted in Statistical Mechanics
Chapter 11. Stochastic Methods Rooted in Statistical Mechanics Neural Networks and Learning Machines (Haykin) Lecture Notes on Self-learning Neural Algorithms Byoung-Tak Zhang School of Computer Science
More informationA New Look at Nonlinear Time Series Prediction with NARX Recurrent Neural Network. José Maria P. Menezes Jr. and Guilherme A.
A New Look at Nonlinear Time Series Prediction with NARX Recurrent Neural Network José Maria P. Menezes Jr. and Guilherme A. Barreto Department of Teleinformatics Engineering Federal University of Ceará,
More informationLecture 4: Feed Forward Neural Networks
Lecture 4: Feed Forward Neural Networks Dr. Roman V Belavkin Middlesex University BIS4435 Biological neurons and the brain A Model of A Single Neuron Neurons as data-driven models Neural Networks Training
More informationCS:4420 Artificial Intelligence
CS:4420 Artificial Intelligence Spring 2018 Neural Networks Cesare Tinelli The University of Iowa Copyright 2004 18, Cesare Tinelli and Stuart Russell a a These notes were originally developed by Stuart
More informationSample Exam COMP 9444 NEURAL NETWORKS Solutions
FAMILY NAME OTHER NAMES STUDENT ID SIGNATURE Sample Exam COMP 9444 NEURAL NETWORKS Solutions (1) TIME ALLOWED 3 HOURS (2) TOTAL NUMBER OF QUESTIONS 12 (3) STUDENTS SHOULD ANSWER ALL QUESTIONS (4) QUESTIONS
More informationNeural networks. Chapter 20. Chapter 20 1
Neural networks Chapter 20 Chapter 20 1 Outline Brains Neural networks Perceptrons Multilayer networks Applications of neural networks Chapter 20 2 Brains 10 11 neurons of > 20 types, 10 14 synapses, 1ms
More informationInternational University Bremen Guided Research Proposal Improve on chaotic time series prediction using MLPs for output training
International University Bremen Guided Research Proposal Improve on chaotic time series prediction using MLPs for output training Aakash Jain a.jain@iu-bremen.de Spring Semester 2004 1 Executive Summary
More informationChristian Mohr
Christian Mohr 20.12.2011 Recurrent Networks Networks in which units may have connections to units in the same or preceding layers Also connections to the unit itself possible Already covered: Hopfield
More information(Feed-Forward) Neural Networks Dr. Hajira Jabeen, Prof. Jens Lehmann
(Feed-Forward) Neural Networks 2016-12-06 Dr. Hajira Jabeen, Prof. Jens Lehmann Outline In the previous lectures we have learned about tensors and factorization methods. RESCAL is a bilinear model for
More informationNeuro-Fuzzy Comp. Ch. 4 March 24, R p
4 Feedforward Multilayer Neural Networks part I Feedforward multilayer neural networks (introduced in sec 17) with supervised error correcting learning are used to approximate (synthesise) a non-linear
More informationLast update: October 26, Neural networks. CMSC 421: Section Dana Nau
Last update: October 26, 207 Neural networks CMSC 42: Section 8.7 Dana Nau Outline Applications of neural networks Brains Neural network units Perceptrons Multilayer perceptrons 2 Example Applications
More informationFeed-forward Network Functions
Feed-forward Network Functions Sargur Srihari Topics 1. Extension of linear models 2. Feed-forward Network Functions 3. Weight-space symmetries 2 Recap of Linear Models Linear Models for Regression, Classification
More informationArtificial Neural Networks Examination, June 2005
Artificial Neural Networks Examination, June 2005 Instructions There are SIXTY questions. (The pass mark is 30 out of 60). For each question, please select a maximum of ONE of the given answers (either
More informationArtificial Neural Network : Training
Artificial Neural Networ : Training Debasis Samanta IIT Kharagpur debasis.samanta.iitgp@gmail.com 06.04.2018 Debasis Samanta (IIT Kharagpur) Soft Computing Applications 06.04.2018 1 / 49 Learning of neural
More informationConsider the way we are able to retrieve a pattern from a partial key as in Figure 10 1.
CompNeuroSci Ch 10 September 8, 2004 10 Associative Memory Networks 101 Introductory concepts Consider the way we are able to retrieve a pattern from a partial key as in Figure 10 1 Figure 10 1: A key
More informationCSC242: Intro to AI. Lecture 21
CSC242: Intro to AI Lecture 21 Administrivia Project 4 (homeworks 18 & 19) due Mon Apr 16 11:59PM Posters Apr 24 and 26 You need an idea! You need to present it nicely on 2-wide by 4-high landscape pages
More informationNeural networks. Chapter 20, Section 5 1
Neural networks Chapter 20, Section 5 Chapter 20, Section 5 Outline Brains Neural networks Perceptrons Multilayer perceptrons Applications of neural networks Chapter 20, Section 5 2 Brains 0 neurons of
More informationNeural Nets in PR. Pattern Recognition XII. Michal Haindl. Outline. Neural Nets in PR 2
Neural Nets in PR NM P F Outline Motivation: Pattern Recognition XII human brain study complex cognitive tasks Michal Haindl Faculty of Information Technology, KTI Czech Technical University in Prague
More informationChapter 4 Neural Networks in System Identification
Chapter 4 Neural Networks in System Identification Gábor HORVÁTH Department of Measurement and Information Systems Budapest University of Technology and Economics Magyar tudósok körútja 2, 52 Budapest,
More informationy k = (a)synaptic f(x j ) link linear i/p o/p relation (b) Activation link linear i/p o/p relation
Neural networks viewed as directed graph - Signal flow graph: w j f(.) x j y k = w kj x j x j y k = (a)synaptic f(x j ) link linear i/p o/p relation (b) Activation link linear i/p o/p relation y i x j
More informationNeural Networks, Computation Graphs. CMSC 470 Marine Carpuat
Neural Networks, Computation Graphs CMSC 470 Marine Carpuat Binary Classification with a Multi-layer Perceptron φ A = 1 φ site = 1 φ located = 1 φ Maizuru = 1 φ, = 2 φ in = 1 φ Kyoto = 1 φ priest = 0 φ
More informationReservoir Computing and Echo State Networks
An Introduction to: Reservoir Computing and Echo State Networks Claudio Gallicchio gallicch@di.unipi.it Outline Focus: Supervised learning in domain of sequences Recurrent Neural networks for supervised
More informationModelling Time Series with Neural Networks. Volker Tresp Summer 2017
Modelling Time Series with Neural Networks Volker Tresp Summer 2017 1 Modelling of Time Series The next figure shows a time series (DAX) Other interesting time-series: energy prize, energy consumption,
More informationNeural Networks Introduction
Neural Networks Introduction H.A Talebi Farzaneh Abdollahi Department of Electrical Engineering Amirkabir University of Technology Winter 2011 H. A. Talebi, Farzaneh Abdollahi Neural Networks 1/22 Biological
More informationLecture 7 Artificial neural networks: Supervised learning
Lecture 7 Artificial neural networks: Supervised learning Introduction, or how the brain works The neuron as a simple computing element The perceptron Multilayer neural networks Accelerated learning in
More informationNN V: The generalized delta learning rule
NN V: The generalized delta learning rule We now focus on generalizing the delta learning rule for feedforward layered neural networks. The architecture of the two-layer network considered below is shown
More informationA recursive algorithm based on the extended Kalman filter for the training of feedforward neural models. Isabelle Rivals and Léon Personnaz
In Neurocomputing 2(-3): 279-294 (998). A recursive algorithm based on the extended Kalman filter for the training of feedforward neural models Isabelle Rivals and Léon Personnaz Laboratoire d'électronique,
More informationIntroduction Neural Networks - Architecture Network Training Small Example - ZIP Codes Summary. Neural Networks - I. Henrik I Christensen
Neural Networks - I Henrik I Christensen Robotics & Intelligent Machines @ GT Georgia Institute of Technology, Atlanta, GA 30332-0280 hic@cc.gatech.edu Henrik I Christensen (RIM@GT) Neural Networks 1 /
More information22c145-Fall 01: Neural Networks. Neural Networks. Readings: Chapter 19 of Russell & Norvig. Cesare Tinelli 1
Neural Networks Readings: Chapter 19 of Russell & Norvig. Cesare Tinelli 1 Brains as Computational Devices Brains advantages with respect to digital computers: Massively parallel Fault-tolerant Reliable
More informationTemporal Backpropagation for FIR Neural Networks
Temporal Backpropagation for FIR Neural Networks Eric A. Wan Stanford University Department of Electrical Engineering, Stanford, CA 94305-4055 Abstract The traditional feedforward neural network is a static
More informationIntroduction to Neural Networks: Structure and Training
Introduction to Neural Networks: Structure and Training Professor Q.J. Zhang Department of Electronics Carleton University, Ottawa, Canada www.doe.carleton.ca/~qjz, qjz@doe.carleton.ca A Quick Illustration
More informationArtificial Neural Networks. MGS Lecture 2
Artificial Neural Networks MGS 2018 - Lecture 2 OVERVIEW Biological Neural Networks Cell Topology: Input, Output, and Hidden Layers Functional description Cost functions Training ANNs Back-Propagation
More informationinear Adaptive Inverse Control
Proceedings of the 36th Conference on Decision & Control San Diego, California USA December 1997 inear Adaptive nverse Control WM15 1:50 Bernard Widrow and Gregory L. Plett Department of Electrical Engineering,
More informationNeural Networks. CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington
Neural Networks CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington 1 Perceptrons x 0 = 1 x 1 x 2 z = h w T x Output: z x D A perceptron
More informationArtificial Neural Network
Artificial Neural Network Contents 2 What is ANN? Biological Neuron Structure of Neuron Types of Neuron Models of Neuron Analogy with human NN Perceptron OCR Multilayer Neural Network Back propagation
More informationCS 6501: Deep Learning for Computer Graphics. Basics of Neural Networks. Connelly Barnes
CS 6501: Deep Learning for Computer Graphics Basics of Neural Networks Connelly Barnes Overview Simple neural networks Perceptron Feedforward neural networks Multilayer perceptron and properties Autoencoders
More informationCSE 352 (AI) LECTURE NOTES Professor Anita Wasilewska. NEURAL NETWORKS Learning
CSE 352 (AI) LECTURE NOTES Professor Anita Wasilewska NEURAL NETWORKS Learning Neural Networks Classifier Short Presentation INPUT: classification data, i.e. it contains an classification (class) attribute.
More information3.4 Linear Least-Squares Filter
X(n) = [x(1), x(2),..., x(n)] T 1 3.4 Linear Least-Squares Filter Two characteristics of linear least-squares filter: 1. The filter is built around a single linear neuron. 2. The cost function is the sum
More informationCheng Soon Ong & Christian Walder. Canberra February June 2018
Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 Outlines Overview Introduction Linear Algebra Probability Linear Regression
More informationCSC321 Lecture 5: Multilayer Perceptrons
CSC321 Lecture 5: Multilayer Perceptrons Roger Grosse Roger Grosse CSC321 Lecture 5: Multilayer Perceptrons 1 / 21 Overview Recall the simple neuron-like unit: y output output bias i'th weight w 1 w2 w3
More informationProject 1: A comparison of time delay neural networks (TDNN) trained with mean squared error (MSE) and error entropy criterion (EEC)
1 Project 1: A comparison of time delay neural networks (TDNN) trained with mean squared error (MSE) and error entropy criterion (EEC) Stefan Craciun Abstract The goal is to implement a TDNN (time delay
More informationNeural Networks. Chapter 18, Section 7. TB Artificial Intelligence. Slides from AIMA 1/ 21
Neural Networks Chapter 8, Section 7 TB Artificial Intelligence Slides from AIMA http://aima.cs.berkeley.edu / 2 Outline Brains Neural networks Perceptrons Multilayer perceptrons Applications of neural
More informationMultilayer Perceptron
Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Single Perceptron 3 Boolean Function Learning 4
More informationMultilayer Neural Networks. (sometimes called Multilayer Perceptrons or MLPs)
Multilayer Neural Networks (sometimes called Multilayer Perceptrons or MLPs) Linear separability Hyperplane In 2D: w x + w 2 x 2 + w 0 = 0 Feature x 2 = w w 2 x w 0 w 2 Feature 2 A perceptron can separate
More informationMark Gales October y (x) x 1. x 2 y (x) Inputs. Outputs. x d. y (x) Second Output layer layer. layer.
University of Cambridge Engineering Part IIB & EIST Part II Paper I0: Advanced Pattern Processing Handouts 4 & 5: Multi-Layer Perceptron: Introduction and Training x y (x) Inputs x 2 y (x) 2 Outputs x
More informationNARX neural networks for sequence processing tasks
Master in Artificial Intelligence (UPC-URV-UB) Master of Science Thesis NARX neural networks for sequence processing tasks eng. Eugen Hristev Advisor: prof. dr. René Alquézar Mancho June 2012 Table of
More informationNeed for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels
Need for Deep Networks Perceptron Can only model linear functions Kernel Machines Non-linearity provided by kernels Need to design appropriate kernels (possibly selecting from a set, i.e. kernel learning)
More informationAnalysis of Multilayer Neural Network Modeling and Long Short-Term Memory
Analysis of Multilayer Neural Network Modeling and Long Short-Term Memory Danilo López, Nelson Vera, Luis Pedraza International Science Index, Mathematical and Computational Sciences waset.org/publication/10006216
More informationApplication of Artificial Neural Networks in Evaluation and Identification of Electrical Loss in Transformers According to the Energy Consumption
Application of Artificial Neural Networks in Evaluation and Identification of Electrical Loss in Transformers According to the Energy Consumption ANDRÉ NUNES DE SOUZA, JOSÉ ALFREDO C. ULSON, IVAN NUNES
More informationAutonomous learning algorithm for fully connected recurrent networks
Autonomous learning algorithm for fully connected recurrent networks Edouard Leclercq, Fabrice Druaux, Dimitri Lefebvre Groupe de Recherche en Electrotechnique et Automatique du Havre Université du Havre,
More informationUnit III. A Survey of Neural Network Model
Unit III A Survey of Neural Network Model 1 Single Layer Perceptron Perceptron the first adaptive network architecture was invented by Frank Rosenblatt in 1957. It can be used for the classification of
More informationNeed for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels
Need for Deep Networks Perceptron Can only model linear functions Kernel Machines Non-linearity provided by kernels Need to design appropriate kernels (possibly selecting from a set, i.e. kernel learning)
More informationStatistical Machine Learning (BE4M33SSU) Lecture 5: Artificial Neural Networks
Statistical Machine Learning (BE4M33SSU) Lecture 5: Artificial Neural Networks Jan Drchal Czech Technical University in Prague Faculty of Electrical Engineering Department of Computer Science Topics covered
More informationA Logarithmic Neural Network Architecture for Unbounded Non-Linear Function Approximation
1 Introduction A Logarithmic Neural Network Architecture for Unbounded Non-Linear Function Approximation J Wesley Hines Nuclear Engineering Department The University of Tennessee Knoxville, Tennessee,
More informationAI Programming CS F-20 Neural Networks
AI Programming CS662-2008F-20 Neural Networks David Galles Department of Computer Science University of San Francisco 20-0: Symbolic AI Most of this class has been focused on Symbolic AI Focus or symbols
More informationNeural Networks and the Back-propagation Algorithm
Neural Networks and the Back-propagation Algorithm Francisco S. Melo In these notes, we provide a brief overview of the main concepts concerning neural networks and the back-propagation algorithm. We closely
More informationUsing a Hopfield Network: A Nuts and Bolts Approach
Using a Hopfield Network: A Nuts and Bolts Approach November 4, 2013 Gershon Wolfe, Ph.D. Hopfield Model as Applied to Classification Hopfield network Training the network Updating nodes Sequencing of
More informationNeural Networks. Mark van Rossum. January 15, School of Informatics, University of Edinburgh 1 / 28
1 / 28 Neural Networks Mark van Rossum School of Informatics, University of Edinburgh January 15, 2018 2 / 28 Goals: Understand how (recurrent) networks behave Find a way to teach networks to do a certain
More informationModeling Economic Time Series Using a Focused Time Lagged FeedForward Neural Network
Proceedings of Student Research Day, CSIS, Pace University, May 9th, 23 Modeling Economic Time Series Using a Focused Time Lagged FeedForward Neural Network N. Moseley ABSTRACT, - Artificial neural networks
More informationSerious limitations of (single-layer) perceptrons: Cannot learn non-linearly separable tasks. Cannot approximate (learn) non-linear functions
BACK-PROPAGATION NETWORKS Serious limitations of (single-layer) perceptrons: Cannot learn non-linearly separable tasks Cannot approximate (learn) non-linear functions Difficult (if not impossible) to design
More informationCascade Neural Networks with Node-Decoupled Extended Kalman Filtering
Cascade Neural Networks with Node-Decoupled Extended Kalman Filtering Michael C. Nechyba and Yangsheng Xu The Robotics Institute Carnegie Mellon University Pittsburgh, PA 15213 Abstract Most neural networks
More informationLecture 4: Perceptrons and Multilayer Perceptrons
Lecture 4: Perceptrons and Multilayer Perceptrons Cognitive Systems II - Machine Learning SS 2005 Part I: Basic Approaches of Concept Learning Perceptrons, Artificial Neuronal Networks Lecture 4: Perceptrons
More informationBranch Prediction using Advanced Neural Methods
Branch Prediction using Advanced Neural Methods Sunghoon Kim Department of Mechanical Engineering University of California, Berkeley shkim@newton.berkeley.edu Abstract Among the hardware techniques, two-level
More informationArtificial Neural Networks D B M G. Data Base and Data Mining Group of Politecnico di Torino. Elena Baralis. Politecnico di Torino
Artificial Neural Networks Data Base and Data Mining Group of Politecnico di Torino Elena Baralis Politecnico di Torino Artificial Neural Networks Inspired to the structure of the human brain Neurons as
More informationShort Term Memory and Pattern Matching with Simple Echo State Networks
Short Term Memory and Pattern Matching with Simple Echo State Networks Georg Fette (fette@in.tum.de), Julian Eggert (julian.eggert@honda-ri.de) Technische Universität München; Boltzmannstr. 3, 85748 Garching/München,
More informationArtificial Intelligence Hopfield Networks
Artificial Intelligence Hopfield Networks Andrea Torsello Network Topologies Single Layer Recurrent Network Bidirectional Symmetric Connection Binary / Continuous Units Associative Memory Optimization
More informationLong-Term Prediction, Chaos and Artificial Neural Networks. Where is the Meeting Point?
Engineering Letters, 5:, EL_5 Long-Term Prediction, Chaos and Artificial Neural Networks. Where is the Meeting Point? Pilar Gómez-Gil Abstract This paper presents the advances of a research using a combination
More informationNeural Turing Machine. Author: Alex Graves, Greg Wayne, Ivo Danihelka Presented By: Tinghui Wang (Steve)
Neural Turing Machine Author: Alex Graves, Greg Wayne, Ivo Danihelka Presented By: Tinghui Wang (Steve) Introduction Neural Turning Machine: Couple a Neural Network with external memory resources The combined
More informationCh.8 Neural Networks
Ch.8 Neural Networks Hantao Zhang http://www.cs.uiowa.edu/ hzhang/c145 The University of Iowa Department of Computer Science Artificial Intelligence p.1/?? Brains as Computational Devices Motivation: Algorithms
More informationMultilayer Neural Networks. (sometimes called Multilayer Perceptrons or MLPs)
Multilayer Neural Networks (sometimes called Multilayer Perceptrons or MLPs) Linear separability Hyperplane In 2D: w 1 x 1 + w 2 x 2 + w 0 = 0 Feature 1 x 2 = w 1 w 2 x 1 w 0 w 2 Feature 2 A perceptron
More informationLong-Short Term Memory
Long-Short Term Memory Sepp Hochreiter, Jürgen Schmidhuber Presented by Derek Jones Table of Contents 1. Introduction 2. Previous Work 3. Issues in Learning Long-Term Dependencies 4. Constant Error Flow
More informationNeural Networks with Applications to Vision and Language. Feedforward Networks. Marco Kuhlmann
Neural Networks with Applications to Vision and Language Feedforward Networks Marco Kuhlmann Feedforward networks Linear separability x 2 x 2 0 1 0 1 0 0 x 1 1 0 x 1 linearly separable not linearly separable
More informationCSCI 315: Artificial Intelligence through Deep Learning
CSCI 315: Artificial Intelligence through Deep Learning W&L Winter Term 2017 Prof. Levy Recurrent Neural Networks (Chapter 7) Recall our first-week discussion... How do we know stuff? (MIT Press 1996)
More informationArtificial Neural Networks Examination, June 2004
Artificial Neural Networks Examination, June 2004 Instructions There are SIXTY questions (worth up to 60 marks). The exam mark (maximum 60) will be added to the mark obtained in the laborations (maximum
More informationNeural Network Training
Neural Network Training Sargur Srihari Topics in Network Training 0. Neural network parameters Probabilistic problem formulation Specifying the activation and error functions for Regression Binary classification
More informationArtificial Neural Networks Examination, March 2004
Artificial Neural Networks Examination, March 2004 Instructions There are SIXTY questions (worth up to 60 marks). The exam mark (maximum 60) will be added to the mark obtained in the laborations (maximum
More informationSPSS, University of Texas at Arlington. Topics in Machine Learning-EE 5359 Neural Networks
Topics in Machine Learning-EE 5359 Neural Networks 1 The Perceptron Output: A perceptron is a function that maps D-dimensional vectors to real numbers. For notational convenience, we add a zero-th dimension
More informationCSC 411 Lecture 10: Neural Networks
CSC 411 Lecture 10: Neural Networks Roger Grosse, Amir-massoud Farahmand, and Juan Carrasquilla University of Toronto UofT CSC 411: 10-Neural Networks 1 / 35 Inspiration: The Brain Our brain has 10 11
More informationArtificial Neural Networks. Q550: Models in Cognitive Science Lecture 5
Artificial Neural Networks Q550: Models in Cognitive Science Lecture 5 "Intelligence is 10 million rules." --Doug Lenat The human brain has about 100 billion neurons. With an estimated average of one thousand
More informationARTIFICIAL NEURAL NETWORKS گروه مطالعاتي 17 بهار 92
ARTIFICIAL NEURAL NETWORKS گروه مطالعاتي 17 بهار 92 BIOLOGICAL INSPIRATIONS Some numbers The human brain contains about 10 billion nerve cells (neurons) Each neuron is connected to the others through 10000
More informationAdvanced Methods for Recurrent Neural Networks Design
Universidad Autónoma de Madrid Escuela Politécnica Superior Departamento de Ingeniería Informática Advanced Methods for Recurrent Neural Networks Design Master s thesis presented to apply for the Master
More informationHarnessing Nonlinearity: Predicting Chaotic Systems and Saving
Harnessing Nonlinearity: Predicting Chaotic Systems and Saving Energy in Wireless Communication Publishde in Science Magazine, 2004 Siamak Saliminejad Overview Eco State Networks How to build ESNs Chaotic
More informationEEE 241: Linear Systems
EEE 4: Linear Systems Summary # 3: Introduction to artificial neural networks DISTRIBUTED REPRESENTATION An ANN consists of simple processing units communicating with each other. The basic elements of
More informationComputational Intelligence Lecture 6: Associative Memory
Computational Intelligence Lecture 6: Associative Memory Farzaneh Abdollahi Department of Electrical Engineering Amirkabir University of Technology Fall 2011 Farzaneh Abdollahi Computational Intelligence
More informationWays to make neural networks generalize better
Ways to make neural networks generalize better Seminar in Deep Learning University of Tartu 04 / 10 / 2014 Pihel Saatmann Topics Overview of ways to improve generalization Limiting the size of the weights
More informationArtificial Neural Networks
Artificial Neural Networks 鮑興國 Ph.D. National Taiwan University of Science and Technology Outline Perceptrons Gradient descent Multi-layer networks Backpropagation Hidden layer representations Examples
More informationy(x n, w) t n 2. (1)
Network training: Training a neural network involves determining the weight parameter vector w that minimizes a cost function. Given a training set comprising a set of input vector {x n }, n = 1,...N,
More informationArtificial Neuron (Perceptron)
9/6/208 Gradient Descent (GD) Hantao Zhang Deep Learning with Python Reading: https://en.wikipedia.org/wiki/gradient_descent Artificial Neuron (Perceptron) = w T = w 0 0 + + w 2 2 + + w d d where
More informationEE04 804(B) Soft Computing Ver. 1.2 Class 2. Neural Networks - I Feb 23, Sasidharan Sreedharan
EE04 804(B) Soft Computing Ver. 1.2 Class 2. Neural Networks - I Feb 23, 2012 Sasidharan Sreedharan www.sasidharan.webs.com 3/1/2012 1 Syllabus Artificial Intelligence Systems- Neural Networks, fuzzy logic,
More information