Simulating Neural Networks. Lawrence Ward P465A

Similar documents
Artificial Neural Network

Data Mining Part 5. Prediction

Neural Networks. Chapter 18, Section 7. TB Artificial Intelligence. Slides from AIMA 1/ 21

COMP9444 Neural Networks and Deep Learning 2. Perceptrons. COMP9444 c Alan Blair, 2017

Introduction to Neural Networks

Simple neuron model Components of simple neuron

Introduction to Artificial Neural Networks

EEE 241: Linear Systems

Simple Neural Nets For Pattern Classification

Multilayer Neural Networks

Last update: October 26, Neural networks. CMSC 421: Section Dana Nau

Artificial Neural Networks" and Nonparametric Methods" CMPSCI 383 Nov 17, 2011!

Lecture 7 Artificial neural networks: Supervised learning

Neural networks. Chapter 20, Section 5 1

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

CSC321 Lecture 5: Multilayer Perceptrons

COGS Q250 Fall Homework 7: Learning in Neural Networks Due: 9:00am, Friday 2nd November.

Neural networks. Chapter 19, Sections 1 5 1

CSE 352 (AI) LECTURE NOTES Professor Anita Wasilewska. NEURAL NETWORKS Learning

4. Multilayer Perceptrons

Machine Learning

2015 Todd Neller. A.I.M.A. text figures 1995 Prentice Hall. Used by permission. Neural Networks. Todd W. Neller

Artificial Neural Networks. Q550: Models in Cognitive Science Lecture 5

) (d o f. For the previous layer in a neural network (just the rightmost layer if a single neuron), the required update equation is: 2.

Course 395: Machine Learning - Lectures

Multilayer Neural Networks

An artificial neural networks (ANNs) model is a functional abstraction of the

Artifical Neural Networks

Artificial Neural Networks. Part 2

Artificial Neural Networks. Historical description

Neural networks. Chapter 20. Chapter 20 1

Perceptron. (c) Marcin Sydow. Summary. Perceptron

Lab 5: 16 th April Exercises on Neural Networks

Lecture 4: Feed Forward Neural Networks

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Back-Propagation Algorithm. Perceptron Gradient Descent Multilayered neural network Back-Propagation More on Back-Propagation Examples

Artificial Neural Networks. Edward Gatt

Introduction Biologically Motivated Crude Model Backpropagation

Introduction To Artificial Neural Networks

Artificial neural networks

Revision: Neural Network

Feedforward Neural Nets and Backpropagation

Neural Networks. Nicholas Ruozzi University of Texas at Dallas

COMP-4360 Machine Learning Neural Networks

Artificial Neural Network and Fuzzy Logic

SPSS, University of Texas at Arlington. Topics in Machine Learning-EE 5359 Neural Networks

Neural Networks Introduction

Pattern Classification

Artificial Intelligence

Artificial Neural Networks

Machine Learning. Neural Networks

Neural Networks. Fundamentals of Neural Networks : Architectures, Algorithms and Applications. L, Fausett, 1994

Plan. Perceptron Linear discriminant. Associative memories Hopfield networks Chaotic networks. Multilayer perceptron Backpropagation

Artificial Neural Networks Examination, June 2005

NONLINEAR CLASSIFICATION AND REGRESSION. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition

Neural Networks: Basics. Darrell Whitley Colorado State University

CHALMERS, GÖTEBORGS UNIVERSITET. EXAM for ARTIFICIAL NEURAL NETWORKS. COURSE CODES: FFR 135, FIM 720 GU, PhD

Backpropagation Neural Net

Neural Networks: Introduction

CS 4700: Foundations of Artificial Intelligence

17 Neural Networks NEURAL NETWORKS. x XOR 1. x Jonathan Richard Shewchuk

INTRODUCTION TO ARTIFICIAL INTELLIGENCE

ARTIFICIAL INTELLIGENCE. Artificial Neural Networks

Neural Networks and Deep Learning

Neural Networks and Ensemble Methods for Classification

AI Programming CS F-20 Neural Networks

Security Analytics. Topic 6: Perceptron and Support Vector Machine

Unit III. A Survey of Neural Network Model

ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD

CS 354R: Computer Game Technology

CS:4420 Artificial Intelligence

Learning and Memory in Neural Networks

CSC 411 Lecture 10: Neural Networks

Neural Networks and Fuzzy Logic Rajendra Dept.of CSE ASCET

Multi-layer Perceptron Networks

Artificial Neural Networks The Introduction

22c145-Fall 01: Neural Networks. Neural Networks. Readings: Chapter 19 of Russell & Norvig. Cesare Tinelli 1

COMP 551 Applied Machine Learning Lecture 14: Neural Networks

CSCI 315: Artificial Intelligence through Deep Learning

Chapter ML:VI (continued)

Νεςπο-Ασαυήρ Υπολογιστική Neuro-Fuzzy Computing

Introduction to Natural Computation. Lecture 9. Multilayer Perceptrons and Backpropagation. Peter Lewis

Computational Intelligence Lecture 3: Simple Neural Networks for Pattern Classification

Neural Networks DWML, /25

Supervised Learning in Neural Networks

Machine Learning

(Feed-Forward) Neural Networks Dr. Hajira Jabeen, Prof. Jens Lehmann

AN INTRODUCTION TO NEURAL NETWORKS. Scott Kuindersma November 12, 2009

Multilayer Perceptron = FeedForward Neural Network

Neural Networks. Fundamentals Framework for distributed processing Network topologies Training of ANN s Notation Perceptron Back Propagation

Multilayer Neural Networks. (sometimes called Multilayer Perceptrons or MLPs)

Grundlagen der Künstlichen Intelligenz

DM534 - Introduction to Computer Science

Master Recherche IAC TC2: Apprentissage Statistique & Optimisation

Multilayer Neural Networks. (sometimes called Multilayer Perceptrons or MLPs)

Machine Learning. Neural Networks. Le Song. CSE6740/CS7641/ISYE6740, Fall Lecture 7, September 11, 2012 Based on slides from Eric Xing, CMU

How to do backpropagation in a brain

HISTORY. In the 70 s vs today variants. (Multi-layer) Feed-forward (perceptron) Hebbian Recurrent

CMSC 421: Neural Computation. Applications of Neural Networks

Transcription:

Simulating Neural Networks Lawrence Ward P465A

1. Neural network (or Parallel Distributed Processing, PDP) models are used for a wide variety of roles, including recognizing patterns, implementing logic, making decisions, etc. 1.1 They rescued artificial intelligence research from the dead end of symbolic AI and are now one part of the foundation of the new AI (the other is agent-based processing). 1.2 They learn, making them one of the most powerful tools for psychological modeling. 1.3 They have limitations, however, one of which is that representations of psychologically meaningful concepts are difficult to see in the model. 1.4 They are inherently dynamic but dynamics is not emphasized (e.g. the learning process) only the outcome (responses after learning). 1.5 They are to be contrasted with spiking neural networks (or pulsecoupled neural networks) where dynamics is the focus. 1.6 They implement only some of the many functional characteristics of neurons.

2. The basic model element: the neuron Incoming activation A 1 Synaptic weights w 1 A i w i Adder function Threshold activation function Outgoing activation A n w n x = n! A i w i i=1 1!(x) = 1+ e "(x"a)/b

3. Synapse layer model: Output layer Input layer All of the network s knowledge is stored in the weights (w ij ) of the connections (representing the synapses in a real neural network). We will store the connection weights in a matrix, call it W. For example for a two-layer, 3-element network w11 w12 w13 W = w21 w22 w23 w31 w32 w33

4. Learning law: scheme to adjust the connection weights so that they reflect the desired input-output relation. For example, a network is trained to identify various input patterns by giving a particular output whenever a particular pattern is input. 4.1 If the weight adjustments are based on some abstract rule it is called unsupervised learning, but if the weight adjustments are based on some error signal calculated from a desired target output, then it is called supervised learning. Supervised learning is most efficient, most reliable, and most used (but unsupervised learning is how humans learn a lot of the time). 4.2 Error correction law: "w ij = # a i (c j $ b j ) where α is the learning rate, a i is the activation of the ith input-layer neuron, b j is the activation of the jth output-layer neuron, and c j is the desired activation of that neuron.

5. Multilayer networks 5.1 Two-layer networks are limited: they cannot learn nonlinear mappings. For example they cannot learn the XOR problem, where the XOR truth table is: Input1 Input 2 Output 0 0 1 1 1 0 1 1 1 0 D3 = 1 iff D1=1 and D2=1; plane is at D3=1 The third dimension is provided by a hidden layer that does not directly interact with the outside world. Below is a network that emulates the XOR function. Here the activation function for hidden and output neurons is 1 f ( x) = 0 if if x > 0 x 0 where x = n i= 1 A i w i

As an example, consider the input x 1 = 1 and x 2 = 0 respectively. Then for hidden element y 1 the argument for the activation function is (1x1)+(-1x0)=1 and so its activation is f(1)=1. For y 2 it is (-1x1)+(1x0)=-1 and so f(-1)=0. Thus the inputs to the output neuron are 1 and 0 making its activation (1x1)+(1x0)=1 and since f (1)=1 its output is 1, as required. Figure out the other possibilities as an exercise.

6. Backpropagation saves neural network models 6.1 Minsky & Papert (Perceptrons, 1969) criticized multilayer networks because they could see no way to correct the weights of the connections between the input layer and the hidden layer. This and the limitations of 2-layer perceptrons contributed to a long hiatus in neural net research. 6.2 The solution to the problem is to backpropagate the error from the output layer to calculate an error for each hidden layer neuron: a weighted sum of the output layer errors multiplied by the activation of the hidden-layer neuron. This penalizes the hidden layer neuron in proportion to how much it contributed to the output layer error. These hidden-layer errors are then used to calculate changes to the weights of the inputhidden connections just like is done directly for the hiddenoutput connections. Output layer error can be backpropagated like this over any number of hidden layers.

7. The basic backpropagation model

7.1 Basic parameters: weights (2 layers), learning rate 0 < α < 1, momentum (how much a previous weight change affects current weight change) m, tolerance (criterion for accepting an output as good ) t. 7.2 Setup: assign random numbers between -1 and +1 to the two sets of weights. Assign values for α, m, and t. 7.3 Learning forward pass: (1) Compute activation for each hidden layer neuron for input pattern selected: for each hidden neuron summed input is n y j =! A i w1 ij + hbias j i=1 where A i is the input from the ith input neuron, and the activation is f (y j ) = 1 1+ e!(y j!a)/b = A j

(2) Compute activation for each output layer neuron where Aj = f(yj) Is the input from the jth hidden layer neuron, and the output activation is This vector of output layer neuron activations is the output you check for goodness. 7.4 Backpropagation of error: (1) compute the output-layer error: d k z k =! A j W 2 jk + obias k f (z k ) = = A where A k is the activation of the kth neuron in the output layer. (2) compute the hidden layer error: p e j = A j m j=1 k 1 1+ e!(z k!a)/b = A k ( 1 Ak )( Tk Ak k=1 ) " (1# A j )W 2 jk d k

(3) adjust the weights and bias for the hidden-output synapses Where W W 2 2 jkt = W 2 jk( t 1) + Δ jkt!w 2 jkt =! A j d k + m!w 2 jk(t"1) obias k = obias k +! d k (4) adjust the weights and bias for the input-hidden synapses Where W W1 1 ijt = W1ij( t 1) + Δ ΔW α A 1ijt = i e j + mδw1ij( t 1) ijt hbias j = hbias j +! e j (5) Repeat until Tk-Ak < t k for all output neurons and all input patterns.

7.5 Recall: repeat the steps for the forward pass. The computed output is the recalled pattern. 7.6 General approach: train on clear input patterns, test on same input patterns, test on variants of input patterns.

The assignment 2 hidden unit XOR 4 hidden unit XOR Discuss Effect of number of hidden units on learning rate Differences and similarities of two networks in terms of patterns of synaptic weights between input and hidden, and hidden and output layers in them

The program Define variables (arrays easiest): alpha, tol, M Input patterns (x) W1 (input->hidden) W2 (hidden->output) j j j 0 -.50.04 i 0 1.2 1.68 -.37.57 -.98 1 0 -.33.92.38 1 1 patterns(pindex(i),j) target outputs 0 1 1 0 pattern index hbias 0 1 2 3

Hidden activation (y, by neuron) Output activation (z, by pattern) d (by pattern) e (by hidden) delw2 delw2old delw1 delw1old

Initialize weights Scramble pindex Compute hidden activation (hard threshold) Compute output activation Compute output error (d) Compute hidden error (e) Compute new hidden->output weights Compute new input->hidden weights New pattern no 4 patterns? yes no Converge? yes Done