EPL442: Computational Learning Systems Lab 2 Vassilis Vassiliades Department of Computer Science University of Cyprus
Outline Artificial Neuron Feedforward Neural Network Back-propagation Algorithm Notes Assignment Questions? Demos Cortex Pro ALVINN
Artificial Neuron Summing Junction Output * W X Inputs *W 2 Y Σ * W 3 Activation Function (e.g., sigmoid) Weights (incl. bias) Y a* X * W X * W W e 2 2 3
Feedforward Neural Network IN H H2 OUT IN 2 H 2 H2 2 OUT 2 Input Layer (inactive - meaning no computation) Hidden Layer Hidden Layer 2 Output Layer
Back-propagation Algorithm (Online Updating) Step by Step
Pattern p is presented f X
Forward Pass w( bias ) w ( x ) f X w( x 2) y f ( w w x w x ) ( bias ) ( x) ( x2) 2
Forward Pass w( bias )2 f X w( x )2 w( x 2)2 y f ( w w x w x ) 2 2 ( bias )2 ( x)2 ( x2)2 2
Forward Pass w ( bias )3 w 3 f X w 23 y f ( w w y w y ) 3 3 ( bias )3 3 23 2
Forward Pass w( bias )4 X f w 4 w 24 y f ( w w y w y ) 4 4 ( bias )4 4 24 2
Forward Pass w ( bias )5 f X w 35 w 45 y f ( w w y w y ) 5 5 ( bias )5 35 3 45 4
Backward Pass - Stage Derivative of the error with respect to the output (y 5 ) Derivative of the output node s activation function (i.e. sigmoid) y ( y ) ( y t ) 5 5 5 5 X f t δ δ 5 y 5 Target output (in our case we have only output) Actual output Assuming we are minimising the sum-of-squares error function and that each neuron has a sigmoid activation function with slope α=
Backward Pass - Stage y ( y ) ( w ) 3 3 3 35 5 δ 3 X f δ δ5 w 35
Backward Pass - Stage δ 3 X f δ 5 δ 4 w 45 y ( y ) ( w ) 4 4 4 45 5
Backward Pass - Stage δ δ 3 X w 3 f w 4 δ 4 δ 5 y ( y ) ( w w ) 3 3 4 4
Backward Pass - Stage δ δ 3 X f δ 5 δ 2 δ 4 w 23 w 24 y ( y ) ( w w ) 2 2 2 23 3 24 4
Backward Pass - Stage 2 w' ( bias ) δ δ 3 X w' x ( x ) f δ 5 w' x δ 2 δ 4 2 4 '( 2) w ' ( bias ) w ( bias ) w' w x ( x) ( x) w w x ' ( x 2) ( x 2) 2
Backward Pass - Stage 2 δ 3 w' ( bias )2 X f δ 5 w' ( x )2 δ 2 δ 4 w w' w ( x2)2 ' ( bias )2 w ( bias )2 2 w' w x ( x)2 ( x)2 2 w w x ' ( x 2)2 ( x 2)2 2 2
Backward Pass - Stage 2 w' ( bias )3 δ 3 X w' 3 f δ 5 w' δ 23 4 w w ' ( bias )3 ( bias )3 3 w ' w y 3 3 3 w' w y 23 23 3 2
Backward Pass - Stage 2 w' ( bias )4 X f w' 4 δ 4 δ 5 w' 24 w w ' ( bias )4 ( bias )4 4 w ' w y 4 4 4 w' w y 24 24 4 2
Backward Pass - Stage 2 w' ( bias )5 X f w' 35 δ 5 w' 45 w' w ( bias)5 ( bias)5 5 w' w y w w y 35 35 5 3 ' 45 45 5 4
Pattern p+ is presented f X The procedure is the same as before
Notes In backward pass you could update the weights immediately after you calculate the deltas. Here the backward pass was done in 2 distinct stages: a) propagation of errors backwards in order to evaluate the derivatives and b) weight adjustment using the calculated derivatives. In class notes the derivative of the error is (target output). If you prefer it like this then you need to put a plus sign (+) in front of the learning rate in the weight update equations (as in class notes). A large learning rate is equivalent to big changes in the weights, thus large jumps in the weight space. This is not always desirable.
Notes To minimise the occurences of local minima: Change the learning rate (either start with a large value and progressively decrease it or intelligently adapt it) Add more hidden nodes (be careful of the overfitting problem) Add momentum to the weight update equations Add noise Overfitting occurs when the neural network: is trained for too long (to avoid this problem stop training early) has a lot of hidden nodes (to avoid this problem do model selection) In the batch update version the weight changes are accumulated and applied after each epoch instead of after each pattern. Weights should be initialised to small random values in the range [-,] and input data should be normalised in the range [0,].
Assignment Assignment Document: www.cs.ucy.ac.cy/~vvassi0/epl442 Submission deadline: 9 October 2009 Deliverables Report in pdf format max. 3 pages Source code with comments Other files (training.dat, test.dat, parameters.dat etc.) All in a zip file to: v.vassiliades@cs.ucy.ac.cy
Assignment - Grading Completeness of the deliverables All written in the assignment description Source code correctness Automated tests Interactive tests Source code quality Comments, design Quality of the report Time of submission
Questions?
Demos Cortex Pro www.tech.plym.ac.uk/soc/staff/guidbugm/software.htm Download and experiment with. ALVINN (Autonomous Land Vehicle In a Neural Network) See video