CSC Neural Networks. Perceptron Learning Rule

CSC 302 1.5 Neural Networks Perceptron Learning Rule 1

Objectives Determining the weight matrix and bias for perceptron networks with many inputs. Explaining what a learning rule is. Developing the perceptron learning rule. Discussing the advantages and limitations of the single layer perceptron. 2

Development Introduced a neuron model by Warren McCulloch & Walter Pitts [1943]. Main features Weighted sum of input signals is compared to a threshold to determine the output. 0 if weighted_sum < 0 1 is weighted_sum >= 0 Able to compute any logical arithmetic function. No training method was available. 3

Development Perceptron was developed by Frank Rosenblatt [1950]. Neurons were similar to those of McCulloch & Pitts. Key feature introduced a learning rule. Proved that learning rule is always converged to correct weights if weights exist for the problem. Simple and automatic. No restriction on initial weights - random 4

Learning Rules Procedure for modifying the weights and biases of a network to perform a specific task. Supervised Learning - Network is provided with a set of examples of proper network behaviour (inputs/targets) Reinforcement Learning - Network is only provided with a grade, or score, which indicates network performance. Unsupervised Learning - Only network inputs are available to the learning algorithm. Network learns to categorize (cluster) the inputs. 5

Perceptron Architecture 6

Perceptron Architecture Output of the i th neuron 7

Perceptron Architecture Therefore, if the inner product of the ith row of the weight matrix with the input vector is greater than or equal to b i the output will be 1, otherwise the output will be 0. Each neuron in the network divides the input space into two regions. 8

Single-Neuron Perceptron Decision boundary n = 1 w T p+b = w 1,1 p 1 + w 1,2 p 2 + b = 0 9

Decision Boundary Decision boundary 1w T p+b = 0 or 1 w T p = -b All points on the decision boundary have the same inner product with the weight vector. Decision boundary is orthogonal to weight vector. 1 w T p 1 = 1 w T p 2 = -b for any two points in the decision boundary. 1 w T (p 1 p 2 ) = 0 Weight vector is orthogonal to (p 1 p 2 ). 10

Direction of the Weight Vector Any vector in the shaded region will have an inner product greater than b and Vectors in the un-shaded region will have inner product less than b. Therefore the weight vector 1 w will always point toward the region where the neuron output is 1. 11

Graphical Method Design of a perceptron to implement the AND gate. Input space each input vector labeled according to the target. Dark circle output is1 Light circle output is 0 12

Graphical Method First select a decision boundary that separates dark circles and light circles. Next choose a weight vector that is orthogonal to the decision boundary. The weight vector can be any length. Infinite no of possibilities. One choice is 13

Graphical Method Finally, we need to find the bias, b. Pick a point on the decision boundary (say [1.5 0] T ) Testing 14

Multiple-Neuron Perceptron Each neuron will have its own decision boundary. i w T p + b i = 0 A single neuron can classify input vectors into two categories. A multi-neuron perceptron can classify input vectors into 2 S categories. 15

Perceptron Learning Rule Supervised training Provided a set of examples of proper network behaviour where p q input to the network and t q corresponding output As each input is supplied to the network, the network output is compared to the target. The learning rule then adjusts the weights and biases of the network in order to move the network output closer to the target. 16

Test Problem = = = = = = 0, 1 0 0, 2 1 1, 2 1 3 3 2 1 t t t p p p 2 1 Input/target pairs 17 Removed the bias for the simplicity. Decision boundary must pass the origin. Decision boundaries Weight Vectors

Starting Point Random initial weight 1.0 1w= 0.8 Present p 1 to the network: a hardlim w T = ( 1 p 1 ) = hardlim 1.0 0.8 1 2 a = hardlim( 0.6) = 0 Incorrect Classification. 18

Tentative Learning Rule We need to alter the weight vector so that it points more toward p 1, so that in the future it has a better chance of classifying p 1.. 19

Tentative Learning Rule One approach would be to set 1 w equal to p 1. This rule cannot find a solution always. If we apply the rule 1 w = p every time one of these vectors misclassified, and network weights will simply oscillate back and forth. 20

Tentative Learning Rule Another possibility would be to add p 1 to 1 w. This rule can be stated as 21

22 Second Input Vector

23 Third Input Vector

24 Unified Learning Rule

25 Unified Learning Rule

26 Multiple-Neuron Perceptron

27 Apple/Banana Example

28 Second Iteration

29 Check

30 Perceptron Rule Capability

31 Perceptron Limitations