CE213 Artificial Intelligence Lecture 13

CE213 Artificial Intelligence Lecture 13 Neural Networks What is a Neural Network? Why Neural Networks? (New Models and Algorithms for Problem Solving) McCulloch-Pitts Neural Nets Learning Using The Delta Rule Back Propagation Networks 1

McCulloch-Pitts Neural Nets The Basic Unit (artificial neuron, mapping input to output) It is called an MP neuron or MP unit (figure on the left) for simulating a typical biological neuron (figure on the right). x 1 w 1 x 2 x 3 w 2 w 3 θ y w n x n where x i is the ithinput w i is the weight of the ithinput θ is the unit s threshold y is the unit s output A typical biological neuron (Approximately 100 billion neurons in the human brain) 2

McCulloch-Pitts Neural Nets (2) The output (binary) is computed from the input as follows: θ =1 <θ =0 Basic information processing: weighted sum and thresholding, which could be quite powerful (e.g., fault diagnosis). Note that this is much more like a logic gate than a biological neuron. 3

What Can an MP Unit Compute? Assume inputs are binary (i.e., 1 or 0) AND A 1.5 B Y = 1 if A + B 1.5; Y = 0 otherwise. Y Of course, there are an infinite number of other correct designs, e.g., changing to 1.1 or 1.2 would do. A B Y 0 0 0 0 1 0 1 0 0 1 1 1 (Aand Bcorrespond to x 1 and x 2 in the MP neuron model presented in the previous slides. The weighted sum is w A A+w B B, wherew A = w 1 = and w B = w 2 =.) 4

What Can an MP Unit Compute? (2) OR A B 0.5 Y = 1 if A + B 0.5; Y = 0 otherwise. Y A B Y 0 0 0 0 1 1 1 0 1 1 1 1 5

What Can an MP Unit Compute? (3) NOT A - -0.5 Y Y = 0 if A > 0.5; Y = 1 otherwise. 6

What Can an MP Unit Compute? (4) Are there any Boolean functions that an MP unit cannot compute? How about XOR (exclusive OR)? A B A xorb 1 1 0 1 0 1 0 1 1 0 0 0 7

What Can an MP Unit Compute? (5) Suppose weights are w A and w B, input to the MP unit isw A A+w B B. Then we require for performing XOR (check the equations on slide 3): w A + w B < θ w A θ w B θ (1) (for A=1, B=1, Y=0) (2) (for A=1, B=0, Y=1) (3) (for A=0, B=1, Y=1) 0 < θ (4) (for A=0, B=0, Y=0) But clearly if both (2) and (3) are true, then (1) must be false, since (4) implies that θ, w A and w B are positive. Thus there is no combination of weights and thresholds that will create an MP unit that computes the XOR function. An MP unit cannot compute XOR! 8

What Can an MP Unit Compute? (6) A Graphical Interpretation Consider the ORunit: w A = w B = 1; θ= 0.5 What is the range of values for which Y = 1? All points for which w A A+ w B B> θ A 0.5 A + B = 0.5 Y = 1 i.e., whenever A + B > 0.5 Y = 0 0.0 0.0 0.5 Thus: All points above the line A + B = 0.5 lead to Y =1 All points below the line lead to Y = 0 B 9

What Can an MP Unit Compute? (7) Similarly for the ANDunit: w A = w B = 1; θ= 1.5 1.5 A + B = 1.5 A Y = 1 0.5 Y = 0 0.0 All points above line A + B = 1.5 lead to Y =1 All points below the line lead to Y = 0. 0.0 0.5 1.5 B 10

What Can an MP Unit Compute? (8) The General Case for 2 Inputs θ/w A w A A+ w B B= θ x 2 initial weights A Y = 1 Y = 0 0.0 0.0 Y = 1 if and only if w A A+ w B B θ. B θ/w B θ, w A, w B can be determined by learning from examples. w A A+ w B B= θis a straight line that interceptsthe axes at θ/w A and θ/w B. x 1 11

What Can an MP Unit Compute? (9) So what about XOR? A 0.5 0.0 0.0 0.5 B There is no straight line that separates the blue spots (Y=0) from the red spots (Y=1). 12

What Can an MP Unit Compute? (10) More Than 2 Inputs (one unit only) This result generalises to any number of inputs: θ =0 This is the equation of a hyperplane in n-dimensional space. So an MP unit divides the space into two regions separated by a hyperplane. The values of the weights and threshold determine the position and orientation of this hyperplane. This may solve complicated classification problems in high-dimensional space! ( First golden age of neural network research in 1960s) 13

Linear Separability Thus the only functions that an n-input MP unit can compute are those defined by a linear surface: a straight line when n= 2 a plane when n= 3 a hyperplanewhen n> 3 Such functions are said to be linearly separable. For this reason MP neurons are sometimes called: Linear Separation Units(LSUs) (They are not linear units, but linear separation units) 14

More Than One Unit We could obviously use the output of one MP unit as an input to another MP unit. In such a way we could build an artificial neural network. Could such a neural network (more than one neuron) compute functions that a single unit could not? -Yes 15

More Than One Unit (2) XOR with 4 MP units A A B 1.5 B - -0.5 (A B) Is it possible to implement this part by one MP unit only? 1.5 (A B) (A B) A xorb A B A B 0.5 16

More Than One Unit (3) An optional exercise: Devise a 3 unit MP network that computes XOR. Hint: Consider devising an MP unit to compute NOT AND (A B) Designing an MP neural network like what have been done so far seems difficult, especially for high-dimensional input problems. Solution Learning from data/experience (topic for next lecture) 17

Summary McCulloch-Pitts Neural Nets MP Unit (Neuron) What Can an MP Unit Compute? Linear Separability More Than One Unit (Neuron) 18