# Simple Neural Nets For Pattern Classification

Save this PDF as:
Size: px
Start display at page:

## Transcription

1 CHAPTER 2 Simple Neural Nets For Pattern Classification Neural Networks

2 General Discussion One of the simplest tasks that neural nets can be trained to perform is pattern classification. In pattern classification problems, each input vector (pattern) belongs, or does not belong, to a particular class or category. For a neural net approach, we assume we have a set of training patterns for which the correct classification is known. The output unit represents membership in the class with a response of 1; a response of - 1 (or 0 if binary representation is used) indicates that the pattern is not a member of the class. 2

3 General Discussion In 1963, neural networks was used to detect heart abnormalities with EKG types of data as input (46 measurements) and classified them into "normal" or "abnormal. When patterns may or may not belong to several classes, there is an output unit for each class. In this chapter, we shall discuss three methods of training a simple single layer neural net for pattern classification: the Hebb rule, the perceptron learning rule, and the delta rule. 3

4 Architecture The basic architecture of the simplest possible neural networks that perform pattern classification consists of a layer of input units (as many units as the patterns to be classified have components) and a single output unit. 4

5 Architecture 5

6 Biases and Thresholds A bias acts exactly as a weight on a connection from a unit whose activation is always 1. Increasing the bias increases the net input to the unit. If a bias is included, the activation function is typically taken to be: 6

7 Biases and Thresholds Some authors do not use a bias weight, but instead use a fixed threshold 0 for the activation function. In that case : 7

8 The role of a bias and threshold we consider the separation of the input space into regions where the response of the net is positive and regions where the response is negative. 8

9 The role of a bias and threshold The boundary between the values of x1 and x2: With bias: With threshold: 9

10 The role of a bias and threshold During training, values of w, and w2 are determined so that the net will have the correct response for the training data. including neither a bias nor a threshold is equivalent to requiring the separating line (or plane or hyperplane for inputs with more components) to pass through the origin. 10

11 Linear Separability For a particular output unit, the desired response is a "yes" if the input pattern is a member of its class and a "no" if it is not. A "yes" response is represented by an output signal of 1, a "no" by an output signal of - 1 (for bipolar signals). Since the net input to the output unit is: the boundary between the region is: 11

12 Linear Separability If there are weights (and a bias) so that all of the training input vectors for which the correct response is +1 lie on one side of the decision boundary and all of the training input vectors for which the correct response is -1 lie on the other side of the decision boundary, we say that the problem is "linearly separable." 12

13 Linear Separability Minsky and Papert [I988] showed that a single-layer net can learn only linearly separable problems. Furthermore, it is easy to extend this result to show that multilayer nets with linear activation functions are no more powerful than single-layer nets (since the composition of linear functions is linear). 13

14 AND function 14

15 AND function 15

16 Linear Separability OR function is also similar to AND function. The equations of the decision boundaries are not unique. Note that if a bias weight were not included in these examples, the decision boundary would be forced to go through the origin. Not all simple two-input, single-output mappings can be solved by a single layer net (even with a bias included), as is illustrated in Example

17 XOR function 17

18 Data Representation Binary or Bipolar Binary representation is also not as good as bipolar if we want the net to generalize (i.e., respond to input data similar, but not identical to, training data). Using bipolar input, missing data can be distinguished from mistaken data. Missing values can be represented by "0" and mistakes by reversing the input value from + 1 to - 1, or vice versa. In general, bipolar representation is preferable 18

19 HEBB NET The earliest and simplest learning rule for a neural net is generally known as the Hebb rule. Hebb proposed that learning occurs by modification of the synapse strengths (weights) in a manner such that if two interconnected neurons are both "on" at the same time, then the weight between those neurons should be increased. If data are represented in bipolar form, it is easy to express the desired weight update as: 19

20 HEBB NET If the data are binary, this formula does not distinguish between a training pair in which an input unit is "on" and the target value is "off" and a training pair in which both the input unit and the target value are "off." 20

21 Algorithm 21

22 Example 2.5 AND function, binary input, binary targets 22

23 Example

24 Example

25 Example 2.7 AND function, bipolar input, bipolar targets 25

26 Example 2.7 First input: 26

27 Example 2.7 First input: 27

28 Example 2.7 Second input: 28

29 Example 2.7 Second input: 29

30 Example 2.7 Third input: 30

31 Example 2.7 Third input: 31

32 Example 2.7 Fourth input: 32

33 Example 2.7 Fourth input : 33

34 Example 2.8 Character recognition 34

35 Example 2.8 The correct response for the first pattern is "on," or + 1, so the weights after presenting the first pattern are simply the input pattern. The bias weight after presenting this is + 1. The correct response for the second pattern is "off," or - 1, so the weight change when the second pattern is presented is , , , ,

36 Example 2.8 In addition, the weight change for the bias weight is - 1. Adding the weight change to the weights representing the first pattern gives the final weights: , , , , , The bias weight is 0 36

37 Example 2.8 The net input (for any input pattern) is the dot product of the input pattern with the weight vector. For the first training vector, the net input is 42, so the response is positive, as desired. For the second training pattern, the net input is -42, so the response is clearly negative, also as desired. 37

38 Example 2.8 The first type of change is usually referred to as "mistakes in the data. changing a 1 to a - 1, or vice versa. The second type of change is called "missing data. the value 0, rather than 1 or - 1. In general, a net can handle more missing components than wrong components; in other words, with input data, "It's better not to guess." 38

39 Example 2.10 Limitation of Hebb rule training for bipolar patterns. 39

40 Example 2.10 Limitation of Hebb rule training for bipolar patterns. 40

41 Example 2.10 Again, it is clear that the weights do not give the correct output for the first input pattern. Figure 2.13 shows that the input points are linearly separable; one possible plane, XI + x2 + x3 + (-2) = 0, to perform the separation is shown. This plane corresponds to a weight vector of (1 1 1) and a bias of

42 Example

43 PERCEPTRON The perceptron learning rule is a more powerful learning rule than the Hebb rule. Under suitable assumptions, its iterative learning procedure can be proved to converge to the correct weights, i.e., the weights that allow the net to produce the correct output value for each of the training input patterns. 43

44 PERCEPTRON The activation function for each associator unit was the binary step function with an arbitrary, but fixed, threshold. 44

45 PERCEPTRON If an error occurred for a particular training input pattern, the weights would be changed according to the formula where the target value t is + 1 or - 1 and a is the learning rate. If an error did not occur, the weights would not be changed. Training would continue until no error occurred. 45

46 Architecture 46

47 Algorithm 47

48 Algorithm 48

49 Algorithm Two separating lines: 49

50 Example 2.11 AND function: binary inputs, bipolar targets Theta=0.2 First training sample: 50

51 Example

52 Example 2.11 Second training sample: 52

53 Example

54 Example 2.11 Third training sample: Fourth training sample: 54

55 Example 2.11 The response of the all inputs are negative which is not true for the (1,1). So we need more iteration. After ninth epoch: After tenth epoch: 55

56 Example 2.11 Separating lines are: 56

57 Example 2.11 Separating lines are: 57

58 Example 2.12 A Perceptron for the AND function: bipolar inputs and targets. The training process Alpha=1 and threshold and initial weights = 0. 58

59 Example 2.12 First epoch: 59

60 Example 2.12 Second epoch: the system was fully trained after the first epoch. Bipolar representation reduces number of epochs. 60

61 Application Procedure 61

62 Training for several output neurons 62

63 Training for several output neurons 63

64 Example 2.15 We want to classify the following 21 characters written by 3 fonts into 7 classes. 64

65 Example

66 Example

67 Example

68 Example

69 Example

70 Example

71 Adaline The ADALINE = ADAptive Linear Neuron. an ADALINE can be trained using the delta rule, also known as the least mean squares (LMS) or Widrow-Hoff rule. During training, the activation of the unit is its net input, i.e., the activation function is the identity function. The learning rule minimizes the mean squared error between the activation and the target value. 71

72 Architecture 72

73 Algorithm 73

74 Learning rate If Alpha is too large, the learning process will not converge; if too small a value is chosen, learning will be extremely slow For a single neuron, a practical range for the learning rate is: N is the number of inputs. 74

75 Applications 75

76 Example 2.16 AND function: binary inputs, bipolar targets 76

77 Example 2.16 We want to minimize: Weights that minimize this error are: Thus, the separating line is: 77

78 Example 2.16 The total squared error for the four training patterns with these weights is 1. In Example 2.11, the boundary line of perceptron is: The total squared error for the minimizing weights found by the perceptron is 10/9. 78

79 Example 2.17 AND function: bipolar inputs, bipolar targets. The weights that minimize the total error for the bipolar form of the AND function are: Thus, the separating line is: as found by the perceptron in Example

80 Derivation The delta rule changes the weights of the neural connections so as to minimize the difference between the net input to the output unit, y-in, and the target value t. The aim is to minimize the error over all training patterns. The delta rule for adjusting the I-th weight (for each pattern) is: 80

81 Derivation The squared error for a particular training pattern is: The gradient of E is the vector consisting of the partial derivatives of E with respect to each of the weights. The gradient gives the direction of most rapid increase in E; the opposite direction gives the most rapid decrease in the error. The error can be reduced by adjusting the weight wi in the direction of 81

82 Derivation 82

83 Derivation for more than one output unit: 83

84 Derivation 84

87 Algorithm In the MRI algorithm (the original form of MADALINE training), only the weights for the hidden ADALINES are adjusted; the weights for the output unit are fixed. The MRII algorithm provides a method for adjusting all weights in the net. 87

88 MRI the weights v1 and v2 and the bias b3 that feed into the output unit Y are determined so that the response of unit Y is 1 if the signal it receives from either Z1 or Z2 (or both) is 1 and is - 1 if both Z1 and Z2 send a signal of - 1. In other words, the unit Y performs the logic function OR on the signals it receives from Z1 and Z2. The weights into Y are: 88

89 MRI The weights on the first hidden ADALINE ( ) and the weights on the second hidden ADALINE ( ) are adjusted according to the algorithm. The activation function for units ZI, Z2, and Y is: 89

90 MRI 90

91 MRI 91

92 MRI 92

93 MRI Step 7 is motivated by the desire to (1) update the weights only if an error occurred and (2) update the weights in such a way that it is more likely for the net to produce the desired response. If t = 1 and error has occurred, it means that all Z units had value - 1 and at least one Z unit needs to have a value of + 1. Therefore, we change the Z unit whose net input is closest to 0 and adjust its weights. 93

94 MRI If t = - 1 and error has occurred, it means that at least one Z unit had value + 1 and all Z units must have value - 1. Therefore, we adjust the weights on all of the Z units with positive net input. 94

95 Example 2-20 Training a MADALINE for the XOR function. We use MRI algorithm and only the first weight update is shown. Training patterns are: 95

96 Example 2-20 Weight of Z1, Z2 are small random values. Weight of Y is those found in Example 2.19 and Alpha is

97 Example

98 Example

99 Example

100 Example 2-20 After fourth epoch, the final weights are: 100

101 Example 2-21 Geometric interpretation of MADALINE weights. The positive response region for the Madaline trained in the previous example is the union of the regions where each of the hidden units have a positive response. For hidden unit Z1, the boundary line is: 101

102 Example

103 Example 2-21 For hidden unit Z2, the boundary line is: 103

104 Example

105 Example 2-21 The response of MADALINE is: 105

### Single layer NN. Neuron Model

Single layer NN We consider the simple architecture consisting of just one neuron. Generalization to a single layer with more neurons as illustrated below is easy because: M M The output units are independent

### The perceptron learning algorithm is one of the first procedures proposed for learning in neural network models and is mostly credited to Rosenblatt.

1 The perceptron learning algorithm is one of the first procedures proposed for learning in neural network models and is mostly credited to Rosenblatt. The algorithm applies only to single layer models

### Neural Networks and Fuzzy Logic Rajendra Dept.of CSE ASCET

Unit-. Definition Neural network is a massively parallel distributed processing system, made of highly inter-connected neural computing elements that have the ability to learn and thereby acquire knowledge

### Chapter 2 Single Layer Feedforward Networks

Chapter 2 Single Layer Feedforward Networks By Rosenblatt (1962) Perceptrons For modeling visual perception (retina) A feedforward network of three layers of units: Sensory, Association, and Response Learning

### 2- AUTOASSOCIATIVE NET - The feedforward autoassociative net considered in this section is a special case of the heteroassociative net.

2- AUTOASSOCIATIVE NET - The feedforward autoassociative net considered in this section is a special case of the heteroassociative net. - For an autoassociative net, the training input and target output

### Unit III. A Survey of Neural Network Model

Unit III A Survey of Neural Network Model 1 Single Layer Perceptron Perceptron the first adaptive network architecture was invented by Frank Rosenblatt in 1957. It can be used for the classification of

### ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD

ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD WHAT IS A NEURAL NETWORK? The simplest definition of a neural network, more properly referred to as an 'artificial' neural network (ANN), is provided

### Artificial Neural Networks The Introduction

Artificial Neural Networks The Introduction 01001110 01100101 01110101 01110010 01101111 01101110 01101111 01110110 01100001 00100000 01110011 01101011 01110101 01110000 01101001 01101110 01100001 00100000

### Introduction To Artificial Neural Networks

Introduction To Artificial Neural Networks Machine Learning Supervised circle square circle square Unsupervised group these into two categories Supervised Machine Learning Supervised Machine Learning Supervised

### CHAPTER 3. Pattern Association. Neural Networks

CHAPTER 3 Pattern Association Neural Networks Pattern Association learning is the process of forming associations between related patterns. The patterns we associate together may be of the same type or

### Backpropagation Neural Net

Backpropagation Neural Net As is the case with most neural networks, the aim of Backpropagation is to train the net to achieve a balance between the ability to respond correctly to the input patterns that

POLYTECHNIC UNIVERSITY Department of Computer and Information Science ADALINE for Pattern Classification K. Ming Leung Abstract: A supervised learning algorithm known as the Widrow-Hoff rule, or the Delta

### Artificial Neural Networks. Part 2

Artificial Neural Netorks Part Artificial Neuron Model Folloing simplified model of real neurons is also knon as a Threshold Logic Unit x McCullouch-Pitts neuron (943) x x n n Body of neuron f out Biological

### Introduction to Artificial Neural Networks

Facultés Universitaires Notre-Dame de la Paix 27 March 2007 Outline 1 Introduction 2 Fundamentals Biological neuron Artificial neuron Artificial Neural Network Outline 3 Single-layer ANN Perceptron Adaline

### In the Name of God. Lecture 11: Single Layer Perceptrons

1 In the Name of God Lecture 11: Single Layer Perceptrons Perceptron: architecture We consider the architecture: feed-forward NN with one layer It is sufficient to study single layer perceptrons with just

### Introduction to Neural Networks

Introduction to Neural Networks What are (Artificial) Neural Networks? Models of the brain and nervous system Highly parallel Process information much more like the brain than a serial computer Learning

### Course 395: Machine Learning - Lectures

Course 395: Machine Learning - Lectures Lecture 1-2: Concept Learning (M. Pantic) Lecture 3-4: Decision Trees & CBC Intro (M. Pantic & S. Petridis) Lecture 5-6: Evaluating Hypotheses (S. Petridis) Lecture

### Last update: October 26, Neural networks. CMSC 421: Section Dana Nau

Last update: October 26, 207 Neural networks CMSC 42: Section 8.7 Dana Nau Outline Applications of neural networks Brains Neural network units Perceptrons Multilayer perceptrons 2 Example Applications

### Artificial Neural Networks. Historical description

Artificial Neural Networks Historical description Victor G. Lopez 1 / 23 Artificial Neural Networks (ANN) An artificial neural network is a computational model that attempts to emulate the functions of

### Input layer. Weight matrix [ ] Output layer

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Department of Electrical Engineering and Computer Science 6.034 Artificial Intelligence, Fall 2003 Recitation 10, November 4 th & 5 th 2003 Learning by perceptrons

### Machine Learning. Neural Networks

Machine Learning Neural Networks Bryan Pardo, Northwestern University, Machine Learning EECS 349 Fall 2007 Biological Analogy Bryan Pardo, Northwestern University, Machine Learning EECS 349 Fall 2007 THE

### Neural networks. Chapter 20, Section 5 1

Neural networks Chapter 20, Section 5 Chapter 20, Section 5 Outline Brains Neural networks Perceptrons Multilayer perceptrons Applications of neural networks Chapter 20, Section 5 2 Brains 0 neurons of

### Artifical Neural Networks

Neural Networks Artifical Neural Networks Neural Networks Biological Neural Networks.................................. Artificial Neural Networks................................... 3 ANN Structure...........................................

### Neural Networks. Chapter 18, Section 7. TB Artificial Intelligence. Slides from AIMA 1/ 21

Neural Networks Chapter 8, Section 7 TB Artificial Intelligence Slides from AIMA http://aima.cs.berkeley.edu / 2 Outline Brains Neural networks Perceptrons Multilayer perceptrons Applications of neural

### Neural Networks. Fundamentals of Neural Networks : Architectures, Algorithms and Applications. L, Fausett, 1994

Neural Networks Neural Networks Fundamentals of Neural Networks : Architectures, Algorithms and Applications. L, Fausett, 1994 An Introduction to Neural Networks (nd Ed). Morton, IM, 1995 Neural Networks

### Perceptron. (c) Marcin Sydow. Summary. Perceptron

Topics covered by this lecture: Neuron and its properties Mathematical model of neuron: as a classier ' Learning Rule (Delta Rule) Neuron Human neural system has been a natural source of inspiration for

### CE213 Artificial Intelligence Lecture 14

CE213 Artificial Intelligence Lecture 14 Neural Networks: Part 2 Learning Rules -Hebb Rule - Perceptron Rule -Delta Rule Neural Networks Using Linear Units [ Difficulty warning: equations! ] 1 Learning

### Lecture 4: Perceptrons and Multilayer Perceptrons

Lecture 4: Perceptrons and Multilayer Perceptrons Cognitive Systems II - Machine Learning SS 2005 Part I: Basic Approaches of Concept Learning Perceptrons, Artificial Neuronal Networks Lecture 4: Perceptrons

### Neural Networks and the Back-propagation Algorithm

Neural Networks and the Back-propagation Algorithm Francisco S. Melo In these notes, we provide a brief overview of the main concepts concerning neural networks and the back-propagation algorithm. We closely

### Neural Networks (Part 1) Goals for the lecture

Neural Networks (Part ) Mark Craven and David Page Computer Sciences 760 Spring 208 www.biostat.wisc.edu/~craven/cs760/ Some of the slides in these lectures have been adapted/borrowed from materials developed

### Artificial Neural Networks Examination, June 2005

Artificial Neural Networks Examination, June 2005 Instructions There are SIXTY questions. (The pass mark is 30 out of 60). For each question, please select a maximum of ONE of the given answers (either

### Part 8: Neural Networks

METU Informatics Institute Min720 Pattern Classification ith Bio-Medical Applications Part 8: Neural Netors - INTRODUCTION: BIOLOGICAL VS. ARTIFICIAL Biological Neural Netors A Neuron: - A nerve cell as

### Chapter ML:VI. VI. Neural Networks. Perceptron Learning Gradient Descent Multilayer Perceptron Radial Basis Functions

Chapter ML:VI VI. Neural Networks Perceptron Learning Gradient Descent Multilayer Perceptron Radial asis Functions ML:VI-1 Neural Networks STEIN 2005-2018 The iological Model Simplified model of a neuron:

### COMP-4360 Machine Learning Neural Networks

COMP-4360 Machine Learning Neural Networks Jacky Baltes Autonomous Agents Lab University of Manitoba Winnipeg, Canada R3T 2N2 Email: jacky@cs.umanitoba.ca WWW: http://www.cs.umanitoba.ca/~jacky http://aalab.cs.umanitoba.ca

### Lecture 7 Artificial neural networks: Supervised learning

Lecture 7 Artificial neural networks: Supervised learning Introduction, or how the brain works The neuron as a simple computing element The perceptron Multilayer neural networks Accelerated learning in

### Artificial Neural Networks

Artificial Neural Networks 鮑興國 Ph.D. National Taiwan University of Science and Technology Outline Perceptrons Gradient descent Multi-layer networks Backpropagation Hidden layer representations Examples

### EEE 241: Linear Systems

EEE 4: Linear Systems Summary # 3: Introduction to artificial neural networks DISTRIBUTED REPRESENTATION An ANN consists of simple processing units communicating with each other. The basic elements of

### Neural Networks biological neuron artificial neuron 1

Neural Networks biological neuron artificial neuron 1 A two-layer neural network Output layer (activation represents classification) Weighted connections Hidden layer ( internal representation ) Input

### 2015 Todd Neller. A.I.M.A. text figures 1995 Prentice Hall. Used by permission. Neural Networks. Todd W. Neller

2015 Todd Neller. A.I.M.A. text figures 1995 Prentice Hall. Used by permission. Neural Networks Todd W. Neller Machine Learning Learning is such an important part of what we consider "intelligence" that

### Linear & nonlinear classifiers

Linear & nonlinear classifiers Machine Learning Hamid Beigy Sharif University of Technology Fall 1394 Hamid Beigy (Sharif University of Technology) Linear & nonlinear classifiers Fall 1394 1 / 34 Table

### Neural networks. Chapter 19, Sections 1 5 1

Neural networks Chapter 19, Sections 1 5 Chapter 19, Sections 1 5 1 Outline Brains Neural networks Perceptrons Multilayer perceptrons Applications of neural networks Chapter 19, Sections 1 5 2 Brains 10

### Feedforward Neural Nets and Backpropagation

Feedforward Neural Nets and Backpropagation Julie Nutini University of British Columbia MLRG September 28 th, 2016 1 / 23 Supervised Learning Roadmap Supervised Learning: Assume that we are given the features

### PMR5406 Redes Neurais e Lógica Fuzzy Aula 3 Single Layer Percetron

PMR5406 Redes Neurais e Aula 3 Single Layer Percetron Baseado em: Neural Networks, Simon Haykin, Prentice-Hall, 2 nd edition Slides do curso por Elena Marchiori, Vrije Unviersity Architecture We consider

### Revision: Neural Network

Revision: Neural Network Exercise 1 Tell whether each of the following statements is true or false by checking the appropriate box. Statement True False a) A perceptron is guaranteed to perfectly learn

### Neural Networks Lecture 2:Single Layer Classifiers

Neural Networks Lecture 2:Single Layer Classifiers H.A Talebi Farzaneh Abdollahi Department of Electrical Engineering Amirkabir University of Technology Winter 2011. A. Talebi, Farzaneh Abdollahi Neural

### CSC321 Lecture 5: Multilayer Perceptrons

CSC321 Lecture 5: Multilayer Perceptrons Roger Grosse Roger Grosse CSC321 Lecture 5: Multilayer Perceptrons 1 / 21 Overview Recall the simple neuron-like unit: y output output bias i'th weight w 1 w2 w3

### Neural Networks. Fundamentals Framework for distributed processing Network topologies Training of ANN s Notation Perceptron Back Propagation

Neural Networks Fundamentals Framework for distributed processing Network topologies Training of ANN s Notation Perceptron Back Propagation Neural Networks Historical Perspective A first wave of interest

### Linear discriminant functions

Andrea Passerini passerini@disi.unitn.it Machine Learning Discriminative learning Discriminative vs generative Generative learning assumes knowledge of the distribution governing the data Discriminative

### Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Machine Learning Neural Networks (slides from Domingos, Pardo, others) For this week, Reading Chapter 4: Neural Networks (Mitchell, 1997) See Canvas For subsequent weeks: Scaling Learning Algorithms toward

### Artificial Neural Networks Examination, March 2004

Artificial Neural Networks Examination, March 2004 Instructions There are SIXTY questions (worth up to 60 marks). The exam mark (maximum 60) will be added to the mark obtained in the laborations (maximum

### 3.3 Discrete Hopfield Net An iterative autoassociative net similar to the nets described in the previous sections has been developed by Hopfield

3.3 Discrete Hopfield Net An iterative autoassociative net similar to the nets described in the previous sections has been developed by Hopfield (1982, 1984). - The net is a fully interconnected neural

### Pattern Recognition Prof. P. S. Sastry Department of Electronics and Communication Engineering Indian Institute of Science, Bangalore

Pattern Recognition Prof. P. S. Sastry Department of Electronics and Communication Engineering Indian Institute of Science, Bangalore Lecture - 27 Multilayer Feedforward Neural networks with Sigmoidal

### Multilayer Neural Networks

Multilayer Neural Networks Introduction Goal: Classify objects by learning nonlinearity There are many problems for which linear discriminants are insufficient for minimum error In previous methods, the

### Multilayer Neural Networks

Multilayer Neural Networks Multilayer Neural Networks Discriminant function flexibility NON-Linear But with sets of linear parameters at each layer Provably general function approximators for sufficient

### Neural Networks Based on Competition

Neural Networks Based on Competition In some examples of pattern classification we encountered a situation in which the net was trained to classify the input signal into one of the output categories, while

### Neural Networks for Machine Learning. Lecture 2a An overview of the main types of neural network architecture

Neural Networks for Machine Learning Lecture 2a An overview of the main types of neural network architecture Geoffrey Hinton with Nitish Srivastava Kevin Swersky Feed-forward neural networks These are

### Artificial Neural Networks Examination, June 2004

Artificial Neural Networks Examination, June 2004 Instructions There are SIXTY questions (worth up to 60 marks). The exam mark (maximum 60) will be added to the mark obtained in the laborations (maximum

### Pattern Classification

Pattern Classification All materials in these slides were taen from Pattern Classification (2nd ed) by R. O. Duda,, P. E. Hart and D. G. Stor, John Wiley & Sons, 2000 with the permission of the authors

### AN INTRODUCTION TO NEURAL NETWORKS. Scott Kuindersma November 12, 2009

AN INTRODUCTION TO NEURAL NETWORKS Scott Kuindersma November 12, 2009 SUPERVISED LEARNING We are given some training data: We must learn a function If y is discrete, we call it classification If it is

### Multilayer Perceptron

Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Single Perceptron 3 Boolean Function Learning 4

### Multilayer Feedforward Networks. Berlin Chen, 2002

Multilayer Feedforard Netors Berlin Chen, 00 Introduction The single-layer perceptron classifiers discussed previously can only deal ith linearly separable sets of patterns The multilayer netors to be

### Reification of Boolean Logic

526 U1180 neural networks 1 Chapter 1 Reification of Boolean Logic The modern era of neural networks began with the pioneer work of McCulloch and Pitts (1943). McCulloch was a psychiatrist and neuroanatomist;

### Neural networks. Chapter 20. Chapter 20 1

Neural networks Chapter 20 Chapter 20 1 Outline Brains Neural networks Perceptrons Multilayer networks Applications of neural networks Chapter 20 2 Brains 10 11 neurons of > 20 types, 10 14 synapses, 1ms

### Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Machine Learning Neural Networks (slides from Domingos, Pardo, others) Human Brain Neurons Input-Output Transformation Input Spikes Output Spike Spike (= a brief pulse) (Excitatory Post-Synaptic Potential)

### Lab 5: 16 th April Exercises on Neural Networks

Lab 5: 16 th April 01 Exercises on Neural Networks 1. What are the values of weights w 0, w 1, and w for the perceptron whose decision surface is illustrated in the figure? Assume the surface crosses the

### COMP9444 Neural Networks and Deep Learning 2. Perceptrons. COMP9444 c Alan Blair, 2017

COMP9444 Neural Networks and Deep Learning 2. Perceptrons COMP9444 17s2 Perceptrons 1 Outline Neurons Biological and Artificial Perceptron Learning Linear Separability Multi-Layer Networks COMP9444 17s2

### Linear & nonlinear classifiers

Linear & nonlinear classifiers Machine Learning Hamid Beigy Sharif University of Technology Fall 1396 Hamid Beigy (Sharif University of Technology) Linear & nonlinear classifiers Fall 1396 1 / 44 Table

### ARTIFICIAL INTELLIGENCE. Artificial Neural Networks

INFOB2KI 2017-2018 Utrecht University The Netherlands ARTIFICIAL INTELLIGENCE Artificial Neural Networks Lecturer: Silja Renooij These slides are part of the INFOB2KI Course Notes available from www.cs.uu.nl/docs/vakken/b2ki/schema.html

### Νεςπο-Ασαυήρ Υπολογιστική Neuro-Fuzzy Computing

Νεςπο-Ασαυήρ Υπολογιστική Neuro-Fuzzy Computing ΗΥ418 Διδάσκων Δημήτριος Κατσαρός @ Τμ. ΗΜΜΥ Πανεπιστήμιο Θεσσαλίαρ Διάλεξη 4η 1 Perceptron s convergence 2 Proof of convergence Suppose that we have n training

### Vote. Vote on timing for night section: Option 1 (what we have now) Option 2. Lecture, 6:10-7:50 25 minute dinner break Tutorial, 8:15-9

Vote Vote on timing for night section: Option 1 (what we have now) Lecture, 6:10-7:50 25 minute dinner break Tutorial, 8:15-9 Option 2 Lecture, 6:10-7 10 minute break Lecture, 7:10-8 10 minute break Tutorial,

### 3.4 Linear Least-Squares Filter

X(n) = [x(1), x(2),..., x(n)] T 1 3.4 Linear Least-Squares Filter Two characteristics of linear least-squares filter: 1. The filter is built around a single linear neuron. 2. The cost function is the sum

### ) (d o f. For the previous layer in a neural network (just the rightmost layer if a single neuron), the required update equation is: 2.

1 Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.034 Artificial Intelligence, Fall 2011 Recitation 8, November 3 Corrected Version & (most) solutions

Classification with Perceptrons Reading: Chapters 1-3 of Michael Nielsen's online book on neural networks covers the basics of perceptrons and multilayer neural networks We will cover material in Chapters

### COGS Q250 Fall Homework 7: Learning in Neural Networks Due: 9:00am, Friday 2nd November.

COGS Q250 Fall 2012 Homework 7: Learning in Neural Networks Due: 9:00am, Friday 2nd November. For the first two questions of the homework you will need to understand the learning algorithm using the delta

### The Perceptron Algorithm

The Perceptron Algorithm Machine Learning Spring 2018 The slides are mainly from Vivek Srikumar 1 Outline The Perceptron Algorithm Perceptron Mistake Bound Variants of Perceptron 2 Where are we? The Perceptron

### Artificial Neural Networks. Edward Gatt

Artificial Neural Networks Edward Gatt What are Neural Networks? Models of the brain and nervous system Highly parallel Process information much more like the brain than a serial computer Learning Very

### Unit 8: Introduction to neural networks. Perceptrons

Unit 8: Introduction to neural networks. Perceptrons D. Balbontín Noval F. J. Martín Mateos J. L. Ruiz Reina A. Riscos Núñez Departamento de Ciencias de la Computación e Inteligencia Artificial Universidad

### Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Machine Learning Neural Networks (slides from Domingos, Pardo, others) For this week, Reading Chapter 4: Neural Networks (Mitchell, 1997) See Canvas For subsequent weeks: Scaling Learning Algorithms toward

### Intelligent Systems Discriminative Learning, Neural Networks

Intelligent Systems Discriminative Learning, Neural Networks Carsten Rother, Dmitrij Schlesinger WS2014/2015, Outline 1. Discriminative learning 2. Neurons and linear classifiers: 1) Perceptron-Algorithm

### Serious limitations of (single-layer) perceptrons: Cannot learn non-linearly separable tasks. Cannot approximate (learn) non-linear functions

BACK-PROPAGATION NETWORKS Serious limitations of (single-layer) perceptrons: Cannot learn non-linearly separable tasks Cannot approximate (learn) non-linear functions Difficult (if not impossible) to design

### Multilayer Perceptron = FeedForward Neural Network

Multilayer Perceptron = FeedForward Neural Networ History Definition Classification = feedforward operation Learning = bacpropagation = local optimization in the space of weights Pattern Classification

### Artificial neural networks

Artificial neural networks Chapter 8, Section 7 Artificial Intelligence, spring 203, Peter Ljunglöf; based on AIMA Slides c Stuart Russel and Peter Norvig, 2004 Chapter 8, Section 7 Outline Brains Neural

### Chapter ML:VI (continued)

Chapter ML:VI (continued) VI Neural Networks Perceptron Learning Gradient Descent Multilayer Perceptron Radial asis Functions ML:VI-64 Neural Networks STEIN 2005-2018 Definition 1 (Linear Separability)

### Neural Networks: Introduction

Neural Networks: Introduction Machine Learning Fall 2017 Based on slides and material from Geoffrey Hinton, Richard Socher, Dan Roth, Yoav Goldberg, Shai Shalev-Shwartz and Shai Ben-David, and others 1

### Computational Intelligence Lecture 3: Simple Neural Networks for Pattern Classification

Computational Intelligence Lecture 3: Simple Neural Networks for Pattern Classification Farzaneh Abdollahi Department of Electrical Engineering Amirkabir University of Technology Fall 2011 arzaneh Abdollahi

### Back-Propagation Algorithm. Perceptron Gradient Descent Multilayered neural network Back-Propagation More on Back-Propagation Examples

Back-Propagation Algorithm Perceptron Gradient Descent Multilayered neural network Back-Propagation More on Back-Propagation Examples 1 Inner-product net =< w, x >= w x cos(θ) net = n i=1 w i x i A measure

### Hopfield Neural Network

Lecture 4 Hopfield Neural Network Hopfield Neural Network A Hopfield net is a form of recurrent artificial neural network invented by John Hopfield. Hopfield nets serve as content-addressable memory systems

### Artificial Neural Network : Training

Artificial Neural Networ : Training Debasis Samanta IIT Kharagpur debasis.samanta.iitgp@gmail.com 06.04.2018 Debasis Samanta (IIT Kharagpur) Soft Computing Applications 06.04.2018 1 / 49 Learning of neural

### NONLINEAR CLASSIFICATION AND REGRESSION. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition

NONLINEAR CLASSIFICATION AND REGRESSION Nonlinear Classification and Regression: Outline 2 Multi-Layer Perceptrons The Back-Propagation Learning Algorithm Generalized Linear Models Radial Basis Function

### Administration. Registration Hw3 is out. Lecture Captioning (Extra-Credit) Scribing lectures. Questions. Due on Thursday 10/6

Administration Registration Hw3 is out Due on Thursday 10/6 Questions Lecture Captioning (Extra-Credit) Look at Piazza for details Scribing lectures With pay; come talk to me/send email. 1 Projects Projects

### Plan. Perceptron Linear discriminant. Associative memories Hopfield networks Chaotic networks. Multilayer perceptron Backpropagation

Neural Networks Plan Perceptron Linear discriminant Associative memories Hopfield networks Chaotic networks Multilayer perceptron Backpropagation Perceptron Historically, the first neural net Inspired

### Artificial Neural Networks

Introduction ANN in Action Final Observations Application: Poverty Detection Artificial Neural Networks Alvaro J. Riascos Villegas University of los Andes and Quantil July 6 2018 Artificial Neural Networks

### Multi-layer Perceptron Networks

Multi-layer Perceptron Networks Our task is to tackle a class of problems that the simple perceptron cannot solve. Complex perceptron networks not only carry some important information for neuroscience,

### The Perceptron. Volker Tresp Summer 2014

The Perceptron Volker Tresp Summer 2014 1 Introduction One of the first serious learning machines Most important elements in learning tasks Collection and preprocessing of training data Definition of a

### Chapter ML:VI (continued)

Chapter ML:VI (continued) VI. Neural Networks Perceptron Learning Gradient Descent Multilayer Perceptron Radial asis Functions ML:VI-56 Neural Networks STEIN 2005-2013 Definition 1 (Linear Separability)

### Fundamentals of Neural Networks

Fundamentals of Neural Networks : Soft Computing Course Lecture 7 14, notes, slides www.myreaders.info/, RC Chakraborty, e-mail rcchak@gmail.com, Aug. 10, 2010 http://www.myreaders.info/html/soft_computing.html

Radial-Basis Function etworks A function is radial basis () if its output depends on (is a non-increasing function of) the distance of the input from a given stored vector. s represent local receptors,