Perceptron. (c) Marcin Sydow. Summary. Perceptron

Similar documents
Neural Networks Introduction CIS 32

Lecture 7 Artificial neural networks: Supervised learning

Simple Neural Nets For Pattern Classification

2015 Todd Neller. A.I.M.A. text figures 1995 Prentice Hall. Used by permission. Neural Networks. Todd W. Neller

CE213 Artificial Intelligence Lecture 13

CSC Neural Networks. Perceptron Learning Rule

Linear discriminant functions

Artificial Neural Networks Examination, June 2004

CE213 Artificial Intelligence Lecture 14

ARTIFICIAL INTELLIGENCE. Artificial Neural Networks

EEE 241: Linear Systems

Artificial Neural Networks Examination, June 2005

Last update: October 26, Neural networks. CMSC 421: Section Dana Nau

Neural Networks Introduction

Vote. Vote on timing for night section: Option 1 (what we have now) Option 2. Lecture, 6:10-7:50 25 minute dinner break Tutorial, 8:15-9

Artificial neural networks

EE04 804(B) Soft Computing Ver. 1.2 Class 2. Neural Networks - I Feb 23, Sasidharan Sreedharan

Artificial Neural Networks. Q550: Models in Cognitive Science Lecture 5

Neural Networks. Fundamentals of Neural Networks : Architectures, Algorithms and Applications. L, Fausett, 1994

Neural Networks. Chapter 18, Section 7. TB Artificial Intelligence. Slides from AIMA 1/ 21

COMP9444 Neural Networks and Deep Learning 2. Perceptrons. COMP9444 c Alan Blair, 2017

Artificial Neural Networks. Historical description

DM534 - Introduction to Computer Science

Neural networks. Chapter 19, Sections 1 5 1

Artificial Neural Networks The Introduction

Artificial Neural Network

CS:4420 Artificial Intelligence

Input layer. Weight matrix [ ] Output layer

22c145-Fall 01: Neural Networks. Neural Networks. Readings: Chapter 19 of Russell & Norvig. Cesare Tinelli 1

Neural Networks. Fundamentals Framework for distributed processing Network topologies Training of ANN s Notation Perceptron Back Propagation

The Perceptron. Volker Tresp Summer 2016

Neural networks. Chapter 20, Section 5 1

Data Mining Part 5. Prediction

AN INTRODUCTION TO NEURAL NETWORKS. Scott Kuindersma November 12, 2009

Artificial Neural Networks

Artifical Neural Networks

Reification of Boolean Logic

Security Analytics. Topic 6: Perceptron and Support Vector Machine

Neural networks. Chapter 20. Chapter 20 1

AI Programming CS F-20 Neural Networks

CSC321 Lecture 5: Multilayer Perceptrons

The Perceptron. Volker Tresp Summer 2014

Lecture 6. Notes on Linear Algebra. Perceptron

COMP304 Introduction to Neural Networks based on slides by:

Artificial Neural Networks Examination, March 2004

Artificial Intelligence (AI) Common AI Methods. Training. Signals to Perceptrons. Artificial Neural Networks (ANN) Artificial Intelligence

Simulating Neural Networks. Lawrence Ward P465A

Introduction to Artificial Neural Networks

Learning and Memory in Neural Networks

CMSC 421: Neural Computation. Applications of Neural Networks

CHALMERS, GÖTEBORGS UNIVERSITET. EXAM for ARTIFICIAL NEURAL NETWORKS. COURSE CODES: FFR 135, FIM 720 GU, PhD

Course 395: Machine Learning - Lectures

Multilayer Perceptron

Lecture 4: Perceptrons and Multilayer Perceptrons

Introduction To Artificial Neural Networks

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Lecture 4: Feed Forward Neural Networks

The perceptron learning algorithm is one of the first procedures proposed for learning in neural network models and is mostly credited to Rosenblatt.

ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD

Computational Intelligence Lecture 3: Simple Neural Networks for Pattern Classification

Computational Intelligence

Ch.8 Neural Networks

CS 4700: Foundations of Artificial Intelligence

CS 4700: Foundations of Artificial Intelligence

Tools of AI. Marcin Sydow. Summary. Machine Learning

Artificial Neural Networks Lecture Notes Part 3

Computational Intelligence Winter Term 2009/10

Revision: Neural Network

Neural Networks (Part 1) Goals for the lecture

Introduction to Neural Networks

Optimization Methods for Machine Learning (OMML)

Computational Intelligence

Artificial Neural Networks. Part 2

Multilayer Perceptron Tutorial

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Artificial Neural Network and Fuzzy Logic

Computational Intelligence Winter Term 2017/18

COMP-4360 Machine Learning Neural Networks

Machine Learning. Neural Networks. Le Song. CSE6740/CS7641/ISYE6740, Fall Lecture 7, September 11, 2012 Based on slides from Eric Xing, CMU

Feedforward Neural Nets and Backpropagation

Administration. Registration Hw3 is out. Lecture Captioning (Extra-Credit) Scribing lectures. Questions. Due on Thursday 10/6

SPSS, University of Texas at Arlington. Topics in Machine Learning-EE 5359 Neural Networks

Linear Classifiers. Michael Collins. January 18, 2012

Machine Learning

Computational Intelligence

Single layer NN. Neuron Model

Plan. Perceptron Linear discriminant. Associative memories Hopfield networks Chaotic networks. Multilayer perceptron Backpropagation

Neural Networks: Introduction

The Perceptron. Volker Tresp Summer 2018

Neural Networks Lecture 4: Radial Bases Function Networks

Linear Regression, Neural Networks, etc.

100 inference steps doesn't seem like enough. Many neuron-like threshold switching units. Many weighted interconnections among units

Chapter 2 Single Layer Feedforward Networks

Chapter 9: The Perceptron

Artificial Intelligence

Lab 5: 16 th April Exercises on Neural Networks

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

COMS 4771 Introduction to Machine Learning. Nakul Verma

LIMITATIONS OF RECEPTRON. XOR Problem The failure of the perceptron to successfully simple problem such as XOR (Minsky and Papert).

Principles of Artificial Intelligence Fall 2005 Handout #7 Perceptrons

Transcription:

Topics covered by this lecture: Neuron and its properties Mathematical model of neuron: as a classier ' Learning Rule (Delta Rule)

Neuron Human neural system has been a natural source of inspiration for articial intelligence researchers. Hence the interest in a neuron the fundamental unit of the neural system.

Behaviour of a Neuron Transmit the neural signal from the inputs (dendrites: the endings of a neural cell contacting with other cells) to the output (neurite, which is often a very long ending, transmitting the signal to further neurons) Non-linear signal processing: the output state is not a simple sum of input signals Dynamically modify the connections with other neurons via synapses (connecting elements), which makes it possible to strengthen or weaken the signal received from other neurons according to the current task

: an Articial Neuron is a simple mathematical model of a neuron. Historically, the goal for the work on neural networks was to gain the ability of generalisation (approximation) and learning, specic for the human brain (according to one of the denitions of articial intelligence). Currently, articial neural networks focus on less ambitiuous, but more realistic tasks. A single perceptron can serve as a classier or regressor is a building block in more complex articial neural network structures that can solve practical problems: supervised or unsupervised learning controlling complex mechanical devices (e.g. robotics)

- a simple model of natural neuron A perceptron consists of: n inputs x 1,..., x n corresponding to dendrites n weights w 1,..., w n, corresponding to synapses Each weight w i is attached to the i th input x i a threshold Θ one output y (All the variables are real numbers) The output y is computed as follows: { 1 if n i=1 y = w i x i = W T X Θ (perceptron activated) 0 else (not activated) W, X R n denote the vector of weights and inputs, respectively

The perceptron is activated (y=1) only when the dot product W T X (sometimes called as net) reaches the specied threshold Θ We call the perceptron discrete if y {0, 1} (or { 1, 1}) continuous if y [0, 1] (or [ 1, 1])

: Geometrical Interpretation The computation of the output of the perceptron has simple geometrical interpretation. Consider the n-dimensional input space (each point here is a potential input vector X R n ). The weight vector W R n is the normal vector of the decision hyperplane. The perceptron is activated (outputs 1) only if the input vector X is on the same side (of the decision hyperplane) as the weight vector W. Moreover, the maximum net value (W T X ) is achieved for X being close to W, is 0 if they are orthogonal and minimum (negative) if they are opposite.

Geometrical Interpretation of, cont. The required perceptron's behavior can be obtained by adjusting appropriate weights and threshold. The weight vector W determines the direction of the decision hyperplane. The threshold Θ determines how much decision hyperplane is moved from the origin (0 point)

Example: s can simulate logical circuits A single perceptron with appropriately set weights and threshold can easily simulate basic logical gates:

Limitations of single perceptron A single perceptron can distinguish (by the value of its output) only the sets of inputs which are linearly separable in the input space (i.e. there exists a n-1-dimensional hyperplane separating the positive and negative cases) One of the simplest examples of linearly non-separable sets is logical function XOR (excluding alternative).

Limitations of a single perceptron One can see on the pictures below that functions AND and OR correspond to the linearly separable sets, so we can model each of them using a single perceptron (as shown in the previous section), while XOR cannot be modeled by any single perceptron.

Network of s The output of one perceptron can be connected to input of other perceptron(s) (as in neural system). This makes it possible to extend the computational possibilities of a single perceptron. For example, XOR can be simulated by joining 2 perceptrons and appropriately setting their weights and thresholds. Remark: by perceptron or multilayer perceptron one can also mean a network of connected entities (neurons).

Example: Iris classication Single perceptron can distinguish Iris-setosa from 2 other sub-species However, it cannot exactly recognise any of the other 2 sub-species

: Overcoming the limitations Discovery of the above limitations (1969) blocked further development of neural networks for years. An obvious method to cope with this problem is to join perceptrons together (e.g. two perceptrons are enough to model XOR) to form articial neural networks. However, it is mathematically far more dicult to adjust more complex networks to our needs than in case of a single perceptron. Fortunately, the development of ecient techniques of learning the perceptron networks (80s of XX century), i.e. automatic tuning of their weights on the basis of positive and negative examples, caused the renaissance of articial neural networks.

as a Classier that can Learn A single perceptron can be used as a tool in supervised machine learning, as a binary classier (output equal to 0 or 1) To achieve this, we present a perceptron with a training set of pairs: input vector correct answer (0 or 1) The perceptron can learn the correct answers by appropriately setting it's weight vector.

's Learning: Delta Rule We apply the training examples one by one. If the current output is the same as the desired, we pass to the next example. If it is incorrect we apply the following perceptron learning rule to the vector of its weights: W = W + (d y)αx d - desired (correct) output y - actual output 0 < α < 1 - a parameter, tuned experimentally

Interpretation of Learning Rule To force the perceptron to give the desired ouputs, its weight vector should be maximally close to the positive (y=1) cases. Hence the formula: W = W + (d y)αx move W towards positive X if it outputs 0 instead to 1 (too weak activation) move W away from negative X (if it outputs 1 instead of 0) (too strong activation) Usually, the whole training set should be passed several times to obtain the desired weights of perceptron.

The Role of Threshold If the activation threshold Θ is set to 0, the perceptron can distinguish only the classes which are separable by a decision hyperplane containing the origin of the input space (0 vector) To move the decision hyperplane away from the origin, the threshold has to be set to some non-zero value.

Incorporating Threshold into Learning W T X Θ W T X Θ 0 Hence, X can be extended by appending 1 (fake n+1-th input) and W can be extended by appending Θ. Denote by W and X the extended (n+1)-th dimensional vectors. Now we have: W T X 0 - the same form as without the threshold Now, the learning rule can be applied to the extended vectors W and X

Questions/Problems: desired properties of articial neuron mathematical formulation of perceptron how perceptron computes it's output geometric interpretation of perceptron's computation mathematical limitations of perceptron learning rule for perceptron geometric interpretation of perceptron's learning rule

Thank you for attention