COMP9444 Neural Networks and Deep Learning 2. Perceptrons. COMP9444 c Alan Blair, 2017

Similar documents
Introduction Biologically Motivated Crude Model Backpropagation

Artificial Neural Networks. Historical description

Introduction To Artificial Neural Networks

Neural networks. Chapter 19, Sections 1 5 1

Neural Networks. Fundamentals of Neural Networks : Architectures, Algorithms and Applications. L, Fausett, 1994

Neural Networks. Chapter 18, Section 7. TB Artificial Intelligence. Slides from AIMA 1/ 21

Artificial Neural Networks. Part 2

Neural networks. Chapter 20, Section 5 1

EEE 241: Linear Systems

Revision: Neural Network

Lecture 4: Feed Forward Neural Networks

Neural networks. Chapter 20. Chapter 20 1

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

ARTIFICIAL INTELLIGENCE. Artificial Neural Networks

Multi-Dimensional Neural Networks: Unified Theory

Chapter 9: The Perceptron

Last update: October 26, Neural networks. CMSC 421: Section Dana Nau

Artificial neural networks

Feedforward Neural Nets and Backpropagation

Data Mining Part 5. Prediction

INTRODUCTION TO NEURAL NETWORKS

Computational Intelligence Winter Term 2009/10

Course 395: Machine Learning - Lectures

Introduction to Artificial Neural Networks

Artificial Neural Networks

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Computational Intelligence

Artifical Neural Networks

Neural Networks (Part 1) Goals for the lecture

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

LIMITATIONS OF RECEPTRON. XOR Problem The failure of the perceptron to successfully simple problem such as XOR (Minsky and Papert).

Artificial Neural Networks The Introduction

Lecture 7 Artificial neural networks: Supervised learning

Simple Neural Nets For Pattern Classification

EE04 804(B) Soft Computing Ver. 1.2 Class 2. Neural Networks - I Feb 23, Sasidharan Sreedharan

Last updated: Oct 22, 2012 LINEAR CLASSIFIERS. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition

Computational Intelligence

Principles of Artificial Intelligence Fall 2005 Handout #7 Perceptrons

Simulating Neural Networks. Lawrence Ward P465A

2018 EE448, Big Data Mining, Lecture 5. (Part II) Weinan Zhang Shanghai Jiao Tong University

Artificial Neural Networks. Q550: Models in Cognitive Science Lecture 5

Artificial Intelligence

Introduction to Neural Networks

Single Layer Perceptron Networks

Neural Networks: Introduction

Chapter ML:VI. VI. Neural Networks. Perceptron Learning Gradient Descent Multilayer Perceptron Radial Basis Functions

Artificial Neural Network

Learning and Memory in Neural Networks

Networks of McCulloch-Pitts Neurons

Neural Networks Lecturer: J. Matas Authors: J. Matas, B. Flach, O. Drbohlav

Sections 18.6 and 18.7 Artificial Neural Networks

Perceptron. (c) Marcin Sydow. Summary. Perceptron

CS 4700: Foundations of Artificial Intelligence

Artificial Neural Networks

Neural Networks for Machine Learning. Lecture 2a An overview of the main types of neural network architecture

Neural Networks. Nicholas Ruozzi University of Texas at Dallas

NONLINEAR CLASSIFICATION AND REGRESSION. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition

Introduction to Deep Learning

Neural Networks Introduction CIS 32

COMP304 Introduction to Neural Networks based on slides by:

Neural Networks: Basics. Darrell Whitley Colorado State University

CE213 Artificial Intelligence Lecture 13

Artificial Neural Networks Lecture Notes Part 3

CMSC 421: Neural Computation. Applications of Neural Networks

Neural Networks and Fuzzy Logic Rajendra Dept.of CSE ASCET

22c145-Fall 01: Neural Networks. Neural Networks. Readings: Chapter 19 of Russell & Norvig. Cesare Tinelli 1

CS:4420 Artificial Intelligence

Chapter ML:VI (continued)

CS 4700: Foundations of Artificial Intelligence

Are Rosenblatt multilayer perceptrons more powerfull than sigmoidal multilayer perceptrons? From a counter example to a general result

Computational Intelligence Winter Term 2017/18

Artificial Neural Networks (ANN) Xiaogang Su, Ph.D. Department of Mathematical Science University of Texas at El Paso

Neural Networks Introduction

CSC Neural Networks. Perceptron Learning Rule

Lecture 6. Notes on Linear Algebra. Perceptron

COMP-4360 Machine Learning Neural Networks

Introduction and Perceptron Learning

Neural Networks. Xiaojin Zhu Computer Sciences Department University of Wisconsin, Madison. slide 1

Neural Networks biological neuron artificial neuron 1

ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD

Computational Intelligence

COMP 551 Applied Machine Learning Lecture 14: Neural Networks

Neural Nets in PR. Pattern Recognition XII. Michal Haindl. Outline. Neural Nets in PR 2

AI Programming CS F-20 Neural Networks

Artificial Neural Network and Fuzzy Logic

Sections 18.6 and 18.7 Artificial Neural Networks

Simple Neural Nets for Pattern Classification: McCulloch-Pitts Threshold Logic CS 5870

Neural Networks. Fundamentals Framework for distributed processing Network topologies Training of ANN s Notation Perceptron Back Propagation

Simple neuron model Components of simple neuron

Part 8: Neural Networks

PETER PAZMANY CATHOLIC UNIVERSITY Consortium members SEMMELWEIS UNIVERSITY, DIALOG CAMPUS PUBLISHER

Unit 8: Introduction to neural networks. Perceptrons

Νεςπο-Ασαυήρ Υπολογιστική Neuro-Fuzzy Computing

Artificial Neural Networks" and Nonparametric Methods" CMPSCI 383 Nov 17, 2011!

Week 3: More Processing, Bits and Bytes. Blinky, Logic Gates, Digital Representations, Huffman Coding. Natural Language and Dialogue Systems Lab

Reification of Boolean Logic

Machine Learning A Geometric Approach

Chapter ML:VI (continued)

Neural networks (not in book)

Linear discriminant functions

Transcription:

COMP9444 Neural Networks and Deep Learning 2. Perceptrons

COMP9444 17s2 Perceptrons 1 Outline Neurons Biological and Artificial Perceptron Learning Linear Separability Multi-Layer Networks

COMP9444 17s2 Perceptrons 2 Structure of a Typical Neuron

COMP9444 17s2 Perceptrons 3 Biological Neurons The brain is made up of neurons (nerve cells) which have a cell body (soma) dendrites (inputs) an aon (outputs) synapses (connections between cells) Synapses can be eitatory or inhibitory and may change over time. When the inputs reach some threshhold an action potential (electrical pulse) is sent along the aon to the outputs.

COMP9444 17s2 Perceptrons 4 Artificial Neural Networks (Artificial) Neural Networks are made up of nodes which have inputs edges, each with some weight outputs edges (with weights) an activation level (a function of the inputs) Weights can be positive or negative and may change over time (learning). The input function is the weighted sum of the activation levels of inputs. The activation level is a non-linear transfer function g of this input: activation i = g(s i )=g( w i j j ) j Some nodes are inputs (sensing), some are outputs (action)

COMP9444 17s2 Perceptrons 5 McCulloch & Pitts Model of a Single Neuron 1 2 w 1 w 2 s Σ g g(s) 1, 2 are inputs w 0 =-th w s=w 1 1 + w 2 2 th 1, w 2 are synaptic weights 1 = w 1 1 + w 2 2 + w 0 th is a threshold w 0 is a bias weight g is transfer function

COMP9444 17s2 Perceptrons 6 Transfer function Originally, a (discontinuous) step function was used for the transfer function: g(s)= { 1, if s 0 0, if s<0 (Later, other transfer functions were introduced, which are continuous and smooth)

COMP9444 17s2 Perceptrons 7 Linear Separability Question: what kind of functions can a perceptron compute? 2 Answer: linearly separable functions 1

COMP9444 17s2 Perceptrons 8 Linear Separability Eamples of linearly separable functions: AND w 1 = w 2 = 1.0, w 0 = 1.5 OR w 1 = w 2 = 1.0, w 0 = 0.5 NOR w 1 = w 2 = 1.0, w 0 = 0.5 Q: How can we train it to learn a new function?

COMP9444 17s2 Perceptrons 9 Rosenblatt Perceptron

COMP9444 17s2 Perceptrons 10 Perceptron Learning Rule Adjust the weights as each input is presented. recall: s=w 1 1 + w 2 2 + w 0 if g(s)=0 but should be 1, if g(s)=1 but should be 0, w k w k + η k w k w k η k w 0 w 0 + η w 0 w 0 η so s s+η(1+ k k so s s η(1+ k k otherwise, weights are unchanged. (η>0 is called the learning rate) Theorem: This will eventually learn to classify the data correctly, as long as they are linearly separable.

COMP9444 17s2 Perceptrons 11 Perceptron Learning Eample 1 2 w 1 Σ (+/ ) w 2 w 1 1 + w 2 2 + w 0 > 0 w 0 learning rate η = 0.1 begin with random weights 1 w 1 = 0.2 w 2 = 0.0 w 0 = 0.1

COMP9444 17s2 Perceptrons 12 Training Step 1 2 0.2 1 + 0.0 2 0.1>0 (1,1) w 1 w 1 η 1 = 0.1 w 2 w 2 η 2 = 0.1 w 0 w 0 η = 0.2 1

COMP9444 17s2 Perceptrons 13 Training Step 2 2 (2,1) 0.1 1 0.1 2 0.2>0 w 1 w 1 + η 1 = 0.3 w 2 w 2 + η 2 = 0.0 w 0 w 0 + η = 0.1 1

COMP9444 17s2 Perceptrons 14 Training Step 3 2 (1.5,0.5) (2,2) 1 0.3 1 + 0.0 2 0.1>0 3rd point correctly classified, so no change 4th point: w 1 w 1 η 1 = 0.1 w 2 w 2 η 2 = 0.2 w 0 w 0 η = 0.2 0.1 1 0.2 2 0.2>0

COMP9444 17s2 Perceptrons 15 Final Outcome 2 eventually, all the data will be correctly classified (provided it is linearly separable) 1

COMP9444 17s2 Perceptrons 16 Limitations of Perceptrons Problem: many useful functions are not linearly separable (e.g. XOR) I 1 I 1 I 1 1 1 1? 0 0 0 0 1 I 2 I 2 0 1 0 1 I 2 I 1 I 2 I 1 I 2 (a) and (b) or (c) or I 1 I 2 Possible solution: 1 XOR 2 can be written as: ( 1 AND 2 ) NOR ( 1 NOR 2 ) Recall that AND, OR and NOR can be implemented by perceptrons.

COMP9444 17s2 Perceptrons 17 Multi-Layer Neural Networks XOR NOR 1 1 +0.5 AND NOR 1.5 +1 1 +1 1 +0.5 Problem: How can we train it to learn a new function? (credit assignment)

COMP9444 17s2 Perceptrons 18 Historical Contet In 1969, Minsky and Papert published a book highlighting the limitations of Perceptrons, and lobbied various funding agencies to redirect funding away from neural network research, preferring instead logic-based methods such as epert systems. It was known as far back as the 1960 s that any given logical function could be implemented in a 2-layer neural network with step function activations. But, the the question of how to learn the weights of a multi-layer neural network based on training eamples remained an open problem. The solution, which we describe in the net section, was found in 1976 by Paul Werbos, but did not become widely known until it was rediscovered in 1986 by Rumelhart, Hinton and Williams.