COMP304 Introduction to Neural Networks based on slides by:

Similar documents
Introduction to Neural Networks

Artificial Neural Networks and Deep Learning

Linear discriminant functions

Artifical Neural Networks

Neural Networks Introduction CIS 32

Linear Regression, Neural Networks, etc.

Artificial Neural Networks

Artificial Neural Network

Multilayer Perceptron

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Information processing. Divisions of nervous system. Neuron structure and function Synapse. Neurons, synapses, and signaling 11/3/2017

Chapter 37 Active Reading Guide Neurons, Synapses, and Signaling

COMP9444 Neural Networks and Deep Learning 2. Perceptrons. COMP9444 c Alan Blair, 2017

CMSC 421: Neural Computation. Applications of Neural Networks

Simple Neural Nets For Pattern Classification

Lecture 4: Perceptrons and Multilayer Perceptrons

Hopfield Networks and Boltzmann Machines. Christian Borgelt Artificial Neural Networks and Deep Learning 296

2015 Todd Neller. A.I.M.A. text figures 1995 Prentice Hall. Used by permission. Neural Networks. Todd W. Neller

The perceptron learning algorithm is one of the first procedures proposed for learning in neural network models and is mostly credited to Rosenblatt.

Perceptron. (c) Marcin Sydow. Summary. Perceptron

Neural Networks. Chapter 18, Section 7. TB Artificial Intelligence. Slides from AIMA 1/ 21

22c145-Fall 01: Neural Networks. Neural Networks. Readings: Chapter 19 of Russell & Norvig. Cesare Tinelli 1

Data Mining Part 5. Prediction

CISC 3250 Systems Neuroscience

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Introduction Biologically Motivated Crude Model Backpropagation

Neural Networks. Prof. Dr. Rudolf Kruse. Computational Intelligence Group Faculty for Computer Science

Neural networks. Chapter 20. Chapter 20 1

Revision: Neural Network

Last update: October 26, Neural networks. CMSC 421: Section Dana Nau

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Artificial neural networks

Artificial Neural Networks

Lecture 7 Artificial neural networks: Supervised learning

Lab 5: 16 th April Exercises on Neural Networks

Chapter 2 Single Layer Feedforward Networks

Fuzzy Systems. Neuro-Fuzzy Systems

Chapter 9. Nerve Signals and Homeostasis

Machine Learning. Neural Networks. Le Song. CSE6740/CS7641/ISYE6740, Fall Lecture 7, September 11, 2012 Based on slides from Eric Xing, CMU

Computational Intelligence

Neuron. Detector Model. Understanding Neural Components in Detector Model. Detector vs. Computer. Detector. Neuron. output. axon

Control and Integration. Nervous System Organization: Bilateral Symmetric Animals. Nervous System Organization: Radial Symmetric Animals

3 Detector vs. Computer

Computational Intelligence Winter Term 2009/10

Artificial Neural Network and Fuzzy Logic

Neural Conduction. biologyaspoetry.com

Nerve Signal Conduction. Resting Potential Action Potential Conduction of Action Potentials

Introduction and Perceptron Learning

Chapter ML:VI. VI. Neural Networks. Perceptron Learning Gradient Descent Multilayer Perceptron Radial Basis Functions

Computational Intelligence

Artificial Neural Networks The Introduction

Neural networks. Chapter 19, Sections 1 5 1

Chapter 48 Neurons, Synapses, and Signaling

Chapter 9: The Perceptron

Artificial Neural Networks

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels

Ch. 5. Membrane Potentials and Action Potentials

CE213 Artificial Intelligence Lecture 13

Hopfield Neural Network and Associative Memory. Typical Myelinated Vertebrate Motoneuron (Wikipedia) Topic 3 Polymers and Neurons Lecture 5

EE04 804(B) Soft Computing Ver. 1.2 Class 2. Neural Networks - I Feb 23, Sasidharan Sreedharan

Neurons, Synapses, and Signaling

Neural networks. Chapter 20, Section 5 1

BIOLOGY 11/10/2016. Neurons, Synapses, and Signaling. Concept 48.1: Neuron organization and structure reflect function in information transfer

Neural Networks. Fundamentals of Neural Networks : Architectures, Algorithms and Applications. L, Fausett, 1994

Nervous Tissue. Neurons Neural communication Nervous Systems

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels

Using Variable Threshold to Increase Capacity in a Feedback Neural Network

ARTIFICIAL INTELLIGENCE. Artificial Neural Networks

y(x n, w) t n 2. (1)

Resting Distribution of Ions in Mammalian Neurons. Outside Inside (mm) E ion Permab. K Na Cl

Administration. Registration Hw3 is out. Lecture Captioning (Extra-Credit) Scribing lectures. Questions. Due on Thursday 10/6

Neural Networks and Fuzzy Logic Rajendra Dept.of CSE ASCET

Physiology Unit 2. MEMBRANE POTENTIALS and SYNAPSES

CE213 Artificial Intelligence Lecture 14

Neural Networks (Part 1) Goals for the lecture

MEMBRANE POTENTIALS AND ACTION POTENTIALS:

Nervous & Endocrine System

AI Programming CS F-20 Neural Networks

Nervous System Organization

Artificial Neural Networks Examination, June 2005

The Perceptron Algorithm 1

Artificial Neural Networks Examination, June 2004

Artificial Neural Networks. Historical description

NOTES: CH 48 Neurons, Synapses, and Signaling

CS:4420 Artificial Intelligence

Artificial Neural Networks. Q550: Models in Cognitive Science Lecture 5

(Feed-Forward) Neural Networks Dr. Hajira Jabeen, Prof. Jens Lehmann

Least Mean Squares Regression

CSC242: Intro to AI. Lecture 21

Artificial Neural Networks. Edward Gatt

BIOLOGY. 1. Overview of Neurons 11/3/2014. Neurons, Synapses, and Signaling. Communication in Neurons

Lecture 6. Notes on Linear Algebra. Perceptron

Neurons and Nervous Systems

Overview. Knowledge-Based Agents. Introduction. COMP219: Artificial Intelligence. Lecture 19: Logic for KR

Artificial Neural Networks

Neurons, Synapses, and Signaling

Neural Networks. Xiaojin Zhu Computer Sciences Department University of Wisconsin, Madison. slide 1

UNIT I INTRODUCTION TO ARTIFICIAL NEURAL NETWORK IT 0469 NEURAL NETWORKS

Sections 18.6 and 18.7 Artificial Neural Networks

COMP219: Artificial Intelligence. Lecture 19: Logic for KR

Transcription:

COMP34 Introduction to Neural Networks based on slides by: Christian Borgelt http://www.borgelt.net/ Christian Borgelt Introduction to Neural Networks

Motivation: Why (Artificial) Neural Networks? (Neuro-)Biology / (Neuro-)Physiology / Psychology: Exploit similarity to real (biological) neural networks. Build models to understand nerve and brain operation by simulation. Computer Science / Engineering / Economics Mimic certain cognitive capabilities of human beings. Solve learning/adaptation, prediction, and optimization problems. Physics / Chemistry Use neural network models to describe physical phenomena. Special case: spin glasses (alloys of magnetic and non-magnetic metals). Christian Borgelt Introduction to Neural Networks

Motivation: Why Neural Networks in AI? Physical-Symbol System Hypothesis [Newell and Simon 976] A physical-symbol system has the necessary and sufficient means for general intelligent action. Neural networks process simple signals, not symbols. So why study neural networks in Artificial Intelligence? Symbol-based representations work well for inference tasks, but are fairly bad for perception tasks. Symbol-based expert systems tend to get slower with growing knowledge, human experts tend to get faster. Neural networks allow for highly parallel information processing. There are several successful applications in industry and finance. Christian Borgelt Introduction to Neural Networks 3

Biological Background Structure of a prototypical biological neuron terminal button synapsis dendrites cell core cell body (soma) axon myelin sheath Christian Borgelt Introduction to Neural Networks 4

Biological Background (Very) simplified description of neural information processing Axon terminal releases chemicals, called neurotransmitters. These act on the membrane of the receptor dendrite to change its polarization. (The inside is usually 7mV more negative than the outside.) Decrease in potential difference: excitatory synapse Increase in potential difference: inhibitory synapse If there is enough net excitatory input, the axon is depolarized. The resulting action potential travels along the axon. (Speed depends on the degree to which the axon is covered with myelin.) When the action potential reaches the terminal buttons, it triggers the release of neurotransmitters. Christian Borgelt Introduction to Neural Networks 5

Threshold Logic Units Christian Borgelt Introduction to Neural Networks 6

Threshold Logic Units A Threshold Logic Unit (TLU) is a processing unit for numbers with n inputs x,..., x n and one output y. The unit has a threshold θ and each input x i is associated with a weight w i. A threshold logic unit computes the function y =, if x w =, otherwise. n i= w i x i θ, x w.. θ y x n w n Christian Borgelt Introduction to Neural Networks 7

Threshold Logic Units: Examples Threshold logic unit for the conjunction x x. x 3 x 4 y x x 3x + x y 3 5 Threshold logic unit for the implication x x. x x y x x x x y Christian Borgelt Introduction to Neural Networks 8

Threshold Logic Units: Examples Threshold logic unit for (x x ) (x x 3 ) (x x 3 ). x x x 3 y x x x 3 i w i x i y 4 Christian Borgelt Introduction to Neural Networks 9

Threshold Logic Units: Geometric Interpretation Review of line representations Straight lines are usually represented in one of the following forms: with the parameters: Explicit Form: g x = bx + c Implicit Form: g a x + a x + d = Point-Direction Form: g x = p + k r Normal Form: g ( x p) n = b : c : p : r : n : Gradient of the line Section of the x axis Vector of a point of the line (base vector) Direction vector of the line Normal vector of the line Christian Borgelt Introduction to Neural Networks

Threshold Logic Units: Geometric Interpretation A straight line and its defining parameters. c x b = r r r n = (a, a ) q = d n n n g ϕ p d = p n x O Christian Borgelt Introduction to Neural Networks

Threshold Logic Units: Geometric Interpretation How to determine the side on which a point x lies. x z z = x n n n n q = d n n n g ϕ x x O Christian Borgelt Introduction to Neural Networks

Threshold Logic Units: Geometric Interpretation Threshold logic unit for x x. x 3 x 4 y x x A threshold logic unit for x x. x x y x x Christian Borgelt Introduction to Neural Networks 3

Threshold Logic Units: Geometric Interpretation Visualization of 3-dimensional Boolean functions: x 3 x (,, ) x (,, ) Threshold logic unit for (x x ) (x x 3 ) (x x 3 ). x x y x 3 x x 3 x Christian Borgelt Introduction to Neural Networks 4

Threshold Logic Units: Limitations The biimplication problem x x : There is no separating line. x x y x x Formal proof by reductio ad absurdum: since (, ) : θ, () since (, ) : w < θ, () since (, ) : w < θ, (3) since (, ) : w + w θ. (4) () and (3): w + w < θ. With (4): θ > θ, or θ >. Contradiction to (). Christian Borgelt Introduction to Neural Networks 5

Threshold Logic Units: Limitations Total number and number of linearly separable Boolean functions. ([Widner 96] as cited in [Zell 994]) inputs Boolean functions linearly separable functions 4 4 6 4 3 56 4 4 65536 774 5 4.3 9 9457 6.8 9 5. 6 For many inputs a threshold logic unit can compute almost no functions. Networks of threshold logic units are needed to overcome the limitations. Christian Borgelt Introduction to Neural Networks 6

Networks of Threshold Logic Units Solving the biimplication problem with a network. Idea: logical decomposition x x (x x ) (x x ) x x computes y = x x 3 computes y = x x computes y = y y y = x x Christian Borgelt Introduction to Neural Networks 7

Networks of Threshold Logic Units Solving the biimplication problem: Geometric interpretation x d c g g = y b ac g 3 a b d x y The first layer computes new Boolean coordinates for the points. After the coordinate transformation the problem is linearly separable. Christian Borgelt Introduction to Neural Networks 8

Representing Arbitrary Boolean Functions Let y = f(x,..., x n ) be a Boolean function of n variables. (i) Represent f(x,..., x n ) in disjunctive normal form. That is, determine D f = K... K m, where all K j are conjunctions of n literals, i.e., K j = l j... l jn with l ji = x i (positive literal) or l ji = x i (negative literal). (ii) Create a neuron for each conjunction K j of the disjunctive normal form (having n inputs one input for each variable), where {, if lji = x w ji = i, and θ, if l ji = x i, j = n + n w ji. (iii) Create an output neuron (having m inputs one input for each neuron that was created in step (ii)), where i= w (n+)k =, k =,..., m, and θ n+ =. Christian Borgelt Introduction to Neural Networks 9

Training Threshold Logic Units Christian Borgelt Introduction to Neural Networks

Training Threshold Logic Units Geometric interpretation provides a way to construct threshold logic units with and 3 inputs, but: Not an automatic method (human visualization needed). Not feasible for more than 3 inputs. General idea of automatic training: Start with random values for weights and threshold. Determine the error of the output for a set of training patterns. Error is a function of the weights and the threshold: e = e(w,..., w n, θ). Adapt weights and threshold so that the error gets smaller. Iterate adaptation until the error vanishes. Christian Borgelt Introduction to Neural Networks

Training Threshold Logic Units Single input threshold logic unit for the negation x. x w θ y x y Output error as a function of weight and threshold. w e θ error for x = w e θ error for x = w e θ sum of errors Christian Borgelt Introduction to Neural Networks

Training Threshold Logic Units The error function cannot be used directly, because it consists of plateaus. Solution: If the computed output is wrong, take into account, how far the weighted sum is from the threshold. Modified output error as a function of weight and threshold. 4 e 4 e 4 e w θ error for x = w θ error for x = w θ sum of errors Christian Borgelt Introduction to Neural Networks 3

Training Threshold Logic Units Example training procedure: Online and batch training. w θ Online-Lernen w θ Batch-Lernen 4 w e θ Batch-Lernen x y x Christian Borgelt Introduction to Neural Networks 4

Training Threshold Logic Units: Delta Rule Formal Training Rule: Let x = (x,..., x n ) be an input vector of a threshold logic unit, o the desired output for this input vector and y the actual output of the threshold logic unit. If y o, then the threshold θ and the weight vector w = (w,..., w n ) are adapted as follows in order to reduce the error: i {,..., n} : θ (new) = θ (old) + θ with θ = η(o y), = w (old) i + w i with w i = η(o y)x i, w (new) i where η is a parameter that is called learning rate. It determines the severity of the weight changes. This procedure is called Delta Rule or Widrow Hoff Procedure [Widrow and Hoff 96]. Online Training: Adapt parameters after each training pattern. Batch Training: Adapt parameters only at the end of each epoch, i.e. after a traversal of all training patterns. Christian Borgelt Introduction to Neural Networks 5

Training Threshold Logic Units: Delta Rule Turning the threshold value into a weight: = x w = θ x w x w x w θ y x w y.. w n w n x n x n n i= w i x i θ n i= w i x i θ Christian Borgelt Introduction to Neural Networks 6

Training Threshold Logic Units: Delta Rule procedure online training (var w, var θ, L, η); var y, e; (* output, sum of errors *) begin repeat e := ; (* initialize the error sum *) for all ( x, o) L do begin (* traverse the patterns *) if ( w x θ) then y := ; (* compute the output *) else y := ; (* of the threshold logic unit *) if (y o) then begin (* if the output is wrong *) θ := θ η(o y); (* adapt the threshold *) w := w + η(o y) x; (* and the weights *) e := e + o y ; (* sum the errors *) end; end; until (e ); (* repeat the computations *) end; (* until the error vanishes *) Christian Borgelt Introduction to Neural Networks 7

Training Threshold Logic Units: Not epoch x o x w y e θ w θ w.5.5.5.5.5.5.5.5.5 3.5.5.5.5 4.5.5.5.5 5.5.5.5.5 6.5.5.5.5 Christian Borgelt Introduction to Neural Networks 8

Training Threshold Logic Units: Conjunction Threshold logic unit with two inputs for the conjunction. x w x w θ y x x y x y x Christian Borgelt Introduction to Neural Networks 9

Training Threshold Logic Units: Conjunction epoch x x o x w y e θ w w θ w w 3 3 4 3 5 3 3 3 6 3 3 3 3 3 Christian Borgelt Introduction to Neural Networks 3

Training Threshold Logic Units: Biimplication epoch x x o x w y e θ w w θ w w 3 3 3 Christian Borgelt Introduction to Neural Networks 3

Training Threshold Logic Units: Convergence Convergence Theorem: Let L = {( x, o ),... ( x m, o m )} be a set of training patterns, each consisting of an input vector x i IR n and a desired output o i {, }. Furthermore, let L = {( x, o) L o = } and L = {( x, o) L o = }. If L and L are linearly separable, i.e., if w IR n and θ IR exist, such that ( x, ) L : w x < θ and ( x, ) L : w x θ, then online as well as batch training terminate. The algorithms terminate only when the error vanishes. Therefore the resulting threshold and weights must solve the problem. For not linearly separable problems the algorithms do not terminate. Christian Borgelt Introduction to Neural Networks 3

Training Networks of Threshold Logic Units Single threshold logic units have strong limitations: They can only compute linearly separable functions. Networks of threshold logic units can compute arbitrary Boolean functions. Training single threshold logic units with the delta rule is fast and guaranteed to find a solution if one exists. Networks of threshold logic units cannot be trained, because there are no desired values for the neurons of the first layer, the problem can usually be solved with different functions computed by the neurons of the first layer. When this situation became clear, neural networks were seen as a research dead end. Christian Borgelt Introduction to Neural Networks 33