In the Name of God. Lecture 9: ANN Architectures

Similar documents
Artificial Neural Network and Fuzzy Logic

Neural Networks. Textbook. Other Textbooks and Books. Course Info. (click on

Last update: October 26, Neural networks. CMSC 421: Section Dana Nau

Neural Networks. Chapter 18, Section 7. TB Artificial Intelligence. Slides from AIMA 1/ 21

EE04 804(B) Soft Computing Ver. 1.2 Class 2. Neural Networks - I Feb 23, Sasidharan Sreedharan

Neural networks. Chapter 19, Sections 1 5 1

CS:4420 Artificial Intelligence

Neural Nets in PR. Pattern Recognition XII. Michal Haindl. Outline. Neural Nets in PR 2

Neural networks. Chapter 20, Section 5 1

Neural Networks and Fuzzy Logic Rajendra Dept.of CSE ASCET

Artificial Neural Network

Introduction to Neural Networks

Artificial Neural Networks. Q550: Models in Cognitive Science Lecture 5

Data Mining Part 5. Prediction

Artificial Neural Networks

Artificial Neural Network

Neural networks. Chapter 20. Chapter 20 1

ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD

EEE 241: Linear Systems

Introduction to Artificial Neural Networks

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Learning and Memory in Neural Networks

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Introduction Biologically Motivated Crude Model Backpropagation

Artificial Neural Networks The Introduction

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Artificial Neural Networks" and Nonparametric Methods" CMPSCI 383 Nov 17, 2011!

Neural Networks Introduction

ARTIFICIAL NEURAL NETWORKS گروه مطالعاتي 17 بهار 92

Artificial Neural Networks. Edward Gatt

Lecture 4: Feed Forward Neural Networks

Artificial Neural Networks Examination, June 2004

22c145-Fall 01: Neural Networks. Neural Networks. Readings: Chapter 19 of Russell & Norvig. Cesare Tinelli 1

ARTIFICIAL INTELLIGENCE. Artificial Neural Networks

Lecture 7 Artificial neural networks: Supervised learning

CSE 352 (AI) LECTURE NOTES Professor Anita Wasilewska. NEURAL NETWORKS Learning

Neural Networks: Introduction

Course 395: Machine Learning - Lectures

Part 8: Neural Networks

(Feed-Forward) Neural Networks Dr. Hajira Jabeen, Prof. Jens Lehmann

Artificial Neural Networks Examination, June 2005

Chapter 9: The Perceptron

Artificial neural networks

Artificial Neural Networks. Historical description

DEEP LEARNING AND NEURAL NETWORKS: BACKGROUND AND HISTORY

PV021: Neural networks. Tomáš Brázdil

Lecture 4: Perceptrons and Multilayer Perceptrons

Neural Networks (Part 1) Goals for the lecture

Grundlagen der Künstlichen Intelligenz

CMSC 421: Neural Computation. Applications of Neural Networks

Instituto Tecnológico y de Estudios Superiores de Occidente Departamento de Electrónica, Sistemas e Informática. Introductory Notes on Neural Networks

Neural Networks. Nicholas Ruozzi University of Texas at Dallas

Artifical Neural Networks

Simple Neural Nets for Pattern Classification: McCulloch-Pitts Threshold Logic CS 5870

CSC321 Lecture 5: Multilayer Perceptrons

Master Recherche IAC TC2: Apprentissage Statistique & Optimisation

Artificial Neural Networks

AI Programming CS F-20 Neural Networks

Machine Learning. Neural Networks

Artificial Neural Networks. Part 2

Artificial Neural Networks Examination, March 2004

2015 Todd Neller. A.I.M.A. text figures 1995 Prentice Hall. Used by permission. Neural Networks. Todd W. Neller

Artificial Intelligence (AI) Common AI Methods. Training. Signals to Perceptrons. Artificial Neural Networks (ANN) Artificial Intelligence

Feedforward Neural Nets and Backpropagation

Simple neuron model Components of simple neuron

COMP-4360 Machine Learning Neural Networks

Lecture 5: Logistic Regression. Neural Networks

Multilayer Perceptron

Artificial Neural Networks

Keywords- Source coding, Huffman encoding, Artificial neural network, Multilayer perceptron, Backpropagation algorithm

Introduction To Artificial Neural Networks

Sections 18.6 and 18.7 Artificial Neural Networks

Artificial Neural Networks. Introduction to Computational Neuroscience Tambet Matiisen

Introduction and Perceptron Learning

How to do backpropagation in a brain

SPSS, University of Texas at Arlington. Topics in Machine Learning-EE 5359 Neural Networks

Computational Intelligence Winter Term 2009/10

Multilayer Perceptrons (MLPs)

COMP9444 Neural Networks and Deep Learning 2. Perceptrons. COMP9444 c Alan Blair, 2017

2- AUTOASSOCIATIVE NET - The feedforward autoassociative net considered in this section is a special case of the heteroassociative net.

Computational Intelligence

The Power of Extra Analog Neuron. Institute of Computer Science Academy of Sciences of the Czech Republic

ECE521 Lecture 7/8. Logistic Regression

Artificial Neural Networks

Fundamentals of Neural Networks

The Perceptron. Volker Tresp Summer 2016

Simple Neural Nets For Pattern Classification

COMP304 Introduction to Neural Networks based on slides by:

Unit III. A Survey of Neural Network Model

The Perceptron. Volker Tresp Summer 2014

Artificial Neural Networks (ANN) Xiaogang Su, Ph.D. Department of Mathematical Science University of Texas at El Paso

Neural Networks. Fundamentals Framework for distributed processing Network topologies Training of ANN s Notation Perceptron Back Propagation

Ch.8 Neural Networks

Neural Networks biological neuron artificial neuron 1

Artificial Neural Networks

Using a Hopfield Network: A Nuts and Bolts Approach

Multilayer Neural Networks. (sometimes called Multilayer Perceptrons or MLPs)

Neural Networks and Ensemble Methods for Classification

Hopfield Neural Network

Deep Feedforward Networks. Sargur N. Srihari

Transcription:

In the Name of God Lecture 9: ANN Architectures

Biological Neuron

Organization of Levels in Brains Central Nervous sys Interregional circuits Local circuits Neurons Dendrite tree map into cerebral cortex, pathways, columns, topographic maps; involve multiple regions neurons of similar and different properties, 1 mm in size, localized region in the brain 100 m in size, contains several dendrite trees Neural microcircuits Synapses Molecules

Biological Analogy Brain Neuron w 1 Artificial neuron (processing element) Inputs w 2 f(net) w n Set of processing elements (PEs) and connections (weights) with adjustable strengths Input Layer X1 X2 X3 X4 Output Layer X5 Hidden Layer

ANN: History Pavlov s conditioning experiments: a conditioned response, salivation in response to the auditory stimulus Lots of activities concerning automatas, communication, computation, understanding of nervous system during 1930s and 1940s McCulloch and Pitts 1943 von Neumann EDVAC (Electronic Discrete Variable Automatic Computer) Hebb: The Organization of Behavior, 1949 Minsky: Theory of Neural-Analog Reinforcement Systems and Its Application to the Brain-Model Problem (Reinforcement learning), 1954 The problem of designing an optimum linear filter: Kolmogorov 1942, Wiener 1949, Zadeh 1953, Gabor 1954 Uttley: leaky integrate and fire neuron, 1956 Rosenblatt: the perceptron, 1958

ANN: History Long-Term Potential, LPT, (1973 Bliss,Lomo), AMPA receptor, Long-Term Depression, LTD, NMDA receptor, The nearest neigbbor rule by Fix and Hodges 1951 Least mean square algorithm by Widraw and Hoff in 1960 The use of stochastic gradient in adaptive pattern classification by Amari in 1967 The idea of competive learning: von der Malsburg 1973, the self-organization of orientation-sensitive nerve cells in the striate cortex Self organized maps by Grossberg in 70s Saving units (associative networks) by Anderson and Kohonen in 1982 Recurrent neural networks by Hopfield in 1982 Error backpropagation learning algorithm by Rumelhart, Hinton and Williams in 1986 Spike timing dependence plasticity by Markram in 1997 Echo state networks by Jaeger in 2002 Lots of applications...

(Artificial) Neural Network? Computational model inspired from neurological model of brain Human brain computes in different way from digital computer highly complex, nonlinear, and parallel computing many times faster than a computer in pattern recognition, perception, motor control has great structure and ability to build up its own rules by experience dramatic development within 2 years after birth continues to develop afterward Language Learning Device before 13 years old Plasticity: ability to adapt to its environment

Neural Network Definitions Machine designed to model the way in which brain performs tasks implemented by electronic devices and/or software (simulation) Learning is the major emphasis of NN Massively parallel distributed processor massive interconnection of simple processing units simple processing units store experience e and make it available to use knowledge is acquired from environment thru learning process Learning Machine modify synaptic weights to obtain design objective modify own topology - neurons die and new one can grow Connectionist network - connectionism

Benefits of Neural Networks (I) Power comes from massively parallel distributed structure and learn to generalize generalization: ability to produce reasonable output for inputs not encountered during training NN cannot provide solution by working individually Complex problem is decomposed into simple tasks, and each task is assigned to a NN Long way to go to build a computer that mimics human brain Non-linearity interconnection of non-linear neurons is itself nonlinear desirable property if underlying physical mechanism is non-linear

Benefits of Neural Networks (II) Input-Output Mapping input-output mapping is built by learning from examples reduce differences of desired response and actual response non-parametric statistical inference estimate arbitrary decision boundaries in input signal space Adaptivity adapt synaptic weight to changes of environment NN is retrained to deal with minor change in the operating environment change synaptic weights in real-time more robust, reliable behavior in non-stationary environment Adaptive pattern recognition, Adaptive signal processing, Adaptive control stability-plasticity dilemma

Benefits of Neural Networks (III) Evidential Response not only selected class label l but also confidence confidences can be used to reject recognition accuracy vs. reliability (do only what you can do) Contextual Information processing (contextual) knowledge is presented in the structure every neuron is affected by others Fault Tolerance performance degrades gracefully under adverse condition catastrophic failure of digital computer VLSI implementability massively parallel l nature makes it well suited for VLSI implementation

Benefits of Neural Networks (IV) Uniformity of Analysis and Design Neuron is common to all NN share theories and learning algorithms modular networks can be built thru seamless integration Neurobiological Analogy living proof of fault tolerant, fast, powerful processing Neuroscientists see it as a research tool for neurobiological phenomena Engineers look to neuroscience for new ideas

ANN: Architectures Perceptron Inputs Weights PEs 5, 3, 2, 5, 3 5, 3, 2, 5, 3 1, 0, 0, 1, 0 5, 3, 2, 2, 1 Exemplar Epoch Outputs Recurrent/Feedback Inputs 5, 3, 2, 5, 3 5, 3, 2, 5, 3 1, 0, 0, 1, 0 5, 3, 2, 2, 1 Multiple Layer Feedforward Weights PEs PEs PEs Hidden Layer Weights Hidden Layer Weights Output Layer Time Lag Feedforward Output t Inputs Inputs Memory Structure 5, 3, 2, 5, 3 5, 3, 2, 5, 3 5, 3, 2, 2, 1 5, 3, 2, 5, 3 Mem Mem Mem 5, 3, 2, 5, 3 Mem Mem Mem

ANN: What Makes them Unique Neural networks are nonlinear models Many other nonlinear models exist mathematics required is usually involved or nonexistent. simplified nonlinear system combinations of simple nonlinear functions Neural networks are trained from the data No expert knowledge is required beforehand They can learn and adapt to changing conditions online They are universal approximators learn any model given enough data and processing elements They have very few formal assumptions about the data They have very few formal assumptions about the data (e.g. no Gaussian requirements, etc.)

ANN: How do neural nets work? TRAIN THE NETWORK: 1. Introduce data 2. Computes an output 3. Output compared to desired output 4. Weights are modified to reduce error USE THE NETWORK: 1. Introduce new data to the network 2. Network computes an output based on its training input output

ANN: Generalization Neural networks are very powerful, often too powerful Can overtrain a neural network will perform very well on data that it was trained with but poorly on test data Never judge a network based upon training data results ONLY!

ANN: Multiple Datasets The most common solution to the generalization problem is to divide your data into 3 sets: Training data: used to train network Cross Validation data: used to actively test the network during training - used to stop training Testing data: used to test the network after training Production data: desired output is not known (implementation)

Models of Neuron Neuron is information processing unit A set of synapses or connecting links characterized by weight or strength An adder summing the input signals weighted by synapses a linear combiner An activation function also called squashing function squash (limits) the output to some finite values

Nonlinear model of a neuron (I) () Bias b k x 1 w k1 Activation function x 2 w k2... v k... (.) Output y k x m w km Input signal Synaptic weights Summing junction v k m j 1 w kj x j b k y k (v k )

Nonlinear model of a neuron (II) X 0 = +1 wk0 W k0 = b k (bias) x 1 w k1 Activation function x 2 w k2... v k... (.) Output y k x m w km Input signal Synaptic weights Summing junction v k m w j 0 kj x j y k (v k )

Types of Activation Function O j O O j O j +1 +1 +1 t in i t in i in i Threshold Function Piecewise-linear Function Sigmoid Function (differentiable) (v ) 1 1 exp( av ) a is slope parameter

Activation Function value range +1 +1 v i v i -1 Hyperbolic tangent Function Signum Function ( v ) tanh( v )

The McCulloch-Pitts Model McCulloch and Pitts (1943) produced the first neural network, which was based on their artificial neuron. The activation of a neuron is binary. The neuron either fires (activation of one) or does not fire (activation of zero). Neurons in a McCulloch-Pitts network are connected by directed and weighted paths.

The McCulloch-Pitts Model For the network shown below the activation function for unit Y is: f(y_in) = 1, if y_in >= T else 0 where y_in is the total input signal received and T is the threshold h for Y. Inputs x 0 x 1 w 1 x 2. x. x n w 1 w 2 wn b Output Y

Example: Logical Functions a 1 a0 a 0 a0 W 0 = 1.5 W 0 = 0.5 a 1 W 0 = -0.5 W 1 = 1 W 1 = 1 W 2 = 1 a a W 2 = 1 W 2 1 = -1 2 AND OR a 1 NOT McCulloch and Pitts: some Boolean functions can be implemented with an artificial neuron (not XOR).

two-layer network capable of calculating XOR

Stochastic Model of a Neuron Deterministic vs stochastic stochastic: stay at a state with probability P x 1 with probabilit y P( v) 1 with probabilit y1 P( v) x: state t of neuron 1 v: induced local field (input sum) P( v) v P(v) probability of firing 1 exp( ) where T is pseudotemparature T 0, reduced to deterministic form T

NNs as directed Graphs Block diagram can be simplified by the idea of signal flow graph node is associated with signal directed link is associated with transfer function synaptic links governed dby linear input-output t t relation signal x j is multiplied by synaptic weight w kj activation links governed by nonlinear input-output relation nonlinear activation function

Signal Flow Graph of a Neuron x 0 = +1 W k0 = b k x 1 w k1 w v x k2 k (.) 2 y k... w km x m

Architectural graph of a Neuron Partially complete directed graph describing layout Three graphical representations Block diagram - providing functional description of a NN Signal flow graph - complete description of signal flow architectural graph - network layout

Network Architecture Single-layer Feedforward Networks input layer and output layer single (computation) layer feedforward, acyclic Multilayer Feedforward Networks hidden layers - hidden neurons and hidden units enables to extract high order statistics 10-4-2 network, 100-30-10-3 network fully connected layered network Recurrent Networks at least one feedback loop with or without hidden neuron

Network Architecture Single layer Multiple layer fully connected Unit delay operator Recurrent network without t hidden units outputs inputs Recurrent network with hidden units

Feedback Output is fed-back to the NN that is used in determining the output itself x j (n) x j (n) w y k (n) y l 1 z -1 ( n ) ( n l ) k i 0 depending on w stable, linear divergence, exponential divergence we are interested in the case of w <1 ; infinite memory output depends on inputs of infinite past NN with feedback loop : recurrent network w x j

Knowledge Representation Knowledge refers to stored information or models used by a person or machine to interpret, predict and appropriately respond to the outside world What information is actually made explicit; How the information is physically encoded for the subsequent use Good solution depends on good representation of knowledge In NN, knowledge is represented by internal network parameters real challenge Knowledge of the world world state represented by known facts - prior knowledge observations - obtained by (noisy) sensors; training examples

Knowledge Acquisition by NN Training Training examples: either labeled or unlabeled labeled : input signal and desired response unlabeled : different realizations of input signal Examples represent the knowledge of environment Character recognition 1. Appropriate architecture is selected for NN source node = number of pixels of input image e. g. 26 output node for each digit subset of examples for training NN by suitable learning algorithm 2. Recognition performance is tested by the rest of the examples Positive and negative examples

Classification: Optical Character Recognition Determine if the input image istheabc A,B,C, 2 classes :create one output for each class (e.g. class 0: true or false, etc.). 26 outputs (A Z). Each image is labeled with a class image A will be (1,0,0,0,0,0,0,0,0,0) image B will be (0100000000) (0,1,0,0,0,0,0,0,0,0), etc. Must train the network to recognize the alphabets Input Layer Hidden Layer A B C D E Output Layer

Rules of Knowledge representation in NN Similar input from similar classes produce similar representations similarity measures Euclidian distance, dot (inner) product, cos random variable : Mahalanobis distance... Separate classes produce widely different representations More neurons should be involved in representation of more important feature probability bilit of detection ti / false alarm Prior information and invariances should be built into the design of the network general purpose vs specialized

Building Prior to NN design Specialized structure learns fast because of small free parameters runs fast because of simple structure No well-defined rules for building specialized NN ad hoc approach restricting ti the network architecture t through h using local connections receptive field Constraining the choice of synaptic weights weight sharing, parameter tying

Building invariance to NN design Want to be capable to cope with transformations Invariance by structure synaptic connections are arranged not to by affected by transformation rotation invariant forcing w ji = w jk for all k in the same distance from the center of image Invariance by training train by data of many different transformations computationally infeasible invariant feature space use features invariant to the transformations No well-developed theory of optimizing architecture of NN NN lacks explanation capability

AI and NN Definition of AI; Goal of AI art of creating machine that performs tasks that requires intelligence when performed by people study of mental faculties through the use of computational models to make computers to perceive, reason and act to develop machine that perform cognitive tasks functions of AI system store knowledge apply the knowledge to solve problems acquire new knowledge thru experience Key components of AI representation reasoning learning representation learning reasoning

AI AI is goal, objective, dream NN is a model of intelligent system it is not the only system Intelligent system is not necessarily same as human Example : Chess machine Symbolic AI is a tool, paradigm toward AI NN can be a good tool toward AI

Reasoning Reasoning is ability to solve problem must able to express and solve broad range of problems must able to make explicit and implicit information known to it must have control mechanism to select operators for a situation ti Problem solving is a searching problem deal with incompleteness, inexactness, uncertainty probabilistic reasoning, plausible reasoning, fuzzy reasoning

Learning Model of Machine Learning Environment, Learning element, Knowledge base, and performance cycle Inductive learning generate rules from raw data similarity-based learning, case-based reasoning deductive learning general rules are used to determine specific facts theorem proving theorem proving Augmenting knowledge-base is not a simple task

Reading S Haykin, Neural Networks: A Comprehensive y, p Foundation, 2007 (Chapter 1).