Neural Networks. Textbook. Other Textbooks and Books. Course Info. (click on

Similar documents
In the Name of God. Lecture 9: ANN Architectures

Artificial Neural Network and Fuzzy Logic

Lecture 4: Feed Forward Neural Networks

Data Mining Part 5. Prediction

Neural Nets in PR. Pattern Recognition XII. Michal Haindl. Outline. Neural Nets in PR 2

Neural Networks Introduction

Instituto Tecnológico y de Estudios Superiores de Occidente Departamento de Electrónica, Sistemas e Informática. Introductory Notes on Neural Networks

Artificial Neural Network

Introduction to Neural Networks

EE04 804(B) Soft Computing Ver. 1.2 Class 2. Neural Networks - I Feb 23, Sasidharan Sreedharan

Lecture 7 Artificial neural networks: Supervised learning

Hierarchy. Will Penny. 24th March Hierarchy. Will Penny. Linear Models. Convergence. Nonlinear Models. References

Artificial Neural Network

EEE 241: Linear Systems

2015 Todd Neller. A.I.M.A. text figures 1995 Prentice Hall. Used by permission. Neural Networks. Todd W. Neller

Neural networks. Chapter 19, Sections 1 5 1

Neural networks. Chapter 20. Chapter 20 1

Introduction Biologically Motivated Crude Model Backpropagation

Artificial Neural Networks. Q550: Models in Cognitive Science Lecture 5

Artificial Neural Networks

CS:4420 Artificial Intelligence

ARTIFICIAL NEURAL NETWORKS گروه مطالعاتي 17 بهار 92

Learning and Memory in Neural Networks

Lecture 4: Perceptrons and Multilayer Perceptrons

Neural networks. Chapter 20, Section 5 1

Artificial Neural Networks Examination, June 2004

Neural Networks and Fuzzy Logic Rajendra Dept.of CSE ASCET

PV021: Neural networks. Tomáš Brázdil

4. Multilayer Perceptrons

(Feed-Forward) Neural Networks Dr. Hajira Jabeen, Prof. Jens Lehmann

Neural Networks. Chapter 18, Section 7. TB Artificial Intelligence. Slides from AIMA 1/ 21

Neural Networks: Introduction

Neural Networks and the Back-propagation Algorithm

ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD

Hopfield Neural Network and Associative Memory. Typical Myelinated Vertebrate Motoneuron (Wikipedia) Topic 3 Polymers and Neurons Lecture 5

Last update: October 26, Neural networks. CMSC 421: Section Dana Nau

How to do backpropagation in a brain

Neural Network Based Response Surface Methods a Comparative Study

Machine Learning. Neural Networks

Modelling stochastic neural learning

Linear Regression, Neural Networks, etc.

Feedforward Neural Nets and Backpropagation

Chapter 48 Neurons, Synapses, and Signaling

Neural Networks and Ensemble Methods for Classification

CSE/NB 528 Final Lecture: All Good Things Must. CSE/NB 528: Final Lecture

Sections 18.6 and 18.7 Artificial Neural Networks

Chapter 9. Nerve Signals and Homeostasis

CMSC 421: Neural Computation. Applications of Neural Networks

CSE 352 (AI) LECTURE NOTES Professor Anita Wasilewska. NEURAL NETWORKS Learning

CISC 3250 Systems Neuroscience

ARTIFICIAL INTELLIGENCE. Artificial Neural Networks

Artificial Neural Networks Examination, June 2005

Neurons, Synapses, and Signaling

Artificial Neural Networks The Introduction

COMP-4360 Machine Learning Neural Networks

Synaptic Devices and Neuron Circuits for Neuron-Inspired NanoElectronics

AI Programming CS F-20 Neural Networks

The Perceptron. Volker Tresp Summer 2016

The Bayesian Brain. Robert Jacobs Department of Brain & Cognitive Sciences University of Rochester. May 11, 2017

Fundamentals of Computational Neuroscience 2e

Machine Learning and Data Mining. Multi-layer Perceptrons & Neural Networks: Basics. Prof. Alexander Ihler

Neurons, Synapses, and Signaling

Deep Learning. What Is Deep Learning? The Rise of Deep Learning. Long History (in Hind Sight)

BIOLOGY 11/10/2016. Neurons, Synapses, and Signaling. Concept 48.1: Neuron organization and structure reflect function in information transfer

Dynamic Causal Modelling for fmri

A Logarithmic Neural Network Architecture for Unbounded Non-Linear Function Approximation

Processing of Time Series by Neural Circuits with Biologically Realistic Synaptic Dynamics

Chapter 9: The Perceptron

Learning in State-Space Reinforcement Learning CIS 32

Neurons, Synapses, and Signaling

MIDTERM: CS 6375 INSTRUCTOR: VIBHAV GOGATE October,

y k = (a)synaptic f(x j ) link linear i/p o/p relation (b) Activation link linear i/p o/p relation

Nervous System Organization

Nervous Systems: Neuron Structure and Function

Artificial Intelligence

22c145-Fall 01: Neural Networks. Neural Networks. Readings: Chapter 19 of Russell & Norvig. Cesare Tinelli 1

Knowledge Extraction from DBNs for Images

Simple neuron model Components of simple neuron

ECE521 Lectures 9 Fully Connected Neural Networks

Neurons, Synapses, and Signaling

Introduction to Artificial Neural Networks

Intelligent Modular Neural Network for Dynamic System Parameter Estimation

Deep Feedforward Networks

Model of a Biological Neuron as a Temporal Neural Network

Slide02 Haykin Chapter 2: Learning Processes

Ch.8 Neural Networks

7 Rate-Based Recurrent Networks of Threshold Neurons: Basis for Associative Memory

ECE662: Pattern Recognition and Decision Making Processes: HW TWO

Deep Feedforward Networks. Sargur N. Srihari

CSC242: Intro to AI. Lecture 21

INTRODUCTION TO NEURAL NETWORKS

An Introductory Course in Computational Neuroscience

Artificial Neural Networks. Historical description

Artifical Neural Networks

Artificial Neural Networks

Tuning tuning curves. So far: Receptive fields Representation of stimuli Population vectors. Today: Contrast enhancment, cortical processing

Sections 18.6 and 18.7 Artificial Neural Networks

Deep Learning. What Is Deep Learning? The Rise of Deep Learning. Long History (in Hind Sight)

Synaptic Plasticity. Introduction. Biophysics of Synaptic Plasticity. Functional Modes of Synaptic Plasticity. Activity-dependent synaptic plasticity:

Keywords- Source coding, Huffman encoding, Artificial neural network, Multilayer perceptron, Backpropagation algorithm

Transcription:

636-600 Neural Networks Textbook Instructor: Yoonsuck Choe Contact info: HRBB 322B, 45-5466, choe@tamu.edu Web page: http://faculty.cs.tamu.edu/choe Simon Haykin, Neural Networks: A Comprehensive Foundation, Second edition, Prentice-Hall, Upper Saddle River, NJ, 1999. ISBN 0-13-273350-1. Code from the book: http://www.mathworks.com/books (click on Neural/Fuzzy and find the book title). Text and figures, etc. will be quoted from the textbook without repeated acknowledgment. Instructor s perspective will be indicated by YC where appropriate. 1 2 Other Textbooks and Books J. Hertz, A. Krogh, and R. Palmer, Introduction to the Theory of Neural Computation, Addison-Wesley, 1991. C. M. Bishop, Neural Networks for Pattern Recognition, Oxford University Press, 1995. M. A. Arbib, The Handbook of Brain Theory and Neural Networks, 2nd edition, MIT Press, 2003. Course Info Grading: Exams: 40% (midterm: 20%, final: 20%); Assignments: 60% (5 written+programming assignments, 12% each); No curving: 90 = A, 0 = B, etc. Academic integrity: individual work, unless otherwise indicated; proper references should be given in case online/offline resources are used. Students with disabilities: see the online syllabus. Lecture notes: check course web page for updates. Download, print, and bring to the class. Computer accounts: talk to CS helpdesk (HRBB 210). Programming: Matlab, or better yet, Octave (http://www.octave.org). C/C++, Java, etc. (they should run on CS 3 Unix or windows) 4

Relation to Other Courses Neural Networks in the Brain Some overlaps: Machine learning: neural network component, reinforcement learning Pattern analysis: PCA, support-vector machines, radial basis functions(?) (Relatively) unique to this course: in depth treatment of single/multilayer networks, neurodynamics, committee machines, information theoretic models, recurrent networks, etc. Human brain computes in an entirely different way from conventional digital computers. The brain is highly complex, nonlinear, and parallel. Orgnization of neurons to perform tasks much faster than computers. (Typical time taken in visual recognition tasks is 100 200 ms.) Key features of the biological brain: experience shapes the wiring through plasticity, and hence learning becomes the central issue in neural networks. 5 6 Neural Networks as an Adaptive Machine Benefits of Neural Networks A neural network is a massively parallel distributed processor made up of simple processing units, which has a natural propensity for storing experimental knowledge and making it available for use. Neural networks resemble the brain: Knowledge is acquired from the environment through a learning process. Inerneuron connection strengths, known as synaptic weights, are used to store the acquired knowledge. Procedure used for learning: learning algorithm. Weights, or even the topology can be adjusted. 1. Nonlinearity: nonlinear components, distributed nonlinearity 2. Input-output mapping: supervised learning, nonparametric statistical inference (model-free estimation, no prior assumptions), 3. Adaptivity: either retain or adapt. Can deal with nonstationary environments. Must overcome stability-plasticity dillema. 4. Evidential response: decision plus confidence of the decision can be provided. 5. Contextual information: Every neuron in the network potentially influences every other neuron, so contextual information is dealt with naturally. 7

Benefits of Neural Networks (cont d) 6. Fault tolerance: performance degrades gracefully. 7. VLSI implementability: network of simple components.. Uniformity of analysis and design: common components (neurons), sharability of theories and learning algorithms, and seamless integration based on modularity. 9. Neurobiological analogy: Neural nets motivated by neurobiology, and neurobiology also turning to neural networks for insights and tools. Human Brain Stimulus Receptors Neural Net Effectors Response Arbib (197) Pioneer: Santiago Ramón y Cajál, a Spanish neuroanatomist who introduced neurons as a fundamental unit of brain function. Neurons are slow: 10 3 s per operation, compared to 10 9 s of modern CPUs. Huge number of neurons and connections: 10 10 neurons, 6 10 13 connetions in human brain. Highly energy efficient: 10 16 J in the brain vs. 10 6 J in modern computers. 9 10 The Neuron and the Synapse Structural Organization of the Brain Synapse: where two neurons meet. Presynaptic neuron: source Postsynaptic neuron: target Neurotransmitters: molecules that cross the synapse (positive, negative, or modulatory effect on postsynaptic activation) Dendrite: branch that receives input Axon: branch that sends out output (spike, or action potential traverses the axon and triggers neurotransmitter release at axon terminals). Small to large-scale organizations Molecules, Synapses, Neural microcircuits Dendritic trees, Neurons Local circuits Interregional circuits: pathways, columns, topographic maps Central nervous system 11 12

Cytoarchitectural Map of the Cerebral Cortex Topographic Maps in the Cortex Visual Field Cortex Map-like organization: Brodmann s cytoarchitectural map of the cerebral cortex. Area 17, 1, 19: visual cortices Area 41, 42: auditory cortices Area 1, 2, 3: somatosensory cortices (bodily sensation) 13 Nearby location in the stimulus space are mapped to nearby neurons in the cortex. Thus, it is like a map of the sensory space, thus the term topographic organization. Many regions of the cortex are organized this way: visual (V1), auditory (A1), and somatosensory (S1) cortices. 14 Models of Neurons Activation Functions Threshold unit: φ(v) = ( 1 if v 0 0 if v < 0 Neuron: informaiton processing unit fundamental to neural network operation. Synapses with associated weights: j to k denoted w kj. Summing junction: u k = P m j=1 w kj Activation function: y k = φ(u k + b k ) Bias b k : v k = u k + b k, or v k = P m j=0 w kj (in the right figure) 15 Piece-wise linear: >< 1 if v + 1 2 φ(v) = v if + 1 2 >: > v > 1 2 0 if v 1 2 Sigmoid: logistic function (a: slope parameter) φ(v) = 1 1 + exp( av) It is differentiable: φ (v) = aφ(v)(1 φ(v)). 16

Signum function: Sign function: Other Activation Functions Hyperbolic tangent function: >< 1 if v > 0 φ(v) = 0 if v = 0 >: 1 if v < 0 < 1 if v 0 φ(v) = : 1 if v < 0 φ(v) = tanh(v) 17 Stochastic Models Instead of deterministic activation, stochastic activation can be done. x: state of neuron (+1 or -1); P (v): probability of firing. < +1 with probability P (v) x = : 1 with probability 1 P (v) Typical choice of P (v): P (v) = 1 1 + exp( v/t ) where T is a pseudotemperature. When T 0, the neuron becomes deterministic. In computer simulations, use the rejection method. 1 Signal-flow Graphs Definition of a Neural Network w kj y k =w kj y i An information processing system that has been developed as a φ() y k =φ( ) y j y k =y i +y j generalization of mathematical models of human cognition or neurobiology, based on the assumptions that Nodes and links Links: synaptic links and activation links. Incoming edges: summation Outgoing edges: replication Architectureal graph simplifies the above and abstracts out internal neuronal function. Information processing occurs at many simple elements called neurons. Signals are passed between neurons over connection links. Each connection link has an associated weight, which typically multiplies the signal transmitted. Each neuron applies an activation function (usually non-linear) to its net input (sum of weighted input signals) to determine its output signal. 19 20

Signal-flow Graph Example: Exercise Feedback Feedback gives dynamics (temporal aspect), and it is found in almost every part of the nervous system in every animal. y k (n) = A[x j (n)](1) x j (n) = (n) + B[y k (n)](2) Turn the above into a signal-flow graph. From (1) and (2), we get y k (n) = A 1 AB [(n)] where A/(1 AB) is called the closed-loop operator and AB the open loop operator. Note that BA AB. 21 22 Feedback (cont d) Feedback (cont d) Substituting w for A and unit delay operator z 1 for B, we get A 1 AB = 2 1 wz 1 = w(1 wz 1 ) 1. Using binomial expansion (1 x) r = P k=0 From this, we get A 1 AB (r) k k! x k, we get a = w(1 wz 1 ) 1 X = w w l z l. l=0 X y k (n) = w w l z l [ (n)]. l=0 With z l [ (n)] = (n l), y k (n) = P l=0 wl+1 (n l). a Pochhammer symbol (r) k = r(r + 1)...(r + k 1) y k (n) = P l=0 wl+1 (n l), so w determines the behavior. With a fixed (0), the output y k (n) either converges or diverges. w < 1: converge (infinite memory, fading) w = 1: linearly diverge w > 1: expontially diverge 24

Network Architectures The connectivity of a neural network is intimately linked with the learning algorithm. Single-layer feedforward networks: one input layer, one layer of computing units (output layer), acyclic connections. Multilayer feedforward networks: one input layer, one (or more) hidden layers, and ont output layer. With more hidden layers, higher-order statistics can be processed. Recurrent networks: feedback loop exists. Layers can be fully connected or partially connected. Knowledge Representation Knowledge refers to stored information or models used by a person or a machine to interpret, predict, and appropriately respond to the outside world. What information is actually made explicit. How the information is physically encoded for subsequent use. Knowledge of the world consists of two kinds of information: The known world state: what is and what has been known prior informaiton. Observations (measurements) of the world, obtained by sensors (they can be noisy). They provide examples. Examples can be labeled or unlabeled. 25 26 Design of Neural Networks Design of Representations Select architecture, and gather input samples and train using a learning algorithm (learning phase). Test with data not seen before (generalization phase). So, it is data-driven, unlike conventional programming. 1. Similar inputs from similar classes should produce similar representations, leading to classification into the same category. 2. Items to be categorized as separate classes should be given widely different representations in the network. 3. If a particular feature is important, a larger number of neurons should be involved in the representation of the item in the network. 4. Prior information and invariances should be built into the design of a neural network with a specialized structure: biologically plausible, fewer free parameters, faster information transfer, and lower cost in building the network. 27 2

Similarity Measures Similar inputs from similar classes should produce similar representations, leading to classification into the same category. Reciprocal of Euclidean distance 1/d(x i, ): x i = [x i1, x i2,..., x im ] T " m # X 1/2 d(x i, ) = x i = (x ik k ) 2 Dot product (inner product) (x i, ) = x T i = k=1 mx x ik k = x i cos θ ij. k=1 Similarity Measures (cont d) When two vectors x i and are drawn from two distributions: Mean vector: µ i = E[x i ] Mahalanobis diatance: d 2 ij = (x i µ i ) T Σ 1 ( µ j ). Covariance matrix is assumed to be the same: Σ = E[(x i µ i )(x i µ i ) T ] = E[( µ j )( µ j ) T ] The two are related, when x i = = 1: mx d 2 (x i, ) = (x ik k ) 2 = (x i ) T (x i ) = 2 2x T i. k=1 29 30 Building Prior Information into Neural Network Design Restrict network architecture: receptive fields Constrain the choice of synaptic weights: weight sharing Building Invariance into Neural Network Design Invariance by structure Invariance by training Invariant feature space 31