Slide10 Haykin Chapter 14: Neurodynamics (3rd Ed. Chapter 13)

Similar documents
Euler equations for multiple integrals

u t v t v t c a u t b a v t u t v t b a

19 Eigenvalues, Eigenvectors, Ordinary Differential Equations, and Control

Lecture 2 Lagrangian formulation of classical mechanics Mechanics

The derivative of a function f(x) is another function, defined in terms of a limiting expression: f(x + δx) f(x)

model considered before, but the prey obey logistic growth in the absence of predators. In

ELEC3114 Control Systems 1

Introduction to the Vlasov-Poisson system

6 General properties of an autonomous system of two first order ODE

Linear ODEs. Types of systems. Linear ODEs. Definition (Linear ODE) Linear ODEs. Existence of solutions to linear IVPs.

Dynamical Systems and a Brief Introduction to Ergodic Theory

Hopfield Network for Associative Memory

BEYOND THE CONSTRUCTION OF OPTIMAL SWITCHING SURFACES FOR AUTONOMOUS HYBRID SYSTEMS. Mauro Boccadoro Magnus Egerstedt Paolo Valigi Yorai Wardi

Lagrangian and Hamiltonian Dynamics

Improved Rate-Based Pull and Push Strategies in Large Distributed Networks

Math Notes on differentials, the Chain Rule, gradients, directional derivative, and normal vectors

Left-invariant extended Kalman filter and attitude estimation

LeChatelier Dynamics

Convergence of Random Walks

Switching Time Optimization in Discretized Hybrid Dynamical Systems

ECE 422 Power System Operations & Planning 7 Transient Stability

Stability Analysis of Nonlinear Networks via M-matrix Theory: Beyond Linear Consensus

An Iterative Incremental Learning Algorithm for Complex-Valued Hopfield Associative Memory

4. Important theorems in quantum mechanics

Nonlinear Adaptive Ship Course Tracking Control Based on Backstepping and Nussbaum Gain

G4003 Advanced Mechanics 1. We already saw that if q is a cyclic variable, the associated conjugate momentum is conserved, L = const.

Chapter 2 Lagrangian Modeling

Designing Information Devices and Systems II Fall 2017 Note Theorem: Existence and Uniqueness of Solutions to Differential Equations

Robust Forward Algorithms via PAC-Bayes and Laplace Distributions. ω Q. Pr (y(ω x) < 0) = Pr A k

F 1 x 1,,x n. F n x 1,,x n

SYMPLECTIC GEOMETRY: LECTURE 3

Outline. Summary of topics relevant for the final. Outline of this part. Scalar difference equations. General theory of ODEs. Linear ODEs.

From Local to Global Control

Math 11 Fall 2016 Section 1 Monday, September 19, Definition: A vector parametric equation for the line parallel to vector v = x v, y v, z v

Table of Common Derivatives By David Abraham

CHAPTER 1 : DIFFERENTIABLE MANIFOLDS. 1.1 The definition of a differentiable manifold

IPA Derivatives for Make-to-Stock Production-Inventory Systems With Backorders Under the (R,r) Policy

Sparse Reconstruction of Systems of Ordinary Differential Equations

Lecture XVI: Symmetrical spacetimes

4 Second-Order Systems

Calculus of Variations

Optimal Variable-Structure Control Tracking of Spacecraft Maneuvers

Fundamentals of Dynamical Systems / Discrete-Time Models. Dr. Dylan McNamara people.uncw.edu/ mcnamarad

Lectures - Week 10 Introduction to Ordinary Differential Equations (ODES) First Order Linear ODEs

Generalized-Type Synchronization of Hyperchaotic Oscillators Using a Vector Signal

Vectors in two dimensions

Schrödinger s equation.

APPPHYS 217 Thursday 8 April 2010

THE DIVERGENCE. 1. Interpretation of the divergence 1 2. The divergence theorem Gauss law 6

A LIMIT THEOREM FOR RANDOM FIELDS WITH A SINGULARITY IN THE SPECTRUM

Hopfield Neural Network

'HVLJQ &RQVLGHUDWLRQ LQ 0DWHULDO 6HOHFWLRQ 'HVLJQ 6HQVLWLYLW\,1752'8&7,21

On state representations of time-varying nonlinear systems

Lyapunov Functions. V. J. Venkataramanan and Xiaojun Lin. Center for Wireless Systems and Applications. School of Electrical and Computer Engineering,

Least-Squares Regression on Sparse Spaces

arxiv: v5 [cs.lg] 28 Mar 2017

Backward Bifurcation of Sir Epidemic Model with Non- Monotonic Incidence Rate under Treatment

Laplacian Cooperative Attitude Control of Multiple Rigid Bodies

Nested Saturation with Guaranteed Real Poles 1

State observers and recursive filters in classical feedback control theory

The Exact Form and General Integrating Factors

Chapter 6 Nonlinear Systems and Phenomena. Friday, November 2, 12

Math 342 Partial Differential Equations «Viktor Grigoryan

Capacity Analysis of MIMO Systems with Unknown Channel State Information

Monopoly Part III: Multiple Local Equilibria and Adaptive Search

Analysis of a Fractional Order Prey-Predator Model (3-Species)

Topic 2.3: The Geometry of Derivatives of Vector Functions

Permanent vs. Determinant

Inferring dynamic architecture of cellular networks using time series of gene expression, protein and metabolite data. Supplementary Information

Witten s Proof of Morse Inequalities

WUCHEN LI AND STANLEY OSHER

EE 370L Controls Laboratory. Laboratory Exercise #7 Root Locus. Department of Electrical and Computer Engineering University of Nevada, at Las Vegas

Diagonalization of Matrices Dr. E. Jacobs

The total derivative. Chapter Lagrangian and Eulerian approaches

On Characterizing the Delay-Performance of Wireless Scheduling Algorithms

θ x = f ( x,t) could be written as

TAYLOR S POLYNOMIAL APPROXIMATION FOR FUNCTIONS

Total Energy Shaping of a Class of Underactuated Port-Hamiltonian Systems using a New Set of Closed-Loop Potential Shape Variables*

The Kepler Problem. 1 Features of the Ellipse: Geometry and Analysis

ensembles When working with density operators, we can use this connection to define a generalized Bloch vector: v x Tr x, v y Tr y

Lecture 3: Development of the Truss Equations.

Lecture Introduction. 2 Examples of Measure Concentration. 3 The Johnson-Lindenstrauss Lemma. CS-621 Theory Gems November 28, 2012

Separation of Variables

FLUCTUATIONS IN THE NUMBER OF POINTS ON SMOOTH PLANE CURVES OVER FINITE FIELDS. 1. Introduction

Implicit Lyapunov control of closed quantum systems

B5.6 Nonlinear Systems

Free rotation of a rigid body 1 D. E. Soper 2 University of Oregon Physics 611, Theoretical Mechanics 5 November 2012

APPROXIMATE SOLUTION FOR TRANSIENT HEAT TRANSFER IN STATIC TURBULENT HE II. B. Baudouy. CEA/Saclay, DSM/DAPNIA/STCM Gif-sur-Yvette Cedex, France

Optimal Control of Spatially Distributed Systems

Linear and quadratic approximation

Students need encouragement. So if a student gets an answer right, tell them it was a lucky guess. That way, they develop a good, lucky feeling.

THE VAN KAMPEN EXPANSION FOR LINKED DUFFING LINEAR OSCILLATORS EXCITED BY COLORED NOISE

7 Rate-Based Recurrent Networks of Threshold Neurons: Basis for Associative Memory

Topic 7: Convergence of Random Variables

LECTURE 1: BASIC CONCEPTS, PROBLEMS, AND EXAMPLES

Polynomial Inclusion Functions

Problem set 3: Solutions Math 207A, Fall where a, b are positive constants, and determine their linearized stability.

Games and Economic Behavior

Numerical Integrator. Graphics

Hopfield Networks. (Excerpt from a Basic Course at IK 2008) Herbert Jaeger. Jacobs University Bremen

Transcription:

Slie10 Haykin Chapter 14: Neuroynamics (3r E. Chapter 13) CPSC 636-600 Instructor: Yoonsuck Choe Spring 2012 Neural Networks with Temporal Behavior Inclusion of feeback gives temporal characteristics to neural networks: recurrent networks. Two ways to a feeback: Local feeback Global feeback Recurrent networks can become unstable or stable. Main interest is in recurrent network s stability: neuroynamics. Stability is a property of the whole system: coorination between parts is necessary. 1 2 Stability in Nonlinear Dynamical System Lyapunov stablity: more on this later. Stuy of neuroynamics: Deterministic neuroynamics: expresse as nonlinear ifferential equations. Stochastic neuroynamics: expresse in terms of stochastic nonlinear ifferential equations. Recurrent networks perturbe by noise. Preliminaries: Dynamical Systems A ynamical system is a system whose state varies with time. State-space moel: values of state variables change over time. Example: x 1 (t), x 2 (t),..., x N (t) are state variables that hol ifferent values uner inepenent variable t. This escribes a system of orer N, an x(t) is calle the state vector. The ynamics of the system is expresse using orinary ifferential equations: t x j(t) = F j (x j (t)), j = 1, 2,..., N. or, more conveniently x(t) = F(x(t)). t 3 4

Autonomous vs. Non-autonomous Dynamical State Space Systems x 2 t=1 x/t t=0 Autonomous: F( ) oes not explicitly epen on time. Non-autonomous: F( ) explicitly epens on time. t=2... x 1 F as a Vector Fiel Since x can be seen as velocity, F(x) can be seen as a t velocity vector fiel, or a vector fiel. In a vector fiel, each point in space (x) is associate with one unique vector (irection an magnitue). In a scalar fiel, one point has one scalar value. 5 It is convenient to view the state-space equation x t escribing the motion of a point in N-imensional space = F(x) as (Eucliean or non-eucliean). Note: t is continuous! The points traverse over time is calle the trajectory or the orbit. The tangent vector shows the instantaneous velocity at the initial conition. 6 Phase Portrait an Vector Fiel Conitions for the Solution of the State Space Equation A unique solution to the state space equation exists only uner certain conitions, which resticts the form of F(x). For a solution to exist, it is sufficient for F(x) to be continuous in all of its arguments. For uniqueness, it must meet the Lipschitz conition. Lipschitz conition: Let x an u be a pair of vectors in an open set M in a normal vector space. A vector function F(x) that satisfies: Re curves show the state (phase) portrait represente by trajectories from ifferent initial points. The blue arrows in the backgroun shows the vector fiel. Source: http://www.math.ku.eu/ byers/oe/b_cp_lab/pict.html 7 F(x) F(u) K x u for some constant K, the above is sai to be Lipschitz, an K is calle the Lipschitz constant for F(x). If F i / x j are finite everywhere, F(x) meet the Lipschitz conition. 8

Stability of Equilibrium States x M is sai to be an equilibrium state (or singular point) of the system if x t x= x = F( x) = 0. How the system behaves near these equilibrium states is of great interest. Near these points, we can approximate the ynamics by linearizing F(x) (using Taylor expansion) aroun x, i.e., x(t) = x + x(t): F(x) x + A x(t) Stability of in Linearize System In the linearize system, the property of the Jacobian matrix A etermine the behavior near equilibrium points. This is because x(t) A x(t). t If A is nonsingular, A 1 exists an this can be use to escribe the local behavior near the equilibrium x. where A is the Jacobian: A = x F(x) x= x The eigenvalues of the matrix A characterize ifferent classes of behaviors. 9 10 Eigenvalues/Eigenvectors Example: 2n-Orer System For a square matrix A, if a vector x an a scalar value λ exists so that (A λi)x = 0 then x is calle an eigenvector of A an λ an eigenvalue. Note, the above is simply Ax = λx An intuitive meaning is: x is the irection in which applying the linear transformation A only changes the magnitue of x (by λ) but not the angle. There can be as many as n eigenvector/eigenvalue for an n n matrix. 11 Positive/negative, real/imaginary character of eigenvalues of Jacobian etermine behavior. Stable noe (real -), unstable focus (Complex, + real) Stable focus (Complex, - real), Sale point (real + -) Unstable noe(real +), Center (Complex, 0 real) 12

Definitions of Stability Lyapunov s Theorem Uniformly stable for an arbitrary ɛ > 0, if there exists a positive δ such that x(0) x < δ implies x(t) x < ɛ for all t > 0. Convergent if there exists a positive δ such that x(0) x < δ implies x(t) x as t Asymptotically stable if both stable an convergent. Globally asymptotically stable if stable an all trajectoreis of the system converge to x as time t approaches infinity. Theorem 1: The equilibrium state x is stable if in a small neighborhoo of x there exists a positive efinite function V (x) such that its erivative with respect to time is negative semiefinite in that region. Theorem 2: The equilibrium state x is asymptotically stable if in a small neighborhoo of x there exists a positive efinite function V (x) such that its erivative with respect to time is negative efinite in that region. A scalar function V (x) that satisfies these conitions is calle a Lyapunov function for the equilibrium state x. 13 14 Attractors Types of Attractors Dissipative systems are characterize by attracting sets or manifols of imensionality lower than that of the embeing space. These are calle attractors. Regions of initial conitions of nonzero state space volume converge to these attractors as time t increases. Point attractors (left) Limit cycle attractors (right) Strange (chaotic) attractors (not shown) 15 16

Neuroynamical Moels We will focus on state variables are continuous-value, an those with ynamics expresse in ifferential equations or ifference equations. Properties: Large number of egree of freeom. Nonlinearity Dissipative (as oppose to conservative), i.e., open system. Noise Manipulation of Attractors as a Recurrent Nnet Paraigm We can ientify attractors with computational objects (associative memories, input-output mappers, etc.). In orer to o so, we must exercise control over the location of the attractors in the state space of the system. A learning algorithm will manipulate the equations governing the ynamical behavior so that a esire location of attractors are set. One goo way to o this is to use the energy minimization paraigm (e.g., by Hopfiel). 17 18 Hopfiel Moel Discrete Hopfiel Moel Base on McCulloch-Pitts moel (neurons with +1 or -1 output). Energy function is efine as E = 1 2 j=1 w ji x i x j (i j). Network ynamics will evolve in the irection that minimizes E. N units with full connection among every noe (no self-feeback). Given M input patterns, each having the same imensionality as the network, can be memorize in attractors of the network. Implements a content-aressable memory. Starting with an initial pattern, the ynamic will converge towar the attractor of the basin of attraction where the inital pattern was place. 19 20

Content-Aressable Memory Map a set of patterns to be memorize ξ µ onto fixe points x µ in the ynamical system realize by the recurrent network. Encoing: Mapping from ξ µ to x µ Decoing: Reverse mapping from state space x µ to ξ µ. Hopfiel Moel: Storage The learning is similar to Hebbian learning: M w ji = 1 ξ µ,j ξ µ,i N µ=1 with w ji = 0 if i = j. (Learning is one-shot.) In matrix form the above becomes: W = 1 N M µ=1 ξ µ ξ T µ MI The resulting weight matrix W is symmetric: W = W T. 21 22 Hopfiel Moel: Activation (Retrieval) Initialize the network with a probe pattern ξ probe. x j (0) = ξ probe,j. Upate output of each neuron (picking them by ranom) as x j (n + 1) = sgn ( N until x reaches a fixe point. w ji x i (n) ). Spurious States The weight matrix W is symmetric, thus the eigenvalues of W are all real. For large number of patters M, the matrix is egenerate, i.e., several eigenvectors can have the same eigenvalue. These eigenvectors form a subspace, an when the associate eigenvalue is 0, it is calle a null space. This is ue to M being smaller than the number of neurons N. Hopfiel network as content aressable memory: Discrete Hopfiel network acts as a vector projector (project probe vector onto subspace spanne by training patterns). Unerlying ynamics rive the network to converge to one of the corners of the unit hypercube. Spurious states are those corners of the hypercube that o not belong to the training pattern set. 23 24

Storage Capacity of Hopfiel Network Given a probe equal to the store pattern ξ ν, the activation of the jth neuron can be ecompose into the signal term an the noise term: v j = N = 1 N w jiξ ν,i M ξ N µ,j ξ µ,iξ ν,i µ=1 = ξ ν,j }{{} signal (ξ 3 ν,j =ξ ν,j {±1}) The signal-to-noise ratio is efine as ρ = + 1 N M µ=1,µ ν variance of signal = 1 variance of noise (M 1)/N N M The reciprocal of ρ, calle the loa parameter is esignate as α. ξ µ,j ξ µ,i ξ ν,i } {{ } noise Accoring to Amit an others, this value nees to be less than 0.14 (critical value α c ). 25 Storage Capacity of Hopfiel Network (cont ) Given α = 0.14, the storage capacity becomes M c = α c N = 0.14N when some error is allowe in the final patterns. For almost error-free performance, the storage capacity become N M c = 2 log e N Thus, storage capacity of Hopfiel network scales less than linearly with the size N of the network. This is a major limitation of the Hopfiel moel. 26 Cohen-Grossberg Theorem Cohen an Grossberg (1983) showe how to assess the stability of a certain class of neural networks: t u j = a j (u j ). [ b j (u j ) c ji ϕ i (u i ) Neural network with the above ynamics amits a Lyapunov function efine as: E = 1 2 where j=1 c ji ϕ i (u i )ϕ j (u j ) j=1 ϕ (λ) = λ (ϕ j(λ)). 27 ] uj 0, j = 1, 2,..., N b j(λ)ϕ j (λ)λ, Cohen-Grossberg Theorem (cont ) For the efinition in the previous slie to be vali, the following conitions nee to be met. The synaptic weights are symmetric. The function a j (u j ) satisfies the conition for nonnegativity. The nonlinear activation function ϕ j (u j ) nees to follow the monotonicity conition: With the above ϕ j (u j) = E t 0 ensuring global stability of the system. u j ϕ j (u j ) 0. Hopfiel moel can be seen as a special case of the Cohen-Grossberg theorem. 28

Demo Noisy input Partial input Capacity overloa 29