patterns (\attractors"). For instance, an associative memory may have tocomplete (or correct) an incomplete

Size: px
Start display at page:

Download "patterns (\attractors"). For instance, an associative memory may have tocomplete (or correct) an incomplete"

Transcription

1 Chapter 6 Associative Models Association is the task of mapping input patterns to target patterns (\attractors"). For instance, an associative memory may have tocomplete (or correct) an incomplete (or corrupted) pattern. Unlike computer memories, no \address" is known for each pattern. Learning consists of encoding the desired patterns as aweight matrix (network) retrieval (or \recall") refers to the generation of an output pattern when an input pattern is presented to the network.. Hetero-association: mapping input vectors to output vectors that range over a dierent vector space, e.g., translating English words to Spanish words.

2 2 CHAPTER 6. ASSOCIATIVE MODELS 2. Auto-association: input vectors and output vectors range over the same vector space, e.g., character recognition, eliminating noise from pixel arrays, as in Figure 6.. Noisy Input Stored Image? (or)? Stored Image Stored Image Figure 6.: Noisy input pattern resembles \3" and \8," but not \." This example illustrates the presence of noise in the input, and also that more than one (but not all) target patterns may be reasonable for an input pattern. The dierence between input and target patterns allows us

3 6.. NON-ITERATIVE PROCEDURES 3 to evaluate network outputs. Many associative learning models are based on variations of Hebb's observation: \When one cell repeatedly assists in ring another, the axon of the rst cell develops synaptic knobs (or enlarges them if they already exist) in contact with the soma of the second cell." We rst discuss non-iterative, \one-shot" procedures for association, then proceed to iterative models with better error-correction capabilities, in which node states may be updated several times, until they \stabilize." 6. Non-iterative Procedures In non-iterative association, the output pattern is generated from the input pattern in a single iteration. Hebb's law maybeused to develop associative \matrix memories," or gradient descent can be applied to minimize recall error. Consider hetero-association, using a two-layer network

4 4 CHAPTER 6. ASSOCIATIVE MODELS developed using the training set T = f(i p d p ):p = ::: Pg where i p 2 f; g n d p 2 f; g m. Presenting i p at the rst layer leads to the instant generation of d p at the second layer. Each node of the network corresponds one component of input (i) or desired output (d) patterns. * ::::::::::::::::::::::::::::::::::::::::::::::::::* Matrix Associative Memories: A weight matrix W is rst used to premultiply the input vector: y = WX () : If output node values must range over f; g, then the signum function is applied to each coordinate: X (2) j = sgn(y j ), for j = ::: m: The Hebbian weight update rule is w j k / i p k d p j : If all input and output patterns are available at once,

5 6.. NON-ITERATIVE PROCEDURES 5 we can perform a direct calculation of weights: w j k = c PX p= i p k d p j for c>: Each w j k measures the correlation between the kth component of the input vectors and the jth component of the associated output vectors. The multiplier c can be omitted if we are only interested in the sign of each component of y, so that: W = PX p= d p [i p ] T = DI T where rows of matrix I are input patterns i p, and rows of D are desired output patterns d p. Non-iterative procedures have low error-correction capabilities: multiplying W with even a slightly corrupted input vector often results in \spurious" output that differs from the patterns intended to be stored. Example 6. Associate input vector ( ) with output vector (; ), and ( ;) with (; ;):

6 6 CHAPTER 6. ASSOCIATIVE MODELS W ; ; ; ; A ;2 2 A : When the original input pattern (,) is presented, y = WX () ;2 2 A ;2 2 A and sgn(y) = X (2), the correct output pattern associated with ( ). If the stimulus is a new input pattern (; ;), for which no stored association exists, the resulting output pattern is y = WX () ;2 2 ; ; A 2 ;2 A and sgn(y) =( ;), dierent from the stored patterns. * ::::::::::::::::::::::::::::::::::::::::::::::::::* Least squares procedure (Widrow-Ho rule): When i p is presented to the network, the resulting output i p must be as close as possible to the desired output pat-

7 6.. NON-ITERATIVE PROCEDURES 7 tern d p. Hence must be chosen to minimize E = = PX p= PX jjd p ; i p jj 2 p=[(d p ; X j j i p j ) 2 + +(d p m ; X j m j i p j ) 2 ] Since E is a quadratic function whose second derivative is positive, we obtain the weights that minimize E j ` = ;2 PX p= for each ` j, obtaining PX p= d p j i p ` = = d p j ; PX p= nx k= nx k= nx k= j k j k i p k! i p ` = j k i p k i p P X p= =(jth row of) (`th column of II T ). Combining all such equations, DI T =(II T ): i p k i p ` A When (II T ) is invertible, mean square error is mini-

8 8 CHAPTER 6. ASSOCIATIVE MODELS mized by =DI T (II T ) ; (6.) The least squares method \normalizes" the Hebbian weight matrix DI T using the inverse of (II T ) ;. For auto-association, = II T (II T ) ; is the identity matrix that maps each vector to itself. * ::::::::::::::::::::::::::::::::::::::::::::::::::* What if II T has no inverse? Every matrix A has a unique `pseudo-inverse', dened to be a matrix A that satises the following conditions: AA A = A A AA = A A A = (A A) T AA = (AA ) T : The Optimal Linear Associative Memory (OLAM) generalizes Equation 6.: =D [I] : For autoassociation, I = D, sothat=i [I] :

9 6.. NON-ITERATIVE PROCEDURES 9 When I is a set of orthonormal unit vectors, hence I [I] T i p i p = 8 < : if p = p otherwise =[I] T I = I, the identity matrix, so that all conditions dening the pseudo-inverse are satised with [I] =[I] T. Then =D [I] = D [I] T : * ::::::::::::::::::::::::::::::::::::::::::::::::::* Example 6.2 Hetero-association, with three (i d) pairs: ( ; ) ( ; ; ; ) (; ;). Input and output matrices are: I = ; ; C A and D ; ; ; ; A : The inverse of I[I] T exists, and = D[I] T (I[I] T ) ; ; ; ; ; A ; ; C A :5 :25 :25 :25 :5 :25 :25 :25 :5 C A

10 CHAPTER 6. ASSOCIATIVE MODELS ; ; A : An input pattern (; ; ;) yields (; ) on premultiplying by. * ::::::::::::::::::::::::::::::::::::::::::::::::::* Example 6.3 Consider hetero-association where four input/output patterns are ( ; ; ;) ( ; ; ; ) (; ; ), and (; ; ). I = ; ; ; ; ; ; C A and D ; ; ; ; A : ; I[I] T = :25 : : : :25 : : : : C A :

11 6.. NON-ITERATIVE PROCEDURES = ; ; ; ; ; A : A ; ; ; ; ; ; B A :25 : : : :25 : : : : C A For input vector ( ;), the output ; ; A ; C A ; ; A : * ::::::::::::::::::::::::::::::::::::::::::::::::::* Noise Extraction: Given I, a vector x can be written as the sum of two components: x = Ix = Wx+(I ; W )x =^x +~a = mx i= c i i i +~a where ^x is the projection of x onto the vector space spanned by I, and ~a is the noise component orthonormal to this space.

12 2 CHAPTER 6. ASSOCIATIVE MODELS Matrix multiplication by W thus projects any vector onto the space of the stored vectors, whereas (I ; W ) extracts the noise component to be minimized. Noise is suppressed if the number of patterns being stored is less than the number of neurons otherwise, the noise may be amplied. * ::::::::::::::::::::::::::::::::::::::::::::::::::* Nonlinear Transformations: X (2) = f(w X () ) = f(w X () + + w n X () n ). f(w m X () + + w m n X () n ) where f is dierentiable. Error E is minimized w.r.t. each weight w j ` j ` PX p= where X (2) p j = f( (d p j ; X (2) p j ) f ( nx k= nx k= w j k i p `)i p ` = w j k i p `) and f (Z) =df(z)=dz. Iterative procedures are needed to solve these nonlinear equations. * ::::::::::::::::::::::::::::::::::::::::::::::::::* C A

13 6.2. HOPFIELD NETWORKS Hopeld Networks Hopeld networks are auto-associators in which node values are iteratively updated based on a local computation principle: the new state of each node depends only on its net weighted input at a given time. The network is fully connected, as shown in Figure 6.2, and weights are obtained by Hebb's rule. The system may undergo many state transitions before converging to a stable state. w 2 = w 2 2 w 5 = w 5 w 2 3 = w w 4 5 = w w 3 4 = w 4 3 Figure 6.2: Hopeld network with ve nodes. * ::::::::::::::::::::::::::::::::::::::::::::::::::* Discrete Hopeld networks One node corresponds to each vector dimension,

14 4 CHAPTER 6. ASSOCIATIVE MODELS taking values 2f; g. Each node applies a step function to the sum of external inputs and the weighted outputs of other nodes. Node output values may change when an input vector is presented. Computation proceeds in \jumps": the values for the time variable range over natural numbers, not real numbers. NOTATION: T = fi ::: i P g: training set. x p j (t): value generated by thejth node at time t, for pth input pattern. w k j : weight of connection from jth to kth node. I p k : external input to kth node, for pth input vector (includes threshold term). The node output function is described by: x p k (t +)= n X j= w k j x p j (t)+i p k A where sgn(x) = if x and sgn(x) =; if x <. Asynchronicity: at every time instant, precisely one node's output value is updated. Node selection may be

15 6.2. HOPFIELD NETWORKS 5 cyclic or random each node in the system must have the opportunity to change state. Network Dynamics: Select a node k 2f ::: ng to be updated x p `(t +)= 8 >< >: x p `(t) n X j= if ` 6= k (6.2) w` j x p j (t)+i p ` A if ` = k: By contrast, in Little's synchronous model, all nodes are updated simultaneously, at every time instant. Cyclic behavior may result when two nodes update their values simultaneously, each attempting to move towards a dierent attractor. The network may then repeatedly shuttle between two network states. Any such cycle consists of only two states and hence can be detected easily. The Hopeld network can be used to retrieve a stored pattern when a corrupted version of the stored pattern is presented. The Hopeld network can also be used to \complete" a pattern when parts of the pat-

16 6 CHAPTER 6. ASSOCIATIVE MODELS X X O X O (a) X Y O (b) X Y O (c) Y Y O Y (d) (e) Figure 6.3: Initial network state (O) is equidistant from attractors X and Y, in (a). Asynchronous computation instantly leads to one of the attractors, (b) or (c). In synchronous computation, the object moves instead from the initial position to a non-attractor position (d). Cycling behavior results for the synchronous case, with network state oscillating between (d) and (e).

17 6.2. HOPFIELD NETWORKS 7 tern are missing, e.g., using for an unknown node input/activation value. * ::::::::::::::::::::::::::::::::::::::::::::::::::* Example 6.4 Consider a 4-node Hopeld network whose weights store patterns ( ) and (; ; ; ;). Each weight w` j =, for ` 6= j, and w j j =forallj. I. Corrupted Input Pattern ( ;): I = I 2 = I 3 = and I 4 = ; are the initial values for node outputs.. Assume that the second node is randomly selected for possible update. Its net input is w 2 x +w 2 3 x 3 + w 2 4 x 4 + I 2 =+;+=2. Since sgn(2) =, this node's output remains at. 2. If the fourth node is selected for possible update, its net input is ++; =2, and sgn(2) = implies that this node changes state to. 3. No further changes of state occur from this network conguration (,,, ). Thus, the network has successfully recovered one of the stored patterns from

18 8 CHAPTER 6. ASSOCIATIVE MODELS the corrupted input vector. II. Equidistant Case( ; ;): Both stored patterns are equally distant one is chosen because the node function yields when the net input is.. If the second node is selected for updating, its net input is, hence state is not changed from. 2. If the third node is selected for updating, its net input is, hence state is changed from ; to. 3. Subsequently, the fourth node also changes state, resulting in the network conguration (,,, ). Missing Input Element Case( ; ;): If the second node is selected for updating, its net input is w 2 x + w 32 x 3 + w 42 x 4 + I 2 =;;+<, implying the updated node output is ; for this node. Subsequently, the rst node also switches state to ;, resulting in the network conguration (;, ;, ;, ;). Multiple Missing Input Element Case( ;): Though most of the initial inputs are unknown, the network suc-

19 6.2. HOPFIELD NETWORKS 9 ceeds in switching states of three nodes, resulting in the stored pattern (;, ;, ;, ;). Thus, a signicant amount of corruption, noise or missing data can be successfully handled in problems with a small number of stored patterns. * ::::::::::::::::::::::::::::::::::::::::::::::::::* Energy Function: The Hopeld network dynamics attempt to minimize an \energy" (cost) function derived as follows, assuming each desired output vector is the stored attractor pattern nearest the input vector. If w` j is positive and large, then we expect that the `th and jth nodes are frequently ON or OFF together in the attractor patterns, i.e., P P p= (i p ` i p j ) is positive and large. Similarly, w` j is negative and large when the `th and jth nodes frequently have opposite activation values for dierent attractor patterns, i.e., P P p= (i p ` i p j )is negative and large.

20 2 CHAPTER 6. ASSOCIATIVE MODELS So w` j should be proportional to P P p= (i p ` i p j ). For strong positive or negative correlation, w` j and P p(i p ` i p j ) have the same sign, hence P P p= w` j(i p ` i p j ) >. Self-excitation coecient w j j, often =. P Summing over all pairs of nodes, P` j w` ji p `i p j is positive and large for input vectors almost identical to some attractor. We expect that node correlations present in the attractors are absent in an input vector i distant from P all attractor patterns, so that P` j w` ji`i j is then low or negative. Therefore, the energy function contains a term X` X j w` j x`x j A : When this is minimized, the nal values of various node outputs are expected to correspond to an attractor pattern. Network output should be close to the input vector

21 6.2. HOPFIELD NETWORKS 2 when presented with a corrupted image of `3', we do not want the system to generate an output corresponding to `'. For external inputs I`, another term ; P`I`x` is included in the energy expression I`x` > i input=output for the `th node. Combining the two terms, the following \energy" or Lyapunov function must be minimized by modifying x` values E = ;a X` X j w` j x`x j ; b X` I`x` where a b. The values a ==2 and b =correspond to reduction in energy whenever a node update occurs, as described below. Even if each I` =,we can select initial node inputs x`() so that the network settles into a state close to the input pattern components. * ::::::::::::::::::::::::::::::::::::::::::::::::::* Energy Minimization: Steadily reducing E will result in convergence to a stable state (which mayormay not be

22 22 CHAPTER 6. ASSOCIATIVE MODELS one of the desired attractors). Let the kth node be selected for updating at time t. For the node update rule in Eqn. 6.2, the resulting change of energy is: E(t) = E(t +); E(t) = ;a X` ;b X` X j6=` w` j [x`(t +)x j (t +); x`(t)x j (t)] I`(x`(t +); x`(t)) = ;a X j6=k ((w k j + w j k )(x k (t +); x k (t)) x j (t)) ;bi k (x k (t +); x k (t)) because x j (t +) = x j (t) for every node (j 6= k) not selected for updating at this step. Hence, E(t) =; 2 4 a X j6=k (w k j + w j k )x j (t)+bi k 3 5 (xk (t +); x k (t)) : For E(t) to be negative, (x k (t +); x k (t)) and (a P j6=k (w k j + w j k )x j (t)+bi k )must have the same sign. The weights are chosen to be proportional to correlation terms, i.e., w` j = P P p= i p `i p j =P. Hence w j k = w k j,i.e.,theweights are symmetric, and for the choice

23 6.2. HOPFIELD NETWORKS 23 of the constants a = b =, the energy change expres- 2 sion simplies to E(t) = ; where net k (t) X w j k x j (t)+i k A (xk (t +); x k (t)) j6=k = ;net k (t)x k X j6=k to the kth node at time t. w j k x j (t)+i k A is the net input To reduce energy, thechosen (kth) node changes state i current state diers from the sign of the net input, i.e., net k (t) x k (t) > : Repeated applications of this update rule results in a `stable state' in which all nodes stop changing their current values. Stable states may not be the desired attractor states, but may instead be \spurious". * ::::::::::::::::::::::::::::::::::::::::::::::::::* Example 6.5 The patterns ( ; ;), ( ) and (; ; ) are to be stored in a 4-node network. The

24 24 CHAPTER 6. ASSOCIATIVE MODELS rst and second nodes have exactly the same values in every stored pattern, hence w 2 =. Similarly, w 3 4 =. The rst and third nodes agree in one stored pattern but disagree in the other two stored patterns, hence w 3 = (no. of agreements)-(no.of disagreements) no. of patterns Similarly, w 4 = w 2 3 = w 2 4 = ;=3. If the input vector is (; ; ; ;), and the fourth node is selected for possible node update, its net input is w 4 x + w 2 4 x 2 + w 3 4 x 3 + I 4 =(;=3)(;) + (;=3)(;)+(;) + (;) = ;4=3 <, hence the node does not change state. The same holds for every node in the network, so that the network conguration remains at (; ; ; ;), dierent from the patterns that were to be stored. If the input vector is (; ; ; ), representing the case when the fourth input value is missing, and the fourth node is selected for possible node update, its net input is w 4 x + w 2 4 x 2 + w 3 4 x 3 + I 4 = = ; 2 3 :

25 6.2. HOPFIELD NETWORKS 25 (;=3)(;)+(;=3)(;)+(;)+() = ;=3 <, and the node changes state to ;, resulting in the spurious pattern (; ; ; ;). * ::::::::::::::::::::::::::::::::::::::::::::::::::* Example 6.6 Images of four dierent objects are shown in Figure 6.4. We treat each image as a 9 9 binary pixel array, as in Figure 6.5, stored using a Hopeld network with 9 9 neurons. XII IX III VI Figure 6.4: Four images stored in a Hopeld network. Figure 6.5: Binary representation of objects in Fig. 6.4.

26 26 CHAPTER 6. ASSOCIATIVE MODELS Figure 6.6: Corrupted versions of objects in Fig The network is stimulated by a distorted version of a stored image, shown in Figure 6.6. In each case, as long as the total amount of distortion is less than 3% of the number of neurons, the network recovers the correct image, even when big parts of an image are lled with ones or zeroes. Poorer performance would be expected if the network is much smaller, or if the number of patterns to be stored is much larger. One drawback of a Hopeld network is the assumption of full connectivity: a million weights are needed for a thousand-node network such large cases arise in imageprocessing applications with one node per pixel. * ::::::::::::::::::::::::::::::::::::::::::::::::::*

27 6.2. HOPFIELD NETWORKS 27 Storage capacity refers to the quantity of information that can be stored and retrieved without error, and may be measured as no. of stored patterns C = : no. of neurons Capacity depends on the connection weights, the stored patterns, and the dierence between the stimulus patterns and stored patterns. Let the training set contain P randomly chosen vectors i ::: i P where each i p 2f ;g n. These vectors are stored using the connection weights w` j = n PX p= i p ` i p j : How large can P be, so that the network responds to each i` by correctly retrieving i`? Theorem 6. The maximum capacity of a Hopeld neural network (with n nodes) is bounded above by(n=4lnn). In other words, if = Prob.[ `-th bit of p-th stored vector is correctly retrieved for each ` p]

28 28 CHAPTER 6. ASSOCIATIVE MODELS then lim n! = whenever P <n=(4 ln(n)). Proof: For stimulus i, the output of the rst node is o = nx j=2 w j i j = n nx PX j=2 p= i p i p j i j : This output will be correctly decoded if o i >. Algebraic manipulation yields o i = +Z ; n where Z =(=n) P n j=2 P P p=2 i p i p j i j i. Probability of correct retrieval for the rst bit is ' = Prob[o i > ] = Prob[ + Z ; n > ] Prob[Z >;] By assumption, E(i` j )=and E(i` j i` j ) = 8 < : if ` = ` j = j otherwise By the central limit theorem, Z would be distributed as a Gaussian random variable with mean and variance (P ;)(n;)=n 2 P=n for large n and P, with

29 6.2. HOPFIELD NETWORKS 29 density function (2P=n) ;=2 exp(;nx 2 =2P ), and Prob[Z >a]= ' = = Z a (2P=n) ;=2 exp(;nx 2 =2P )dx p n Z p 2 P ; p n Z p 2 P ; e nx2 ;( 2P e nx2 ;( 2P ) dx ) dx since the density function of Z is symmetric. If (n=p ) is large, ' ; q P exp; ; n 2 n 2 P : The same probability expression ' applies for each bit of each pattern. Therefore, if p n, the probability of correct retrieval of all bits of all stored patterns is given by r P! np =(') np ; 2 n e(;n=2p ) r P ; np 2 n e(;n=2p ) : If P=n ==(2 ln n), for some >, then exp(; n 2P ln n ) = exp(;2 )=n ; 2

30 3 CHAPTER 6. ASSOCIATIVE MODELS which converges to zero as n!, so that r P np 2 n exp(; n 2 P ) n 2; (2 ln n) 3=2p 2 which converges to zero if 2. A better bound can be obtained by considering the correction of errors during the iterative evaluations. * ::::::::::::::::::::::::::::::::::::::::::::::::::* Hopeld network dynamics are not deterministic: the node to be updated at any given time is chosen randomly. Dierent sequences of choices of nodes to be updated may lead to dierent stable states. Stochastic version (of Hopeld network): Output of node ` is + with probability =( + exp(;2net`)), for net node input net`. eective if P < :38 n. Retrieval of stored patterns is then * ::::::::::::::::::::::::::::::::::::::::::::::::::* Continuous Hopeld networks:. Node outputs 2 a continuous interval. 2

31 6.2. HOPFIELD NETWORKS 3 2. Time is continuous: each node constantly examines its net input and updates its output. 3. Changes in node outputs must be gradual over time. For ease of implementation, assurance of convergence, and biological plausibility, we assume each node's output 2 [; ], with the modied node update rule: x`(t) t = 8 >< >: if x` =andf( P j w j `x j (t)+i`) > if x` = ; andf( P j w j `x j (t)+i`) < f( P j w j `x j (t)+i`) otherwise: Proof of convergence is similar to that for the discrete Hopeld model. An energy function with a lower bound is constructed, and it is shown that every change made in a node's output decreases energy, assuming asynchronous dynamics. The energy function E = ;(=2) X` X j6=` w` j x`(t)x j (t) ; X` I`x`(t) is minimized as x (t) ::: x n (t) vary with time t. Given the weights and external inputs, E has a lower bound since x (t) ::: x n (t) have upper and lower bounds.

32 32 CHAPTER 6. ASSOCIATIVE MODELS Since (=2)w` j x`x j +(=2)w j `x j x` = w` j x`x j for symmetric weights, we have E x`(t) = ;(X j The Hopeld net update rule requires x` w` j x j + I`): (6.3) t > if(x w j `x j + I`) > : (6.4) j Equivalently, the update rule may be expressed in terms of changes occurring in the net input to a node, instead of node output (x`). Whenever f is a monotonically increasing function (such as tanh), with f() =, f( X j w j `x j + I`) > i( X j w j `x j + I`) > : (6.5) From Equations 6.3, 6.4, and 6.5, x` t > X j =) E w j`x` + I` > A i.e., x` E t =) E t = X` x` x` t x` < for each i E x` < : <

33 6.2. HOPFIELD NETWORKS 33 Computation terminates, since: (a) each node update decreases (lower-bounded) E, (b) the number of possible states is nite, (c) the number of possible node updates is limited, (d) and node output values are bounded. The continuous model generalizes the discrete model, but the size of the state space is larger and the energy function may have many more local minima. * ::::::::::::::::::::::::::::::::::::::::::::::::::* Cohen-Grossberg Theorem gives sucient conditions for a network to converge asysmptotically to a stable state. Let u` be the net input to the `th node, f` its node function, a j the rate of change, and b j a \loss" term. Theorem 6.2 Let a j (u j ), (df j (u j )=du j ), where u j is the net input to the jth node in a neural network with symmetric weights w j ` = w` j, whose behavior is governed by the following dierential equation: du j dt = a j (u j ) " b j (u j ) ; NX i= w j ` f`(u`) # for j = ::: n:

34 34 CHAPTER 6. ASSOCIATIVE MODELS Then, there exists an energy function E for which (de=dt) for u j 6=, i.e., the network dynamics lead to a stable state in which energy ceases to change. * ::::::::::::::::::::::::::::::::::::::::::::::::::* 6.3 Brain-State-in-a-Box (BSB) Network BSB is similar to the Hopeld model, but all nodes are updated simultaneously. The node function used is a ramp function f(net) = min( max(; net)) which is bounded, continuous, and piecewise linear, as shown in Figure 6.7. f (x) + x ; Figure 6.7: Ramp function, with output values 2 [; ].

35 6.3. BRAIN-STATE-IN-A-BOX (BSB) NETWORK 35 Node update rule: initial activation is steadily amplied by positive feedback until saturation, jx`j. x`(t n X j= w` ` may be xed to equal. w` j x j (t) The state of the network always remains inside an n- dimensional \box," giving rise to the name of the network, \Brain-State-in-a-Box." Network state steadily moves from an arbitrary point inside the box(a in Figure 6.8(a)) towards one side of the box (B in gure), and then crawls along the side of the box to reach a corner of the box (C in gure), a stored pattern. Connections between nodes are Hebbian, representing correlations between node activations, and can be A obtained by the non-iterative computation: w` j = P where each i p j 2f; g. PX p= (i p ` i p j ) If the training procedure is iterative, training patterns are repeatedly presented to the network, and weights

36 36 CHAPTER 6. ASSOCIATIVE MODELS x 3 C ( ) /3 A (; ; ;) B x x 2 /3 (a) ( ; ;) (b) Figure 6.8: State change trajectory (A! B! C) in a BSB: (a) the network, with 3 nodes (b) three stored patterns, indicated by darkened circles. are successively modied using the weight update rule w` j = i p j (i p ` ; X k w` k i p k ) for p = ::: P where. This rule steadily reduces E = PX p= nx `=(i p ` ; X k w` k i p `) 2 and is applied repeatedly for each pattern until E. When training is completed, we expect P p w` j =, implying that X p i p j (i p ` ; X k w` k i p k )=

37 6.3. BRAIN-STATE-IN-A-BOX (BSB) NETWORK 37 X X i.e., (i p j i p `) = p p an equality satised when X i p j (w` k i p k ) k i p ` = X k (w` k i p k ): The trained network is hence \stable" for the trained patterns, i.e., presentation of a stored pattern does not result in any change in the network. * ::::::::::::::::::::::::::::::::::::::::::::::::::* Example 6.7 Let the training set contain 3 patterns f( ) (; ; ;) ( ; ;)g as in Figure 6.8, with network connection weights w 2 = w 2 =(+; )=3 ==3 w 3 = w 3 =(+; )=3 ==3 w 2 3 = w 3 2 =(++)=3 =: If an input pattern (:5 :6 :) is presented, the next network state is f :5+ :6 3 + : 3 f :6+ :5 3 +: f(:::)

38 38 CHAPTER 6. ASSOCIATIVE MODELS =(:73 :97 :97) where f is the ramp function described earlier. Note that the second and third nodes exhibit identical states, since w 2 3 =. The very next network state is ( ) a stored pattern. If (:5 :6 ;:7) is the input pattern, network state changes to (:47 :7 :7) then to (:52 :3 :3), eventually converging to the stable memory ( ). If the input pattern presented is ( :5 ;:5), network state changes instantly to ( ), and does not change thereafter. Network state may converge to a pattern which was not intended to be stored. E.g., if a 3-dimensional BSB network was intended to store only the two patterns f( ) ( ; ;)g with weights w 2 = w 2 = ( ; )=2 = w 3 = w 3 = ( ; )=2 = and w 2 3 = w 3 2 = ( + )=2 =

39 6.3. BRAIN-STATE-IN-A-BOX (BSB) NETWORK 39 then (; ; ;) is such a spurious attractor. BSB computations steadily reduce ; P` P j w` jx`x j when the weight matrix is symmetric and positive denite (i.e., all eigenvalues of W are positive). Stability and the number of spurious attractors tends to increase with the self-excitatory weights w j j. If the weight matrix is \diagonal-dominant," with w j j X`6=j w` j for each j 2f ::: ng, then every vertex of the `box' is a stable memory. The hetero-associative version of the BSB network contains two layers of nodes the connection from the jth node of the input layer to the `th node of the output layer carries the weight w` j = P PX p= i p j d p `: Example application: clustering radar pulses, distinguishing meaningful signals from noise in a radar surveillance environment where a detailed description of the signal sources is not known.

40 4 CHAPTER 6. ASSOCIATIVE MODELS 6.4 Hetero-associators A hetero-associator maps input patterns to a dierent set of output patterns. Consider the task of translating English word inputs into Spanish word outputs. A simple feedforward network trained bybackpropagation may be able to assert that a particular word supplied as input to the system is the 25th word in the dictionary. The desired output pattern may be obtained by concatenating the feedforward network function f : < n ;!fc ::: C k g with a lookup table mapping g : fc ::: C k g;!fp ::: P k g that associates each \address" C i with a pattern P i,as shown in Figure 6.9. Two-layer networks with weights determined by Hebb's rule are much less computationally expensive. In a noniterative model, output layer node activations are given by X x (2) ` (t +)=f( j w` j x () j (t)+`)

41 6.4. HETERO-ASSOCIATORS 4 Input vector Feedforward network C =. C i =. C C 2 C 3. P P 2 P 3. C k = C k P k P i Output pattern Figure 6.9: Association between input and output patterns using feedforward network and simple memory (at most one C i is \ON" for any input vector presentation).

42 42 CHAPTER 6. ASSOCIATIVE MODELS For error correction, iterative models are more useful:. Compute output node activations using the above (non-iterative) update rule, and then perform iterative auto-association within the output layer, leading to a stored output pattern. 2. Perform iterative auto-association within the input layer, resulting in a stored input pattern, which is then fed into the second layer of the hetero-associator network. 3. Bidirectional Associative Memory (BAM) with no intra-layer connections, as in Figure 6.: REPEAT (a) x (2) ` (t +)=f( P j (b) x () ` (t +2)=f( P j w` jx () j (t)+ (2) ` ) w` jx (2) j (t +)+ () ` ) UNTIL x () ` (t+2) = x () ` (t) and x (2) ` (t+2) = x (2) ` (t): Weights can be chosen to be Hebbian (correlation) terms: w` j = c P P p= i p j d p ` Sigmoid node functions can be used in continuous BAM models.

43 6.4. HETERO-ASSOCIATORS 43 First layer Second layer Figure 6.: Bidirectional Associative Memory (BAM) * ::::::::::::::::::::::::::::::::::::::::::::::::::* Example 6.8 The goal is to establish the following three associations between 4-dim. and 2-dim. patterns: (+ + ; ;)! (+ +) ( )! (+ ;) (; ; + +)! (; +): By the Hebbian rule (with c ==P ==3), w = (P 3 p= i p d p ) 3 =:

44 44 CHAPTER 6. ASSOCIATIVE MODELS Likewise, w 2 =andevery other weight =; 3, e.g., w 3 = (P 3 p= i p 3d p ) 3 = ; 3 : These weights constitute the weight matrix: W ;=3 ;=3 ;=3 ;=3 ;=3 ;=3 A : When an input vector such asi =(; ; ; ;) is presented at the rst layer, x (2) = sgn(x () w +x () 2 w 2+ x () 3 w 3 + x () 4 w 4 ) = sgn(; ; +=3 +=3) = ; and x (2) 2 = sgn(x () w 2 + x () 2 w x () 3 w x () 4 w 2 4 )= sgn(=3 + =3 + =3 + =3) = : The resulting vector is (; ), one of the stored 2-dim. patterns. The corresponding rst layer pattern generated is (; ; ), following computations such as x () = sgn(x (2) w + x (2) 2 w 2 )= sgn(; ; =3) = ;: No further changes occur. * ::::::::::::::::::::::::::::::::::::::::::::::::::* The `additive' variant of the BAM separates out the eect of the previous activation value and the external

45 6.4. HETERO-ASSOCIATORS 45 input I` for the node under consideration, using the following state change rule: x (2) X ` (t +)=a`x (2) ` (t)+b`i` + f( j6=` w` j x () j (t)) where a i b i are frequently chosen from f, g. If the BAM is discrete, bivalent, as well as additive, then x (2) i (t+) = 8 < : if[a i x (2) i (t)+b i I i + P j6=i w i jx () j (t)] > i otherwise, where i is the threshold for the ith node. A similar expression is used for the rst layer updates. BAM models have been shown to converge using a Lyapunov function such as L = ; X` X j x () ` (t)x (2) j (t)w` j : This energy function can be modied, taking into account external node inputs I k` for nodes: L 2 = L ; 2X k= N (k) X `= as well as thresholds (k) ` x (k) ` (I k` ; (k) ` )

46 46 CHAPTER 6. ASSOCIATIVE MODELS where N(k) denotes the number of nodes in the kth layer. As in Hopeld nets, there is no guarantee that the system stabilizes to a desired output pattern. Stability is assured if nodes of one layer are unchanged while nodes of the other layer are being updated. All nodes in a layer may change state simultaneously, allowing greater parallelism than Hopeld networks. When a new input pattern is presented, the rate at which the system stabilizes depends on the proximity of the new input pattern to a stored pattern, and not on the number of patterns stored. The number of patterns that can be stored in a BAM is limited by network size. * ::::::::::::::::::::::::::::::::::::::::::::::::::* Importance Factor: The relative importance of dierent pattern-pairs may be adjusted by attaching a \signicance factor" ( p )toeach pattern i p being used to modify the BAM weights: w j i = w i j = PX p= ( p i p i d p j ):

47 6.4. HETERO-ASSOCIATORS 47 Decay: Memory may change with time, allowing previously stored patterns to decay, using a monotonically decreasing function for each importance factor: p (t) =(; ) p (t ; ) or p (t) = max( p (t ; ) ; ) where << represents the \forgetting" rate. If (i p d p ) is added to the training set at time t, then w i j (t) =(; )w i j (t ; ) + p (t ; )i p j d p i where ( ; ) is the attenuation factor. Alternatively, the rate at which memory fades may be xed to a global clock: w i j (t) =(; )w i j (t ; ) + (t) X p= p (t ; )i p j d p i where (t) is the number of new patterns being stored at time t. There may be many instants at which (t) =, i.e., no new patterns are being stored, but existing memory continues to decay. * ::::::::::::::::::::::::::::::::::::::::::::::::::*

48 48 CHAPTER 6. ASSOCIATIVE MODELS 6.5 Boltzmann Machines Memory capacity ofhopeld models can be increased byintroducing hidden nodes. A stochastic learning process is needed to allow theweights between hidden nodes and other (\visible") nodes to change to optimal values, beginning from randomly chosen values. Principles of simulated annealing are invoked to minimize the energy function E dened earlier for Hop- eld networks. Anodeisrandomlychosen, and changes state with a probability that depends on (E=), where temperature > is steadily lowered. The state change is accepted with probability =( + exp(e=)). Annealing terminates when. Many state changes occur at each temperature. If such a system is allowed to reach equilibriumatany temperature, the ratio of the probabilities of two states a b with energies E a and E b will be given by P (a)=p (b) = exp(e b ;E a )=. This is the Boltzmann distribution, and

49 6.5. BOLTZMANN MACHINES 49 does not depend on the initial state or the path followed in reaching equilibrium. Figure 6. describes the learning algorithm for heteroassociation problems. For autoassociation, no nodes are \clamped" in the second phase of the algorithm. The Boltzmann machine weight change rule conducts gradient descent on relative entropy (cross-entropy), H(P P )= X s P s ln(p s =P s ) where s ranges over all possible network states, P s is the probability of network state s when the visible nodes are clamped, and P s is the probability ofnetwork state s when no nodes are clamped. H(P P ) compares the probability distributions P and P note that H(P P )= when P = P, the desired goal. In using the BM, we cannot directly compute output node states from input node states since initial states of the hidden nodes are undetermined. Annealing the network states would result in a global optimum of the energy function, which may have nothing in common

50 5 CHAPTER 6. ASSOCIATIVE MODELS Algorithm Boltzmann while weights continue to change and computational bounds are not exceeded, do Phase : for each training pattern, do Clamp all input and output nodes ANNEAL, changing hidden node states Update f:::p i j :::g, the equilibrium probs. with which nodes i j have same state end-for Phase 2: for each training pattern, do Clamp all input nodes ANNEAL, changing hidden & output nodes Update f:::p i j :::g, the equilibrium probs. with which nodes i j have same state end-for Increment each w i j by (p i j ; p ) i j end-while. Figure 6.: BM Learning Algorithm

51 6.5. BOLTZMANN MACHINES 5 with the input pattern. Since the input pattern may be corrupted, the best approach is to initially clamp input nodes while annealing from a high temperature to an intermediate temperature I. This leads the network towards a local minimum of the energy function near the input pattern. The visible nodes are then unclamped, and annealing continues from I to, allowing visible node states also to be modied, correcting errors in the input pattern. The cooling rate with which temperature decreases must be extremely slow toassureconvergence to global minima of E. Faster cooling rates are often used due to computational limitations. The BM learning algorithm is extremely slow: many observations have to be made at many temperatures before computation concludes. * ::::::::::::::::::::::::::::::::::::::::::::::::::* Mean eld annealing improves on the speed of execution of the BM, using a \mean eld" approximation in

52 52 CHAPTER 6. ASSOCIATIVE MODELS the weight change rule, e.g., approximating the weight update rule w` j = (p` j ; p ` j ) (where p` j = E(x`x j ) when the visible nodes are clamped, while p ` j = E(x`x j ) when no nodes are clamped), by w` j = (q`q j ; q `q j ) where average output of `th node is q` nodes are clamped, and q` without clamping. when visible For the Boltzmann distribution, the average output is q` = tanh( X j w` j x j =): The mean eld approximation suggests replacing the random variable x j by its expected value E(x j ), so that q` = tanh( X j w` j E(x j )=): These approximations improve the speed of execution of the Boltzmann machine, but convergence of the weight values to global optima is not assured. * ::::::::::::::::::::::::::::::::::::::::::::::::::*

53 6.6. CONCLUSION Conclusion The biologically inspired Hebbian learning principle shows how to make connection weights represent the similarities and dierences inherent in various attributes or input dimensions of available data. No extensive slow \training" phase is required the number of statechanges executed before the network stabilizes is roughly proportional to the number of nodes. Associative learning reinforces the magnitudes of connections between correlated nodes. Such networks can be used to respond to a corrupted input pattern with the correct output pattern. In autoassociation, the input pattern space and output space are identical these spaces are distinct in heteroassociation tasks. Heteroassociative systems may be bidirectional, with a vector from either vector space being generated when a vector from the other vector space is presented. These tasks can be accomplished using one-shot, non-iterative pro-

54 54 CHAPTER 6. ASSOCIATIVE MODELS cedures, as well as iterative mechanisms that repeatedly modify the weights as new samples are presented. In dierential Hebbian learning, changes in a weight w i j are caused by the change in the stimulation of the ith node by thejth node, at the time the output of the ith node changes also, the weight change is governed by the sum of such changes over a period of time. Not too many pattern associations can be stably stored in these networks. If few patterns are to be stored, perfect retrieval is possible even when the input stimulus is signicantly noisy or corrupted. However, such networks often store spurious memories.

CHAPTER 3. Pattern Association. Neural Networks

CHAPTER 3. Pattern Association. Neural Networks CHAPTER 3 Pattern Association Neural Networks Pattern Association learning is the process of forming associations between related patterns. The patterns we associate together may be of the same type or

More information

Computational Intelligence Lecture 6: Associative Memory

Computational Intelligence Lecture 6: Associative Memory Computational Intelligence Lecture 6: Associative Memory Farzaneh Abdollahi Department of Electrical Engineering Amirkabir University of Technology Fall 2011 Farzaneh Abdollahi Computational Intelligence

More information

COMP9444 Neural Networks and Deep Learning 11. Boltzmann Machines. COMP9444 c Alan Blair, 2017

COMP9444 Neural Networks and Deep Learning 11. Boltzmann Machines. COMP9444 c Alan Blair, 2017 COMP9444 Neural Networks and Deep Learning 11. Boltzmann Machines COMP9444 17s2 Boltzmann Machines 1 Outline Content Addressable Memory Hopfield Network Generative Models Boltzmann Machine Restricted Boltzmann

More information

In biological terms, memory refers to the ability of neural systems to store activity patterns and later recall them when required.

In biological terms, memory refers to the ability of neural systems to store activity patterns and later recall them when required. In biological terms, memory refers to the ability of neural systems to store activity patterns and later recall them when required. In humans, association is known to be a prominent feature of memory.

More information

Neural Nets and Symbolic Reasoning Hopfield Networks

Neural Nets and Symbolic Reasoning Hopfield Networks Neural Nets and Symbolic Reasoning Hopfield Networks Outline The idea of pattern completion The fast dynamics of Hopfield networks Learning with Hopfield networks Emerging properties of Hopfield networks

More information

Lecture 7 Artificial neural networks: Supervised learning

Lecture 7 Artificial neural networks: Supervised learning Lecture 7 Artificial neural networks: Supervised learning Introduction, or how the brain works The neuron as a simple computing element The perceptron Multilayer neural networks Accelerated learning in

More information

Artificial Intelligence Hopfield Networks

Artificial Intelligence Hopfield Networks Artificial Intelligence Hopfield Networks Andrea Torsello Network Topologies Single Layer Recurrent Network Bidirectional Symmetric Connection Binary / Continuous Units Associative Memory Optimization

More information

Hopfield Neural Network

Hopfield Neural Network Lecture 4 Hopfield Neural Network Hopfield Neural Network A Hopfield net is a form of recurrent artificial neural network invented by John Hopfield. Hopfield nets serve as content-addressable memory systems

More information

Neural Networks Lecture 6: Associative Memory II

Neural Networks Lecture 6: Associative Memory II Neural Networks Lecture 6: Associative Memory II H.A Talebi Farzaneh Abdollahi Department of Electrical Engineering Amirkabir University of Technology Winter 2011. A. Talebi, Farzaneh Abdollahi Neural

More information

2- AUTOASSOCIATIVE NET - The feedforward autoassociative net considered in this section is a special case of the heteroassociative net.

2- AUTOASSOCIATIVE NET - The feedforward autoassociative net considered in this section is a special case of the heteroassociative net. 2- AUTOASSOCIATIVE NET - The feedforward autoassociative net considered in this section is a special case of the heteroassociative net. - For an autoassociative net, the training input and target output

More information

Introduction to Neural Networks

Introduction to Neural Networks Introduction to Neural Networks What are (Artificial) Neural Networks? Models of the brain and nervous system Highly parallel Process information much more like the brain than a serial computer Learning

More information

Using a Hopfield Network: A Nuts and Bolts Approach

Using a Hopfield Network: A Nuts and Bolts Approach Using a Hopfield Network: A Nuts and Bolts Approach November 4, 2013 Gershon Wolfe, Ph.D. Hopfield Model as Applied to Classification Hopfield network Training the network Updating nodes Sequencing of

More information

Learning and Memory in Neural Networks

Learning and Memory in Neural Networks Learning and Memory in Neural Networks Guy Billings, Neuroinformatics Doctoral Training Centre, The School of Informatics, The University of Edinburgh, UK. Neural networks consist of computational units

More information

Associative Memories (I) Hopfield Networks

Associative Memories (I) Hopfield Networks Associative Memories (I) Davide Bacciu Dipartimento di Informatica Università di Pisa bacciu@di.unipi.it Applied Brain Science - Computational Neuroscience (CNS) A Pun Associative Memories Introduction

More information

Neural Networks. Hopfield Nets and Auto Associators Fall 2017

Neural Networks. Hopfield Nets and Auto Associators Fall 2017 Neural Networks Hopfield Nets and Auto Associators Fall 2017 1 Story so far Neural networks for computation All feedforward structures But what about.. 2 Loopy network Θ z = ቊ +1 if z > 0 1 if z 0 y i

More information

1 Introduction Tasks like voice or face recognition are quite dicult to realize with conventional computer systems, even for the most powerful of them

1 Introduction Tasks like voice or face recognition are quite dicult to realize with conventional computer systems, even for the most powerful of them Information Storage Capacity of Incompletely Connected Associative Memories Holger Bosch Departement de Mathematiques et d'informatique Ecole Normale Superieure de Lyon Lyon, France Franz Kurfess Department

More information

Using Variable Threshold to Increase Capacity in a Feedback Neural Network

Using Variable Threshold to Increase Capacity in a Feedback Neural Network Using Variable Threshold to Increase Capacity in a Feedback Neural Network Praveen Kuruvada Abstract: The article presents new results on the use of variable thresholds to increase the capacity of a feedback

More information

Hopfield Networks. (Excerpt from a Basic Course at IK 2008) Herbert Jaeger. Jacobs University Bremen

Hopfield Networks. (Excerpt from a Basic Course at IK 2008) Herbert Jaeger. Jacobs University Bremen Hopfield Networks (Excerpt from a Basic Course at IK 2008) Herbert Jaeger Jacobs University Bremen Building a model of associative memory should be simple enough... Our brain is a neural network Individual

More information

A. The Hopfield Network. III. Recurrent Neural Networks. Typical Artificial Neuron. Typical Artificial Neuron. Hopfield Network.

A. The Hopfield Network. III. Recurrent Neural Networks. Typical Artificial Neuron. Typical Artificial Neuron. Hopfield Network. III. Recurrent Neural Networks A. The Hopfield Network 2/9/15 1 2/9/15 2 Typical Artificial Neuron Typical Artificial Neuron connection weights linear combination activation function inputs output net

More information

Hertz, Krogh, Palmer: Introduction to the Theory of Neural Computation. Addison-Wesley Publishing Company (1991). (v ji (1 x i ) + (1 v ji )x i )

Hertz, Krogh, Palmer: Introduction to the Theory of Neural Computation. Addison-Wesley Publishing Company (1991). (v ji (1 x i ) + (1 v ji )x i ) Symmetric Networks Hertz, Krogh, Palmer: Introduction to the Theory of Neural Computation. Addison-Wesley Publishing Company (1991). How can we model an associative memory? Let M = {v 1,..., v m } be a

More information

Iterative Autoassociative Net: Bidirectional Associative Memory

Iterative Autoassociative Net: Bidirectional Associative Memory POLYTECHNIC UNIVERSITY Department of Computer and Information Science Iterative Autoassociative Net: Bidirectional Associative Memory K. Ming Leung Abstract: Iterative associative neural networks are introduced.

More information

Stochastic Networks Variations of the Hopfield model

Stochastic Networks Variations of the Hopfield model 4 Stochastic Networks 4. Variations of the Hopfield model In the previous chapter we showed that Hopfield networks can be used to provide solutions to combinatorial problems that can be expressed as the

More information

AI Programming CS F-20 Neural Networks

AI Programming CS F-20 Neural Networks AI Programming CS662-2008F-20 Neural Networks David Galles Department of Computer Science University of San Francisco 20-0: Symbolic AI Most of this class has been focused on Symbolic AI Focus or symbols

More information

Information Capacity of Binary Weights Associative. Memories. Mathematical Sciences. University at Bualo. Bualo NY

Information Capacity of Binary Weights Associative. Memories. Mathematical Sciences. University at Bualo. Bualo NY Information Capacity of Binary Weights Associative Memories Arun Jagota Department of Computer Science University of California, Santa Cruz CA 95064 jagota@cse.ucsc.edu Kenneth W. Regan University at Bualo

More information

Data Mining Part 5. Prediction

Data Mining Part 5. Prediction Data Mining Part 5. Prediction 5.5. Spring 2010 Instructor: Dr. Masoud Yaghini Outline How the Brain Works Artificial Neural Networks Simple Computing Elements Feed-Forward Networks Perceptrons (Single-layer,

More information

A. The Hopfield Network. III. Recurrent Neural Networks. Typical Artificial Neuron. Typical Artificial Neuron. Hopfield Network.

A. The Hopfield Network. III. Recurrent Neural Networks. Typical Artificial Neuron. Typical Artificial Neuron. Hopfield Network. Part 3A: Hopfield Network III. Recurrent Neural Networks A. The Hopfield Network 1 2 Typical Artificial Neuron Typical Artificial Neuron connection weights linear combination activation function inputs

More information

Pattern Association or Associative Networks. Jugal Kalita University of Colorado at Colorado Springs

Pattern Association or Associative Networks. Jugal Kalita University of Colorado at Colorado Springs Pattern Association or Associative Networks Jugal Kalita University of Colorado at Colorado Springs To an extent, learning is forming associations. Human memory associates similar items, contrary/opposite

More information

Consider the way we are able to retrieve a pattern from a partial key as in Figure 10 1.

Consider the way we are able to retrieve a pattern from a partial key as in Figure 10 1. CompNeuroSci Ch 10 September 8, 2004 10 Associative Memory Networks 101 Introductory concepts Consider the way we are able to retrieve a pattern from a partial key as in Figure 10 1 Figure 10 1: A key

More information

Hopfield Network for Associative Memory

Hopfield Network for Associative Memory CSE 5526: Introduction to Neural Networks Hopfield Network for Associative Memory 1 The next few units cover unsupervised models Goal: learn the distribution of a set of observations Some observations

More information

Machine Learning. Neural Networks

Machine Learning. Neural Networks Machine Learning Neural Networks Bryan Pardo, Northwestern University, Machine Learning EECS 349 Fall 2007 Biological Analogy Bryan Pardo, Northwestern University, Machine Learning EECS 349 Fall 2007 THE

More information

Artificial Neural Networks Examination, March 2004

Artificial Neural Networks Examination, March 2004 Artificial Neural Networks Examination, March 2004 Instructions There are SIXTY questions (worth up to 60 marks). The exam mark (maximum 60) will be added to the mark obtained in the laborations (maximum

More information

7 Rate-Based Recurrent Networks of Threshold Neurons: Basis for Associative Memory

7 Rate-Based Recurrent Networks of Threshold Neurons: Basis for Associative Memory Physics 178/278 - David Kleinfeld - Fall 2005; Revised for Winter 2017 7 Rate-Based Recurrent etworks of Threshold eurons: Basis for Associative Memory 7.1 A recurrent network with threshold elements The

More information

Artificial Neural Networks Examination, June 2005

Artificial Neural Networks Examination, June 2005 Artificial Neural Networks Examination, June 2005 Instructions There are SIXTY questions. (The pass mark is 30 out of 60). For each question, please select a maximum of ONE of the given answers (either

More information

Equivalence of Backpropagation and Contrastive Hebbian Learning in a Layered Network

Equivalence of Backpropagation and Contrastive Hebbian Learning in a Layered Network LETTER Communicated by Geoffrey Hinton Equivalence of Backpropagation and Contrastive Hebbian Learning in a Layered Network Xiaohui Xie xhx@ai.mit.edu Department of Brain and Cognitive Sciences, Massachusetts

More information

Memories Associated with Single Neurons and Proximity Matrices

Memories Associated with Single Neurons and Proximity Matrices Memories Associated with Single Neurons and Proximity Matrices Subhash Kak Oklahoma State University, Stillwater Abstract: This paper extends the treatment of single-neuron memories obtained by the use

More information

Hopfield Neural Network and Associative Memory. Typical Myelinated Vertebrate Motoneuron (Wikipedia) Topic 3 Polymers and Neurons Lecture 5

Hopfield Neural Network and Associative Memory. Typical Myelinated Vertebrate Motoneuron (Wikipedia) Topic 3 Polymers and Neurons Lecture 5 Hopfield Neural Network and Associative Memory Typical Myelinated Vertebrate Motoneuron (Wikipedia) PHY 411-506 Computational Physics 2 1 Wednesday, March 5 1906 Nobel Prize in Physiology or Medicine.

More information

Neural networks. Chapter 20. Chapter 20 1

Neural networks. Chapter 20. Chapter 20 1 Neural networks Chapter 20 Chapter 20 1 Outline Brains Neural networks Perceptrons Multilayer networks Applications of neural networks Chapter 20 2 Brains 10 11 neurons of > 20 types, 10 14 synapses, 1ms

More information

SCRAM: Statistically Converging Recurrent Associative Memory

SCRAM: Statistically Converging Recurrent Associative Memory SCRAM: Statistically Converging Recurrent Associative Memory Sylvain Chartier Sébastien Hélie Mounir Boukadoum Robert Proulx Department of computer Department of computer science, UQAM, Montreal, science,

More information

Linear Regression and Its Applications

Linear Regression and Its Applications Linear Regression and Its Applications Predrag Radivojac October 13, 2014 Given a data set D = {(x i, y i )} n the objective is to learn the relationship between features and the target. We usually start

More information

Markov Chains and MCMC

Markov Chains and MCMC Markov Chains and MCMC Markov chains Let S = {1, 2,..., N} be a finite set consisting of N states. A Markov chain Y 0, Y 1, Y 2,... is a sequence of random variables, with Y t S for all points in time

More information

Hopfield Networks and Boltzmann Machines. Christian Borgelt Artificial Neural Networks and Deep Learning 296

Hopfield Networks and Boltzmann Machines. Christian Borgelt Artificial Neural Networks and Deep Learning 296 Hopfield Networks and Boltzmann Machines Christian Borgelt Artificial Neural Networks and Deep Learning 296 Hopfield Networks A Hopfield network is a neural network with a graph G = (U,C) that satisfies

More information

Introduction to Artificial Neural Networks

Introduction to Artificial Neural Networks Facultés Universitaires Notre-Dame de la Paix 27 March 2007 Outline 1 Introduction 2 Fundamentals Biological neuron Artificial neuron Artificial Neural Network Outline 3 Single-layer ANN Perceptron Adaline

More information

Storage Capacity of Letter Recognition in Hopfield Networks

Storage Capacity of Letter Recognition in Hopfield Networks Storage Capacity of Letter Recognition in Hopfield Networks Gang Wei (gwei@cs.dal.ca) Zheyuan Yu (zyu@cs.dal.ca) Faculty of Computer Science, Dalhousie University, Halifax, N.S., Canada B3H 1W5 Abstract:

More information

Plan. Perceptron Linear discriminant. Associative memories Hopfield networks Chaotic networks. Multilayer perceptron Backpropagation

Plan. Perceptron Linear discriminant. Associative memories Hopfield networks Chaotic networks. Multilayer perceptron Backpropagation Neural Networks Plan Perceptron Linear discriminant Associative memories Hopfield networks Chaotic networks Multilayer perceptron Backpropagation Perceptron Historically, the first neural net Inspired

More information

Lecture 5: Logistic Regression. Neural Networks

Lecture 5: Logistic Regression. Neural Networks Lecture 5: Logistic Regression. Neural Networks Logistic regression Comparison with generative models Feed-forward neural networks Backpropagation Tricks for training neural networks COMP-652, Lecture

More information

Machine Learning and Adaptive Systems. Lectures 5 & 6

Machine Learning and Adaptive Systems. Lectures 5 & 6 ECE656- Lectures 5 & 6, Professor Department of Electrical and Computer Engineering Colorado State University Fall 2015 c. Performance Learning-LMS Algorithm (Widrow 1960) The iterative procedure in steepest

More information

Neural networks. Chapter 19, Sections 1 5 1

Neural networks. Chapter 19, Sections 1 5 1 Neural networks Chapter 19, Sections 1 5 Chapter 19, Sections 1 5 1 Outline Brains Neural networks Perceptrons Multilayer perceptrons Applications of neural networks Chapter 19, Sections 1 5 2 Brains 10

More information

Neural Networks and Fuzzy Logic Rajendra Dept.of CSE ASCET

Neural Networks and Fuzzy Logic Rajendra Dept.of CSE ASCET Unit-. Definition Neural network is a massively parallel distributed processing system, made of highly inter-connected neural computing elements that have the ability to learn and thereby acquire knowledge

More information

7 Recurrent Networks of Threshold (Binary) Neurons: Basis for Associative Memory

7 Recurrent Networks of Threshold (Binary) Neurons: Basis for Associative Memory Physics 178/278 - David Kleinfeld - Winter 2019 7 Recurrent etworks of Threshold (Binary) eurons: Basis for Associative Memory 7.1 The network The basic challenge in associative networks, also referred

More information

Artificial Neural Networks. Q550: Models in Cognitive Science Lecture 5

Artificial Neural Networks. Q550: Models in Cognitive Science Lecture 5 Artificial Neural Networks Q550: Models in Cognitive Science Lecture 5 "Intelligence is 10 million rules." --Doug Lenat The human brain has about 100 billion neurons. With an estimated average of one thousand

More information

Learning with Ensembles: How. over-tting can be useful. Anders Krogh Copenhagen, Denmark. Abstract

Learning with Ensembles: How. over-tting can be useful. Anders Krogh Copenhagen, Denmark. Abstract Published in: Advances in Neural Information Processing Systems 8, D S Touretzky, M C Mozer, and M E Hasselmo (eds.), MIT Press, Cambridge, MA, pages 190-196, 1996. Learning with Ensembles: How over-tting

More information

Neural Networks for Machine Learning. Lecture 11a Hopfield Nets

Neural Networks for Machine Learning. Lecture 11a Hopfield Nets Neural Networks for Machine Learning Lecture 11a Hopfield Nets Geoffrey Hinton Nitish Srivastava, Kevin Swersky Tijmen Tieleman Abdel-rahman Mohamed Hopfield Nets A Hopfield net is composed of binary threshold

More information

Neural Networks Lecture 2:Single Layer Classifiers

Neural Networks Lecture 2:Single Layer Classifiers Neural Networks Lecture 2:Single Layer Classifiers H.A Talebi Farzaneh Abdollahi Department of Electrical Engineering Amirkabir University of Technology Winter 2011. A. Talebi, Farzaneh Abdollahi Neural

More information

Neural Networks. Chapter 18, Section 7. TB Artificial Intelligence. Slides from AIMA 1/ 21

Neural Networks. Chapter 18, Section 7. TB Artificial Intelligence. Slides from AIMA   1/ 21 Neural Networks Chapter 8, Section 7 TB Artificial Intelligence Slides from AIMA http://aima.cs.berkeley.edu / 2 Outline Brains Neural networks Perceptrons Multilayer perceptrons Applications of neural

More information

Error Empirical error. Generalization error. Time (number of iteration)

Error Empirical error. Generalization error. Time (number of iteration) Submitted to Neural Networks. Dynamics of Batch Learning in Multilayer Networks { Overrealizability and Overtraining { Kenji Fukumizu The Institute of Physical and Chemical Research (RIKEN) E-mail: fuku@brain.riken.go.jp

More information

Terminal attractor optical associative memory with adaptive control parameter

Terminal attractor optical associative memory with adaptive control parameter 1 June 1998 Ž. Optics Communications 151 1998 353 365 Full length article Terminal attractor optical associative memory with adaptive control parameter Xin Lin a,), Junji Ohtsubo b, Masahiko Mori a a Electrotechnical

More information

Credit Assignment: Beyond Backpropagation

Credit Assignment: Beyond Backpropagation Credit Assignment: Beyond Backpropagation Yoshua Bengio 11 December 2016 AutoDiff NIPS 2016 Workshop oo b s res P IT g, M e n i arn nlin Le ain o p ee em : D will r G PLU ters p cha k t, u o is Deep Learning

More information

= w 2. w 1. B j. A j. C + j1j2

= w 2. w 1. B j. A j. C + j1j2 Local Minima and Plateaus in Multilayer Neural Networks Kenji Fukumizu and Shun-ichi Amari Brain Science Institute, RIKEN Hirosawa 2-, Wako, Saitama 35-098, Japan E-mail: ffuku, amarig@brain.riken.go.jp

More information

Neural Networks DWML, /25

Neural Networks DWML, /25 DWML, 2007 /25 Neural networks: Biological and artificial Consider humans: Neuron switching time 0.00 second Number of neurons 0 0 Connections per neuron 0 4-0 5 Scene recognition time 0. sec 00 inference

More information

Lecture 4: Feed Forward Neural Networks

Lecture 4: Feed Forward Neural Networks Lecture 4: Feed Forward Neural Networks Dr. Roman V Belavkin Middlesex University BIS4435 Biological neurons and the brain A Model of A Single Neuron Neurons as data-driven models Neural Networks Training

More information

ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD

ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD WHAT IS A NEURAL NETWORK? The simplest definition of a neural network, more properly referred to as an 'artificial' neural network (ANN), is provided

More information

Neural Network Training

Neural Network Training Neural Network Training Sargur Srihari Topics in Network Training 0. Neural network parameters Probabilistic problem formulation Specifying the activation and error functions for Regression Binary classification

More information

Basic Principles of Unsupervised and Unsupervised

Basic Principles of Unsupervised and Unsupervised Basic Principles of Unsupervised and Unsupervised Learning Toward Deep Learning Shun ichi Amari (RIKEN Brain Science Institute) collaborators: R. Karakida, M. Okada (U. Tokyo) Deep Learning Self Organization

More information

CSE 5526: Introduction to Neural Networks Hopfield Network for Associative Memory

CSE 5526: Introduction to Neural Networks Hopfield Network for Associative Memory CSE 5526: Introduction to Neural Networks Hopfield Network for Associative Memory Part VII 1 The basic task Store a set of fundamental memories {ξξ 1, ξξ 2,, ξξ MM } so that, when presented a new pattern

More information

Neural networks. Chapter 20, Section 5 1

Neural networks. Chapter 20, Section 5 1 Neural networks Chapter 20, Section 5 Chapter 20, Section 5 Outline Brains Neural networks Perceptrons Multilayer perceptrons Applications of neural networks Chapter 20, Section 5 2 Brains 0 neurons of

More information

Associative Memory : Soft Computing Course Lecture 21 24, notes, slides RC Chakraborty, Aug.

Associative Memory : Soft Computing Course Lecture 21 24, notes, slides   RC Chakraborty,  Aug. Associative Memory : Soft Computing Course Lecture 21 24, notes, slides www.myreaders.info/, RC Chakraborty, e-mail rcchak@gmail.com, Aug. 10, 2010 http://www.myreaders.info/html/soft_computing.html www.myreaders.info

More information

C4 Phenomenological Modeling - Regression & Neural Networks : Computational Modeling and Simulation Instructor: Linwei Wang

C4 Phenomenological Modeling - Regression & Neural Networks : Computational Modeling and Simulation Instructor: Linwei Wang C4 Phenomenological Modeling - Regression & Neural Networks 4040-849-03: Computational Modeling and Simulation Instructor: Linwei Wang Recall.. The simple, multiple linear regression function ŷ(x) = a

More information

3.3 Discrete Hopfield Net An iterative autoassociative net similar to the nets described in the previous sections has been developed by Hopfield

3.3 Discrete Hopfield Net An iterative autoassociative net similar to the nets described in the previous sections has been developed by Hopfield 3.3 Discrete Hopfield Net An iterative autoassociative net similar to the nets described in the previous sections has been developed by Hopfield (1982, 1984). - The net is a fully interconnected neural

More information

7.1 Basis for Boltzmann machine. 7. Boltzmann machines

7.1 Basis for Boltzmann machine. 7. Boltzmann machines 7. Boltzmann machines this section we will become acquainted with classical Boltzmann machines which can be seen obsolete being rarely applied in neurocomputing. It is interesting, after all, because is

More information

Learning in State-Space Reinforcement Learning CIS 32

Learning in State-Space Reinforcement Learning CIS 32 Learning in State-Space Reinforcement Learning CIS 32 Functionalia Syllabus Updated: MIDTERM and REVIEW moved up one day. MIDTERM: Everything through Evolutionary Agents. HW 2 Out - DUE Sunday before the

More information

Deep Learning. What Is Deep Learning? The Rise of Deep Learning. Long History (in Hind Sight)

Deep Learning. What Is Deep Learning? The Rise of Deep Learning. Long History (in Hind Sight) CSCE 636 Neural Networks Instructor: Yoonsuck Choe Deep Learning What Is Deep Learning? Learning higher level abstractions/representations from data. Motivation: how the brain represents sensory information

More information

CS:4420 Artificial Intelligence

CS:4420 Artificial Intelligence CS:4420 Artificial Intelligence Spring 2018 Neural Networks Cesare Tinelli The University of Iowa Copyright 2004 18, Cesare Tinelli and Stuart Russell a a These notes were originally developed by Stuart

More information

Artificial Neural Networks Examination, June 2004

Artificial Neural Networks Examination, June 2004 Artificial Neural Networks Examination, June 2004 Instructions There are SIXTY questions (worth up to 60 marks). The exam mark (maximum 60) will be added to the mark obtained in the laborations (maximum

More information

The error-backpropagation algorithm is one of the most important and widely used (and some would say wildly used) learning techniques for neural

The error-backpropagation algorithm is one of the most important and widely used (and some would say wildly used) learning techniques for neural 1 2 The error-backpropagation algorithm is one of the most important and widely used (and some would say wildly used) learning techniques for neural networks. First we will look at the algorithm itself

More information

Lecture 4: Perceptrons and Multilayer Perceptrons

Lecture 4: Perceptrons and Multilayer Perceptrons Lecture 4: Perceptrons and Multilayer Perceptrons Cognitive Systems II - Machine Learning SS 2005 Part I: Basic Approaches of Concept Learning Perceptrons, Artificial Neuronal Networks Lecture 4: Perceptrons

More information

Artificial Neural Networks. MGS Lecture 2

Artificial Neural Networks. MGS Lecture 2 Artificial Neural Networks MGS 2018 - Lecture 2 OVERVIEW Biological Neural Networks Cell Topology: Input, Output, and Hidden Layers Functional description Cost functions Training ANNs Back-Propagation

More information

CSC 578 Neural Networks and Deep Learning

CSC 578 Neural Networks and Deep Learning CSC 578 Neural Networks and Deep Learning Fall 2018/19 3. Improving Neural Networks (Some figures adapted from NNDL book) 1 Various Approaches to Improve Neural Networks 1. Cost functions Quadratic Cross

More information

Week 4: Hopfield Network

Week 4: Hopfield Network Week 4: Hopfield Network Phong Le, Willem Zuidema November 20, 2013 Last week we studied multi-layer perceptron, a neural network in which information is only allowed to transmit in one direction (from

More information

T sg. α c (0)= T=1/β. α c (T ) α=p/n

T sg. α c (0)= T=1/β. α c (T ) α=p/n Taejon, Korea, vol. 2 of 2, pp. 779{784, Nov. 2. Capacity Analysis of Bidirectional Associative Memory Toshiyuki Tanaka y, Shinsuke Kakiya y, and Yoshiyuki Kabashima z ygraduate School of Engineering,

More information

Boxlets: a Fast Convolution Algorithm for. Signal Processing and Neural Networks. Patrice Y. Simard, Leon Bottou, Patrick Haner and Yann LeCun

Boxlets: a Fast Convolution Algorithm for. Signal Processing and Neural Networks. Patrice Y. Simard, Leon Bottou, Patrick Haner and Yann LeCun Boxlets: a Fast Convolution Algorithm for Signal Processing and Neural Networks Patrice Y. Simard, Leon Bottou, Patrick Haner and Yann LeCun AT&T Labs-Research 100 Schultz Drive, Red Bank, NJ 07701-7033

More information

Artificial Neural Networks

Artificial Neural Networks 0 Artificial Neural Networks Based on Machine Learning, T Mitchell, McGRAW Hill, 1997, ch 4 Acknowledgement: The present slides are an adaptation of slides drawn by T Mitchell PLAN 1 Introduction Connectionist

More information

Artifical Neural Networks

Artifical Neural Networks Neural Networks Artifical Neural Networks Neural Networks Biological Neural Networks.................................. Artificial Neural Networks................................... 3 ANN Structure...........................................

More information

Ecient Higher-order Neural Networks. for Classication and Function Approximation. Joydeep Ghosh and Yoan Shin. The University of Texas at Austin

Ecient Higher-order Neural Networks. for Classication and Function Approximation. Joydeep Ghosh and Yoan Shin. The University of Texas at Austin Ecient Higher-order Neural Networks for Classication and Function Approximation Joydeep Ghosh and Yoan Shin Department of Electrical and Computer Engineering The University of Texas at Austin Austin, TX

More information

1 Solutions to selected problems

1 Solutions to selected problems Solutions to selected problems Section., #a,c,d. a. p x = n for i = n : 0 p x = xp x + i end b. z = x, y = x for i = : n y = y + x i z = zy end c. y = (t x ), p t = a for i = : n y = y(t x i ) p t = p

More information

4. Multilayer Perceptrons

4. Multilayer Perceptrons 4. Multilayer Perceptrons This is a supervised error-correction learning algorithm. 1 4.1 Introduction A multilayer feedforward network consists of an input layer, one or more hidden layers, and an output

More information

Introduction to Convolutional Neural Networks (CNNs)

Introduction to Convolutional Neural Networks (CNNs) Introduction to Convolutional Neural Networks (CNNs) nojunk@snu.ac.kr http://mipal.snu.ac.kr Department of Transdisciplinary Studies Seoul National University, Korea Jan. 2016 Many slides are from Fei-Fei

More information

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others) Machine Learning Neural Networks (slides from Domingos, Pardo, others) For this week, Reading Chapter 4: Neural Networks (Mitchell, 1997) See Canvas For subsequent weeks: Scaling Learning Algorithms toward

More information

1 R.V k V k 1 / I.k/ here; we ll stimulate the action potential another way.) Note that this further simplifies to. m 3 k h k.

1 R.V k V k 1 / I.k/ here; we ll stimulate the action potential another way.) Note that this further simplifies to. m 3 k h k. 1. The goal of this problem is to simulate a propagating action potential for the Hodgkin-Huxley model and to determine the propagation speed. From the class notes, the discrete version (i.e., after breaking

More information

Primal vector is primal infeasible till end. So when primal feasibility attained, the pair becomes opt. & method terminates. 3. Two main steps carried

Primal vector is primal infeasible till end. So when primal feasibility attained, the pair becomes opt. & method terminates. 3. Two main steps carried 4.1 Primal-Dual Algorithms Katta G. Murty, IOE 612 Lecture slides 4 Here we discuss special min cost ow problems on bipartite networks, the assignment and transportation problems. Algorithms based on an

More information

Shigetaka Fujita. Rokkodai, Nada, Kobe 657, Japan. Haruhiko Nishimura. Yashiro-cho, Kato-gun, Hyogo , Japan. Abstract

Shigetaka Fujita. Rokkodai, Nada, Kobe 657, Japan. Haruhiko Nishimura. Yashiro-cho, Kato-gun, Hyogo , Japan. Abstract KOBE-TH-94-07 HUIS-94-03 November 1994 An Evolutionary Approach to Associative Memory in Recurrent Neural Networks Shigetaka Fujita Graduate School of Science and Technology Kobe University Rokkodai, Nada,

More information

Back-Propagation Algorithm. Perceptron Gradient Descent Multilayered neural network Back-Propagation More on Back-Propagation Examples

Back-Propagation Algorithm. Perceptron Gradient Descent Multilayered neural network Back-Propagation More on Back-Propagation Examples Back-Propagation Algorithm Perceptron Gradient Descent Multilayered neural network Back-Propagation More on Back-Propagation Examples 1 Inner-product net =< w, x >= w x cos(θ) net = n i=1 w i x i A measure

More information

Optimization of Quadratic Forms: NP Hard Problems : Neural Networks

Optimization of Quadratic Forms: NP Hard Problems : Neural Networks 1 Optimization of Quadratic Forms: NP Hard Problems : Neural Networks Garimella Rama Murthy, Associate Professor, International Institute of Information Technology, Gachibowli, HYDERABAD, AP, INDIA ABSTRACT

More information

22c145-Fall 01: Neural Networks. Neural Networks. Readings: Chapter 19 of Russell & Norvig. Cesare Tinelli 1

22c145-Fall 01: Neural Networks. Neural Networks. Readings: Chapter 19 of Russell & Norvig. Cesare Tinelli 1 Neural Networks Readings: Chapter 19 of Russell & Norvig. Cesare Tinelli 1 Brains as Computational Devices Brains advantages with respect to digital computers: Massively parallel Fault-tolerant Reliable

More information

Artificial Neural Networks D B M G. Data Base and Data Mining Group of Politecnico di Torino. Elena Baralis. Politecnico di Torino

Artificial Neural Networks D B M G. Data Base and Data Mining Group of Politecnico di Torino. Elena Baralis. Politecnico di Torino Artificial Neural Networks Data Base and Data Mining Group of Politecnico di Torino Elena Baralis Politecnico di Torino Artificial Neural Networks Inspired to the structure of the human brain Neurons as

More information

Neural Networks. Mark van Rossum. January 15, School of Informatics, University of Edinburgh 1 / 28

Neural Networks. Mark van Rossum. January 15, School of Informatics, University of Edinburgh 1 / 28 1 / 28 Neural Networks Mark van Rossum School of Informatics, University of Edinburgh January 15, 2018 2 / 28 Goals: Understand how (recurrent) networks behave Find a way to teach networks to do a certain

More information

( ) T. Reading. Lecture 22. Definition of Covariance. Imprinting Multiple Patterns. Characteristics of Hopfield Memory

( ) T. Reading. Lecture 22. Definition of Covariance. Imprinting Multiple Patterns. Characteristics of Hopfield Memory Part 3: Autonomous Agents /8/07 Reading Lecture 22 Flake, ch. 20 ( Genetics and Evolution ) /8/07 /8/07 2 Imprinting Multiple Patterns Let x, x 2,, x p be patterns to be imprinted Define the sum-of-outer-products

More information

Gradient Descent Methods

Gradient Descent Methods Lab 18 Gradient Descent Methods Lab Objective: Many optimization methods fall under the umbrella of descent algorithms. The idea is to choose an initial guess, identify a direction from this point along

More information

Neural Network Essentials 1

Neural Network Essentials 1 Neural Network Essentials Draft: Associative Memory Chapter Anthony S. Maida University of Louisiana at Lafayette, USA June 6, 205 Copyright c 2000 2003 by Anthony S. Maida Contents 0. Associative memory.............................

More information

Lecture 6. Notes on Linear Algebra. Perceptron

Lecture 6. Notes on Linear Algebra. Perceptron Lecture 6. Notes on Linear Algebra. Perceptron COMP90051 Statistical Machine Learning Semester 2, 2017 Lecturer: Andrey Kan Copyright: University of Melbourne This lecture Notes on linear algebra Vectors

More information

p(z)

p(z) Chapter Statistics. Introduction This lecture is a quick review of basic statistical concepts; probabilities, mean, variance, covariance, correlation, linear regression, probability density functions and

More information