Hebbian Learning II. Robert Jacobs Department of Brain & Cognitive Sciences University of Rochester. July 20, 2017

Similar documents
Review of Linear Algebra

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

ELE/MCE 503 Linear Algebra Facts Fall 2018

2018 Fall 2210Q Section 013 Midterm Exam I Solution

This appendix provides a very basic introduction to linear algebra concepts.

Vectors and Matrices Statistics with Vectors and Matrices

Linear Algebra Review. Vectors

Quantum Computing Lecture 2. Review of Linear Algebra

VAR Model. (k-variate) VAR(p) model (in the Reduced Form): Y t-2. Y t-1 = A + B 1. Y t + B 2. Y t-p. + ε t. + + B p. where:

Linear Algebra Massoud Malek

There are two things that are particularly nice about the first basis

MATH 304 Linear Algebra Lecture 20: The Gram-Schmidt process (continued). Eigenvalues and eigenvectors.

We use the overhead arrow to denote a column vector, i.e., a number with a direction. For example, in three-space, we write

Math 3191 Applied Linear Algebra

Mathematical foundations - linear algebra

Vectors To begin, let us describe an element of the state space as a point with numerical coordinates, that is x 1. x 2. x =

1. Let m 1 and n 1 be two natural numbers such that m > n. Which of the following is/are true?

MATH 115A: SAMPLE FINAL SOLUTIONS

which are not all zero. The proof in the case where some vector other than combination of the other vectors in S is similar.

Family Feud Review. Linear Algebra. October 22, 2013

22.3. Repeated Eigenvalues and Symmetric Matrices. Introduction. Prerequisites. Learning Outcomes

Applied Linear Algebra in Geoscience Using MATLAB

Lecture II: Linear Algebra Revisited

CS 246 Review of Linear Algebra 01/17/19

Linear Equations in Linear Algebra

Linear Algebra V = T = ( 4 3 ).

Linear Combination. v = a 1 v 1 + a 2 v a k v k

The set of all solutions to the homogeneous equation Ax = 0 is a subspace of R n if A is m n.

Chapter 3 Transformations

Linear Algebra (Review) Volker Tresp 2017

Mathematical foundations - linear algebra

Matrix Algebra: Summary

5.) For each of the given sets of vectors, determine whether or not the set spans R 3. Give reasons for your answers.

1. What is the determinant of the following matrix? a 1 a 2 4a 3 2a 2 b 1 b 2 4b 3 2b c 1. = 4, then det

Eigenvalue and Eigenvector Homework

j=1 u 1jv 1j. 1/ 2 Lemma 1. An orthogonal set of vectors must be linearly independent.

Intro Vectors 2D implicit curves 2D parametric curves. Graphics 2012/2013, 4th quarter. Lecture 2: vectors, curves, and surfaces

Linear Algebra (Review) Volker Tresp 2018

Econ Slides from Lecture 7

Linear Algebra - Part II

Properties of Matrices and Operations on Matrices

Linear Algebra Review

Usually, when we first formulate a problem in mathematics, we use the most familiar

Chapter 6: Orthogonality

Chapter 2. Linear Algebra. rather simple and learning them will eventually allow us to explain the strange results of

(v, w) = arccos( < v, w >

2. Review of Linear Algebra

The 'linear algebra way' of talking about "angle" and "similarity" between two vectors is called "inner product". We'll define this next.

Definitions for Quizzes

NOTES ON LINEAR ALGEBRA CLASS HANDOUT

Vectors Coordinate frames 2D implicit curves 2D parametric curves. Graphics 2008/2009, period 1. Lecture 2: vectors, curves, and surfaces

Linear Models Review

Dot Products. K. Behrend. April 3, Abstract A short review of some basic facts on the dot product. Projections. The spectral theorem.

MAC Module 12 Eigenvalues and Eigenvectors. Learning Objectives. Upon completing this module, you should be able to:

Example Linear Algebra Competency Test

MAC Module 12 Eigenvalues and Eigenvectors

Image Registration Lecture 2: Vectors and Matrices

6.1. Inner Product, Length and Orthogonality

MTH 2032 SemesterII

Consequences of Orthogonality

2. Matrix Algebra and Random Vectors

Math 1553, Introduction to Linear Algebra

Linear Algebra and Dirac Notation, Pt. 1

Repeated Eigenvalues and Symmetric Matrices

FFTs in Graphics and Vision. The Laplace Operator

In biological terms, memory refers to the ability of neural systems to store activity patterns and later recall them when required.

Linear Equations in Linear Algebra

Introduction to Matrix Algebra

Definition 1. A set V is a vector space over the scalar field F {R, C} iff. there are two operations defined on V, called vector addition

Elements of linear algebra

Knowledge Discovery and Data Mining 1 (VO) ( )

Linear Algebra Practice Problems

Chapter Two Elements of Linear Algebra

Basic Concepts in Matrix Algebra

Solution: By inspection, the standard matrix of T is: A = Where, Ae 1 = 3. , and Ae 3 = 4. , Ae 2 =

Eigenvalues and Eigenvectors

Diagonalizing Matrices

Linear Independence. Linear Algebra MATH Linear Algebra LI or LD Chapter 1, Section 7 1 / 1

MATRIX ALGEBRA. or x = (x 1,..., x n ) R n. y 1 y 2. x 2. x m. y m. y = cos θ 1 = x 1 L x. sin θ 1 = x 2. cos θ 2 = y 1 L y.

Linear Algebra. Alvin Lin. August December 2017

Linear Algebra Practice Problems

Intro Vectors 2D implicit curves 2D parametric curves. Graphics 2011/2012, 4th quarter. Lecture 2: vectors, curves, and surfaces

MATH 2331 Linear Algebra. Section 1.1 Systems of Linear Equations. Finding the solution to a set of two equations in two variables: Example 1: Solve:

Symmetric and anti symmetric matrices

Math 2114 Common Final Exam May 13, 2015 Form A

Lecture 2: Linear Algebra Review

CS123 INTRODUCTION TO COMPUTER GRAPHICS. Linear Algebra 1/33

3 a 21 a a 2N. 3 a 21 a a 2M

HOSTOS COMMUNITY COLLEGE DEPARTMENT OF MATHEMATICS

CS123 INTRODUCTION TO COMPUTER GRAPHICS. Linear Algebra /34

Math 2331 Linear Algebra

Chapter 1 Vector Spaces

The Transpose of a Vector

Properties of Linear Transformations from R n to R m

Linear Algebra & Geometry why is linear algebra useful in computer vision?

Review: Linear and Vector Algebra

IFT 6760A - Lecture 1 Linear Algebra Refresher

2. Every linear system with the same number of equations as unknowns has a unique solution.

ELEMENTARY LINEAR ALGEBRA

Large Scale Data Analysis Using Deep Learning

Transcription:

Hebbian Learning II Robert Jacobs Department of Brain & Cognitive Sciences University of Rochester July 20, 2017

Goals Teach about one-half of an undergraduate course on Linear Algebra Understand when supervised Hebbian learning works perfectly (and when it does not) Patten completion: Supervised Hebbian learning finds a weight matrix such that the input vectors are the eigenvectors of this weight matrix

Vector age height weight Joe = 37 72 175 Mary = 10 30 61 Vectors have both a length (magnitude) and a direction

Graphical representation for the vector Mary :

Multiplication of a Vector by a Scalar 2 [ 2 1 ] = [ 4 2 ] Scalar multiplication corresponds to lengthening or shortening a vector (while leaving it pointing in the same direction)

Addition of Vectors 1 2 1 + 2 1 3 = 3 3 4

Linear Combination of Vectors u = c 1 v 1 + c 2 v 2 The set of all linear combinations of the { v i } is called the set spanned by the { v i }

The three vectors 1 0 0, 0 1 0, 0 0 1 span all of three-dimensional space since any vector u = written as a linear combination: 1 u = a 0 + b 0 0 1 0 + c 0 0 1 a b c can be In general, n vectors suffice to span n-dimensional space

Linear Independence If none of the vectors in a set can be written as a linear combination of the others, then the set is called linearly independent n-dimensional space is the set of vectors spanned by a set of n linearly independent vectors. The n vectors are referred to as a basis for the space

[ ] [ ] 1 2, are colinear and, thus, are linearly dependent. They 1 2 span only a 1-dimensional space [ ] [ ] 1 2, are linearly independent (and, thus, span a 1 1 2-dimensional space)

[ 1 1 ] [ 2, 1 ] [ 1, 3 ] are linearly dependent ( v 3 = 7 v 1 4 v 2 ) 1 2 0, 3 2 0, 9 10 0 are linearly dependent (no vector with a non-zero third component can be generated from this set)

For a given n-dimensional space, there are an infinite number of basis for that space Every vector has a different representation (i.e., set of coordinates) for each basis Which basis is best?

Inner (Dot) Product of Two Vectors v = 3 1 2, w = 1 2 1 v w = i v i w i = (3 1) + ( 1 2) + (2 1) = 3

Length of a Vector The length of a vector (denoted ) is the square root of the inner product of the vector with itself Let v = 3 1 2. v = ( v v) 1/2 = 3 2 + ( 1) 2 + 2 2

Follows from the Pythagorean Theorem:

Angle Between Two Vectors cos θ = = v w v w i v iw i ( i v2 i )1/2 ( i w2 i )1/2 Roughly a measure of similarity between two vectors: If v and w are random variables (so v and w are vectors of values for these variables) with zero mean, then this formula is their correlation If the inner product is zero, then cos θ = 0 (meaning that the two vectors are orthogonal)

Projection of One Vector Onto Another Vector

Let x be the projection of v onto w (a number, not a vector): x = v cos θ = v w w If w = 1, then x = v w

Example: Linear Neural Network with One Output Unit x 1 w 1 x2 w 2 y x 3 w 3 The output (y = w 1 x 1 + w 2 x 2 + w 3 x 3 = w x) gives an indication of how close the input x is to the weight vector w: If y > 0, then x is similar to w If y = 0, then x is orthogonal to w If y < 0, then x is dissimilar to w

Matrix An array of numbers. For example: [ ] 3 4 5 W = 1 0 1

Multiplication of a Matrix and a Vector u = W v: Matrix W maps from one space of vectors ( v) to a new space of vectors ( u) In general, vectors v and u may have different dimensionalities

Multiplication of a Matrix and a Vector W = [ 3 4 5 1 0 1 ] v = 1 0 2 u = W v = = = [ 3 4 5 1 0 1 ] 1 0 2 [ (3 1) + (4 0) + (5 2) (1 1) + (0 0) + (1 2) [ ] 13 3 ]

Multiplication of a Matrix and a Vector The following are equivalent: Form inner product of each row of matrix with vector u = W v is a linear combination of the columns of W. The coefficients are the components of v

Neural Network with Multiple Input and Multiple Output Units x 1 w 11 y 1 x2 y 2 x 3 w 23 [ y1 y 2 ] = [ w11 w 12 w 13 w 21 w 22 w 23 ] x 1 x 2 x 3 y = W x

Linearity A function is said to be linear if: f(cx) = cf(x) f(x 1 + x 2 ) = f(x 1 ) + f(x 2 )

Implication: If we know how a system responds to the basis of a space, then we can easily compute how it responds to all vectors in that space Let { v i } be a basis for a space Let v be an arbitrary vector in this space Then: W v = W (c 1 v 1 + c 2 v 2 + + c n v n ) = c 1 W v 1 + c 2 W v 2 + + c n W v n

Eigenvectors and Eigenvalues Limit our attention to square matrices (i.e., v and u have the same dimensionality) In general, multiplication by a matrix changes both a vector s direction and length However, there are some vectors that will change only in length, not direction For these vectors, multiplication by the matrix is no different than multiplication by a scalar where λ is a scalar W v = λ v Such vectors are called eigenvectors, and the scalar λ is called an eigenvalue

[ 4 1 2 1 ] [ 1 2 ] [ 1 = 2 2 ] Each vector that is colinear with an eigenvector is itself an eigenvector: [ 4 1 2 1 ] [ 2 4 ] [ 2 = 2 4 ] We will reserve the term eigenvector only for vectors of length 1

An n n matrix can have up to (but no more than) n distinct eigenvalues If it has n distinct eigenvalues, then the n associated eigenvectors are linearly independent Thus, these eigenvectors form a basis

Let { v i } be linearly independent eigenvectors of matrix W, and let v be an arbitrary vector. Then: u = W v = W (c 1 v 1 + + c n v n ) = c 1 W v 1 + + c n W v n = c 1 λ 1 v 1 + c n λ n v n There are no matrices in this last equation. Just a simple linear combination of eigenvectors.

Eigenvectors and eigenvalues reveal the directions in which matrix multiplication stretches and shrinks a space (i.e., it reveals which input vectors a system gives small and large responses to). v W u Power method for finding the largest eigenvector of a matrix

Transpose Turn a column vector into a row vector. For example, we can re-write an inner product as follows: w v = w T v = [w 1 w 2 w 3 ] v 1 v 2 v 3 = (w 1 v 1 ) + (w 2 v 2 ) + (w 3 v 3 )

Outer Product w v T = = w 1 w 2 w 3 [v 1 v 2 v 3 ] w 1 v 1 w 1 v 2 w 1 v 3 w 2 v 1 w 2 v 2 w 2 v 3 w 3 v 1 w 3 v 2 w 3 v 3 If w and v are random variables (the components of w and v are values of these variables) with zero means, and if w = v, then this is a covariance matrix

Using Linear Algebra to Study Supervised Hebbian Learning

Neural Network With One Output Unit: One Input-Output Pattern One input-output pattern: x y Assume x = 1 If we choose w = x, then y = w T x = x T x = 1 But we want y to equal y. So let w = y x w T x = (y x) T x = y ( x T x) = y

Problem of finding w corresponds to finding a vector whose projection onto x is y x y * There are an infinite number of solutions On the previous slide, we made the simple choice of the vector that points in the same directions as x

Neural Network With Multiple Output Units: One Input-Output Pattern One input-output pattern: x y Assume x = 1 Let W = y x T y = W x = ( y x T ) x = y ( x T x) = y

Example With Multiple Input-Output Patterns x 1 = x 2 = x 3 = 0.577 0.577 0.577 0.816 0.408 0.408 0.0 0.707 0.707 y 1 = 3 y 2 = 2 y 3 = 4

Based on 1st pattern, w 1 = Based on 2nd pattern, w 2 = Based on 3rd pattern, w 3 = Next, set weight matrix W : 1.731 1.731 1.731 1.632 0.816 0.816 0.0 2.828 2.828 W = w 1 + w 2 + w 3 0.099 = 0.281 5.375

Verify: W x 1 = y 1 W x 2 = y 2 W x 3 = y 3 Q: Why does this work? A: If the input vectors are orthogonal, then the Hebb rule works perfectly (!!!)

Hebb Rule Works Perfectly When Inputs Are Orthogonal Assume input vectors are unit length and mutually orthogonal: { 1 if i = j x T i x j = 0 else Set W i = y i x T Set W = W 1 + + W n

For all i: y = W x i = (W 1 + + W n ) x i = ( y 1 x T 1 + y n x T n) x i = y 1 x T 1 x i + + y n x T n x i = 0 + + y i + + 0 = y i

Caveat If the input vectors are not orthogonal, the Hebb rule is not guaranteed to work perfectly If the input vectors are linearly independent, the LMS rule works perfectly

Example: Hebb Fails Input Output 1-1 1-1 1 1 1 1 1 1 1 1 1-1 -1 1-1 -1 1-1 Hebb rule: overall weight changes for w 1, w 2, and w 4 are 0 (i.e., Hebb rule does not work) There are successful weights: w 1 = 1, w 2 = 1, w 3 = 2, and w 4 = 1 (but Hebb rule won t find these values)

Hebb Learning and Pattern Completion

Recurrent Network

Associate input vectors with scalar copies of themselves: Assume λ i are distinct y i = λ i x i Assume input vectors are unit length and mutually orthogonal: { 1 if i = j x T i x j = 0 else

Set W i = y i x T i = λ i x i x T i Set W = W 1 + + W n W x i = (W 1 + + W n ) x i = (λ 1 x 1 x T 1 + λ n x n x T n) x i = λ 1 x 1 x T 1 x i + + λ n x n x T n x i = 0 + + λ i x i + + 0 = λ i x i Hebb rule creates a weight matrix such that the input vectors are the eigenvectors of this matrix (!!!)