Unsupervised Learning

Similar documents
Hopfield Training Rules 1 N

Hopfield networks and Boltzmann machines. Geoffrey Hinton et al. Presented by Tambet Matiisen

Internet Engineering. Jacek Mazurkiewicz, PhD Softcomputing. Part 3: Recurrent Artificial Neural Networks Self-Organising Artificial Neural Networks

Associative Memories

Model of Neurons. CS 416 Artificial Intelligence. Early History of Neural Nets. Cybernetics. McCulloch-Pitts Neurons. Hebbian Modification.

1 Derivation of Rate Equations from Single-Cell Conductance (Hodgkin-Huxley-like) Equations

C/CS/Phy191 Problem Set 3 Solutions Out: Oct 1, 2008., where ( 00. ), so the overall state of the system is ) ( ( ( ( 00 ± 11 ), Φ ± = 1

Week 2. This week, we covered operations on sets and cardinality.

9 Derivation of Rate Equations from Single-Cell Conductance (Hodgkin-Huxley-like) Equations

Week 5: Neural Networks

CSC321 Tutorial 9: Review of Boltzmann machines and simulated annealing

Introduction to Vapor/Liquid Equilibrium, part 2. Raoult s Law:

8 Derivation of Network Rate Equations from Single- Cell Conductance Equations

8.6 The Complex Number System

The Hopfield model. 1 The Hebbian paradigm. Sebastian Seung Lecture 15: November 7, 2002

arxiv: v1 [cs.lg] 17 Jan 2019

8 Derivation of Network Rate Equations from Single- Cell Conductance Equations

7. Products and matrix elements

Section 8.3 Polar Form of Complex Numbers

LECTURE NOTES. Artifical Neural Networks. B. MEHLIG (course home page)

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.

CHALMERS, GÖTEBORGS UNIVERSITET. SOLUTIONS to RE-EXAM for ARTIFICIAL NEURAL NETWORKS. COURSE CODES: FFR 135, FIM 720 GU, PhD

Multilayer Perceptrons and Backpropagation. Perceptrons. Recap: Perceptrons. Informatics 1 CG: Lecture 6. Mirella Lapata

The Cortex. Networks. Laminar Structure of Cortex. Chapter 3, O Reilly & Munakata.

Admin NEURAL NETWORKS. Perceptron learning algorithm. Our Nervous System 10/25/16. Assignment 7. Class 11/22. Schedule for the rest of the semester

Formal solvers of the RT equation

Lecture 12: Discrete Laplacian

EEE 241: Linear Systems

Neural Networks & Learning

THE SUMMATION NOTATION Ʃ

1 Convex Optimization

THE CHINESE REMAINDER THEOREM. We should thank the Chinese for their wonderful remainder theorem. Glenn Stevens

12. The Hamilton-Jacobi Equation Michael Fowler

Problem Set 9 Solutions

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

Random Walks on Digraphs

Complex Numbers. x = B B 2 4AC 2A. or x = x = 2 ± 4 4 (1) (5) 2 (1)

Introductory Cardinality Theory Alan Kaylor Cline

Problem Do any of the following determine homomorphisms from GL n (C) to GL n (C)?

CS 3710: Visual Recognition Classification and Detection. Adriana Kovashka Department of Computer Science January 13, 2015

j) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1

10-701/ Machine Learning, Fall 2005 Homework 3

Section 3.6 Complex Zeros

Salmon: Lectures on partial differential equations. Consider the general linear, second-order PDE in the form. ,x 2

VQ widely used in coding speech, image, and video

Digital Signal Processing

FREQUENCY DISTRIBUTIONS Page 1 of The idea of a frequency distribution for sets of observations will be introduced,

Lecture 4: November 17, Part 1 Single Buffer Management

Statistical Inference. 2.3 Summary Statistics Measures of Center and Spread. parameters ( population characteristics )

Kernel Methods and SVMs Extension

8.1 Arc Length. What is the length of a curve? How can we approximate it? We could do it following the pattern we ve used before

Multi-layer neural networks

Multilayer neural networks

HOPFIELD NETWORKS 9.1 INTRODUCTION

Some Comments on Accelerating Convergence of Iterative Sequences Using Direct Inversion of the Iterative Subspace (DIIS)

Boostrapaggregating (Bagging)

CHAPTER III Neural Networks as Associative Memory

IV. Performance Optimization

2 Laminar Structure of Cortex. 4 Area Structure of Cortex

Solutions to Problem Set 6

Vapnik-Chervonenkis theory

10. Canonical Transformations Michael Fowler

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

A Quantum Gauss-Bonnet Theorem

Multilayer Perceptron (MLP)

G = G 1 + G 2 + G 3 G 2 +G 3 G1 G2 G3. Network (a) Network (b) Network (c) Network (d)

HMMT February 2016 February 20, 2016

Module 9. Lecture 6. Duality in Assignment Problems

Topics in Probability Theory and Stochastic Processes Steven R. Dunbar. Classes of States and Stationary Distributions

Linear Feature Engineering 11

Limited Dependent Variables

Gaussian Mixture Models

Instance-Based Learning (a.k.a. memory-based learning) Part I: Nearest Neighbor Classification

Lecture 10 Support Vector Machines. Oct

Physics 207: Lecture 20. Today s Agenda Homework for Monday

Physics 4B. A positive value is obtained, so the current is counterclockwise around the circuit.

ALGEBRA MID-TERM. 1 Suppose I is a principal ideal of the integral domain R. Prove that the R-module I R I has no non-zero torsion elements.

THE VIBRATIONS OF MOLECULES II THE CARBON DIOXIDE MOLECULE Student Instructions

Differentiating Gaussian Processes

JAB Chain. Long-tail claims development. ASTIN - September 2005 B.Verdier A. Klinger

MEASUREMENT OF MOMENT OF INERTIA

9 Characteristic classes

P A = (P P + P )A = P (I P T (P P ))A = P (A P T (P P )A) Hence if we let E = P T (P P A), We have that

Evaluation for sets of classes

U.C. Berkeley CS294: Spectral Methods and Expanders Handout 8 Luca Trevisan February 17, 2016

Quantum Mechanics I - Session 4

( 1) i [ d i ]. The claim is that this defines a chain complex. The signs have been inserted into the definition to make this work out.

Rhythmic activity in neuronal ensembles in the presence of conduction delays

Generalized Linear Methods

Neural Networks. Perceptrons and Backpropagation. Silke Bussen-Heyen. 5th of Novemeber Universität Bremen Fachbereich 3. Neural Networks 1 / 17

Other NN Models. Reinforcement learning (RL) Probabilistic neural networks

u i ( u i )a v a = i ( u i )a v i n = x u 1 x u 2

= z 20 z n. (k 20) + 4 z k = 4

Expected Value and Variance

Density matrix. c α (t)φ α (q)

One Dimension Again. Chapter Fourteen

PERFORMANCE OF HEAVY-DUTY PLANETARY GEARS

MIMA Group. Chapter 2 Bayesian Decision Theory. School of Computer Science and Technology, Shandong University. Xin-Shun SDU

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009

Polynomials. 1 What is a polynomial? John Stalker

Transcription:

Unsupervsed Learnng Kevn Swngler What s Unsupervsed Learnng? Most smply, t can be thought of as learnng to recognse and recall thngs Recognton I ve seen that before Recall I ve seen that before and I can recall more about t from memory. There s no feedback or reward lke there s wth renforcement learnng There s no gven answer lke there s n supervsed learnng 1

In Context Supervsed learnng: These are apples, these are pears What s ths new thng (apple or pear?) Renforcement learnng: Lke the warmer, colder game actons are gven rewards or feedback, whch s learned for future use Unsupervsed learnng: Rememberng the route home n day lght and stll beng able to do t at nght, or n the snow when thngs look dfferent, or when parked cars have moved Content Addressable Memory One way to thnk about unsupervsed learnng s as a content addressable memory That s, you look thngs up, not by searchng, but by descrbng some aspects of the thng you are lookng for and havng that trgger other thngs about t 2

Assocatve Patterns The mportant aspect of assocatve memory s that t stores patterns, or thngs that go together Just as an exotc smell mght evoke memores of a holday, assocatve memores work by completng partal patterns A Smple Assocatve Memory The Hopfeld Network Stores patterns n an assocatve memory Can recall a complete pattern when gven only a part of that pattern as nput Robust under nose wll recall the nearest pattern t has to the nput stmulus Robust under damage remove a few of the connectons and t stll works 3

Characterstcs of a Hopfeld Network A collecton of nodes, whch we wll call neurons (though they are really ust smple mathematcal functons) Each neuron s connected to every other neuron n the network (but not tself) we call the connectons synapses Synapses have a weght that s ether exctatory (+ve) or nhbtory (-ve) Weghts are symmetrcal: W = W Neurons can be ether on or off represented as an output value of +1 or -1 The neurons are of the McCulloch and Ptts type that we have already seen. Recall n a Hopfeld Network The output (+1 or -1) from a neuron s calculated based on the ncomng weghts and the values carred along those weghts The output from each neuron s multpled by the weght on each connecton leavng that neuron to contrbute to the nput to ts destnaton node So the nput to a neuron s the sum of the product of the ncomng weghts and ther values 4

The Value of a Hopfeld Neuron u = w v + I 1 v = 1 u 0 u < 0. u s the sum of the product of the weghts, w and the outputs from ther pre-synaptc neurons, v plus the nput to the neuron tself (I) v, the value of the neuron (ts output) s ether +1 or -1 dependng on the sgn of u (+ve or ve) Convergence You mght thnk that as each neuron s output value changes, t affects the nput to, and so the output from other neurons, whch change all other neurons and the whole system keeps changng forever But t doesn t Gven an nput pattern, a Hopfeld network wll always settle on a fxed state after a number of teratons Ths state s known as the attractor for the gven nput and t represents the pattern that has been recalled 5

As the Network Settles As the network settles n ts steady state, output from the neurons becomes consstent wth the weghts Ths consstency can be measured by multplyng the values of each par of neurons wth the weght of the synapse between them and summng the result: w v v Smple Example Take a sngle par of neurons: 1 Wth a connecton strength of 1: 1 x 1 x 1 = 1-1 x -1 x 1 = 1-1 x 1 x 1 = -1 6

Stable States The sum of the product of the weghts and the values gets larger as the network approaches an attractor In the prevous example, 1,1 and -1,-1 one attractors for the gven network Set them as nputs, and the network wll not change Set -1, 1 or 1, -1 as nputs, and t wll change to 1,1 or -1,-1 Energy Functon A network n a stable state s sad to have low energy A network n transton has hgher energy So energy s somehow the opposte of the consstency measure we saw earler In fact, f we make that measure negatve so that low values correspond wth stable states, we get 1 E = 2 w v v We halve the value because we are countng the b-drectonal synapses twce So, recall n a Hopfeld network s the same as mnmsng the energy functon 7

Another Example Look at the smple network below -1 Stable states are 1,-1 and -1,1 Put t n 1,1 or -1,-1 and one neuron wll flp Ths s a so-called flp-flop network It s energy s Whch s Whch s plotted below: Flp Flop 1 E = ( v1v 2) + ( v2v1) 2 E = ( v v ) + ( v v ) / = v v 1 2 2 1 2 1 2 E 1,-1-1,1 8

Pattern Recall So a Hopfeld network mnmses ts energy functon and settles nto an attractor state, whch represents the full pattern that the nputs are closest to Learnng n a Hopfeld Network Now we turn to the queston of learnng n a Hopfeld network how do the weghts get set to store the memores? The goal s to ensure that the energy of the network at each of the patterns s locally mnmal (.e. the lowest near that pattern) 9

Hopfeld Learnng To mnmse the energy, we present a pattern p to a network and update the weghts thus: w w + p p Smply update the weght between two neurons by the product of ts two neurons values n each pattern So neurons that are both present n many patterns have larger weghts! Learnng Fnally, when all the patterns have been loaded, we dvde each weght by the number of patterns there are to normalse the weghts We also make sure the leadng dagonal of the weghts matrx contans zeros to ensure no neuron lnks to tself What does ths remnd you of? Hebb s rule when two neurons are on, strengthen the weght between them 10

Example Learnng Images Tran a network wth clean mages: Present corrupted text as nputs: Network produces clean characters as outputs: 11