The Hopfield model. 1 The Hebbian paradigm. Sebastian Seung Lecture 15: November 7, 2002

Size: px
Start display at page:

Download "The Hopfield model. 1 The Hebbian paradigm. Sebastian Seung Lecture 15: November 7, 2002"

Transcription

1 MIT Department of Bran and Cogntve Scences 9.29J, Sprng Introducton to Computatonal euroscence Instructor: Professor Sebastan Seung The Hopfeld model Sebastan Seung 9.64 Lecture 5: ovember 7, 2002 The Hebban paradgm In hs 949 book The Organzaton of Behavor, Donald Hebb predcted a form of synaptc plastcty wth the followng property When an axon of cell A s near enough to excte a cell B and repeatedly or persstently takes part n frng t, some growth process or metabolc change takes place n one or both cells such that A s effcency, as one of the cells frng B, s ncreased. In 973, the phenomenon of long-term potentaton (LTP) was dscovered. The study of LTP s now a small ndustry wthn the feld of neuroscence. Among the varous forms of LTP, those that depend on the MDA subtype of glutamate receptor are regarded as Hebban. These forms depend on temporal contguty of presynaptc and postsynaptc actvty. The requrement of temporal contguty s expressed n the followng dtty eurons that fre together, wre together. eurons that fre out of sync, fal to lnk. Hebban synaptc plastcty s remarkable for havng been predcted on theoretcal grounds well before t was dscovered expermentally. Ths s not a common occurrence n bology. How dd Hebb make hs remarkable predcton? He was motvated by an old tradton n Western thought known as assocatonsm. Ths s the dea that the bran s nothng more than an engne for storng and retrevng assocatons. In the late 9th century, t was found that the bran conssted of neurons connected by an ntrcate web of synapses. Ths transformed the older assocatonst tradton nto connectonsm, the doctrne that assocatons are stored as synaptc connectons. Ths s an example of the dea that structure determnes functon n bology. Hume and other emprcst phlosophers had already expressed the dea that assocatons were learned from temporal contguty. It only stood to reason that connectons should also be learned from temporal contguty. Whle these deas are very suggestve, they are admttedly rather vague and metaphorcal. The challenge of connectonsm s to make these deas precse. A number of theorsts have formulated neural network models wth the goal of explanng how Hebban synaptc plastcty could be used to store memores.

2 2 Bnary model neurons For the most part, we have studed neural network models n whch the actvty of each neuron s descrbed by a sngle analog varable. A more drastc smplfcaton s to use bnary neurons, whch are ether actve (s = ) or nactve (s = ). In such a dynamcs, each neuron updates tself accordng to the rule s = sgn W s, () where s s the state of neuron after the update. ote that sometmes s = s, n whch case the update results n no change. If the argument of the sgn functon happens to be zero, there s an ambguty to the defnton, whch could be resolved by a con flp or arbtrarly defnng sgn(0) =. Equaton () can be used to defne several types of network dynamcs, dependng on the exact manner n whch t s appled. In a sequental dynamcs, the neurons are updated one at a tme, typcally n a random order. In a parallel dynamcs, Eq. () s appled to all neurons smultaneously. It s often easer to prove theorems about the sequental case, but the parallel case s sometmes easer to mplement (for example n MATLAB t s more natural). In a smulaton on a dgtal computer, t s natural to make the updates at dscrete tmes. However, the update tmes could n prncple be contnuous. For example, they could occur at random for each neuron accordng to a Posson process wth some mean rate. As we shall see, all these verson of network dynamcs behave qualtatvely the same n general, but there can be subtle dfferences. 3 Hopfeld model Suppose that synapses change accordng to the learnng rule W = ηs s where η > 0 s a learnng rate. Ths could be called Hebban, although t has some werd features. The synapse s potentated f both neurons are actve, or both are nactve. The synapse s depressed f one neuron s actve, and the other s nactve. Suppose further that the network s exposed to P bnary patterns ξ,..., ξ P durng a tranng phase. More specfcally, the state s s set to each pattern n turn, and the synaptc weghts are changed by the Hebban update. If W = 0 at the begnnng of the tranng phase, Hebban learnng wll result n W = ξ ξ The prefactor / corresponds to a partcular choce of learnng rate η = /. It s a convenent choce for the calculatons we are about to perform, but s otherwse arb (2) 2

3 trary, as W can be multpled by any postve prefactor wthout affectng the dynamcs (). Ths synaptc weght matrx s the famous Hopfeld model, along wth the dynamcs (), and the assumpton that the patterns are chosen at random. Ths last assumpton makes sense f we assume that there s a data compresson stage that encodes sensory data effcently before t reaches the Hopfeld network. In hs orgnal paper, Hopfeld set the dagonal terms of the weght matrx to be zero (W = 0). If ths s not done, the dynamcs takes the form = sgn P s s + W s,,= The network stll bascally works, but there are some subtle dfferences, whch we wll dscuss later. As we shall see n the followng, the Hopfeld model functons as an assocatve memory, because the patterns are stored as dynamcal attractors. If the network s ntalzed wth a corrupted or ncomplete verson of a pattern, convergence to an attractor can recall the correct pattern. 4 Sngle pattern Let s start wth a smple case, a sngle pattern W = ξ ξ (3) The dynamcs takes the form s ξ ξ = sgn s In terms of the overlap m = ξ between ξ and s, we can wrte s = sgn (ξ m) Snce m = when s = ξ, t s a steady state of the dynamcs. Smlarly, m = when s = ξ, so t s also a steady state of the dynamcs. If m > 0, updates can only cause m to ncrease. Lkewse f m < 0, updates can only cause m to decrease. Therefore these steady states are attractors of the dynamcs. s 3

4 5 Many patterns The case of many patterns s more nterestng. The weght matrx (2) s ust the superposton of sngle pattern outer products (3). The danger here s that the patterns mght nterfere wth each other. If the patterns were exactly orthogonal to each other, there would be no nterference. If they are chosen randomly, there s some possblty of crosstalk, whch s quantfed below. To check whether the patterns are steady states, we calculate ξ W ξ ξ ξ = (4),=,= ξ = ξ + ξ ξ (5),=,= ξ ξ ξ = + ξ (6) ξ,=,= The last term s called the crosstalk or nterference term. If t s greater than for all, then the th pattern s a steady state ξ = sgn W ξ If we try to store too many patterns, then the nterference between them becomes large, and storage s not possble. To study the stablty of a pattern, magne that the network s ntalzed at the pattern, and try to estmate how many bt flps take place. We can approxmate the crosstalk term as the sum of p ndependent con flps. There s an error when the sum s more negatve than. The dervaton n the book uses the Gaussan approxmaton to the bnomal dstrbuton. We wll do somethng coarser, whch s the Hoeffdng bound on the devaton between frequency and probablty. Accordng to the one-sded Hoeffdng bound for m con tosses P rob[ˆp < p ] < exp( 2 2 m) Here we are nterested n devatons of the frequency of heads from 2 by more than 2p. We fnd that P error < exp( 2p(/2p) 2 ) = exp 2p There are several defntons of capacty. (fracton of bts are corrupted, P error < 0.0) Suppose that we would lke less than one percent of the bts corrupted. Then we need exp( /(2p)) < 0.0, or p < /(2 log(0.0)) = 0.. The problem wth ths calculaton s that the flppng of the frst bts could cause an avalanche. In realty, the capacty s about p = 0.4. Calculatng ths number requres some heavy mathematcs, lke the replca trck. 4

5 (fracton of patterns are corrupted, P error < 0.0/ ) We could mpose a more strngent requrement, whch s than less than one percent of the patterns have a bt corrupted. p = 2 log (fracton of samples are corrupted, P error < 0.0/(p ). The most strngent requrement s that no pattern n the sample s corrupted, wth confdence greater than 99%. Takng log p log, we obtan p = 4 log Is ths good? Accordng to the Cover argument, the maxmum should be p = 2. We can store patterns as attractors of the network dynamcs (wth some corrupton). However, there can be other attractors, or spurous states. These are of three types. There are reversed states ξ due to the ± symmetry of the network dynamcs. There are mxture states, whch are a superposton of an odd number of patterns. For example, ξ mx = sgn(±ξ ± ξ 2 ± ξ 3 ) An even number won t work, because the sum works out to zero on some stes. There are spn glass states for large p, whch are not correlated wth any fnte number of the patterns ξ. 6 Energy functon If the nteractons W are symmetrc, then E = W s s 2 s nonncreasng under the dynamcs (), assumng asynchronous update. To prove ths, note that the = terms n the sum are unchangng, snce s 2 =. Therefore the change n E due to the update () s gven by E = 2 (s s ) W s (7),= = 2 (s s ) W s + (s s )W s (8) 2 If s = s, then E = 0 and we are done. In the other case, s = s, and we can wrte E = s W s W s 2 0 5

6 Therefore the dynamcs can be understood as descent on an energy landscape. Statstcal physcsts lke bnary neurons because they are analogous to magnetc spns. The synaptc nteracton W can be compared to a magnetc nteracton. If W > 0, the spns want to lne up n the same drecton, as n a ferromagnet. If W < 0, the nteracton s called antferromagnetc. 7 Energy functon Let s consder the case W = for all and. Ths corresponds to a pure ferromagnet, n whch all nteractons try to make the spns lne up n the same drecton. In ths case, ( E = 2 Obvously there are two mnma, the fully magnetzed states s = for all and s = for all. Any ntal condton wth more up spns than down spns wll converge to the all up state, and a smlar statement can be made about the down state. Suppose that we would lke to store a sngle pattern ξ as an attractor of the dynamcs. Ths s bascally the same as the prevous case. If we choose then the energy functon s W = ξ ξ s 2 E = 2 ξ ξ s s (9) ( = ξ s (0) 2 The attractors of ths dynamcs are s = ξ and s = ξ, as requred. In statstcal mechancs, ths s known as the Matts model. ote that the energy functon for ths network s ( E = s ξ 2 It s plausble that the patterns should be local mnma, but they mght nterfere wth each other. 2 8 Content-addressable memory The nput s encoded n the ntal condton of the network dynamcs. Convergence to an attractor corresponds to recall of the closest stored memory. 6

Hopfield Training Rules 1 N

Hopfield Training Rules 1 N Hopfeld Tranng Rules To memorse a sngle pattern Suppose e set the eghts thus - = p p here, s the eght beteen nodes & s the number of nodes n the netor p s the value requred for the -th node What ll the

More information

1 Derivation of Rate Equations from Single-Cell Conductance (Hodgkin-Huxley-like) Equations

1 Derivation of Rate Equations from Single-Cell Conductance (Hodgkin-Huxley-like) Equations Physcs 171/271 -Davd Klenfeld - Fall 2005 (revsed Wnter 2011) 1 Dervaton of Rate Equatons from Sngle-Cell Conductance (Hodgkn-Huxley-lke) Equatons We consder a network of many neurons, each of whch obeys

More information

9 Derivation of Rate Equations from Single-Cell Conductance (Hodgkin-Huxley-like) Equations

9 Derivation of Rate Equations from Single-Cell Conductance (Hodgkin-Huxley-like) Equations Physcs 171/271 - Chapter 9R -Davd Klenfeld - Fall 2005 9 Dervaton of Rate Equatons from Sngle-Cell Conductance (Hodgkn-Huxley-lke) Equatons We consder a network of many neurons, each of whch obeys a set

More information

8 Derivation of Network Rate Equations from Single- Cell Conductance Equations

8 Derivation of Network Rate Equations from Single- Cell Conductance Equations Physcs 178/278 - Davd Klenfeld - Wnter 2015 8 Dervaton of Network Rate Equatons from Sngle- Cell Conductance Equatons We consder a network of many neurons, each of whch obeys a set of conductancebased,

More information

Unsupervised Learning

Unsupervised Learning Unsupervsed Learnng Kevn Swngler What s Unsupervsed Learnng? Most smply, t can be thought of as learnng to recognse and recall thngs Recognton I ve seen that before Recall I ve seen that before and I can

More information

Model of Neurons. CS 416 Artificial Intelligence. Early History of Neural Nets. Cybernetics. McCulloch-Pitts Neurons. Hebbian Modification.

Model of Neurons. CS 416 Artificial Intelligence. Early History of Neural Nets. Cybernetics. McCulloch-Pitts Neurons. Hebbian Modification. Page 1 Model of Neurons CS 416 Artfcal Intellgence Lecture 18 Neural Nets Chapter 20 Multple nputs/dendrtes (~10,000!!!) Cell body/soma performs computaton Sngle output/axon Computaton s typcally modeled

More information

8 Derivation of Network Rate Equations from Single- Cell Conductance Equations

8 Derivation of Network Rate Equations from Single- Cell Conductance Equations Physcs 178/278 - Davd Klenfeld - Wnter 2019 8 Dervaton of Network Rate Equatons from Sngle- Cell Conductance Equatons Our goal to derve the form of the abstract quanttes n rate equatons, such as synaptc

More information

CHAPTER III Neural Networks as Associative Memory

CHAPTER III Neural Networks as Associative Memory CHAPTER III Neural Networs as Assocatve Memory Introducton One of the prmary functons of the bran s assocatve memory. We assocate the faces wth names, letters wth sounds, or we can recognze the people

More information

Introduction to Vapor/Liquid Equilibrium, part 2. Raoult s Law:

Introduction to Vapor/Liquid Equilibrium, part 2. Raoult s Law: CE304, Sprng 2004 Lecture 4 Introducton to Vapor/Lqud Equlbrum, part 2 Raoult s Law: The smplest model that allows us do VLE calculatons s obtaned when we assume that the vapor phase s an deal gas, and

More information

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:

More information

Week 5: Neural Networks

Week 5: Neural Networks Week 5: Neural Networks Instructor: Sergey Levne Neural Networks Summary In the prevous lecture, we saw how we can construct neural networks by extendng logstc regresson. Neural networks consst of multple

More information

1 Convex Optimization

1 Convex Optimization Convex Optmzaton We wll consder convex optmzaton problems. Namely, mnmzaton problems where the objectve s convex (we assume no constrants for now). Such problems often arse n machne learnng. For example,

More information

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results. Neural Networks : Dervaton compled by Alvn Wan from Professor Jtendra Malk s lecture Ths type of computaton s called deep learnng and s the most popular method for many problems, such as computer vson

More information

Homework Assignment 3 Due in class, Thursday October 15

Homework Assignment 3 Due in class, Thursday October 15 Homework Assgnment 3 Due n class, Thursday October 15 SDS 383C Statstcal Modelng I 1 Rdge regresson and Lasso 1. Get the Prostrate cancer data from http://statweb.stanford.edu/~tbs/elemstatlearn/ datasets/prostate.data.

More information

} Often, when learning, we deal with uncertainty:

} Often, when learning, we deal with uncertainty: Uncertanty and Learnng } Often, when learnng, we deal wth uncertanty: } Incomplete data sets, wth mssng nformaton } Nosy data sets, wth unrelable nformaton } Stochastcty: causes and effects related non-determnstcally

More information

Internet Engineering. Jacek Mazurkiewicz, PhD Softcomputing. Part 3: Recurrent Artificial Neural Networks Self-Organising Artificial Neural Networks

Internet Engineering. Jacek Mazurkiewicz, PhD Softcomputing. Part 3: Recurrent Artificial Neural Networks Self-Organising Artificial Neural Networks Internet Engneerng Jacek Mazurkewcz, PhD Softcomputng Part 3: Recurrent Artfcal Neural Networks Self-Organsng Artfcal Neural Networks Recurrent Artfcal Neural Networks Feedback sgnals between neurons Dynamc

More information

j) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1

j) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1 Random varables Measure of central tendences and varablty (means and varances) Jont densty functons and ndependence Measures of assocaton (covarance and correlaton) Interestng result Condtonal dstrbutons

More information

Linear Feature Engineering 11

Linear Feature Engineering 11 Lnear Feature Engneerng 11 2 Least-Squares 2.1 Smple least-squares Consder the followng dataset. We have a bunch of nputs x and correspondng outputs y. The partcular values n ths dataset are x y 0.23 0.19

More information

Lecture Notes on Linear Regression

Lecture Notes on Linear Regression Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume

More information

Lecture 3: Shannon s Theorem

Lecture 3: Shannon s Theorem CSE 533: Error-Correctng Codes (Autumn 006 Lecture 3: Shannon s Theorem October 9, 006 Lecturer: Venkatesan Guruswam Scrbe: Wdad Machmouch 1 Communcaton Model The communcaton model we are usng conssts

More information

Chapter 9: Statistical Inference and the Relationship between Two Variables

Chapter 9: Statistical Inference and the Relationship between Two Variables Chapter 9: Statstcal Inference and the Relatonshp between Two Varables Key Words The Regresson Model The Sample Regresson Equaton The Pearson Correlaton Coeffcent Learnng Outcomes After studyng ths chapter,

More information

6. Stochastic processes (2)

6. Stochastic processes (2) Contents Markov processes Brth-death processes Lect6.ppt S-38.45 - Introducton to Teletraffc Theory Sprng 5 Markov process Consder a contnuous-tme and dscrete-state stochastc process X(t) wth state space

More information

Temperature. Chapter Heat Engine

Temperature. Chapter Heat Engine Chapter 3 Temperature In prevous chapters of these notes we ntroduced the Prncple of Maxmum ntropy as a technque for estmatng probablty dstrbutons consstent wth constrants. In Chapter 9 we dscussed the

More information

6. Stochastic processes (2)

6. Stochastic processes (2) 6. Stochastc processes () Lect6.ppt S-38.45 - Introducton to Teletraffc Theory Sprng 5 6. Stochastc processes () Contents Markov processes Brth-death processes 6. Stochastc processes () Markov process

More information

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number

More information

Lecture 7: Boltzmann distribution & Thermodynamics of mixing

Lecture 7: Boltzmann distribution & Thermodynamics of mixing Prof. Tbbtt Lecture 7 etworks & Gels Lecture 7: Boltzmann dstrbuton & Thermodynamcs of mxng 1 Suggested readng Prof. Mark W. Tbbtt ETH Zürch 13 März 018 Molecular Drvng Forces Dll and Bromberg: Chapters

More information

MLE and Bayesian Estimation. Jie Tang Department of Computer Science & Technology Tsinghua University 2012

MLE and Bayesian Estimation. Jie Tang Department of Computer Science & Technology Tsinghua University 2012 MLE and Bayesan Estmaton Je Tang Department of Computer Scence & Technology Tsnghua Unversty 01 1 Lnear Regresson? As the frst step, we need to decde how we re gong to represent the functon f. One example:

More information

Lecture 12: Discrete Laplacian

Lecture 12: Discrete Laplacian Lecture 12: Dscrete Laplacan Scrbe: Tanye Lu Our goal s to come up wth a dscrete verson of Laplacan operator for trangulated surfaces, so that we can use t n practce to solve related problems We are mostly

More information

10-701/ Machine Learning, Fall 2005 Homework 3

10-701/ Machine Learning, Fall 2005 Homework 3 10-701/15-781 Machne Learnng, Fall 2005 Homework 3 Out: 10/20/05 Due: begnnng of the class 11/01/05 Instructons Contact questons-10701@autonlaborg for queston Problem 1 Regresson and Cross-valdaton [40

More information

Negative Binomial Regression

Negative Binomial Regression STATGRAPHICS Rev. 9/16/2013 Negatve Bnomal Regresson Summary... 1 Data Input... 3 Statstcal Model... 3 Analyss Summary... 4 Analyss Optons... 7 Plot of Ftted Model... 8 Observed Versus Predcted... 10 Predctons...

More information

Chapter 1. Probability

Chapter 1. Probability Chapter. Probablty Mcroscopc propertes of matter: quantum mechancs, atomc and molecular propertes Macroscopc propertes of matter: thermodynamcs, E, H, C V, C p, S, A, G How do we relate these two propertes?

More information

Module 9. Lecture 6. Duality in Assignment Problems

Module 9. Lecture 6. Duality in Assignment Problems Module 9 1 Lecture 6 Dualty n Assgnment Problems In ths lecture we attempt to answer few other mportant questons posed n earler lecture for (AP) and see how some of them can be explaned through the concept

More information

Lecture 23: Artificial neural networks

Lecture 23: Artificial neural networks Lecture 23: Artfcal neural networks Broad feld that has developed over the past 20 to 30 years Confluence of statstcal mechancs, appled math, bology and computers Orgnal motvaton: mathematcal modelng of

More information

Lecture 10 Support Vector Machines II

Lecture 10 Support Vector Machines II Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed

More information

Lecture 4: Universal Hash Functions/Streaming Cont d

Lecture 4: Universal Hash Functions/Streaming Cont d CSE 5: Desgn and Analyss of Algorthms I Sprng 06 Lecture 4: Unversal Hash Functons/Streamng Cont d Lecturer: Shayan Oves Gharan Aprl 6th Scrbe: Jacob Schreber Dsclamer: These notes have not been subjected

More information

Lecture Space-Bounded Derandomization

Lecture Space-Bounded Derandomization Notes on Complexty Theory Last updated: October, 2008 Jonathan Katz Lecture Space-Bounded Derandomzaton 1 Space-Bounded Derandomzaton We now dscuss derandomzaton of space-bounded algorthms. Here non-trval

More information

Hopfield networks and Boltzmann machines. Geoffrey Hinton et al. Presented by Tambet Matiisen

Hopfield networks and Boltzmann machines. Geoffrey Hinton et al. Presented by Tambet Matiisen Hopfeld networks and Boltzmann machnes Geoffrey Hnton et al. Presented by Tambet Matsen 18.11.2014 Hopfeld network Bnary unts Symmetrcal connectons http://www.nnwj.de/hopfeld-net.html Energy functon The

More information

A how to guide to second quantization method.

A how to guide to second quantization method. Phys. 67 (Graduate Quantum Mechancs Sprng 2009 Prof. Pu K. Lam. Verson 3 (4/3/2009 A how to gude to second quantzaton method. -> Second quantzaton s a mathematcal notaton desgned to handle dentcal partcle

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 12 10/21/2013. Martingale Concentration Inequalities and Applications

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 12 10/21/2013. Martingale Concentration Inequalities and Applications MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.65/15.070J Fall 013 Lecture 1 10/1/013 Martngale Concentraton Inequaltes and Applcatons Content. 1. Exponental concentraton for martngales wth bounded ncrements.

More information

Additional Codes using Finite Difference Method. 1 HJB Equation for Consumption-Saving Problem Without Uncertainty

Additional Codes using Finite Difference Method. 1 HJB Equation for Consumption-Saving Problem Without Uncertainty Addtonal Codes usng Fnte Dfference Method Benamn Moll 1 HJB Equaton for Consumpton-Savng Problem Wthout Uncertanty Before consderng the case wth stochastc ncome n http://www.prnceton.edu/~moll/ HACTproect/HACT_Numercal_Appendx.pdf,

More information

Lecture 3: Probability Distributions

Lecture 3: Probability Distributions Lecture 3: Probablty Dstrbutons Random Varables Let us begn by defnng a sample space as a set of outcomes from an experment. We denote ths by S. A random varable s a functon whch maps outcomes nto the

More information

Supporting Information

Supporting Information Supportng Informaton The neural network f n Eq. 1 s gven by: f x l = ReLU W atom x l + b atom, 2 where ReLU s the element-wse rectfed lnear unt, 21.e., ReLUx = max0, x, W atom R d d s the weght matrx to

More information

Transfer Functions. Convenient representation of a linear, dynamic model. A transfer function (TF) relates one input and one output: ( ) system

Transfer Functions. Convenient representation of a linear, dynamic model. A transfer function (TF) relates one input and one output: ( ) system Transfer Functons Convenent representaton of a lnear, dynamc model. A transfer functon (TF) relates one nput and one output: x t X s y t system Y s The followng termnology s used: x y nput output forcng

More information

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton

More information

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems Numercal Analyss by Dr. Anta Pal Assstant Professor Department of Mathematcs Natonal Insttute of Technology Durgapur Durgapur-713209 emal: anta.bue@gmal.com 1 . Chapter 5 Soluton of System of Lnear Equatons

More information

Problem Set 9 Solutions

Problem Set 9 Solutions Desgn and Analyss of Algorthms May 4, 2015 Massachusetts Insttute of Technology 6.046J/18.410J Profs. Erk Demane, Srn Devadas, and Nancy Lynch Problem Set 9 Solutons Problem Set 9 Solutons Ths problem

More information

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix Lectures - Week 4 Matrx norms, Condtonng, Vector Spaces, Lnear Independence, Spannng sets and Bass, Null space and Range of a Matrx Matrx Norms Now we turn to assocatng a number to each matrx. We could

More information

8.592J: Solutions for Assignment 7 Spring 2005

8.592J: Solutions for Assignment 7 Spring 2005 8.59J: Solutons for Assgnment 7 Sprng 5 Problem 1 (a) A flament of length l can be created by addton of a monomer to one of length l 1 (at rate a) or removal of a monomer from a flament of length l + 1

More information

NUMERICAL DIFFERENTIATION

NUMERICAL DIFFERENTIATION NUMERICAL DIFFERENTIATION 1 Introducton Dfferentaton s a method to compute the rate at whch a dependent output y changes wth respect to the change n the ndependent nput x. Ths rate of change s called the

More information

Lossy Compression. Compromise accuracy of reconstruction for increased compression.

Lossy Compression. Compromise accuracy of reconstruction for increased compression. Lossy Compresson Compromse accuracy of reconstructon for ncreased compresson. The reconstructon s usually vsbly ndstngushable from the orgnal mage. Typcally, one can get up to 0:1 compresson wth almost

More information

Generalized Linear Methods

Generalized Linear Methods Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set

More information

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y)

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y) Secton 1.5 Correlaton In the prevous sectons, we looked at regresson and the value r was a measurement of how much of the varaton n y can be attrbuted to the lnear relatonshp between y and x. In ths secton,

More information

Notes on Frequency Estimation in Data Streams

Notes on Frequency Estimation in Data Streams Notes on Frequency Estmaton n Data Streams In (one of) the data streamng model(s), the data s a sequence of arrvals a 1, a 2,..., a m of the form a j = (, v) where s the dentty of the tem and belongs to

More information

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4) I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes

More information

Comparison of Regression Lines

Comparison of Regression Lines STATGRAPHICS Rev. 9/13/2013 Comparson of Regresson Lnes Summary... 1 Data Input... 3 Analyss Summary... 4 Plot of Ftted Model... 6 Condtonal Sums of Squares... 6 Analyss Optons... 7 Forecasts... 8 Confdence

More information

Outline and Reading. Dynamic Programming. Dynamic Programming revealed. Computing Fibonacci. The General Dynamic Programming Technique

Outline and Reading. Dynamic Programming. Dynamic Programming revealed. Computing Fibonacci. The General Dynamic Programming Technique Outlne and Readng Dynamc Programmng The General Technque ( 5.3.2) -1 Knapsac Problem ( 5.3.3) Matrx Chan-Product ( 5.3.1) Dynamc Programmng verson 1.4 1 Dynamc Programmng verson 1.4 2 Dynamc Programmng

More information

A particle in a state of uniform motion remain in that state of motion unless acted upon by external force.

A particle in a state of uniform motion remain in that state of motion unless acted upon by external force. The fundamental prncples of classcal mechancs were lad down by Galleo and Newton n the 16th and 17th centures. In 1686, Newton wrote the Prncpa where he gave us three laws of moton, one law of gravty,

More information

1 GSW Iterative Techniques for y = Ax

1 GSW Iterative Techniques for y = Ax 1 for y = A I m gong to cheat here. here are a lot of teratve technques that can be used to solve the general case of a set of smultaneous equatons (wrtten n the matr form as y = A), but ths chapter sn

More information

Section 8.3 Polar Form of Complex Numbers

Section 8.3 Polar Form of Complex Numbers 80 Chapter 8 Secton 8 Polar Form of Complex Numbers From prevous classes, you may have encountered magnary numbers the square roots of negatve numbers and, more generally, complex numbers whch are the

More information

THE SUMMATION NOTATION Ʃ

THE SUMMATION NOTATION Ʃ Sngle Subscrpt otaton THE SUMMATIO OTATIO Ʃ Most of the calculatons we perform n statstcs are repettve operatons on lsts of numbers. For example, we compute the sum of a set of numbers, or the sum of the

More information

Gaussian Mixture Models

Gaussian Mixture Models Lab Gaussan Mxture Models Lab Objectve: Understand the formulaton of Gaussan Mxture Models (GMMs) and how to estmate GMM parameters. You ve already seen GMMs as the observaton dstrbuton n certan contnuous

More information

Lecture 21: Numerical methods for pricing American type derivatives

Lecture 21: Numerical methods for pricing American type derivatives Lecture 21: Numercal methods for prcng Amercan type dervatves Xaoguang Wang STAT 598W Aprl 10th, 2014 (STAT 598W) Lecture 21 1 / 26 Outlne 1 Fnte Dfference Method Explct Method Penalty Method (STAT 598W)

More information

JAB Chain. Long-tail claims development. ASTIN - September 2005 B.Verdier A. Klinger

JAB Chain. Long-tail claims development. ASTIN - September 2005 B.Verdier A. Klinger JAB Chan Long-tal clams development ASTIN - September 2005 B.Verder A. Klnger Outlne Chan Ladder : comments A frst soluton: Munch Chan Ladder JAB Chan Chan Ladder: Comments Black lne: average pad to ncurred

More information

APPROXIMATE PRICES OF BASKET AND ASIAN OPTIONS DUPONT OLIVIER. Premia 14

APPROXIMATE PRICES OF BASKET AND ASIAN OPTIONS DUPONT OLIVIER. Premia 14 APPROXIMAE PRICES OF BASKE AND ASIAN OPIONS DUPON OLIVIER Prema 14 Contents Introducton 1 1. Framewor 1 1.1. Baset optons 1.. Asan optons. Computng the prce 3. Lower bound 3.1. Closed formula for the prce

More information

Some Comments on Accelerating Convergence of Iterative Sequences Using Direct Inversion of the Iterative Subspace (DIIS)

Some Comments on Accelerating Convergence of Iterative Sequences Using Direct Inversion of the Iterative Subspace (DIIS) Some Comments on Acceleratng Convergence of Iteratve Sequences Usng Drect Inverson of the Iteratve Subspace (DIIS) C. Davd Sherrll School of Chemstry and Bochemstry Georga Insttute of Technology May 1998

More information

Curve Fitting with the Least Square Method

Curve Fitting with the Least Square Method WIKI Document Number 5 Interpolaton wth Least Squares Curve Fttng wth the Least Square Method Mattheu Bultelle Department of Bo-Engneerng Imperal College, London Context We wsh to model the postve feedback

More information

Resource Allocation and Decision Analysis (ECON 8010) Spring 2014 Foundations of Regression Analysis

Resource Allocation and Decision Analysis (ECON 8010) Spring 2014 Foundations of Regression Analysis Resource Allocaton and Decson Analss (ECON 800) Sprng 04 Foundatons of Regresson Analss Readng: Regresson Analss (ECON 800 Coursepak, Page 3) Defntons and Concepts: Regresson Analss statstcal technques

More information

Note 10. Modeling and Simulation of Dynamic Systems

Note 10. Modeling and Simulation of Dynamic Systems Lecture Notes of ME 475: Introducton to Mechatroncs Note 0 Modelng and Smulaton of Dynamc Systems Department of Mechancal Engneerng, Unversty Of Saskatchewan, 57 Campus Drve, Saskatoon, SK S7N 5A9, Canada

More information

Convergence of random processes

Convergence of random processes DS-GA 12 Lecture notes 6 Fall 216 Convergence of random processes 1 Introducton In these notes we study convergence of dscrete random processes. Ths allows to characterze phenomena such as the law of large

More information

Expected Value and Variance

Expected Value and Variance MATH 38 Expected Value and Varance Dr. Neal, WKU We now shall dscuss how to fnd the average and standard devaton of a random varable X. Expected Value Defnton. The expected value (or average value, or

More information

Logistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI

Logistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI Logstc Regresson CAP 561: achne Learnng Instructor: Guo-Jun QI Bayes Classfer: A Generatve model odel the posteror dstrbuton P(Y X) Estmate class-condtonal dstrbuton P(X Y) for each Y Estmate pror dstrbuton

More information

Errors for Linear Systems

Errors for Linear Systems Errors for Lnear Systems When we solve a lnear system Ax b we often do not know A and b exactly, but have only approxmatons  and ˆb avalable. Then the best thng we can do s to solve ˆx ˆb exactly whch

More information

CHAPTER 14 GENERAL PERTURBATION THEORY

CHAPTER 14 GENERAL PERTURBATION THEORY CHAPTER 4 GENERAL PERTURBATION THEORY 4 Introducton A partcle n orbt around a pont mass or a sphercally symmetrc mass dstrbuton s movng n a gravtatonal potental of the form GM / r In ths potental t moves

More information

Using T.O.M to Estimate Parameter of distributions that have not Single Exponential Family

Using T.O.M to Estimate Parameter of distributions that have not Single Exponential Family IOSR Journal of Mathematcs IOSR-JM) ISSN: 2278-5728. Volume 3, Issue 3 Sep-Oct. 202), PP 44-48 www.osrjournals.org Usng T.O.M to Estmate Parameter of dstrbutons that have not Sngle Exponental Famly Jubran

More information

= z 20 z n. (k 20) + 4 z k = 4

= z 20 z n. (k 20) + 4 z k = 4 Problem Set #7 solutons 7.2.. (a Fnd the coeffcent of z k n (z + z 5 + z 6 + z 7 + 5, k 20. We use the known seres expanson ( n+l ( z l l z n below: (z + z 5 + z 6 + z 7 + 5 (z 5 ( + z + z 2 + z + 5 5

More information

Assortment Optimization under MNL

Assortment Optimization under MNL Assortment Optmzaton under MNL Haotan Song Aprl 30, 2017 1 Introducton The assortment optmzaton problem ams to fnd the revenue-maxmzng assortment of products to offer when the prces of products are fxed.

More information

arxiv: v1 [cs.lg] 17 Jan 2019

arxiv: v1 [cs.lg] 17 Jan 2019 LECTURE NOTES arxv:90.05639v [cs.lg] 7 Jan 209 Artfcal Neural Networks B. MEHLIG Department of Physcs Unversty of Gothenburg Göteborg, Sweden 209 PREFACE These are lecture notes for my course on Artfcal

More information

Chapter 7 Channel Capacity and Coding

Chapter 7 Channel Capacity and Coding Wreless Informaton Transmsson System Lab. Chapter 7 Channel Capacty and Codng Insttute of Communcatons Engneerng atonal Sun Yat-sen Unversty Contents 7. Channel models and channel capacty 7.. Channel models

More information

Basically, if you have a dummy dependent variable you will be estimating a probability.

Basically, if you have a dummy dependent variable you will be estimating a probability. ECON 497: Lecture Notes 13 Page 1 of 1 Metropoltan State Unversty ECON 497: Research and Forecastng Lecture Notes 13 Dummy Dependent Varable Technques Studenmund Chapter 13 Bascally, f you have a dummy

More information

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U) Econ 413 Exam 13 H ANSWERS Settet er nndelt 9 deloppgaver, A,B,C, som alle anbefales å telle lkt for å gøre det ltt lettere å stå. Svar er gtt . Unfortunately, there s a prntng error n the hnt of

More information

princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 3: Large deviations bounds and applications Lecturer: Sanjeev Arora

princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 3: Large deviations bounds and applications Lecturer: Sanjeev Arora prnceton unv. F 13 cos 521: Advanced Algorthm Desgn Lecture 3: Large devatons bounds and applcatons Lecturer: Sanjeev Arora Scrbe: Today s topc s devaton bounds: what s the probablty that a random varable

More information

LECTURE NOTES. Artifical Neural Networks. B. MEHLIG (course home page)

LECTURE NOTES. Artifical Neural Networks. B. MEHLIG (course home page) LECTURE NOTES Artfcal Neural Networks B. MEHLIG (course home page) Department of Physcs Unversty of Gothenburg Göteborg, Sweden 208 PREFACE These are lecture notes for my course on Artfcal Neural Networks

More information

Neural Networks. Perceptrons and Backpropagation. Silke Bussen-Heyen. 5th of Novemeber Universität Bremen Fachbereich 3. Neural Networks 1 / 17

Neural Networks. Perceptrons and Backpropagation. Silke Bussen-Heyen. 5th of Novemeber Universität Bremen Fachbereich 3. Neural Networks 1 / 17 Neural Networks Perceptrons and Backpropagaton Slke Bussen-Heyen Unverstät Bremen Fachberech 3 5th of Novemeber 2012 Neural Networks 1 / 17 Contents 1 Introducton 2 Unts 3 Network structure 4 Snglelayer

More information

Feature Selection: Part 1

Feature Selection: Part 1 CSE 546: Machne Learnng Lecture 5 Feature Selecton: Part 1 Instructor: Sham Kakade 1 Regresson n the hgh dmensonal settng How do we learn when the number of features d s greater than the sample sze n?

More information

Lecture 16 Statistical Analysis in Biomaterials Research (Part II)

Lecture 16 Statistical Analysis in Biomaterials Research (Part II) 3.051J/0.340J 1 Lecture 16 Statstcal Analyss n Bomaterals Research (Part II) C. F Dstrbuton Allows comparson of varablty of behavor between populatons usng test of hypothess: σ x = σ x amed for Brtsh statstcan

More information

Course 395: Machine Learning - Lectures

Course 395: Machine Learning - Lectures Course 395: Machne Learnng - Lectures Lecture 1-2: Concept Learnng (M. Pantc Lecture 3-4: Decson Trees & CC Intro (M. Pantc Lecture 5-6: Artfcal Neural Networks (S.Zaferou Lecture 7-8: Instance ased Learnng

More information

Excess Error, Approximation Error, and Estimation Error

Excess Error, Approximation Error, and Estimation Error E0 370 Statstcal Learnng Theory Lecture 10 Sep 15, 011 Excess Error, Approxaton Error, and Estaton Error Lecturer: Shvan Agarwal Scrbe: Shvan Agarwal 1 Introducton So far, we have consdered the fnte saple

More information

Limited Dependent Variables

Limited Dependent Variables Lmted Dependent Varables. What f the left-hand sde varable s not a contnuous thng spread from mnus nfnty to plus nfnty? That s, gven a model = f (, β, ε, where a. s bounded below at zero, such as wages

More information

Topic 23 - Randomized Complete Block Designs (RCBD)

Topic 23 - Randomized Complete Block Designs (RCBD) Topc 3 ANOVA (III) 3-1 Topc 3 - Randomzed Complete Block Desgns (RCBD) Defn: A Randomzed Complete Block Desgn s a varant of the completely randomzed desgn (CRD) that we recently learned. In ths desgn,

More information

Admin NEURAL NETWORKS. Perceptron learning algorithm. Our Nervous System 10/25/16. Assignment 7. Class 11/22. Schedule for the rest of the semester

Admin NEURAL NETWORKS. Perceptron learning algorithm. Our Nervous System 10/25/16. Assignment 7. Class 11/22. Schedule for the rest of the semester 0/25/6 Admn Assgnment 7 Class /22 Schedule for the rest of the semester NEURAL NETWORKS Davd Kauchak CS58 Fall 206 Perceptron learnng algorthm Our Nervous System repeat untl convergence (or for some #

More information

Multilayer Perceptron (MLP)

Multilayer Perceptron (MLP) Multlayer Perceptron (MLP) Seungjn Cho Department of Computer Scence and Engneerng Pohang Unversty of Scence and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjn@postech.ac.kr 1 / 20 Outlne

More information

CSC321 Tutorial 9: Review of Boltzmann machines and simulated annealing

CSC321 Tutorial 9: Review of Boltzmann machines and simulated annealing CSC321 Tutoral 9: Revew of Boltzmann machnes and smulated annealng (Sldes based on Lecture 16-18 and selected readngs) Yue L Emal: yuel@cs.toronto.edu Wed 11-12 March 19 Fr 10-11 March 21 Outlne Boltzmann

More information

a b a In case b 0, a being divisible by b is the same as to say that

a b a In case b 0, a being divisible by b is the same as to say that Secton 6.2 Dvsblty among the ntegers An nteger a ε s dvsble by b ε f there s an nteger c ε such that a = bc. Note that s dvsble by any nteger b, snce = b. On the other hand, a s dvsble by only f a = :

More information

MMA and GCMMA two methods for nonlinear optimization

MMA and GCMMA two methods for nonlinear optimization MMA and GCMMA two methods for nonlnear optmzaton Krster Svanberg Optmzaton and Systems Theory, KTH, Stockholm, Sweden. krlle@math.kth.se Ths note descrbes the algorthms used n the author s 2007 mplementatons

More information

EEE 241: Linear Systems

EEE 241: Linear Systems EEE : Lnear Systems Summary #: Backpropagaton BACKPROPAGATION The perceptron rule as well as the Wdrow Hoff learnng were desgned to tran sngle layer networks. They suffer from the same dsadvantage: they

More information

Numerical Heat and Mass Transfer

Numerical Heat and Mass Transfer Master degree n Mechancal Engneerng Numercal Heat and Mass Transfer 06-Fnte-Dfference Method (One-dmensonal, steady state heat conducton) Fausto Arpno f.arpno@uncas.t Introducton Why we use models and

More information

Linear Approximation with Regularization and Moving Least Squares

Linear Approximation with Regularization and Moving Least Squares Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...

More information

Expectation Maximization Mixture Models HMMs

Expectation Maximization Mixture Models HMMs -755 Machne Learnng for Sgnal Processng Mture Models HMMs Class 9. 2 Sep 200 Learnng Dstrbutons for Data Problem: Gven a collecton of eamples from some data, estmate ts dstrbuton Basc deas of Mamum Lelhood

More information

Positive feedback Derivative feedback Pos. + der. feedback c Stronger. E I e f g Time (s)

Positive feedback Derivative feedback Pos. + der. feedback c Stronger. E I e f g Time (s) Frng rate (Hz) a f nput current Transent nput Step-lke nput Postve feedback Dervatve feedback Pos. + der. feedback b c Stronger d e f g Frng rate (Hz) h j Frng rate (Hz) 5 5 3 4 5 3 4 5 Tme (s) 6 4 5 3

More information

Statistical Inference. 2.3 Summary Statistics Measures of Center and Spread. parameters ( population characteristics )

Statistical Inference. 2.3 Summary Statistics Measures of Center and Spread. parameters ( population characteristics ) Ismor Fscher, 8//008 Stat 54 / -8.3 Summary Statstcs Measures of Center and Spread Dstrbuton of dscrete contnuous POPULATION Random Varable, numercal True center =??? True spread =???? parameters ( populaton

More information