The Hopfield model. 1 The Hebbian paradigm. Sebastian Seung Lecture 15: November 7, 2002
|
|
- Patience Greene
- 6 years ago
- Views:
Transcription
1 MIT Department of Bran and Cogntve Scences 9.29J, Sprng Introducton to Computatonal euroscence Instructor: Professor Sebastan Seung The Hopfeld model Sebastan Seung 9.64 Lecture 5: ovember 7, 2002 The Hebban paradgm In hs 949 book The Organzaton of Behavor, Donald Hebb predcted a form of synaptc plastcty wth the followng property When an axon of cell A s near enough to excte a cell B and repeatedly or persstently takes part n frng t, some growth process or metabolc change takes place n one or both cells such that A s effcency, as one of the cells frng B, s ncreased. In 973, the phenomenon of long-term potentaton (LTP) was dscovered. The study of LTP s now a small ndustry wthn the feld of neuroscence. Among the varous forms of LTP, those that depend on the MDA subtype of glutamate receptor are regarded as Hebban. These forms depend on temporal contguty of presynaptc and postsynaptc actvty. The requrement of temporal contguty s expressed n the followng dtty eurons that fre together, wre together. eurons that fre out of sync, fal to lnk. Hebban synaptc plastcty s remarkable for havng been predcted on theoretcal grounds well before t was dscovered expermentally. Ths s not a common occurrence n bology. How dd Hebb make hs remarkable predcton? He was motvated by an old tradton n Western thought known as assocatonsm. Ths s the dea that the bran s nothng more than an engne for storng and retrevng assocatons. In the late 9th century, t was found that the bran conssted of neurons connected by an ntrcate web of synapses. Ths transformed the older assocatonst tradton nto connectonsm, the doctrne that assocatons are stored as synaptc connectons. Ths s an example of the dea that structure determnes functon n bology. Hume and other emprcst phlosophers had already expressed the dea that assocatons were learned from temporal contguty. It only stood to reason that connectons should also be learned from temporal contguty. Whle these deas are very suggestve, they are admttedly rather vague and metaphorcal. The challenge of connectonsm s to make these deas precse. A number of theorsts have formulated neural network models wth the goal of explanng how Hebban synaptc plastcty could be used to store memores.
2 2 Bnary model neurons For the most part, we have studed neural network models n whch the actvty of each neuron s descrbed by a sngle analog varable. A more drastc smplfcaton s to use bnary neurons, whch are ether actve (s = ) or nactve (s = ). In such a dynamcs, each neuron updates tself accordng to the rule s = sgn W s, () where s s the state of neuron after the update. ote that sometmes s = s, n whch case the update results n no change. If the argument of the sgn functon happens to be zero, there s an ambguty to the defnton, whch could be resolved by a con flp or arbtrarly defnng sgn(0) =. Equaton () can be used to defne several types of network dynamcs, dependng on the exact manner n whch t s appled. In a sequental dynamcs, the neurons are updated one at a tme, typcally n a random order. In a parallel dynamcs, Eq. () s appled to all neurons smultaneously. It s often easer to prove theorems about the sequental case, but the parallel case s sometmes easer to mplement (for example n MATLAB t s more natural). In a smulaton on a dgtal computer, t s natural to make the updates at dscrete tmes. However, the update tmes could n prncple be contnuous. For example, they could occur at random for each neuron accordng to a Posson process wth some mean rate. As we shall see, all these verson of network dynamcs behave qualtatvely the same n general, but there can be subtle dfferences. 3 Hopfeld model Suppose that synapses change accordng to the learnng rule W = ηs s where η > 0 s a learnng rate. Ths could be called Hebban, although t has some werd features. The synapse s potentated f both neurons are actve, or both are nactve. The synapse s depressed f one neuron s actve, and the other s nactve. Suppose further that the network s exposed to P bnary patterns ξ,..., ξ P durng a tranng phase. More specfcally, the state s s set to each pattern n turn, and the synaptc weghts are changed by the Hebban update. If W = 0 at the begnnng of the tranng phase, Hebban learnng wll result n W = ξ ξ The prefactor / corresponds to a partcular choce of learnng rate η = /. It s a convenent choce for the calculatons we are about to perform, but s otherwse arb (2) 2
3 trary, as W can be multpled by any postve prefactor wthout affectng the dynamcs (). Ths synaptc weght matrx s the famous Hopfeld model, along wth the dynamcs (), and the assumpton that the patterns are chosen at random. Ths last assumpton makes sense f we assume that there s a data compresson stage that encodes sensory data effcently before t reaches the Hopfeld network. In hs orgnal paper, Hopfeld set the dagonal terms of the weght matrx to be zero (W = 0). If ths s not done, the dynamcs takes the form = sgn P s s + W s,,= The network stll bascally works, but there are some subtle dfferences, whch we wll dscuss later. As we shall see n the followng, the Hopfeld model functons as an assocatve memory, because the patterns are stored as dynamcal attractors. If the network s ntalzed wth a corrupted or ncomplete verson of a pattern, convergence to an attractor can recall the correct pattern. 4 Sngle pattern Let s start wth a smple case, a sngle pattern W = ξ ξ (3) The dynamcs takes the form s ξ ξ = sgn s In terms of the overlap m = ξ between ξ and s, we can wrte s = sgn (ξ m) Snce m = when s = ξ, t s a steady state of the dynamcs. Smlarly, m = when s = ξ, so t s also a steady state of the dynamcs. If m > 0, updates can only cause m to ncrease. Lkewse f m < 0, updates can only cause m to decrease. Therefore these steady states are attractors of the dynamcs. s 3
4 5 Many patterns The case of many patterns s more nterestng. The weght matrx (2) s ust the superposton of sngle pattern outer products (3). The danger here s that the patterns mght nterfere wth each other. If the patterns were exactly orthogonal to each other, there would be no nterference. If they are chosen randomly, there s some possblty of crosstalk, whch s quantfed below. To check whether the patterns are steady states, we calculate ξ W ξ ξ ξ = (4),=,= ξ = ξ + ξ ξ (5),=,= ξ ξ ξ = + ξ (6) ξ,=,= The last term s called the crosstalk or nterference term. If t s greater than for all, then the th pattern s a steady state ξ = sgn W ξ If we try to store too many patterns, then the nterference between them becomes large, and storage s not possble. To study the stablty of a pattern, magne that the network s ntalzed at the pattern, and try to estmate how many bt flps take place. We can approxmate the crosstalk term as the sum of p ndependent con flps. There s an error when the sum s more negatve than. The dervaton n the book uses the Gaussan approxmaton to the bnomal dstrbuton. We wll do somethng coarser, whch s the Hoeffdng bound on the devaton between frequency and probablty. Accordng to the one-sded Hoeffdng bound for m con tosses P rob[ˆp < p ] < exp( 2 2 m) Here we are nterested n devatons of the frequency of heads from 2 by more than 2p. We fnd that P error < exp( 2p(/2p) 2 ) = exp 2p There are several defntons of capacty. (fracton of bts are corrupted, P error < 0.0) Suppose that we would lke less than one percent of the bts corrupted. Then we need exp( /(2p)) < 0.0, or p < /(2 log(0.0)) = 0.. The problem wth ths calculaton s that the flppng of the frst bts could cause an avalanche. In realty, the capacty s about p = 0.4. Calculatng ths number requres some heavy mathematcs, lke the replca trck. 4
5 (fracton of patterns are corrupted, P error < 0.0/ ) We could mpose a more strngent requrement, whch s than less than one percent of the patterns have a bt corrupted. p = 2 log (fracton of samples are corrupted, P error < 0.0/(p ). The most strngent requrement s that no pattern n the sample s corrupted, wth confdence greater than 99%. Takng log p log, we obtan p = 4 log Is ths good? Accordng to the Cover argument, the maxmum should be p = 2. We can store patterns as attractors of the network dynamcs (wth some corrupton). However, there can be other attractors, or spurous states. These are of three types. There are reversed states ξ due to the ± symmetry of the network dynamcs. There are mxture states, whch are a superposton of an odd number of patterns. For example, ξ mx = sgn(±ξ ± ξ 2 ± ξ 3 ) An even number won t work, because the sum works out to zero on some stes. There are spn glass states for large p, whch are not correlated wth any fnte number of the patterns ξ. 6 Energy functon If the nteractons W are symmetrc, then E = W s s 2 s nonncreasng under the dynamcs (), assumng asynchronous update. To prove ths, note that the = terms n the sum are unchangng, snce s 2 =. Therefore the change n E due to the update () s gven by E = 2 (s s ) W s (7),= = 2 (s s ) W s + (s s )W s (8) 2 If s = s, then E = 0 and we are done. In the other case, s = s, and we can wrte E = s W s W s 2 0 5
6 Therefore the dynamcs can be understood as descent on an energy landscape. Statstcal physcsts lke bnary neurons because they are analogous to magnetc spns. The synaptc nteracton W can be compared to a magnetc nteracton. If W > 0, the spns want to lne up n the same drecton, as n a ferromagnet. If W < 0, the nteracton s called antferromagnetc. 7 Energy functon Let s consder the case W = for all and. Ths corresponds to a pure ferromagnet, n whch all nteractons try to make the spns lne up n the same drecton. In ths case, ( E = 2 Obvously there are two mnma, the fully magnetzed states s = for all and s = for all. Any ntal condton wth more up spns than down spns wll converge to the all up state, and a smlar statement can be made about the down state. Suppose that we would lke to store a sngle pattern ξ as an attractor of the dynamcs. Ths s bascally the same as the prevous case. If we choose then the energy functon s W = ξ ξ s 2 E = 2 ξ ξ s s (9) ( = ξ s (0) 2 The attractors of ths dynamcs are s = ξ and s = ξ, as requred. In statstcal mechancs, ths s known as the Matts model. ote that the energy functon for ths network s ( E = s ξ 2 It s plausble that the patterns should be local mnma, but they mght nterfere wth each other. 2 8 Content-addressable memory The nput s encoded n the ntal condton of the network dynamcs. Convergence to an attractor corresponds to recall of the closest stored memory. 6
Hopfield Training Rules 1 N
Hopfeld Tranng Rules To memorse a sngle pattern Suppose e set the eghts thus - = p p here, s the eght beteen nodes & s the number of nodes n the netor p s the value requred for the -th node What ll the
More information1 Derivation of Rate Equations from Single-Cell Conductance (Hodgkin-Huxley-like) Equations
Physcs 171/271 -Davd Klenfeld - Fall 2005 (revsed Wnter 2011) 1 Dervaton of Rate Equatons from Sngle-Cell Conductance (Hodgkn-Huxley-lke) Equatons We consder a network of many neurons, each of whch obeys
More information9 Derivation of Rate Equations from Single-Cell Conductance (Hodgkin-Huxley-like) Equations
Physcs 171/271 - Chapter 9R -Davd Klenfeld - Fall 2005 9 Dervaton of Rate Equatons from Sngle-Cell Conductance (Hodgkn-Huxley-lke) Equatons We consder a network of many neurons, each of whch obeys a set
More information8 Derivation of Network Rate Equations from Single- Cell Conductance Equations
Physcs 178/278 - Davd Klenfeld - Wnter 2015 8 Dervaton of Network Rate Equatons from Sngle- Cell Conductance Equatons We consder a network of many neurons, each of whch obeys a set of conductancebased,
More informationUnsupervised Learning
Unsupervsed Learnng Kevn Swngler What s Unsupervsed Learnng? Most smply, t can be thought of as learnng to recognse and recall thngs Recognton I ve seen that before Recall I ve seen that before and I can
More informationModel of Neurons. CS 416 Artificial Intelligence. Early History of Neural Nets. Cybernetics. McCulloch-Pitts Neurons. Hebbian Modification.
Page 1 Model of Neurons CS 416 Artfcal Intellgence Lecture 18 Neural Nets Chapter 20 Multple nputs/dendrtes (~10,000!!!) Cell body/soma performs computaton Sngle output/axon Computaton s typcally modeled
More information8 Derivation of Network Rate Equations from Single- Cell Conductance Equations
Physcs 178/278 - Davd Klenfeld - Wnter 2019 8 Dervaton of Network Rate Equatons from Sngle- Cell Conductance Equatons Our goal to derve the form of the abstract quanttes n rate equatons, such as synaptc
More informationCHAPTER III Neural Networks as Associative Memory
CHAPTER III Neural Networs as Assocatve Memory Introducton One of the prmary functons of the bran s assocatve memory. We assocate the faces wth names, letters wth sounds, or we can recognze the people
More informationIntroduction to Vapor/Liquid Equilibrium, part 2. Raoult s Law:
CE304, Sprng 2004 Lecture 4 Introducton to Vapor/Lqud Equlbrum, part 2 Raoult s Law: The smplest model that allows us do VLE calculatons s obtaned when we assume that the vapor phase s an deal gas, and
More informationModule 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur
Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:
More informationWeek 5: Neural Networks
Week 5: Neural Networks Instructor: Sergey Levne Neural Networks Summary In the prevous lecture, we saw how we can construct neural networks by extendng logstc regresson. Neural networks consst of multple
More information1 Convex Optimization
Convex Optmzaton We wll consder convex optmzaton problems. Namely, mnmzaton problems where the objectve s convex (we assume no constrants for now). Such problems often arse n machne learnng. For example,
More informationFor now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.
Neural Networks : Dervaton compled by Alvn Wan from Professor Jtendra Malk s lecture Ths type of computaton s called deep learnng and s the most popular method for many problems, such as computer vson
More informationHomework Assignment 3 Due in class, Thursday October 15
Homework Assgnment 3 Due n class, Thursday October 15 SDS 383C Statstcal Modelng I 1 Rdge regresson and Lasso 1. Get the Prostrate cancer data from http://statweb.stanford.edu/~tbs/elemstatlearn/ datasets/prostate.data.
More information} Often, when learning, we deal with uncertainty:
Uncertanty and Learnng } Often, when learnng, we deal wth uncertanty: } Incomplete data sets, wth mssng nformaton } Nosy data sets, wth unrelable nformaton } Stochastcty: causes and effects related non-determnstcally
More informationInternet Engineering. Jacek Mazurkiewicz, PhD Softcomputing. Part 3: Recurrent Artificial Neural Networks Self-Organising Artificial Neural Networks
Internet Engneerng Jacek Mazurkewcz, PhD Softcomputng Part 3: Recurrent Artfcal Neural Networks Self-Organsng Artfcal Neural Networks Recurrent Artfcal Neural Networks Feedback sgnals between neurons Dynamc
More informationj) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1
Random varables Measure of central tendences and varablty (means and varances) Jont densty functons and ndependence Measures of assocaton (covarance and correlaton) Interestng result Condtonal dstrbutons
More informationLinear Feature Engineering 11
Lnear Feature Engneerng 11 2 Least-Squares 2.1 Smple least-squares Consder the followng dataset. We have a bunch of nputs x and correspondng outputs y. The partcular values n ths dataset are x y 0.23 0.19
More informationLecture Notes on Linear Regression
Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume
More informationLecture 3: Shannon s Theorem
CSE 533: Error-Correctng Codes (Autumn 006 Lecture 3: Shannon s Theorem October 9, 006 Lecturer: Venkatesan Guruswam Scrbe: Wdad Machmouch 1 Communcaton Model The communcaton model we are usng conssts
More informationChapter 9: Statistical Inference and the Relationship between Two Variables
Chapter 9: Statstcal Inference and the Relatonshp between Two Varables Key Words The Regresson Model The Sample Regresson Equaton The Pearson Correlaton Coeffcent Learnng Outcomes After studyng ths chapter,
More information6. Stochastic processes (2)
Contents Markov processes Brth-death processes Lect6.ppt S-38.45 - Introducton to Teletraffc Theory Sprng 5 Markov process Consder a contnuous-tme and dscrete-state stochastc process X(t) wth state space
More informationTemperature. Chapter Heat Engine
Chapter 3 Temperature In prevous chapters of these notes we ntroduced the Prncple of Maxmum ntropy as a technque for estmatng probablty dstrbutons consstent wth constrants. In Chapter 9 we dscussed the
More information6. Stochastic processes (2)
6. Stochastc processes () Lect6.ppt S-38.45 - Introducton to Teletraffc Theory Sprng 5 6. Stochastc processes () Contents Markov processes Brth-death processes 6. Stochastc processes () Markov process
More information3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X
Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number
More informationLecture 7: Boltzmann distribution & Thermodynamics of mixing
Prof. Tbbtt Lecture 7 etworks & Gels Lecture 7: Boltzmann dstrbuton & Thermodynamcs of mxng 1 Suggested readng Prof. Mark W. Tbbtt ETH Zürch 13 März 018 Molecular Drvng Forces Dll and Bromberg: Chapters
More informationMLE and Bayesian Estimation. Jie Tang Department of Computer Science & Technology Tsinghua University 2012
MLE and Bayesan Estmaton Je Tang Department of Computer Scence & Technology Tsnghua Unversty 01 1 Lnear Regresson? As the frst step, we need to decde how we re gong to represent the functon f. One example:
More informationLecture 12: Discrete Laplacian
Lecture 12: Dscrete Laplacan Scrbe: Tanye Lu Our goal s to come up wth a dscrete verson of Laplacan operator for trangulated surfaces, so that we can use t n practce to solve related problems We are mostly
More information10-701/ Machine Learning, Fall 2005 Homework 3
10-701/15-781 Machne Learnng, Fall 2005 Homework 3 Out: 10/20/05 Due: begnnng of the class 11/01/05 Instructons Contact questons-10701@autonlaborg for queston Problem 1 Regresson and Cross-valdaton [40
More informationNegative Binomial Regression
STATGRAPHICS Rev. 9/16/2013 Negatve Bnomal Regresson Summary... 1 Data Input... 3 Statstcal Model... 3 Analyss Summary... 4 Analyss Optons... 7 Plot of Ftted Model... 8 Observed Versus Predcted... 10 Predctons...
More informationChapter 1. Probability
Chapter. Probablty Mcroscopc propertes of matter: quantum mechancs, atomc and molecular propertes Macroscopc propertes of matter: thermodynamcs, E, H, C V, C p, S, A, G How do we relate these two propertes?
More informationModule 9. Lecture 6. Duality in Assignment Problems
Module 9 1 Lecture 6 Dualty n Assgnment Problems In ths lecture we attempt to answer few other mportant questons posed n earler lecture for (AP) and see how some of them can be explaned through the concept
More informationLecture 23: Artificial neural networks
Lecture 23: Artfcal neural networks Broad feld that has developed over the past 20 to 30 years Confluence of statstcal mechancs, appled math, bology and computers Orgnal motvaton: mathematcal modelng of
More informationLecture 10 Support Vector Machines II
Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed
More informationLecture 4: Universal Hash Functions/Streaming Cont d
CSE 5: Desgn and Analyss of Algorthms I Sprng 06 Lecture 4: Unversal Hash Functons/Streamng Cont d Lecturer: Shayan Oves Gharan Aprl 6th Scrbe: Jacob Schreber Dsclamer: These notes have not been subjected
More informationLecture Space-Bounded Derandomization
Notes on Complexty Theory Last updated: October, 2008 Jonathan Katz Lecture Space-Bounded Derandomzaton 1 Space-Bounded Derandomzaton We now dscuss derandomzaton of space-bounded algorthms. Here non-trval
More informationHopfield networks and Boltzmann machines. Geoffrey Hinton et al. Presented by Tambet Matiisen
Hopfeld networks and Boltzmann machnes Geoffrey Hnton et al. Presented by Tambet Matsen 18.11.2014 Hopfeld network Bnary unts Symmetrcal connectons http://www.nnwj.de/hopfeld-net.html Energy functon The
More informationA how to guide to second quantization method.
Phys. 67 (Graduate Quantum Mechancs Sprng 2009 Prof. Pu K. Lam. Verson 3 (4/3/2009 A how to gude to second quantzaton method. -> Second quantzaton s a mathematcal notaton desgned to handle dentcal partcle
More informationMASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 12 10/21/2013. Martingale Concentration Inequalities and Applications
MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.65/15.070J Fall 013 Lecture 1 10/1/013 Martngale Concentraton Inequaltes and Applcatons Content. 1. Exponental concentraton for martngales wth bounded ncrements.
More informationAdditional Codes using Finite Difference Method. 1 HJB Equation for Consumption-Saving Problem Without Uncertainty
Addtonal Codes usng Fnte Dfference Method Benamn Moll 1 HJB Equaton for Consumpton-Savng Problem Wthout Uncertanty Before consderng the case wth stochastc ncome n http://www.prnceton.edu/~moll/ HACTproect/HACT_Numercal_Appendx.pdf,
More informationLecture 3: Probability Distributions
Lecture 3: Probablty Dstrbutons Random Varables Let us begn by defnng a sample space as a set of outcomes from an experment. We denote ths by S. A random varable s a functon whch maps outcomes nto the
More informationSupporting Information
Supportng Informaton The neural network f n Eq. 1 s gven by: f x l = ReLU W atom x l + b atom, 2 where ReLU s the element-wse rectfed lnear unt, 21.e., ReLUx = max0, x, W atom R d d s the weght matrx to
More informationTransfer Functions. Convenient representation of a linear, dynamic model. A transfer function (TF) relates one input and one output: ( ) system
Transfer Functons Convenent representaton of a lnear, dynamc model. A transfer functon (TF) relates one nput and one output: x t X s y t system Y s The followng termnology s used: x y nput output forcng
More information2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification
E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton
More informationChapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems
Numercal Analyss by Dr. Anta Pal Assstant Professor Department of Mathematcs Natonal Insttute of Technology Durgapur Durgapur-713209 emal: anta.bue@gmal.com 1 . Chapter 5 Soluton of System of Lnear Equatons
More informationProblem Set 9 Solutions
Desgn and Analyss of Algorthms May 4, 2015 Massachusetts Insttute of Technology 6.046J/18.410J Profs. Erk Demane, Srn Devadas, and Nancy Lynch Problem Set 9 Solutons Problem Set 9 Solutons Ths problem
More informationLectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix
Lectures - Week 4 Matrx norms, Condtonng, Vector Spaces, Lnear Independence, Spannng sets and Bass, Null space and Range of a Matrx Matrx Norms Now we turn to assocatng a number to each matrx. We could
More information8.592J: Solutions for Assignment 7 Spring 2005
8.59J: Solutons for Assgnment 7 Sprng 5 Problem 1 (a) A flament of length l can be created by addton of a monomer to one of length l 1 (at rate a) or removal of a monomer from a flament of length l + 1
More informationNUMERICAL DIFFERENTIATION
NUMERICAL DIFFERENTIATION 1 Introducton Dfferentaton s a method to compute the rate at whch a dependent output y changes wth respect to the change n the ndependent nput x. Ths rate of change s called the
More informationLossy Compression. Compromise accuracy of reconstruction for increased compression.
Lossy Compresson Compromse accuracy of reconstructon for ncreased compresson. The reconstructon s usually vsbly ndstngushable from the orgnal mage. Typcally, one can get up to 0:1 compresson wth almost
More informationGeneralized Linear Methods
Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set
More informationHere is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y)
Secton 1.5 Correlaton In the prevous sectons, we looked at regresson and the value r was a measurement of how much of the varaton n y can be attrbuted to the lnear relatonshp between y and x. In ths secton,
More informationNotes on Frequency Estimation in Data Streams
Notes on Frequency Estmaton n Data Streams In (one of) the data streamng model(s), the data s a sequence of arrvals a 1, a 2,..., a m of the form a j = (, v) where s the dentty of the tem and belongs to
More informationEcon107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)
I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes
More informationComparison of Regression Lines
STATGRAPHICS Rev. 9/13/2013 Comparson of Regresson Lnes Summary... 1 Data Input... 3 Analyss Summary... 4 Plot of Ftted Model... 6 Condtonal Sums of Squares... 6 Analyss Optons... 7 Forecasts... 8 Confdence
More informationOutline and Reading. Dynamic Programming. Dynamic Programming revealed. Computing Fibonacci. The General Dynamic Programming Technique
Outlne and Readng Dynamc Programmng The General Technque ( 5.3.2) -1 Knapsac Problem ( 5.3.3) Matrx Chan-Product ( 5.3.1) Dynamc Programmng verson 1.4 1 Dynamc Programmng verson 1.4 2 Dynamc Programmng
More informationA particle in a state of uniform motion remain in that state of motion unless acted upon by external force.
The fundamental prncples of classcal mechancs were lad down by Galleo and Newton n the 16th and 17th centures. In 1686, Newton wrote the Prncpa where he gave us three laws of moton, one law of gravty,
More information1 GSW Iterative Techniques for y = Ax
1 for y = A I m gong to cheat here. here are a lot of teratve technques that can be used to solve the general case of a set of smultaneous equatons (wrtten n the matr form as y = A), but ths chapter sn
More informationSection 8.3 Polar Form of Complex Numbers
80 Chapter 8 Secton 8 Polar Form of Complex Numbers From prevous classes, you may have encountered magnary numbers the square roots of negatve numbers and, more generally, complex numbers whch are the
More informationTHE SUMMATION NOTATION Ʃ
Sngle Subscrpt otaton THE SUMMATIO OTATIO Ʃ Most of the calculatons we perform n statstcs are repettve operatons on lsts of numbers. For example, we compute the sum of a set of numbers, or the sum of the
More informationGaussian Mixture Models
Lab Gaussan Mxture Models Lab Objectve: Understand the formulaton of Gaussan Mxture Models (GMMs) and how to estmate GMM parameters. You ve already seen GMMs as the observaton dstrbuton n certan contnuous
More informationLecture 21: Numerical methods for pricing American type derivatives
Lecture 21: Numercal methods for prcng Amercan type dervatves Xaoguang Wang STAT 598W Aprl 10th, 2014 (STAT 598W) Lecture 21 1 / 26 Outlne 1 Fnte Dfference Method Explct Method Penalty Method (STAT 598W)
More informationJAB Chain. Long-tail claims development. ASTIN - September 2005 B.Verdier A. Klinger
JAB Chan Long-tal clams development ASTIN - September 2005 B.Verder A. Klnger Outlne Chan Ladder : comments A frst soluton: Munch Chan Ladder JAB Chan Chan Ladder: Comments Black lne: average pad to ncurred
More informationAPPROXIMATE PRICES OF BASKET AND ASIAN OPTIONS DUPONT OLIVIER. Premia 14
APPROXIMAE PRICES OF BASKE AND ASIAN OPIONS DUPON OLIVIER Prema 14 Contents Introducton 1 1. Framewor 1 1.1. Baset optons 1.. Asan optons. Computng the prce 3. Lower bound 3.1. Closed formula for the prce
More informationSome Comments on Accelerating Convergence of Iterative Sequences Using Direct Inversion of the Iterative Subspace (DIIS)
Some Comments on Acceleratng Convergence of Iteratve Sequences Usng Drect Inverson of the Iteratve Subspace (DIIS) C. Davd Sherrll School of Chemstry and Bochemstry Georga Insttute of Technology May 1998
More informationCurve Fitting with the Least Square Method
WIKI Document Number 5 Interpolaton wth Least Squares Curve Fttng wth the Least Square Method Mattheu Bultelle Department of Bo-Engneerng Imperal College, London Context We wsh to model the postve feedback
More informationResource Allocation and Decision Analysis (ECON 8010) Spring 2014 Foundations of Regression Analysis
Resource Allocaton and Decson Analss (ECON 800) Sprng 04 Foundatons of Regresson Analss Readng: Regresson Analss (ECON 800 Coursepak, Page 3) Defntons and Concepts: Regresson Analss statstcal technques
More informationNote 10. Modeling and Simulation of Dynamic Systems
Lecture Notes of ME 475: Introducton to Mechatroncs Note 0 Modelng and Smulaton of Dynamc Systems Department of Mechancal Engneerng, Unversty Of Saskatchewan, 57 Campus Drve, Saskatoon, SK S7N 5A9, Canada
More informationConvergence of random processes
DS-GA 12 Lecture notes 6 Fall 216 Convergence of random processes 1 Introducton In these notes we study convergence of dscrete random processes. Ths allows to characterze phenomena such as the law of large
More informationExpected Value and Variance
MATH 38 Expected Value and Varance Dr. Neal, WKU We now shall dscuss how to fnd the average and standard devaton of a random varable X. Expected Value Defnton. The expected value (or average value, or
More informationLogistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI
Logstc Regresson CAP 561: achne Learnng Instructor: Guo-Jun QI Bayes Classfer: A Generatve model odel the posteror dstrbuton P(Y X) Estmate class-condtonal dstrbuton P(X Y) for each Y Estmate pror dstrbuton
More informationErrors for Linear Systems
Errors for Lnear Systems When we solve a lnear system Ax b we often do not know A and b exactly, but have only approxmatons  and ˆb avalable. Then the best thng we can do s to solve ˆx ˆb exactly whch
More informationCHAPTER 14 GENERAL PERTURBATION THEORY
CHAPTER 4 GENERAL PERTURBATION THEORY 4 Introducton A partcle n orbt around a pont mass or a sphercally symmetrc mass dstrbuton s movng n a gravtatonal potental of the form GM / r In ths potental t moves
More informationUsing T.O.M to Estimate Parameter of distributions that have not Single Exponential Family
IOSR Journal of Mathematcs IOSR-JM) ISSN: 2278-5728. Volume 3, Issue 3 Sep-Oct. 202), PP 44-48 www.osrjournals.org Usng T.O.M to Estmate Parameter of dstrbutons that have not Sngle Exponental Famly Jubran
More information= z 20 z n. (k 20) + 4 z k = 4
Problem Set #7 solutons 7.2.. (a Fnd the coeffcent of z k n (z + z 5 + z 6 + z 7 + 5, k 20. We use the known seres expanson ( n+l ( z l l z n below: (z + z 5 + z 6 + z 7 + 5 (z 5 ( + z + z 2 + z + 5 5
More informationAssortment Optimization under MNL
Assortment Optmzaton under MNL Haotan Song Aprl 30, 2017 1 Introducton The assortment optmzaton problem ams to fnd the revenue-maxmzng assortment of products to offer when the prces of products are fxed.
More informationarxiv: v1 [cs.lg] 17 Jan 2019
LECTURE NOTES arxv:90.05639v [cs.lg] 7 Jan 209 Artfcal Neural Networks B. MEHLIG Department of Physcs Unversty of Gothenburg Göteborg, Sweden 209 PREFACE These are lecture notes for my course on Artfcal
More informationChapter 7 Channel Capacity and Coding
Wreless Informaton Transmsson System Lab. Chapter 7 Channel Capacty and Codng Insttute of Communcatons Engneerng atonal Sun Yat-sen Unversty Contents 7. Channel models and channel capacty 7.. Channel models
More informationBasically, if you have a dummy dependent variable you will be estimating a probability.
ECON 497: Lecture Notes 13 Page 1 of 1 Metropoltan State Unversty ECON 497: Research and Forecastng Lecture Notes 13 Dummy Dependent Varable Technques Studenmund Chapter 13 Bascally, f you have a dummy
More informationANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)
Econ 413 Exam 13 H ANSWERS Settet er nndelt 9 deloppgaver, A,B,C, som alle anbefales å telle lkt for å gøre det ltt lettere å stå. Svar er gtt . Unfortunately, there s a prntng error n the hnt of
More informationprinceton univ. F 13 cos 521: Advanced Algorithm Design Lecture 3: Large deviations bounds and applications Lecturer: Sanjeev Arora
prnceton unv. F 13 cos 521: Advanced Algorthm Desgn Lecture 3: Large devatons bounds and applcatons Lecturer: Sanjeev Arora Scrbe: Today s topc s devaton bounds: what s the probablty that a random varable
More informationLECTURE NOTES. Artifical Neural Networks. B. MEHLIG (course home page)
LECTURE NOTES Artfcal Neural Networks B. MEHLIG (course home page) Department of Physcs Unversty of Gothenburg Göteborg, Sweden 208 PREFACE These are lecture notes for my course on Artfcal Neural Networks
More informationNeural Networks. Perceptrons and Backpropagation. Silke Bussen-Heyen. 5th of Novemeber Universität Bremen Fachbereich 3. Neural Networks 1 / 17
Neural Networks Perceptrons and Backpropagaton Slke Bussen-Heyen Unverstät Bremen Fachberech 3 5th of Novemeber 2012 Neural Networks 1 / 17 Contents 1 Introducton 2 Unts 3 Network structure 4 Snglelayer
More informationFeature Selection: Part 1
CSE 546: Machne Learnng Lecture 5 Feature Selecton: Part 1 Instructor: Sham Kakade 1 Regresson n the hgh dmensonal settng How do we learn when the number of features d s greater than the sample sze n?
More informationLecture 16 Statistical Analysis in Biomaterials Research (Part II)
3.051J/0.340J 1 Lecture 16 Statstcal Analyss n Bomaterals Research (Part II) C. F Dstrbuton Allows comparson of varablty of behavor between populatons usng test of hypothess: σ x = σ x amed for Brtsh statstcan
More informationCourse 395: Machine Learning - Lectures
Course 395: Machne Learnng - Lectures Lecture 1-2: Concept Learnng (M. Pantc Lecture 3-4: Decson Trees & CC Intro (M. Pantc Lecture 5-6: Artfcal Neural Networks (S.Zaferou Lecture 7-8: Instance ased Learnng
More informationExcess Error, Approximation Error, and Estimation Error
E0 370 Statstcal Learnng Theory Lecture 10 Sep 15, 011 Excess Error, Approxaton Error, and Estaton Error Lecturer: Shvan Agarwal Scrbe: Shvan Agarwal 1 Introducton So far, we have consdered the fnte saple
More informationLimited Dependent Variables
Lmted Dependent Varables. What f the left-hand sde varable s not a contnuous thng spread from mnus nfnty to plus nfnty? That s, gven a model = f (, β, ε, where a. s bounded below at zero, such as wages
More informationTopic 23 - Randomized Complete Block Designs (RCBD)
Topc 3 ANOVA (III) 3-1 Topc 3 - Randomzed Complete Block Desgns (RCBD) Defn: A Randomzed Complete Block Desgn s a varant of the completely randomzed desgn (CRD) that we recently learned. In ths desgn,
More informationAdmin NEURAL NETWORKS. Perceptron learning algorithm. Our Nervous System 10/25/16. Assignment 7. Class 11/22. Schedule for the rest of the semester
0/25/6 Admn Assgnment 7 Class /22 Schedule for the rest of the semester NEURAL NETWORKS Davd Kauchak CS58 Fall 206 Perceptron learnng algorthm Our Nervous System repeat untl convergence (or for some #
More informationMultilayer Perceptron (MLP)
Multlayer Perceptron (MLP) Seungjn Cho Department of Computer Scence and Engneerng Pohang Unversty of Scence and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjn@postech.ac.kr 1 / 20 Outlne
More informationCSC321 Tutorial 9: Review of Boltzmann machines and simulated annealing
CSC321 Tutoral 9: Revew of Boltzmann machnes and smulated annealng (Sldes based on Lecture 16-18 and selected readngs) Yue L Emal: yuel@cs.toronto.edu Wed 11-12 March 19 Fr 10-11 March 21 Outlne Boltzmann
More informationa b a In case b 0, a being divisible by b is the same as to say that
Secton 6.2 Dvsblty among the ntegers An nteger a ε s dvsble by b ε f there s an nteger c ε such that a = bc. Note that s dvsble by any nteger b, snce = b. On the other hand, a s dvsble by only f a = :
More informationMMA and GCMMA two methods for nonlinear optimization
MMA and GCMMA two methods for nonlnear optmzaton Krster Svanberg Optmzaton and Systems Theory, KTH, Stockholm, Sweden. krlle@math.kth.se Ths note descrbes the algorthms used n the author s 2007 mplementatons
More informationEEE 241: Linear Systems
EEE : Lnear Systems Summary #: Backpropagaton BACKPROPAGATION The perceptron rule as well as the Wdrow Hoff learnng were desgned to tran sngle layer networks. They suffer from the same dsadvantage: they
More informationNumerical Heat and Mass Transfer
Master degree n Mechancal Engneerng Numercal Heat and Mass Transfer 06-Fnte-Dfference Method (One-dmensonal, steady state heat conducton) Fausto Arpno f.arpno@uncas.t Introducton Why we use models and
More informationLinear Approximation with Regularization and Moving Least Squares
Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...
More informationExpectation Maximization Mixture Models HMMs
-755 Machne Learnng for Sgnal Processng Mture Models HMMs Class 9. 2 Sep 200 Learnng Dstrbutons for Data Problem: Gven a collecton of eamples from some data, estmate ts dstrbuton Basc deas of Mamum Lelhood
More informationPositive feedback Derivative feedback Pos. + der. feedback c Stronger. E I e f g Time (s)
Frng rate (Hz) a f nput current Transent nput Step-lke nput Postve feedback Dervatve feedback Pos. + der. feedback b c Stronger d e f g Frng rate (Hz) h j Frng rate (Hz) 5 5 3 4 5 3 4 5 Tme (s) 6 4 5 3
More informationStatistical Inference. 2.3 Summary Statistics Measures of Center and Spread. parameters ( population characteristics )
Ismor Fscher, 8//008 Stat 54 / -8.3 Summary Statstcs Measures of Center and Spread Dstrbuton of dscrete contnuous POPULATION Random Varable, numercal True center =??? True spread =???? parameters ( populaton
More information