Hopfield networks and Boltzmann machines. Geoffrey Hinton et al. Presented by Tambet Matiisen

Size: px
Start display at page:

Download "Hopfield networks and Boltzmann machines. Geoffrey Hinton et al. Presented by Tambet Matiisen"

Transcription

1 Hopfeld networks and Boltzmann machnes Geoffrey Hnton et al. Presented by Tambet Matsen

2 Hopfeld network Bnary unts Symmetrcal connectons

3 Energy functon The global energy: E s b j s s j w j The energy gap: E E( s 0) E( s 1) Update rule: s s b j s 1,f b s j 0, otherwse j j w w j j 0

4 Example 1-4 0? ? 0? E = goodness = 34

5 Deeper energy mnmum E = goodness = 5

6 Is updatng of Hopfeld network determnstc or non-determnstc? A. Determnstc B. Non-determnstc 0% 0% Determnstc Non-determnstc

7 How to update? Nodes must be updated sequentally, usually n randomzed order. Wth parallel updatng energy could go up If updates occur n parallel but wth random tmng, the oscllatons are usually destroyed.

8 Content-addressable memory Usng energy mnma to represent memores gves a content-addressable memory. An tem can be accessed by just knowng part of ts content. Can fll out mssng or corrupted peces of nformaton. It s robust aganst hardware damage.

9 Classcal condtonng

10 Storng memores Energy landscape s determned by weghts! If we use actvtes -1 and 1: w j = s s j If we use states 0 and 1: f f s s s s j j then then w w j j 1 1 w j 4 1 ( s 1 ) ( s ) 2 j 2

11 Demo (choose Algorthm: Hopfeld and Intalze)

12 How many weghts the example had? A. 100 B C % 0% 0%

13 Storage capacty The capacty of a totally connected net wth N unts s only about 0.15 * N memores. Wth N bts per one memory ths s only 0.15 * N * N bts. The net has N 2 weghts and bases. After storng M memores, each connecton weght has an nteger value n the range [ M, M]. So the number of bts requred to store the weghts and bases s: N 2 log(2m +1)

14 How many bts are needed to represent weghts n the example? A B C % 0% 0%

15 Spurous mnma Each tme we memorze a confguraton, we hope to create a new energy mnmum. But what f two mnma merge to create a mnmum at an ntermedate locaton?

16 Reverse learnng Let the net settle from a random ntal state and then do unlearnng. Ths wll get rd of deep, spurous mnma and ncrease memory capacty.

17 Increasng memory capacty Instead of tryng to store vectors n one shot, cycle through the tranng set many tmes. Use the perceptron convergence procedure to tran each unt to have the correct state gven the states of all the other unts n that vector. x f j x j w j w ( x x) x j j

18 Hopfeld nets wth hdden unts Instead of usng the net to store memores, use t to construct nterpretatons of sensory nput. The nput s represented by the vsble unts. The nterpretaton s represented by the states of the hdden unts. The badness of the nterpretaton s represented by the energy. hdden unts vsble unts

19 3D edges from 2D mages 3-D lnes You can only see one of these 3-D edges at a tme because they occlude one another. 2-D lnes pcture

20 Nosy networks Hopfeld net tres reduce the energy at each step. Ths makes t mpossble to escape from local mnma. We can use random nose to escape from poor mnma. Start wth a lot of nose so ts easy to cross energy barrers. Slowly reduce the nose so that the system ends up n a deep mnmum. Ths s smulated annealng. A B C

21 Temperature p( A B ) 0.2 Hgh temperature transton probabltes A p( A B ) 0.1 p( A B) B Low temperature transton probabltes A p( A B) B

22 Stochastc bnary unts Replace the bnary threshold unts by bnary stochastc unts that make based random decsons. The temperature controls the amount of nose. Rasng the nose level s equvalent to decreasng all the energy gaps between confguratons. p(s =1) = 1 1+ e E T temperature

23 Why we need stochastc bnary unts? A. Because we cannot get rd of nherent nose. B. Because t helps to escape local mnma. C. Because we want system to produce randomzed results. Because we cannot get r... 0% 0% 0% Because t helps to escap... Because we want system..

24 Thermal equlbrum Thermal equlbrum s a dffcult concept! Reachng thermal equlbrum does not mean that the system has settled down nto the lowest energy confguraton. The thng that settles down s the probablty dstrbuton over confguratons. Ths settles to the statonary dstrbuton. Any gven system keeps changng ts confguraton, but the fracton of systems n each confguraton does not change.

25 Modelng bnary data Gven a tranng set of bnary vectors, ft a model that wll assgn a probablty to every possble bnary vector. Model can be used for generatng data wth the same dstrbuton as orgnal data. If partcular model (dstrbuton) produced the observed data: p( model data ) p( data model p( data j ) p( model model ) j )

26 Boltzmann machne...s defned n terms of the energes of jont confguratons of the vsble and hdden unts. Probablty of jont confguraton: p(v, h) e E(v,h) The probablty of fndng the network n that jont confguraton after we have updated all of the stochastc bnary unts many tmes.

27 Energy of a jont confguraton bnary state of unt n v bas of unt k E(v, h) = v b + h k b k + v v j w j + v h k w k + h k h l w kl vs k hd < j, k k<l Energy wth confguraton v on the vsble unts and h on the hdden unts ndexes every nondentcal par of and j once weght between vsble unt and hdden unt k

28 From energes to probabltes The probablty of a jont confguraton over both vsble and hdden unts depends on the energy of that jont confguraton compared wth the energy of all other jont confguratons. p(v, h) = partton functon u, g E(v, h) e E(u, g) e The probablty of a confguraton of the vsble unts s the sum of the probabltes of all the jont confguratons that contan t. p(v) = h u, g E(v, h) e E(u, g) e

29 Example v h E e E p(v, h ) p(v) An example of how weghts defne a dstrbuton h1-1 h v1 v2

30 Gettng a sample from model We cannot compute the normalzng term (the partton functon) because t has exponentally many terms. So we use Markov Chan Monte Carlo to get samples from the model startng from a random global confguraton: Keep pckng unts at random and allowng them to stochastcally update ther states based on ther energy gaps. Run the Markov chan untl t reaches ts statonary dstrbuton The probablty of a global confguraton s then related to ts energy by the Boltzmann dstrbuton.

31 Gettng a sample from the posteror dstrbuton for a gven data vector The number of possble hdden confguratons s exponental so we need MCMC to sample from the posteror. It s just the same as gettng a sample from the model, except that we keep the vsble unts clamped to the gven data vector. Only the hdden unts are allowed to change states Samples from the posteror are requred for learnng the weghts. Each hdden confguraton s an explanaton of an observed vsble confguraton. Better explanatons have lower energy.

32 What does Boltzmann machne really A. Models probablty dstrbuton of nput data. B. Generates samples from modeled dstrbuton. C. Learns probablty dstrbuton of nput data from samples. D. All of above. do? Generates samples from... Models probablty dstr... 20% 20% 20% 20% 20% Learns probablty dstrb... All of above. None of above.

CSC321 Tutorial 9: Review of Boltzmann machines and simulated annealing

CSC321 Tutorial 9: Review of Boltzmann machines and simulated annealing CSC321 Tutoral 9: Revew of Boltzmann machnes and smulated annealng (Sldes based on Lecture 16-18 and selected readngs) Yue L Emal: yuel@cs.toronto.edu Wed 11-12 March 19 Fr 10-11 March 21 Outlne Boltzmann

More information

Markov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement

Markov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement Markov Chan Monte Carlo MCMC, Gbbs Samplng, Metropols Algorthms, and Smulated Annealng 2001 Bonformatcs Course Supplement SNU Bontellgence Lab http://bsnuackr/ Outlne! Markov Chan Monte Carlo MCMC! Metropols-Hastngs

More information

Neural Networks for Machine Learning. Lecture 11a Hopfield Nets

Neural Networks for Machine Learning. Lecture 11a Hopfield Nets Neural Networks for Machine Learning Lecture 11a Hopfield Nets Geoffrey Hinton Nitish Srivastava, Kevin Swersky Tijmen Tieleman Abdel-rahman Mohamed Hopfield Nets A Hopfield net is composed of binary threshold

More information

6. Stochastic processes (2)

6. Stochastic processes (2) 6. Stochastc processes () Lect6.ppt S-38.45 - Introducton to Teletraffc Theory Sprng 5 6. Stochastc processes () Contents Markov processes Brth-death processes 6. Stochastc processes () Markov process

More information

Unsupervised Learning

Unsupervised Learning Unsupervsed Learnng Kevn Swngler What s Unsupervsed Learnng? Most smply, t can be thought of as learnng to recognse and recall thngs Recognton I ve seen that before Recall I ve seen that before and I can

More information

6. Stochastic processes (2)

6. Stochastic processes (2) Contents Markov processes Brth-death processes Lect6.ppt S-38.45 - Introducton to Teletraffc Theory Sprng 5 Markov process Consder a contnuous-tme and dscrete-state stochastc process X(t) wth state space

More information

CHALMERS, GÖTEBORGS UNIVERSITET. SOLUTIONS to RE-EXAM for ARTIFICIAL NEURAL NETWORKS. COURSE CODES: FFR 135, FIM 720 GU, PhD

CHALMERS, GÖTEBORGS UNIVERSITET. SOLUTIONS to RE-EXAM for ARTIFICIAL NEURAL NETWORKS. COURSE CODES: FFR 135, FIM 720 GU, PhD CHALMERS, GÖTEBORGS UNIVERSITET SOLUTIONS to RE-EXAM for ARTIFICIAL NEURAL NETWORKS COURSE CODES: FFR 35, FIM 72 GU, PhD Tme: Place: Teachers: Allowed materal: Not allowed: January 2, 28, at 8 3 2 3 SB

More information

Multilayer Perceptrons and Backpropagation. Perceptrons. Recap: Perceptrons. Informatics 1 CG: Lecture 6. Mirella Lapata

Multilayer Perceptrons and Backpropagation. Perceptrons. Recap: Perceptrons. Informatics 1 CG: Lecture 6. Mirella Lapata Multlayer Perceptrons and Informatcs CG: Lecture 6 Mrella Lapata School of Informatcs Unversty of Ednburgh mlap@nf.ed.ac.uk Readng: Kevn Gurney s Introducton to Neural Networks, Chapters 5 6.5 January,

More information

Other NN Models. Reinforcement learning (RL) Probabilistic neural networks

Other NN Models. Reinforcement learning (RL) Probabilistic neural networks Other NN Models Renforcement learnng (RL) Probablstc neural networks Support vector machne (SVM) Renforcement learnng g( (RL) Basc deas: Supervsed dlearnng: (delta rule, BP) Samples (x, f(x)) to learn

More information

Multilayer Perceptron (MLP)

Multilayer Perceptron (MLP) Multlayer Perceptron (MLP) Seungjn Cho Department of Computer Scence and Engneerng Pohang Unversty of Scence and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjn@postech.ac.kr 1 / 20 Outlne

More information

Internet Engineering. Jacek Mazurkiewicz, PhD Softcomputing. Part 3: Recurrent Artificial Neural Networks Self-Organising Artificial Neural Networks

Internet Engineering. Jacek Mazurkiewicz, PhD Softcomputing. Part 3: Recurrent Artificial Neural Networks Self-Organising Artificial Neural Networks Internet Engneerng Jacek Mazurkewcz, PhD Softcomputng Part 3: Recurrent Artfcal Neural Networks Self-Organsng Artfcal Neural Networks Recurrent Artfcal Neural Networks Feedback sgnals between neurons Dynamc

More information

Ensemble Methods: Boosting

Ensemble Methods: Boosting Ensemble Methods: Boostng Ncholas Ruozz Unversty of Texas at Dallas Based on the sldes of Vbhav Gogate and Rob Schapre Last Tme Varance reducton va baggng Generate new tranng data sets by samplng wth replacement

More information

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results. Neural Networks : Dervaton compled by Alvn Wan from Professor Jtendra Malk s lecture Ths type of computaton s called deep learnng and s the most popular method for many problems, such as computer vson

More information

Markov Chain Monte Carlo Lecture 6

Markov Chain Monte Carlo Lecture 6 where (x 1,..., x N ) X N, N s called the populaton sze, f(x) f (x) for at least one {1, 2,..., N}, and those dfferent from f(x) are called the tral dstrbutons n terms of mportance samplng. Dfferent ways

More information

Problem Set 9 Solutions

Problem Set 9 Solutions Desgn and Analyss of Algorthms May 4, 2015 Massachusetts Insttute of Technology 6.046J/18.410J Profs. Erk Demane, Srn Devadas, and Nancy Lynch Problem Set 9 Solutons Problem Set 9 Solutons Ths problem

More information

VQ widely used in coding speech, image, and video

VQ widely used in coding speech, image, and video at Scalar quantzers are specal cases of vector quantzers (VQ): they are constraned to look at one sample at a tme (memoryless) VQ does not have such constrant better RD perfomance expected Source codng

More information

Multi-layer neural networks

Multi-layer neural networks Lecture 0 Mult-layer neural networks Mlos Hauskrecht mlos@cs.ptt.edu 5329 Sennott Square Lnear regresson w Lnear unts f () Logstc regresson T T = w = p( y =, w) = g( w ) w z f () = p ( y = ) w d w d Gradent

More information

A Bayesian methodology for systemic risk assessment in financial networks

A Bayesian methodology for systemic risk assessment in financial networks A Bayesan methodology for systemc rsk assessment n fnancal networks Lutgard A. M. Veraart London School of Economcs and Poltcal Scence September 2015 Jont work wth Axel Gandy (Imperal College London) 7th

More information

CS 3750 Machine Learning Lecture 6. Monte Carlo methods. CS 3750 Advanced Machine Learning. Markov chain Monte Carlo

CS 3750 Machine Learning Lecture 6. Monte Carlo methods. CS 3750 Advanced Machine Learning. Markov chain Monte Carlo CS 3750 Machne Learnng Lectre 6 Monte Carlo methods Mlos Haskrecht mlos@cs.ptt.ed 5329 Sennott Sqare Markov chan Monte Carlo Importance samplng: samples are generated accordng to Q and every sample from

More information

Information Geometry of Gibbs Sampler

Information Geometry of Gibbs Sampler Informaton Geometry of Gbbs Sampler Kazuya Takabatake Neuroscence Research Insttute AIST Central 2, Umezono 1-1-1, Tsukuba JAPAN 305-8568 k.takabatake@ast.go.jp Abstract: - Ths paper shows some nformaton

More information

Multilayer neural networks

Multilayer neural networks Lecture Multlayer neural networks Mlos Hauskrecht mlos@cs.ptt.edu 5329 Sennott Square Mdterm exam Mdterm Monday, March 2, 205 In-class (75 mnutes) closed book materal covered by February 25, 205 Multlayer

More information

Hopfield Training Rules 1 N

Hopfield Training Rules 1 N Hopfeld Tranng Rules To memorse a sngle pattern Suppose e set the eghts thus - = p p here, s the eght beteen nodes & s the number of nodes n the netor p s the value requred for the -th node What ll the

More information

Motion Perception Under Uncertainty. Hongjing Lu Department of Psychology University of Hong Kong

Motion Perception Under Uncertainty. Hongjing Lu Department of Psychology University of Hong Kong Moton Percepton Under Uncertanty Hongjng Lu Department of Psychology Unversty of Hong Kong Outlne Uncertanty n moton stmulus Correspondence problem Qualtatve fttng usng deal observer models Based on sgnal

More information

Hidden Markov Models & The Multivariate Gaussian (10/26/04)

Hidden Markov Models & The Multivariate Gaussian (10/26/04) CS281A/Stat241A: Statstcal Learnng Theory Hdden Markov Models & The Multvarate Gaussan (10/26/04) Lecturer: Mchael I. Jordan Scrbes: Jonathan W. Hu 1 Hdden Markov Models As a bref revew, hdden Markov models

More information

Course 395: Machine Learning - Lectures

Course 395: Machine Learning - Lectures Course 395: Machne Learnng - Lectures Lecture 1-2: Concept Learnng (M. Pantc Lecture 3-4: Decson Trees & CC Intro (M. Pantc Lecture 5-6: Artfcal Neural Networks (S.Zaferou Lecture 7-8: Instance ased Learnng

More information

Grover s Algorithm + Quantum Zeno Effect + Vaidman

Grover s Algorithm + Quantum Zeno Effect + Vaidman Grover s Algorthm + Quantum Zeno Effect + Vadman CS 294-2 Bomb 10/12/04 Fall 2004 Lecture 11 Grover s algorthm Recall that Grover s algorthm for searchng over a space of sze wors as follows: consder the

More information

Mean Field / Variational Approximations

Mean Field / Variational Approximations Mean Feld / Varatonal Appromatons resented by Jose Nuñez 0/24/05 Outlne Introducton Mean Feld Appromaton Structured Mean Feld Weghted Mean Feld Varatonal Methods Introducton roblem: We have dstrbuton but

More information

Associative Memories

Associative Memories Assocatve Memores We consder now modes for unsupervsed earnng probems, caed auto-assocaton probems. Assocaton s the task of mappng patterns to patterns. In an assocatve memory the stmuus of an ncompete

More information

Homework Assignment 3 Due in class, Thursday October 15

Homework Assignment 3 Due in class, Thursday October 15 Homework Assgnment 3 Due n class, Thursday October 15 SDS 383C Statstcal Modelng I 1 Rdge regresson and Lasso 1. Get the Prostrate cancer data from http://statweb.stanford.edu/~tbs/elemstatlearn/ datasets/prostate.data.

More information

CSE 546 Midterm Exam, Fall 2014(with Solution)

CSE 546 Midterm Exam, Fall 2014(with Solution) CSE 546 Mdterm Exam, Fall 014(wth Soluton) 1. Personal nfo: Name: UW NetID: Student ID:. There should be 14 numbered pages n ths exam (ncludng ths cover sheet). 3. You can use any materal you brought:

More information

Expectation Maximization Mixture Models HMMs

Expectation Maximization Mixture Models HMMs -755 Machne Learnng for Sgnal Processng Mture Models HMMs Class 9. 2 Sep 200 Learnng Dstrbutons for Data Problem: Gven a collecton of eamples from some data, estmate ts dstrbuton Basc deas of Mamum Lelhood

More information

Using deep belief network modelling to characterize differences in brain morphometry in schizophrenia

Using deep belief network modelling to characterize differences in brain morphometry in schizophrenia Usng deep belef network modellng to characterze dfferences n bran morphometry n schzophrena Walter H. L. Pnaya * a ; Ary Gadelha b ; Orla M. Doyle c ; Crstano Noto b ; André Zugman d ; Qurno Cordero b,

More information

CHAPTER 17 Amortized Analysis

CHAPTER 17 Amortized Analysis CHAPTER 7 Amortzed Analyss In an amortzed analyss, the tme requred to perform a sequence of data structure operatons s averaged over all the operatons performed. It can be used to show that the average

More information

Discriminative classifier: Logistic Regression. CS534-Machine Learning

Discriminative classifier: Logistic Regression. CS534-Machine Learning Dscrmnatve classfer: Logstc Regresson CS534-Machne Learnng 2 Logstc Regresson Gven tranng set D stc regresson learns the condtonal dstrbuton We ll assume onl to classes and a parametrc form for here s

More information

Admin NEURAL NETWORKS. Perceptron learning algorithm. Our Nervous System 10/25/16. Assignment 7. Class 11/22. Schedule for the rest of the semester

Admin NEURAL NETWORKS. Perceptron learning algorithm. Our Nervous System 10/25/16. Assignment 7. Class 11/22. Schedule for the rest of the semester 0/25/6 Admn Assgnment 7 Class /22 Schedule for the rest of the semester NEURAL NETWORKS Davd Kauchak CS58 Fall 206 Perceptron learnng algorthm Our Nervous System repeat untl convergence (or for some #

More information

Using Immune Genetic Algorithm to Optimize BP Neural Network and Its Application Peng-fei LIU1,Qun-tai SHEN1 and Jun ZHI2,*

Using Immune Genetic Algorithm to Optimize BP Neural Network and Its Application Peng-fei LIU1,Qun-tai SHEN1 and Jun ZHI2,* Advances n Computer Scence Research (ACRS), volume 54 Internatonal Conference on Computer Networks and Communcaton Technology (CNCT206) Usng Immune Genetc Algorthm to Optmze BP Neural Network and Its Applcaton

More information

1 The Mistake Bound Model

1 The Mistake Bound Model 5-850: Advanced Algorthms CMU, Sprng 07 Lecture #: Onlne Learnng and Multplcatve Weghts February 7, 07 Lecturer: Anupam Gupta Scrbe: Bryan Lee,Albert Gu, Eugene Cho he Mstake Bound Model Suppose there

More information

Artificial Intelligence Bayesian Networks

Artificial Intelligence Bayesian Networks Artfcal Intellgence Bayesan Networks Adapted from sldes by Tm Fnn and Mare desjardns. Some materal borrowed from Lse Getoor. 1 Outlne Bayesan networks Network structure Condtonal probablty tables Condtonal

More information

Probabilistic Graphical Models

Probabilistic Graphical Models School of Computer Scence robablstc Graphcal Models Appromate Inference: Markov Chan Monte Carlo 05 07 Erc Xng Lecture 7 March 9 04 X X 075 05 05 03 X 3 Erc Xng @ CMU 005-04 Recap of Monte Carlo Monte

More information

NP-Completeness : Proofs

NP-Completeness : Proofs NP-Completeness : Proofs Proof Methods A method to show a decson problem Π NP-complete s as follows. (1) Show Π NP. (2) Choose an NP-complete problem Π. (3) Show Π Π. A method to show an optmzaton problem

More information

ECE 534: Elements of Information Theory. Solutions to Midterm Exam (Spring 2006)

ECE 534: Elements of Information Theory. Solutions to Midterm Exam (Spring 2006) ECE 534: Elements of Informaton Theory Solutons to Mdterm Eam (Sprng 6) Problem [ pts.] A dscrete memoryless source has an alphabet of three letters,, =,, 3, wth probabltes.4,.4, and., respectvely. (a)

More information

Self-organising Systems 2 Simulated Annealing and Boltzmann Machines

Self-organising Systems 2 Simulated Annealing and Boltzmann Machines Ams Reference Keywords Plan Self-organsng Systems Smulated Annealng and Boltzmann Machnes to obtan a mathematcal framework for stochastc machnes to study smulated annealng to study the Boltzmann machne

More information

Lecture 10 Support Vector Machines. Oct

Lecture 10 Support Vector Machines. Oct Lecture 10 Support Vector Machnes Oct - 20-2008 Lnear Separators Whch of the lnear separators s optmal? Concept of Margn Recall that n Perceptron, we learned that the convergence rate of the Perceptron

More information

CSC 411 / CSC D11 / CSC C11

CSC 411 / CSC D11 / CSC C11 18 Boostng s a general strategy for learnng classfers by combnng smpler ones. The dea of boostng s to take a weak classfer that s, any classfer that wll do at least slghtly better than chance and use t

More information

ENTROPIC QUESTIONING

ENTROPIC QUESTIONING ENTROPIC QUESTIONING NACHUM. Introucton Goal. Pck the queston that contrbutes most to fnng a sutable prouct. Iea. Use an nformaton-theoretc measure. Bascs. Entropy (a non-negatve real number) measures

More information

Optimal information storage in noisy synapses under resource constraints

Optimal information storage in noisy synapses under resource constraints Optmal nformaton storage n nosy synapses under resource constrants Lav arshney MIT Jesper Sjöström Brandes, UCL London Dmtr Mtya Chklovsk Cold Sprng Harbor Laboratory Modfcatons of synaptc connectons store

More information

Suggested solutions for the exam in SF2863 Systems Engineering. June 12,

Suggested solutions for the exam in SF2863 Systems Engineering. June 12, Suggested solutons for the exam n SF2863 Systems Engneerng. June 12, 2012 14.00 19.00 Examner: Per Enqvst, phone: 790 62 98 1. We can thnk of the farm as a Jackson network. The strawberry feld s modelled

More information

} Often, when learning, we deal with uncertainty:

} Often, when learning, we deal with uncertainty: Uncertanty and Learnng } Often, when learnng, we deal wth uncertanty: } Incomplete data sets, wth mssng nformaton } Nosy data sets, wth unrelable nformaton } Stochastcty: causes and effects related non-determnstcally

More information

Review: Fit a line to N data points

Review: Fit a line to N data points Revew: Ft a lne to data ponts Correlated parameters: L y = a x + b Orthogonal parameters: J y = a (x ˆ x + b For ntercept b, set a=0 and fnd b by optmal average: ˆ b = y, Var[ b ˆ ] = For slope a, set

More information

find (x): given element x, return the canonical element of the set containing x;

find (x): given element x, return the canonical element of the set containing x; COS 43 Sprng, 009 Dsjont Set Unon Problem: Mantan a collecton of dsjont sets. Two operatons: fnd the set contanng a gven element; unte two sets nto one (destructvely). Approach: Canoncal element method:

More information

Dynamic Programming. Lecture 13 (5/31/2017)

Dynamic Programming. Lecture 13 (5/31/2017) Dynamc Programmng Lecture 13 (5/31/2017) - A Forest Thnnng Example - Projected yeld (m3/ha) at age 20 as functon of acton taken at age 10 Age 10 Begnnng Volume Resdual Ten-year Volume volume thnned volume

More information

Continuous Time Markov Chains

Continuous Time Markov Chains Contnuous Tme Markov Chans Brth and Death Processes,Transton Probablty Functon, Kolmogorov Equatons, Lmtng Probabltes, Unformzaton Chapter 6 1 Markovan Processes State Space Parameter Space (Tme) Dscrete

More information

Logistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI

Logistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI Logstc Regresson CAP 561: achne Learnng Instructor: Guo-Jun QI Bayes Classfer: A Generatve model odel the posteror dstrbuton P(Y X) Estmate class-condtonal dstrbuton P(X Y) for each Y Estmate pror dstrbuton

More information

Space of ML Problems. CSE 473: Artificial Intelligence. Parameter Estimation and Bayesian Networks. Learning Topics

Space of ML Problems. CSE 473: Artificial Intelligence. Parameter Estimation and Bayesian Networks. Learning Topics /7/7 CSE 73: Artfcal Intellgence Bayesan - Learnng Deter Fox Sldes adapted from Dan Weld, Jack Breese, Dan Klen, Daphne Koller, Stuart Russell, Andrew Moore & Luke Zettlemoyer What s Beng Learned? Space

More information

CS 3710: Visual Recognition Classification and Detection. Adriana Kovashka Department of Computer Science January 13, 2015

CS 3710: Visual Recognition Classification and Detection. Adriana Kovashka Department of Computer Science January 13, 2015 CS 3710: Vsual Recognton Classfcaton and Detecton Adrana Kovashka Department of Computer Scence January 13, 2015 Plan for Today Vsual recognton bascs part 2: Classfcaton and detecton Adrana s research

More information

Chapter 8 SCALAR QUANTIZATION

Chapter 8 SCALAR QUANTIZATION Outlne Chapter 8 SCALAR QUANTIZATION Yeuan-Kuen Lee [ CU, CSIE ] 8.1 Overvew 8. Introducton 8.4 Unform Quantzer 8.5 Adaptve Quantzaton 8.6 Nonunform Quantzaton 8.7 Entropy-Coded Quantzaton Ch 8 Scalar

More information

MATH 567: Mathematical Techniques in Data Science Lab 8

MATH 567: Mathematical Techniques in Data Science Lab 8 1/14 MATH 567: Mathematcal Technques n Data Scence Lab 8 Domnque Gullot Departments of Mathematcal Scences Unversty of Delaware Aprl 11, 2017 Recall We have: a (2) 1 = f(w (1) 11 x 1 + W (1) 12 x 2 + W

More information

Lecture Notes on Linear Regression

Lecture Notes on Linear Regression Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume

More information

Generalized Linear Methods

Generalized Linear Methods Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set

More information

10-701/ Machine Learning, Fall 2005 Homework 3

10-701/ Machine Learning, Fall 2005 Homework 3 10-701/15-781 Machne Learnng, Fall 2005 Homework 3 Out: 10/20/05 Due: begnnng of the class 11/01/05 Instructons Contact questons-10701@autonlaborg for queston Problem 1 Regresson and Cross-valdaton [40

More information

Convergence of random processes

Convergence of random processes DS-GA 12 Lecture notes 6 Fall 216 Convergence of random processes 1 Introducton In these notes we study convergence of dscrete random processes. Ths allows to characterze phenomena such as the law of large

More information

Winter 2008 CS567 Stochastic Linear/Integer Programming Guest Lecturer: Xu, Huan

Winter 2008 CS567 Stochastic Linear/Integer Programming Guest Lecturer: Xu, Huan Wnter 2008 CS567 Stochastc Lnear/Integer Programmng Guest Lecturer: Xu, Huan Class 2: More Modelng Examples 1 Capacty Expanson Capacty expanson models optmal choces of the tmng and levels of nvestments

More information

Boostrapaggregating (Bagging)

Boostrapaggregating (Bagging) Boostrapaggregatng (Baggng) An ensemble meta-algorthm desgned to mprove the stablty and accuracy of machne learnng algorthms Can be used n both regresson and classfcaton Reduces varance and helps to avod

More information

Dynamical Systems and Information Theory

Dynamical Systems and Information Theory Dynamcal Systems and Informaton Theory Informaton Theory Lecture 4 Let s consder systems that evolve wth tme x F ( x, x, x,... That s, systems that can be descrbed as the evoluton of a set of state varables

More information

1 Derivation of Rate Equations from Single-Cell Conductance (Hodgkin-Huxley-like) Equations

1 Derivation of Rate Equations from Single-Cell Conductance (Hodgkin-Huxley-like) Equations Physcs 171/271 -Davd Klenfeld - Fall 2005 (revsed Wnter 2011) 1 Dervaton of Rate Equatons from Sngle-Cell Conductance (Hodgkn-Huxley-lke) Equatons We consder a network of many neurons, each of whch obeys

More information

Channel Encoder. Channel. Figure 7.1: Communication system

Channel Encoder. Channel. Figure 7.1: Communication system Chapter 7 Processes The model of a communcaton system that we have been developng s shown n Fgure 7.. Ths model s also useful for some computaton systems. The source s assumed to emt a stream of symbols.

More information

p 1 c 2 + p 2 c 2 + p 3 c p m c 2

p 1 c 2 + p 2 c 2 + p 3 c p m c 2 Where to put a faclty? Gven locatons p 1,..., p m n R n of m houses, want to choose a locaton c n R n for the fre staton. Want c to be as close as possble to all the house. We know how to measure dstance

More information

1 Convex Optimization

1 Convex Optimization Convex Optmzaton We wll consder convex optmzaton problems. Namely, mnmzaton problems where the objectve s convex (we assume no constrants for now). Such problems often arse n machne learnng. For example,

More information

Hidden Markov Models

Hidden Markov Models Note to other teachers and users of these sldes. Andrew would be delghted f you found ths source materal useful n gvng your own lectures. Feel free to use these sldes verbatm, or to modfy them to ft your

More information

The Feynman path integral

The Feynman path integral The Feynman path ntegral Aprl 3, 205 Hesenberg and Schrödnger pctures The Schrödnger wave functon places the tme dependence of a physcal system n the state, ψ, t, where the state s a vector n Hlbert space

More information

Lecture 3: Shannon s Theorem

Lecture 3: Shannon s Theorem CSE 533: Error-Correctng Codes (Autumn 006 Lecture 3: Shannon s Theorem October 9, 006 Lecturer: Venkatesan Guruswam Scrbe: Wdad Machmouch 1 Communcaton Model The communcaton model we are usng conssts

More information

Applied Stochastic Processes

Applied Stochastic Processes STAT455/855 Fall 23 Appled Stochastc Processes Fnal Exam, Bref Solutons 1. (15 marks) (a) (7 marks) The dstrbuton of Y s gven by ( ) ( ) y 2 1 5 P (Y y) for y 2, 3,... The above follows because each of

More information

Outline and Reading. Dynamic Programming. Dynamic Programming revealed. Computing Fibonacci. The General Dynamic Programming Technique

Outline and Reading. Dynamic Programming. Dynamic Programming revealed. Computing Fibonacci. The General Dynamic Programming Technique Outlne and Readng Dynamc Programmng The General Technque ( 5.3.2) -1 Knapsac Problem ( 5.3.3) Matrx Chan-Product ( 5.3.1) Dynamc Programmng verson 1.4 1 Dynamc Programmng verson 1.4 2 Dynamc Programmng

More information

The Cortex. Networks. Laminar Structure of Cortex. Chapter 3, O Reilly & Munakata.

The Cortex. Networks. Laminar Structure of Cortex. Chapter 3, O Reilly & Munakata. Networks The Cortex Chapter, O Relly & Munakata. Bology of networks: The cortex Exctaton: Undrectonal (transformatons) Local vs. dstrbuted representatons Bdrectonal (pattern completon, amplfcaton) Inhbton:

More information

INF 5860 Machine learning for image classification. Lecture 3 : Image classification and regression part II Anne Solberg January 31, 2018

INF 5860 Machine learning for image classification. Lecture 3 : Image classification and regression part II Anne Solberg January 31, 2018 INF 5860 Machne learnng for mage classfcaton Lecture 3 : Image classfcaton and regresson part II Anne Solberg January 3, 08 Today s topcs Multclass logstc regresson and softma Regularzaton Image classfcaton

More information

Hidden Markov Models

Hidden Markov Models CM229S: Machne Learnng for Bonformatcs Lecture 12-05/05/2016 Hdden Markov Models Lecturer: Srram Sankararaman Scrbe: Akshay Dattatray Shnde Edted by: TBD 1 Introducton For a drected graph G we can wrte

More information

10.34 Fall 2015 Metropolis Monte Carlo Algorithm

10.34 Fall 2015 Metropolis Monte Carlo Algorithm 10.34 Fall 2015 Metropols Monte Carlo Algorthm The Metropols Monte Carlo method s very useful for calculatng manydmensonal ntegraton. For e.g. n statstcal mechancs n order to calculate the prospertes of

More information

I529: Machine Learning in Bioinformatics (Spring 2017) Markov Models

I529: Machine Learning in Bioinformatics (Spring 2017) Markov Models I529: Machne Learnng n Bonformatcs (Sprng 217) Markov Models Yuzhen Ye School of Informatcs and Computng Indana Unversty, Bloomngton Sprng 217 Outlne Smple model (frequency & profle) revew Markov chan

More information

2 Laminar Structure of Cortex. 4 Area Structure of Cortex

2 Laminar Structure of Cortex. 4 Area Structure of Cortex Networks!! Lamnar Structure of Cortex. Bology: The cortex. Exctaton: Undrectonal (transformatons) Local vs. dstrbuted representatons Bdrectonal (pattern completon, amplfcaton). Inhbton: Controllng bdrectonal

More information

Target tracking example Filtering: Xt. (main interest) Smoothing: X1: t. (also given with SIS)

Target tracking example Filtering: Xt. (main interest) Smoothing: X1: t. (also given with SIS) Target trackng example Flterng: Xt Y1: t (man nterest) Smoothng: X1: t Y1: t (also gven wth SIS) However as we have seen, the estmate of ths dstrbuton breaks down when t gets large due to the weghts becomng

More information

Lossy Compression. Compromise accuracy of reconstruction for increased compression.

Lossy Compression. Compromise accuracy of reconstruction for increased compression. Lossy Compresson Compromse accuracy of reconstructon for ncreased compresson. The reconstructon s usually vsbly ndstngushable from the orgnal mage. Typcally, one can get up to 0:1 compresson wth almost

More information

The Gaussian classifier. Nuno Vasconcelos ECE Department, UCSD

The Gaussian classifier. Nuno Vasconcelos ECE Department, UCSD he Gaussan classfer Nuno Vasconcelos ECE Department, UCSD Bayesan decson theory recall that we have state of the world X observatons g decson functon L[g,y] loss of predctng y wth g Bayes decson rule s

More information

CHAPTER 3: BAYESIAN DECISION THEORY

CHAPTER 3: BAYESIAN DECISION THEORY HATER 3: BAYESIAN DEISION THEORY Decson mang under uncertanty 3 Data comes from a process that s completely not nown The lac of nowledge can be compensated by modelng t as a random process May be the underlyng

More information

Discriminative classifier: Logistic Regression. CS534-Machine Learning

Discriminative classifier: Logistic Regression. CS534-Machine Learning Dscrmnatve classfer: Logstc Regresson CS534-Machne Learnng robablstc Classfer Gven an nstance, hat does a probablstc classfer do dfferentl compared to, sa, perceptron? It does not drectl predct Instead,

More information

Review of Taylor Series. Read Section 1.2

Review of Taylor Series. Read Section 1.2 Revew of Taylor Seres Read Secton 1.2 1 Power Seres A power seres about c s an nfnte seres of the form k = 0 k a ( x c) = a + a ( x c) + a ( x c) + a ( x c) k 2 3 0 1 2 3 + In many cases, c = 0, and the

More information

Probability Theory (revisited)

Probability Theory (revisited) Probablty Theory (revsted) Summary Probablty v.s. plausblty Random varables Smulaton of Random Experments Challenge The alarm of a shop rang. Soon afterwards, a man was seen runnng n the street, persecuted

More information

Linear Classification, SVMs and Nearest Neighbors

Linear Classification, SVMs and Nearest Neighbors 1 CSE 473 Lecture 25 (Chapter 18) Lnear Classfcaton, SVMs and Nearest Neghbors CSE AI faculty + Chrs Bshop, Dan Klen, Stuart Russell, Andrew Moore Motvaton: Face Detecton How do we buld a classfer to dstngush

More information

Equilibrium with Complete Markets. Instructor: Dmytro Hryshko

Equilibrium with Complete Markets. Instructor: Dmytro Hryshko Equlbrum wth Complete Markets Instructor: Dmytro Hryshko 1 / 33 Readngs Ljungqvst and Sargent. Recursve Macroeconomc Theory. MIT Press. Chapter 8. 2 / 33 Equlbrum n pure exchange, nfnte horzon economes,

More information

Statistical Foundations of Pattern Recognition

Statistical Foundations of Pattern Recognition Statstcal Foundatons of Pattern Recognton Learnng Objectves Bayes Theorem Decson-mang Confdence factors Dscrmnants The connecton to neural nets Statstcal Foundatons of Pattern Recognton NDE measurement

More information

Lecture 10: May 6, 2013

Lecture 10: May 6, 2013 TTIC/CMSC 31150 Mathematcal Toolkt Sprng 013 Madhur Tulsan Lecture 10: May 6, 013 Scrbe: Wenje Luo In today s lecture, we manly talked about random walk on graphs and ntroduce the concept of graph expander,

More information

Strong Markov property: Same assertion holds for stopping times τ.

Strong Markov property: Same assertion holds for stopping times τ. Brownan moton Let X ={X t : t R + } be a real-valued stochastc process: a famlty of real random varables all defned on the same probablty space. Defne F t = nformaton avalable by observng the process up

More information

On complexity and randomness of Markov-chain prediction

On complexity and randomness of Markov-chain prediction On complexty and randomness of Markov-chan predcton Joel Ratsaby Department of Electrcal and Electroncs Engneerng Arel Unversty Arel, ISRAEL Emal: ratsaby@arelacl Abstract Let {X t : t Z} be a sequence

More information

An Experiment/Some Intuition (Fall 2006): Lecture 18 The EM Algorithm heads coin 1 tails coin 2 Overview Maximum Likelihood Estimation

An Experiment/Some Intuition (Fall 2006): Lecture 18 The EM Algorithm heads coin 1 tails coin 2 Overview Maximum Likelihood Estimation An Experment/Some Intuton I have three cons n my pocket, 6.864 (Fall 2006): Lecture 18 The EM Algorthm Con 0 has probablty λ of heads; Con 1 has probablty p 1 of heads; Con 2 has probablty p 2 of heads

More information

Supporting Information

Supporting Information Supportng Informaton The neural network f n Eq. 1 s gven by: f x l = ReLU W atom x l + b atom, 2 where ReLU s the element-wse rectfed lnear unt, 21.e., ReLUx = max0, x, W atom R d d s the weght matrx to

More information

EEE 241: Linear Systems

EEE 241: Linear Systems EEE : Lnear Systems Summary #: Backpropagaton BACKPROPAGATION The perceptron rule as well as the Wdrow Hoff learnng were desgned to tran sngle layer networks. They suffer from the same dsadvantage: they

More information

arxiv: v1 [cs.lg] 17 Jan 2019

arxiv: v1 [cs.lg] 17 Jan 2019 LECTURE NOTES arxv:90.05639v [cs.lg] 7 Jan 209 Artfcal Neural Networks B. MEHLIG Department of Physcs Unversty of Gothenburg Göteborg, Sweden 209 PREFACE These are lecture notes for my course on Artfcal

More information

HMMT February 2016 February 20, 2016

HMMT February 2016 February 20, 2016 HMMT February 016 February 0, 016 Combnatorcs 1. For postve ntegers n, let S n be the set of ntegers x such that n dstnct lnes, no three concurrent, can dvde a plane nto x regons (for example, S = {3,

More information

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:

More information

LECTURE NOTES. Artifical Neural Networks. B. MEHLIG (course home page)

LECTURE NOTES. Artifical Neural Networks. B. MEHLIG (course home page) LECTURE NOTES Artfcal Neural Networks B. MEHLIG (course home page) Department of Physcs Unversty of Gothenburg Göteborg, Sweden 208 PREFACE These are lecture notes for my course on Artfcal Neural Networks

More information