Hopfield networks and Boltzmann machines. Geoffrey Hinton et al. Presented by Tambet Matiisen
|
|
- Norman Jones
- 6 years ago
- Views:
Transcription
1 Hopfeld networks and Boltzmann machnes Geoffrey Hnton et al. Presented by Tambet Matsen
2 Hopfeld network Bnary unts Symmetrcal connectons
3 Energy functon The global energy: E s b j s s j w j The energy gap: E E( s 0) E( s 1) Update rule: s s b j s 1,f b s j 0, otherwse j j w w j j 0
4 Example 1-4 0? ? 0? E = goodness = 34
5 Deeper energy mnmum E = goodness = 5
6 Is updatng of Hopfeld network determnstc or non-determnstc? A. Determnstc B. Non-determnstc 0% 0% Determnstc Non-determnstc
7 How to update? Nodes must be updated sequentally, usually n randomzed order. Wth parallel updatng energy could go up If updates occur n parallel but wth random tmng, the oscllatons are usually destroyed.
8 Content-addressable memory Usng energy mnma to represent memores gves a content-addressable memory. An tem can be accessed by just knowng part of ts content. Can fll out mssng or corrupted peces of nformaton. It s robust aganst hardware damage.
9 Classcal condtonng
10 Storng memores Energy landscape s determned by weghts! If we use actvtes -1 and 1: w j = s s j If we use states 0 and 1: f f s s s s j j then then w w j j 1 1 w j 4 1 ( s 1 ) ( s ) 2 j 2
11 Demo (choose Algorthm: Hopfeld and Intalze)
12 How many weghts the example had? A. 100 B C % 0% 0%
13 Storage capacty The capacty of a totally connected net wth N unts s only about 0.15 * N memores. Wth N bts per one memory ths s only 0.15 * N * N bts. The net has N 2 weghts and bases. After storng M memores, each connecton weght has an nteger value n the range [ M, M]. So the number of bts requred to store the weghts and bases s: N 2 log(2m +1)
14 How many bts are needed to represent weghts n the example? A B C % 0% 0%
15 Spurous mnma Each tme we memorze a confguraton, we hope to create a new energy mnmum. But what f two mnma merge to create a mnmum at an ntermedate locaton?
16 Reverse learnng Let the net settle from a random ntal state and then do unlearnng. Ths wll get rd of deep, spurous mnma and ncrease memory capacty.
17 Increasng memory capacty Instead of tryng to store vectors n one shot, cycle through the tranng set many tmes. Use the perceptron convergence procedure to tran each unt to have the correct state gven the states of all the other unts n that vector. x f j x j w j w ( x x) x j j
18 Hopfeld nets wth hdden unts Instead of usng the net to store memores, use t to construct nterpretatons of sensory nput. The nput s represented by the vsble unts. The nterpretaton s represented by the states of the hdden unts. The badness of the nterpretaton s represented by the energy. hdden unts vsble unts
19 3D edges from 2D mages 3-D lnes You can only see one of these 3-D edges at a tme because they occlude one another. 2-D lnes pcture
20 Nosy networks Hopfeld net tres reduce the energy at each step. Ths makes t mpossble to escape from local mnma. We can use random nose to escape from poor mnma. Start wth a lot of nose so ts easy to cross energy barrers. Slowly reduce the nose so that the system ends up n a deep mnmum. Ths s smulated annealng. A B C
21 Temperature p( A B ) 0.2 Hgh temperature transton probabltes A p( A B ) 0.1 p( A B) B Low temperature transton probabltes A p( A B) B
22 Stochastc bnary unts Replace the bnary threshold unts by bnary stochastc unts that make based random decsons. The temperature controls the amount of nose. Rasng the nose level s equvalent to decreasng all the energy gaps between confguratons. p(s =1) = 1 1+ e E T temperature
23 Why we need stochastc bnary unts? A. Because we cannot get rd of nherent nose. B. Because t helps to escape local mnma. C. Because we want system to produce randomzed results. Because we cannot get r... 0% 0% 0% Because t helps to escap... Because we want system..
24 Thermal equlbrum Thermal equlbrum s a dffcult concept! Reachng thermal equlbrum does not mean that the system has settled down nto the lowest energy confguraton. The thng that settles down s the probablty dstrbuton over confguratons. Ths settles to the statonary dstrbuton. Any gven system keeps changng ts confguraton, but the fracton of systems n each confguraton does not change.
25 Modelng bnary data Gven a tranng set of bnary vectors, ft a model that wll assgn a probablty to every possble bnary vector. Model can be used for generatng data wth the same dstrbuton as orgnal data. If partcular model (dstrbuton) produced the observed data: p( model data ) p( data model p( data j ) p( model model ) j )
26 Boltzmann machne...s defned n terms of the energes of jont confguratons of the vsble and hdden unts. Probablty of jont confguraton: p(v, h) e E(v,h) The probablty of fndng the network n that jont confguraton after we have updated all of the stochastc bnary unts many tmes.
27 Energy of a jont confguraton bnary state of unt n v bas of unt k E(v, h) = v b + h k b k + v v j w j + v h k w k + h k h l w kl vs k hd < j, k k<l Energy wth confguraton v on the vsble unts and h on the hdden unts ndexes every nondentcal par of and j once weght between vsble unt and hdden unt k
28 From energes to probabltes The probablty of a jont confguraton over both vsble and hdden unts depends on the energy of that jont confguraton compared wth the energy of all other jont confguratons. p(v, h) = partton functon u, g E(v, h) e E(u, g) e The probablty of a confguraton of the vsble unts s the sum of the probabltes of all the jont confguratons that contan t. p(v) = h u, g E(v, h) e E(u, g) e
29 Example v h E e E p(v, h ) p(v) An example of how weghts defne a dstrbuton h1-1 h v1 v2
30 Gettng a sample from model We cannot compute the normalzng term (the partton functon) because t has exponentally many terms. So we use Markov Chan Monte Carlo to get samples from the model startng from a random global confguraton: Keep pckng unts at random and allowng them to stochastcally update ther states based on ther energy gaps. Run the Markov chan untl t reaches ts statonary dstrbuton The probablty of a global confguraton s then related to ts energy by the Boltzmann dstrbuton.
31 Gettng a sample from the posteror dstrbuton for a gven data vector The number of possble hdden confguratons s exponental so we need MCMC to sample from the posteror. It s just the same as gettng a sample from the model, except that we keep the vsble unts clamped to the gven data vector. Only the hdden unts are allowed to change states Samples from the posteror are requred for learnng the weghts. Each hdden confguraton s an explanaton of an observed vsble confguraton. Better explanatons have lower energy.
32 What does Boltzmann machne really A. Models probablty dstrbuton of nput data. B. Generates samples from modeled dstrbuton. C. Learns probablty dstrbuton of nput data from samples. D. All of above. do? Generates samples from... Models probablty dstr... 20% 20% 20% 20% 20% Learns probablty dstrb... All of above. None of above.
CSC321 Tutorial 9: Review of Boltzmann machines and simulated annealing
CSC321 Tutoral 9: Revew of Boltzmann machnes and smulated annealng (Sldes based on Lecture 16-18 and selected readngs) Yue L Emal: yuel@cs.toronto.edu Wed 11-12 March 19 Fr 10-11 March 21 Outlne Boltzmann
More informationMarkov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement
Markov Chan Monte Carlo MCMC, Gbbs Samplng, Metropols Algorthms, and Smulated Annealng 2001 Bonformatcs Course Supplement SNU Bontellgence Lab http://bsnuackr/ Outlne! Markov Chan Monte Carlo MCMC! Metropols-Hastngs
More informationNeural Networks for Machine Learning. Lecture 11a Hopfield Nets
Neural Networks for Machine Learning Lecture 11a Hopfield Nets Geoffrey Hinton Nitish Srivastava, Kevin Swersky Tijmen Tieleman Abdel-rahman Mohamed Hopfield Nets A Hopfield net is composed of binary threshold
More information6. Stochastic processes (2)
6. Stochastc processes () Lect6.ppt S-38.45 - Introducton to Teletraffc Theory Sprng 5 6. Stochastc processes () Contents Markov processes Brth-death processes 6. Stochastc processes () Markov process
More informationUnsupervised Learning
Unsupervsed Learnng Kevn Swngler What s Unsupervsed Learnng? Most smply, t can be thought of as learnng to recognse and recall thngs Recognton I ve seen that before Recall I ve seen that before and I can
More information6. Stochastic processes (2)
Contents Markov processes Brth-death processes Lect6.ppt S-38.45 - Introducton to Teletraffc Theory Sprng 5 Markov process Consder a contnuous-tme and dscrete-state stochastc process X(t) wth state space
More informationCHALMERS, GÖTEBORGS UNIVERSITET. SOLUTIONS to RE-EXAM for ARTIFICIAL NEURAL NETWORKS. COURSE CODES: FFR 135, FIM 720 GU, PhD
CHALMERS, GÖTEBORGS UNIVERSITET SOLUTIONS to RE-EXAM for ARTIFICIAL NEURAL NETWORKS COURSE CODES: FFR 35, FIM 72 GU, PhD Tme: Place: Teachers: Allowed materal: Not allowed: January 2, 28, at 8 3 2 3 SB
More informationMultilayer Perceptrons and Backpropagation. Perceptrons. Recap: Perceptrons. Informatics 1 CG: Lecture 6. Mirella Lapata
Multlayer Perceptrons and Informatcs CG: Lecture 6 Mrella Lapata School of Informatcs Unversty of Ednburgh mlap@nf.ed.ac.uk Readng: Kevn Gurney s Introducton to Neural Networks, Chapters 5 6.5 January,
More informationOther NN Models. Reinforcement learning (RL) Probabilistic neural networks
Other NN Models Renforcement learnng (RL) Probablstc neural networks Support vector machne (SVM) Renforcement learnng g( (RL) Basc deas: Supervsed dlearnng: (delta rule, BP) Samples (x, f(x)) to learn
More informationMultilayer Perceptron (MLP)
Multlayer Perceptron (MLP) Seungjn Cho Department of Computer Scence and Engneerng Pohang Unversty of Scence and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjn@postech.ac.kr 1 / 20 Outlne
More informationInternet Engineering. Jacek Mazurkiewicz, PhD Softcomputing. Part 3: Recurrent Artificial Neural Networks Self-Organising Artificial Neural Networks
Internet Engneerng Jacek Mazurkewcz, PhD Softcomputng Part 3: Recurrent Artfcal Neural Networks Self-Organsng Artfcal Neural Networks Recurrent Artfcal Neural Networks Feedback sgnals between neurons Dynamc
More informationEnsemble Methods: Boosting
Ensemble Methods: Boostng Ncholas Ruozz Unversty of Texas at Dallas Based on the sldes of Vbhav Gogate and Rob Schapre Last Tme Varance reducton va baggng Generate new tranng data sets by samplng wth replacement
More informationFor now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.
Neural Networks : Dervaton compled by Alvn Wan from Professor Jtendra Malk s lecture Ths type of computaton s called deep learnng and s the most popular method for many problems, such as computer vson
More informationMarkov Chain Monte Carlo Lecture 6
where (x 1,..., x N ) X N, N s called the populaton sze, f(x) f (x) for at least one {1, 2,..., N}, and those dfferent from f(x) are called the tral dstrbutons n terms of mportance samplng. Dfferent ways
More informationProblem Set 9 Solutions
Desgn and Analyss of Algorthms May 4, 2015 Massachusetts Insttute of Technology 6.046J/18.410J Profs. Erk Demane, Srn Devadas, and Nancy Lynch Problem Set 9 Solutons Problem Set 9 Solutons Ths problem
More informationVQ widely used in coding speech, image, and video
at Scalar quantzers are specal cases of vector quantzers (VQ): they are constraned to look at one sample at a tme (memoryless) VQ does not have such constrant better RD perfomance expected Source codng
More informationMulti-layer neural networks
Lecture 0 Mult-layer neural networks Mlos Hauskrecht mlos@cs.ptt.edu 5329 Sennott Square Lnear regresson w Lnear unts f () Logstc regresson T T = w = p( y =, w) = g( w ) w z f () = p ( y = ) w d w d Gradent
More informationA Bayesian methodology for systemic risk assessment in financial networks
A Bayesan methodology for systemc rsk assessment n fnancal networks Lutgard A. M. Veraart London School of Economcs and Poltcal Scence September 2015 Jont work wth Axel Gandy (Imperal College London) 7th
More informationCS 3750 Machine Learning Lecture 6. Monte Carlo methods. CS 3750 Advanced Machine Learning. Markov chain Monte Carlo
CS 3750 Machne Learnng Lectre 6 Monte Carlo methods Mlos Haskrecht mlos@cs.ptt.ed 5329 Sennott Sqare Markov chan Monte Carlo Importance samplng: samples are generated accordng to Q and every sample from
More informationInformation Geometry of Gibbs Sampler
Informaton Geometry of Gbbs Sampler Kazuya Takabatake Neuroscence Research Insttute AIST Central 2, Umezono 1-1-1, Tsukuba JAPAN 305-8568 k.takabatake@ast.go.jp Abstract: - Ths paper shows some nformaton
More informationMultilayer neural networks
Lecture Multlayer neural networks Mlos Hauskrecht mlos@cs.ptt.edu 5329 Sennott Square Mdterm exam Mdterm Monday, March 2, 205 In-class (75 mnutes) closed book materal covered by February 25, 205 Multlayer
More informationHopfield Training Rules 1 N
Hopfeld Tranng Rules To memorse a sngle pattern Suppose e set the eghts thus - = p p here, s the eght beteen nodes & s the number of nodes n the netor p s the value requred for the -th node What ll the
More informationMotion Perception Under Uncertainty. Hongjing Lu Department of Psychology University of Hong Kong
Moton Percepton Under Uncertanty Hongjng Lu Department of Psychology Unversty of Hong Kong Outlne Uncertanty n moton stmulus Correspondence problem Qualtatve fttng usng deal observer models Based on sgnal
More informationHidden Markov Models & The Multivariate Gaussian (10/26/04)
CS281A/Stat241A: Statstcal Learnng Theory Hdden Markov Models & The Multvarate Gaussan (10/26/04) Lecturer: Mchael I. Jordan Scrbes: Jonathan W. Hu 1 Hdden Markov Models As a bref revew, hdden Markov models
More informationCourse 395: Machine Learning - Lectures
Course 395: Machne Learnng - Lectures Lecture 1-2: Concept Learnng (M. Pantc Lecture 3-4: Decson Trees & CC Intro (M. Pantc Lecture 5-6: Artfcal Neural Networks (S.Zaferou Lecture 7-8: Instance ased Learnng
More informationGrover s Algorithm + Quantum Zeno Effect + Vaidman
Grover s Algorthm + Quantum Zeno Effect + Vadman CS 294-2 Bomb 10/12/04 Fall 2004 Lecture 11 Grover s algorthm Recall that Grover s algorthm for searchng over a space of sze wors as follows: consder the
More informationMean Field / Variational Approximations
Mean Feld / Varatonal Appromatons resented by Jose Nuñez 0/24/05 Outlne Introducton Mean Feld Appromaton Structured Mean Feld Weghted Mean Feld Varatonal Methods Introducton roblem: We have dstrbuton but
More informationAssociative Memories
Assocatve Memores We consder now modes for unsupervsed earnng probems, caed auto-assocaton probems. Assocaton s the task of mappng patterns to patterns. In an assocatve memory the stmuus of an ncompete
More informationHomework Assignment 3 Due in class, Thursday October 15
Homework Assgnment 3 Due n class, Thursday October 15 SDS 383C Statstcal Modelng I 1 Rdge regresson and Lasso 1. Get the Prostrate cancer data from http://statweb.stanford.edu/~tbs/elemstatlearn/ datasets/prostate.data.
More informationCSE 546 Midterm Exam, Fall 2014(with Solution)
CSE 546 Mdterm Exam, Fall 014(wth Soluton) 1. Personal nfo: Name: UW NetID: Student ID:. There should be 14 numbered pages n ths exam (ncludng ths cover sheet). 3. You can use any materal you brought:
More informationExpectation Maximization Mixture Models HMMs
-755 Machne Learnng for Sgnal Processng Mture Models HMMs Class 9. 2 Sep 200 Learnng Dstrbutons for Data Problem: Gven a collecton of eamples from some data, estmate ts dstrbuton Basc deas of Mamum Lelhood
More informationUsing deep belief network modelling to characterize differences in brain morphometry in schizophrenia
Usng deep belef network modellng to characterze dfferences n bran morphometry n schzophrena Walter H. L. Pnaya * a ; Ary Gadelha b ; Orla M. Doyle c ; Crstano Noto b ; André Zugman d ; Qurno Cordero b,
More informationCHAPTER 17 Amortized Analysis
CHAPTER 7 Amortzed Analyss In an amortzed analyss, the tme requred to perform a sequence of data structure operatons s averaged over all the operatons performed. It can be used to show that the average
More informationDiscriminative classifier: Logistic Regression. CS534-Machine Learning
Dscrmnatve classfer: Logstc Regresson CS534-Machne Learnng 2 Logstc Regresson Gven tranng set D stc regresson learns the condtonal dstrbuton We ll assume onl to classes and a parametrc form for here s
More informationAdmin NEURAL NETWORKS. Perceptron learning algorithm. Our Nervous System 10/25/16. Assignment 7. Class 11/22. Schedule for the rest of the semester
0/25/6 Admn Assgnment 7 Class /22 Schedule for the rest of the semester NEURAL NETWORKS Davd Kauchak CS58 Fall 206 Perceptron learnng algorthm Our Nervous System repeat untl convergence (or for some #
More informationUsing Immune Genetic Algorithm to Optimize BP Neural Network and Its Application Peng-fei LIU1,Qun-tai SHEN1 and Jun ZHI2,*
Advances n Computer Scence Research (ACRS), volume 54 Internatonal Conference on Computer Networks and Communcaton Technology (CNCT206) Usng Immune Genetc Algorthm to Optmze BP Neural Network and Its Applcaton
More information1 The Mistake Bound Model
5-850: Advanced Algorthms CMU, Sprng 07 Lecture #: Onlne Learnng and Multplcatve Weghts February 7, 07 Lecturer: Anupam Gupta Scrbe: Bryan Lee,Albert Gu, Eugene Cho he Mstake Bound Model Suppose there
More informationArtificial Intelligence Bayesian Networks
Artfcal Intellgence Bayesan Networks Adapted from sldes by Tm Fnn and Mare desjardns. Some materal borrowed from Lse Getoor. 1 Outlne Bayesan networks Network structure Condtonal probablty tables Condtonal
More informationProbabilistic Graphical Models
School of Computer Scence robablstc Graphcal Models Appromate Inference: Markov Chan Monte Carlo 05 07 Erc Xng Lecture 7 March 9 04 X X 075 05 05 03 X 3 Erc Xng @ CMU 005-04 Recap of Monte Carlo Monte
More informationNP-Completeness : Proofs
NP-Completeness : Proofs Proof Methods A method to show a decson problem Π NP-complete s as follows. (1) Show Π NP. (2) Choose an NP-complete problem Π. (3) Show Π Π. A method to show an optmzaton problem
More informationECE 534: Elements of Information Theory. Solutions to Midterm Exam (Spring 2006)
ECE 534: Elements of Informaton Theory Solutons to Mdterm Eam (Sprng 6) Problem [ pts.] A dscrete memoryless source has an alphabet of three letters,, =,, 3, wth probabltes.4,.4, and., respectvely. (a)
More informationSelf-organising Systems 2 Simulated Annealing and Boltzmann Machines
Ams Reference Keywords Plan Self-organsng Systems Smulated Annealng and Boltzmann Machnes to obtan a mathematcal framework for stochastc machnes to study smulated annealng to study the Boltzmann machne
More informationLecture 10 Support Vector Machines. Oct
Lecture 10 Support Vector Machnes Oct - 20-2008 Lnear Separators Whch of the lnear separators s optmal? Concept of Margn Recall that n Perceptron, we learned that the convergence rate of the Perceptron
More informationCSC 411 / CSC D11 / CSC C11
18 Boostng s a general strategy for learnng classfers by combnng smpler ones. The dea of boostng s to take a weak classfer that s, any classfer that wll do at least slghtly better than chance and use t
More informationENTROPIC QUESTIONING
ENTROPIC QUESTIONING NACHUM. Introucton Goal. Pck the queston that contrbutes most to fnng a sutable prouct. Iea. Use an nformaton-theoretc measure. Bascs. Entropy (a non-negatve real number) measures
More informationOptimal information storage in noisy synapses under resource constraints
Optmal nformaton storage n nosy synapses under resource constrants Lav arshney MIT Jesper Sjöström Brandes, UCL London Dmtr Mtya Chklovsk Cold Sprng Harbor Laboratory Modfcatons of synaptc connectons store
More informationSuggested solutions for the exam in SF2863 Systems Engineering. June 12,
Suggested solutons for the exam n SF2863 Systems Engneerng. June 12, 2012 14.00 19.00 Examner: Per Enqvst, phone: 790 62 98 1. We can thnk of the farm as a Jackson network. The strawberry feld s modelled
More information} Often, when learning, we deal with uncertainty:
Uncertanty and Learnng } Often, when learnng, we deal wth uncertanty: } Incomplete data sets, wth mssng nformaton } Nosy data sets, wth unrelable nformaton } Stochastcty: causes and effects related non-determnstcally
More informationReview: Fit a line to N data points
Revew: Ft a lne to data ponts Correlated parameters: L y = a x + b Orthogonal parameters: J y = a (x ˆ x + b For ntercept b, set a=0 and fnd b by optmal average: ˆ b = y, Var[ b ˆ ] = For slope a, set
More informationfind (x): given element x, return the canonical element of the set containing x;
COS 43 Sprng, 009 Dsjont Set Unon Problem: Mantan a collecton of dsjont sets. Two operatons: fnd the set contanng a gven element; unte two sets nto one (destructvely). Approach: Canoncal element method:
More informationDynamic Programming. Lecture 13 (5/31/2017)
Dynamc Programmng Lecture 13 (5/31/2017) - A Forest Thnnng Example - Projected yeld (m3/ha) at age 20 as functon of acton taken at age 10 Age 10 Begnnng Volume Resdual Ten-year Volume volume thnned volume
More informationContinuous Time Markov Chains
Contnuous Tme Markov Chans Brth and Death Processes,Transton Probablty Functon, Kolmogorov Equatons, Lmtng Probabltes, Unformzaton Chapter 6 1 Markovan Processes State Space Parameter Space (Tme) Dscrete
More informationLogistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI
Logstc Regresson CAP 561: achne Learnng Instructor: Guo-Jun QI Bayes Classfer: A Generatve model odel the posteror dstrbuton P(Y X) Estmate class-condtonal dstrbuton P(X Y) for each Y Estmate pror dstrbuton
More informationSpace of ML Problems. CSE 473: Artificial Intelligence. Parameter Estimation and Bayesian Networks. Learning Topics
/7/7 CSE 73: Artfcal Intellgence Bayesan - Learnng Deter Fox Sldes adapted from Dan Weld, Jack Breese, Dan Klen, Daphne Koller, Stuart Russell, Andrew Moore & Luke Zettlemoyer What s Beng Learned? Space
More informationCS 3710: Visual Recognition Classification and Detection. Adriana Kovashka Department of Computer Science January 13, 2015
CS 3710: Vsual Recognton Classfcaton and Detecton Adrana Kovashka Department of Computer Scence January 13, 2015 Plan for Today Vsual recognton bascs part 2: Classfcaton and detecton Adrana s research
More informationChapter 8 SCALAR QUANTIZATION
Outlne Chapter 8 SCALAR QUANTIZATION Yeuan-Kuen Lee [ CU, CSIE ] 8.1 Overvew 8. Introducton 8.4 Unform Quantzer 8.5 Adaptve Quantzaton 8.6 Nonunform Quantzaton 8.7 Entropy-Coded Quantzaton Ch 8 Scalar
More informationMATH 567: Mathematical Techniques in Data Science Lab 8
1/14 MATH 567: Mathematcal Technques n Data Scence Lab 8 Domnque Gullot Departments of Mathematcal Scences Unversty of Delaware Aprl 11, 2017 Recall We have: a (2) 1 = f(w (1) 11 x 1 + W (1) 12 x 2 + W
More informationLecture Notes on Linear Regression
Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume
More informationGeneralized Linear Methods
Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set
More information10-701/ Machine Learning, Fall 2005 Homework 3
10-701/15-781 Machne Learnng, Fall 2005 Homework 3 Out: 10/20/05 Due: begnnng of the class 11/01/05 Instructons Contact questons-10701@autonlaborg for queston Problem 1 Regresson and Cross-valdaton [40
More informationConvergence of random processes
DS-GA 12 Lecture notes 6 Fall 216 Convergence of random processes 1 Introducton In these notes we study convergence of dscrete random processes. Ths allows to characterze phenomena such as the law of large
More informationWinter 2008 CS567 Stochastic Linear/Integer Programming Guest Lecturer: Xu, Huan
Wnter 2008 CS567 Stochastc Lnear/Integer Programmng Guest Lecturer: Xu, Huan Class 2: More Modelng Examples 1 Capacty Expanson Capacty expanson models optmal choces of the tmng and levels of nvestments
More informationBoostrapaggregating (Bagging)
Boostrapaggregatng (Baggng) An ensemble meta-algorthm desgned to mprove the stablty and accuracy of machne learnng algorthms Can be used n both regresson and classfcaton Reduces varance and helps to avod
More informationDynamical Systems and Information Theory
Dynamcal Systems and Informaton Theory Informaton Theory Lecture 4 Let s consder systems that evolve wth tme x F ( x, x, x,... That s, systems that can be descrbed as the evoluton of a set of state varables
More information1 Derivation of Rate Equations from Single-Cell Conductance (Hodgkin-Huxley-like) Equations
Physcs 171/271 -Davd Klenfeld - Fall 2005 (revsed Wnter 2011) 1 Dervaton of Rate Equatons from Sngle-Cell Conductance (Hodgkn-Huxley-lke) Equatons We consder a network of many neurons, each of whch obeys
More informationChannel Encoder. Channel. Figure 7.1: Communication system
Chapter 7 Processes The model of a communcaton system that we have been developng s shown n Fgure 7.. Ths model s also useful for some computaton systems. The source s assumed to emt a stream of symbols.
More informationp 1 c 2 + p 2 c 2 + p 3 c p m c 2
Where to put a faclty? Gven locatons p 1,..., p m n R n of m houses, want to choose a locaton c n R n for the fre staton. Want c to be as close as possble to all the house. We know how to measure dstance
More information1 Convex Optimization
Convex Optmzaton We wll consder convex optmzaton problems. Namely, mnmzaton problems where the objectve s convex (we assume no constrants for now). Such problems often arse n machne learnng. For example,
More informationHidden Markov Models
Note to other teachers and users of these sldes. Andrew would be delghted f you found ths source materal useful n gvng your own lectures. Feel free to use these sldes verbatm, or to modfy them to ft your
More informationThe Feynman path integral
The Feynman path ntegral Aprl 3, 205 Hesenberg and Schrödnger pctures The Schrödnger wave functon places the tme dependence of a physcal system n the state, ψ, t, where the state s a vector n Hlbert space
More informationLecture 3: Shannon s Theorem
CSE 533: Error-Correctng Codes (Autumn 006 Lecture 3: Shannon s Theorem October 9, 006 Lecturer: Venkatesan Guruswam Scrbe: Wdad Machmouch 1 Communcaton Model The communcaton model we are usng conssts
More informationApplied Stochastic Processes
STAT455/855 Fall 23 Appled Stochastc Processes Fnal Exam, Bref Solutons 1. (15 marks) (a) (7 marks) The dstrbuton of Y s gven by ( ) ( ) y 2 1 5 P (Y y) for y 2, 3,... The above follows because each of
More informationOutline and Reading. Dynamic Programming. Dynamic Programming revealed. Computing Fibonacci. The General Dynamic Programming Technique
Outlne and Readng Dynamc Programmng The General Technque ( 5.3.2) -1 Knapsac Problem ( 5.3.3) Matrx Chan-Product ( 5.3.1) Dynamc Programmng verson 1.4 1 Dynamc Programmng verson 1.4 2 Dynamc Programmng
More informationThe Cortex. Networks. Laminar Structure of Cortex. Chapter 3, O Reilly & Munakata.
Networks The Cortex Chapter, O Relly & Munakata. Bology of networks: The cortex Exctaton: Undrectonal (transformatons) Local vs. dstrbuted representatons Bdrectonal (pattern completon, amplfcaton) Inhbton:
More informationINF 5860 Machine learning for image classification. Lecture 3 : Image classification and regression part II Anne Solberg January 31, 2018
INF 5860 Machne learnng for mage classfcaton Lecture 3 : Image classfcaton and regresson part II Anne Solberg January 3, 08 Today s topcs Multclass logstc regresson and softma Regularzaton Image classfcaton
More informationHidden Markov Models
CM229S: Machne Learnng for Bonformatcs Lecture 12-05/05/2016 Hdden Markov Models Lecturer: Srram Sankararaman Scrbe: Akshay Dattatray Shnde Edted by: TBD 1 Introducton For a drected graph G we can wrte
More information10.34 Fall 2015 Metropolis Monte Carlo Algorithm
10.34 Fall 2015 Metropols Monte Carlo Algorthm The Metropols Monte Carlo method s very useful for calculatng manydmensonal ntegraton. For e.g. n statstcal mechancs n order to calculate the prospertes of
More informationI529: Machine Learning in Bioinformatics (Spring 2017) Markov Models
I529: Machne Learnng n Bonformatcs (Sprng 217) Markov Models Yuzhen Ye School of Informatcs and Computng Indana Unversty, Bloomngton Sprng 217 Outlne Smple model (frequency & profle) revew Markov chan
More information2 Laminar Structure of Cortex. 4 Area Structure of Cortex
Networks!! Lamnar Structure of Cortex. Bology: The cortex. Exctaton: Undrectonal (transformatons) Local vs. dstrbuted representatons Bdrectonal (pattern completon, amplfcaton). Inhbton: Controllng bdrectonal
More informationTarget tracking example Filtering: Xt. (main interest) Smoothing: X1: t. (also given with SIS)
Target trackng example Flterng: Xt Y1: t (man nterest) Smoothng: X1: t Y1: t (also gven wth SIS) However as we have seen, the estmate of ths dstrbuton breaks down when t gets large due to the weghts becomng
More informationLossy Compression. Compromise accuracy of reconstruction for increased compression.
Lossy Compresson Compromse accuracy of reconstructon for ncreased compresson. The reconstructon s usually vsbly ndstngushable from the orgnal mage. Typcally, one can get up to 0:1 compresson wth almost
More informationThe Gaussian classifier. Nuno Vasconcelos ECE Department, UCSD
he Gaussan classfer Nuno Vasconcelos ECE Department, UCSD Bayesan decson theory recall that we have state of the world X observatons g decson functon L[g,y] loss of predctng y wth g Bayes decson rule s
More informationCHAPTER 3: BAYESIAN DECISION THEORY
HATER 3: BAYESIAN DEISION THEORY Decson mang under uncertanty 3 Data comes from a process that s completely not nown The lac of nowledge can be compensated by modelng t as a random process May be the underlyng
More informationDiscriminative classifier: Logistic Regression. CS534-Machine Learning
Dscrmnatve classfer: Logstc Regresson CS534-Machne Learnng robablstc Classfer Gven an nstance, hat does a probablstc classfer do dfferentl compared to, sa, perceptron? It does not drectl predct Instead,
More informationReview of Taylor Series. Read Section 1.2
Revew of Taylor Seres Read Secton 1.2 1 Power Seres A power seres about c s an nfnte seres of the form k = 0 k a ( x c) = a + a ( x c) + a ( x c) + a ( x c) k 2 3 0 1 2 3 + In many cases, c = 0, and the
More informationProbability Theory (revisited)
Probablty Theory (revsted) Summary Probablty v.s. plausblty Random varables Smulaton of Random Experments Challenge The alarm of a shop rang. Soon afterwards, a man was seen runnng n the street, persecuted
More informationLinear Classification, SVMs and Nearest Neighbors
1 CSE 473 Lecture 25 (Chapter 18) Lnear Classfcaton, SVMs and Nearest Neghbors CSE AI faculty + Chrs Bshop, Dan Klen, Stuart Russell, Andrew Moore Motvaton: Face Detecton How do we buld a classfer to dstngush
More informationEquilibrium with Complete Markets. Instructor: Dmytro Hryshko
Equlbrum wth Complete Markets Instructor: Dmytro Hryshko 1 / 33 Readngs Ljungqvst and Sargent. Recursve Macroeconomc Theory. MIT Press. Chapter 8. 2 / 33 Equlbrum n pure exchange, nfnte horzon economes,
More informationStatistical Foundations of Pattern Recognition
Statstcal Foundatons of Pattern Recognton Learnng Objectves Bayes Theorem Decson-mang Confdence factors Dscrmnants The connecton to neural nets Statstcal Foundatons of Pattern Recognton NDE measurement
More informationLecture 10: May 6, 2013
TTIC/CMSC 31150 Mathematcal Toolkt Sprng 013 Madhur Tulsan Lecture 10: May 6, 013 Scrbe: Wenje Luo In today s lecture, we manly talked about random walk on graphs and ntroduce the concept of graph expander,
More informationStrong Markov property: Same assertion holds for stopping times τ.
Brownan moton Let X ={X t : t R + } be a real-valued stochastc process: a famlty of real random varables all defned on the same probablty space. Defne F t = nformaton avalable by observng the process up
More informationOn complexity and randomness of Markov-chain prediction
On complexty and randomness of Markov-chan predcton Joel Ratsaby Department of Electrcal and Electroncs Engneerng Arel Unversty Arel, ISRAEL Emal: ratsaby@arelacl Abstract Let {X t : t Z} be a sequence
More informationAn Experiment/Some Intuition (Fall 2006): Lecture 18 The EM Algorithm heads coin 1 tails coin 2 Overview Maximum Likelihood Estimation
An Experment/Some Intuton I have three cons n my pocket, 6.864 (Fall 2006): Lecture 18 The EM Algorthm Con 0 has probablty λ of heads; Con 1 has probablty p 1 of heads; Con 2 has probablty p 2 of heads
More informationSupporting Information
Supportng Informaton The neural network f n Eq. 1 s gven by: f x l = ReLU W atom x l + b atom, 2 where ReLU s the element-wse rectfed lnear unt, 21.e., ReLUx = max0, x, W atom R d d s the weght matrx to
More informationEEE 241: Linear Systems
EEE : Lnear Systems Summary #: Backpropagaton BACKPROPAGATION The perceptron rule as well as the Wdrow Hoff learnng were desgned to tran sngle layer networks. They suffer from the same dsadvantage: they
More informationarxiv: v1 [cs.lg] 17 Jan 2019
LECTURE NOTES arxv:90.05639v [cs.lg] 7 Jan 209 Artfcal Neural Networks B. MEHLIG Department of Physcs Unversty of Gothenburg Göteborg, Sweden 209 PREFACE These are lecture notes for my course on Artfcal
More informationHMMT February 2016 February 20, 2016
HMMT February 016 February 0, 016 Combnatorcs 1. For postve ntegers n, let S n be the set of ntegers x such that n dstnct lnes, no three concurrent, can dvde a plane nto x regons (for example, S = {3,
More informationModule 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur
Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:
More informationLECTURE NOTES. Artifical Neural Networks. B. MEHLIG (course home page)
LECTURE NOTES Artfcal Neural Networks B. MEHLIG (course home page) Department of Physcs Unversty of Gothenburg Göteborg, Sweden 208 PREFACE These are lecture notes for my course on Artfcal Neural Networks
More information