MATH 567: Mathematical Techniques in Data Science Lab 8
|
|
- Melvyn Allison
- 5 years ago
- Views:
Transcription
1 1/14 MATH 567: Mathematcal Technques n Data Scence Lab 8 Domnque Gullot Departments of Mathematcal Scences Unversty of Delaware Aprl 11, 2017
2 Recall We have: a (2) 1 = f(w (1) 11 x 1 + W (1) 12 x 2 + W (1) 13 x 3 + b (1) 1 ) a (2) 2 = f(w (1) 21 x 1 + W (1) 22 x 2 + W (1) 23 x 3 + b (1) 2 ) a (2) 3 = f(w (1) 31 x 1 + W (1) 32 x 2 + W (1) 33 x 3 + b (1) 3 ) h W,b = a (3) 1 = f(w (2) 11 a(2) 1 + W (2) 12 a(2) 2 + W (2) 13 a(2) 3 + b (2) 1 ). 2/14
3 Recall (cont.) 3/14 Vector form: z (2) = W (1) x + b (1) a (2) = f(z (2) ) z (3) = W (2) a (2) + b (2) h W,b = a (3) = f(z (3) ).
4 Tranng neural networks 4/14 Suppose we have A neural network wth s l neurons n layer l (l = 1,..., n l ).
5 Tranng neural networks 4/14 Suppose we have A neural network wth s l neurons n layer l (l = 1,..., n l ). Observatons (x (1), y (1) ),..., (x (m), y (m) ) R s 1 R sn l. We would lke to choose W (l) and b (l) n some optmal way for all l.
6 Tranng neural networks 4/14 Suppose we have A neural network wth s l neurons n layer l (l = 1,..., n l ). Observatons (x (1), y (1) ),..., (x (m), y (m) ) R s 1 R sn l. We would lke to choose W (l) and b (l) n some optmal way for all l. Let J(W, b; x, y) := 1 2 h W,b(x) y 2 2 (Squared error for one sample).
7 Tranng neural networks 4/14 Suppose we have A neural network wth s l neurons n layer l (l = 1,..., n l ). Observatons (x (1), y (1) ),..., (x (m), y (m) ) R s 1 R sn l. We would lke to choose W (l) and b (l) n some optmal way for all l. Let J(W, b; x, y) := 1 2 h W,b(x) y 2 2 (Squared error for one sample). Dene J(W, b) := 1 m m J(W, b; x (), y () ) + λ 2 =1 (average squared error wth Rdge penalty). n l 1 s l s l+1 (W (l) j )2. l=1 =1 j=1
8 Tranng neural networks Suppose we have A neural network wth s l neurons n layer l (l = 1,..., n l ). Observatons (x (1), y (1) ),..., (x (m), y (m) ) R s 1 R sn l. We would lke to choose W (l) and b (l) n some optmal way for all l. Let J(W, b; x, y) := 1 2 h W,b(x) y 2 2 (Squared error for one sample). Dene J(W, b) := 1 m m J(W, b; x (), y () ) + λ 2 =1 (average squared error wth Rdge penalty). Note: The Rdge penalty prevents overttng. We do not penalze the bas terms b (l). n l 1 s l s l+1 (W (l) j )2. l=1 =1 j=1 4/14
9 Some remarks 5/14 Can use other loss functons (e.g. for classcaton). Can use other penaltes (e.g. l 1, elastc net, etc.).
10 Some remarks 5/14 Can use other loss functons (e.g. for classcaton). Can use other penaltes (e.g. l 1, elastc net, etc.). In classcaton problems, we choose the labels y {0, 1} (f workng wth sgmod) or y { 1, 1} (f workng wth tanh). For regresson problems, we scale the output so that y [0, 1] (f workng wth sgmod) or y [ 1, 1] (f workng wth tanh).
11 Some remarks 5/14 Can use other loss functons (e.g. for classcaton). Can use other penaltes (e.g. l 1, elastc net, etc.). In classcaton problems, we choose the labels y {0, 1} (f workng wth sgmod) or y { 1, 1} (f workng wth tanh). For regresson problems, we scale the output so that y [0, 1] (f workng wth sgmod) or y [ 1, 1] (f workng wth tanh). We can use gradent descent to mnmze J(W, b). Note that snce the functon J(W, b) s non-convex, we may only nd a local mnmum.
12 Some remarks 5/14 Can use other loss functons (e.g. for classcaton). Can use other penaltes (e.g. l 1, elastc net, etc.). In classcaton problems, we choose the labels y {0, 1} (f workng wth sgmod) or y { 1, 1} (f workng wth tanh). For regresson problems, we scale the output so that y [0, 1] (f workng wth sgmod) or y [ 1, 1] (f workng wth tanh). We can use gradent descent to mnmze J(W, b). Note that snce the functon J(W, b) s non-convex, we may only nd a local mnmum. We need an ntal choce for W (l) j and b (l). If we ntalze all the parameters to 0, then the parameters reman constant over the layers because of the symmetry of the problem.
13 Some remarks 5/14 Can use other loss functons (e.g. for classcaton). Can use other penaltes (e.g. l 1, elastc net, etc.). In classcaton problems, we choose the labels y {0, 1} (f workng wth sgmod) or y { 1, 1} (f workng wth tanh). For regresson problems, we scale the output so that y [0, 1] (f workng wth sgmod) or y [ 1, 1] (f workng wth tanh). We can use gradent descent to mnmze J(W, b). Note that snce the functon J(W, b) s non-convex, we may only nd a local mnmum. We need an ntal choce for W (l) j and b (l). If we ntalze all the parameters to 0, then the parameters reman constant over the layers because of the symmetry of the problem. As a result, we ntalze the parameters to a small constant at random (say, usng N(0, ɛ 2 ) for ɛ = 0.01).
14 Gradent descent and the backpropagaton algorthm 6/14 We update the parameters usng a gradent descent as follows: W (l) j b (l) W (l) j α b (l) α W (l) j b (l) J(W, b) J(W, b). Here α > 0 s a parameter (the learnng rate).
15 Gradent descent and the backpropagaton algorthm 6/14 We update the parameters usng a gradent descent as follows: W (l) j b (l) W (l) j α b (l) α W (l) j b (l) J(W, b) J(W, b). Here α > 0 s a parameter (the learnng rate). The partal dervatves can be cleverly computed usng the chan rule to avod repeatng calculatons (backpropagaton algorthm).
16 Sparse neural networks 7/14 Sparse networks can be bult by Penalzng coecents (e.g. usng a l 1 penalty). Droppng some of the connectons at random (dropout). Srvastava et al., JMLR 15 (2014). Useful to prevent overttng. Recent work: One-shot learners can be used to tran models wth a smaller sample sze.
17 Autoencoders 8/14 An autoencoder learns the dentty functon: Input: unlabeled data. Output = nput. Idea: lmt the number of hdden layers to dscover structure n the data. Learn a compressed representaton of the nput. Source: UFLDL tutoral.
18 Example (UFLDL) 9/14 Tran an autoencoder on mages wth one hdden layer.
19 Example (UFLDL) 9/14 Tran an autoencoder on mages wth one hdden layer. Each hdden unt computes: 100 = f a (2) j=1 j x j + b (1) j. W (1)
20 Example (UFLDL) 9/14 Tran an autoencoder on mages wth one hdden layer. Each hdden unt computes: 100 = f a (2) j=1 j x j + b (1) j. W (1) Thnk of a (2) as some non-lnear feature of the nput x.
21 Example (UFLDL) 9/14 Tran an autoencoder on mages wth one hdden layer. Each hdden unt computes: 100 = f a (2) j=1 j x j + b (1) j. W (1) Thnk of a (2) as some non-lnear feature of the nput x. Problem: Fnd x that maxmally actvates a (2) over x 2 1.
22 Example (UFLDL) 9/14 Tran an autoencoder on mages wth one hdden layer. Each hdden unt computes: 100 = f a (2) j=1 j x j + b (1) j. W (1) Thnk of a (2) as some non-lnear feature of the nput x. Problem: Fnd x that maxmally actvates a (2) over x 2 1. Clam: x j = W (1) j 100. (1) j=1 (W j )2
23 Example (UFLDL) Tran an autoencoder on mages wth one hdden layer. Each hdden unt computes: 100 = f a (2) j=1 j x j + b (1) j. W (1) Thnk of a (2) as some non-lnear feature of the nput x. Problem: Fnd x that maxmally actvates a (2) over x 2 1. Clam: x j = (Hnt: Use CauchySchwarz). W (1) j 100. (1) j=1 (W j )2 We can now dsplay the mage maxmzng a (2) for each. 9/14
24 Example (cont.) 10/ hdden unts on 10x10 pxel nputs: The derent hdden unts have learned to detect edges at derent postons and orentatons n the mage.
25 Usng convolutons 11/14 Idea: Certan sgnals are statonary,.e., ther statstcal propertes do not change n space or tme.
26 Usng convolutons 11/14 Idea: Certan sgnals are statonary,.e., ther statstcal propertes do not change n space or tme. For example, mages often have smlar statstcal propertes n derent regons n space.
27 Usng convolutons 11/14 Idea: Certan sgnals are statonary,.e., ther statstcal propertes do not change n space or tme. For example, mages often have smlar statstcal propertes n derent regons n space. That suggests that the features that we learn at one part of an mage can also be appled to other parts of the mage.
28 Usng convolutons 11/14 Idea: Certan sgnals are statonary,.e., ther statstcal propertes do not change n space or tme. For example, mages often have smlar statstcal propertes n derent regons n space. That suggests that the features that we learn at one part of an mage can also be appled to other parts of the mage. Can convolve the learned features wth the larger mage.
29 Usng convolutons 11/14 Idea: Certan sgnals are statonary,.e., ther statstcal propertes do not change n space or tme. For example, mages often have smlar statstcal propertes n derent regons n space. That suggests that the features that we learn at one part of an mage can also be appled to other parts of the mage. Can convolve the learned features wth the larger mage. Example: mage.
30 Usng convolutons 11/14 Idea: Certan sgnals are statonary,.e., ther statstcal propertes do not change n space or tme. For example, mages often have smlar statstcal propertes n derent regons n space. That suggests that the features that we learn at one part of an mage can also be appled to other parts of the mage. Can convolve the learned features wth the larger mage. Example: mage. Learn features on small 8 8 patches sampled randomly (e.g. usng a sparse autoencoder).
31 Usng convolutons 11/14 Idea: Certan sgnals are statonary,.e., ther statstcal propertes do not change n space or tme. For example, mages often have smlar statstcal propertes n derent regons n space. That suggests that the features that we learn at one part of an mage can also be appled to other parts of the mage. Can convolve the learned features wth the larger mage. Example: mage. Learn features on small 8 8 patches sampled randomly (e.g. usng a sparse autoencoder). Run the traned model through all 8 8 patches of the mage to get the feature actvatons.
32 Usng convolutons Idea: Certan sgnals are statonary,.e., ther statstcal propertes do not change n space or tme. For example, mages often have smlar statstcal propertes n derent regons n space. That suggests that the features that we learn at one part of an mage can also be appled to other parts of the mage. Can convolve the learned features wth the larger mage. Example: mage. Learn features on small 8 8 patches sampled randomly (e.g. usng a sparse autoencoder). Run the traned model through all 8 8 patches of the mage to get the feature actvatons. Source: UFLDL tutoral. 11/14
33 Poolng features 12/14 Once can also pool the features obtaned va convoluton.
34 Poolng features 12/14 Once can also pool the features obtaned va convoluton. For example, to descrbe a large mage, one natural approach s to aggregate statstcs of these features at varous locatons.
35 Poolng features 12/14 Once can also pool the features obtaned va convoluton. For example, to descrbe a large mage, one natural approach s to aggregate statstcs of these features at varous locatons. E.g. compute the mean, max, etc. over derent regons.
36 Poolng features 12/14 Once can also pool the features obtaned va convoluton. For example, to descrbe a large mage, one natural approach s to aggregate statstcs of these features at varous locatons. E.g. compute the mean, max, etc. over derent regons. Can lead to more robust features. Can lead to nvarant features.
37 Poolng features 12/14 Once can also pool the features obtaned va convoluton. For example, to descrbe a large mage, one natural approach s to aggregate statstcs of these features at varous locatons. E.g. compute the mean, max, etc. over derent regons. Can lead to more robust features. Can lead to nvarant features. For example, f the poolng regons are contguous, then the poolng unts wll be translaton nvarant,.e., they won't change much f objects n the mage are undergo a (small) translaton.
38 Poolng features 12/14 Once can also pool the features obtaned va convoluton. For example, to descrbe a large mage, one natural approach s to aggregate statstcs of these features at varous locatons. E.g. compute the mean, max, etc. over derent regons. Can lead to more robust features. Can lead to nvarant features. For example, f the poolng regons are contguous, then the poolng unts wll be translaton nvarant,.e., they won't change much f objects n the mage are undergo a (small) translaton.
39 R 13/14 We wll use the package h2o to tran neural networks wth R. To get you started, we wll construct a neural network wth 1 hdden layers contanng 2 neurons to learn the XOR functon: # Intalze h2o lbrary(h2o) h2o.nt(nthreads=-1, max_mem_sze="2g") h2o.removeall() # n case the cluster was # already runnng # Construct the XOR functon X = t(matrx(c(0,0,0,1,1,0,1,1),2,4)) y = matrx(c(-1,1,1,-1), 4) tran = as.h2o(cbnd(x,y))
40 R (cont.) 14/14 Tranng the model: # Tran model model <- h2o.deeplearnng(x = names(tran)[1:2], y = names(tran)[3], tranng_frame = tran, actvaton = "Tanh", hdden = c(2), nput_dropout_rato = 0.0, l1 = 0, epochs = 10000) # Test the model h2o.predct(model, tran) Some optons you may want to use when buldng more complcated models for data: actvaton = "RectferWthDropout" nput_dropout_rato = 0.2 l1 = 1e-5
Multilayer Perceptrons and Backpropagation. Perceptrons. Recap: Perceptrons. Informatics 1 CG: Lecture 6. Mirella Lapata
Multlayer Perceptrons and Informatcs CG: Lecture 6 Mrella Lapata School of Informatcs Unversty of Ednburgh mlap@nf.ed.ac.uk Readng: Kevn Gurney s Introducton to Neural Networks, Chapters 5 6.5 January,
More informationMulti-layer neural networks
Lecture 0 Mult-layer neural networks Mlos Hauskrecht mlos@cs.ptt.edu 5329 Sennott Square Lnear regresson w Lnear unts f () Logstc regresson T T = w = p( y =, w) = g( w ) w z f () = p ( y = ) w d w d Gradent
More informationMultilayer neural networks
Lecture Multlayer neural networks Mlos Hauskrecht mlos@cs.ptt.edu 5329 Sennott Square Mdterm exam Mdterm Monday, March 2, 205 In-class (75 mnutes) closed book materal covered by February 25, 205 Multlayer
More informationCS294A Lecture notes. Andrew Ng
CS294A Lecture notes Andrew Ng Sparse autoencoder 1 Introducton Supervsed learnng s one of the most powerful tools of AI, and has led to automatc zp code recognton, speech recognton, self-drvng cars, and
More informationAdmin NEURAL NETWORKS. Perceptron learning algorithm. Our Nervous System 10/25/16. Assignment 7. Class 11/22. Schedule for the rest of the semester
0/25/6 Admn Assgnment 7 Class /22 Schedule for the rest of the semester NEURAL NETWORKS Davd Kauchak CS58 Fall 206 Perceptron learnng algorthm Our Nervous System repeat untl convergence (or for some #
More informationCS294A Lecture notes. Andrew Ng
CS294A Lecture notes Andrew Ng Sparse autoencoder 1 Introducton Supervsed learnng s one of the most powerful tools of AI, and has led to automatc zp code recognton, speech recognton, self-drvng cars, and
More informationEEE 241: Linear Systems
EEE : Lnear Systems Summary #: Backpropagaton BACKPROPAGATION The perceptron rule as well as the Wdrow Hoff learnng were desgned to tran sngle layer networks. They suffer from the same dsadvantage: they
More informationFor now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.
Neural Networks : Dervaton compled by Alvn Wan from Professor Jtendra Malk s lecture Ths type of computaton s called deep learnng and s the most popular method for many problems, such as computer vson
More informationEvaluation of classifiers MLPs
Lecture Evaluaton of classfers MLPs Mlos Hausrecht mlos@cs.ptt.edu 539 Sennott Square Evaluaton For any data set e use to test the model e can buld a confuson matrx: Counts of examples th: class label
More informationSupporting Information
Supportng Informaton The neural network f n Eq. 1 s gven by: f x l = ReLU W atom x l + b atom, 2 where ReLU s the element-wse rectfed lnear unt, 21.e., ReLUx = max0, x, W atom R d d s the weght matrx to
More informationWeek 5: Neural Networks
Week 5: Neural Networks Instructor: Sergey Levne Neural Networks Summary In the prevous lecture, we saw how we can construct neural networks by extendng logstc regresson. Neural networks consst of multple
More information1 Convex Optimization
Convex Optmzaton We wll consder convex optmzaton problems. Namely, mnmzaton problems where the objectve s convex (we assume no constrants for now). Such problems often arse n machne learnng. For example,
More informationNeural networks. Nuno Vasconcelos ECE Department, UCSD
Neural networs Nuno Vasconcelos ECE Department, UCSD Classfcaton a classfcaton problem has two types of varables e.g. X - vector of observatons (features) n the world Y - state (class) of the world x X
More informationGeneralized Linear Methods
Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set
More informationMultilayer Perceptron (MLP)
Multlayer Perceptron (MLP) Seungjn Cho Department of Computer Scence and Engneerng Pohang Unversty of Scence and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjn@postech.ac.kr 1 / 20 Outlne
More informationMultigradient for Neural Networks for Equalizers 1
Multgradent for Neural Netorks for Equalzers 1 Chulhee ee, Jnook Go and Heeyoung Km Department of Electrcal and Electronc Engneerng Yonse Unversty 134 Shnchon-Dong, Seodaemun-Ku, Seoul 1-749, Korea ABSTRACT
More informationLecture 23: Artificial neural networks
Lecture 23: Artfcal neural networks Broad feld that has developed over the past 20 to 30 years Confluence of statstcal mechancs, appled math, bology and computers Orgnal motvaton: mathematcal modelng of
More informationLecture 10 Support Vector Machines II
Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed
More informationMATH 829: Introduction to Data Mining and Analysis The EM algorithm (part 2)
1/16 MATH 829: Introducton to Data Mnng and Analyss The EM algorthm (part 2) Domnque Gullot Departments of Mathematcal Scences Unversty of Delaware Aprl 20, 2016 Recall 2/16 We are gven ndependent observatons
More informationIntroduction to the Introduction to Artificial Neural Network
Introducton to the Introducton to Artfcal Neural Netork Vuong Le th Hao Tang s sldes Part of the content of the sldes are from the Internet (possbly th modfcatons). The lecturer does not clam any onershp
More information1 Input-Output Mappings. 2 Hebbian Failure. 3 Delta Rule Success.
Task Learnng 1 / 27 1 Input-Output Mappngs. 2 Hebban Falure. 3 Delta Rule Success. Input-Output Mappngs 2 / 27 0 1 2 3 4 5 6 7 8 9 Output 3 8 2 7 Input 5 6 0 9 1 4 Make approprate: Response gven stmulus.
More informationMLE and Bayesian Estimation. Jie Tang Department of Computer Science & Technology Tsinghua University 2012
MLE and Bayesan Estmaton Je Tang Department of Computer Scence & Technology Tsnghua Unversty 01 1 Lnear Regresson? As the frst step, we need to decde how we re gong to represent the functon f. One example:
More informationCHALMERS, GÖTEBORGS UNIVERSITET. SOLUTIONS to RE-EXAM for ARTIFICIAL NEURAL NETWORKS. COURSE CODES: FFR 135, FIM 720 GU, PhD
CHALMERS, GÖTEBORGS UNIVERSITET SOLUTIONS to RE-EXAM for ARTIFICIAL NEURAL NETWORKS COURSE CODES: FFR 35, FIM 72 GU, PhD Tme: Place: Teachers: Allowed materal: Not allowed: January 2, 28, at 8 3 2 3 SB
More informationNeural Networks. Perceptrons and Backpropagation. Silke Bussen-Heyen. 5th of Novemeber Universität Bremen Fachbereich 3. Neural Networks 1 / 17
Neural Networks Perceptrons and Backpropagaton Slke Bussen-Heyen Unverstät Bremen Fachberech 3 5th of Novemeber 2012 Neural Networks 1 / 17 Contents 1 Introducton 2 Unts 3 Network structure 4 Snglelayer
More informationSolving Nonlinear Differential Equations by a Neural Network Method
Solvng Nonlnear Dfferental Equatons by a Neural Network Method Luce P. Aarts and Peter Van der Veer Delft Unversty of Technology, Faculty of Cvlengneerng and Geoscences, Secton of Cvlengneerng Informatcs,
More informationINF 5860 Machine learning for image classification. Lecture 3 : Image classification and regression part II Anne Solberg January 31, 2018
INF 5860 Machne learnng for mage classfcaton Lecture 3 : Image classfcaton and regresson part II Anne Solberg January 3, 08 Today s topcs Multclass logstc regresson and softma Regularzaton Image classfcaton
More information10-701/ Machine Learning, Fall 2005 Homework 3
10-701/15-781 Machne Learnng, Fall 2005 Homework 3 Out: 10/20/05 Due: begnnng of the class 11/01/05 Instructons Contact questons-10701@autonlaborg for queston Problem 1 Regresson and Cross-valdaton [40
More informationSupport Vector Machines. Vibhav Gogate The University of Texas at dallas
Support Vector Machnes Vbhav Gogate he Unversty of exas at dallas What We have Learned So Far? 1. Decson rees. Naïve Bayes 3. Lnear Regresson 4. Logstc Regresson 5. Perceptron 6. Neural networks 7. K-Nearest
More informationCIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M
CIS56: achne Learnng Lecture 3 (Sept 6, 003) Preparaton help: Xaoyng Huang Lnear Regresson Lnear regresson can be represented by a functonal form: f(; θ) = θ 0 0 +θ + + θ = θ = 0 ote: 0 s a dummy attrbute
More informationGradient Descent Learning and Backpropagation
Artfcal Neural Networks (art 2) Chrstan Jacob Gradent Descent Learnng and Backpropagaton CSC 533 Wnter 200 Learnng by Gradent Descent Defnton of the Learnng roble Let us start wth the sple case of lnear
More informationLinear Feature Engineering 11
Lnear Feature Engneerng 11 2 Least-Squares 2.1 Smple least-squares Consder the followng dataset. We have a bunch of nputs x and correspondng outputs y. The partcular values n ths dataset are x y 0.23 0.19
More informationEnsemble Methods: Boosting
Ensemble Methods: Boostng Ncholas Ruozz Unversty of Texas at Dallas Based on the sldes of Vbhav Gogate and Rob Schapre Last Tme Varance reducton va baggng Generate new tranng data sets by samplng wth replacement
More informationSDMML HT MSc Problem Sheet 4
SDMML HT 06 - MSc Problem Sheet 4. The recever operatng characterstc ROC curve plots the senstvty aganst the specfcty of a bnary classfer as the threshold for dscrmnaton s vared. Let the data space be
More informationLecture Notes on Linear Regression
Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume
More informationClassification as a Regression Problem
Target varable y C C, C,, ; Classfcaton as a Regresson Problem { }, 3 L C K To treat classfcaton as a regresson problem we should transform the target y nto numercal values; The choce of numercal class
More informationDeep Learning. Boyang Albert Li, Jie Jay Tan
Deep Learnng Boyang Albert L, Je Jay Tan An Unrelated Vdeo A bcycle controller learned usng NEAT (Stanley) What do you mean, deep? Shallow Hdden Markov models ANNs wth one hdden layer Manually selected
More informationMulti layer feed-forward NN FFNN. XOR problem. XOR problem. Neural Network for Speech. NETtalk (Sejnowski & Rosenberg, 1987) NETtalk (contd.
NN 3-00 Mult layer feed-forard NN FFNN We consder a more general netor archtecture: beteen the nput and output layers there are hdden layers, as llustrated belo. Hdden nodes do not drectly send outputs
More informationHopfield networks and Boltzmann machines. Geoffrey Hinton et al. Presented by Tambet Matiisen
Hopfeld networks and Boltzmann machnes Geoffrey Hnton et al. Presented by Tambet Matsen 18.11.2014 Hopfeld network Bnary unts Symmetrcal connectons http://www.nnwj.de/hopfeld-net.html Energy functon The
More informationNeural Networks. Class 22: MLSP, Fall 2016 Instructor: Bhiksha Raj
Neural Networs Class 22: MLSP, Fall 2016 Instructor: Bhsha Raj IMPORTANT ADMINSTRIVIA Fnal wee. Project presentatons on 6th 18797/11755 2 Neural Networs are tang over! Neural networs have become one of
More informationLogistic Classifier CISC 5800 Professor Daniel Leeds
lon 9/7/8 Logstc Classfer CISC 58 Professor Danel Leeds Classfcaton strategy: generatve vs. dscrmnatve Generatve, e.g., Bayes/Naïve Bayes: 5 5 Identfy probablty dstrbuton for each class Determne class
More informationC4B Machine Learning Answers II. = σ(z) (1 σ(z)) 1 1 e z. e z = σ(1 σ) (1 + e z )
C4B Machne Learnng Answers II.(a) Show that for the logstc sgmod functon dσ(z) dz = σ(z) ( σ(z)) A. Zsserman, Hlary Term 20 Start from the defnton of σ(z) Note that Then σ(z) = σ = dσ(z) dz = + e z e z
More informationKernel Methods and SVMs Extension
Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general
More informationModule 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur
Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:
More informationFourier Transform. Additive noise. Fourier Tansform. I = S + N. Noise doesn t depend on signal. We ll consider:
Flterng Announcements HW2 wll be posted later today Constructng a mosac by warpng mages. CSE252A Lecture 10a Flterng Exampel: Smoothng by Averagng Kernel: (From Bll Freeman) m=2 I Kernel sze s m+1 by m+1
More informationLINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity
LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 31 Multcollnearty Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur 6. Rdge regresson The OLSE s the best lnear unbased
More informationTracking with Kalman Filter
Trackng wth Kalman Flter Scott T. Acton Vrgna Image and Vdeo Analyss (VIVA), Charles L. Brown Department of Electrcal and Computer Engneerng Department of Bomedcal Engneerng Unversty of Vrgna, Charlottesvlle,
More informationVideo Data Analysis. Video Data Analysis, B-IT
Lecture Vdeo Data Analyss Deformable Snakes Segmentaton Neural networks Lecture plan:. Segmentaton by morphologcal watershed. Deformable snakes 3. Segmentaton va classfcaton of patterns 4. Concept of a
More informationCSE 546 Midterm Exam, Fall 2014(with Solution)
CSE 546 Mdterm Exam, Fall 014(wth Soluton) 1. Personal nfo: Name: UW NetID: Student ID:. There should be 14 numbered pages n ths exam (ncludng ths cover sheet). 3. You can use any materal you brought:
More informationInternet Engineering. Jacek Mazurkiewicz, PhD Softcomputing. Part 3: Recurrent Artificial Neural Networks Self-Organising Artificial Neural Networks
Internet Engneerng Jacek Mazurkewcz, PhD Softcomputng Part 3: Recurrent Artfcal Neural Networks Self-Organsng Artfcal Neural Networks Recurrent Artfcal Neural Networks Feedback sgnals between neurons Dynamc
More informationModel of Neurons. CS 416 Artificial Intelligence. Early History of Neural Nets. Cybernetics. McCulloch-Pitts Neurons. Hebbian Modification.
Page 1 Model of Neurons CS 416 Artfcal Intellgence Lecture 18 Neural Nets Chapter 20 Multple nputs/dendrtes (~10,000!!!) Cell body/soma performs computaton Sngle output/axon Computaton s typcally modeled
More informationFeature Selection: Part 1
CSE 546: Machne Learnng Lecture 5 Feature Selecton: Part 1 Instructor: Sham Kakade 1 Regresson n the hgh dmensonal settng How do we learn when the number of features d s greater than the sample sze n?
More informationA neural network with localized receptive fields for visual pattern classification
Unversty of Wollongong Research Onlne Faculty of Informatcs - Papers (Archve) Faculty of Engneerng and Informaton Scences 2005 A neural network wth localzed receptve felds for vsual pattern classfcaton
More informationHomework Assignment 3 Due in class, Thursday October 15
Homework Assgnment 3 Due n class, Thursday October 15 SDS 383C Statstcal Modelng I 1 Rdge regresson and Lasso 1. Get the Prostrate cancer data from http://statweb.stanford.edu/~tbs/elemstatlearn/ datasets/prostate.data.
More informationPattern Classification
Pattern Classfcaton All materals n these sldes ere taken from Pattern Classfcaton (nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wley & Sons, 000 th the permsson of the authors and the publsher
More informationWhy feed-forward networks are in a bad shape
Why feed-forward networks are n a bad shape Patrck van der Smagt, Gerd Hrznger Insttute of Robotcs and System Dynamcs German Aerospace Center (DLR Oberpfaffenhofen) 82230 Wesslng, GERMANY emal smagt@dlr.de
More informationVQ widely used in coding speech, image, and video
at Scalar quantzers are specal cases of vector quantzers (VQ): they are constraned to look at one sample at a tme (memoryless) VQ does not have such constrant better RD perfomance expected Source codng
More informationLecture 10 Support Vector Machines. Oct
Lecture 10 Support Vector Machnes Oct - 20-2008 Lnear Separators Whch of the lnear separators s optmal? Concept of Margn Recall that n Perceptron, we learned that the convergence rate of the Perceptron
More informationOnline Classification: Perceptron and Winnow
E0 370 Statstcal Learnng Theory Lecture 18 Nov 8, 011 Onlne Classfcaton: Perceptron and Wnnow Lecturer: Shvan Agarwal Scrbe: Shvan Agarwal 1 Introducton In ths lecture we wll start to study the onlne learnng
More informationLogistic Regression Maximum Likelihood Estimation
Harvard-MIT Dvson of Health Scences and Technology HST.951J: Medcal Decson Support, Fall 2005 Instructors: Professor Lucla Ohno-Machado and Professor Staal Vnterbo 6.873/HST.951 Medcal Decson Support Fall
More informationFundamentals of Computational Neuroscience 2e
Fundamentals of Computatonal Neuroscence e Thomas Trappenberg February 7, 9 Chapter 6: Feed-forward mappng networks Dgtal representaton of letter A 3 3 4 5 3 33 4 5 34 35
More informationCSC321 Tutorial 9: Review of Boltzmann machines and simulated annealing
CSC321 Tutoral 9: Revew of Boltzmann machnes and smulated annealng (Sldes based on Lecture 16-18 and selected readngs) Yue L Emal: yuel@cs.toronto.edu Wed 11-12 March 19 Fr 10-11 March 21 Outlne Boltzmann
More informationTechnical Report: Multidimensional, Downsampled Convolution for Autoencoders
Techncal Report: Multdmensonal, Downsampled Convoluton for Autoencoders Ian Goodfellow August 9, 2010 Abstract Ths techncal report descrbes dscrete convoluton wth a multdmensonal kernel. Convoluton mplements
More informationUsing deep belief network modelling to characterize differences in brain morphometry in schizophrenia
Usng deep belef network modellng to characterze dfferences n bran morphometry n schzophrena Walter H. L. Pnaya * a ; Ary Gadelha b ; Orla M. Doyle c ; Crstano Noto b ; André Zugman d ; Qurno Cordero b,
More informationTHE SUMMATION NOTATION Ʃ
Sngle Subscrpt otaton THE SUMMATIO OTATIO Ʃ Most of the calculatons we perform n statstcs are repettve operatons on lsts of numbers. For example, we compute the sum of a set of numbers, or the sum of the
More informationLinear Classification, SVMs and Nearest Neighbors
1 CSE 473 Lecture 25 (Chapter 18) Lnear Classfcaton, SVMs and Nearest Neghbors CSE AI faculty + Chrs Bshop, Dan Klen, Stuart Russell, Andrew Moore Motvaton: Face Detecton How do we buld a classfer to dstngush
More informationEEL 6266 Power System Operation and Control. Chapter 3 Economic Dispatch Using Dynamic Programming
EEL 6266 Power System Operaton and Control Chapter 3 Economc Dspatch Usng Dynamc Programmng Pecewse Lnear Cost Functons Common practce many utltes prefer to represent ther generator cost functons as sngle-
More information2 S. S. DRAGOMIR, N. S. BARNETT, AND I. S. GOMM Theorem. Let V :(d d)! R be a twce derentable varogram havng the second dervatve V :(d d)! R whch s bo
J. KSIAM Vol.4, No., -7, 2 FURTHER BOUNDS FOR THE ESTIMATION ERROR VARIANCE OF A CONTINUOUS STREAM WITH STATIONARY VARIOGRAM S. S. DRAGOMIR, N. S. BARNETT, AND I. S. GOMM Abstract. In ths paper we establsh
More informationUsing T.O.M to Estimate Parameter of distributions that have not Single Exponential Family
IOSR Journal of Mathematcs IOSR-JM) ISSN: 2278-5728. Volume 3, Issue 3 Sep-Oct. 202), PP 44-48 www.osrjournals.org Usng T.O.M to Estmate Parameter of dstrbutons that have not Sngle Exponental Famly Jubran
More informationKernels in Support Vector Machines. Based on lectures of Martin Law, University of Michigan
Kernels n Support Vector Machnes Based on lectures of Martn Law, Unversty of Mchgan Non Lnear separable problems AND OR NOT() The XOR problem cannot be solved wth a perceptron. XOR Per Lug Martell - Systems
More informationA linear imaging system with white additive Gaussian noise on the observed data is modeled as follows:
Supplementary Note Mathematcal bacground A lnear magng system wth whte addtve Gaussan nose on the observed data s modeled as follows: X = R ϕ V + G, () where X R are the expermental, two-dmensonal proecton
More informationDeep Learning: A Quick Overview
Deep Learnng: A Quck Overvew Seungjn Cho Department of Computer Scence and Engneerng Pohang Unversty of Scence and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjn@postech.ac.kr http://mlg.postech.ac.kr/
More informationLecture 21: Numerical methods for pricing American type derivatives
Lecture 21: Numercal methods for prcng Amercan type dervatves Xaoguang Wang STAT 598W Aprl 10th, 2014 (STAT 598W) Lecture 21 1 / 26 Outlne 1 Fnte Dfference Method Explct Method Penalty Method (STAT 598W)
More informationDeep Learning for Causal Inference
Deep Learnng for Causal Inference Vkas Ramachandra Stanford Unversty Graduate School of Busness 655 Knght Way, Stanford, CA 94305 Abstract In ths paper, we propose the use of deep learnng technques n econometrcs,
More informationKristin P. Bennett. Rensselaer Polytechnic Institute
Support Vector Machnes and Other Kernel Methods Krstn P. Bennett Mathematcal Scences Department Rensselaer Polytechnc Insttute Support Vector Machnes (SVM) A methodology for nference based on Statstcal
More informationMachine Learning & Data Mining CS/CNS/EE 155. Lecture 4: Regularization, Sparsity & Lasso
Machne Learnng Data Mnng CS/CS/EE 155 Lecture 4: Regularzaton, Sparsty Lasso 1 Recap: Complete Ppelne S = {(x, y )} Tranng Data f (x, b) = T x b Model Class(es) L(a, b) = (a b) 2 Loss Functon,b L( y, f
More informationCS 3710: Visual Recognition Classification and Detection. Adriana Kovashka Department of Computer Science January 13, 2015
CS 3710: Vsual Recognton Classfcaton and Detecton Adrana Kovashka Department of Computer Scence January 13, 2015 Plan for Today Vsual recognton bascs part 2: Classfcaton and detecton Adrana s research
More informationManifold Learning for Complex Visual Analytics: Benefits from and to Neural Architectures
Manfold Learnng for Complex Vsual Analytcs: Benefts from and to Neural Archtectures Stephane Marchand-Mallet Vper group Unversty of Geneva Swtzerland Edgar Roman-Rangel, Ke Sun (Vper) A. Agocs, D. Dardans,
More informationNeural Networks. Neural Network Motivation. Why Neural Networks? Comments on Blue Gene. More Comments on Blue Gene
Motvaton for non-lnear Classfers Neural Networs CPS 27 Ron Parr Lnear methods are wea Mae strong assumptons Can only express relatvely smple functons of nputs Comng up wth good features can be hard Why
More informationTraining Convolutional Neural Networks
Tranng Convolutonal Neural Networks Carlo Tomas November 26, 208 The Soft-Max Smplex Neural networks are typcally desgned to compute real-valued functons y = h(x) : R d R e of ther nput x When a classfer
More informationRegression Analysis. Regression Analysis
Regresson Analyss Smple Regresson Multvarate Regresson Stepwse Regresson Replcaton and Predcton Error 1 Regresson Analyss In general, we "ft" a model by mnmzng a metrc that represents the error. n mn (y
More informationProblem Set 9 Solutions
Desgn and Analyss of Algorthms May 4, 2015 Massachusetts Insttute of Technology 6.046J/18.410J Profs. Erk Demane, Srn Devadas, and Nancy Lynch Problem Set 9 Solutons Problem Set 9 Solutons Ths problem
More informationP R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering /
Theory and Applcatons of Pattern Recognton 003, Rob Polkar, Rowan Unversty, Glassboro, NJ Lecture 4 Bayes Classfcaton Rule Dept. of Electrcal and Computer Engneerng 0909.40.0 / 0909.504.04 Theory & Applcatons
More informationCSCI B609: Foundations of Data Science
CSCI B609: Foundatons of Data Scence Lecture 13/14: Gradent Descent, Boostng and Learnng from Experts Sldes at http://grgory.us/data-scence-class.html Grgory Yaroslavtsev http://grgory.us Constraned Convex
More informationCOMP th April, 2007 Clement Pang
COMP 540 12 th Aprl, 2007 Cleent Pang Boostng Cobnng weak classers Fts an Addtve Model Is essentally Forward Stagewse Addtve Modelng wth Exponental Loss Loss Functons Classcaton: Msclasscaton, Exponental,
More informationSupport Vector Machines
Support Vector Machnes Konstantn Tretyakov (kt@ut.ee) MTAT.03.227 Machne Learnng So far Supervsed machne learnng Lnear models Least squares regresson Fsher s dscrmnant, Perceptron, Logstc model Non-lnear
More information1 Derivation of Point-to-Plane Minimization
1 Dervaton of Pont-to-Plane Mnmzaton Consder the Chen-Medon (pont-to-plane) framework for ICP. Assume we have a collecton of ponts (p, q ) wth normals n. We want to determne the optmal rotaton and translaton
More informationBoostrapaggregating (Bagging)
Boostrapaggregatng (Baggng) An ensemble meta-algorthm desgned to mprove the stablty and accuracy of machne learnng algorthms Can be used n both regresson and classfcaton Reduces varance and helps to avod
More informationGaussian Mixture Models
Lab Gaussan Mxture Models Lab Objectve: Understand the formulaton of Gaussan Mxture Models (GMMs) and how to estmate GMM parameters. You ve already seen GMMs as the observaton dstrbuton n certan contnuous
More informationLab 2e Thermal System Response and Effective Heat Transfer Coefficient
58:080 Expermental Engneerng 1 OBJECTIVE Lab 2e Thermal System Response and Effectve Heat Transfer Coeffcent Warnng: though the experment has educatonal objectves (to learn about bolng heat transfer, etc.),
More informationThe exam is closed book, closed notes except your one-page cheat sheet.
CS 89 Fall 206 Introducton to Machne Learnng Fnal Do not open the exam before you are nstructed to do so The exam s closed book, closed notes except your one-page cheat sheet Usage of electronc devces
More informationCOS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture #16 Scribe: Yannan Wang April 3, 2014
COS 511: Theoretcal Machne Learnng Lecturer: Rob Schapre Lecture #16 Scrbe: Yannan Wang Aprl 3, 014 1 Introducton The goal of our onlne learnng scenaro from last class s C comparng wth best expert and
More informationLogistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI
Logstc Regresson CAP 561: achne Learnng Instructor: Guo-Jun QI Bayes Classfer: A Generatve model odel the posteror dstrbuton P(Y X) Estmate class-condtonal dstrbuton P(X Y) for each Y Estmate pror dstrbuton
More informationBlocky models via the L1/L2 hybrid norm
Blocky models va the L1/L2 hybrd norm Jon Claerbout ABSTRACT Ths paper seeks to defne robust, effcent solvers of regressons of L1 nature wth two goals: (1) straghtforward parameterzaton, and (2) blocky
More informationSupport Vector Machines
Support Vector Machnes Konstantn Tretyakov (kt@ut.ee) MTAT.03.227 Machne Learnng So far So far Supervsed machne learnng Lnear models Non-lnear models Unsupervsed machne learnng Generc scaffoldng So far
More informationChapter 9: Statistical Inference and the Relationship between Two Variables
Chapter 9: Statstcal Inference and the Relatonshp between Two Varables Key Words The Regresson Model The Sample Regresson Equaton The Pearson Correlaton Coeffcent Learnng Outcomes After studyng ths chapter,
More informationThe Study of Teaching-learning-based Optimization Algorithm
Advanced Scence and Technology Letters Vol. (AST 06), pp.05- http://dx.do.org/0.57/astl.06. The Study of Teachng-learnng-based Optmzaton Algorthm u Sun, Yan fu, Lele Kong, Haolang Q,, Helongang Insttute
More informationCS 468 Lecture 16: Isometry Invariance and Spectral Techniques
CS 468 Lecture 16: Isometry Invarance and Spectral Technques Justn Solomon Scrbe: Evan Gawlk Introducton. In geometry processng, t s often desrable to characterze the shape of an object n a manner that
More informationClassification (klasifikácia) Feedforward Multi-Layer Perceptron (Dopredná viacvrstvová sieť) 14/11/2016. Perceptron (Frank Rosenblatt, 1957)
4//06 IAI: Lecture 09 Feedforard Mult-Layer Percetron (Doredná vacvrstvová seť) Lubca Benuskova AIMA 3rd ed. Ch. 8.6.4 8.7.5 Classfcaton (klasfkáca) In machne learnng and statstcs, classfcaton s the roblem
More informationNon-linear Canonical Correlation Analysis Using a RBF Network
ESANN' proceedngs - European Smposum on Artfcal Neural Networks Bruges (Belgum), 4-6 Aprl, d-sde publ., ISBN -97--, pp. 57-5 Non-lnear Canoncal Correlaton Analss Usng a RBF Network Sukhbnder Kumar, Elane
More informationMULTISPECTRAL IMAGE CLASSIFICATION USING BACK-PROPAGATION NEURAL NETWORK IN PCA DOMAIN
MULTISPECTRAL IMAGE CLASSIFICATION USING BACK-PROPAGATION NEURAL NETWORK IN PCA DOMAIN S. Chtwong, S. Wtthayapradt, S. Intajag, and F. Cheevasuvt Faculty of Engneerng, Kng Mongkut s Insttute of Technology
More information