Deep Learning CMSC 422 MARINE CARPUAT. Based on slides by Vlad Morariu
|
|
- Beatrice Norman
- 6 years ago
- Views:
Transcription
1 Deep Learig CMSC 422 MARINE CARPUAT Based o slides by Vlad Morariu
2 feature extractio classificatio Stadard Applicatio of Machie Learig to Computer Visio cat or backgroud features predicted labels Features: e.g., Scale Ivariat Feature Trasform(SIFT) Classifiers: SVM, Radom Forests, KNN, Features are had-crafted, ot traied evetually limited by feature quality cat supervisio traiig Cat image credit:
3 traiig supervisio features classifier Image credit: LeCu, Y., Bottou, L., Begio, Y., Haffer, P. Gradiet-based learig applied to documet recogitio. Proceedigs of the IEEE, Deep learig multiple layer eural etworks lear features ad classifiers directly ( ed-to-ed traiig) breakthrough i Computer Visio, ow i other AI areas
4 Speech Recogitio Slide credit: Bohyug Ha
5 Image Classificatio Performace Image Classificatio Top-5 Errors (%) Figure from: K. He, X. Zhag, S. Re, J. Su. Deep Residual Learig for Image Recogitio. arxiv (slides) Slide credit: Bohyug Ha
6 Today s lecture: key cocepts Covolutioal Neural Networks Revisitig Backpropagatio ad Gradiet Descet for Deep Networks
7 Multi-Layer Perceptro (MLP) Image source:
8 Image credit: LeCu, Y., Bottou, L., Begio, Y., Haffer, P. Gradiet-based learig applied to documet recogitio. Proceedigs of the IEEE, Neural Networks Applied to Visio LeCu, Y; Boser, B; Deker, J; Hederso, D; Howard, R; Hubbard, W; Jackel, L, Backpropagatio Applied to Hadwritte Zip Code Recogitio, i Neural Computatio, 1989 USPS digit recogitio, later check readig Covolutio, poolig ( weight sharig ), fully coected layers
9 Architecture overview Image credit: LeCu, Y., Bottou, L., Begio, Y., Haffer, P. Gradiet-based learig applied to documet recogitio. Proceedigs of the IEEE, Compoets: Covolutio layers Poolig/Subsamplig layers Fully coected layers
10 Covolutioal Layer 32x32x3 image 32 height 3 32 depth width Slide credit: Fei-Fei Li, Adrej Karpathy, ad Justi Johso
11 Covolutioal Layer Slide credit: Fei-Fei Li, Adrej Karpathy, ad Justi Johso 32x32x3 image 5x5x3 filter Covolve the filter with the image i.e. slide over the image spatially, computig dot products 11 Ja 2016
12 Covolutioal Layer Slide credit: Fei-Fei Li, Adrej Karpathy, ad Justi Johso 32x32x3 image Filters always exted the full depth of the iput volume 5x5x3 filter Covolve the filter with the image i.e. slide over the image spatially, computig dot products 11 Ja 2016
13 Covolutioal Layer Slide credit: Fei-Fei Li, Adrej Karpathy, ad Justi Johso 32 32x32x3 image 5x5x3 filter umber: the result of takig a dot product betwee the filter ad a small 5x5x3 chuk of the image (i.e. 5*5*3 = 75-dimesioal dot product + bias) 11 Ja 2016
14 Covolutioal Layer Slide credit: Fei-Fei Li, Adrej Karpathy, ad Justi Joh 32 32x32x3 image 5x5x3 filter activatio map 28 covolve (slide) over all spatial locatios Ja 2016
15 Covolutioal Layer Slide credit: Fei-Fei Li, Adrej Karpathy, ad Justi Johso cosider a secod, gree filter 32 32x32x3 image 5x5x3 filter activatio maps 28 covolve (slide) over all spatial locatios Ja 2016
16 Covolutioal Layer For example, if we had 6 5x5 filters, we ll get 6 separate activatio maps: activatio maps Covolutio Layer Ja 2016 We stack these up to get a ew image of size 28x28x6! Slide credit: Fei-Fei Li, Adrej Karpathy, ad Justi Johso
17 Covolutioal Layer Slide credit: Fei-Fei Li, Adrej Karpathy, ad Justi Johso CovNet is a sequece of Covolutioal Layers, iterspersed with activatio fuctios CONV, ReLU e.g. 6 5x5x3 filters Ja 2016
18 Covolutioal Layer Slide credit: Fei-Fei Li, Adrej Karpathy, ad Justi Johso CovNet is a sequece of Covolutioal Layers, iterspersed with activatio fuctios CONV, ReLU e.g. 6 5x5x3 filters 28 6 CONV, ReLU e.g. 10 5x5x6 filters CONV, ReLU. 11 Ja 2016
19 Rectified Liear Uits (ReLU) Use rectified liear fuctio istead of sigmoid ReL(x) = max (0,x) Advatages Fast No vaishig gradiets
20 Poolig Layer - makes the represetatios smaller ad more maageable - operates over each activatio map idepedetly 11 Ja 2016 Slide credit: Fei-Fei Li, Adrej Karpathy, ad Justi Johso
21 Poolig Layer MAX POOLING x Sigle depth slice y max pool with 2x2 filters ad stride Ja 2016 Slide credit: Fei-Fei Li, Adrej Karpathy, ad Justi Johso
22 Covolutioal filter visualizatio Slide credit: Fei-Fei Li, Adrej Karpathy, ad Justi Johso [From recet Ya LeCu slides] 11 Ja 2016
23 Covolutioal filter visualizatio Slide credit: Fei-Fei Li, Adrej Karpathy, ad Justi Johso oe filter => oe activatio map example 5x5 filters (32 total) We call the layer covolutioal because it is related to covolutio of two sigals: 11 Ja 2016 elemetwise multiplicatio ad sum of a filter ad the sigal (image)
24 Today s lecture: key cocepts Covolutioal Neural Networks Revisitig Backpropagatio ad Gradiet Descet for Deep Networks
25 Multi-Layer Perceptro (MLP) Image source:
26 Sigle euro gradiet x 1 w 1 b y x 2 w 2 z y Σ w d Sigmoid L L x d z = b + i w i x i y = e z L = 1 2 y y 2 z w i = x i d y dz = y(1 y) L y = y y Chai rule: L w i = y w i L y = z d y w i dz L y = x i y 1 y y y Slide credit: Adapted from Bohyug Ha
27 Sigle euro traiig for t = 1,, T y = f x, w t = 1,, N L w i = w t+1 = w t + Δw edfor x i y 1 y y y i = 1,, d a epoch Slide credit: Adapted from Bohyug Ha
28 Multi-Layer: Backpropagatio y x k w ij z j y i Sigmoid Sigmoid w ki z i Σ Σ L Neuro i Neuro j y j L L z j = d y j dz j L y j L y i = j dz j d y i L z j = j w ij L z j = j w ij d y j dz j L y j L w ki = z i d y i w ki dz i L y i = z i d y i w ki dz i j w ij d y j dz j L y j Slide credit: 28 Bohyug Ha
29 Backpropagatio i practice Two passes per iteratio: Forward pass: compute value of loss fuctio (ad itermediate euros) give iputs Backward pass: propagate gradiet of loss (error) backwards through the etwork usig the chai rule
30 Stochastic Gradiet Descet (SGD) Update weights for each sample E = 1 2 y y 2 + Fast, olie Sesitive to oise w i t + 1 = w i t ε E w i Miibatch SGD: Update weights for a small set of samples E = 1 2 B y y 2 w i t + 1 = w i t ε EB w i + Fast, olie + Robust to oise Slide credit: Bohyug Ha
31 SGD improvemets: Mometum Remember the previous directio + Coverge faster + Avoid oscillatio v i t = αv i t 1 ε E w i (t) w t + 1 = w t + v(t) Slide credit: Bohyug Ha
32 SGD improvemets: Weight Decay Pealize the size of the weights C = E i w i 2 w i t + 1 = w i t ε C w i = w i t ε E w i λw i + Improve geeralizatio a lot! Slide credit: Bohyug Ha
33 Key cocepts Covolutioal Neural Networks Revisitig Backpropagatio ad Gradiet Descet for Deep Networks
34 History: NN Revival i the 1980 s Backpropagatio discovered i 1970 s but popularized i 1986 David E. Rumelhart, Geoffrey E. Hito, Roald J. Williams. Learig represetatios by back-propagatig errors. I Nature, MLP is a uiversal approximator Ca approximate ay o-liear fuctio i theory, give eough euros, data Kurt Horik, Maxwell Stichcombe, Halbert White. Multilayer feedforward etworks are uiversal approximators. Neural Networks, 1989 Geerated lots of excitemet ad applicatios 35
35 Image credit: LeCu, Y., Bottou, L., Begio, Y., Haffer, P. Gradiet-based learig applied to documet recogitio. Proceedigs of the IEEE, Neural Networks Applied to Visio LeNet visio applicatio LeCu, Y; Boser, B; Deker, J; Hederso, D; Howard, R; Hubbard, W; Jackel, L, Backpropagatio Applied to Hadwritte Zip Code Recogitio, i Neural Computatio, 1989 USPS digit recogitio, later check readig Covolutio, poolig ( weight sharig ), fully coected layers
36 Issues i Deep Neural Networks Prohibitive traiig time Especially with lots of traiig data May epochs typically required for optimizatio Expesive gradiet computatios Overfittig Leared fuctio fits traiig data well, but performs poorly o ew data (high capacity model, ot eough traiig data) Slide credit: adapted from Bohyug Ha
37 Issues i Deep Neural Networks Vaishig gradiet problem z y Sigmoid E w ki = z i d y i w ki dz i E y i = z i w ki d y i dz i j w ij d y j dz j E y j Gradiets i the lower layers are typically extremely small Optimizig multi-layer eural etworks takes huge amout of time Slide credit: adapted from Bohyug Ha
38 New witer ad revival i early 2000 s New witer i the early 2000 s due to problems with traiig NNs Support Vector Machies (SVMs), Radom Forests (RF) easy to trai, ice theory Revival agai by Name chage ( eural etworks -> deep learig ) + Algorithmic developmets usupervised layer-wise pre-traiig ReLU, dropout, layer ormalizatoi + Big data + GPU computig = Large outperformace o may datasets (Visio: ILSVRC 12)
39 Big Data ImageNet Large Scale Visual Recogitio Challege 1000 categories w/ 1000 images per category 1.2 millio traiig images, 50,000 validatio, 150,000 testig 40 O. Russakovsky, J. Deg, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huag, A. Karpathy, A. Khosla, M. Berstei, A. C. Berg ad L. Fei-Fei. ImageNet Large Scale Visual Recogitio Challege. IJCV, 2015.
40 AlexNet Architecture 60 millio parameters! Various tricks ReLU oliearity Overlappig poolig Local respose ormalizatio Dropout set hidde euro output to 0 with probability.5 Data augmetatio Traiig o GPUs Figure credit: Krizhevsky et al, NIPS Alex Krizhevsky, Ilya Sutskeyer, Geoffrey E. Hito. ImageNet Classificatio with Deep Covolutioal Neural Networks. NIPS, 2012.
41 GPU Computig Big data ad big models require lots of computatioal power GPUs thousads of cores for parallel operatios multiple GPUs still took about 5-6 days to trai AlexNet o two NVIDIA GTX 580 3GB GPUs (much faster today)
42 Recurret Neural Networks Networks with loops The output of a layer is used as iput for the same (or lower) layer Ca model dyamics (e.g. i space or time) Image credit: Chritopher Olah s blog Sepp Hochreiter (1991), Utersuchuge zu dyamische euroale Netze, Diploma thesis. Istitut f. Iformatik, Techische Uiv. Muich. Advisor: J. Schmidhuber. Y. Begio, P. Simard, P. Frascoi. Learig Log-Term Depedecies with Gradiet Descet is Difficult. I TNN 1994.
43 Recurret Neural Networks Let s uroll the loops Now a stadard feed-forward etwork with may layers Suffers from vaishig gradiet problem I theory, ca lear log term memory, i practice ot (Begio et al, 1994) Image credit: Chritopher Olah s blog Sepp Hochreiter (1991), Utersuchuge zu dyamische euroale Netze, Diploma thesis. Istitut f. Iformatik, Techische Uiv. Muich. Advisor: J. Schmidhuber. Y. Begio, P. Simard, P. Frascoi. Learig Log-Term Depedecies with Gradiet Descet is Difficult. I TNN 1994.
44 Log Short Term Memory (LSTM) A type of RNN explicitly desiged ot to have the vaishig or explodig gradiet problem Models log-term depedecies Memory is propagated ad accessed by gates Used for speech recogitio, laguage modelig Hochreiter, Sepp; ad Schmidhuber, Jürge. Log Short-Term Memory. Neural Computatio, Image credit: Christopher Colah s blog,
45 Usupervised Neural Networks Autoecoders Ecode the decode the same iput No supervisio eeded output x hidde layer iput x H. Bourlard ad Y. Kamp Auto-associatio by multilayer perceptros ad sigular value decompositio. Biol. Cyber. 59, 4-5 (September 1988),
Deep Neural Networks CMSC 422 MARINE CARPUAT. Deep learning slides credit: Vlad Morariu
Deep Neural Networks CMSC 422 MARINE CARPUAT marie@cs.umd.edu Deep learig slides credit: Vlad Morariu Traiig (Deep) Neural Networks Computatioal graphs Improvemets to gradiet descet Stochastic gradiet
More informationConvolutional Neural Networks II. Slides from Dr. Vlad Morariu
Convolutional Neural Networks II Slides from Dr. Vlad Morariu 1 Optimization Example of optimization progress while training a neural network. (Loss over mini-batches goes down over time.) 2 Learning rate
More informationWeek 1, Lecture 2. Neural Network Basics. Announcements: HW 1 Due on 10/8 Data sets for HW 1 are online Project selection 10/11. Suggested reading :
ME 537: Learig-Based Cotrol Week 1, Lecture 2 Neural Network Basics Aoucemets: HW 1 Due o 10/8 Data sets for HW 1 are olie Proect selectio 10/11 Suggested readig : NN survey paper (Zhag Chap 1, 2 ad Sectios
More informationME 539, Fall 2008: Learning-Based Control
ME 539, Fall 2008: Learig-Based Cotrol Neural Network Basics 10/1/2008 & 10/6/2008 Uiversity Orego State Neural Network Basics Questios??? Aoucemet: Homework 1 has bee posted Due Friday 10/10/08 at oo
More informationMultilayer perceptrons
Multilayer perceptros If traiig set is ot liearly separable, a etwork of McCulloch-Pitts uits ca give a solutio If o loop exists i etwork, called a feedforward etwork (else, recurret etwork) A two-layer
More informationPixel Recurrent Neural Networks
Pixel Recurret Neural Networks Aa ro va de Oord, Nal Kalchbreer, Koray Kavukcuoglu Google DeepMid August 2016 Preseter - Neha M Example problem (completig a image) Give the first half of the image, create
More informationLinear Associator Linear Layer
Hebbia Learig opic 6 Note: lecture otes by Michael Negevitsky (uiversity of asmaia) Bob Keller (Harvey Mudd College CA) ad Marti Haga (Uiversity of Colorado) are used Mai idea: learig based o associatio
More informationPerceptron. Inner-product scalar Perceptron. XOR problem. Gradient descent Stochastic Approximation to gradient descent 5/10/10
Perceptro Ier-product scalar Perceptro Perceptro learig rule XOR problem liear separable patters Gradiet descet Stochastic Approximatio to gradiet descet LMS Adalie 1 Ier-product et =< w, x >= w x cos(θ)
More informationTemplate matching. s[x,y] t[x,y] Problem: locate an object, described by a template t[x,y], in the image s[x,y] Example
Template matchig Problem: locate a object, described by a template t[x,y], i the image s[x,y] Example t[x,y] s[x,y] Digital Image Processig: Berd Girod, 013-018 Staford Uiversity -- Template Matchig 1
More informationMachine Learning Lecture 10
Today s Topic Machie Learig Lecture 10 Neural Networks 26.11.2018 Bastia Leibe RWTH Aache http://www.visio.rwth-aache.de leibe@visio.rwth-aache.de Deep Learig 2 Course Outlie Recap: AdaBoost Adaptive Boostig
More information10-701/ Machine Learning Mid-term Exam Solution
0-70/5-78 Machie Learig Mid-term Exam Solutio Your Name: Your Adrew ID: True or False (Give oe setece explaatio) (20%). (F) For a cotiuous radom variable x ad its probability distributio fuctio p(x), it
More informationAn Introduction to Neural Networks
A Itroductio to Neural Networks Referece: B.J.A. Kröse ad P.P. va der Smagt (1994): A Itroductio to Neural Networks, Poglavja 1-5, 6.1, 6.2, 7-8. Systems modellig from data 0 B.J.A. Kröse ad P.P. va der
More informationPattern recognition systems Laboratory 10 Linear Classifiers and the Perceptron Algorithm
Patter recogitio systems Laboratory 10 Liear Classifiers ad the Perceptro Algorithm 1. Objectives his laboratory sessio presets the perceptro learig algorithm for the liear classifier. We will apply gradiet
More informationMachine Learning. Ilya Narsky, Caltech
Machie Learig Ilya Narsky, Caltech Lecture 4 Multi-class problems. Multi-class versios of Neural Networks, Decisio Trees, Support Vector Machies ad AdaBoost. Reductio of a multi-class problem to a set
More informationPattern recognition systems Lab 10 Linear Classifiers and the Perceptron Algorithm
Patter recogitio systems Lab 10 Liear Classifiers ad the Perceptro Algorithm 1. Objectives his lab sessio presets the perceptro learig algorithm for the liear classifier. We will apply gradiet descet ad
More informationOutline. Linear regression. Regularization functions. Polynomial curve fitting. Stochastic gradient descent for regression. MLE for regression
REGRESSION 1 Outlie Liear regressio Regularizatio fuctios Polyomial curve fittig Stochastic gradiet descet for regressio MLE for regressio Step-wise forward regressio Regressio methods Statistical techiques
More information1 Duality revisited. AM 221: Advanced Optimization Spring 2016
AM 22: Advaced Optimizatio Sprig 206 Prof. Yaro Siger Sectio 7 Wedesday, Mar. 9th Duality revisited I this sectio, we will give a slightly differet perspective o duality. optimizatio program: f(x) x R
More informationMachine Learning Theory (CS 6783)
Machie Learig Theory (CS 6783) Lecture 2 : Learig Frameworks, Examples Settig up learig problems. X : istace space or iput space Examples: Computer Visio: Raw M N image vectorized X = 0, 255 M N, SIFT
More informationMachine Learning Brett Bernstein
Machie Learig Brett Berstei Week 2 Lecture: Cocept Check Exercises Starred problems are optioal. Excess Risk Decompositio 1. Let X = Y = {1, 2,..., 10}, A = {1,..., 10, 11} ad suppose the data distributio
More informationIntroduction to Convolutional Neural Networks (CNNs)
Introduction to Convolutional Neural Networks (CNNs) nojunk@snu.ac.kr http://mipal.snu.ac.kr Department of Transdisciplinary Studies Seoul National University, Korea Jan. 2016 Many slides are from Fei-Fei
More informationMachine Learning Regression I Hamid R. Rabiee [Slides are based on Bishop Book] Spring
Machie Learig Regressio I Hamid R. Rabiee [Slides are based o Bishop Book] Sprig 015 http://ce.sharif.edu/courses/93-94//ce717-1 Liear Regressio Liear regressio: ivolves a respose variable ad a sigle predictor
More informationDiscrete-Time Systems, LTI Systems, and Discrete-Time Convolution
EEL5: Discrete-Time Sigals ad Systems. Itroductio I this set of otes, we begi our mathematical treatmet of discrete-time s. As show i Figure, a discrete-time operates or trasforms some iput sequece x [
More informationA Note on Effi cient Conditional Simulation of Gaussian Distributions. April 2010
A Note o Effi ciet Coditioal Simulatio of Gaussia Distributios A D D C S S, U B C, V, BC, C April 2010 A Cosider a multivariate Gaussia radom vector which ca be partitioed ito observed ad uobserved compoetswe
More informationStep 1: Function Set. Otherwise, output C 2. Function set: Including all different w and b
Logistic Regressio Step : Fuctio Set We wat to fid P w,b C x σ z = + exp z If P w,b C x.5, output C Otherwise, output C 2 z P w,b C x = σ z z = w x + b = w i x i + b i z Fuctio set: f w,b x = P w,b C x
More informationInformation-based Feature Selection
Iformatio-based Feature Selectio Farza Faria, Abbas Kazeroui, Afshi Babveyh Email: {faria,abbask,afshib}@staford.edu 1 Itroductio Feature selectio is a topic of great iterest i applicatios dealig with
More informationMixtures of Gaussians and the EM Algorithm
Mixtures of Gaussias ad the EM Algorithm CSE 6363 Machie Learig Vassilis Athitsos Computer Sciece ad Egieerig Departmet Uiversity of Texas at Arligto 1 Gaussias A popular way to estimate probability desity
More informationClassification goals: Make 1 guess about the label (Top-1 error) Make 5 guesses about the label (Top-5 error) No Bounding Box
ImageNet Classification with Deep Convolutional Neural Networks Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton Motivation Classification goals: Make 1 guess about the label (Top-1 error) Make 5 guesses
More informationLectures 12&13&14: Multilayer Perceptrons (MLP) Networks
1 Lectures 12&13&14: Multilayer Perceptros MLP Networks MultiLayer Perceptro MLP formulated from loose biological priciples popularized mid 1980s Rumelhart, Hito & Williams 1986 Werbos 1974, Ho 1964 lear
More information10/2/ , 5.9, Jacob Hays Amit Pillay James DeFelice
0//008 Liear Discrimiat Fuctios Jacob Hays Amit Pillay James DeFelice 5.8, 5.9, 5. Miimum Squared Error Previous methods oly worked o liear separable cases, by lookig at misclassified samples to correct
More informationSGD and Deep Learning
SGD and Deep Learning Subgradients Lets make the gradient cheating more formal. Recall that the gradient is the slope of the tangent. f(w 1 )+rf(w 1 ) (w w 1 ) Non differentiable case? w 1 Subgradients
More informationResearch Article A Novel Single Neuron Perceptron with Universal Approximation and XOR Computation Properties
Computatioal Itelligece ad Neurosciece, Article ID 746376, 6 pages http://dx.doi.org/10.1155/2014/746376 Research Article A Novel Sigle Neuro Perceptro with Uiversal Approximatio ad XOR Computatio Properties
More informationSupplementary Material: HCP: A Flexible CNN Framework for Multi-label Image Classification
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL.XX, NO.XX, 2015 1 Supplemetary Material: HCP: A Flexible CNN Framework for Multi-label Image Classificatio Yuchao Wei, Wei Xia, Mi Li,
More informationAdmin REGULARIZATION. Schedule. Midterm 9/29/16. Assignment 5. Midterm next week, due Friday (more on this in 1 min)
Admi Assigmet 5! Starter REGULARIZATION David Kauchak CS 158 Fall 2016 Schedule Midterm ext week, due Friday (more o this i 1 mi Assigmet 6 due Friday before fall break Midterm Dowload from course web
More informationJacob Hays Amit Pillay James DeFelice 4.1, 4.2, 4.3
No-Parametric Techiques Jacob Hays Amit Pillay James DeFelice 4.1, 4.2, 4.3 Parametric vs. No-Parametric Parametric Based o Fuctios (e.g Normal Distributio) Uimodal Oly oe peak Ulikely real data cofies
More informationv = -!g(x 0 ) Ûg Ûx 1 Ûx 2 Ú If we work out the details in the partial derivatives, we get a pleasing result. n Ûx k, i x i - 2 b k
The Method of Steepest Descet This is the quadratic fuctio from to that is costructed to have a miimum at the x that solves the system A x = b: g(x) = - 2 I the method of steepest descet, we
More informationOptimally Sparse SVMs
A. Proof of Lemma 3. We here prove a lower boud o the umber of support vectors to achieve geeralizatio bouds of the form which we cosider. Importatly, this result holds ot oly for liear classifiers, but
More informationExpectation-Maximization Algorithm.
Expectatio-Maximizatio Algorithm. Petr Pošík Czech Techical Uiversity i Prague Faculty of Electrical Egieerig Dept. of Cyberetics MLE 2 Likelihood.........................................................................................................
More informationLearning Bounds for Support Vector Machines with Learned Kernels
Learig Bouds for Support Vector Machies with Leared Kerels Nati Srebro TTI-Chicago Shai Be-David Uiversity of Waterloo Mostly based o a paper preseted at COLT 06 Kerelized Large-Margi Liear Classificatio
More informationDEEP LEARNING AND NEURAL NETWORKS: BACKGROUND AND HISTORY
DEEP LEARNING AND NEURAL NETWORKS: BACKGROUND AND HISTORY 1 On-line Resources http://neuralnetworksanddeeplearning.com/index.html Online book by Michael Nielsen http://matlabtricks.com/post-5/3x3-convolution-kernelswith-online-demo
More informationLinear Classifiers III
Uiversität Potsdam Istitut für Iformatik Lehrstuhl Maschielles Lere Liear Classifiers III Blaie Nelso, Tobias Scheffer Cotets Classificatio Problem Bayesia Classifier Decisio Liear Classifiers, MAP Models
More information2D DSP Basics: 2D Systems
- Digital Image Processig ad Compressio D DSP Basics: D Systems D Systems T[ ] y = T [ ] Liearity Additivity: If T y = T [ ] The + T y = y + y Homogeeity: If The T y = T [ ] a T y = ay = at [ ] Liearity
More informationDeep Learning (CNNs)
10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Deep Learning (CNNs) Deep Learning Readings: Murphy 28 Bishop - - HTF - - Mitchell
More informationPC5215 Numerical Recipes with Applications - Review Problems
PC55 Numerical Recipes with Applicatios - Review Problems Give the IEEE 754 sigle precisio bit patter (biary or he format) of the followig umbers: 0 0 05 00 0 00 Note that it has 8 bits for the epoet,
More information<Special Topics in VLSI> Learning for Deep Neural Networks (Back-propagation)
Learning for Deep Neural Networks (Back-propagation) Outline Summary of Previous Standford Lecture Universal Approximation Theorem Inference vs Training Gradient Descent Back-Propagation
More informationDiscrete-Time Signals and Systems. Discrete-Time Signals and Systems. Signal Symmetry. Elementary Discrete-Time Signals.
Discrete-ime Sigals ad Systems Discrete-ime Sigals ad Systems Dr. Deepa Kudur Uiversity of oroto Referece: Sectios. -.5 of Joh G. Proakis ad Dimitris G. Maolakis, Digital Sigal Processig: Priciples, Algorithms,
More informationComputing the output response of LTI Systems.
Computig the output respose of LTI Systems. By breaig or decomposig ad represetig the iput sigal to the LTI system ito terms of a liear combiatio of a set of basic sigals. Usig the superpositio property
More informationMachine Learning. Logistic Regression -- generative verses discriminative classifier. Le Song /15-781, Spring 2008
Machie Learig 070/578 Srig 008 Logistic Regressio geerative verses discrimiative classifier Le Sog Lecture 5 Setember 4 0 Based o slides from Eric Xig CMU Readig: Cha. 3..34 CB Geerative vs. Discrimiative
More informationLecture 9: Boosting. Akshay Krishnamurthy October 3, 2017
Lecture 9: Boostig Akshay Krishamurthy akshay@csumassedu October 3, 07 Recap Last week we discussed some algorithmic aspects of machie learig We saw oe very powerful family of learig algorithms, amely
More informationELEC1200: A System View of Communications: from Signals to Packets Lecture 3
ELEC2: A System View of Commuicatios: from Sigals to Packets Lecture 3 Commuicatio chaels Discrete time Chael Modelig the chael Liear Time Ivariat Systems Step Respose Respose to sigle bit Respose to geeral
More informationIntermittent demand forecasting by using Neural Network with simulated data
Proceedigs of the 011 Iteratioal Coferece o Idustrial Egieerig ad Operatios Maagemet Kuala Lumpur, Malaysia, Jauary 4, 011 Itermittet demad forecastig by usig Neural Network with simulated data Nguye Khoa
More informationMachine Learning Theory Tübingen University, WS 2016/2017 Lecture 12
Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture Tolstikhi Ilya Abstract I this lecture we derive risk bouds for kerel methods. We will start by showig that Soft Margi kerel SVM correspods to miimizig
More informationLecture 4. Hw 1 and 2 will be reoped after class for every body. New deadline 4/20 Hw 3 and 4 online (Nima is lead)
Lecture 4 Homework Hw 1 ad 2 will be reoped after class for every body. New deadlie 4/20 Hw 3 ad 4 olie (Nima is lead) Pod-cast lecture o-lie Fial projects Nima will register groups ext week. Email/tell
More informationACCURATE DICTIONARY LEARNING WITH DIRECT SPARSITY CONTROL. Hongyu Mou, Adrian Barbu
ACCURATE DICTIONARY LEARNING WITH DIRECT SPARSITY CONTROL Hogyu Mou, Adria Barbu Statistics Departmet, Florida State Uiversity Tallahassee FL 32306 ABSTRACT Dictioary learig is a popular method for obtaiig
More informationTopmoumoute online natural gradient algorithm
Topmoumoute olie atural gradiet algorithm Nicolas Le Roux Uiversity of Motreal icolas.le.roux@umotreal.ca Pierre-Atoie Mazagol Uiversity of Motreal mazagop@umotreal.ca Yoshua Begio Uiversity of Motreal
More informationMachine Learning for Signal Processing Neural Networks Continue. Instructor: Bhiksha Raj Slides by Najim Dehak 1 Dec 2016
Machine Learning for Signal Processing Neural Networks Continue Instructor: Bhiksha Raj Slides by Najim Dehak 1 Dec 2016 1 So what are neural networks?? Voice signal N.Net Transcription Image N.Net Text
More information6.883: Online Methods in Machine Learning Alexander Rakhlin
6.883: Olie Methods i Machie Learig Alexader Rakhli LECURE 4 his lecture is partly based o chapters 4-5 i [SSBD4]. Let us o give a variat of SGD for strogly covex fuctios. Algorithm SGD for strogly covex
More informationCSCI567 Machine Learning (Fall 2018)
CSCI567 Machine Learning (Fall 2018) Prof. Haipeng Luo U of Southern California Sep 12, 2018 September 12, 2018 1 / 49 Administration GitHub repos are setup (ask TA Chi Zhang for any issues) HW 1 is due
More informationarxiv: v1 [stat.ml] 28 Sep 2016
Variatioal Autoecoder for Deep Learig of Images, Labels ad Captios arxiv:1609.08976v1 [stat.ml] 28 Sep 2016 Yuche Pu, Zhe Ga, Ricardo Heao, Xi Yua, Chuyua Li, Adrew Steves ad Lawrece Cari Departmet of
More informationOrthogonal Gaussian Filters for Signal Processing
Orthogoal Gaussia Filters for Sigal Processig Mark Mackezie ad Kiet Tieu Mechaical Egieerig Uiversity of Wollogog.S.W. Australia Abstract A Gaussia filter usig the Hermite orthoormal series of fuctios
More informationADVANCED DIGITAL SIGNAL PROCESSING
ADVANCED DIGITAL SIGNAL PROCESSING PROF. S. C. CHAN (email : sccha@eee.hku.hk, Rm. CYC-702) DISCRETE-TIME SIGNALS AND SYSTEMS MULTI-DIMENSIONAL SIGNALS AND SYSTEMS RANDOM PROCESSES AND APPLICATIONS ADAPTIVE
More informationNeural Networks. David Rosenberg. July 26, New York University. David Rosenberg (New York University) DS-GA 1003 July 26, / 35
Neural Networks David Rosenberg New York University July 26, 2017 David Rosenberg (New York University) DS-GA 1003 July 26, 2017 1 / 35 Neural Networks Overview Objectives What are neural networks? How
More informationVassilis Katsouros, Vassilis Papavassiliou and Christos Emmanouilidis
Vassilis Katsouros, Vassilis Papavassiliou ad Christos Emmaouilidis ATHENA Research & Iovatio Cetre, Greece www.athea-iovatio.gr www.ceti.athea-iovatio.gr/compsys e-mail: christosem AT ieee.org Problem
More information6.867 Machine learning, lecture 7 (Jaakkola) 1
6.867 Machie learig, lecture 7 (Jaakkola) 1 Lecture topics: Kerel form of liear regressio Kerels, examples, costructio, properties Liear regressio ad kerels Cosider a slightly simpler model where we omit
More informationTitle: Damage Identification of Structures Based on Pattern Classification Using Limited Number of Sensors
Cover page Title: Damage Idetificatio of Structures Based o Patter Classificatio Usig Limited Number of Sesors Authors: Yuyi QIAN Akira MITA PAPER DEADLINE: **JULY, ** PAPER LENGTH: **8 PAGES MAXIMUM **
More informationRelative Margin Machines
Relative Margi Machies Paagadatta K Shivaswamy ad Toy Jebara Departmet of Computer Sciece, Columbia Uiversity, New York, NY pks0,jebara@cs.columbia.edu Abstract I classificatio problems, Support Vector
More informationFFTs in Graphics and Vision. The Fast Fourier Transform
FFTs i Graphics ad Visio The Fast Fourier Trasform 1 Outlie The FFT Algorithm Applicatios i 1D Multi-Dimesioal FFTs More Applicatios Real FFTs 2 Computatioal Complexity To compute the movig dot-product
More informationArtificial Intelligence Based Automatic Generation of
Artificial Itelligece Based Automatic Geeratio of Etertaiig Gamig Egies Dr. Zahid Halim Faculty of Computer Sciece ad Egieerig Ghulam Ishaq Kha Istitute of Egieerig Scieces ad Techology, Topi zahid.halim@giki.edu.pk
More informationA Unified Approach on Fast Training of Feedforward and Recurrent Networks Using EM Algorithm
2270 IEEE TRASACTIOS O SIGAL PROCESSIG, VOL. 46, O. 8, AUGUST 1998 [12] Q. T. Zhag, K. M. Wog, P. C. Yip, ad J. P. Reilly, Statistical aalysis of the performace of iformatio criteria i the detectio of
More informationClassification with linear models
Lecture 8 Classificatio with liear models Milos Hauskrecht milos@cs.pitt.edu 539 Seott Square Geerative approach to classificatio Idea:. Represet ad lear the distributio, ). Use it to defie probabilistic
More informationDigital Signal Processing, Fall 2006
Digital Sigal Processig, Fall 26 Lecture 1: Itroductio, Discrete-time sigals ad systems Zheg-Hua Ta Departmet of Electroic Systems Aalborg Uiversity, Demark zt@kom.aau.dk 1 Part I: Itroductio Itroductio
More informationApplication of Neural Networks in Bridge Health Prediction based on Acceleration and Displacement Data Domain
Proceedigs of the Iteratioal MultiCoferece of Egieers ad Computer Scietists 213 Vol I,, March 13-15, 213, Hog Kog Applicatio of Neural Networks i Bridge Health Predictio based o Acceleratio ad Displacemet
More informationWeb Appendix O - Derivations of the Properties of the z Transform
M. J. Roberts - 2/18/07 Web Appedix O - Derivatios of the Properties of the z Trasform O.1 Liearity Let z = x + y where ad are costats. The ( z)= ( x + y )z = x z + y z ad the liearity property is O.2
More informationHMM-Based Semantic Learning for a Mobile Robot
HMM-Based Sematic Learig for a Mobile Robot Kevi Squire Laguage Acquisitio ad Robotics Group Uiversity of Illiois at Urbaa-Champaig Adviser: Stephe E. Leviso Laguage Learig Kevi Squire Licol Laboratory
More informationVector Quantization: a Limiting Case of EM
. Itroductio & defiitios Assume that you are give a data set X = { x j }, j { 2,,, }, of d -dimesioal vectors. The vector quatizatio (VQ) problem requires that we fid a set of prototype vectors Z = { z
More informationLast time, we talked about how Equation (1) can simulate Equation (2). We asserted that Equation (2) can also simulate Equation (1).
6896 Quatum Complexity Theory Sept 23, 2008 Lecturer: Scott Aaroso Lecture 6 Last Time: Quatum Error-Correctio Quatum Query Model Deutsch-Jozsa Algorithm (Computes x y i oe query) Today: Berstei-Vazirii
More informationChapter 2 Systems and Signals
Chapter 2 Systems ad Sigals 1 Itroductio Discrete-Time Sigals: Sequeces Discrete-Time Systems Properties of Liear Time-Ivariat Systems Liear Costat-Coefficiet Differece Equatios Frequecy-Domai Represetatio
More informationEE422G Homework #13 (12 points)
EE422G Homework #1 (12 poits) 1. (5 poits) I this problem, you are asked to explore a importat applicatio of FFT: efficiet computatio of covolutio. The impulse respose of a system is give by h(t) (.9),1,2,,1
More informationChapter 7. Support Vector Machine
Chapter 7 Support Vector Machie able of Cotet Margi ad support vectors SVM formulatio Slack variables ad hige loss SVM for multiple class SVM ith Kerels Relevace Vector Machie Support Vector Machie (SVM)
More informationA Predictive Model of Gene Expression Using a Deep Learning Framework
A Predictive Model of Gee Expressio Usig a Deep Learig Framework Rui Xie, Adrew Quitadamo, Jiali Cheg ad Xighua Shi Departmet of Computer Sciece, Uiversity of Missouri at Columbia Columbia, MO, 65201,
More informationTesting the number of parameters with multidimensional MLP
Testig the umber of parameters with multidimesioal MLP Joseph Rykiewicz To cite this versio: Joseph Rykiewicz. Testig the umber of parameters with multidimesioal MLP. ASMDA 2005, 2005, Brest, Frace. pp.561-568,
More informationLecture 17: Neural Networks and Deep Learning
UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016 Lecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 Neurons 1-Layer Neural Network Multi-layer Neural Network Loss Functions
More informationECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015
ECE 8527: Itroductio to Machie Learig ad Patter Recogitio Midterm # 1 Vaishali Ami Fall, 2015 tue39624@temple.edu Problem No. 1: Cosider a two-class discrete distributio problem: ω 1 :{[0,0], [2,0], [2,2],
More informationQuantile regression with multilayer perceptrons.
Quatile regressio with multilayer perceptros. S.-F. Dimby ad J. Rykiewicz Uiversite Paris 1 - SAMM 90 Rue de Tolbiac, 75013 Paris - Frace Abstract. We cosider oliear quatile regressio ivolvig multilayer
More informationIntroduction to Artificial Intelligence CAP 4601 Summer 2013 Midterm Exam
Itroductio to Artificial Itelligece CAP 601 Summer 013 Midterm Exam 1. Termiology (7 Poits). Give the followig task eviromets, eter their properties/characteristics. The properties/characteristics of the
More informationDiscrete-Time System Properties. Discrete-Time System Properties. Terminology: Implication. Terminology: Equivalence. Reference: Section 2.
Professor Deepa Kudur Uiversity of oroto Referece: Sectio 2.2 Joh G. Proakis ad Dimitris G. Maolakis, Digital Sigal Processig: Priciples, Algorithms, ad Applicatios, 4th editio, 2007. Professor Deepa Kudur
More informationMachine Learning: Chenhao Tan University of Colorado Boulder LECTURE 16
Machine Learning: Chenhao Tan University of Colorado Boulder LECTURE 16 Slides adapted from Jordan Boyd-Graber, Justin Johnson, Andrej Karpathy, Chris Ketelsen, Fei-Fei Li, Mike Mozer, Michael Nielson
More information6.3 Testing Series With Positive Terms
6.3. TESTING SERIES WITH POSITIVE TERMS 307 6.3 Testig Series With Positive Terms 6.3. Review of what is kow up to ow I theory, testig a series a i for covergece amouts to fidig the i= sequece of partial
More informationIntro to Learning Theory
Lecture 1, October 18, 2016 Itro to Learig Theory Ruth Urer 1 Machie Learig ad Learig Theory Comig soo 2 Formal Framework 21 Basic otios I our formal model for machie learig, the istaces to be classified
More informationMachine Learning for Data Science (CS 4786)
Machie Learig for Data Sciece CS 4786) Lecture 9: Pricipal Compoet Aalysis The text i black outlies mai ideas to retai from the lecture. The text i blue give a deeper uderstadig of how we derive or get
More informationPattern Classification, Ch4 (Part 1)
Patter Classificatio All materials i these slides were take from Patter Classificatio (2d ed) by R O Duda, P E Hart ad D G Stork, Joh Wiley & Sos, 2000 with the permissio of the authors ad the publisher
More informationApproximation by Superpositions of a Sigmoidal Function
Zeitschrift für Aalysis ud ihre Aweduge Joural for Aalysis ad its Applicatios Volume 22 (2003, No. 2, 463 470 Approximatio by Superpositios of a Sigmoidal Fuctio G. Lewicki ad G. Mario Abstract. We geeralize
More informationNon-Linear Maximum Likelihood Feature Transformation For Speech Recognition
No-Liear Maximum Likelihood Feature Trasformatio For Speech Recogitio Mohamed Kamal Omar, Mark Hasegawa-Johso Departmet of Electrical Ad Computer Egieerig, Uiversity of Illiois at Urbaa-Champaig, Urbaa,
More informationBIOINF 585: Machine Learning for Systems Biology & Clinical Informatics
BIOINF 585: Machie Learig for Systems Biology & Cliical Iformatics Lecture 14: Dimesio Reductio Jie Wag Departmet of Computatioal Medicie & Bioiformatics Uiversity of Michiga 1 Outlie What is feature reductio?
More informationKernel Methods: Support Vector Machines
Kerel Methods: Support Vector Machies Marco ricavelli 8//0 Mobile Robotics ad Olfactio Lab AASS Research Cetre, Örebro Uiversity State of the Art Methods of Data Modelig ad Machie Learig, IMRIS program,
More informationQuestion1 Multiple choices (circle the most appropriate one):
Philadelphia Uiversity Studet Name: Faculty of Egieerig Studet Number: Dept. of Computer Egieerig Fial Exam, First Semester: 2014/2015 Course Title: Digital Sigal Aalysis ad Processig Date: 01/02/2015
More informationarxiv: v4 [math.st] 16 Mar 2019
Submitted to the Aals of Statistics arxiv: 1708.06633 NONPARAMETRIC REGRESSION USING DEEP NEURAL NETWORKS WITH RELU ACTIVATION FUNCTION arxiv:1708.06633v4 [math.st] 16 Mar 2019 By Johaes Schmidt-Hieber
More informationMachine Learning Theory Tübingen University, WS 2016/2017 Lecture 11
Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture Tolstikhi Ilya Abstract We will itroduce the otio of reproducig kerels ad associated Reproducig Kerel Hilbert Spaces (RKHS). We will cosider couple
More informationIntroduction to Signals and Systems, Part V: Lecture Summary
EEL33: Discrete-Time Sigals ad Systems Itroductio to Sigals ad Systems, Part V: Lecture Summary Itroductio to Sigals ad Systems, Part V: Lecture Summary So far we have oly looked at examples of o-recursive
More informationEE123 Digital Signal Processing
Aoucemets HW solutios posted -- self gradig due HW2 due Friday EE2 Digital Sigal Processig ham radio licesig lectures Tue 6:-8pm Cory 2 Lecture 6 based o slides by J.M. Kah SDR give after GSI Wedesday
More informationRun-length & Entropy Coding. Redundancy Removal. Sampling. Quantization. Perform inverse operations at the receiver EEE
Geeral e Image Coder Structure Motio Video (s 1,s 2,t) or (s 1,s 2 ) Natural Image Samplig A form of data compressio; usually lossless, but ca be lossy Redudacy Removal Lossless compressio: predictive
More information