Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Neural Networks. Tobias Scheffer

Size: px
Start display at page:

Download "Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Neural Networks. Tobias Scheffer"

Transcription

1 Universität Potsam Institut für Informatik Lehrstuhl Maschinelles Lernen Neural Networks Tobias Scheffer

2 Overview Neural information processing. Fee-forwar networks. Training fee-forwar networks, back propagation. Unsupervise learning: Auto encoers. Training auto encoers via back propagation. Restricte Boltzmann machines. Convolutional networks.

3 Learning Problems can be Impossible without the Right Features motorcycle? motorcycle 3

4 Learning Problems can be Impossible without the Right Features Wheel Hanlebar Frame Wheel! motorcycle? motorcycle 4

5 Learning Problems can be Impossible without the Right Features Abstract features (higher level) Wheel Hanlebar Frame Wheel Feature funktion Raw ata (low-level) 5

6 Neuronal Networks Moel of neural information processing Waves of popularity Perceptron: Rosenblatt, 96. Perceptron only linear classifier (Minsky, Papert, 69). Multilayer perceptrons (9s). Popularity of SVMs (late 9s). Deep learning (late s). Now state of the art for Voice Recognition (Google DeepMin), Face Recognition (Deep Face, 4) 6

7 Neuronal Networks Deep learning, unsupervise feature learning Unsupervise iscovery of features which can then be use for supervise learning Implementation on GPU Able to process vast amounts of ata. Seen as step towars AI 7

8 Deep Learning Recors Neural networks best-performing algorithms for Object classification (CIFAR/NORB/PASCAL VOC- Benchmarks) Vieo classification (various benchmarks) Sentiment analysis (MR Benchmark) Peestrian etection Speech recognition Phycheelic art (Deep Dream) 8

9 Supervise an Unsupervise Learning Supervise learning Entire network traine on labele ata. Unsupervise learning Entire network traine on unlabele ata. Unsupervise pre-training + supervise learning Network (except top-most layer) traine layer-wise on unsupervise ata. Then, entire network is traine on labele ata. Goo for many unlabele + few labele ata. 9

10 Probability of an output spike Neural Information Processing Input signals Weighte input signals are aggregate Axon: output signal Weighte input signals Synaptic weights: strengthene an weakene by learning processes Output signals are electric spikes Connections to other nerve cells

11 Probability of an output spike Neural Information Processing: Moel x x x 3 Input vector x 3 T h x n Output (h) Weighte input signals Weight vector

12 Fee Forwar Networks Inex i x h x h ( h ) k k x k k k ( h ) k k k x k k Inex k x Forwar propagation: Input vector Linear moel: x Each unit has parameter i i i vector Layer i has matrix of parameters Output layer k h x i i i i k k k n i i i i i ni i i i ni ni nin i Hien layers x Input layer

13 Fee Forwar Networks Inex i x h x h ( h ) k k x k k k ( h ) k k k x k k Inex k x Forwar propagation: Input vector Linear moel: Activation function an propagation: Output vector i x i ( h ) x Output layer x h x i i i i k k k Hien layers x x m Input layer 3

14 Fee Forwar Networks Inex i x h x h ( h ) k k x k k k ( h ) k k k x k k Inex k x Bias unit Linear moell: i i i i k k k i Constant element k is replace by aitional unit with i i i constant output : x h x h k k [.. n ] k x x m 4

15 Fee Forwar Networks Inex i x h x h ( h ) k k x k k k ( h ) k k k x k k Inex k x Forwar propagation per layer in vector notation: h x i i i x x m 5

16 Fee Forwar Networks x x h x h ( h ) k k x ( h ) k k k x k k k k k x Stochastic graient escent Square loss: R = m Graient: m j= y j x j Stochastic graient for instance x ˆ Rˆ( ) ' R( ) ' ( y j x j) j m ' y ( ) x 6

17 Fee Forwar Nets: Back Propagation x y x x x δ Stochastic graient for instance x For top-level weights: ' y ( ) x ( y x ) ( y ( x )) k k k k k k ( y ( x )) ( x ) x k k k k ( k x ) kx k ( y ( x )) '( x ) x k k k ( y x ) '( h ) x k k x k k x with: k ( '( h yk x h k k ) )( y x ) k k k 7

18 Forwar Propagation Fee Forwar Nets: Back Propagation x y x i x i δ i i x i x l δ i x ( h i k i h k i lk ) i δ i δ Back Propagation For weights at layer i: with ( y x ) ( y x ) h k k k k i i k hk i k x i i k ( yk xk ) i h ( y x ) (,, ) k k i i x xn l l i i i i k k l k i i i hl xk hk i i i l lk k i i i k l l lk i i ( x,, xn ) ( y x ) h x '( h ) '( h ) k h i k i k i k x 8

19 Activation Function Any ifferentiable sigmoial function is suitable Examples: ( h) h e '( h) ( h)( ( h)) 9

20 Back Propagation: Algorithm Iterate over training instances (x, y): Forwar propagation: for i= : For k= n i : i i x ( h ) Back propagation: For k= n i : k For i=- : For k= n i : Until concergence h x i i i i k k k '( h )( y x ) k k k k ' k k x '( h ) i i i i k k l l lk i k i i ' k k x i

21 Back Propagation Loss function is not convex Each permutation of hien units is a local minimum. Learne features (hien units) may be ok, but not usually globally optimal. Hope: Local minima can still be arbitrarily goo. Many local minima can be equally goo. Reality: Back propagation works well for few ( or ) hien layers.

22 Regularization L-regularize loss Correspons to normal prior on parameters. Graient: Upate: Rˆ ( ) ( y x ) m j j j ˆ ( i i i R ) m δx j j ' δx j Calle weight ecay. Aitional regularization schemes: Early stopping: Stop training before convergence. Delete units with small weights. Dropout: During training, set some units output to zero at ranom. Normalize length of propagate vectors. T

23 Regularization: Dropout In compleetworks, complex co-aaptation relationships can form between units. Not robust for new ata. Dropout: In each training set, raw a fraction of units at ranom an set their output to zero. At application time, use all units. Improves overall robustness: each unit has to function within varying combinations of units. 3

24 Regularization: Stochastic Binary Units Deterministic units propagate x i i k = σ h k. i Stochastic-binary units calculate activation σ h k, Then propagate x i k = with probability σ(h i k ) x i k = otherwise. Similar to ropout: with some probability, i x k each unit oes not prouce output. Biological nneurons behave like this. 4

25 Back Propagation: Tricks Use cross-entropy as loss for classification Stochastic graient on small batches. Permute training ata at ranom. Decrease learning rate uring optimization Initialize weights ranomly (origin can be sale point). Initialize weights via unsupervise pre-training. 5

26 Overview Neural information processing. Fee-forwar networks. Training fee-forwar networks, back propagation. Unsupervise learning: Auto encoers. Training auto encoers via back propagation. Restricte Boltzmann machines. Convolutional networks. 6

27 Auto Encoers Auto encoers learn the ientity function. n input units to n hien units to n output units, with n > n. On the hien layer, the input has to be compresse. Learning algorithm erives a representation that preserves the information from the input. 7

28 Auto Encoers: Example Input: binary vectors with a single. 4 input units, hien units, 4 output units Inputs:,,,,,,,,,,,, 8

29 Auto Encoers: Example Possible activation of the hien units after training 9

30 Auto Encoers: Example Possible activation of the hien units after training 3

31 Auto Encoers: Example Possible activation of the hien units after training 3

32 Auto Encoers: Example Possible activation of the hien units after training 3

33 Auto Encoers: Example There are several local minima of the loss functions (how many?) 33

34 Auto Encoers: Example Input: units Each unit represents the grey value of a pixel. Hien layer: k units Output: units

35 Auto Encoers: Example Each of the hien units is a etector for a base face The weights from one hien unit to the output units encoe the image of the base face. Input faces are represente as a combination of these base faces

36 Auto Encoers: Example The weights from one hien unit to the output units encoe the image of the base face. 36

37 Auto Encoers: Example Feeing an input of into one of the hien units prouces the base face that the hien unit represents. 37

38 Auto Encoers: Example The weights from one hien unit to the output units encoe the image of the base face. Weights from all hien units to output units after training with a set of aligne faces: 38

39 Auto Encoers: Example After training on han-written igits x m 39

40 Auto Encoers via Backpropagation Desire output: y j = x j. Empirical risk: R = m m j= x j x j Train with stanar back propagation, using the input as target output values. x x n 4

41 Auto Encoers via Backpropagation Aitional regularization: hien units shoul be sparse (i.e., have activation most of the time). Minimize KL ivergence between ρ = (ρ,, ρ) an activation of hien units. KL ρ x n = ρ log ρ x + ( ρ) log ( ρ) i i= Moifie backprop upate: 3 3 k '( hk ) l l lk xk xk ( x i ) x x n 4

42 Deep Learning: Stacke Autoencoers Multiple hien layers, each layer has fewer units. Autoencoer has to reprouce the input vector. R = m m j= n > n > > n, n = n. x j x j x x xn 4

43 Stacke Autoencoers: Learning Step : Learn autoencoer using back propagation Run back propagation until convergence. R = m m j= x j x j x x n 43

44 Stacke Autoencoers: Learning Step : Freeze, a another layer. Train an 3 using back propagation R = m m j= x j x j x x n 44

45 Stacke Autoencoers: Learning Step : Freeze,,, a another layer. Train an using back propagation x j x m j= j R = m 3 x x xn 45

46 Stacke Autoencoers: Learning Step + : Remove layer. The result is a hierarchical feature representation. These features can be very useful for supervise learning, in particular for convolutional networks. x xn 46

47 Denoising Autoencoers Aitional regularization: Reconstruct input from corrupte version of the input. Ranomly set a fraction of the input to zero; use uncorrupte input as target for loss function. Tens to lea to more robust representations. x xn 47

48 Stacke Autoencoer: Example D visualization of a ocument corpus TF vector 5 hien units 5 hien units hien units (imensions) units 3 5 units 5 units Term frequencies of many terms 48

49 Auto Encoers: What are they Goo for? Hien units represent a clustering of the inputs. Hien units are features that can now be use for a classification task. For instance, face ientification, han-written letter recognition. x xn 49

50 Restricte Boltzmann Machine Unsupervise learning. Input layer an hien layer. Binary stochastic units, one bias unit per layer. Generative, probabilistic moel. Energy function: E( x, x ) ( x ) x T T n x x n nn x n x n Bias units x x k = (Bias units Are not connecte) x x m 5

51 Restricte Boltzmann Machine Energy function: Energy function~ log P activation E( x, x ) ( x ) x T P( x, ) P( x ) P E(, ) Z e x x x E(, ) Z e x x x ( x x ) e P ( x, x ) P( x ) E( x, x ) Z E( x, x ) Z e x e x Bias units x x k Z is normalization constant x x m 5

52 Inference in RBM RMB is a generative moel; generates states like a Bayesian network. Inference by Markov chain monte carlo: Iterate over units, alternate between input an hien units. Draw unit activation given activations of neighboring units. After burn-in phase, Markov chain of activations is governe by istribution of the encoe RBM. 5

53 RBM: Sampling of States Initialize states at ranom x x k x x m 53

54 RBM: Sampling of States Initialize states at ranom P x x = +e x x x k x x m 54

55 RBM: Sampling of States Initialize states at ranom P x x = P x x = +e x +e x x x k x x m 55

56 RBM: Sampling of States Initialize states at ranom P x x = P x x = P x x = +e x +e x +e x x x k x x m 56

57 RBM: Sampling of States Initialize states at ranom P x x = P x x = P x x = P x x = +e x +e x +e x +e x x x x k x m 57

58 RBM: Sampling of States Initialize states at ranom P x x = P x x = P x x = P x x = +e x +e x +e x +e x x x x k x m 58

59 Restricte Boltzmann Machine: Learning Learning: maximize log-likelihoo of the input vectors. Graient: Energy graient: arg max log P( x ) log p( x ) x ji p ( x x ) E Energy graient for observe input ( x, x ) ji E( x, x ) p( x, x ) x, x ji E( x, x ) ( ) x x T ji ji xx i j Marginal energy graient x x x k x m 59

60 Restricte Boltzmann Machine: Learning Graient: Mit Weight upate log p( x ) E(, ) E(, ) p( ) x x p( ) x x x x x x x x, x ji ji ji E( x, x ) ( ) x x T ji ji ji ' ji ( xi hj x hj ) xx i j Observe input Input inferre in an MCMC step 6

61 Restricte Boltzman Machine: Example Unsupervise learning of a representation, similar to hien layer of an autoencoer. 5 weight vectors i after training with a set of aligne faces: x x k x x m 6

62 Restricte Boltzman Machine: Example Unsupervise learning of a representation, similar to hien layer of an autoencoer. Weights i after training with a set of han-written igits: x x k x x m 6

63 Stacke RBM Learning Stacke RBMs analogous to stacke backpropagation autoencoers. Step : argmax { log P x } x x k x x m 63

64 Stacke RBM Learning Stacke RBMs analogous to stacke backpropagation autoencoers. Step : argmax { log P x } Step : argmax { log P x } x x x k x x m 64

65 RBM: What are they Goo for? Hien units represent a clustering of the inputs. Hien units are features that can then be use for supervise learning. x x x k x x m 65

66 Overview Neural information processing. Fee-forwar networks. Training fee-forwar networks, back propagation. Unsupervise learning: Auto encoers. Training auto encoers via back propagation. Restricte Boltzmann machines. Convolutional networks. 66

67 Convolutional Networks What happens when an autoencoer is traine with unaligne images of faces? Training images 67

68 Convolutional Networks What happens when an autoencoer is traine with unaligne images of faces? Training images 68

69 Convolutional Networks What happens when an autoencoer is traine with unaligne images of faces? Weights become blurry; hien units ten to represent ifferent face positions rather than ifferent looks of faces. Training images 69

70 Convolutional Networks Iea: Have etectors fin common patterns at ifferent positions of the input image an at ifferent scales. Training images 7

71 Convolution Multiplication of a filter with an area of the input gives intensity of the filter signal at the position. n n ij ik, jl kl k n ln x x Pixel in result image x n n convolution. is the result of a Use for images, auio signals. E.g., etection of eges (greyvalue graients). 7

72 Convolutional Networks Apply one or several filters to every position an possibly at several scales. Each unit prouces output of same filter for ifferent position. All units for one filter have the same weight. Example convolutional network with fixe scale an a single filter. n n = 7

73 Multiple Convolutions Multiple etectors per location an scale E.g., eges of varying orientation Eges of varying scale Results in an array of convolute result images. 73

74 Convolutional Networks: Example n n Local receptive fiels Share weights x kn k k x kn n k k k x k nk k k k k k 74

75 Convolutional Networks: Example Convolutional layer with k filters takes an input image an prouces k images. k images have to be aggregate by pooling layers. n n kn k x knn k k k x k n k k k 75

76 Convolutional Networks: Pooling Layers MaxPooling-Layer: Split layer into non-overlapping areas. For each area, pass on maximal value to next layer. Receptive fiel Aggregate feature 76

77 Stacke Convolutional Networks Local receptive fiels are traine layer-wise restricte Boltzmann machine. Weight coupling has to be observe in the implementation (one parameter vector per filter, applie to all areas an at all scales). Iteratively, train next layer on output of previous pooling layer. 77

78 Stacke Convolutional RBM Traine on faces, cars, motorbikes, airplanes 3 78

79 DeepFace: Face Ientification Discriminative layer: same person or not? Layer is traine on labele ata. Pooling Convolution Pooling Convolution Pooling Convolution Convolutiopnal layers traine with unlabele ata 79

80 GPU Training GPUs are suitable to parallelize neural network training Matrix multiplications, convolutions, element-wise operations GPUs oftware CUDA: NVIDIA C-API OPENCL: not specific to NVIDIA PyCUDA: Python-API PyOPENCL: not specific to NVIDIA 8

81 Deep Learning Step-wise transformation of the input space into more abstract feature spaces. Features emanate as solution of an optimization problem. Image processing Pixels local grey-value graients object parts objects. Natural-language text Characters wors chunks clauses sentences. Spoken language Signal spectral ban phone phoneme wor 8

82 Summary Supervise neural network learning Stochastic graient: back propagation. Unsupervise learning: Autoencoers learn features that preserve the iunformation in the training instances Stacke autoencoers. (Stacke) restricte Boltzmann machines: generative moels; inference by sampling (Stacke) convolutional networks: Learn filters, apply filters to ifferent regions, scales Aggregate filter banks by pooling layers. Increasingly abstract etectors applie to regions. 8

83 Seminar Lecture on Neural Networks? 3 x 3 minutes lecture incl. some time for questions. Natural-language escription of images. Worvec. Speech recognition. 83

Admin BACKPROPAGATION. Neural network. Neural network 11/3/16. Assignment 7. Assignment 8 Goals today. David Kauchak CS158 Fall 2016

Admin BACKPROPAGATION. Neural network. Neural network 11/3/16. Assignment 7. Assignment 8 Goals today. David Kauchak CS158 Fall 2016 Amin Assignment 7 Assignment 8 Goals toay BACKPROPAGATION Davi Kauchak CS58 Fall 206 Neural network Neural network inputs inputs some inputs are provie/ entere Iniviual perceptrons/ neurons Neural network

More information

How to do backpropagation in a brain

How to do backpropagation in a brain How to do backpropagation in a brain Geoffrey Hinton Canadian Institute for Advanced Research & University of Toronto & Google Inc. Prelude I will start with three slides explaining a popular type of deep

More information

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others) Machine Learning Neural Networks (slides from Domingos, Pardo, others) For this week, Reading Chapter 4: Neural Networks (Mitchell, 1997) See Canvas For subsequent weeks: Scaling Learning Algorithms toward

More information

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others) Machine Learning Neural Networks (slides from Domingos, Pardo, others) Human Brain Neurons Input-Output Transformation Input Spikes Output Spike Spike (= a brief pulse) (Excitatory Post-Synaptic Potential)

More information

RegML 2018 Class 8 Deep learning

RegML 2018 Class 8 Deep learning RegML 2018 Class 8 Deep learning Lorenzo Rosasco UNIGE-MIT-IIT June 18, 2018 Supervised vs unsupervised learning? So far we have been thinking of learning schemes made in two steps f(x) = w, Φ(x) F, x

More information

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels Need for Deep Networks Perceptron Can only model linear functions Kernel Machines Non-linearity provided by kernels Need to design appropriate kernels (possibly selecting from a set, i.e. kernel learning)

More information

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others) Machine Learning Neural Networks (slides from Domingos, Pardo, others) For this week, Reading Chapter 4: Neural Networks (Mitchell, 1997) See Canvas For subsequent weeks: Scaling Learning Algorithms toward

More information

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels Need for Deep Networks Perceptron Can only model linear functions Kernel Machines Non-linearity provided by kernels Need to design appropriate kernels (possibly selecting from a set, i.e. kernel learning)

More information

Statistical NLP for the Web

Statistical NLP for the Web Statistical NLP for the Web Neural Networks, Deep Belief Networks Sameer Maskey Week 8, October 24, 2012 *some slides from Andrew Rosenberg Announcements Please ask HW2 related questions in courseworks

More information

Radial Basis-Function Networks

Radial Basis-Function Networks Raial Basis-Function Networks Back-Propagation Stochastic Back-Propagation Algorithm Step by Step Example Raial Basis-Function Networks Gaussian response function Location of center u Determining sigma

More information

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Language Models. Tobias Scheffer

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Language Models. Tobias Scheffer Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen Language Models Tobias Scheffer Stochastic Language Models A stochastic language model is a probability distribution over words.

More information

Neural Networks and Deep Learning

Neural Networks and Deep Learning Neural Networks and Deep Learning Professor Ameet Talwalkar November 12, 2015 Professor Ameet Talwalkar Neural Networks and Deep Learning November 12, 2015 1 / 16 Outline 1 Review of last lecture AdaBoost

More information

Artificial Neural Networks. MGS Lecture 2

Artificial Neural Networks. MGS Lecture 2 Artificial Neural Networks MGS 2018 - Lecture 2 OVERVIEW Biological Neural Networks Cell Topology: Input, Output, and Hidden Layers Functional description Cost functions Training ANNs Back-Propagation

More information

Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function.

Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function. Bayesian learning: Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function. Let y be the true label and y be the predicted

More information

Classification goals: Make 1 guess about the label (Top-1 error) Make 5 guesses about the label (Top-5 error) No Bounding Box

Classification goals: Make 1 guess about the label (Top-1 error) Make 5 guesses about the label (Top-5 error) No Bounding Box ImageNet Classification with Deep Convolutional Neural Networks Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton Motivation Classification goals: Make 1 guess about the label (Top-1 error) Make 5 guesses

More information

Introduction to Convolutional Neural Networks (CNNs)

Introduction to Convolutional Neural Networks (CNNs) Introduction to Convolutional Neural Networks (CNNs) nojunk@snu.ac.kr http://mipal.snu.ac.kr Department of Transdisciplinary Studies Seoul National University, Korea Jan. 2016 Many slides are from Fei-Fei

More information

Machine Learning. Neural Networks

Machine Learning. Neural Networks Machine Learning Neural Networks Bryan Pardo, Northwestern University, Machine Learning EECS 349 Fall 2007 Biological Analogy Bryan Pardo, Northwestern University, Machine Learning EECS 349 Fall 2007 THE

More information

Lecture 2: Correlated Topic Model

Lecture 2: Correlated Topic Model Probabilistic Moels for Unsupervise Learning Spring 203 Lecture 2: Correlate Topic Moel Inference for Correlate Topic Moel Yuan Yuan First of all, let us make some claims about the parameters an variables

More information

Deep Learning: a gentle introduction

Deep Learning: a gentle introduction Deep Learning: a gentle introduction Jamal Atif jamal.atif@dauphine.fr PSL, Université Paris-Dauphine, LAMSADE February 8, 206 Jamal Atif (Université Paris-Dauphine) Deep Learning February 8, 206 / Why

More information

Administration. Registration Hw3 is out. Lecture Captioning (Extra-Credit) Scribing lectures. Questions. Due on Thursday 10/6

Administration. Registration Hw3 is out. Lecture Captioning (Extra-Credit) Scribing lectures. Questions. Due on Thursday 10/6 Administration Registration Hw3 is out Due on Thursday 10/6 Questions Lecture Captioning (Extra-Credit) Look at Piazza for details Scribing lectures With pay; come talk to me/send email. 1 Projects Projects

More information

(Feed-Forward) Neural Networks Dr. Hajira Jabeen, Prof. Jens Lehmann

(Feed-Forward) Neural Networks Dr. Hajira Jabeen, Prof. Jens Lehmann (Feed-Forward) Neural Networks 2016-12-06 Dr. Hajira Jabeen, Prof. Jens Lehmann Outline In the previous lectures we have learned about tensors and factorization methods. RESCAL is a bilinear model for

More information

Deep unsupervised learning

Deep unsupervised learning Deep unsupervised learning Advanced data-mining Yongdai Kim Department of Statistics, Seoul National University, South Korea Unsupervised learning In machine learning, there are 3 kinds of learning paradigm.

More information

Neural Networks. Mark van Rossum. January 15, School of Informatics, University of Edinburgh 1 / 28

Neural Networks. Mark van Rossum. January 15, School of Informatics, University of Edinburgh 1 / 28 1 / 28 Neural Networks Mark van Rossum School of Informatics, University of Edinburgh January 15, 2018 2 / 28 Goals: Understand how (recurrent) networks behave Find a way to teach networks to do a certain

More information

Machine Learning for Computer Vision 8. Neural Networks and Deep Learning. Vladimir Golkov Technical University of Munich Computer Vision Group

Machine Learning for Computer Vision 8. Neural Networks and Deep Learning. Vladimir Golkov Technical University of Munich Computer Vision Group Machine Learning for Computer Vision 8. Neural Networks and Deep Learning Vladimir Golkov Technical University of Munich Computer Vision Group INTRODUCTION Nonlinear Coordinate Transformation http://cs.stanford.edu/people/karpathy/convnetjs/

More information

A graph contains a set of nodes (vertices) connected by links (edges or arcs)

A graph contains a set of nodes (vertices) connected by links (edges or arcs) BOLTZMANN MACHINES Generative Models Graphical Models A graph contains a set of nodes (vertices) connected by links (edges or arcs) In a probabilistic graphical model, each node represents a random variable,

More information

Neural Networks. Single-layer neural network. CSE 446: Machine Learning Emily Fox University of Washington March 10, /9/17

Neural Networks. Single-layer neural network. CSE 446: Machine Learning Emily Fox University of Washington March 10, /9/17 3/9/7 Neural Networks Emily Fox University of Washington March 0, 207 Slides adapted from Ali Farhadi (via Carlos Guestrin and Luke Zettlemoyer) Single-layer neural network 3/9/7 Perceptron as a neural

More information

CSCI 315: Artificial Intelligence through Deep Learning

CSCI 315: Artificial Intelligence through Deep Learning CSCI 35: Artificial Intelligence through Deep Learning W&L Fall Term 27 Prof. Levy Convolutional Networks http://wernerstudio.typepad.com/.a/6ad83549adb53ef53629ccf97c-5wi Convolution: Convolution is

More information

UNSUPERVISED LEARNING

UNSUPERVISED LEARNING UNSUPERVISED LEARNING Topics Layer-wise (unsupervised) pre-training Restricted Boltzmann Machines Auto-encoders LAYER-WISE (UNSUPERVISED) PRE-TRAINING Breakthrough in 2006 Layer-wise (unsupervised) pre-training

More information

Artificial Neural Networks. Introduction to Computational Neuroscience Tambet Matiisen

Artificial Neural Networks. Introduction to Computational Neuroscience Tambet Matiisen Artificial Neural Networks Introduction to Computational Neuroscience Tambet Matiisen 2.04.2018 Artificial neural network NB! Inspired by biology, not based on biology! Applications Automatic speech recognition

More information

Jakub Hajic Artificial Intelligence Seminar I

Jakub Hajic Artificial Intelligence Seminar I Jakub Hajic Artificial Intelligence Seminar I. 11. 11. 2014 Outline Key concepts Deep Belief Networks Convolutional Neural Networks A couple of questions Convolution Perceptron Feedforward Neural Network

More information

Machine Learning Lecture 10

Machine Learning Lecture 10 Machine Learning Lecture 10 Neural Networks 26.11.2018 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Today s Topic Deep Learning 2 Course Outline Fundamentals Bayes

More information

Grundlagen der Künstlichen Intelligenz

Grundlagen der Künstlichen Intelligenz Grundlagen der Künstlichen Intelligenz Neural networks Daniel Hennes 21.01.2018 (WS 2017/18) University Stuttgart - IPVS - Machine Learning & Robotics 1 Today Logistic regression Neural networks Perceptron

More information

Neural networks COMS 4771

Neural networks COMS 4771 Neural networks COMS 4771 1. Logistic regression Logistic regression Suppose X = R d and Y = {0, 1}. A logistic regression model is a statistical model where the conditional probability function has a

More information

Neural Networks. Nicholas Ruozzi University of Texas at Dallas

Neural Networks. Nicholas Ruozzi University of Texas at Dallas Neural Networks Nicholas Ruozzi University of Texas at Dallas Handwritten Digit Recognition Given a collection of handwritten digits and their corresponding labels, we d like to be able to correctly classify

More information

TTIC 31230, Fundamentals of Deep Learning, Winter David McAllester. The Fundamental Equations of Deep Learning

TTIC 31230, Fundamentals of Deep Learning, Winter David McAllester. The Fundamental Equations of Deep Learning TTIC 31230, Fundamentals of Deep Learning, Winter 2019 David McAllester The Fundamental Equations of Deep Learning 1 Early History 1943: McCullock and Pitts introduced the linear threshold neuron. 1962:

More information

How to do backpropagation in a brain. Geoffrey Hinton Canadian Institute for Advanced Research & University of Toronto

How to do backpropagation in a brain. Geoffrey Hinton Canadian Institute for Advanced Research & University of Toronto 1 How to do backpropagation in a brain Geoffrey Hinton Canadian Institute for Advanced Research & University of Toronto What is wrong with back-propagation? It requires labeled training data. (fixed) Almost

More information

Apprentissage, réseaux de neurones et modèles graphiques (RCP209) Neural Networks and Deep Learning

Apprentissage, réseaux de neurones et modèles graphiques (RCP209) Neural Networks and Deep Learning Apprentissage, réseaux de neurones et modèles graphiques (RCP209) Neural Networks and Deep Learning Nicolas Thome Prenom.Nom@cnam.fr http://cedric.cnam.fr/vertigo/cours/ml2/ Département Informatique Conservatoire

More information

Feature Design. Feature Design. Feature Design. & Deep Learning

Feature Design. Feature Design. Feature Design. & Deep Learning Artificial Intelligence and its applications Lecture 9 & Deep Learning Professor Daniel Yeung danyeung@ieee.org Dr. Patrick Chan patrickchan@ieee.org South China University of Technology, China Appropriately

More information

Neural networks and optimization

Neural networks and optimization Neural networks and optimization Nicolas Le Roux Criteo 18/05/15 Nicolas Le Roux (Criteo) Neural networks and optimization 18/05/15 1 / 85 1 Introduction 2 Deep networks 3 Optimization 4 Convolutional

More information

Neural Networks Learning the network: Backprop , Fall 2018 Lecture 4

Neural Networks Learning the network: Backprop , Fall 2018 Lecture 4 Neural Networks Learning the network: Backprop 11-785, Fall 2018 Lecture 4 1 Recap: The MLP can represent any function The MLP can be constructed to represent anything But how do we construct it? 2 Recap:

More information

A Course in Machine Learning

A Course in Machine Learning A Course in Machine Learning Hal Daumé III 12 EFFICIENT LEARNING So far, our focus has been on moels of learning an basic algorithms for those moels. We have not place much emphasis on how to learn quickly.

More information

Cascaded redundancy reduction

Cascaded redundancy reduction Network: Comput. Neural Syst. 9 (1998) 73 84. Printe in the UK PII: S0954-898X(98)88342-5 Cascae reunancy reuction Virginia R e Sa an Geoffrey E Hinton Department of Computer Science, University of Toronto,

More information

Deep learning / Ian Goodfellow, Yoshua Bengio and Aaron Courville. - Cambridge, MA ; London, Spis treści

Deep learning / Ian Goodfellow, Yoshua Bengio and Aaron Courville. - Cambridge, MA ; London, Spis treści Deep learning / Ian Goodfellow, Yoshua Bengio and Aaron Courville. - Cambridge, MA ; London, 2017 Spis treści Website Acknowledgments Notation xiii xv xix 1 Introduction 1 1.1 Who Should Read This Book?

More information

Deep Learning Autoencoder Models

Deep Learning Autoencoder Models Deep Learning Autoencoder Models Davide Bacciu Dipartimento di Informatica Università di Pisa Intelligent Systems for Pattern Recognition (ISPR) Generative Models Wrap-up Deep Learning Module Lecture Generative

More information

CSCI567 Machine Learning (Fall 2018)

CSCI567 Machine Learning (Fall 2018) CSCI567 Machine Learning (Fall 2018) Prof. Haipeng Luo U of Southern California Sep 12, 2018 September 12, 2018 1 / 49 Administration GitHub repos are setup (ask TA Chi Zhang for any issues) HW 1 is due

More information

Machine Learning Lecture 12

Machine Learning Lecture 12 Machine Learning Lecture 12 Neural Networks 30.11.2017 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Course Outline Fundamentals Bayes Decision Theory Probability

More information

Deep Learning Architectures and Algorithms

Deep Learning Architectures and Algorithms Deep Learning Architectures and Algorithms In-Jung Kim 2016. 12. 2. Agenda Introduction to Deep Learning RBM and Auto-Encoders Convolutional Neural Networks Recurrent Neural Networks Reinforcement Learning

More information

Learning and Memory in Neural Networks

Learning and Memory in Neural Networks Learning and Memory in Neural Networks Guy Billings, Neuroinformatics Doctoral Training Centre, The School of Informatics, The University of Edinburgh, UK. Neural networks consist of computational units

More information

arxiv: v5 [cs.lg] 28 Mar 2017

arxiv: v5 [cs.lg] 28 Mar 2017 Equilibrium Propagation: Briging the Gap Between Energy-Base Moels an Backpropagation Benjamin Scellier an Yoshua Bengio * Université e Montréal, Montreal Institute for Learning Algorithms March 3, 217

More information

Convolution and Pooling as an Infinitely Strong Prior

Convolution and Pooling as an Infinitely Strong Prior Convolution and Pooling as an Infinitely Strong Prior Sargur Srihari srihari@buffalo.edu This is part of lecture slides on Deep Learning: http://www.cedar.buffalo.edu/~srihari/cse676 1 Topics in Convolutional

More information

Learning Deep Architectures for AI. Part II - Vijay Chakilam

Learning Deep Architectures for AI. Part II - Vijay Chakilam Learning Deep Architectures for AI - Yoshua Bengio Part II - Vijay Chakilam Limitations of Perceptron x1 W, b 0,1 1,1 y x2 weight plane output =1 output =0 There is no value for W and b such that the model

More information

WHY ARE DEEP NETS REVERSIBLE: A SIMPLE THEORY,

WHY ARE DEEP NETS REVERSIBLE: A SIMPLE THEORY, WHY ARE DEEP NETS REVERSIBLE: A SIMPLE THEORY, WITH IMPLICATIONS FOR TRAINING Sanjeev Arora, Yingyu Liang & Tengyu Ma Department of Computer Science Princeton University Princeton, NJ 08540, USA {arora,yingyul,tengyu}@cs.princeton.edu

More information

Deep Feedforward Networks. Sargur N. Srihari

Deep Feedforward Networks. Sargur N. Srihari Deep Feedforward Networks Sargur N. srihari@cedar.buffalo.edu 1 Topics Overview 1. Example: Learning XOR 2. Gradient-Based Learning 3. Hidden Units 4. Architecture Design 5. Backpropagation and Other Differentiation

More information

Pattern Recognition and Machine Learning

Pattern Recognition and Machine Learning Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability

More information

Convolutional Neural Networks II. Slides from Dr. Vlad Morariu

Convolutional Neural Networks II. Slides from Dr. Vlad Morariu Convolutional Neural Networks II Slides from Dr. Vlad Morariu 1 Optimization Example of optimization progress while training a neural network. (Loss over mini-batches goes down over time.) 2 Learning rate

More information

Reading Group on Deep Learning Session 4 Unsupervised Neural Networks

Reading Group on Deep Learning Session 4 Unsupervised Neural Networks Reading Group on Deep Learning Session 4 Unsupervised Neural Networks Jakob Verbeek & Daan Wynen 206-09-22 Jakob Verbeek & Daan Wynen Unsupervised Neural Networks Outline Autoencoders Restricted) Boltzmann

More information

Bits of Machine Learning Part 1: Supervised Learning

Bits of Machine Learning Part 1: Supervised Learning Bits of Machine Learning Part 1: Supervised Learning Alexandre Proutiere and Vahan Petrosyan KTH (The Royal Institute of Technology) Outline of the Course 1. Supervised Learning Regression and Classification

More information

CSE446: Neural Networks Spring Many slides are adapted from Carlos Guestrin and Luke Zettlemoyer

CSE446: Neural Networks Spring Many slides are adapted from Carlos Guestrin and Luke Zettlemoyer CSE446: Neural Networks Spring 2017 Many slides are adapted from Carlos Guestrin and Luke Zettlemoyer Human Neurons Switching time ~ 0.001 second Number of neurons 10 10 Connections per neuron 10 4-5 Scene

More information

Lecture 12. Neural Networks Bastian Leibe RWTH Aachen

Lecture 12. Neural Networks Bastian Leibe RWTH Aachen Advanced Machine Learning Lecture 12 Neural Networks 24.11.2016 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de/ leibe@vision.rwth-aachen.de Talk Announcement Yann LeCun (NYU & FaceBook AI)

More information

A PAC-Bayesian Approach to Spectrally-Normalized Margin Bounds for Neural Networks

A PAC-Bayesian Approach to Spectrally-Normalized Margin Bounds for Neural Networks A PAC-Bayesian Approach to Spectrally-Normalize Margin Bouns for Neural Networks Behnam Neyshabur, Srinah Bhojanapalli, Davi McAllester, Nathan Srebro Toyota Technological Institute at Chicago {bneyshabur,

More information

Course Structure. Psychology 452 Week 12: Deep Learning. Chapter 8 Discussion. Part I: Deep Learning: What and Why? Rufus. Rufus Processed By Fetch

Course Structure. Psychology 452 Week 12: Deep Learning. Chapter 8 Discussion. Part I: Deep Learning: What and Why? Rufus. Rufus Processed By Fetch Psychology 452 Week 12: Deep Learning What Is Deep Learning? Preliminary Ideas (that we already know!) The Restricted Boltzmann Machine (RBM) Many Layers of RBMs Pros and Cons of Deep Learning Course Structure

More information

Influence of weight initialization on multilayer perceptron performance

Influence of weight initialization on multilayer perceptron performance Influence of weight initialization on multilayer perceptron performance M. Karouia (1,2) T. Denœux (1) R. Lengellé (1) (1) Université e Compiègne U.R.A. CNRS 817 Heuiasyc BP 649 - F-66 Compiègne ceex -

More information

Microarray Data Analysis: Discovery

Microarray Data Analysis: Discovery Microarray Data Analysis: Discovery Lecture 5 Classification Classification vs. Clustering Classification: Goal: Placing objects (e.g. genes) into meaningful classes Supervised Clustering: Goal: Discover

More information

Statistical Learning Reading Assignments

Statistical Learning Reading Assignments Statistical Learning Reading Assignments S. Gong et al. Dynamic Vision: From Images to Face Recognition, Imperial College Press, 2001 (Chapt. 3, hard copy). T. Evgeniou, M. Pontil, and T. Poggio, "Statistical

More information

Deep Neural Networks

Deep Neural Networks Deep Neural Networks DT2118 Speech and Speaker Recognition Giampiero Salvi KTH/CSC/TMH giampi@kth.se VT 2015 1 / 45 Outline State-to-Output Probability Model Artificial Neural Networks Perceptron Multi

More information

Lecture 12. Talk Announcement. Neural Networks. This Lecture: Advanced Machine Learning. Recap: Generalized Linear Discriminants

Lecture 12. Talk Announcement. Neural Networks. This Lecture: Advanced Machine Learning. Recap: Generalized Linear Discriminants Advanced Machine Learning Lecture 2 Neural Networks 24..206 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de/ leibe@vision.rwth-aachen.de Talk Announcement Yann LeCun (NYU & FaceBook AI) 28..

More information

TUTORIAL PART 1 Unsupervised Learning

TUTORIAL PART 1 Unsupervised Learning TUTORIAL PART 1 Unsupervised Learning Marc'Aurelio Ranzato Department of Computer Science Univ. of Toronto ranzato@cs.toronto.edu Co-organizers: Honglak Lee, Yoshua Bengio, Geoff Hinton, Yann LeCun, Andrew

More information

Neural Networks. David Rosenberg. July 26, New York University. David Rosenberg (New York University) DS-GA 1003 July 26, / 35

Neural Networks. David Rosenberg. July 26, New York University. David Rosenberg (New York University) DS-GA 1003 July 26, / 35 Neural Networks David Rosenberg New York University July 26, 2017 David Rosenberg (New York University) DS-GA 1003 July 26, 2017 1 / 35 Neural Networks Overview Objectives What are neural networks? How

More information

A summary of Deep Learning without Poor Local Minima

A summary of Deep Learning without Poor Local Minima A summary of Deep Learning without Poor Local Minima by Kenji Kawaguchi MIT oral presentation at NIPS 2016 Learning Supervised (or Predictive) learning Learn a mapping from inputs x to outputs y, given

More information

Basic Principles of Unsupervised and Unsupervised

Basic Principles of Unsupervised and Unsupervised Basic Principles of Unsupervised and Unsupervised Learning Toward Deep Learning Shun ichi Amari (RIKEN Brain Science Institute) collaborators: R. Karakida, M. Okada (U. Tokyo) Deep Learning Self Organization

More information

Linear Classifiers IV

Linear Classifiers IV Universität Potsdam Institut für Informatik Lehrstuhl Linear Classifiers IV Blaine Nelson, Tobias Scheffer Contents Classification Problem Bayesian Classifier Decision Linear Classifiers, MAP Models Logistic

More information

Index. Santanu Pattanayak 2017 S. Pattanayak, Pro Deep Learning with TensorFlow,

Index. Santanu Pattanayak 2017 S. Pattanayak, Pro Deep Learning with TensorFlow, Index A Activation functions, neuron/perceptron binary threshold activation function, 102 103 linear activation function, 102 rectified linear unit, 106 sigmoid activation function, 103 104 SoftMax activation

More information

Machine Learning. Boris

Machine Learning. Boris Machine Learning Boris Nadion boris@astrails.com @borisnadion @borisnadion boris@astrails.com astrails http://astrails.com awesome web and mobile apps since 2005 terms AI (artificial intelligence)

More information

Neural Networks. Learning and Computer Vision Prof. Olga Veksler CS9840. Lecture 10

Neural Networks. Learning and Computer Vision Prof. Olga Veksler CS9840. Lecture 10 CS9840 Learning and Computer Vision Prof. Olga Veksler Lecture 0 Neural Networks Many slides are from Andrew NG, Yann LeCun, Geoffry Hinton, Abin - Roozgard Outline Short Intro Perceptron ( layer NN) Multilayer

More information

Neural Network Training

Neural Network Training Neural Network Training Sargur Srihari Topics in Network Training 0. Neural network parameters Probabilistic problem formulation Specifying the activation and error functions for Regression Binary classification

More information

Introduction to Artificial Neural Networks

Introduction to Artificial Neural Networks Facultés Universitaires Notre-Dame de la Paix 27 March 2007 Outline 1 Introduction 2 Fundamentals Biological neuron Artificial neuron Artificial Neural Network Outline 3 Single-layer ANN Perceptron Adaline

More information

DEEP LEARNING AND NEURAL NETWORKS: BACKGROUND AND HISTORY

DEEP LEARNING AND NEURAL NETWORKS: BACKGROUND AND HISTORY DEEP LEARNING AND NEURAL NETWORKS: BACKGROUND AND HISTORY 1 On-line Resources http://neuralnetworksanddeeplearning.com/index.html Online book by Michael Nielsen http://matlabtricks.com/post-5/3x3-convolution-kernelswith-online-demo

More information

Deep Learning Srihari. Deep Belief Nets. Sargur N. Srihari

Deep Learning Srihari. Deep Belief Nets. Sargur N. Srihari Deep Belief Nets Sargur N. Srihari srihari@cedar.buffalo.edu Topics 1. Boltzmann machines 2. Restricted Boltzmann machines 3. Deep Belief Networks 4. Deep Boltzmann machines 5. Boltzmann machines for continuous

More information

Online Videos FERPA. Sign waiver or sit on the sides or in the back. Off camera question time before and after lecture. Questions?

Online Videos FERPA. Sign waiver or sit on the sides or in the back. Off camera question time before and after lecture. Questions? Online Videos FERPA Sign waiver or sit on the sides or in the back Off camera question time before and after lecture Questions? Lecture 1, Slide 1 CS224d Deep NLP Lecture 4: Word Window Classification

More information

Introduction to Neural Networks

Introduction to Neural Networks Introduction to Neural Networks What are (Artificial) Neural Networks? Models of the brain and nervous system Highly parallel Process information much more like the brain than a serial computer Learning

More information

Neural networks and optimization

Neural networks and optimization Neural networks and optimization Nicolas Le Roux INRIA 8 Nov 2011 Nicolas Le Roux (INRIA) Neural networks and optimization 8 Nov 2011 1 / 80 1 Introduction 2 Linear classifier 3 Convolutional neural networks

More information

Neural Networks Lecturer: J. Matas Authors: J. Matas, B. Flach, O. Drbohlav

Neural Networks Lecturer: J. Matas Authors: J. Matas, B. Flach, O. Drbohlav Neural Networks 30.11.2015 Lecturer: J. Matas Authors: J. Matas, B. Flach, O. Drbohlav 1 Talk Outline Perceptron Combining neurons to a network Neural network, processing input to an output Learning Cost

More information

Artificial Neural Networks

Artificial Neural Networks Artificial Neural Networks 鮑興國 Ph.D. National Taiwan University of Science and Technology Outline Perceptrons Gradient descent Multi-layer networks Backpropagation Hidden layer representations Examples

More information

The connection of dropout and Bayesian statistics

The connection of dropout and Bayesian statistics The connection of dropout and Bayesian statistics Interpretation of dropout as approximate Bayesian modelling of NN http://mlg.eng.cam.ac.uk/yarin/thesis/thesis.pdf Dropout Geoffrey Hinton Google, University

More information

CSC321 Lecture 20: Autoencoders

CSC321 Lecture 20: Autoencoders CSC321 Lecture 20: Autoencoders Roger Grosse Roger Grosse CSC321 Lecture 20: Autoencoders 1 / 16 Overview Latent variable models so far: mixture models Boltzmann machines Both of these involve discrete

More information

Machine Learning Techniques

Machine Learning Techniques Machine Learning Techniques ( 機器學習技法 ) Lecture 13: Deep Learning Hsuan-Tien Lin ( 林軒田 ) htlin@csie.ntu.edu.tw Department of Computer Science & Information Engineering National Taiwan University ( 國立台灣大學資訊工程系

More information

Lecture 5: Logistic Regression. Neural Networks

Lecture 5: Logistic Regression. Neural Networks Lecture 5: Logistic Regression. Neural Networks Logistic regression Comparison with generative models Feed-forward neural networks Backpropagation Tricks for training neural networks COMP-652, Lecture

More information

Neural Networks and Deep Learning.

Neural Networks and Deep Learning. Neural Networks and Deep Learning www.cs.wisc.edu/~dpage/cs760/ 1 Goals for the lecture you should understand the following concepts perceptrons the perceptron training rule linear separability hidden

More information

Recurrent Neural Networks (Part - 2) Sumit Chopra Facebook

Recurrent Neural Networks (Part - 2) Sumit Chopra Facebook Recurrent Neural Networks (Part - 2) Sumit Chopra Facebook Recap Standard RNNs Training: Backpropagation Through Time (BPTT) Application to sequence modeling Language modeling Applications: Automatic speech

More information

Learning Deep Architectures

Learning Deep Architectures Learning Deep Architectures Yoshua Bengio, U. Montreal Microsoft Cambridge, U.K. July 7th, 2009, Montreal Thanks to: Aaron Courville, Pascal Vincent, Dumitru Erhan, Olivier Delalleau, Olivier Breuleux,

More information

Lecture 17: Neural Networks and Deep Learning

Lecture 17: Neural Networks and Deep Learning UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016 Lecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 Neurons 1-Layer Neural Network Multi-layer Neural Network Loss Functions

More information

Natural Language Processing with Deep Learning CS224N/Ling284

Natural Language Processing with Deep Learning CS224N/Ling284 Natural Language Processing with Deep Learning CS224N/Ling284 Lecture 4: Word Window Classification and Neural Networks Richard Socher Organization Main midterm: Feb 13 Alternative midterm: Friday Feb

More information

CSE 352 (AI) LECTURE NOTES Professor Anita Wasilewska. NEURAL NETWORKS Learning

CSE 352 (AI) LECTURE NOTES Professor Anita Wasilewska. NEURAL NETWORKS Learning CSE 352 (AI) LECTURE NOTES Professor Anita Wasilewska NEURAL NETWORKS Learning Neural Networks Classifier Short Presentation INPUT: classification data, i.e. it contains an classification (class) attribute.

More information

Unit III. A Survey of Neural Network Model

Unit III. A Survey of Neural Network Model Unit III A Survey of Neural Network Model 1 Single Layer Perceptron Perceptron the first adaptive network architecture was invented by Frank Rosenblatt in 1957. It can be used for the classification of

More information

Lecture 12. Neural Networks Bastian Leibe RWTH Aachen

Lecture 12. Neural Networks Bastian Leibe RWTH Aachen Advanced Machine Learning Lecture 12 Neural Networks 10.12.2015 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de/ leibe@vision.rwth-aachen.de This Lecture: Advanced Machine Learning Regression

More information

Stochastic Backpropagation, Variational Inference, and Semi-Supervised Learning

Stochastic Backpropagation, Variational Inference, and Semi-Supervised Learning Stochastic Backpropagation, Variational Inference, and Semi-Supervised Learning Diederik (Durk) Kingma Danilo J. Rezende (*) Max Welling Shakir Mohamed (**) Stochastic Gradient Variational Inference Bayesian

More information

INTRODUCTION TO ARTIFICIAL INTELLIGENCE

INTRODUCTION TO ARTIFICIAL INTELLIGENCE v=1 v= 1 v= 1 v= 1 v= 1 v=1 optima 2) 3) 5) 6) 7) 8) 9) 12) 11) 13) INTRDUCTIN T ARTIFICIAL INTELLIGENCE DATA15001 EPISDE 8: NEURAL NETWRKS TDAY S MENU 1. NEURAL CMPUTATIN 2. FEEDFRWARD NETWRKS (PERCEPTRN)

More information

Supervised Learning. George Konidaris

Supervised Learning. George Konidaris Supervised Learning George Konidaris gdk@cs.brown.edu Fall 2017 Machine Learning Subfield of AI concerned with learning from data. Broadly, using: Experience To Improve Performance On Some Task (Tom Mitchell,

More information

Machine Learning with Quantum-Inspired Tensor Networks

Machine Learning with Quantum-Inspired Tensor Networks Machine Learning with Quantum-Inspired Tensor Networks E.M. Stoudenmire and David J. Schwab Advances in Neural Information Processing 29 arxiv:1605.05775 RIKEN AICS - Mar 2017 Collaboration with David

More information

Boosting & Deep Learning

Boosting & Deep Learning Boosting & Deep Learning Ensemble Learning n So far learning methods that learn a single hypothesis, chosen form a hypothesis space that is used to make predictions n Ensemble learning à select a collection

More information