Lecture 17: Neural Networks and Deep Learning
|
|
- Phyllis Caroline Little
- 6 years ago
- Views:
Transcription
1 UVA CS 6316 / CS Machine Learning Fall 2016 Lecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1
2 Neurons 1-Layer Neural Network Multi-layer Neural Network Loss Functions Backpropagation Nonlinearity Functions NNs in Practice 2
3 W1 x1 x x2 W2 w3 ŷ x3 3
4 Logistic Regression wx+b e P(Y=1 x) = wx+b 1+e Sigmoid Function (aka logistic, logit, S, soft-step) x 4
5 Expanded Logistic Regression x1 x2 Input x p=3 x3 +1 w1 w2 w 3 1 b Σ z Summing Function ŷ = P(Y=1 x,w) Sigmoid Function Multiply by weights T. z = w x +b 1x1 px1 1xp y = sigmoid(z) = ez 1 + ez 5
6 Neuron x1 x2 x3 w1 w2 w3 Σ z b1 +1 6
7 Neurons 7
8 Neuron x1 Input x x2 w1 w2 Σ w3 z ŷ x3 From here on, we leave out bias for simplicity T. z = w x 1x1 px1 1xp ŷ = sigmoid(z) = ez 1 + ez 8
9 Block View of a Neuron parameterized block w x * Input Dot Product z T. z = w x 1x1 px1 1xp ŷ = sigmoid(z) = ŷ Sigmoid ez 1 + ez output 9
10 Neuron Representation x1 x x2 w1 w2 Σ ŷ w3 x3 The linear transformation and nonlinearity together is typically considered a single neuron 10
11 Neuron Representation x w * ŷ The linear transformation and nonlinearity together is typically considered a single neuron 11
12 Neurons 1-Layer Neural Network Multi-layer Neural Network Loss Functions Backpropagation Nonlinearity Functions NNs in Practice 12
13 W1 x1 x x2 W2 w3 ŷ x3 13
14 1-Layer Neural Network (with 4 neurons) vector z matrix W ŷ Σ 1 x1 ŷ Σ Input x 2 x2 Output ŷ ŷ Σ x3 3 ŷ Σ Linear 4 Sigmoid 1 layer 14
15 1-Layer Neural Network (with 4 neurons) z W ŷ Σ 1 x1 ŷ Σ Input x 2 x2 Output ŷ ŷ Σ x3 3 p=3 ŷ Σ 4 Element-wise on vector z d=4 z =WT x dx1 dxp px1 ŷ = sigmoid(z) = dx1 dx1 ez 1 + ez 15
16 1-Layer Neural Network (with 4 neurons) ŷ W Σ 1 x1 ŷ Σ Input x 2 x2 Output ŷ ŷ Σ x3 3 p=3 ŷ Σ 4 Element-wise on vector z d=4 z =WT x dx1 dxp px1 ŷ = sigmoid(z) = dx1 dx1 ez 1 + ez 16
17 Block View of a Neural Network W is now a matrix W x * Input Dot Product z is now a vector z ŷ Sigmoid output z =WT x dx1 dxp px1 ŷ = sigmoid(z) = dx1 dx1 ez 1 + ez 17
18 Neurons 1-Layer Neural Network Multi-layer Neural Network Loss Functions Backpropagation Nonlinearity Functions NNs in Practice 18
19 W1 x1 x x2 W2 w3 ŷ x3 19
20 Multi-Layer Neural Network (Multi-Layer Perceptron (MLP) Network) weight subscript represents layer number W1 w2 x1 x ŷ x2 x3 Hidden layer Output layer 2-layer NN 20
21 Multi-Layer Neural Network (MLP) W2 W1 w3 x1 x ŷ x2 x3 1st hidden layer 2nd hidden layer Output layer 3-layer NN 21
22 Multi-Layer Neural Network (MLP) h1 W1 h2 W2 w3 x1 x ŷ x2 x3 hidden layer 1 output z1 =W1T x h1 = sigmoid(z1) z2 =W2T h1 h2 = sigmoid(z2) z3 =w3t h2 ŷ = sigmoid(z3) 22
23 Multi-Class Output MLP W1 x1 x h1 W2 h2 w3 x2 ŷ x3 z1 =W1T x h1 = sigmoid(z1) z2 =W2T h1 h2 = sigmoid(z2) z3 =W3T h2 ŷ = sigmoid(z2) 23
24 Block View Of MLP x W1 * z1 1st hidden layer h1 W2 * z2 2nd hidden layer h2 W3 * z3 y Output layer 24
25 Deep Neural Networks (i.e. > 1 hidden layer) Researchers have successfully used 1000 layers to train an object classifier 25
26 Neurons 1-Layer Neural Network Multi-layer Neural Network Loss Functions Backpropagation Nonlinearity Functions NNs in Practice 26
27 W1 x1 x x2 W2 w3 ŷ E (ŷ) x3 27
28 Binary Classification Loss W1 W2 w3 x1 x ŷ = P(y=1 X,W) x2 x3 this example is for a single sample x E= loss = - logp(y = ŷ X= x ) = - y log(ŷ) - (1 - y) log(1-ŷ) 28 true output
29 Regression Loss W1 x1 x W2 w3 Σ x2 ŷ x3 E = loss= 12 ( y - ŷ )2 true output 29
30 Multi-Class Classification Loss W1 W2 W3 x z1 Σ x1 ŷ1 z2 x2 Σ x3 Σ ŷ2 z3 ŷi = P( ŷi = 1 x ) ŷ3 K=3 Softmax function. Normalizing function which converts each class output to a probability. E = loss = - Σ yj ln ŷj j = 1...K 0 for all except true class 30
31 Neurons 1-Layer Neural Network Multi-layer Neural Network Loss Functions Backpropagation Nonlinearity Functions NNs in Practice 31
32 W1 x1 x x2 W2 w3 ŷ E (ŷ) x3 32
33 Training Neural Networks How do we learn the optimal weights WL for our task?? Gradient descent: WL(t+1) = WL(t) - E WL(t) W1 x W2 w3 ŷ But how do we get gradients of lower layers? Backpropagation! Repeated application of chain rule of calculus Locally minimize the objective Requires all blocks of the network to be differentiable 33 LeCun et. al. Efficient Backpropagation. 1998
34 Backpropagation Intro 34
35 Backpropagation Intro 35
36 Backpropagation Intro 36
37 Backpropagation Intro 37
38 Backpropagation Intro 38
39 Backpropagation Intro 39
40 Backpropagation Intro 40
41 Backpropagation Intro 41
42 Backpropagation Intro 42
43 Backpropagation Intro 43
44 Backpropagation Intro 44
45 Backpropagation Intro 45
46 Backpropagation Intro Tells us: by increasing x by a scale of 1, we decrease f by a scale of
47 Backpropagation (binary classification example) W1 x1 x x2 w2 ŷ = P(y=1 X,W) x3 Example on 1-hidden layer NN for binary classification 47
48 Backpropagation (binary classification example) x W1 * z1 h1 w2 * z2 y 48
49 Backpropagation (binary classification example) x W1 * z1 h1 w2 * z2 y E = loss = Gradient Descent to Minimize loss: Need to find these! 49
50 Backpropagation (binary classification example) x W1 * z1 h1 w2 * z2 y 50
51 Backpropagation (binary classification example) x W1 * z1 h1 w2 * z2 y =?? =?? 51
52 Backpropagation (binary classification example) x W1 * z1 h1 w2 * z2 y =?? =?? Exploit the chain rule! 52
53 Backpropagation (binary classification example) x W1 * z1 h1 w2 * z2 y 53
54 Backpropagation (binary classification example) x W1 * z1 h1 w2 * z2 y chain rule 54
55 Backpropagation (binary classification example) x W1 * z1 h1 w2 * z2 y 55
56 Backpropagation (binary classification example) x W1 * z1 h1 w2 * z2 y 56
57 Backpropagation (binary classification example) x W1 * z1 h1 w2 * z2 y 57
58 Backpropagation (binary classification example) x W1 * z1 h1 w2 * z2 y 58
59 Backpropagation (binary classification example) x W1 * z1 h1 w2 * z2 y 59
60 Backpropagation (binary classification example) x W1 * z1 h1 w2 * z2 y 60
61 Backpropagation (binary classification example) x W1 * z1 h1 w2 * z2 y 61
62 Backpropagation (binary classification example) x W1 * z1 h1 w2 * z2 y already computed 62
63 Local-ness of Backpropagation activations local gradients x f y gradients 63
64 Local-ness of Backpropagation activations local gradients x f y gradients 64
65 Example: Sigmoid Block x sigmoid(x) = (x) 65
66 Deep Learning = Concatenation of Differentiable Parameterized Layers (linear & nonlinearity functions) x W1 * z1 1st hidden layer h1 W2 * z2 2nd hidden layer h2 W3 * z3 y Output layer Want to find optimal weights W to minimize some loss function E! 66
67 Backprop Whiteboard Demo w1 x1 w2 x2 Σ w3 w4 b1 z1 h1 w5 Σ Σ h2 z2 b2 1 ŷ w6 b3 1 f1 z1 = x1w1 + x2w3 + b1 z2 = x1w2 + x2w4 + b2 f2 exp(z1) h1 = 1 + exp(z ) 1 exp(z2) h2 = 1 + exp(z2) f3 ŷ = h1w5 + h2w6 + b3 f4 E = ( y - ŷ )2 w(t+1) = w(t) - E w E w(t) =?? 67
68 Neurons 1-Layer Neural Network Multi-layer Neural Network Loss Functions Backpropagation Nonlinearity Functions NNs in Practice 68
69 Nonlinearity Functions (i.e. transfer or activation functions) x1 x2 x3 +1 w1 w2 w3 b1 Σ Summing Function z Sigmoid Function Multiply by weights 69
70 Nonlinearity Functions (i.e. transfer or activation functions) Name Plot Equation Derivative ( w.r.t x ) 70
71 Nonlinearity Functions (i.e. transfer or activation functions) Name Plot Equation Derivative ( w.r.t x ) 71
72 Nonlinearity Functions (aka transfer or activation functions) Name Plot Equation Derivative ( w.r.t x ) 72 usually works best in practice
73 Neurons 1-Layer Neural Network Multi-layer Neural Network Loss Functions Backpropagation Nonlinearity Functions NNs in Practice 73
74 Neural Net Pipeline W2 W1 x w3 ŷ E 1. Initialize weights 2. For each batch of input x samples S: a. Run the network Forward on S to compute outputs and loss b. Run the network Backward using outputs and loss to compute gradients c. Update weights using SGD (or a similar method) 3. Repeat step 2 until loss convergence 74
75 Non-Convexity of Neural Nets In very high dimensions, there exists many local minimum which are about the same. Pascanu, et. al. On the saddle point problem for non-convex optimization
76 Building Deep Neural Nets y x f 76
77 Building Deep Neural Nets GoogLeNet for Object Classification 77
78 Block Example Implementation 78
79 Advantage of Neural Nets As long as it s fully differentiable, we can train the model to automatically learn features for us. 79
80 Advanced Deep Learning Models: Convolutional Neural Networks & Recurrent Neural Networks Most slides from 80
81 Convolutional Neural Networks (aka CNNs and ConvNets) 81
82 Challenges in Visual Recognition
83 Challenges in Visual Recognition
84 Problems with Fully Connected Networks on Images Each neuron corresponds to a specific pixel 1000x1000 Image 1M hidden units: 10^12 parameters Spatial Correlation is local! Connect units 84
85 Problems with Fully Connected Networks on Images Each neuron corresponds to a specific pixel 1000x1000 Image 1M hidden units: 10^12 parameters Spatial Correlation is local! Connect units How do we deal with multiple dimensions in the input? Length,height, channel (R,G,B) 85
86 Convolutional Layer 86
87 Convolutional Layer 87
88 Convolutional Layer 88
89 Convolutional Layer 89
90 Convolution 90
91 Convolution 91
92 Neuron View of Convolutional Layer 92
93 Neuron View of Convolutional Layer 93
94 Convolutional Layer 94
95 Convolutional Layer 95
96 Neuron View of Convolutional Layer 96
97 Convolutional Neural Networks 97
98 Convolutional Neural Networks 98
99 Convolutional Neural Networks 99
100 Convolutional Neural Networks 100
101 Pooling Layer 101
102 Pooling Layer (Max Pooling example) 102
103 History of ConvNets Gradient-based learning applied to document recognition [LeCun, ImageNet Classification with Deep Convolutional Neural Networks Bottou, Bengio, Haffner] [Krizhevsky, Sutskever, Hinton, 2012] 103
104 104
105 105
106 106
107 107
108 Recurrent Neural Networks 108
109 Standard Feed-Forward Neural Network 109
110 Standard Feed-Forward Neural Network 110
111 Recurrent Neural Networks (RNNs) RNNs can handle 111
112 Recurrent Neural Networks (RNNs) RNNs can handle 112
113 Recurrent Neural Networks (RNNs) Traditional Feed Forward Neural Network Recurrent Neural Network predict a vector at each timestep output output hidden hidden input input
114 Recurrent Neural Networks ht ht-1 114
115 Recurrent Neural Networks ht ht-1 115
116 Vanilla Recurrent Neural Network ht ht-1 116
117 Character Level Language Model with an RNN 117
118 Character Level Language Model with an RNN 118
119 Character Level Language Model with an RNN 119
120 Character Level Language Model with an RNN 120
121 Character Level Language Model with an RNN 121
122 Character Level Language Model with an RNN 122
123 Character Level Language Model with an RNN 123
124 Example: Generating Shakespeare with RNNs 124
125 Example: Generating C Code with RNNs 125
126 Long Short Term Memory Networks (LSTMs) Recurrent networks suffer from the vanishing gradient problem Aren t able to model long term dependencies in sequences output hidden input
127 Long Short Term Memory Networks (LSTMs) Recurrent networks suffer from the vanishing gradient problem Aren t able to model long term dependencies in sequences Use gating units to learn when to remember output input
128 RNNs and CNNs Together 128
Intro to Neural Networks and Deep Learning
Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi UVA CS 6316 1 Neurons 1-Layer Neural Network Multi-layer Neural Network Loss Functions Backpropagation Nonlinearity Functions NNs
More informationMaking Deep Learning Understandable for Analyzing Sequential Data about Gene Regulation
Making Deep Learning Understandable for Analyzing Sequential Data about Gene Regulation Dr. Yanjun Qi Department of Computer Science University of Virginia Tutorial @ ACM BCB-2018 8/29/18 Yanjun Qi / UVA
More informationMachine Learning for Large-Scale Data Analysis and Decision Making A. Neural Networks Week #6
Machine Learning for Large-Scale Data Analysis and Decision Making 80-629-17A Neural Networks Week #6 Today Neural Networks A. Modeling B. Fitting C. Deep neural networks Today s material is (adapted)
More informationNeural Networks 2. 2 Receptive fields and dealing with image inputs
CS 446 Machine Learning Fall 2016 Oct 04, 2016 Neural Networks 2 Professor: Dan Roth Scribe: C. Cheng, C. Cervantes Overview Convolutional Neural Networks Recurrent Neural Networks 1 Introduction There
More informationApprentissage, réseaux de neurones et modèles graphiques (RCP209) Neural Networks and Deep Learning
Apprentissage, réseaux de neurones et modèles graphiques (RCP209) Neural Networks and Deep Learning Nicolas Thome Prenom.Nom@cnam.fr http://cedric.cnam.fr/vertigo/cours/ml2/ Département Informatique Conservatoire
More informationNeural Networks. Intro to AI Bert Huang Virginia Tech
Neural Networks Intro to AI Bert Huang Virginia Tech Outline Biological inspiration for artificial neural networks Linear vs. nonlinear functions Learning with neural networks: back propagation https://en.wikipedia.org/wiki/neuron#/media/file:chemical_synapse_schema_cropped.jpg
More informationMachine Learning for Signal Processing Neural Networks Continue. Instructor: Bhiksha Raj Slides by Najim Dehak 1 Dec 2016
Machine Learning for Signal Processing Neural Networks Continue Instructor: Bhiksha Raj Slides by Najim Dehak 1 Dec 2016 1 So what are neural networks?? Voice signal N.Net Transcription Image N.Net Text
More informationECE521 Lectures 9 Fully Connected Neural Networks
ECE521 Lectures 9 Fully Connected Neural Networks Outline Multi-class classification Learning multi-layer neural networks 2 Measuring distance in probability space We learnt that the squared L2 distance
More informationDEEP LEARNING AND NEURAL NETWORKS: BACKGROUND AND HISTORY
DEEP LEARNING AND NEURAL NETWORKS: BACKGROUND AND HISTORY 1 On-line Resources http://neuralnetworksanddeeplearning.com/index.html Online book by Michael Nielsen http://matlabtricks.com/post-5/3x3-convolution-kernelswith-online-demo
More informationIntroduction to Deep Neural Networks
Introduction to Deep Neural Networks Presenter: Chunyuan Li Pattern Classification and Recognition (ECE 681.01) Duke University April, 2016 Outline 1 Background and Preliminaries Why DNNs? Model: Logistic
More informationCSE446: Neural Networks Spring Many slides are adapted from Carlos Guestrin and Luke Zettlemoyer
CSE446: Neural Networks Spring 2017 Many slides are adapted from Carlos Guestrin and Luke Zettlemoyer Human Neurons Switching time ~ 0.001 second Number of neurons 10 10 Connections per neuron 10 4-5 Scene
More informationNeural Networks: Backpropagation
Neural Networks: Backpropagation Machine Learning Fall 2017 Based on slides and material from Geoffrey Hinton, Richard Socher, Dan Roth, Yoav Goldberg, Shai Shalev-Shwartz and Shai Ben-David, and others
More informationMachine Learning and Data Mining. Multi-layer Perceptrons & Neural Networks: Basics. Prof. Alexander Ihler
+ Machine Learning and Data Mining Multi-layer Perceptrons & Neural Networks: Basics Prof. Alexander Ihler Linear Classifiers (Perceptrons) Linear Classifiers a linear classifier is a mapping which partitions
More informationDeep Learning Recurrent Networks 2/28/2018
Deep Learning Recurrent Networks /8/8 Recap: Recurrent networks can be incredibly effective Story so far Y(t+) Stock vector X(t) X(t+) X(t+) X(t+) X(t+) X(t+5) X(t+) X(t+7) Iterated structures are good
More informationLecture 8: Introduction to Deep Learning: Part 2 (More on backpropagation, and ConvNets)
COS 402 Machine Learning and Artificial Intelligence Fall 2016 Lecture 8: Introduction to Deep Learning: Part 2 (More on backpropagation, and ConvNets) Sanjeev Arora Elad Hazan Recap: Structure of a deep
More informationClassification goals: Make 1 guess about the label (Top-1 error) Make 5 guesses about the label (Top-5 error) No Bounding Box
ImageNet Classification with Deep Convolutional Neural Networks Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton Motivation Classification goals: Make 1 guess about the label (Top-1 error) Make 5 guesses
More informationDeep Learning (CNNs)
10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Deep Learning (CNNs) Deep Learning Readings: Murphy 28 Bishop - - HTF - - Mitchell
More informationConvolutional Neural Networks
Convolutional Neural Networks Books» http://www.deeplearningbook.org/ Books http://neuralnetworksanddeeplearning.com/.org/ reviews» http://www.deeplearningbook.org/contents/linear_algebra.html» http://www.deeplearningbook.org/contents/prob.html»
More informationLecture 5 Neural models for NLP
CS546: Machine Learning in NLP (Spring 2018) http://courses.engr.illinois.edu/cs546/ Lecture 5 Neural models for NLP Julia Hockenmaier juliahmr@illinois.edu 3324 Siebel Center Office hours: Tue/Thu 2pm-3pm
More informationConvolutional Neural Networks II. Slides from Dr. Vlad Morariu
Convolutional Neural Networks II Slides from Dr. Vlad Morariu 1 Optimization Example of optimization progress while training a neural network. (Loss over mini-batches goes down over time.) 2 Learning rate
More informationCSC321 Lecture 16: ResNets and Attention
CSC321 Lecture 16: ResNets and Attention Roger Grosse Roger Grosse CSC321 Lecture 16: ResNets and Attention 1 / 24 Overview Two topics for today: Topic 1: Deep Residual Networks (ResNets) This is the state-of-the
More informationWhat Do Neural Networks Do? MLP Lecture 3 Multi-layer networks 1
What Do Neural Networks Do? MLP Lecture 3 Multi-layer networks 1 Multi-layer networks Steve Renals Machine Learning Practical MLP Lecture 3 7 October 2015 MLP Lecture 3 Multi-layer networks 2 What Do Single
More informationIntroduction to Convolutional Neural Networks (CNNs)
Introduction to Convolutional Neural Networks (CNNs) nojunk@snu.ac.kr http://mipal.snu.ac.kr Department of Transdisciplinary Studies Seoul National University, Korea Jan. 2016 Many slides are from Fei-Fei
More informationMachine Learning Lecture 10
Machine Learning Lecture 10 Neural Networks 26.11.2018 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Today s Topic Deep Learning 2 Course Outline Fundamentals Bayes
More informationSlide credit from Hung-Yi Lee & Richard Socher
Slide credit from Hung-Yi Lee & Richard Socher 1 Review Recurrent Neural Network 2 Recurrent Neural Network Idea: condition the neural network on all previous words and tie the weights at each time step
More informationArtificial Neural Networks D B M G. Data Base and Data Mining Group of Politecnico di Torino. Elena Baralis. Politecnico di Torino
Artificial Neural Networks Data Base and Data Mining Group of Politecnico di Torino Elena Baralis Politecnico di Torino Artificial Neural Networks Inspired to the structure of the human brain Neurons as
More informationMachine Learning for Computer Vision 8. Neural Networks and Deep Learning. Vladimir Golkov Technical University of Munich Computer Vision Group
Machine Learning for Computer Vision 8. Neural Networks and Deep Learning Vladimir Golkov Technical University of Munich Computer Vision Group INTRODUCTION Nonlinear Coordinate Transformation http://cs.stanford.edu/people/karpathy/convnetjs/
More informationUnderstanding How ConvNets See
Understanding How ConvNets See Slides from Andrej Karpathy Springerberg et al, Striving for Simplicity: The All Convolutional Net (ICLR 2015 workshops) CSC321: Intro to Machine Learning and Neural Networks,
More informationMachine Learning Lecture 12
Machine Learning Lecture 12 Neural Networks 30.11.2017 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Course Outline Fundamentals Bayes Decision Theory Probability
More informationNeural Networks. Yan Shao Department of Linguistics and Philology, Uppsala University 7 December 2016
Neural Networks Yan Shao Department of Linguistics and Philology, Uppsala University 7 December 2016 Outline Part 1 Introduction Feedforward Neural Networks Stochastic Gradient Descent Computational Graph
More informationBackpropagation Introduction to Machine Learning. Matt Gormley Lecture 12 Feb 23, 2018
10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Backpropagation Matt Gormley Lecture 12 Feb 23, 2018 1 Neural Networks Outline
More informationCOMP 551 Applied Machine Learning Lecture 14: Neural Networks
COMP 551 Applied Machine Learning Lecture 14: Neural Networks Instructor: Ryan Lowe (ryan.lowe@mail.mcgill.ca) Slides mostly by: Class web page: www.cs.mcgill.ca/~hvanho2/comp551 Unless otherwise noted,
More information(Feed-Forward) Neural Networks Dr. Hajira Jabeen, Prof. Jens Lehmann
(Feed-Forward) Neural Networks 2016-12-06 Dr. Hajira Jabeen, Prof. Jens Lehmann Outline In the previous lectures we have learned about tensors and factorization methods. RESCAL is a bilinear model for
More informationSGD and Deep Learning
SGD and Deep Learning Subgradients Lets make the gradient cheating more formal. Recall that the gradient is the slope of the tangent. f(w 1 )+rf(w 1 ) (w w 1 ) Non differentiable case? w 1 Subgradients
More informationRecurrent Neural Networks. Jian Tang
Recurrent Neural Networks Jian Tang tangjianpku@gmail.com 1 RNN: Recurrent neural networks Neural networks for sequence modeling Summarize a sequence with fix-sized vector through recursively updating
More informationECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference
ECE 18-898G: Special Topics in Signal Processing: Sparsity, Structure, and Inference Neural Networks: A brief touch Yuejie Chi Department of Electrical and Computer Engineering Spring 2018 1/41 Outline
More informationCSC321 Lecture 5: Multilayer Perceptrons
CSC321 Lecture 5: Multilayer Perceptrons Roger Grosse Roger Grosse CSC321 Lecture 5: Multilayer Perceptrons 1 / 21 Overview Recall the simple neuron-like unit: y output output bias i'th weight w 1 w2 w3
More informationLecture 11 Recurrent Neural Networks I
Lecture 11 Recurrent Neural Networks I CMSC 35246: Deep Learning Shubhendu Trivedi & Risi Kondor University of Chicago May 01, 2017 Introduction Sequence Learning with Neural Networks Some Sequence Tasks
More informationNeural Architectures for Image, Language, and Speech Processing
Neural Architectures for Image, Language, and Speech Processing Karl Stratos June 26, 2018 1 / 31 Overview Feedforward Networks Need for Specialized Architectures Convolutional Neural Networks (CNNs) Recurrent
More informationIntroduction to Convolutional Neural Networks 2018 / 02 / 23
Introduction to Convolutional Neural Networks 2018 / 02 / 23 Buzzword: CNN Convolutional neural networks (CNN, ConvNet) is a class of deep, feed-forward (not recurrent) artificial neural networks that
More informationLecture 3 Feedforward Networks and Backpropagation
Lecture 3 Feedforward Networks and Backpropagation CMSC 35246: Deep Learning Shubhendu Trivedi & Risi Kondor University of Chicago April 3, 2017 Things we will look at today Recap of Logistic Regression
More informationArtificial Neural Networks. Introduction to Computational Neuroscience Tambet Matiisen
Artificial Neural Networks Introduction to Computational Neuroscience Tambet Matiisen 2.04.2018 Artificial neural network NB! Inspired by biology, not based on biology! Applications Automatic speech recognition
More informationLecture 3 Feedforward Networks and Backpropagation
Lecture 3 Feedforward Networks and Backpropagation CMSC 35246: Deep Learning Shubhendu Trivedi & Risi Kondor University of Chicago April 3, 2017 Things we will look at today Recap of Logistic Regression
More informationCS 179: LECTURE 16 MODEL COMPLEXITY, REGULARIZATION, AND CONVOLUTIONAL NETS
CS 179: LECTURE 16 MODEL COMPLEXITY, REGULARIZATION, AND CONVOLUTIONAL NETS LAST TIME Intro to cudnn Deep neural nets using cublas and cudnn TODAY Building a better model for image classification Overfitting
More informationSequence Modeling with Neural Networks
Sequence Modeling with Neural Networks Harini Suresh y 0 y 1 y 2 s 0 s 1 s 2... x 0 x 1 x 2 hat is a sequence? This morning I took the dog for a walk. sentence medical signals speech waveform Successes
More informationNeural Network Training
Neural Network Training Sargur Srihari Topics in Network Training 0. Neural network parameters Probabilistic problem formulation Specifying the activation and error functions for Regression Binary classification
More informationCSC 411 Lecture 10: Neural Networks
CSC 411 Lecture 10: Neural Networks Roger Grosse, Amir-massoud Farahmand, and Juan Carrasquilla University of Toronto UofT CSC 411: 10-Neural Networks 1 / 35 Inspiration: The Brain Our brain has 10 11
More informationStatistical NLP for the Web
Statistical NLP for the Web Neural Networks, Deep Belief Networks Sameer Maskey Week 8, October 24, 2012 *some slides from Andrew Rosenberg Announcements Please ask HW2 related questions in courseworks
More informationIntroduction to RNNs!
Introduction to RNNs Arun Mallya Best viewed with Computer Modern fonts installed Outline Why Recurrent Neural Networks (RNNs)? The Vanilla RNN unit The RNN forward pass Backpropagation refresher The RNN
More informationGrundlagen der Künstlichen Intelligenz
Grundlagen der Künstlichen Intelligenz Neural networks Daniel Hennes 21.01.2018 (WS 2017/18) University Stuttgart - IPVS - Machine Learning & Robotics 1 Today Logistic regression Neural networks Perceptron
More informationTTIC 31230, Fundamentals of Deep Learning David McAllester, April Vanishing and Exploding Gradients. ReLUs. Xavier Initialization
TTIC 31230, Fundamentals of Deep Learning David McAllester, April 2017 Vanishing and Exploding Gradients ReLUs Xavier Initialization Batch Normalization Highway Architectures: Resnets, LSTMs and GRUs Causes
More informationLecture 11 Recurrent Neural Networks I
Lecture 11 Recurrent Neural Networks I CMSC 35246: Deep Learning Shubhendu Trivedi & Risi Kondor niversity of Chicago May 01, 2017 Introduction Sequence Learning with Neural Networks Some Sequence Tasks
More informationDeep Learning: a gentle introduction
Deep Learning: a gentle introduction Jamal Atif jamal.atif@dauphine.fr PSL, Université Paris-Dauphine, LAMSADE February 8, 206 Jamal Atif (Université Paris-Dauphine) Deep Learning February 8, 206 / Why
More informationLogistic Regression Introduction to Machine Learning. Matt Gormley Lecture 8 Feb. 12, 2018
10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Logistic Regression Matt Gormley Lecture 8 Feb. 12, 2018 1 10-601 Introduction
More informationJakub Hajic Artificial Intelligence Seminar I
Jakub Hajic Artificial Intelligence Seminar I. 11. 11. 2014 Outline Key concepts Deep Belief Networks Convolutional Neural Networks A couple of questions Convolution Perceptron Feedforward Neural Network
More informationNeural Networks Learning the network: Backprop , Fall 2018 Lecture 4
Neural Networks Learning the network: Backprop 11-785, Fall 2018 Lecture 4 1 Recap: The MLP can represent any function The MLP can be constructed to represent anything But how do we construct it? 2 Recap:
More informationDeep Feedforward Networks. Sargur N. Srihari
Deep Feedforward Networks Sargur N. srihari@cedar.buffalo.edu 1 Topics Overview 1. Example: Learning XOR 2. Gradient-Based Learning 3. Hidden Units 4. Architecture Design 5. Backpropagation and Other Differentiation
More informationNeural Networks, Computation Graphs. CMSC 470 Marine Carpuat
Neural Networks, Computation Graphs CMSC 470 Marine Carpuat Binary Classification with a Multi-layer Perceptron φ A = 1 φ site = 1 φ located = 1 φ Maizuru = 1 φ, = 2 φ in = 1 φ Kyoto = 1 φ priest = 0 φ
More informationFrom perceptrons to word embeddings. Simon Šuster University of Groningen
From perceptrons to word embeddings Simon Šuster University of Groningen Outline A basic computational unit Weighting some input to produce an output: classification Perceptron Classify tweets Written
More informationDeep Learning Lab Course 2017 (Deep Learning Practical)
Deep Learning Lab Course 207 (Deep Learning Practical) Labs: (Computer Vision) Thomas Brox, (Robotics) Wolfram Burgard, (Machine Learning) Frank Hutter, (Neurorobotics) Joschka Boedecker University of
More informationDemystifying deep learning. Artificial Intelligence Group Department of Computer Science and Technology, University of Cambridge, UK
Demystifying deep learning Petar Veličković Artificial Intelligence Group Department of Computer Science and Technology, University of Cambridge, UK London Data Science Summit 20 October 2017 Introduction
More informationCSC321 Lecture 10 Training RNNs
CSC321 Lecture 10 Training RNNs Roger Grosse and Nitish Srivastava February 23, 2015 Roger Grosse and Nitish Srivastava CSC321 Lecture 10 Training RNNs February 23, 2015 1 / 18 Overview Last time, we saw
More informationDeep Neural Networks (1) Hidden layers; Back-propagation
Deep Neural Networs (1) Hidden layers; Bac-propagation Steve Renals Machine Learning Practical MLP Lecture 3 4 October 2017 / 9 October 2017 MLP Lecture 3 Deep Neural Networs (1) 1 Recap: Softmax single
More informationBACKPROPAGATION. Neural network training optimization problem. Deriving backpropagation
BACKPROPAGATION Neural network training optimization problem min J(w) w The application of gradient descent to this problem is called backpropagation. Backpropagation is gradient descent applied to J(w)
More informationNeural Networks. Bishop PRML Ch. 5. Alireza Ghane. Feed-forward Networks Network Training Error Backpropagation Applications
Neural Networks Bishop PRML Ch. 5 Alireza Ghane Neural Networks Alireza Ghane / Greg Mori 1 Neural Networks Neural networks arise from attempts to model human/animal brains Many models, many claims of
More informationFeed-forward Networks Network Training Error Backpropagation Applications. Neural Networks. Oliver Schulte - CMPT 726. Bishop PRML Ch.
Neural Networks Oliver Schulte - CMPT 726 Bishop PRML Ch. 5 Neural Networks Neural networks arise from attempts to model human/animal brains Many models, many claims of biological plausibility We will
More informationStochastic gradient descent; Classification
Stochastic gradient descent; Classification Steve Renals Machine Learning Practical MLP Lecture 2 28 September 2016 MLP Lecture 2 Stochastic gradient descent; Classification 1 Single Layer Networks MLP
More informationBackpropagation Introduction to Machine Learning. Matt Gormley Lecture 13 Mar 1, 2018
10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Backpropagation Matt Gormley Lecture 13 Mar 1, 2018 1 Reminders Homework 5: Neural
More informationarxiv: v3 [cs.lg] 14 Jan 2018
A Gentle Tutorial of Recurrent Neural Network with Error Backpropagation Gang Chen Department of Computer Science and Engineering, SUNY at Buffalo arxiv:1610.02583v3 [cs.lg] 14 Jan 2018 1 abstract We describe
More informationStatistical Machine Learning (BE4M33SSU) Lecture 5: Artificial Neural Networks
Statistical Machine Learning (BE4M33SSU) Lecture 5: Artificial Neural Networks Jan Drchal Czech Technical University in Prague Faculty of Electrical Engineering Department of Computer Science Topics covered
More informationStatistical Machine Learning from Data
January 17, 2006 Samy Bengio Statistical Machine Learning from Data 1 Statistical Machine Learning from Data Multi-Layer Perceptrons Samy Bengio IDIAP Research Institute, Martigny, Switzerland, and Ecole
More informationLogistic Regression Introduction to Machine Learning. Matt Gormley Lecture 9 Sep. 26, 2018
10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Logistic Regression Matt Gormley Lecture 9 Sep. 26, 2018 1 Reminders Homework 3:
More informationCSCI567 Machine Learning (Fall 2018)
CSCI567 Machine Learning (Fall 2018) Prof. Haipeng Luo U of Southern California Sep 12, 2018 September 12, 2018 1 / 49 Administration GitHub repos are setup (ask TA Chi Zhang for any issues) HW 1 is due
More informationBayesian Networks (Part I)
10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Bayesian Networks (Part I) Graphical Model Readings: Murphy 10 10.2.1 Bishop 8.1,
More informationIntroduction to Neural Networks
Introduction to Neural Networks Steve Renals Automatic Speech Recognition ASR Lecture 10 24 February 2014 ASR Lecture 10 Introduction to Neural Networks 1 Neural networks for speech recognition Introduction
More informationNeural networks and optimization
Neural networks and optimization Nicolas Le Roux Criteo 18/05/15 Nicolas Le Roux (Criteo) Neural networks and optimization 18/05/15 1 / 85 1 Introduction 2 Deep networks 3 Optimization 4 Convolutional
More informationSPSS, University of Texas at Arlington. Topics in Machine Learning-EE 5359 Neural Networks
Topics in Machine Learning-EE 5359 Neural Networks 1 The Perceptron Output: A perceptron is a function that maps D-dimensional vectors to real numbers. For notational convenience, we add a zero-th dimension
More informationDeep Feedforward Networks
Deep Feedforward Networks Liu Yang March 30, 2017 Liu Yang Short title March 30, 2017 1 / 24 Overview 1 Background A general introduction Example 2 Gradient based learning Cost functions Output Units 3
More informationNeural Networks and the Back-propagation Algorithm
Neural Networks and the Back-propagation Algorithm Francisco S. Melo In these notes, we provide a brief overview of the main concepts concerning neural networks and the back-propagation algorithm. We closely
More informationArtificial Neuron (Perceptron)
9/6/208 Gradient Descent (GD) Hantao Zhang Deep Learning with Python Reading: https://en.wikipedia.org/wiki/gradient_descent Artificial Neuron (Perceptron) = w T = w 0 0 + + w 2 2 + + w d d where
More informationNeural Networks. Single-layer neural network. CSE 446: Machine Learning Emily Fox University of Washington March 10, /9/17
3/9/7 Neural Networks Emily Fox University of Washington March 0, 207 Slides adapted from Ali Farhadi (via Carlos Guestrin and Luke Zettlemoyer) Single-layer neural network 3/9/7 Perceptron as a neural
More informationy(x n, w) t n 2. (1)
Network training: Training a neural network involves determining the weight parameter vector w that minimizes a cost function. Given a training set comprising a set of input vector {x n }, n = 1,...N,
More informationDeep Feedforward Networks
Deep Feedforward Networks Yongjin Park 1 Goal of Feedforward Networks Deep Feedforward Networks are also called as Feedforward neural networks or Multilayer Perceptrons Their Goal: approximate some function
More informationLearning Deep Architectures for AI. Part I - Vijay Chakilam
Learning Deep Architectures for AI - Yoshua Bengio Part I - Vijay Chakilam Chapter 0: Preliminaries Neural Network Models The basic idea behind the neural network approach is to model the response as a
More informationRecurrent neural networks
12-1: Recurrent neural networks Prof. J.C. Kao, UCLA Recurrent neural networks Motivation Network unrollwing Backpropagation through time Vanishing and exploding gradients LSTMs GRUs 12-2: Recurrent neural
More informationMachine Learning (CSE 446): Neural Networks
Machine Learning (CSE 446): Neural Networks Noah Smith c 2017 University of Washington nasmith@cs.washington.edu November 6, 2017 1 / 22 Admin No Wednesday office hours for Noah; no lecture Friday. 2 /
More informationLogistic Regression & Neural Networks
Logistic Regression & Neural Networks CMSC 723 / LING 723 / INST 725 Marine Carpuat Slides credit: Graham Neubig, Jacob Eisenstein Logistic Regression Perceptron & Probabilities What if we want a probability
More informationCourse 395: Machine Learning - Lectures
Course 395: Machine Learning - Lectures Lecture 1-2: Concept Learning (M. Pantic) Lecture 3-4: Decision Trees & CBC Intro (M. Pantic & S. Petridis) Lecture 5-6: Evaluating Hypotheses (S. Petridis) Lecture
More informationSpeech and Language Processing
Speech and Language Processing Lecture 5 Neural network based acoustic and language models Information and Communications Engineering Course Takahiro Shinoaki 08//6 Lecture Plan (Shinoaki s part) I gives
More informationNeural Networks and Deep Learning
Neural Networks and Deep Learning Professor Ameet Talwalkar November 12, 2015 Professor Ameet Talwalkar Neural Networks and Deep Learning November 12, 2015 1 / 16 Outline 1 Review of last lecture AdaBoost
More informationNEURAL LANGUAGE MODELS
COMP90042 LECTURE 14 NEURAL LANGUAGE MODELS LANGUAGE MODELS Assign a probability to a sequence of words Framed as sliding a window over the sentence, predicting each word from finite context to left E.g.,
More informationDeep Learning for Natural Language Processing. Sidharth Mudgal April 4, 2017
Deep Learning for Natural Language Processing Sidharth Mudgal April 4, 2017 Table of contents 1. Intro 2. Word Vectors 3. Word2Vec 4. Char Level Word Embeddings 5. Application: Entity Matching 6. Conclusion
More informationEngineering Part IIB: Module 4F10 Statistical Pattern Processing Lecture 6: Multi-Layer Perceptrons I
Engineering Part IIB: Module 4F10 Statistical Pattern Processing Lecture 6: Multi-Layer Perceptrons I Phil Woodland: pcw@eng.cam.ac.uk Michaelmas 2012 Engineering Part IIB: Module 4F10 Introduction In
More informationNeed for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels
Need for Deep Networks Perceptron Can only model linear functions Kernel Machines Non-linearity provided by kernels Need to design appropriate kernels (possibly selecting from a set, i.e. kernel learning)
More informationNeural Networks. Learning and Computer Vision Prof. Olga Veksler CS9840. Lecture 10
CS9840 Learning and Computer Vision Prof. Olga Veksler Lecture 0 Neural Networks Many slides are from Andrew NG, Yann LeCun, Geoffry Hinton, Abin - Roozgard Outline Short Intro Perceptron ( layer NN) Multilayer
More informationRecurrent Neural Networks. COMP-550 Oct 5, 2017
Recurrent Neural Networks COMP-550 Oct 5, 2017 Outline Introduction to neural networks and deep learning Feedforward neural networks Recurrent neural networks 2 Classification Review y = f( x) output label
More informationComments. Assignment 3 code released. Thought questions 3 due this week. Mini-project: hopefully you have started. implement classification algorithms
Neural networks Comments Assignment 3 code released implement classification algorithms use kernels for census dataset Thought questions 3 due this week Mini-project: hopefully you have started 2 Example:
More informationNeed for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels
Need for Deep Networks Perceptron Can only model linear functions Kernel Machines Non-linearity provided by kernels Need to design appropriate kernels (possibly selecting from a set, i.e. kernel learning)
More informationA summary of Deep Learning without Poor Local Minima
A summary of Deep Learning without Poor Local Minima by Kenji Kawaguchi MIT oral presentation at NIPS 2016 Learning Supervised (or Predictive) learning Learn a mapping from inputs x to outputs y, given
More informationCSC321 Lecture 15: Exploding and Vanishing Gradients
CSC321 Lecture 15: Exploding and Vanishing Gradients Roger Grosse Roger Grosse CSC321 Lecture 15: Exploding and Vanishing Gradients 1 / 23 Overview Yesterday, we saw how to compute the gradient descent
More information