Introduction to Deep Learning

Size: px
Start display at page:

Download "Introduction to Deep Learning"

Transcription

1 Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015 A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

2 Outline 1 Universality of Neural Networks 2 Learning Neural Networks 3 Deep Learning 4 Applications 5 References A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

3 What are neural networks? Let s ask Biological Input layer Hidden layer Output layer Input #1 Computational Input #2 Input #3 Input #4 Output A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

4 What are neural networks?...neural networks (NNs) are computational models inspired by biological neural networks [...] and are used to estimate or approximate functions... [Wikipedia] A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

5 What are neural networks? Origins: Traced back to threshold logic [W. McCulloch and W. Pitts, 1943] Perceptron [F. Rosenblatt, 1958] A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

6 What are neural networks? Use cases Classification Playing video games Captcha Neural Turing Machine (e.g., learn how to sort) Alex Graves A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

7 What are neural networks? Example: input x parameters w 1, w 2, b A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

8 What are neural networks? Example: input x parameters w 1, w 2, b x R w 1 w 2 h 1 f b R A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

9 How to compute the function? Forward propagation/pass, inference, prediction: Given input x and parameters w, b Compute (latent variables/) intermediate results in a feed-forward manner Until we obtain output function f x R w 1 w 2 h 1 f b R A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

10 How to compute the function? Forward propagation/pass, inference, prediction: Given input x and parameters w, b Compute (latent variables/) intermediate results in a feed-forward manner Until we obtain output function f A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

11 How to compute the function? Example: input x, parameters w 1, w 2, b x R w 1 w 2 h 1 f b R A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

12 How to compute the function? Example: input x, parameters w 1, w 2, b x R w 1 h w 2 1 f h 1 = σ(w 1 x + b) b R f = w 2 h 1 Sigmoid function: σ(z) = 1/(1 + exp( z)) A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

13 How to compute the function? Example: input x, parameters w 1, w 2, b x R w 1 h w 2 1 f h 1 = σ(w 1 x + b) b R f = w 2 h 1 Sigmoid function: σ(z) = 1/(1 + exp( z)) x = ln 2, b = ln 3, w 1 = 2, w 2 = 2 h 1 =? f =? A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

14 How to compute the function? Given parameters, what is f for x = 0, x = 1, x = 2,... f = w 2 σ(w 1 x + b) A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

15 f How to compute the function? Given parameters, what is f for x = 0, x = 1, x = 2,... f = w 2 σ(w 1 x + b) x A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

16 Let s mess with parameters: x R w 1 h 1 w 2 f h 1 = σ(w 1 x + b) f = w 2 h 1 b R σ(z) = 1/(1 + exp( z)) A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

17 Let s mess with parameters: x R w 1 h 1 w 2 f h 1 = σ(w 1 x + b) f = w 2 h 1 b R σ(z) = 1/(1 + exp( z)) w 1 = 1.0, b changes A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

18 Let s mess with parameters: f x R w 1 h 1 w 2 f h 1 = σ(w 1 x + b) f = w 2 h 1 b R σ(z) = 1/(1 + exp( z)) w 1 = 1.0, b changes 0.4 b = b = 0 b = x A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

19 Let s mess with parameters: f x R w 1 h 1 w 2 f h 1 = σ(w 1 x + b) f = w 2 h 1 b R σ(z) = 1/(1 + exp( z)) 1 w 1 = 1.0, b changes b = 0, w 1 changes b = b = 0 b = x A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

20 Let s mess with parameters: f x R w 1 h 1 w 2 f h 1 = σ(w 1 x + b) f = w 2 h 1 b R σ(z) = 1/(1 + exp( z)) w 1 = 1.0, b changes b = 0, w 1 changes b = b = 0 b = x 0.6 = 0 fw1 0.4 w 1 = w 1 = 1.0 w 1 = x A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

21 Let s mess with parameters: f x R w 1 h 1 w 2 f h 1 = σ(w 1 x + b) f = w 2 h 1 b R σ(z) = 1/(1 + exp( z)) w 1 = 1.0, b changes b = 0, w 1 changes b = b = 0 b = x 0.6 = 0 fw1 0.4 w 1 = w 1 = 1.0 w 1 = x Keep in mind the step function. A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

22 How to use Neural Networks for binary classification? Feature/Measurement: x Output: How likely is the input to be a cat? y x A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

23 f How to use Neural Networks for binary classification? Feature/Measurement: x Output: How likely is the input to be a cat? x A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

24 f How to use Neural Networks for binary classification? Feature/Measurement: x Output: How likely is the input to be a cat? x A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

25 f How to use Neural Networks for binary classification? Feature/Measurement: x Output: How likely is the input to be a cat? x A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

26 f How to use Neural Networks for binary classification? Shifted feature/measurement: x Output: How likely is the input to be a cat? Previous features x A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

27 f f How to use Neural Networks for binary classification? Shifted feature/measurement: x Output: How likely is the input to be a cat? Previous features x Shifted features x A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

28 f f How to use Neural Networks for binary classification? Shifted feature/measurement: x Output: How likely is the input to be a cat? Previous features x Shifted features x Learning/Training means finding the right parameters. A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

29 f So far we are able to scale and translate sigmoids. How well can we approximate an arbitrary function? With the simple model we are obviously not going very far. Features are good Simple classifier x A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

30 f f So far we are able to scale and translate sigmoids. How well can we approximate an arbitrary function? With the simple model we are obviously not going very far. Features are good Simple classifier Features are noisy More complex classifier x x A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

31 f f So far we are able to scale and translate sigmoids. How well can we approximate an arbitrary function? With the simple model we are obviously not going very far. Features are good Simple classifier Features are noisy More complex classifier x x How can we generalize? A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

32 Let s use more hidden variables: b 1 h 1 x R w 1 w 2 w 3 w 4 f h 1 = σ(w 1 x + b 1 ) h 2 = σ(w 3 x + b 2 ) f = w 2 h 1 + w 4 h 2 h 2 b 2 A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

33 Let s use more hidden variables: f b 1 h 1 x R w 1 w 2 w 3 w 4 f h 1 = σ(w 1 x + b 1 ) h 2 = σ(w 3 x + b 2 ) f = w 2 h 1 + w 4 h 2 h 2 b 2 Combining two step functions gives a bump x w 1 = 100, b 1 = 40, w 3 = 100, b 2 = 60, w 2 = 1, w 4 = 1 A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

34 So let s simplify: b 1 x R w 1 h 1 w 2 w 3 w 4 f h 2 b 2 Bump(x 1, x 2, h) f We simplify a pair of hidden nodes to a bump function: Starts at x 1 Ends at x 2 Has height h A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

35 Now we can represent bumps very well. How can we generalize? A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

36 Now we can represent bumps very well. How can we generalize? f Bump(0.0, 0.2, h 1 ) Bump(0.2, 0.4, h 2 ) Bump(0.4, 0.6, h 3 ) f Bump(0.6, 0.8, h 4 ) Bump(0.8, 1.0, h 5) Target Approximation x More bumps gives more accurate approximation. Corresponds to a single layer network. A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

37 Universality: theoretically we can approximate an arbitrary function So we can learn a really complex cat classifier Where is the catch? A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

38 Universality: theoretically we can approximate an arbitrary function So we can learn a really complex cat classifier Where is the catch? Complexity, we might need quite a few hidden units Overfitting, memorize the training data A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

39 Generalizations are possible to A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

40 Generalizations are possible to include more input dimensions capture more output dimensions employ multiple layers for more efficient representations See for a great read! A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

41 How do we find the parameters to obtain a good approximation? How do we tell a computer to do that? A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

42 How do we find the parameters to obtain a good approximation? How do we tell a computer to do that? Intuitive explanation: Compute approximation error at the output Propagate error back by computing individual contributions of parameters to error [Fig. from H. Lee] A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

43 Example for backpropagation of error: Target function: 5x 2 Approximation: f (x, w) Domain of interest: x {0, 1, 2, 3} Error: e(w) = (5x 2 f (x, w)) 2 x {0,1,2,3} A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

44 Example for backpropagation of error: Target function: 5x 2 Approximation: f (x, w) Domain of interest: x {0, 1, 2, 3} Error: Program of interest: min w e(w) = e(w) = min w x {0,1,2,3} x {0,1,2,3} (5x 2 f (x, w)) 2 (5x 2 f (x, w)) 2 }{{} l(x,w) How to optimize? A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

45 Example for backpropagation of error: Target function: 5x 2 Approximation: f (x, w) Domain of interest: x {0, 1, 2, 3} Error: Program of interest: min w e(w) = e(w) = min w x {0,1,2,3} x {0,1,2,3} (5x 2 f (x, w)) 2 (5x 2 f (x, w)) 2 }{{} l(x,w) How to optimize? Gradient descent A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

46 Gradient descent min e(w) w A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

47 Gradient descent min e(w) w Algorithm: start with w 0, t = 0 1 Compute gradient g t = e 2 Update w t+1 = w t ηg t 3 Set t t + 1 w w=wt A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

48 Chain rule is important to compute gradients: min w e(w) = min w x {0,1,2,3} (5x 2 f (x, w)) 2 }{{} l(x,w) A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

49 Chain rule is important to compute gradients: min w Loss function: l(x, w) e(w) = min w x {0,1,2,3} (5x 2 f (x, w)) 2 }{{} l(x,w) A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

50 Chain rule is important to compute gradients: min w Loss function: l(x, w) Squared loss Log loss Hinge loss e(w) = min w x {0,1,2,3} (5x 2 f (x, w)) 2 }{{} l(x,w) A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

51 Chain rule is important to compute gradients: min w Loss function: l(x, w) Squared loss Log loss Hinge loss Derivatives: e(w) = min w e(w) w = = x {0,1,2,3} x {0,1,2,3} (5x 2 f (x, w)) 2 }{{} l(x,w) l(x, w) w A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

52 Chain rule is important to compute gradients: min w Loss function: l(x, w) Squared loss Log loss Hinge loss Derivatives: e(w) = min w e(w) w = = x {0,1,2,3} x {0,1,2,3} x {0,1,2,3} (5x 2 f (x, w)) 2 }{{} l(x,w) l(x, w) w l(x, w) f (x, w) f w A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

53 Slightly more complex example: Composite function represented as a directed a-cyclic graph l(x, w) = f 1 (w 1, f 2 (w 2, f 3 (...))) f 1 (w 1, f 2 ) f f 1 1 w f 1 2 w 1 f 2 (w 2, f 3 ) f 1 f 2 f 2 f 3 f 1 f 2 f 2 w 2 w 2 f 3 (...) Repeated application of chain rule for efficient computation of all gradients A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

54 Back propagation doesn t work well for deep sigmoid networks: Diffusion of gradient signal (multiplication of many small numbers) Attractivity of many local minima (random initialization is very far from good points) Requires a lot of training samples Need for significant computational power A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

55 Back propagation doesn t work well for deep sigmoid networks: Diffusion of gradient signal (multiplication of many small numbers) Attractivity of many local minima (random initialization is very far from good points) Requires a lot of training samples Need for significant computational power Solution: 2 step approach Greedy layerwise pre-training Perform full fine tuning at the end A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

56 Why go deep? Representation efficiency (fewer computational units for the same function) Hierarchical representation (non-local generalization) Combinatorial sharing (re-use of earlier computation) Works very well [Fig. from H. Lee] A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

57 To obtain more flexibility/non-linearity we use additional function prototypes: A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

58 To obtain more flexibility/non-linearity we use additional function prototypes: Sigmoid Rectified linear unit (ReLU) Pooling Dropout Convolutions A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

59 Convolutions What do the numbers mean? See Sanja s lecture 14 for the answers... [Fig. adapted from A. Krizhevsky] A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

60 f conv (, ) A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

61 f conv (, ) = A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

62 Max Pooling What is happening here? [Fig. adapted from A. Krizhevsky] A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

63 A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

64 Rectified Linear Unit (ReLU) A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

65 Rectified Linear Unit (ReLU) Drop information if smaller than zero Fixes the problem of vanishing gradients to some degree A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

66 Rectified Linear Unit (ReLU) Drop information if smaller than zero Fixes the problem of vanishing gradients to some degree Dropout A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

67 Rectified Linear Unit (ReLU) Drop information if smaller than zero Fixes the problem of vanishing gradients to some degree Dropout Drop information at random Kind of a regularization, enforcing redundancy A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

68 A famous deep learning network called AlexNet. The network won the ImageNet competition in How many parameters? Given an image, what is happening? Inference Time: about 2ms per image when processing many images in parallel on the GPU Training Time: a few days given a single recent GPU [Fig. adapted from A. Krizhevsky] A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

69 Demo A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

70 Neural networks have been used for many applications: Classification and Recognition in Computer Vision Text Parsing in Natural Language Processing Playing Video Games Stock Market Prediction Captcha Demos: Russ website Antonio Places website A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

71 Classification in Computer Vision: ImageNet Challenge Since it s the end of the semester, let s find the beach... A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

72 Classification in Computer Vision: ImageNet Challenge A place to maybe prepare for exams... A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

73 Links: Tutorials: Toronto Demo by Russ and students: MIT Demo by Antonio and students: Honglak Lee: workshop-tutorial-final.pdf Yann LeCun: yann/talks/lecun-ranzato-icml2013.pdf Richard Socher: A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

74 Videos: Video games: Captcha: Stock exchange: A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding / 39

Introduction to Deep Learning

Introduction to Deep Learning Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2014 A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2014 1 / 35 Outline 1 Universality of Neural Networks

More information

CSE446: Neural Networks Spring Many slides are adapted from Carlos Guestrin and Luke Zettlemoyer

CSE446: Neural Networks Spring Many slides are adapted from Carlos Guestrin and Luke Zettlemoyer CSE446: Neural Networks Spring 2017 Many slides are adapted from Carlos Guestrin and Luke Zettlemoyer Human Neurons Switching time ~ 0.001 second Number of neurons 10 10 Connections per neuron 10 4-5 Scene

More information

Neural Networks. Nicholas Ruozzi University of Texas at Dallas

Neural Networks. Nicholas Ruozzi University of Texas at Dallas Neural Networks Nicholas Ruozzi University of Texas at Dallas Handwritten Digit Recognition Given a collection of handwritten digits and their corresponding labels, we d like to be able to correctly classify

More information

SGD and Deep Learning

SGD and Deep Learning SGD and Deep Learning Subgradients Lets make the gradient cheating more formal. Recall that the gradient is the slope of the tangent. f(w 1 )+rf(w 1 ) (w w 1 ) Non differentiable case? w 1 Subgradients

More information

Lecture 17: Neural Networks and Deep Learning

Lecture 17: Neural Networks and Deep Learning UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016 Lecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 Neurons 1-Layer Neural Network Multi-layer Neural Network Loss Functions

More information

Jakub Hajic Artificial Intelligence Seminar I

Jakub Hajic Artificial Intelligence Seminar I Jakub Hajic Artificial Intelligence Seminar I. 11. 11. 2014 Outline Key concepts Deep Belief Networks Convolutional Neural Networks A couple of questions Convolution Perceptron Feedforward Neural Network

More information

Administration. Registration Hw3 is out. Lecture Captioning (Extra-Credit) Scribing lectures. Questions. Due on Thursday 10/6

Administration. Registration Hw3 is out. Lecture Captioning (Extra-Credit) Scribing lectures. Questions. Due on Thursday 10/6 Administration Registration Hw3 is out Due on Thursday 10/6 Questions Lecture Captioning (Extra-Credit) Look at Piazza for details Scribing lectures With pay; come talk to me/send email. 1 Projects Projects

More information

Neural Networks. Bishop PRML Ch. 5. Alireza Ghane. Feed-forward Networks Network Training Error Backpropagation Applications

Neural Networks. Bishop PRML Ch. 5. Alireza Ghane. Feed-forward Networks Network Training Error Backpropagation Applications Neural Networks Bishop PRML Ch. 5 Alireza Ghane Neural Networks Alireza Ghane / Greg Mori 1 Neural Networks Neural networks arise from attempts to model human/animal brains Many models, many claims of

More information

Classification goals: Make 1 guess about the label (Top-1 error) Make 5 guesses about the label (Top-5 error) No Bounding Box

Classification goals: Make 1 guess about the label (Top-1 error) Make 5 guesses about the label (Top-5 error) No Bounding Box ImageNet Classification with Deep Convolutional Neural Networks Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton Motivation Classification goals: Make 1 guess about the label (Top-1 error) Make 5 guesses

More information

CSC 411 Lecture 10: Neural Networks

CSC 411 Lecture 10: Neural Networks CSC 411 Lecture 10: Neural Networks Roger Grosse, Amir-massoud Farahmand, and Juan Carrasquilla University of Toronto UofT CSC 411: 10-Neural Networks 1 / 35 Inspiration: The Brain Our brain has 10 11

More information

(Feed-Forward) Neural Networks Dr. Hajira Jabeen, Prof. Jens Lehmann

(Feed-Forward) Neural Networks Dr. Hajira Jabeen, Prof. Jens Lehmann (Feed-Forward) Neural Networks 2016-12-06 Dr. Hajira Jabeen, Prof. Jens Lehmann Outline In the previous lectures we have learned about tensors and factorization methods. RESCAL is a bilinear model for

More information

CSC321 Lecture 5: Multilayer Perceptrons

CSC321 Lecture 5: Multilayer Perceptrons CSC321 Lecture 5: Multilayer Perceptrons Roger Grosse Roger Grosse CSC321 Lecture 5: Multilayer Perceptrons 1 / 21 Overview Recall the simple neuron-like unit: y output output bias i'th weight w 1 w2 w3

More information

Neural Networks and Deep Learning.

Neural Networks and Deep Learning. Neural Networks and Deep Learning www.cs.wisc.edu/~dpage/cs760/ 1 Goals for the lecture you should understand the following concepts perceptrons the perceptron training rule linear separability hidden

More information

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others) Machine Learning Neural Networks (slides from Domingos, Pardo, others) For this week, Reading Chapter 4: Neural Networks (Mitchell, 1997) See Canvas For subsequent weeks: Scaling Learning Algorithms toward

More information

Introduction to Convolutional Neural Networks (CNNs)

Introduction to Convolutional Neural Networks (CNNs) Introduction to Convolutional Neural Networks (CNNs) nojunk@snu.ac.kr http://mipal.snu.ac.kr Department of Transdisciplinary Studies Seoul National University, Korea Jan. 2016 Many slides are from Fei-Fei

More information

Artificial Neural Networks. MGS Lecture 2

Artificial Neural Networks. MGS Lecture 2 Artificial Neural Networks MGS 2018 - Lecture 2 OVERVIEW Biological Neural Networks Cell Topology: Input, Output, and Hidden Layers Functional description Cost functions Training ANNs Back-Propagation

More information

From perceptrons to word embeddings. Simon Šuster University of Groningen

From perceptrons to word embeddings. Simon Šuster University of Groningen From perceptrons to word embeddings Simon Šuster University of Groningen Outline A basic computational unit Weighting some input to produce an output: classification Perceptron Classify tweets Written

More information

Neural Networks. Intro to AI Bert Huang Virginia Tech

Neural Networks. Intro to AI Bert Huang Virginia Tech Neural Networks Intro to AI Bert Huang Virginia Tech Outline Biological inspiration for artificial neural networks Linear vs. nonlinear functions Learning with neural networks: back propagation https://en.wikipedia.org/wiki/neuron#/media/file:chemical_synapse_schema_cropped.jpg

More information

CSC321 Lecture 4: Learning a Classifier

CSC321 Lecture 4: Learning a Classifier CSC321 Lecture 4: Learning a Classifier Roger Grosse Roger Grosse CSC321 Lecture 4: Learning a Classifier 1 / 28 Overview Last time: binary classification, perceptron algorithm Limitations of the perceptron

More information

Feature Design. Feature Design. Feature Design. & Deep Learning

Feature Design. Feature Design. Feature Design. & Deep Learning Artificial Intelligence and its applications Lecture 9 & Deep Learning Professor Daniel Yeung danyeung@ieee.org Dr. Patrick Chan patrickchan@ieee.org South China University of Technology, China Appropriately

More information

Intelligent Systems Discriminative Learning, Neural Networks

Intelligent Systems Discriminative Learning, Neural Networks Intelligent Systems Discriminative Learning, Neural Networks Carsten Rother, Dmitrij Schlesinger WS2014/2015, Outline 1. Discriminative learning 2. Neurons and linear classifiers: 1) Perceptron-Algorithm

More information

A summary of Deep Learning without Poor Local Minima

A summary of Deep Learning without Poor Local Minima A summary of Deep Learning without Poor Local Minima by Kenji Kawaguchi MIT oral presentation at NIPS 2016 Learning Supervised (or Predictive) learning Learn a mapping from inputs x to outputs y, given

More information

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference ECE 18-898G: Special Topics in Signal Processing: Sparsity, Structure, and Inference Neural Networks: A brief touch Yuejie Chi Department of Electrical and Computer Engineering Spring 2018 1/41 Outline

More information

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others) Machine Learning Neural Networks (slides from Domingos, Pardo, others) For this week, Reading Chapter 4: Neural Networks (Mitchell, 1997) See Canvas For subsequent weeks: Scaling Learning Algorithms toward

More information

Neural Networks: Introduction

Neural Networks: Introduction Neural Networks: Introduction Machine Learning Fall 2017 Based on slides and material from Geoffrey Hinton, Richard Socher, Dan Roth, Yoav Goldberg, Shai Shalev-Shwartz and Shai Ben-David, and others 1

More information

CSC321 Lecture 4: Learning a Classifier

CSC321 Lecture 4: Learning a Classifier CSC321 Lecture 4: Learning a Classifier Roger Grosse Roger Grosse CSC321 Lecture 4: Learning a Classifier 1 / 31 Overview Last time: binary classification, perceptron algorithm Limitations of the perceptron

More information

ARTIFICIAL INTELLIGENCE. Artificial Neural Networks

ARTIFICIAL INTELLIGENCE. Artificial Neural Networks INFOB2KI 2017-2018 Utrecht University The Netherlands ARTIFICIAL INTELLIGENCE Artificial Neural Networks Lecturer: Silja Renooij These slides are part of the INFOB2KI Course Notes available from www.cs.uu.nl/docs/vakken/b2ki/schema.html

More information

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels Need for Deep Networks Perceptron Can only model linear functions Kernel Machines Non-linearity provided by kernels Need to design appropriate kernels (possibly selecting from a set, i.e. kernel learning)

More information

CSE 190 Fall 2015 Midterm DO NOT TURN THIS PAGE UNTIL YOU ARE TOLD TO START!!!!

CSE 190 Fall 2015 Midterm DO NOT TURN THIS PAGE UNTIL YOU ARE TOLD TO START!!!! CSE 190 Fall 2015 Midterm DO NOT TURN THIS PAGE UNTIL YOU ARE TOLD TO START!!!! November 18, 2015 THE EXAM IS CLOSED BOOK. Once the exam has started, SORRY, NO TALKING!!! No, you can t even say see ya

More information

Course 395: Machine Learning - Lectures

Course 395: Machine Learning - Lectures Course 395: Machine Learning - Lectures Lecture 1-2: Concept Learning (M. Pantic) Lecture 3-4: Decision Trees & CBC Intro (M. Pantic & S. Petridis) Lecture 5-6: Evaluating Hypotheses (S. Petridis) Lecture

More information

Understanding How ConvNets See

Understanding How ConvNets See Understanding How ConvNets See Slides from Andrej Karpathy Springerberg et al, Striving for Simplicity: The All Convolutional Net (ICLR 2015 workshops) CSC321: Intro to Machine Learning and Neural Networks,

More information

Machine Learning for Large-Scale Data Analysis and Decision Making A. Neural Networks Week #6

Machine Learning for Large-Scale Data Analysis and Decision Making A. Neural Networks Week #6 Machine Learning for Large-Scale Data Analysis and Decision Making 80-629-17A Neural Networks Week #6 Today Neural Networks A. Modeling B. Fitting C. Deep neural networks Today s material is (adapted)

More information

Neural networks and optimization

Neural networks and optimization Neural networks and optimization Nicolas Le Roux Criteo 18/05/15 Nicolas Le Roux (Criteo) Neural networks and optimization 18/05/15 1 / 85 1 Introduction 2 Deep networks 3 Optimization 4 Convolutional

More information

Neural Networks Lecturer: J. Matas Authors: J. Matas, B. Flach, O. Drbohlav

Neural Networks Lecturer: J. Matas Authors: J. Matas, B. Flach, O. Drbohlav Neural Networks 30.11.2015 Lecturer: J. Matas Authors: J. Matas, B. Flach, O. Drbohlav 1 Talk Outline Perceptron Combining neurons to a network Neural network, processing input to an output Learning Cost

More information

Reading Group on Deep Learning Session 1

Reading Group on Deep Learning Session 1 Reading Group on Deep Learning Session 1 Stephane Lathuiliere & Pablo Mesejo 2 June 2016 1/31 Contents Introduction to Artificial Neural Networks to understand, and to be able to efficiently use, the popular

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Neural Networks Varun Chandola x x 5 Input Outline Contents February 2, 207 Extending Perceptrons 2 Multi Layered Perceptrons 2 2. Generalizing to Multiple Labels.................

More information

CS 179: LECTURE 16 MODEL COMPLEXITY, REGULARIZATION, AND CONVOLUTIONAL NETS

CS 179: LECTURE 16 MODEL COMPLEXITY, REGULARIZATION, AND CONVOLUTIONAL NETS CS 179: LECTURE 16 MODEL COMPLEXITY, REGULARIZATION, AND CONVOLUTIONAL NETS LAST TIME Intro to cudnn Deep neural nets using cublas and cudnn TODAY Building a better model for image classification Overfitting

More information

Feed-forward Networks Network Training Error Backpropagation Applications. Neural Networks. Oliver Schulte - CMPT 726. Bishop PRML Ch.

Feed-forward Networks Network Training Error Backpropagation Applications. Neural Networks. Oliver Schulte - CMPT 726. Bishop PRML Ch. Neural Networks Oliver Schulte - CMPT 726 Bishop PRML Ch. 5 Neural Networks Neural networks arise from attempts to model human/animal brains Many models, many claims of biological plausibility We will

More information

Artificial Neural Networks. Introduction to Computational Neuroscience Tambet Matiisen

Artificial Neural Networks. Introduction to Computational Neuroscience Tambet Matiisen Artificial Neural Networks Introduction to Computational Neuroscience Tambet Matiisen 2.04.2018 Artificial neural network NB! Inspired by biology, not based on biology! Applications Automatic speech recognition

More information

Neural Networks and Deep Learning

Neural Networks and Deep Learning Neural Networks and Deep Learning Professor Ameet Talwalkar November 12, 2015 Professor Ameet Talwalkar Neural Networks and Deep Learning November 12, 2015 1 / 16 Outline 1 Review of last lecture AdaBoost

More information

<Special Topics in VLSI> Learning for Deep Neural Networks (Back-propagation)

<Special Topics in VLSI> Learning for Deep Neural Networks (Back-propagation) Learning for Deep Neural Networks (Back-propagation) Outline Summary of Previous Standford Lecture Universal Approximation Theorem Inference vs Training Gradient Descent Back-Propagation

More information

Neural Networks. Learning and Computer Vision Prof. Olga Veksler CS9840. Lecture 10

Neural Networks. Learning and Computer Vision Prof. Olga Veksler CS9840. Lecture 10 CS9840 Learning and Computer Vision Prof. Olga Veksler Lecture 0 Neural Networks Many slides are from Andrew NG, Yann LeCun, Geoffry Hinton, Abin - Roozgard Outline Short Intro Perceptron ( layer NN) Multilayer

More information

Logistic Regression & Neural Networks

Logistic Regression & Neural Networks Logistic Regression & Neural Networks CMSC 723 / LING 723 / INST 725 Marine Carpuat Slides credit: Graham Neubig, Jacob Eisenstein Logistic Regression Perceptron & Probabilities What if we want a probability

More information

CSC321 Lecture 16: ResNets and Attention

CSC321 Lecture 16: ResNets and Attention CSC321 Lecture 16: ResNets and Attention Roger Grosse Roger Grosse CSC321 Lecture 16: ResNets and Attention 1 / 24 Overview Two topics for today: Topic 1: Deep Residual Networks (ResNets) This is the state-of-the

More information

Artificial Neural Networks D B M G. Data Base and Data Mining Group of Politecnico di Torino. Elena Baralis. Politecnico di Torino

Artificial Neural Networks D B M G. Data Base and Data Mining Group of Politecnico di Torino. Elena Baralis. Politecnico di Torino Artificial Neural Networks Data Base and Data Mining Group of Politecnico di Torino Elena Baralis Politecnico di Torino Artificial Neural Networks Inspired to the structure of the human brain Neurons as

More information

y(x n, w) t n 2. (1)

y(x n, w) t n 2. (1) Network training: Training a neural network involves determining the weight parameter vector w that minimizes a cost function. Given a training set comprising a set of input vector {x n }, n = 1,...N,

More information

Neural Networks (Part 1) Goals for the lecture

Neural Networks (Part 1) Goals for the lecture Neural Networks (Part ) Mark Craven and David Page Computer Sciences 760 Spring 208 www.biostat.wisc.edu/~craven/cs760/ Some of the slides in these lectures have been adapted/borrowed from materials developed

More information

Machine Learning for Signal Processing Neural Networks Continue. Instructor: Bhiksha Raj Slides by Najim Dehak 1 Dec 2016

Machine Learning for Signal Processing Neural Networks Continue. Instructor: Bhiksha Raj Slides by Najim Dehak 1 Dec 2016 Machine Learning for Signal Processing Neural Networks Continue Instructor: Bhiksha Raj Slides by Najim Dehak 1 Dec 2016 1 So what are neural networks?? Voice signal N.Net Transcription Image N.Net Text

More information

ECE521 Lecture 7/8. Logistic Regression

ECE521 Lecture 7/8. Logistic Regression ECE521 Lecture 7/8 Logistic Regression Outline Logistic regression (Continue) A single neuron Learning neural networks Multi-class classification 2 Logistic regression The output of a logistic regression

More information

Machine Learning Lecture 12

Machine Learning Lecture 12 Machine Learning Lecture 12 Neural Networks 30.11.2017 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Course Outline Fundamentals Bayes Decision Theory Probability

More information

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Cheng Soon Ong & Christian Walder. Canberra February June 2018 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 Outlines Overview Introduction Linear Algebra Probability Linear Regression

More information

Last update: October 26, Neural networks. CMSC 421: Section Dana Nau

Last update: October 26, Neural networks. CMSC 421: Section Dana Nau Last update: October 26, 207 Neural networks CMSC 42: Section 8.7 Dana Nau Outline Applications of neural networks Brains Neural network units Perceptrons Multilayer perceptrons 2 Example Applications

More information

Neural networks. Chapter 19, Sections 1 5 1

Neural networks. Chapter 19, Sections 1 5 1 Neural networks Chapter 19, Sections 1 5 Chapter 19, Sections 1 5 1 Outline Brains Neural networks Perceptrons Multilayer perceptrons Applications of neural networks Chapter 19, Sections 1 5 2 Brains 10

More information

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others) Machine Learning Neural Networks (slides from Domingos, Pardo, others) Human Brain Neurons Input-Output Transformation Input Spikes Output Spike Spike (= a brief pulse) (Excitatory Post-Synaptic Potential)

More information

AN INTRODUCTION TO NEURAL NETWORKS. Scott Kuindersma November 12, 2009

AN INTRODUCTION TO NEURAL NETWORKS. Scott Kuindersma November 12, 2009 AN INTRODUCTION TO NEURAL NETWORKS Scott Kuindersma November 12, 2009 SUPERVISED LEARNING We are given some training data: We must learn a function If y is discrete, we call it classification If it is

More information

Machine Learning for Physicists Lecture 1

Machine Learning for Physicists Lecture 1 Machine Learning for Physicists Lecture 1 Summer 2017 University of Erlangen-Nuremberg Florian Marquardt (Image generated by a net with 20 hidden layers) OUTPUT INPUT (Picture: Wikimedia Commons) OUTPUT

More information

Machine Learning Lecture 10

Machine Learning Lecture 10 Machine Learning Lecture 10 Neural Networks 26.11.2018 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Today s Topic Deep Learning 2 Course Outline Fundamentals Bayes

More information

1 What a Neural Network Computes

1 What a Neural Network Computes Neural Networks 1 What a Neural Network Computes To begin with, we will discuss fully connected feed-forward neural networks, also known as multilayer perceptrons. A feedforward neural network consists

More information

Artificial Neural Networks

Artificial Neural Networks Artificial Neural Networks Oliver Schulte - CMPT 310 Neural Networks Neural networks arise from attempts to model human/animal brains Many models, many claims of biological plausibility We will focus on

More information

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels Need for Deep Networks Perceptron Can only model linear functions Kernel Machines Non-linearity provided by kernels Need to design appropriate kernels (possibly selecting from a set, i.e. kernel learning)

More information

Convolutional Neural Networks. Srikumar Ramalingam

Convolutional Neural Networks. Srikumar Ramalingam Convolutional Neural Networks Srikumar Ramalingam Reference Many of the slides are prepared using the following resources: neuralnetworksanddeeplearning.com (mainly Chapter 6) http://cs231n.github.io/convolutional-networks/

More information

Neural networks. Chapter 20. Chapter 20 1

Neural networks. Chapter 20. Chapter 20 1 Neural networks Chapter 20 Chapter 20 1 Outline Brains Neural networks Perceptrons Multilayer networks Applications of neural networks Chapter 20 2 Brains 10 11 neurons of > 20 types, 10 14 synapses, 1ms

More information

Comments. Assignment 3 code released. Thought questions 3 due this week. Mini-project: hopefully you have started. implement classification algorithms

Comments. Assignment 3 code released. Thought questions 3 due this week. Mini-project: hopefully you have started. implement classification algorithms Neural networks Comments Assignment 3 code released implement classification algorithms use kernels for census dataset Thought questions 3 due this week Mini-project: hopefully you have started 2 Example:

More information

Grundlagen der Künstlichen Intelligenz

Grundlagen der Künstlichen Intelligenz Grundlagen der Künstlichen Intelligenz Neural networks Daniel Hennes 21.01.2018 (WS 2017/18) University Stuttgart - IPVS - Machine Learning & Robotics 1 Today Logistic regression Neural networks Perceptron

More information

Deep Learning: a gentle introduction

Deep Learning: a gentle introduction Deep Learning: a gentle introduction Jamal Atif jamal.atif@dauphine.fr PSL, Université Paris-Dauphine, LAMSADE February 8, 206 Jamal Atif (Université Paris-Dauphine) Deep Learning February 8, 206 / Why

More information

Intro to Neural Networks and Deep Learning

Intro to Neural Networks and Deep Learning Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi UVA CS 6316 1 Neurons 1-Layer Neural Network Multi-layer Neural Network Loss Functions Backpropagation Nonlinearity Functions NNs

More information

Deep Learning (CNNs)

Deep Learning (CNNs) 10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Deep Learning (CNNs) Deep Learning Readings: Murphy 28 Bishop - - HTF - - Mitchell

More information

Backpropagation Introduction to Machine Learning. Matt Gormley Lecture 12 Feb 23, 2018

Backpropagation Introduction to Machine Learning. Matt Gormley Lecture 12 Feb 23, 2018 10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Backpropagation Matt Gormley Lecture 12 Feb 23, 2018 1 Neural Networks Outline

More information

CSE446: Neural Networks Winter 2015

CSE446: Neural Networks Winter 2015 CSE446: Neural Networks Winter 2015 Luke Ze

More information

DEEP LEARNING AND NEURAL NETWORKS: BACKGROUND AND HISTORY

DEEP LEARNING AND NEURAL NETWORKS: BACKGROUND AND HISTORY DEEP LEARNING AND NEURAL NETWORKS: BACKGROUND AND HISTORY 1 On-line Resources http://neuralnetworksanddeeplearning.com/index.html Online book by Michael Nielsen http://matlabtricks.com/post-5/3x3-convolution-kernelswith-online-demo

More information

Neural Networks with Applications to Vision and Language. Feedforward Networks. Marco Kuhlmann

Neural Networks with Applications to Vision and Language. Feedforward Networks. Marco Kuhlmann Neural Networks with Applications to Vision and Language Feedforward Networks Marco Kuhlmann Feedforward networks Linear separability x 2 x 2 0 1 0 1 0 0 x 1 1 0 x 1 linearly separable not linearly separable

More information

Neural networks. Chapter 20, Section 5 1

Neural networks. Chapter 20, Section 5 1 Neural networks Chapter 20, Section 5 Chapter 20, Section 5 Outline Brains Neural networks Perceptrons Multilayer perceptrons Applications of neural networks Chapter 20, Section 5 2 Brains 0 neurons of

More information

Artificial Neural Networks. Historical description

Artificial Neural Networks. Historical description Artificial Neural Networks Historical description Victor G. Lopez 1 / 23 Artificial Neural Networks (ANN) An artificial neural network is a computational model that attempts to emulate the functions of

More information

Neural Networks. Chapter 18, Section 7. TB Artificial Intelligence. Slides from AIMA 1/ 21

Neural Networks. Chapter 18, Section 7. TB Artificial Intelligence. Slides from AIMA   1/ 21 Neural Networks Chapter 8, Section 7 TB Artificial Intelligence Slides from AIMA http://aima.cs.berkeley.edu / 2 Outline Brains Neural networks Perceptrons Multilayer perceptrons Applications of neural

More information

CS 6501: Deep Learning for Computer Graphics. Basics of Neural Networks. Connelly Barnes

CS 6501: Deep Learning for Computer Graphics. Basics of Neural Networks. Connelly Barnes CS 6501: Deep Learning for Computer Graphics Basics of Neural Networks Connelly Barnes Overview Simple neural networks Perceptron Feedforward neural networks Multilayer perceptron and properties Autoencoders

More information

CSCI567 Machine Learning (Fall 2018)

CSCI567 Machine Learning (Fall 2018) CSCI567 Machine Learning (Fall 2018) Prof. Haipeng Luo U of Southern California Sep 12, 2018 September 12, 2018 1 / 49 Administration GitHub repos are setup (ask TA Chi Zhang for any issues) HW 1 is due

More information

Introduction to Natural Computation. Lecture 9. Multilayer Perceptrons and Backpropagation. Peter Lewis

Introduction to Natural Computation. Lecture 9. Multilayer Perceptrons and Backpropagation. Peter Lewis Introduction to Natural Computation Lecture 9 Multilayer Perceptrons and Backpropagation Peter Lewis 1 / 25 Overview of the Lecture Why multilayer perceptrons? Some applications of multilayer perceptrons.

More information

CSC 411 Lecture 7: Linear Classification

CSC 411 Lecture 7: Linear Classification CSC 411 Lecture 7: Linear Classification Roger Grosse, Amir-massoud Farahmand, and Juan Carrasquilla University of Toronto UofT CSC 411: 07-Linear Classification 1 / 23 Overview Classification: predicting

More information

Machine Learning

Machine Learning Machine Learning 10-315 Maria Florina Balcan Machine Learning Department Carnegie Mellon University 03/29/2019 Today: Artificial neural networks Backpropagation Reading: Mitchell: Chapter 4 Bishop: Chapter

More information

Neural Networks: Backpropagation

Neural Networks: Backpropagation Neural Networks: Backpropagation Machine Learning Fall 2017 Based on slides and material from Geoffrey Hinton, Richard Socher, Dan Roth, Yoav Goldberg, Shai Shalev-Shwartz and Shai Ben-David, and others

More information

Neural Networks. Yan Shao Department of Linguistics and Philology, Uppsala University 7 December 2016

Neural Networks. Yan Shao Department of Linguistics and Philology, Uppsala University 7 December 2016 Neural Networks Yan Shao Department of Linguistics and Philology, Uppsala University 7 December 2016 Outline Part 1 Introduction Feedforward Neural Networks Stochastic Gradient Descent Computational Graph

More information

Neural Networks. David Rosenberg. July 26, New York University. David Rosenberg (New York University) DS-GA 1003 July 26, / 35

Neural Networks. David Rosenberg. July 26, New York University. David Rosenberg (New York University) DS-GA 1003 July 26, / 35 Neural Networks David Rosenberg New York University July 26, 2017 David Rosenberg (New York University) DS-GA 1003 July 26, 2017 1 / 35 Neural Networks Overview Objectives What are neural networks? How

More information

Machine Learning Basics III

Machine Learning Basics III Machine Learning Basics III Benjamin Roth CIS LMU München Benjamin Roth (CIS LMU München) Machine Learning Basics III 1 / 62 Outline 1 Classification Logistic Regression 2 Gradient Based Optimization Gradient

More information

Neural Networks. Single-layer neural network. CSE 446: Machine Learning Emily Fox University of Washington March 10, /9/17

Neural Networks. Single-layer neural network. CSE 446: Machine Learning Emily Fox University of Washington March 10, /9/17 3/9/7 Neural Networks Emily Fox University of Washington March 0, 207 Slides adapted from Ali Farhadi (via Carlos Guestrin and Luke Zettlemoyer) Single-layer neural network 3/9/7 Perceptron as a neural

More information

Sequence Modeling with Neural Networks

Sequence Modeling with Neural Networks Sequence Modeling with Neural Networks Harini Suresh y 0 y 1 y 2 s 0 s 1 s 2... x 0 x 1 x 2 hat is a sequence? This morning I took the dog for a walk. sentence medical signals speech waveform Successes

More information

Introduction to Neural Networks

Introduction to Neural Networks CUONG TUAN NGUYEN SEIJI HOTTA MASAKI NAKAGAWA Tokyo University of Agriculture and Technology Copyright by Nguyen, Hotta and Nakagawa 1 Pattern classification Which category of an input? Example: Character

More information

Lecture 4: Perceptrons and Multilayer Perceptrons

Lecture 4: Perceptrons and Multilayer Perceptrons Lecture 4: Perceptrons and Multilayer Perceptrons Cognitive Systems II - Machine Learning SS 2005 Part I: Basic Approaches of Concept Learning Perceptrons, Artificial Neuronal Networks Lecture 4: Perceptrons

More information

CSC 411: Lecture 04: Logistic Regression

CSC 411: Lecture 04: Logistic Regression CSC 411: Lecture 04: Logistic Regression Raquel Urtasun & Rich Zemel University of Toronto Sep 23, 2015 Urtasun & Zemel (UofT) CSC 411: 04-Prob Classif Sep 23, 2015 1 / 16 Today Key Concepts: Logistic

More information

CSC 411: Lecture 02: Linear Regression

CSC 411: Lecture 02: Linear Regression CSC 411: Lecture 02: Linear Regression Richard Zemel, Raquel Urtasun and Sanja Fidler University of Toronto (Most plots in this lecture are from Bishop s book) Zemel, Urtasun, Fidler (UofT) CSC 411: 02-Regression

More information

Multilayer Perceptrons and Backpropagation

Multilayer Perceptrons and Backpropagation Multilayer Perceptrons and Backpropagation Informatics 1 CG: Lecture 7 Chris Lucas School of Informatics University of Edinburgh January 31, 2017 (Slides adapted from Mirella Lapata s.) 1 / 33 Reading:

More information

Slide credit from Hung-Yi Lee & Richard Socher

Slide credit from Hung-Yi Lee & Richard Socher Slide credit from Hung-Yi Lee & Richard Socher 1 Review Recurrent Neural Network 2 Recurrent Neural Network Idea: condition the neural network on all previous words and tie the weights at each time step

More information

Feedforward Neural Nets and Backpropagation

Feedforward Neural Nets and Backpropagation Feedforward Neural Nets and Backpropagation Julie Nutini University of British Columbia MLRG September 28 th, 2016 1 / 23 Supervised Learning Roadmap Supervised Learning: Assume that we are given the features

More information

CS489/698: Intro to ML

CS489/698: Intro to ML CS489/698: Intro to ML Lecture 03: Multi-layer Perceptron Outline Failure of Perceptron Neural Network Backpropagation Universal Approximator 2 Outline Failure of Perceptron Neural Network Backpropagation

More information

Nonlinear Classification

Nonlinear Classification Nonlinear Classification INFO-4604, Applied Machine Learning University of Colorado Boulder October 5-10, 2017 Prof. Michael Paul Linear Classification Most classifiers we ve seen use linear functions

More information

Apprentissage, réseaux de neurones et modèles graphiques (RCP209) Neural Networks and Deep Learning

Apprentissage, réseaux de neurones et modèles graphiques (RCP209) Neural Networks and Deep Learning Apprentissage, réseaux de neurones et modèles graphiques (RCP209) Neural Networks and Deep Learning Nicolas Thome Prenom.Nom@cnam.fr http://cedric.cnam.fr/vertigo/cours/ml2/ Département Informatique Conservatoire

More information

Natural Language Processing

Natural Language Processing Natural Language Processing Info 59/259 Lecture 4: Text classification 3 (Sept 5, 207) David Bamman, UC Berkeley . https://www.forbes.com/sites/kevinmurnane/206/04/0/what-is-deep-learning-and-how-is-it-useful

More information

Introduction to (Convolutional) Neural Networks

Introduction to (Convolutional) Neural Networks Introduction to (Convolutional) Neural Networks Philipp Grohs Summer School DL and Vis, Sept 2018 Syllabus 1 Motivation and Definition 2 Universal Approximation 3 Backpropagation 4 Stochastic Gradient

More information

CS 1674: Intro to Computer Vision. Final Review. Prof. Adriana Kovashka University of Pittsburgh December 7, 2016

CS 1674: Intro to Computer Vision. Final Review. Prof. Adriana Kovashka University of Pittsburgh December 7, 2016 CS 1674: Intro to Computer Vision Final Review Prof. Adriana Kovashka University of Pittsburgh December 7, 2016 Final info Format: multiple-choice, true/false, fill in the blank, short answers, apply an

More information

Neural networks COMS 4771

Neural networks COMS 4771 Neural networks COMS 4771 1. Logistic regression Logistic regression Suppose X = R d and Y = {0, 1}. A logistic regression model is a statistical model where the conditional probability function has a

More information

Convolutional Neural Networks II. Slides from Dr. Vlad Morariu

Convolutional Neural Networks II. Slides from Dr. Vlad Morariu Convolutional Neural Networks II Slides from Dr. Vlad Morariu 1 Optimization Example of optimization progress while training a neural network. (Loss over mini-batches goes down over time.) 2 Learning rate

More information