Learning from Data: Multi-layer Perceptrons

Similar documents
MLPR: Logistic Regression and Neural Networks

Outline. MLPR: Logistic Regression and Neural Networks Machine Learning and Pattern Recognition. Which is the correct model? Recap.

What Do Neural Networks Do? MLP Lecture 3 Multi-layer networks 1

Deep Neural Networks (1) Hidden layers; Back-propagation

Neural Networks, Computation Graphs. CMSC 470 Marine Carpuat

CSC321 Lecture 5: Multilayer Perceptrons

Deep Neural Networks (1) Hidden layers; Back-propagation

ECE521 Lectures 9 Fully Connected Neural Networks

Multi-layer Neural Networks

Lecture 10. Neural networks and optimization. Machine Learning and Data Mining November Nando de Freitas UBC. Nonlinear Supervised Learning

Deep Feedforward Networks

Machine Learning (CSE 446): Neural Networks

Engineering Part IIB: Module 4F10 Statistical Pattern Processing Lecture 6: Multi-Layer Perceptrons I

Introduction to Neural Networks

Introduction to Machine Learning Spring 2018 Note Neural Networks

Computational statistics

Intro to Neural Networks and Deep Learning

Multilayer Perceptron

Artificial Neural Networks

Backpropagation Introduction to Machine Learning. Matt Gormley Lecture 12 Feb 23, 2018

Week 5: Logistic Regression & Neural Networks

CSC321 Lecture 4 The Perceptron Algorithm

CSC242: Intro to AI. Lecture 21

Neural Networks with Applications to Vision and Language. Feedforward Networks. Marco Kuhlmann

The Multi-Layer Perceptron

CSC321 Lecture 4: Learning a Classifier

Feed-forward Network Functions

Lecture 17: Neural Networks and Deep Learning

Neural Networks and Deep Learning

Machine Learning and Data Mining. Multi-layer Perceptrons & Neural Networks: Basics. Prof. Alexander Ihler

Feed-forward Networks Network Training Error Backpropagation Applications. Neural Networks. Oliver Schulte - CMPT 726. Bishop PRML Ch.

Advanced statistical methods for data analysis Lecture 2

Rapid Introduction to Machine Learning/ Deep Learning

CSC321 Lecture 4: Learning a Classifier

Artificial neural networks

Neural Networks: Introduction

COMS 4771 Introduction to Machine Learning. Nakul Verma

Machine Learning for Large-Scale Data Analysis and Decision Making A. Neural Networks Week #6

Deep Feedforward Networks. Han Shao, Hou Pong Chan, and Hongyi Zhang

Introduction to Natural Computation. Lecture 9. Multilayer Perceptrons and Backpropagation. Peter Lewis

Artificial Neural Networks" and Nonparametric Methods" CMPSCI 383 Nov 17, 2011!

(Feed-Forward) Neural Networks Dr. Hajira Jabeen, Prof. Jens Lehmann

Lecture 5 Neural models for NLP

AN INTRODUCTION TO NEURAL NETWORKS. Scott Kuindersma November 12, 2009

CSC 411 Lecture 10: Neural Networks

17 Neural Networks NEURAL NETWORKS. x XOR 1. x Jonathan Richard Shewchuk

Pattern Recognition Prof. P. S. Sastry Department of Electronics and Communication Engineering Indian Institute of Science, Bangalore

Multivariate Data Analysis and Machine Learning in High Energy Physics (III)

Neural Networks in Structured Prediction. November 17, 2015

Stochastic gradient descent; Classification

Notes on Discriminant Functions and Optimal Classification

Artificial Intelligence

Neural Networks. Bishop PRML Ch. 5. Alireza Ghane. Feed-forward Networks Network Training Error Backpropagation Applications

Introduction Neural Networks - Architecture Network Training Small Example - ZIP Codes Summary. Neural Networks - I. Henrik I Christensen

Neural networks. Chapter 19, Sections 1 5 1

ECE521 Lecture 7/8. Logistic Regression

Multilayer Perceptrons and Backpropagation

Machine Learning: Chenhao Tan University of Colorado Boulder LECTURE 6

Deep Feedforward Networks. Lecture slides for Chapter 6 of Deep Learning Ian Goodfellow Last updated

Neural networks (NN) 1

Artificial Neural Networks

Lecture 3 Feedforward Networks and Backpropagation

Neural networks. Chapter 20. Chapter 20 1

ARTIFICIAL INTELLIGENCE. Artificial Neural Networks

DEEP LEARNING AND NEURAL NETWORKS: BACKGROUND AND HISTORY

Neural Networks Lecturer: J. Matas Authors: J. Matas, B. Flach, O. Drbohlav

4. Multilayer Perceptrons

Ch.6 Deep Feedforward Networks (2/3)

Multivariate Analysis, TMVA, and Artificial Neural Networks

NONLINEAR CLASSIFICATION AND REGRESSION. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition

Intelligent Systems Discriminative Learning, Neural Networks

Neural networks and support vector machines

Jakub Hajic Artificial Intelligence Seminar I

text classification 3: neural networks

Comments. Assignment 3 code released. Thought questions 3 due this week. Mini-project: hopefully you have started. implement classification algorithms

Neural networks. Chapter 20, Section 5 1

From perceptrons to word embeddings. Simon Šuster University of Groningen

Artificial Neural Networks (ANN)

CSE 190 Fall 2015 Midterm DO NOT TURN THIS PAGE UNTIL YOU ARE TOLD TO START!!!!

Nonlinear Classification

Learning and Memory in Neural Networks

LECTURE # - NEURAL COMPUTATION, Feb 04, Linear Regression. x 1 θ 1 output... θ M x M. Assumes a functional form

Artificial Neural Networks (ANN) Xiaogang Su, Ph.D. Department of Mathematical Science University of Texas at El Paso

Neural Networks: Backpropagation

Deep Feedforward Networks

Machine Learning. Neural Networks. Le Song. CSE6740/CS7641/ISYE6740, Fall Lecture 7, September 11, 2012 Based on slides from Eric Xing, CMU

Lecture 5: Logistic Regression. Neural Networks

Neural Networks Learning the network: Backprop , Fall 2018 Lecture 4

Mark Gales October y (x) x 1. x 2 y (x) Inputs. Outputs. x d. y (x) Second Output layer layer. layer.

Neural Networks. Advanced data-mining. Yongdai Kim. Department of Statistics, Seoul National University, South Korea

SPSS, University of Texas at Arlington. Topics in Machine Learning-EE 5359 Neural Networks

ARTIFICIAL NEURAL NETWORKS گروه مطالعاتي 17 بهار 92

Artificial Neural Network

Lecture 3 Feedforward Networks and Backpropagation

epochs epochs

Deep Learning for NLP

The classifier. Theorem. where the min is over all possible classifiers. To calculate the Bayes classifier/bayes risk, we need to know

The classifier. Linear discriminant analysis (LDA) Example. Challenges for LDA

Introduction to Neural Networks

ECE521 Lecture7. Logistic Regression

Transcription:

Learning from Data: Multi-layer Perceptrons Amos Storkey, School of Informatics University of Edinburgh Semester, 24 LfD 24

Layered Neural Networks Background Single Neurons Relationship to logistic regression. Multiple Neurons. The transfer function. Different output types. Whole Model LfD 24

Neural Networks The field of neural networks grew up out of simple models of neurons. Research was done into what networks of these neurons could achieve. Neural networks proved to be a reasonable modelling tool. Which is funny really as they never were good models of neurons... or of neural networks. But when understood in turns of learning from data, they proved to be powerful. LfD 24 2

Simple Neural Model Input: x Output: h(x) = g(a(x)) for activation a(x) = w T x + b and transfer function g( ). Most commonly g(.) is logistic, but could be Gaussian shaped. w is called a weight vector and b is called the bias. These are the parameter of a neuron. Note if the output of a neuron is understood as a class probability, and g( ) is logistic, this is just a logistic regression model. It has all the same properties! LfD 24 3

Single Neuron Function.5.8.6.4.2.5.5 x(2).5.8.6.4.2 x().2.4.6.8 A single neuron returns functions of the projected distance along some line. Left Gaussian transfer, right sigmoid transfer x(2).5.5 x().5 LfD 24 4

Logistic Regression The logistic regression model only copes with linear decision boundaries. What is we wanted to model more complicated systems, with nonlinear class boundaries. Logistic regression would not work. However if you had some features. That could select different parts of the input space, and have different models for each of those parts. But logistic regression is precisely a model which selects two different parts of the input space! What happens when you use a number of neurons to select the different regions of input space, and then have a neuron which does logistic regression in this feature space. LfD 24 5

Layered Neural Models Suppose we have a layer of K simple neurons, all taking the same input, but producing a different output. The output of this first neural layer (the hidden layer) can now be viewed as a new input space. And if the parameters of the hidden layer neurons were chosen well, we may find that two classes we are interested in do have nearly linear decision boundaries in the output space of this hidden layer. Then we are away! A standard logistic regression model will solve our problems in this case. So we just need to put one neuron in the next layer (the output layer) to produce the final classification. LfD 24 6

The Transfer Function The function used to add nonlinearity in the hidden layer need not be a logistic; it need not return values in the range [, ]. This function is called the transfer function Choose transfer function g( ) so that outputs of the net are continuous and differentiable functions of the weights. The logistic or sigmoid function is the most common choice (the tanh function is equivalent up to an additive or multiplicative constant). g(z) = + e ( z z tanh = 2g(z) 2) LfD 24 7

Logistic Function.9.8 D=3.7.6 D=.5.4.3.2. 5 4 3 2 2 3 4 5 The logistic function has a simple derivative g (z) = g(z)( g(z)). LfD 24 8

Different outputs What if we want a real valued output? Answer: We can set the output neuron to have a linear transfer function. What if we want a multivariate output? Answer: we can have many neurons in the output layer, all returning different variables. What if we wanted many classes? We pipe our multivariate output through a logit or softmax model (see next lecture). LfD 24 9

Combining Neurons So what happens when we combine the output of different neurons. Suppose we have two neurons, and we sum the output of those two neurons. What sort of functions can we now represent? LfD 24

Combined Output of two Neurons Representation of two hidden neurons LfD 24

Examples 2.5 2.5.5.5.5.5 x(2).5.8.6.4.2.2.4.6.8 x() x(2).5.5 x().5 Multiple neurons return more complicated functions LfD 24 2

The Model Input layer: x. K Hidden layer neurons. Neuron i: h i (x) = g(a i (x)) for activation a i (x) = w T i x + b i. Output neuron: r( K i= v ig(w T i x + b i) + b). Typically g is logistic and r is linear/logistic. LfD 24 3

So what are Neural Networks They are simply nonlinear functions with many parameters. There is a weight and a bias parameter for each unit. That s it really. Don t over-glamorise them. LfD 24 4

MLP architecture Example: hidden layer (bottom to top)..... output units... hidden units...... input units LfD 24 5

- 6 2-8 -4 9 7 3 5 x y Output is g (6g(7x + 3y 8) + 2g(9x + 5y 4) ) LfD 24 6

Summary Neural network history Simple neurons and logistic regression What can a simple neuron do? Combining neurons. Layered Neural Networks/Multilayer perceptrons. LfD 24 7