Mul7layer Perceptrons

Similar documents
10. Artificial Neural Networks

Multilayer Perceptron

Neural networks. Chapter 19, Sections 1 5 1

Neural networks. Chapter 20, Section 5 1

Neural Networks. Chapter 18, Section 7. TB Artificial Intelligence. Slides from AIMA 1/ 21

Multilayer Perceptron

Lecture 5: Logistic Regression. Neural Networks

Artificial Neural Networks

100 inference steps doesn't seem like enough. Many neuron-like threshold switching units. Many weighted interconnections among units

Artificial Neural Networks

Neural networks. Chapter 20. Chapter 20 1

Last update: October 26, Neural networks. CMSC 421: Section Dana Nau

Computational statistics

Plan. Perceptron Linear discriminant. Associative memories Hopfield networks Chaotic networks. Multilayer perceptron Backpropagation

Introduction to Neural Networks

Lecture 10. Neural networks and optimization. Machine Learning and Data Mining November Nando de Freitas UBC. Nonlinear Supervised Learning

COMP 551 Applied Machine Learning Lecture 14: Neural Networks

Introduction to Neural Networks

Machine Learning and Data Mining. Multi-layer Perceptrons & Neural Networks: Basics. Prof. Alexander Ihler

Machine Learning Lecture 10

Neural Networks biological neuron artificial neuron 1

Artifical Neural Networks

Sections 18.6 and 18.7 Artificial Neural Networks

Course 395: Machine Learning - Lectures

Deep Feedforward Networks

COMP-4360 Machine Learning Neural Networks

Statistical Machine Learning (BE4M33SSU) Lecture 5: Artificial Neural Networks

CSC321 Lecture 5: Multilayer Perceptrons

Lecture 17: Neural Networks and Deep Learning

Machine Learning. Neural Networks. Le Song. CSE6740/CS7641/ISYE6740, Fall Lecture 7, September 11, 2012 Based on slides from Eric Xing, CMU

NONLINEAR CLASSIFICATION AND REGRESSION. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition

Machine Learning 2nd Edi7on

Lecture 4: Perceptrons and Multilayer Perceptrons

Artificial Neural Networks" and Nonparametric Methods" CMPSCI 383 Nov 17, 2011!

Machine Learning Lecture 12

Artificial Neural Networks. Q550: Models in Cognitive Science Lecture 5

Multilayer Neural Networks. (sometimes called Multilayer Perceptrons or MLPs)

Machine Learning for Large-Scale Data Analysis and Decision Making A. Neural Networks Week #6

Multilayer Neural Networks. (sometimes called Multilayer Perceptrons or MLPs)

Back-Propagation Algorithm. Perceptron Gradient Descent Multilayered neural network Back-Propagation More on Back-Propagation Examples

ARTIFICIAL NEURAL NETWORKS گروه مطالعاتي 17 بهار 92

Chapter ML:VI (continued)

CSE446: Neural Networks Spring Many slides are adapted from Carlos Guestrin and Luke Zettlemoyer

Artificial Neural Networks

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels

Multilayer Perceptrons and Backpropagation

Artificial Neural Networks

Sections 18.6 and 18.7 Analysis of Artificial Neural Networks

y(x n, w) t n 2. (1)

Logistic Regression & Neural Networks

Neural Networks: Backpropagation

Machine Learning. Neural Networks

Sections 18.6 and 18.7 Artificial Neural Networks

Neural Networks Lecturer: J. Matas Authors: J. Matas, B. Flach, O. Drbohlav

Introduction to Natural Computation. Lecture 9. Multilayer Perceptrons and Backpropagation. Peter Lewis

Machine Learning

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels

Neural Networks (Part 1) Goals for the lecture

Introduction Neural Networks - Architecture Network Training Small Example - ZIP Codes Summary. Neural Networks - I. Henrik I Christensen

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Intro to Neural Networks and Deep Learning

Learning and Memory in Neural Networks

April 9, Depto. de Ing. de Sistemas e Industrial Universidad Nacional de Colombia, Bogotá. Linear Classification Models. Fabio A. González Ph.D.

CS:4420 Artificial Intelligence

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

What Do Neural Networks Do? MLP Lecture 3 Multi-layer networks 1

Chapter ML:VI (continued)

SPSS, University of Texas at Arlington. Topics in Machine Learning-EE 5359 Neural Networks

MIDTERM: CS 6375 INSTRUCTOR: VIBHAV GOGATE October,

Data Mining Part 5. Prediction

ARTIFICIAL INTELLIGENCE. Artificial Neural Networks

ECE 471/571 - Lecture 17. Types of NN. History. Back Propagation. Recurrent (feedback during operation) Feedforward

Grundlagen der Künstlichen Intelligenz

How to do backpropagation in a brain

Lab 5: 16 th April Exercises on Neural Networks

Simple Neural Nets For Pattern Classification

Neural Networks. Haiming Zhou. Division of Statistics Northern Illinois University.

Deep Learning: a gentle introduction

Machine Learning Lecture 5

Artificial neural networks

The error-backpropagation algorithm is one of the most important and widely used (and some would say wildly used) learning techniques for neural

CS 6501: Deep Learning for Computer Graphics. Basics of Neural Networks. Connelly Barnes

Neural Networks and the Back-propagation Algorithm

Neural Networks and Deep Learning

Artificial Intelligence

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Machine Learning

Neural Networks. Xiaojin Zhu Computer Sciences Department University of Wisconsin, Madison. slide 1

CS 730/730W/830: Intro AI

Serious limitations of (single-layer) perceptrons: Cannot learn non-linearly separable tasks. Cannot approximate (learn) non-linear functions

Neural Networks. CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington

Computational Intelligence Winter Term 2017/18

Unit III. A Survey of Neural Network Model

ECE521 Lectures 9 Fully Connected Neural Networks

CSC242: Intro to AI. Lecture 21

Neural Networks: Introduction

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Lecture 12. Neural Networks Bastian Leibe RWTH Aachen

Statistical NLP for the Web

Transcription:

Lecture Slides for INTRODUCTION TO Machine Learning 2nd Edi7on CHAPTER 11: Mul7layer Perceptrons ETHEM ALPAYDIN The MIT Press, 2010 Edited and expanded for CS 4641 by Chris Simpkins alpaydin@boun.edu.tr h1p://www.cmpe.boun.edu.tr/~ethem/i2ml2e

Overview Neural networks, brains, and computers Perceptrons Training Classification and regression Linear separability Multilayer perceptrons Universal approximation Backpropagation 2

Neural Networks Networks of processing units (neurons) with connec7ons (synapses) between them Large number of neurons: 10 10 Large connec7vity: 10 5 Parallel processing Distributed computa7on/memory Robust to noise, failures 3

Understanding the Brain Levels of analysis (Marr, 1982) 1. ComputaOonal theory 2. RepresentaOon and algorithm 3. Hardware implementaoon Reverse engineering: From hardware to theory Parallel processing: SIMD vs MIMD Neural net: SIMD with modifiable local memory Learning: Update by training/experience 4

Perceptron (RosenblaS, 1962) 5

What a Perceptron Does Regression: y=wx+w 0 Classifica7on: y=1(wx+w 0 >0) y y s y w 0 x 0 =+1 w x x w 0 w x w 0 Linear fit Linear discrimination 6

Regression: K Outputs ClassificaOon: 7

Training Online (instances seen one by one) vs batch (whole sample) learning: No need to store the whole sample Problem may change in Ome Wear and degradaoon in system components Stochas7c gradient- descent: Update ayer a single pazern Generic update rule (LMS rule): 8

Training a Perceptron: Regression Regression (Linear output): to Machine Learning The MIT Press (V1.1) 9

ClassificaOon Single sigmoid output K>2 so4max outputs Same as for linear discriminants from chapter 10 except we update after each instance 10

Learning Boolean AND 11

XOR No w 0, w 1, w 2 sa7sfy: (Minsky and Papert, 1969) to Machine Learning The MIT Press (V1.1) 12

MulOlayer Perceptrons (Rumelhart et al., 1986) 13

MLP as Universal Approximator x 1 XOR x 2 = (x 1 AND ~x Lecture Notes 2 ) OR (~x for E Alpaydın 1 AND x 2004 Introduction 2 ) 14

BackpropagaOon 15

Regression Backward Forward x 16

Regression with MulOple Outputs y i v ih w hj z h x j 17

18

19

w h x+w 0 z h v h z h 20

Two- Class DiscriminaOon One sigmoid output y t for P(C 1 x t ) and P(C 2 x t ) 1- y t 21

K>2 Classes 22

MulOple Hidden Layers MLP with one hidden layer is a universal approximator (Hornik et al., 1989), but using mul7ple layers may lead to simpler networks 23

Improving Convergence Momentum Adap7ve learning rate 24

Overfibng/Overtraining Number of weights: H (d+1)+(h+1)k 25

Conclusion Perceptrons handle linearly separable problems Multilayer perceptrons handle any problem Logistic discrimination functions enable gradient descent- based packpropagation Solves the structural credit assignment problem Susceptible to local optima Susceptible to overfitting 26

27

Structured MLP (Le Cun et al, 1989) 28

Weight Sharing 29

Hints (Abu- Mostafa, 1995) Invariance to translaoon, rotaoon, size Virtual examples Augmented error: E =E+λ h E h If x and x are the same : E h =[g(x θ)- g(x θ)] 2 ApproximaOon hint: 30

Tuning the Network Size DestrucOve Weight decay: ConstrucOve Growing networks (Ash, 1989) (Fahlman and Lebiere, 1989) 31

Bayesian Learning Consider weights w i as random vars, prior p(w i ) Weight decay, ridge regression, regularizaoon cost=data- misfit + λ complexity More about Bayesian methods in chapter 14 32

Dimensionality ReducOon 33

34

Learning Time Applica7ons: Sequence recognioon: Speech recognioon Sequence reproducoon: Time- series predicoon Sequence associaoon Network architectures Time- delay networks (Waibel et al., 1989) Recurrent networks (Rumelhart et al., 1986) 35

Time- Delay Neural Networks 36

Recurrent Networks 37

Unfolding in Time 38