Single layer NN. Neuron Model

Size: px
Start display at page:

Download "Single layer NN. Neuron Model"

Transcription

1 Single layer NN We consider the simple architecture consisting of just one neuron. Generalization to a single layer with more neurons as illustrated below is easy because: M M The output units are independent among each other Each weight only affects one of the outputs Neural Networks NN Neuron Model The (McCulloch-Pitts) perceptron is a single layer NN with a non-linear ϕ, the sign function + if v 0 ϕ(v) = if v < 0 b (bias) x x w w v ϕ(v) y w n x n Neural Networks NN Rossella Cancelliere

2 Training How can we train a perceptron for a classification task? We try to find suitable values for the weights in such a way that the training examples are correctly classified. Geometrically, we try to find a hyperplane that separates the examples of the two classes. Neural Networks NN 3 Geometric View m i= The equation below describes a (hyper-)plane in the input space consisting of real valued m-dimensional vectors. The plane splits the input space into two regions, each of them describing one class. decision region for C w xi + w0 i = 0 decision boundary x w x + w x + w 0 >= 0 C C x w x + w x + w 0 = 0 Neural Networks NN 4 Rossella Cancelliere

3 Classification The perceptron is used for binary classification. Given training examples of classes C, C train the perceptron in such a way that it classifies correctly the training examples: If the output of the perceptron is + then the input is assigned to class C If the output is - then the input is assigned to C Neural Networks NN 5 The learning algorithm C a c ax + by + c = 0 y = x b b P(4,3) Example: C r : x y + = 0 y = x + x(n)=[-, 4, 3] w(n)=[,, -] w(n+)=x(n)+w(n) = [0, 5, ] y = 5 x T Training rule: w(n+)=w(n) if w(n) x(n) > 0 and x(n) belongs to class C T w(n+)=w(n) if w(n) x(n) < 0 and x(n) belongs to class C T w(n+)=w(n) -η(n) x(n) if w(n) x(n) > 0 and x(n) belongs to class C T w(n+)=w(n) + η(n) x(n) if w(n) x(n) < 0 and x(n) belongs to class C Neural Networks NN 6 Rossella Cancelliere 3

4 The learning algorithm n=; initialize w(n) randomly; while (there are misclassified training examples) Select a misclassified augmented example (x(n),d(n)) w(n+) = w(n) + ηd(n)x(n); n = n+; end-while; η = learning rate parameter (real number) Neural Networks NN 7 Convergence theorem (proof) Suppose the classes C, C are linearly separable (that is, there exists a hyper-plane that separates them). Then the perceptron algorithm applied to C C terminates successfully after a finite number of iterations. Proof: Consider the set C containing the inputs of C C transformed by replacing x with -x for each x with class label -. For simplicity assume w() = 0, η =. Let x() x(k) C be the sequence of inputs that have been used after k iterations. Then w() = w() + x() w(3) = w() + x() w(k+) = x() + + x(k) w(k+) = w(k) + x(k) Neural Networks NN 8 Rossella Cancelliere 4

5 Convergence theorem (proof) Since C and C are linearly separable then there exists w * such that w * T x > 0, x C. Let α = min w * T x Then w * T w(k+) = w * T x() + + w * T x(k) kα By the Cauchy-Schwarz inequality we get: w * w(k+) [w * T w(k+)] k α w(k+) (A) w * Neural Networks NN 9 Convergence theorem (proof) Now we consider another route: w(k+) = w(k) + x(k) w(k+) = w(k) + x(k) + w T (k)x(k) euclidean norm because x(k) is misclassified w(k+) w(k) + x(k) =0 w() w() + x() w(3) w() + x() w(k+) k i= x(i) Neural Networks NN 0 Rossella Cancelliere 5

6 Convergence theorem (proof) Let β = max x(n) x(n) C w(k+) k β (B) For sufficiently large values of k: (B) becomes in conflict with (A). Then k cannot be greater than k max such that (A) and (B) are both satisfied with the equality sign. kmaxα w* = kmaxβ kmax = β w* α β w * The algorithm terminates successfully in at most iterations, i.e. α and w( k + ) w( k ) 0 lim w( k ) = k w( k max ) lim = k Neural Networks NN Example Consider the -dimensional training set C C, C = {(,), (, -), (0, -)} with class label C = {(-,-), (-,), (0,)} with class label - Train a perceptron on C C Neural Networks NN Rossella Cancelliere 6

7 A possible implementation Consider the augmented training set C C, with first entry fixed to (to deal with the bias as extra weight): (,, ), (,, -), (, 0, -),(,-, -), (,-, ), (,0, ) Replace x with -x for all x C and use the following update rule: T w( n) + ηx( n) if w ( n) x( n) 0 w( n + ) = w( n) otherwise Epoch = the application of the update rule to each example of the training set. Then terminate the execution of the learning algorithm if the weights do not change after one epoch. Neural Networks NN 3 Execution the execution of the perceptron learning algorithm for each epoch is illustrated below, with w()=(,0,0), η =, and transformed inputs (,, ), (,, -), (,0, -), (-,, ), (-,, -), (-,0, -) Adjusted pattern Weight applied w(n) x(n) Update? New weight (,, ) (, 0, 0) No (, 0, 0) (,, -) (, 0, 0) No (, 0, 0) (,0, -) (, 0, 0) No (, 0, 0) (-,, ) (, 0, 0) - Yes (0,, ) (-,, -) (0,, ) 0 Yes (-,, 0) (-,0, -) (-,, 0) No (-,, 0) End epoch Neural Networks NN 4 Rossella Cancelliere 7

8 Execution Adjusted pattern Weight applied w(n) x(n) Update? New weight (,, ) (-,, 0) No (-,, 0) (,, -) (-,, 0) No (-,, 0) (,0, -) (-,, 0) - Yes (0,, -) (-,, ) (0,, -) No (0,, -) (-,, -) (0,, -) 3 No (0,, -) (-,0, -) (0,, -) No (0,, -) End epoch At epoch 3 no weight changes. (check!) stop execution of algorithm. Final weight vect.: (0,, -) decision hyperplane is x -x = 0. Neural Networks NN 5 Result C x Decision boundary: x -x = 0 - / x C Neural Networks NN 6 Rossella Cancelliere 8

9 Limitations The perceptron can only model linearly separable classes, like (those described by) the following Boolean functions: AND OR It cannot model the XOR! Neural Networks NN 7 Adaline: Adaptive Linear Element When the two classes are not linearly separable, it may be desirable to obtain a linear separator that minimizes the mean squared error. Adaline (Adaptive Linear Element): E uses a linear neuron model and the Least-Mean-Square (LMS) learning algorithm For the k-th example ( k ) ( k ) x, d (w) = j= 0 m ( k ) ( k ) ( d y ) = d x ( k ) ( k ) ( k ) ( ) Neural Networks NN 8 j w j Total error: N E tot = E k = ( k ) Rossella Cancelliere 9

10 Adaline The total error E_tot is the sum of the squared errors of all the examples. E_tot is a quadratic function of the weights whose derivative exists everywhere. Then incremental gradient descent may be used to minimize E_tot. At each iteration LMS algorithm selects an example and decreases the network error E of that example. Neural Networks NN 9 Incremental Gradient Descent start from an arbitrary point in the weight space the direction in which the error E of an example (as a function of the weights) is decreasing most rapidly is the opposite of the gradient of E: ( k ( E ) (w)) = w m take a small step (of size η) in that direction Neural Networks NN 0 E E, K, w ( k w(k + ) = w(k) η( E ) (w)) Rossella Cancelliere 0

11 Weights Update Rule Computation of Gradient(E): N ( k ) E tot E (w ) = w w j ( k ) ( k ) E (w ) w j w(k k = j E = y + ) ( k ) ( k ) (w ) y w Delta rule for weight update: = w(k) ( k ) ( k ) ( d y ) x k j Neural Networks NN j = + ηe ( k ) ( k ) x LMS learning algorithm k=; initialize w(k) randomly; while (E_tot unsatisfactory and k<max_iterations) ( k ) ( k ) Select an example x,d e ( ) ( w(k) x ) ( k) ( k) T ( k) = d w( k + ) = w(k) + η e ( k ) ( ) x k k = k+; end-while; η = learning rate parameter (real number) Neural Networks NN Rossella Cancelliere

12 Comparison: and Adaline Adaline Architecture Single-layer Single-layer Neuron model Learning algorithm Application Non-linear Minimize number of misclassified examples Linear classification linear Minimize total error Linear classification regression Neural Networks NN 3 Rossella Cancelliere

In the Name of God. Lecture 11: Single Layer Perceptrons

In the Name of God. Lecture 11: Single Layer Perceptrons 1 In the Name of God Lecture 11: Single Layer Perceptrons Perceptron: architecture We consider the architecture: feed-forward NN with one layer It is sufficient to study single layer perceptrons with just

More information

PMR5406 Redes Neurais e Lógica Fuzzy Aula 3 Single Layer Percetron

PMR5406 Redes Neurais e Lógica Fuzzy Aula 3 Single Layer Percetron PMR5406 Redes Neurais e Aula 3 Single Layer Percetron Baseado em: Neural Networks, Simon Haykin, Prentice-Hall, 2 nd edition Slides do curso por Elena Marchiori, Vrije Unviersity Architecture We consider

More information

Simple Neural Nets For Pattern Classification

Simple Neural Nets For Pattern Classification CHAPTER 2 Simple Neural Nets For Pattern Classification Neural Networks General Discussion One of the simplest tasks that neural nets can be trained to perform is pattern classification. In pattern classification

More information

Chapter 2 Single Layer Feedforward Networks

Chapter 2 Single Layer Feedforward Networks Chapter 2 Single Layer Feedforward Networks By Rosenblatt (1962) Perceptrons For modeling visual perception (retina) A feedforward network of three layers of units: Sensory, Association, and Response Learning

More information

Chapter ML:VI. VI. Neural Networks. Perceptron Learning Gradient Descent Multilayer Perceptron Radial Basis Functions

Chapter ML:VI. VI. Neural Networks. Perceptron Learning Gradient Descent Multilayer Perceptron Radial Basis Functions Chapter ML:VI VI. Neural Networks Perceptron Learning Gradient Descent Multilayer Perceptron Radial asis Functions ML:VI-1 Neural Networks STEIN 2005-2018 The iological Model Simplified model of a neuron:

More information

Radial-Basis Function Networks

Radial-Basis Function Networks Radial-Basis Function etworks A function is radial basis () if its output depends on (is a non-increasing function of) the distance of the input from a given stored vector. s represent local receptors,

More information

Radial-Basis Function Networks

Radial-Basis Function Networks Radial-Basis Function etworks A function is radial () if its output depends on (is a nonincreasing function of) the distance of the input from a given stored vector. s represent local receptors, as illustrated

More information

The Perceptron Algorithm 1

The Perceptron Algorithm 1 CS 64: Machine Learning Spring 5 College of Computer and Information Science Northeastern University Lecture 5 March, 6 Instructor: Bilal Ahmed Scribe: Bilal Ahmed & Virgil Pavlu Introduction The Perceptron

More information

The perceptron learning algorithm is one of the first procedures proposed for learning in neural network models and is mostly credited to Rosenblatt.

The perceptron learning algorithm is one of the first procedures proposed for learning in neural network models and is mostly credited to Rosenblatt. 1 The perceptron learning algorithm is one of the first procedures proposed for learning in neural network models and is mostly credited to Rosenblatt. The algorithm applies only to single layer models

More information

Linear models: the perceptron and closest centroid algorithms. D = {(x i,y i )} n i=1. x i 2 R d 9/3/13. Preliminaries. Chapter 1, 7.

Linear models: the perceptron and closest centroid algorithms. D = {(x i,y i )} n i=1. x i 2 R d 9/3/13. Preliminaries. Chapter 1, 7. Preliminaries Linear models: the perceptron and closest centroid algorithms Chapter 1, 7 Definition: The Euclidean dot product beteen to vectors is the expression d T x = i x i The dot product is also

More information

Neural Networks (Part 1) Goals for the lecture

Neural Networks (Part 1) Goals for the lecture Neural Networks (Part ) Mark Craven and David Page Computer Sciences 760 Spring 208 www.biostat.wisc.edu/~craven/cs760/ Some of the slides in these lectures have been adapted/borrowed from materials developed

More information

3.4 Linear Least-Squares Filter

3.4 Linear Least-Squares Filter X(n) = [x(1), x(2),..., x(n)] T 1 3.4 Linear Least-Squares Filter Two characteristics of linear least-squares filter: 1. The filter is built around a single linear neuron. 2. The cost function is the sum

More information

Engineering Part IIB: Module 4F10 Statistical Pattern Processing Lecture 5: Single Layer Perceptrons & Estimating Linear Classifiers

Engineering Part IIB: Module 4F10 Statistical Pattern Processing Lecture 5: Single Layer Perceptrons & Estimating Linear Classifiers Engineering Part IIB: Module 4F0 Statistical Pattern Processing Lecture 5: Single Layer Perceptrons & Estimating Linear Classifiers Phil Woodland: pcw@eng.cam.ac.uk Michaelmas 202 Engineering Part IIB:

More information

Linear Discrimination Functions

Linear Discrimination Functions Laurea Magistrale in Informatica Nicola Fanizzi Dipartimento di Informatica Università degli Studi di Bari November 4, 2009 Outline Linear models Gradient descent Perceptron Minimum square error approach

More information

Linear discriminant functions

Linear discriminant functions Andrea Passerini passerini@disi.unitn.it Machine Learning Discriminative learning Discriminative vs generative Generative learning assumes knowledge of the distribution governing the data Discriminative

More information

Rosenblatt s Perceptron

Rosenblatt s Perceptron M01_HAYK1399_SE_03_C01.QXD 10/1/08 7:03 PM Page 47 C H A P T E R 1 Rosenblatt s Perceptron ORGANIZATION OF THE CHAPTER The perceptron occupies a special place in the historical development of neural networks:

More information

Artificial Neural Network

Artificial Neural Network Artificial Neural Network Eung Je Woo Department of Biomedical Engineering Impedance Imaging Research Center (IIRC) Kyung Hee University Korea ejwoo@khu.ac.kr Neuron and Neuron Model McCulloch and Pitts

More information

ADALINE for Pattern Classification

ADALINE for Pattern Classification POLYTECHNIC UNIVERSITY Department of Computer and Information Science ADALINE for Pattern Classification K. Ming Leung Abstract: A supervised learning algorithm known as the Widrow-Hoff rule, or the Delta

More information

Course 395: Machine Learning - Lectures

Course 395: Machine Learning - Lectures Course 395: Machine Learning - Lectures Lecture 1-2: Concept Learning (M. Pantic) Lecture 3-4: Decision Trees & CBC Intro (M. Pantic & S. Petridis) Lecture 5-6: Evaluating Hypotheses (S. Petridis) Lecture

More information

Preliminaries. Definition: The Euclidean dot product between two vectors is the expression. i=1

Preliminaries. Definition: The Euclidean dot product between two vectors is the expression. i=1 90 8 80 7 70 6 60 0 8/7/ Preliminaries Preliminaries Linear models and the perceptron algorithm Chapters, T x + b < 0 T x + b > 0 Definition: The Euclidean dot product beteen to vectors is the expression

More information

Lecture 6. Notes on Linear Algebra. Perceptron

Lecture 6. Notes on Linear Algebra. Perceptron Lecture 6. Notes on Linear Algebra. Perceptron COMP90051 Statistical Machine Learning Semester 2, 2017 Lecturer: Andrey Kan Copyright: University of Melbourne This lecture Notes on linear algebra Vectors

More information

Intelligent Systems Discriminative Learning, Neural Networks

Intelligent Systems Discriminative Learning, Neural Networks Intelligent Systems Discriminative Learning, Neural Networks Carsten Rother, Dmitrij Schlesinger WS2014/2015, Outline 1. Discriminative learning 2. Neurons and linear classifiers: 1) Perceptron-Algorithm

More information

Linear & nonlinear classifiers

Linear & nonlinear classifiers Linear & nonlinear classifiers Machine Learning Hamid Beigy Sharif University of Technology Fall 1394 Hamid Beigy (Sharif University of Technology) Linear & nonlinear classifiers Fall 1394 1 / 34 Table

More information

The Perceptron Algorithm

The Perceptron Algorithm The Perceptron Algorithm Greg Grudic Greg Grudic Machine Learning Questions? Greg Grudic Machine Learning 2 Binary Classification A binary classifier is a mapping from a set of d inputs to a single output

More information

ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD

ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD WHAT IS A NEURAL NETWORK? The simplest definition of a neural network, more properly referred to as an 'artificial' neural network (ANN), is provided

More information

Last updated: Oct 22, 2012 LINEAR CLASSIFIERS. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition

Last updated: Oct 22, 2012 LINEAR CLASSIFIERS. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition Last updated: Oct 22, 2012 LINEAR CLASSIFIERS Problems 2 Please do Problem 8.3 in the textbook. We will discuss this in class. Classification: Problem Statement 3 In regression, we are modeling the relationship

More information

CSC Neural Networks. Perceptron Learning Rule

CSC Neural Networks. Perceptron Learning Rule CSC 302 1.5 Neural Networks Perceptron Learning Rule 1 Objectives Determining the weight matrix and bias for perceptron networks with many inputs. Explaining what a learning rule is. Developing the perceptron

More information

Neural Networks. CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington

Neural Networks. CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington Neural Networks CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington 1 Perceptrons x 0 = 1 x 1 x 2 z = h w T x Output: z x D A perceptron

More information

Serious limitations of (single-layer) perceptrons: Cannot learn non-linearly separable tasks. Cannot approximate (learn) non-linear functions

Serious limitations of (single-layer) perceptrons: Cannot learn non-linearly separable tasks. Cannot approximate (learn) non-linear functions BACK-PROPAGATION NETWORKS Serious limitations of (single-layer) perceptrons: Cannot learn non-linearly separable tasks Cannot approximate (learn) non-linear functions Difficult (if not impossible) to design

More information

CSE 417T: Introduction to Machine Learning. Lecture 11: Review. Henry Chai 10/02/18

CSE 417T: Introduction to Machine Learning. Lecture 11: Review. Henry Chai 10/02/18 CSE 417T: Introduction to Machine Learning Lecture 11: Review Henry Chai 10/02/18 Unknown Target Function!: # % Training data Formal Setup & = ( ), + ),, ( -, + - Learning Algorithm 2 Hypothesis Set H

More information

Classification with Perceptrons. Reading:

Classification with Perceptrons. Reading: Classification with Perceptrons Reading: Chapters 1-3 of Michael Nielsen's online book on neural networks covers the basics of perceptrons and multilayer neural networks We will cover material in Chapters

More information

AN INTRODUCTION TO NEURAL NETWORKS. Scott Kuindersma November 12, 2009

AN INTRODUCTION TO NEURAL NETWORKS. Scott Kuindersma November 12, 2009 AN INTRODUCTION TO NEURAL NETWORKS Scott Kuindersma November 12, 2009 SUPERVISED LEARNING We are given some training data: We must learn a function If y is discrete, we call it classification If it is

More information

Neural Networks. Nicholas Ruozzi University of Texas at Dallas

Neural Networks. Nicholas Ruozzi University of Texas at Dallas Neural Networks Nicholas Ruozzi University of Texas at Dallas Handwritten Digit Recognition Given a collection of handwritten digits and their corresponding labels, we d like to be able to correctly classify

More information

ARTIFICIAL INTELLIGENCE. Artificial Neural Networks

ARTIFICIAL INTELLIGENCE. Artificial Neural Networks INFOB2KI 2017-2018 Utrecht University The Netherlands ARTIFICIAL INTELLIGENCE Artificial Neural Networks Lecturer: Silja Renooij These slides are part of the INFOB2KI Course Notes available from www.cs.uu.nl/docs/vakken/b2ki/schema.html

More information

Neural Networks. Xiaojin Zhu Computer Sciences Department University of Wisconsin, Madison. slide 1

Neural Networks. Xiaojin Zhu Computer Sciences Department University of Wisconsin, Madison. slide 1 Neural Networks Xiaoin Zhu erryzhu@cs.wisc.edu Computer Sciences Department University of Wisconsin, Madison slide 1 Terminator 2 (1991) JOHN: Can you learn? So you can be... you know. More human. Not

More information

Pattern Recognition Prof. P. S. Sastry Department of Electronics and Communication Engineering Indian Institute of Science, Bangalore

Pattern Recognition Prof. P. S. Sastry Department of Electronics and Communication Engineering Indian Institute of Science, Bangalore Pattern Recognition Prof. P. S. Sastry Department of Electronics and Communication Engineering Indian Institute of Science, Bangalore Lecture - 27 Multilayer Feedforward Neural networks with Sigmoidal

More information

Artificial Neural Networks The Introduction

Artificial Neural Networks The Introduction Artificial Neural Networks The Introduction 01001110 01100101 01110101 01110010 01101111 01101110 01101111 01110110 01100001 00100000 01110011 01101011 01110101 01110000 01101001 01101110 01100001 00100000

More information

Lecture 4: Perceptrons and Multilayer Perceptrons

Lecture 4: Perceptrons and Multilayer Perceptrons Lecture 4: Perceptrons and Multilayer Perceptrons Cognitive Systems II - Machine Learning SS 2005 Part I: Basic Approaches of Concept Learning Perceptrons, Artificial Neuronal Networks Lecture 4: Perceptrons

More information

Multilayer Perceptron

Multilayer Perceptron Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Single Perceptron 3 Boolean Function Learning 4

More information

1 Machine Learning Concepts (16 points)

1 Machine Learning Concepts (16 points) CSCI 567 Fall 2018 Midterm Exam DO NOT OPEN EXAM UNTIL INSTRUCTED TO DO SO PLEASE TURN OFF ALL CELL PHONES Problem 1 2 3 4 5 6 Total Max 16 10 16 42 24 12 120 Points Please read the following instructions

More information

From Binary to Multiclass Classification. CS 6961: Structured Prediction Spring 2018

From Binary to Multiclass Classification. CS 6961: Structured Prediction Spring 2018 From Binary to Multiclass Classification CS 6961: Structured Prediction Spring 2018 1 So far: Binary Classification We have seen linear models Learning algorithms Perceptron SVM Logistic Regression Prediction

More information

Optimization and Gradient Descent

Optimization and Gradient Descent Optimization and Gradient Descent INFO-4604, Applied Machine Learning University of Colorado Boulder September 12, 2017 Prof. Michael Paul Prediction Functions Remember: a prediction function is the function

More information

Artificial Neural Networks Examination, June 2005

Artificial Neural Networks Examination, June 2005 Artificial Neural Networks Examination, June 2005 Instructions There are SIXTY questions. (The pass mark is 30 out of 60). For each question, please select a maximum of ONE of the given answers (either

More information

Νεςπο-Ασαυήρ Υπολογιστική Neuro-Fuzzy Computing

Νεςπο-Ασαυήρ Υπολογιστική Neuro-Fuzzy Computing Νεςπο-Ασαυήρ Υπολογιστική Neuro-Fuzzy Computing ΗΥ418 Διδάσκων Δημήτριος Κατσαρός @ Τμ. ΗΜΜΥ Πανεπιστήμιο Θεσσαλίαρ Διάλεξη 4η 1 Perceptron s convergence 2 Proof of convergence Suppose that we have n training

More information

Vorlesung Neuronale Netze - Radiale-Basisfunktionen (RBF)-Netze -

Vorlesung Neuronale Netze - Radiale-Basisfunktionen (RBF)-Netze - Vorlesung Neuronale Netze - Radiale-Basisfunktionen (RBF)-Netze - SS 004 Holger Fröhlich (abg. Vorl. von S. Kaushik¹) Lehrstuhl Rechnerarchitektur, Prof. Dr. A. Zell ¹www.cse.iitd.ernet.in/~saroj Radial

More information

COMP 652: Machine Learning. Lecture 12. COMP Lecture 12 1 / 37

COMP 652: Machine Learning. Lecture 12. COMP Lecture 12 1 / 37 COMP 652: Machine Learning Lecture 12 COMP 652 Lecture 12 1 / 37 Today Perceptrons Definition Perceptron learning rule Convergence (Linear) support vector machines Margin & max margin classifier Formulation

More information

Midterm: CS 6375 Spring 2015 Solutions

Midterm: CS 6375 Spring 2015 Solutions Midterm: CS 6375 Spring 2015 Solutions The exam is closed book. You are allowed a one-page cheat sheet. Answer the questions in the spaces provided on the question sheets. If you run out of room for an

More information

Multilayer Perceptron = FeedForward Neural Network

Multilayer Perceptron = FeedForward Neural Network Multilayer Perceptron = FeedForward Neural Networ History Definition Classification = feedforward operation Learning = bacpropagation = local optimization in the space of weights Pattern Classification

More information

Neural Networks and the Back-propagation Algorithm

Neural Networks and the Back-propagation Algorithm Neural Networks and the Back-propagation Algorithm Francisco S. Melo In these notes, we provide a brief overview of the main concepts concerning neural networks and the back-propagation algorithm. We closely

More information

Inf2b Learning and Data

Inf2b Learning and Data Inf2b Learning and Data Lecture : Single layer Neural Networks () (Credit: Hiroshi Shimodaira Iain Murray and Steve Renals) Centre for Speech Technology Research (CSTR) School of Informatics University

More information

Machine Learning Support Vector Machines. Prof. Matteo Matteucci

Machine Learning Support Vector Machines. Prof. Matteo Matteucci Machine Learning Support Vector Machines Prof. Matteo Matteucci Discriminative vs. Generative Approaches 2 o Generative approach: we derived the classifier from some generative hypothesis about the way

More information

Revision: Neural Network

Revision: Neural Network Revision: Neural Network Exercise 1 Tell whether each of the following statements is true or false by checking the appropriate box. Statement True False a) A perceptron is guaranteed to perfectly learn

More information

Neural Networks biological neuron artificial neuron 1

Neural Networks biological neuron artificial neuron 1 Neural Networks biological neuron artificial neuron 1 A two-layer neural network Output layer (activation represents classification) Weighted connections Hidden layer ( internal representation ) Input

More information

Linear & nonlinear classifiers

Linear & nonlinear classifiers Linear & nonlinear classifiers Machine Learning Hamid Beigy Sharif University of Technology Fall 1396 Hamid Beigy (Sharif University of Technology) Linear & nonlinear classifiers Fall 1396 1 / 44 Table

More information

Lecture 5: Logistic Regression. Neural Networks

Lecture 5: Logistic Regression. Neural Networks Lecture 5: Logistic Regression. Neural Networks Logistic regression Comparison with generative models Feed-forward neural networks Backpropagation Tricks for training neural networks COMP-652, Lecture

More information

CSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18

CSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18 CSE 417T: Introduction to Machine Learning Final Review Henry Chai 12/4/18 Overfitting Overfitting is fitting the training data more than is warranted Fitting noise rather than signal 2 Estimating! "#$

More information

The Perceptron algorithm

The Perceptron algorithm The Perceptron algorithm Tirgul 3 November 2016 Agnostic PAC Learnability A hypothesis class H is agnostic PAC learnable if there exists a function m H : 0,1 2 N and a learning algorithm with the following

More information

Unit III. A Survey of Neural Network Model

Unit III. A Survey of Neural Network Model Unit III A Survey of Neural Network Model 1 Single Layer Perceptron Perceptron the first adaptive network architecture was invented by Frank Rosenblatt in 1957. It can be used for the classification of

More information

Binary Classification / Perceptron

Binary Classification / Perceptron Binary Classification / Perceptron Nicholas Ruozzi University of Texas at Dallas Slides adapted from David Sontag and Vibhav Gogate Supervised Learning Input: x 1, y 1,, (x n, y n ) x i is the i th data

More information

18.6 Regression and Classification with Linear Models

18.6 Regression and Classification with Linear Models 18.6 Regression and Classification with Linear Models 352 The hypothesis space of linear functions of continuous-valued inputs has been used for hundreds of years A univariate linear function (a straight

More information

4. Multilayer Perceptrons

4. Multilayer Perceptrons 4. Multilayer Perceptrons This is a supervised error-correction learning algorithm. 1 4.1 Introduction A multilayer feedforward network consists of an input layer, one or more hidden layers, and an output

More information

NN V: The generalized delta learning rule

NN V: The generalized delta learning rule NN V: The generalized delta learning rule We now focus on generalizing the delta learning rule for feedforward layered neural networks. The architecture of the two-layer network considered below is shown

More information

SPSS, University of Texas at Arlington. Topics in Machine Learning-EE 5359 Neural Networks

SPSS, University of Texas at Arlington. Topics in Machine Learning-EE 5359 Neural Networks Topics in Machine Learning-EE 5359 Neural Networks 1 The Perceptron Output: A perceptron is a function that maps D-dimensional vectors to real numbers. For notational convenience, we add a zero-th dimension

More information

MIDTERM: CS 6375 INSTRUCTOR: VIBHAV GOGATE October,

MIDTERM: CS 6375 INSTRUCTOR: VIBHAV GOGATE October, MIDTERM: CS 6375 INSTRUCTOR: VIBHAV GOGATE October, 23 2013 The exam is closed book. You are allowed a one-page cheat sheet. Answer the questions in the spaces provided on the question sheets. If you run

More information

NONLINEAR CLASSIFICATION AND REGRESSION. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition

NONLINEAR CLASSIFICATION AND REGRESSION. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition NONLINEAR CLASSIFICATION AND REGRESSION Nonlinear Classification and Regression: Outline 2 Multi-Layer Perceptrons The Back-Propagation Learning Algorithm Generalized Linear Models Radial Basis Function

More information

Machine Learning and Data Mining. Multi-layer Perceptrons & Neural Networks: Basics. Prof. Alexander Ihler

Machine Learning and Data Mining. Multi-layer Perceptrons & Neural Networks: Basics. Prof. Alexander Ihler + Machine Learning and Data Mining Multi-layer Perceptrons & Neural Networks: Basics Prof. Alexander Ihler Linear Classifiers (Perceptrons) Linear Classifiers a linear classifier is a mapping which partitions

More information

Vote. Vote on timing for night section: Option 1 (what we have now) Option 2. Lecture, 6:10-7:50 25 minute dinner break Tutorial, 8:15-9

Vote. Vote on timing for night section: Option 1 (what we have now) Option 2. Lecture, 6:10-7:50 25 minute dinner break Tutorial, 8:15-9 Vote Vote on timing for night section: Option 1 (what we have now) Lecture, 6:10-7:50 25 minute dinner break Tutorial, 8:15-9 Option 2 Lecture, 6:10-7 10 minute break Lecture, 7:10-8 10 minute break Tutorial,

More information

Linear Discriminant Functions

Linear Discriminant Functions Linear Discriminant Functions Linear discriminant functions and decision surfaces Definition It is a function that is a linear combination of the components of g() = t + 0 () here is the eight vector and

More information

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels Need for Deep Networks Perceptron Can only model linear functions Kernel Machines Non-linearity provided by kernels Need to design appropriate kernels (possibly selecting from a set, i.e. kernel learning)

More information

Linear models and the perceptron algorithm

Linear models and the perceptron algorithm 8/5/6 Preliminaries Linear models and the perceptron algorithm Chapters, 3 Definition: The Euclidean dot product beteen to vectors is the expression dx T x = i x i The dot product is also referred to as

More information

More about the Perceptron

More about the Perceptron More about the Perceptron CMSC 422 MARINE CARPUAT marine@cs.umd.edu Credit: figures by Piyush Rai and Hal Daume III Recap: Perceptron for binary classification Classifier = hyperplane that separates positive

More information

Linear smoother. ŷ = S y. where s ij = s ij (x) e.g. s ij = diag(l i (x))

Linear smoother. ŷ = S y. where s ij = s ij (x) e.g. s ij = diag(l i (x)) Linear smoother ŷ = S y where s ij = s ij (x) e.g. s ij = diag(l i (x)) 2 Online Learning: LMS and Perceptrons Partially adapted from slides by Ryan Gabbard and Mitch Marcus (and lots original slides by

More information

Artificial Neural Networks

Artificial Neural Networks Artificial Neural Networks Math 596 Project Summary Fall 2016 Jarod Hart 1 Overview Artificial Neural Networks (ANNs) are machine learning algorithms that imitate neural function. There is a vast theory

More information

Lecture 7 Artificial neural networks: Supervised learning

Lecture 7 Artificial neural networks: Supervised learning Lecture 7 Artificial neural networks: Supervised learning Introduction, or how the brain works The neuron as a simple computing element The perceptron Multilayer neural networks Accelerated learning in

More information

Data Mining Part 5. Prediction

Data Mining Part 5. Prediction Data Mining Part 5. Prediction 5.5. Spring 2010 Instructor: Dr. Masoud Yaghini Outline How the Brain Works Artificial Neural Networks Simple Computing Elements Feed-Forward Networks Perceptrons (Single-layer,

More information

Neural networks III: The delta learning rule with semilinear activation function

Neural networks III: The delta learning rule with semilinear activation function Neural networks III: The delta learning rule with semilinear activation function The standard delta rule essentially implements gradient descent in sum-squared error for linear activation functions. We

More information

Pattern Classification

Pattern Classification Pattern Classification All materials in these slides were taen from Pattern Classification (2nd ed) by R. O. Duda,, P. E. Hart and D. G. Stor, John Wiley & Sons, 2000 with the permission of the authors

More information

Artificial Neural Networks" and Nonparametric Methods" CMPSCI 383 Nov 17, 2011!

Artificial Neural Networks and Nonparametric Methods CMPSCI 383 Nov 17, 2011! Artificial Neural Networks" and Nonparametric Methods" CMPSCI 383 Nov 17, 2011! 1 Todayʼs lecture" How the brain works (!)! Artificial neural networks! Perceptrons! Multilayer feed-forward networks! Error

More information

Part 8: Neural Networks

Part 8: Neural Networks METU Informatics Institute Min720 Pattern Classification ith Bio-Medical Applications Part 8: Neural Netors - INTRODUCTION: BIOLOGICAL VS. ARTIFICIAL Biological Neural Netors A Neuron: - A nerve cell as

More information

Artificial Neural Networks Examination, March 2004

Artificial Neural Networks Examination, March 2004 Artificial Neural Networks Examination, March 2004 Instructions There are SIXTY questions (worth up to 60 marks). The exam mark (maximum 60) will be added to the mark obtained in the laborations (maximum

More information

ECE521 Lectures 9 Fully Connected Neural Networks

ECE521 Lectures 9 Fully Connected Neural Networks ECE521 Lectures 9 Fully Connected Neural Networks Outline Multi-class classification Learning multi-layer neural networks 2 Measuring distance in probability space We learnt that the squared L2 distance

More information

GRADIENT DESCENT AND LOCAL MINIMA

GRADIENT DESCENT AND LOCAL MINIMA GRADIENT DESCENT AND LOCAL MINIMA 25 20 5 15 10 3 2 1 1 2 5 2 2 4 5 5 10 Suppose for both functions above, gradient descent is started at the point marked red. It will walk downhill as far as possible,

More information

CSE 352 (AI) LECTURE NOTES Professor Anita Wasilewska. NEURAL NETWORKS Learning

CSE 352 (AI) LECTURE NOTES Professor Anita Wasilewska. NEURAL NETWORKS Learning CSE 352 (AI) LECTURE NOTES Professor Anita Wasilewska NEURAL NETWORKS Learning Neural Networks Classifier Short Presentation INPUT: classification data, i.e. it contains an classification (class) attribute.

More information

A MULTILAYER EXTENSION OF THE SIMILARITY NEURAL NETWORK

A MULTILAYER EXTENSION OF THE SIMILARITY NEURAL NETWORK Master in Artificial Intelligence (UPC- URV- UB) Master of Science Thesis A MULTILAYER EXTENSION OF THE SIMILARITY NEURAL NETWORK David Buchaca Prats Advisor: Lluís Belanche November 24, 2014 Acknowledgements

More information

Introduction: The Perceptron

Introduction: The Perceptron Introduction: The Perceptron Haim Sompolinsy, MIT October 4, 203 Perceptron Architecture The simplest type of perceptron has a single layer of weights connecting the inputs and output. Formally, the perceptron

More information

Lecture 9: Large Margin Classifiers. Linear Support Vector Machines

Lecture 9: Large Margin Classifiers. Linear Support Vector Machines Lecture 9: Large Margin Classifiers. Linear Support Vector Machines Perceptrons Definition Perceptron learning rule Convergence Margin & max margin classifiers (Linear) support vector machines Formulation

More information

Last update: October 26, Neural networks. CMSC 421: Section Dana Nau

Last update: October 26, Neural networks. CMSC 421: Section Dana Nau Last update: October 26, 207 Neural networks CMSC 42: Section 8.7 Dana Nau Outline Applications of neural networks Brains Neural network units Perceptrons Multilayer perceptrons 2 Example Applications

More information

9 Classification. 9.1 Linear Classifiers

9 Classification. 9.1 Linear Classifiers 9 Classification This topic returns to prediction. Unlike linear regression where we were predicting a numeric value, in this case we are predicting a class: winner or loser, yes or no, rich or poor, positive

More information

Machine Learning for Large-Scale Data Analysis and Decision Making A. Neural Networks Week #6

Machine Learning for Large-Scale Data Analysis and Decision Making A. Neural Networks Week #6 Machine Learning for Large-Scale Data Analysis and Decision Making 80-629-17A Neural Networks Week #6 Today Neural Networks A. Modeling B. Fitting C. Deep neural networks Today s material is (adapted)

More information

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels Need for Deep Networks Perceptron Can only model linear functions Kernel Machines Non-linearity provided by kernels Need to design appropriate kernels (possibly selecting from a set, i.e. kernel learning)

More information

Mining Classification Knowledge

Mining Classification Knowledge Mining Classification Knowledge Remarks on NonSymbolic Methods JERZY STEFANOWSKI Institute of Computing Sciences, Poznań University of Technology COST Doctoral School, Troina 2008 Outline 1. Bayesian classification

More information

Machine Learning. Linear Models. Fabio Vandin October 10, 2017

Machine Learning. Linear Models. Fabio Vandin October 10, 2017 Machine Learning Linear Models Fabio Vandin October 10, 2017 1 Linear Predictors and Affine Functions Consider X = R d Affine functions: L d = {h w,b : w R d, b R} where ( d ) h w,b (x) = w, x + b = w

More information

Linear Classifiers. Michael Collins. January 18, 2012

Linear Classifiers. Michael Collins. January 18, 2012 Linear Classifiers Michael Collins January 18, 2012 Today s Lecture Binary classification problems Linear classifiers The perceptron algorithm Classification Problems: An Example Goal: build a system that

More information

Learning from Examples

Learning from Examples Learning from Examples Data fitting Decision trees Cross validation Computational learning theory Linear classifiers Neural networks Nonparametric methods: nearest neighbor Support vector machines Ensemble

More information

Artificial Neural Networks

Artificial Neural Networks Artificial Neural Networks 鮑興國 Ph.D. National Taiwan University of Science and Technology Outline Perceptrons Gradient descent Multi-layer networks Backpropagation Hidden layer representations Examples

More information

Introduction to Neural Networks

Introduction to Neural Networks Introduction to Neural Networks What are (Artificial) Neural Networks? Models of the brain and nervous system Highly parallel Process information much more like the brain than a serial computer Learning

More information

Unit 8: Introduction to neural networks. Perceptrons

Unit 8: Introduction to neural networks. Perceptrons Unit 8: Introduction to neural networks. Perceptrons D. Balbontín Noval F. J. Martín Mateos J. L. Ruiz Reina A. Riscos Núñez Departamento de Ciencias de la Computación e Inteligencia Artificial Universidad

More information

Learning with multiple models. Boosting.

Learning with multiple models. Boosting. CS 2750 Machine Learning Lecture 21 Learning with multiple models. Boosting. Milos Hauskrecht milos@cs.pitt.edu 5329 Sennott Square Learning with multiple models: Approach 2 Approach 2: use multiple models

More information

The Perceptron. Volker Tresp Summer 2014

The Perceptron. Volker Tresp Summer 2014 The Perceptron Volker Tresp Summer 2014 1 Introduction One of the first serious learning machines Most important elements in learning tasks Collection and preprocessing of training data Definition of a

More information

Neural Networks Lecture 2:Single Layer Classifiers

Neural Networks Lecture 2:Single Layer Classifiers Neural Networks Lecture 2:Single Layer Classifiers H.A Talebi Farzaneh Abdollahi Department of Electrical Engineering Amirkabir University of Technology Winter 2011. A. Talebi, Farzaneh Abdollahi Neural

More information