In the Name of God. Lecture 11: Single Layer Perceptrons
|
|
- Barnard Perkins
- 5 years ago
- Views:
Transcription
1 1 In the Name of God Lecture 11: Single Layer Perceptrons
2 Perceptron: architecture We consider the architecture: feed-forward NN with one layer It is sufficient to study single layer perceptrons with just one neuron:
3 Single layer perceptrons Generalization to single layer perceptrons with more neurons is easy because: The output units are independent among each other The output units are independent among each other Each weight only affects one of the outputs
4 Perceptron: Neuron Model The (McCulloch-Pitts) perceptron is a single layer NN with a non-linear, the sign function
5 Perceptron Perceptron is based on a nonlinear neuron. x(n) = [+1,x 1 (n),x 2 (n),...,x m (n)] T w(n) = [b(n),w 1 (n),w 2 (n),...,x m (n)] T v(n) = w T (n)x(n) defines a hyperplane m v w x b i1 i i m y signb wix i1 i
6 Perceptron: Applications The perceptronp is used for classification: classify correctly a set of examples into one of the two classes C 1 and C 2 : If the output of the perceptron is +1, then the input is assigned to class C 1 If the output of the perceptron is -1, then the input is assigned to C 2
7 Perceptron Training How can we train a perceptronp for a classification task? We try to find suitable values for the weights in such a way that the training examples are correctly classified. Geometrically, we try to find a hyper-plane that t separates the examples of the two classes.
8 Perceptron: Classification The equation below describes a hyperplane in the input space. This hyperplane is used to separate the two classes C1 and C2 decision region for C1 m i1 w x i i b x 2 w 1 x 1 + w 2 x 2 + b > 0 0 decision i boundary C 1 decision i x C 1 region for C 2 2 w 1 x 1 + w 2 x 2 + b <= 0 w 1 x 1 + w 2 x 2 + b = 0
9 Example: AND Here is a representation of the AND function White means false, black means true for the output -1 means false, +1 means true for the input A linear decision surface separates false from true instances -1 AND -1 = false -1 AND +1 = false +1 AND -1 = false +1 AND +1 = true
10 Example: AND Watch a perceptron learn the AND function:
11 Example: XOR Here s the XOR function: -1 XOR -1 = false -1 XOR +1 = true +1 XOR -1 = true +1 XOR +1 = false Perceptrons cannot learn such linearly inseparable functions
12 Example: XOR Watch a perceptron try to learn XOR
13 Perceptron learning If the n-th member of the training set, x(n), is correctly classified by the weight vector w(n) ( ) computed at the n-th iteration of the algorithm, no correction is made to the weight vector, i.e: w(n+1) = w(n) if w T x(n) > 0 and x(n) belongs to class C 1 w(n+1) = w(n) if w T x(n) 0 and x(n) belongs to class C 2 Otherwise the weight vector of the perceptron is uppdated: w(n+1) = w (n) - (n)x(n) if w T (n)x(n) >0 and x(n) belongs to class C 2 w(n+1) = w (n) + (n)x(n) if w T (n)x(n) 0 and x(n) belongs to class C 1 learning-rate rate parameter (n) controls the adjustment applied to the weight vector at iteration n.
14 Perceptron learning Or, simply w(n + 1) = w(n) + (n)e(n)x(n) where e(n) = d(n) y(n) [the error] The perceptron uses the McCulloch-Pitts formal model of a neuron. The learning gprocess is performed for a finite number of iterations and then stops
15 Learning algorithm Epoch: Presentation of the entire training set to the neural network. In the case of the AND function an epoch consists of ffour sets of finputs being presented dto the network (i.e. [0,0], [0,1], [1,0], [1,1]) Error: The error value is the amount by which the value Error: The error value is the amount by which the value output by the network differs from the target value. For example, if we required the network to output 0 and it outputs a 1, then Error = -1
16 Decision boundary In simple cases, divide feature space by drawing a hyperplane across it. Known as a decision boundary. Discriminant function: returns different values on opposite sides. (straight line) Problems which can be thus classified are linearly separable. + + x x x Linearly separable - x Non-Linearly separable
17 Example Consider the 2-dimensional training set C 1 C 2, C 1 = {(1, 1, 1), (1, 1, -1), (1, 0, -1)} with class label 1 C 2 = {(1, -1,-1), (1, -1,1), (1, 0,1)} with class label 0 Train a perceptron on C 1 C 2 First pass is as follows Input Weight Desired Actual Update? New weight (1, 1, 1) (1, 0, 0) 1 1 No (1, 0, 0) (1, 1, -1) (1, 0, 0) 1 1 No (1, 0, 0) (1,0, -1) (1, 0, 0) 1 1 No (1, 0, 0) (1,-1, -1) (1, 0, 0) 0 1 Yes (0, 1, 1) (1,-1, 1) (0, 1, 1) 0 0 No (0, 1, 1) (1, 0, 1) (0, 1, 1) 0 1 Yes (-1, 1, 0)
18 Example C1: {(1, 1, 1), (1, 1, -1), (1, 0, -1)} C2: {(1, -1,-1), (1, -1,1), (1, 0,1)} Fill out this table sequentially (Second pass): Input Weight Desired Actual Update? New weight (1, 1, 1) (-1, 1, 0) 1 0 Yes (0, 2, 1) (1, 1, -1) (0, 2, 1) 1 1 No (0, 2, 1) (1,0, -1) (0, 2, 1) 1 0 Yes (1, 2, 0) (1,-1, -1) (1, 2, 0) 0 0 No (1, 2, 0) (1,-1, 1) (1, 2, 0) 0 0 No (1, 2, 0) (1, 0, 1) (1, 2, 0) 0 1 Yes (0, 2, -1)
19 Example C1: {(1, 1, 1), (1, 1, -1), (1, 0, -1)} C2: {(1, -1,-1), 1) (1, -1,1), 11) (1, 0,1)} 01)} Fill out this table sequentially (Third pass): Input Weight Desired Actual Update? New weight (1, 1, 1) (0, 2, -1) 1 1 No (0, 2, -1) (1, 1, -1) (0, 2, -1) 1 1 No (0, 2, -1) (1,0, -1) (0, 2, -1) 1 1 No (0, 2, -1) (1,-1, -1) (0, 2, -1) 0 0 No (0, 2, -1) (1,-1, 1) (0, 2, -1) 0 0 No (0, 2, -1) (1, 0, 1) (0, 2, -1) 0 0 No (0, 2, -1) At epoch 3 no weight changes. stop execution of algorithm. Final weight vector: (0, 2, -1). decision hyperplane is 2x 1 -x 2 = 0.
20 Example: final results C 2 x Decision boundary: 2x 1 -x 2 = 0-1 1/2 1 2 x 1 C C 1 w
21 Some Unhappiness About Perceptron Training The perceptron learning rule fails to converge if examples are not linearly separable Can only model linearly separable classes, like (those described by) the following Boolean functions: AND, OR, but not XOR When a perceptron gives the right answer, no learning takes place. Anything below the threshold is interpreted as no, even if it is just below the threshold. Might be better to train the neuron based on how far below the threshold it is?
22 Perceptron: Convergence Theorem Suppose datasets C 1 and C 2 are linearly separable. The perceptron learning algorithm converges after n 0 iterations, with n 0 n max on training set C 1 C 2. Proof: suppose x C 1 output = 1 and x C 2 output = -1. For simplicity assume w(1) = 0, = 1. Suppose perceptron incorrectly classifies x(1) x(n) C 1. Then w T (k) x(k) 0. Error correction rule: w(2) = w(1) + x(1) w(3) = w(2) + x(2) w(n+1) = x(1)+ + x(n) w(n+1) = w(n) + x(n).
23 Perceptron: Convergence Theorem Let w 0 be such that w 0T x(n) > 0 x(n) C 1. w 0 exists because C 1 and C 2 are linearly separable. Let = min w 0T x(n) x(n) C 1. Then, w 0T w(n+1) = w 0T x(1) + + w 0T x(n) n Cauchy-Schwarz inequality: w 0 2 w(n+1) 2 [w 0 T w(n+1)] 2 w(n+1) 2 n 2 2 w 0 2 (A)
24 Perceptron: Convergence Theorem Now we consider another route: w(k+1) = w(k) + x(k) w(k+1) 2 = w(k) 2 + x(k) w T (k)x(k) Euclidean norm 0 because x(k) is misclassified w(k+1) 2 w(k) 2 + x(k) 2 k=1,..,n =0 w(2) 2 w(1) 2 + x(1) 2 w(3) 2 w(2) 2 + x(2) 2 n w(n+1) 2 k 1 x ( k ) 2
25 Perceptron: Convergence Theorem Let = max x(n) 2 x(n) C 1 w(n+1) 2 n (B) For sufficiently large values of n: (B) becomes in conflict with (A). Then, n cannot be greater than n max such that (A) and (B) are both satisfied with the equality sign. n w 2 2 max 0 nmax nmax 2 2 w0 Perceptron convergence algorithm terminates in at most 2 β n max = w iterations.
26 Fixed-increment convergence theorem Let the subsets of training vectors C1 and C2 be linearly separable. Let the inputs presented to perceptron originate from these two subsets. The perceptron converges after some n 0 iterations, in the sense that w(n 0 )=w(n+1) 0 = w(n 0 +2) = is a solution vector for n 0 n max. 0 max
27 Learning-Rate Annealing Schedules When learning rate is large, trajectory may follow zigzagging path When it is small, procedure may be slow Simplest learning rate parameter ( n) 0 Stochastic ti approximation time-varying when c is large, danger of parameter blowup for small n. ( n ) c n (c is constant)
28 Learning-Rate Annealing Schedules Search then converge schedule in early stages, learning rate parameter is approximately equal to 0 for a number of iteration n large compared to search time constant t, learning rate parameter approximates as c/n 0 ( n) 1 ( n / ) ( 0, is constant)
29 Summary 1. Initialization set w(0)= Activation at time step n, activate perceptron by applying continuous valued input vector x(n) and desired d response d(n) 3. Computation of actual response T y ( n ) sgn[ w ( n ) x ( n )] 4. Adaptation of Weight Vector w ( n 1) w ( n ) [ d ( n ) y ( n )] x ( n ) 1 d( n) { 1 if x(n) belongs to class x(n) belongs 5. Continuation increment time step n and go back to step 2 if to class C C 1 2
30 Perceptron Learning Example x 1 w 1 x 0 =-1 =w 0 x. 2 w 0 w 2 y. wn a= i=0n w i x i. n x n y= { 1 if a 0 if a< Thus, y= sign(a)=0 or 1
31 Perceptron Learning Example t=1 t=-1 o=1 w=[0 [ ] 05] x 2 = 0.2 x o=-1 (x,t)=([-1,-1],1) (x,t)=([2,1], ([ 1] -1) 1) o=sgn( ) o=sgn( ) (x,t)=([1,1],1) o=sgn( ) =-1 =1 =-1 w=[ ] w=[ ] w=[0.2 [ ]
32 Adaline: Adaptive Linear Element The output y is a linear combination o x x 1 x 2 w 1 w w 2 y x m w m m y x () (n)w (n) j0 j j )
33 Adaline: Adaptive Linear Element Adaline: uses a linear neuron model and the Least-Mean- Square (LMS) learning algorithm The idea: try to minimize the square error, which is a function of the weights E(w(n)) 1 2 e 2 (n) e(n) d(n) m j0 x j (n)w j(n) We can find the minimum of the error function E by means of fthe Steepest tdescent method
34 Steepest Descent Method start with an arbitrary point find a direction in which E is decreasing most rapidly (gradient of E(w)) make a small step in that direction E w1 E,, w1 wmw w(n 1) w(n) (gradient of E(n)) 34
35 Comparison of Perceptron and Adaline Perceptron Adaline Architecture Single-layer Single-layer Neuron Non-linear linear model Learning Minimze Minimize total algorithm number of squared error misclassified examples Application Linear classification Linear classification, regression
36 Reading S Haykin, Neural Networks: A Comprehensive y, p Foundation, 2007 (Chapter 4).
PMR5406 Redes Neurais e Lógica Fuzzy Aula 3 Single Layer Percetron
PMR5406 Redes Neurais e Aula 3 Single Layer Percetron Baseado em: Neural Networks, Simon Haykin, Prentice-Hall, 2 nd edition Slides do curso por Elena Marchiori, Vrije Unviersity Architecture We consider
More informationSingle layer NN. Neuron Model
Single layer NN We consider the simple architecture consisting of just one neuron. Generalization to a single layer with more neurons as illustrated below is easy because: M M The output units are independent
More informationThe Perceptron Algorithm 1
CS 64: Machine Learning Spring 5 College of Computer and Information Science Northeastern University Lecture 5 March, 6 Instructor: Bilal Ahmed Scribe: Bilal Ahmed & Virgil Pavlu Introduction The Perceptron
More information3.4 Linear Least-Squares Filter
X(n) = [x(1), x(2),..., x(n)] T 1 3.4 Linear Least-Squares Filter Two characteristics of linear least-squares filter: 1. The filter is built around a single linear neuron. 2. The cost function is the sum
More informationSimple Neural Nets For Pattern Classification
CHAPTER 2 Simple Neural Nets For Pattern Classification Neural Networks General Discussion One of the simplest tasks that neural nets can be trained to perform is pattern classification. In pattern classification
More informationChapter 2 Single Layer Feedforward Networks
Chapter 2 Single Layer Feedforward Networks By Rosenblatt (1962) Perceptrons For modeling visual perception (retina) A feedforward network of three layers of units: Sensory, Association, and Response Learning
More informationChapter ML:VI. VI. Neural Networks. Perceptron Learning Gradient Descent Multilayer Perceptron Radial Basis Functions
Chapter ML:VI VI. Neural Networks Perceptron Learning Gradient Descent Multilayer Perceptron Radial asis Functions ML:VI-1 Neural Networks STEIN 2005-2018 The iological Model Simplified model of a neuron:
More informationThe perceptron learning algorithm is one of the first procedures proposed for learning in neural network models and is mostly credited to Rosenblatt.
1 The perceptron learning algorithm is one of the first procedures proposed for learning in neural network models and is mostly credited to Rosenblatt. The algorithm applies only to single layer models
More informationRosenblatt s Perceptron
M01_HAYK1399_SE_03_C01.QXD 10/1/08 7:03 PM Page 47 C H A P T E R 1 Rosenblatt s Perceptron ORGANIZATION OF THE CHAPTER The perceptron occupies a special place in the historical development of neural networks:
More informationLinear models: the perceptron and closest centroid algorithms. D = {(x i,y i )} n i=1. x i 2 R d 9/3/13. Preliminaries. Chapter 1, 7.
Preliminaries Linear models: the perceptron and closest centroid algorithms Chapter 1, 7 Definition: The Euclidean dot product beteen to vectors is the expression d T x = i x i The dot product is also
More informationLast updated: Oct 22, 2012 LINEAR CLASSIFIERS. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition
Last updated: Oct 22, 2012 LINEAR CLASSIFIERS Problems 2 Please do Problem 8.3 in the textbook. We will discuss this in class. Classification: Problem Statement 3 In regression, we are modeling the relationship
More informationArtificial Neural Network
Artificial Neural Network Eung Je Woo Department of Biomedical Engineering Impedance Imaging Research Center (IIRC) Kyung Hee University Korea ejwoo@khu.ac.kr Neuron and Neuron Model McCulloch and Pitts
More informationCourse 395: Machine Learning - Lectures
Course 395: Machine Learning - Lectures Lecture 1-2: Concept Learning (M. Pantic) Lecture 3-4: Decision Trees & CBC Intro (M. Pantic & S. Petridis) Lecture 5-6: Evaluating Hypotheses (S. Petridis) Lecture
More informationNeural Networks (Part 1) Goals for the lecture
Neural Networks (Part ) Mark Craven and David Page Computer Sciences 760 Spring 208 www.biostat.wisc.edu/~craven/cs760/ Some of the slides in these lectures have been adapted/borrowed from materials developed
More informationLinear discriminant functions
Andrea Passerini passerini@disi.unitn.it Machine Learning Discriminative learning Discriminative vs generative Generative learning assumes knowledge of the distribution governing the data Discriminative
More informationNeural Networks and the Back-propagation Algorithm
Neural Networks and the Back-propagation Algorithm Francisco S. Melo In these notes, we provide a brief overview of the main concepts concerning neural networks and the back-propagation algorithm. We closely
More informationLinear & nonlinear classifiers
Linear & nonlinear classifiers Machine Learning Hamid Beigy Sharif University of Technology Fall 1394 Hamid Beigy (Sharif University of Technology) Linear & nonlinear classifiers Fall 1394 1 / 34 Table
More informationLinear & nonlinear classifiers
Linear & nonlinear classifiers Machine Learning Hamid Beigy Sharif University of Technology Fall 1396 Hamid Beigy (Sharif University of Technology) Linear & nonlinear classifiers Fall 1396 1 / 44 Table
More information4. Multilayer Perceptrons
4. Multilayer Perceptrons This is a supervised error-correction learning algorithm. 1 4.1 Introduction A multilayer feedforward network consists of an input layer, one or more hidden layers, and an output
More informationMIDTERM: CS 6375 INSTRUCTOR: VIBHAV GOGATE October,
MIDTERM: CS 6375 INSTRUCTOR: VIBHAV GOGATE October, 23 2013 The exam is closed book. You are allowed a one-page cheat sheet. Answer the questions in the spaces provided on the question sheets. If you run
More informationLecture 6. Notes on Linear Algebra. Perceptron
Lecture 6. Notes on Linear Algebra. Perceptron COMP90051 Statistical Machine Learning Semester 2, 2017 Lecturer: Andrey Kan Copyright: University of Melbourne This lecture Notes on linear algebra Vectors
More information18.6 Regression and Classification with Linear Models
18.6 Regression and Classification with Linear Models 352 The hypothesis space of linear functions of continuous-valued inputs has been used for hundreds of years A univariate linear function (a straight
More informationAN INTRODUCTION TO NEURAL NETWORKS. Scott Kuindersma November 12, 2009
AN INTRODUCTION TO NEURAL NETWORKS Scott Kuindersma November 12, 2009 SUPERVISED LEARNING We are given some training data: We must learn a function If y is discrete, we call it classification If it is
More informationEngineering Part IIB: Module 4F10 Statistical Pattern Processing Lecture 5: Single Layer Perceptrons & Estimating Linear Classifiers
Engineering Part IIB: Module 4F0 Statistical Pattern Processing Lecture 5: Single Layer Perceptrons & Estimating Linear Classifiers Phil Woodland: pcw@eng.cam.ac.uk Michaelmas 202 Engineering Part IIB:
More informationLast update: October 26, Neural networks. CMSC 421: Section Dana Nau
Last update: October 26, 207 Neural networks CMSC 42: Section 8.7 Dana Nau Outline Applications of neural networks Brains Neural network units Perceptrons Multilayer perceptrons 2 Example Applications
More informationClassification with Perceptrons. Reading:
Classification with Perceptrons Reading: Chapters 1-3 of Michael Nielsen's online book on neural networks covers the basics of perceptrons and multilayer neural networks We will cover material in Chapters
More informationLecture 4: Perceptrons and Multilayer Perceptrons
Lecture 4: Perceptrons and Multilayer Perceptrons Cognitive Systems II - Machine Learning SS 2005 Part I: Basic Approaches of Concept Learning Perceptrons, Artificial Neuronal Networks Lecture 4: Perceptrons
More informationThe Perceptron Algorithm
The Perceptron Algorithm Greg Grudic Greg Grudic Machine Learning Questions? Greg Grudic Machine Learning 2 Binary Classification A binary classifier is a mapping from a set of d inputs to a single output
More informationIntelligent Systems Discriminative Learning, Neural Networks
Intelligent Systems Discriminative Learning, Neural Networks Carsten Rother, Dmitrij Schlesinger WS2014/2015, Outline 1. Discriminative learning 2. Neurons and linear classifiers: 1) Perceptron-Algorithm
More informationArtificial Neural Networks" and Nonparametric Methods" CMPSCI 383 Nov 17, 2011!
Artificial Neural Networks" and Nonparametric Methods" CMPSCI 383 Nov 17, 2011! 1 Todayʼs lecture" How the brain works (!)! Artificial neural networks! Perceptrons! Multilayer feed-forward networks! Error
More informationArtificial Neural Networks The Introduction
Artificial Neural Networks The Introduction 01001110 01100101 01110101 01110010 01101111 01101110 01101111 01110110 01100001 00100000 01110011 01101011 01110101 01110000 01101001 01101110 01100001 00100000
More informationy(x n, w) t n 2. (1)
Network training: Training a neural network involves determining the weight parameter vector w that minimizes a cost function. Given a training set comprising a set of input vector {x n }, n = 1,...N,
More informationLinear Discrimination Functions
Laurea Magistrale in Informatica Nicola Fanizzi Dipartimento di Informatica Università degli Studi di Bari November 4, 2009 Outline Linear models Gradient descent Perceptron Minimum square error approach
More informationMachine Learning Lecture 5
Machine Learning Lecture 5 Linear Discriminant Functions 26.10.2017 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Course Outline Fundamentals Bayes Decision Theory
More informationLab 5: 16 th April Exercises on Neural Networks
Lab 5: 16 th April 01 Exercises on Neural Networks 1. What are the values of weights w 0, w 1, and w for the perceptron whose decision surface is illustrated in the figure? Assume the surface crosses the
More informationRadial-Basis Function Networks
Radial-Basis Function etworks A function is radial () if its output depends on (is a nonincreasing function of) the distance of the input from a given stored vector. s represent local receptors, as illustrated
More informationCSE 352 (AI) LECTURE NOTES Professor Anita Wasilewska. NEURAL NETWORKS Learning
CSE 352 (AI) LECTURE NOTES Professor Anita Wasilewska NEURAL NETWORKS Learning Neural Networks Classifier Short Presentation INPUT: classification data, i.e. it contains an classification (class) attribute.
More informationADALINE for Pattern Classification
POLYTECHNIC UNIVERSITY Department of Computer and Information Science ADALINE for Pattern Classification K. Ming Leung Abstract: A supervised learning algorithm known as the Widrow-Hoff rule, or the Delta
More informationARTIFICIAL INTELLIGENCE. Artificial Neural Networks
INFOB2KI 2017-2018 Utrecht University The Netherlands ARTIFICIAL INTELLIGENCE Artificial Neural Networks Lecturer: Silja Renooij These slides are part of the INFOB2KI Course Notes available from www.cs.uu.nl/docs/vakken/b2ki/schema.html
More informationNeural Networks biological neuron artificial neuron 1
Neural Networks biological neuron artificial neuron 1 A two-layer neural network Output layer (activation represents classification) Weighted connections Hidden layer ( internal representation ) Input
More informationNetworks of McCulloch-Pitts Neurons
s Lecture 4 Netorks of McCulloch-Pitts Neurons The McCulloch and Pitts (M_P) Neuron x x sgn x n Netorks of M-P Neurons One neuron can t do much on its on, but a net of these neurons x i x i i sgn i ij
More informationPreliminaries. Definition: The Euclidean dot product between two vectors is the expression. i=1
90 8 80 7 70 6 60 0 8/7/ Preliminaries Preliminaries Linear models and the perceptron algorithm Chapters, T x + b < 0 T x + b > 0 Definition: The Euclidean dot product beteen to vectors is the expression
More informationRevision: Neural Network
Revision: Neural Network Exercise 1 Tell whether each of the following statements is true or false by checking the appropriate box. Statement True False a) A perceptron is guaranteed to perfectly learn
More informationRadial-Basis Function Networks
Radial-Basis Function etworks A function is radial basis () if its output depends on (is a non-increasing function of) the distance of the input from a given stored vector. s represent local receptors,
More informationMachine Learning: The Perceptron. Lecture 06
Machine Learning: he Perceptron Razvan C. Bunescu School of Electrical Engineering and Computer Science bunescu@ohio.edu 1 McCulloch-Pitts Neuron Function 0 1 w 0 activation / output function 1 w 1 w w
More informationLinear Discriminant Functions
Linear Discriminant Functions Linear discriminant functions and decision surfaces Definition It is a function that is a linear combination of the components of g() = t + 0 () here is the eight vector and
More informationVorlesung Neuronale Netze - Radiale-Basisfunktionen (RBF)-Netze -
Vorlesung Neuronale Netze - Radiale-Basisfunktionen (RBF)-Netze - SS 004 Holger Fröhlich (abg. Vorl. von S. Kaushik¹) Lehrstuhl Rechnerarchitektur, Prof. Dr. A. Zell ¹www.cse.iitd.ernet.in/~saroj Radial
More informationCSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18
CSE 417T: Introduction to Machine Learning Final Review Henry Chai 12/4/18 Overfitting Overfitting is fitting the training data more than is warranted Fitting noise rather than signal 2 Estimating! "#$
More informationMultilayer Perceptron
Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Single Perceptron 3 Boolean Function Learning 4
More informationNeural networks. Chapter 19, Sections 1 5 1
Neural networks Chapter 19, Sections 1 5 Chapter 19, Sections 1 5 1 Outline Brains Neural networks Perceptrons Multilayer perceptrons Applications of neural networks Chapter 19, Sections 1 5 2 Brains 10
More informationGRADIENT DESCENT AND LOCAL MINIMA
GRADIENT DESCENT AND LOCAL MINIMA 25 20 5 15 10 3 2 1 1 2 5 2 2 4 5 5 10 Suppose for both functions above, gradient descent is started at the point marked red. It will walk downhill as far as possible,
More informationNeural Networks. CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington
Neural Networks CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington 1 Perceptrons x 0 = 1 x 1 x 2 z = h w T x Output: z x D A perceptron
More informationMachine Learning Lecture 7
Course Outline Machine Learning Lecture 7 Fundamentals (2 weeks) Bayes Decision Theory Probability Density Estimation Statistical Learning Theory 23.05.2016 Discriminative Approaches (5 weeks) Linear Discriminant
More informationInf2b Learning and Data
Inf2b Learning and Data Lecture : Single layer Neural Networks () (Credit: Hiroshi Shimodaira Iain Murray and Steve Renals) Centre for Speech Technology Research (CSTR) School of Informatics University
More informationVote. Vote on timing for night section: Option 1 (what we have now) Option 2. Lecture, 6:10-7:50 25 minute dinner break Tutorial, 8:15-9
Vote Vote on timing for night section: Option 1 (what we have now) Lecture, 6:10-7:50 25 minute dinner break Tutorial, 8:15-9 Option 2 Lecture, 6:10-7 10 minute break Lecture, 7:10-8 10 minute break Tutorial,
More informationECE521 Lectures 9 Fully Connected Neural Networks
ECE521 Lectures 9 Fully Connected Neural Networks Outline Multi-class classification Learning multi-layer neural networks 2 Measuring distance in probability space We learnt that the squared L2 distance
More informationNeural networks. Chapter 20. Chapter 20 1
Neural networks Chapter 20 Chapter 20 1 Outline Brains Neural networks Perceptrons Multilayer networks Applications of neural networks Chapter 20 2 Brains 10 11 neurons of > 20 types, 10 14 synapses, 1ms
More informationArtificial Neural Networks
Artificial Neural Networks 鮑興國 Ph.D. National Taiwan University of Science and Technology Outline Perceptrons Gradient descent Multi-layer networks Backpropagation Hidden layer representations Examples
More informationCSC242: Intro to AI. Lecture 21
CSC242: Intro to AI Lecture 21 Administrivia Project 4 (homeworks 18 & 19) due Mon Apr 16 11:59PM Posters Apr 24 and 26 You need an idea! You need to present it nicely on 2-wide by 4-high landscape pages
More informationIntroduction: The Perceptron
Introduction: The Perceptron Haim Sompolinsy, MIT October 4, 203 Perceptron Architecture The simplest type of perceptron has a single layer of weights connecting the inputs and output. Formally, the perceptron
More informationLinear models and the perceptron algorithm
8/5/6 Preliminaries Linear models and the perceptron algorithm Chapters, 3 Definition: The Euclidean dot product beteen to vectors is the expression dx T x = i x i The dot product is also referred to as
More informationARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD
ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD WHAT IS A NEURAL NETWORK? The simplest definition of a neural network, more properly referred to as an 'artificial' neural network (ANN), is provided
More informationMachine Learning for Large-Scale Data Analysis and Decision Making A. Neural Networks Week #6
Machine Learning for Large-Scale Data Analysis and Decision Making 80-629-17A Neural Networks Week #6 Today Neural Networks A. Modeling B. Fitting C. Deep neural networks Today s material is (adapted)
More informationLecture 5: Logistic Regression. Neural Networks
Lecture 5: Logistic Regression. Neural Networks Logistic regression Comparison with generative models Feed-forward neural networks Backpropagation Tricks for training neural networks COMP-652, Lecture
More informationCSE 417T: Introduction to Machine Learning. Lecture 11: Review. Henry Chai 10/02/18
CSE 417T: Introduction to Machine Learning Lecture 11: Review Henry Chai 10/02/18 Unknown Target Function!: # % Training data Formal Setup & = ( ), + ),, ( -, + - Learning Algorithm 2 Hypothesis Set H
More informationMachine Learning and Data Mining. Multi-layer Perceptrons & Neural Networks: Basics. Prof. Alexander Ihler
+ Machine Learning and Data Mining Multi-layer Perceptrons & Neural Networks: Basics Prof. Alexander Ihler Linear Classifiers (Perceptrons) Linear Classifiers a linear classifier is a mapping which partitions
More informationNeural Networks: Backpropagation
Neural Networks: Backpropagation Machine Learning Fall 2017 Based on slides and material from Geoffrey Hinton, Richard Socher, Dan Roth, Yoav Goldberg, Shai Shalev-Shwartz and Shai Ben-David, and others
More informationThe Perceptron algorithm
The Perceptron algorithm Tirgul 3 November 2016 Agnostic PAC Learnability A hypothesis class H is agnostic PAC learnable if there exists a function m H : 0,1 2 N and a learning algorithm with the following
More informationNONLINEAR CLASSIFICATION AND REGRESSION. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition
NONLINEAR CLASSIFICATION AND REGRESSION Nonlinear Classification and Regression: Outline 2 Multi-Layer Perceptrons The Back-Propagation Learning Algorithm Generalized Linear Models Radial Basis Function
More informationCSC Neural Networks. Perceptron Learning Rule
CSC 302 1.5 Neural Networks Perceptron Learning Rule 1 Objectives Determining the weight matrix and bias for perceptron networks with many inputs. Explaining what a learning rule is. Developing the perceptron
More information1 Machine Learning Concepts (16 points)
CSCI 567 Fall 2018 Midterm Exam DO NOT OPEN EXAM UNTIL INSTRUCTED TO DO SO PLEASE TURN OFF ALL CELL PHONES Problem 1 2 3 4 5 6 Total Max 16 10 16 42 24 12 120 Points Please read the following instructions
More informationMultilayer Neural Networks. (sometimes called Multilayer Perceptrons or MLPs)
Multilayer Neural Networks (sometimes called Multilayer Perceptrons or MLPs) Linear separability Hyperplane In 2D: w x + w 2 x 2 + w 0 = 0 Feature x 2 = w w 2 x w 0 w 2 Feature 2 A perceptron can separate
More informationNeural Networks, Computation Graphs. CMSC 470 Marine Carpuat
Neural Networks, Computation Graphs CMSC 470 Marine Carpuat Binary Classification with a Multi-layer Perceptron φ A = 1 φ site = 1 φ located = 1 φ Maizuru = 1 φ, = 2 φ in = 1 φ Kyoto = 1 φ priest = 0 φ
More informationNeural networks. Chapter 20, Section 5 1
Neural networks Chapter 20, Section 5 Chapter 20, Section 5 Outline Brains Neural networks Perceptrons Multilayer perceptrons Applications of neural networks Chapter 20, Section 5 2 Brains 0 neurons of
More informationSPSS, University of Texas at Arlington. Topics in Machine Learning-EE 5359 Neural Networks
Topics in Machine Learning-EE 5359 Neural Networks 1 The Perceptron Output: A perceptron is a function that maps D-dimensional vectors to real numbers. For notational convenience, we add a zero-th dimension
More informationNeural Networks. Chapter 18, Section 7. TB Artificial Intelligence. Slides from AIMA 1/ 21
Neural Networks Chapter 8, Section 7 TB Artificial Intelligence Slides from AIMA http://aima.cs.berkeley.edu / 2 Outline Brains Neural networks Perceptrons Multilayer perceptrons Applications of neural
More informationLeast Mean Squares Regression
Least Mean Squares Regression Machine Learning Spring 2018 The slides are mainly from Vivek Srikumar 1 Lecture Overview Linear classifiers What functions do linear classifiers express? Least Squares Method
More informationSupervised Learning. George Konidaris
Supervised Learning George Konidaris gdk@cs.brown.edu Fall 2017 Machine Learning Subfield of AI concerned with learning from data. Broadly, using: Experience To Improve Performance On Some Task (Tom Mitchell,
More informationLinear Classification. CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington
Linear Classification CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington 1 Example of Linear Classification Red points: patterns belonging
More informationCE213 Artificial Intelligence Lecture 14
CE213 Artificial Intelligence Lecture 14 Neural Networks: Part 2 Learning Rules -Hebb Rule - Perceptron Rule -Delta Rule Neural Networks Using Linear Units [ Difficulty warning: equations! ] 1 Learning
More informationΝεςπο-Ασαυήρ Υπολογιστική Neuro-Fuzzy Computing
Νεςπο-Ασαυήρ Υπολογιστική Neuro-Fuzzy Computing ΗΥ418 Διδάσκων Δημήτριος Κατσαρός @ Τμ. ΗΜΜΥ Πανεπιστήμιο Θεσσαλίαρ Διάλεξη 4η 1 Perceptron s convergence 2 Proof of convergence Suppose that we have n training
More informationNeuro-Fuzzy Comp. Ch. 4 March 24, R p
4 Feedforward Multilayer Neural Networks part I Feedforward multilayer neural networks (introduced in sec 17) with supervised error correcting learning are used to approximate (synthesise) a non-linear
More informationIn the Name of God. Lectures 15&16: Radial Basis Function Networks
1 In the Name of God Lectures 15&16: Radial Basis Function Networks Some Historical Notes Learning is equivalent to finding a surface in a multidimensional space that provides a best fit to the training
More informationLINEAR CLASSIFIERS. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition
LINEAR CLASSIFIERS Classification: Problem Statement 2 In regression, we are modeling the relationship between a continuous input variable x and a continuous target variable t. In classification, the input
More informationPhysics 178/278 - David Kleinfeld - Winter 2017
Physics 178/278 - David Kleinfeld - Winter 2017 10 Layered networks We consider feedforward networks. These may be considered as the front end of nervous system, such as retinal ganglia cells that feed
More informationLinear smoother. ŷ = S y. where s ij = s ij (x) e.g. s ij = diag(l i (x))
Linear smoother ŷ = S y where s ij = s ij (x) e.g. s ij = diag(l i (x)) 2 Online Learning: LMS and Perceptrons Partially adapted from slides by Ryan Gabbard and Mitch Marcus (and lots original slides by
More informationArtificial Neural Networks
Artificial Neural Networks Math 596 Project Summary Fall 2016 Jarod Hart 1 Overview Artificial Neural Networks (ANNs) are machine learning algorithms that imitate neural function. There is a vast theory
More informationNeural Networks DWML, /25
DWML, 2007 /25 Neural networks: Biological and artificial Consider humans: Neuron switching time 0.00 second Number of neurons 0 0 Connections per neuron 0 4-0 5 Scene recognition time 0. sec 00 inference
More informationCOMP-4360 Machine Learning Neural Networks
COMP-4360 Machine Learning Neural Networks Jacky Baltes Autonomous Agents Lab University of Manitoba Winnipeg, Canada R3T 2N2 Email: jacky@cs.umanitoba.ca WWW: http://www.cs.umanitoba.ca/~jacky http://aalab.cs.umanitoba.ca
More information(Feed-Forward) Neural Networks Dr. Hajira Jabeen, Prof. Jens Lehmann
(Feed-Forward) Neural Networks 2016-12-06 Dr. Hajira Jabeen, Prof. Jens Lehmann Outline In the previous lectures we have learned about tensors and factorization methods. RESCAL is a bilinear model for
More informationLeast Mean Squares Regression. Machine Learning Fall 2018
Least Mean Squares Regression Machine Learning Fall 2018 1 Where are we? Least Squares Method for regression Examples The LMS objective Gradient descent Incremental/stochastic gradient descent Exercises
More informationMultilayer Perceptron
Aprendizagem Automática Multilayer Perceptron Ludwig Krippahl Aprendizagem Automática Summary Perceptron and linear discrimination Multilayer Perceptron, nonlinear discrimination Backpropagation and training
More informationPattern Recognition Prof. P. S. Sastry Department of Electronics and Communication Engineering Indian Institute of Science, Bangalore
Pattern Recognition Prof. P. S. Sastry Department of Electronics and Communication Engineering Indian Institute of Science, Bangalore Lecture - 27 Multilayer Feedforward Neural networks with Sigmoidal
More informationThe Perceptron Algorithm
The Perceptron Algorithm Machine Learning Spring 2018 The slides are mainly from Vivek Srikumar 1 Outline The Perceptron Algorithm Perceptron Mistake Bound Variants of Perceptron 2 Where are we? The Perceptron
More informationLecture 7 Artificial neural networks: Supervised learning
Lecture 7 Artificial neural networks: Supervised learning Introduction, or how the brain works The neuron as a simple computing element The perceptron Multilayer neural networks Accelerated learning in
More informationNeural Networks Lecture 2:Single Layer Classifiers
Neural Networks Lecture 2:Single Layer Classifiers H.A Talebi Farzaneh Abdollahi Department of Electrical Engineering Amirkabir University of Technology Winter 2011. A. Talebi, Farzaneh Abdollahi Neural
More informationMultilayer Neural Networks. (sometimes called Multilayer Perceptrons or MLPs)
Multilayer Neural Networks (sometimes called Multilayer Perceptrons or MLPs) Linear separability Hyperplane In 2D: w 1 x 1 + w 2 x 2 + w 0 = 0 Feature 1 x 2 = w 1 w 2 x 1 w 0 w 2 Feature 2 A perceptron
More informationPattern Classification
Pattern Classification All materials in these slides were taen from Pattern Classification (2nd ed) by R. O. Duda,, P. E. Hart and D. G. Stor, John Wiley & Sons, 2000 with the permission of the authors
More informationLearning from Examples
Learning from Examples Data fitting Decision trees Cross validation Computational learning theory Linear classifiers Neural networks Nonparametric methods: nearest neighbor Support vector machines Ensemble
More information