Admin NEURAL NETWORKS. Perceptron learning algorithm. Our Nervous System 10/25/16. Assignment 7. Class 11/22. Schedule for the rest of the semester

Similar documents
Neural Networks. Perceptrons and Backpropagation. Silke Bussen-Heyen. 5th of Novemeber Universität Bremen Fachbereich 3. Neural Networks 1 / 17

Multi-layer neural networks

Multilayer neural networks

EEE 241: Linear Systems

Multilayer Perceptron (MLP)

Multilayer Perceptrons and Backpropagation. Perceptrons. Recap: Perceptrons. Informatics 1 CG: Lecture 6. Mirella Lapata

Introduction to the Introduction to Artificial Neural Network

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.

Evaluation of classifiers MLPs

Week 5: Neural Networks

Neural networks. Nuno Vasconcelos ECE Department, UCSD

Supporting Information

MATH 567: Mathematical Techniques in Data Science Lab 8

1 Input-Output Mappings. 2 Hebbian Failure. 3 Delta Rule Success.

Generalized Linear Methods

SDMML HT MSc Problem Sheet 4

CHALMERS, GÖTEBORGS UNIVERSITET. SOLUTIONS to RE-EXAM for ARTIFICIAL NEURAL NETWORKS. COURSE CODES: FFR 135, FIM 720 GU, PhD

Model of Neurons. CS 416 Artificial Intelligence. Early History of Neural Nets. Cybernetics. McCulloch-Pitts Neurons. Hebbian Modification.

Neural Networks & Learning

Logistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI

Neural Networks. Class 22: MLSP, Fall 2016 Instructor: Bhiksha Raj

1 Convex Optimization

Lecture 23: Artificial neural networks

Discriminative classifier: Logistic Regression. CS534-Machine Learning

Fundamentals of Computational Neuroscience 2e

Neural Networks. Neural Network Motivation. Why Neural Networks? Comments on Blue Gene. More Comments on Blue Gene

Lecture 10 Support Vector Machines. Oct

CSC 411 / CSC D11 / CSC C11

Internet Engineering. Jacek Mazurkiewicz, PhD Softcomputing. Part 3: Recurrent Artificial Neural Networks Self-Organising Artificial Neural Networks

CSC321 Tutorial 9: Review of Boltzmann machines and simulated annealing

Discriminative classifier: Logistic Regression. CS534-Machine Learning

Multigradient for Neural Networks for Equalizers 1

Nonlinear Classifiers II

Feature Selection: Part 1

Machine Learning CS-527A ANN ANN. ANN Short History ANN. Artificial Neural Networks (ANN) Artificial Neural Networks

We present the algorithm first, then derive it later. Assume access to a dataset {(x i, y i )} n i=1, where x i R d and y i { 1, 1}.

10-701/ Machine Learning, Fall 2005 Homework 3

Gradient Descent Learning and Backpropagation

Support Vector Machines. Vibhav Gogate The University of Texas at dallas

Lecture 10 Support Vector Machines II

Kernels in Support Vector Machines. Based on lectures of Martin Law, University of Michigan

CS 3710: Visual Recognition Classification and Detection. Adriana Kovashka Department of Computer Science January 13, 2015

Learning Theory: Lecture Notes

Image classification. Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing i them?

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

IV. Performance Optimization

INF 5860 Machine learning for image classification. Lecture 3 : Image classification and regression part II Anne Solberg January 31, 2018

Support Vector Machines CS434

Ensemble Methods: Boosting

A neural network with localized receptive fields for visual pattern classification

Using deep belief network modelling to characterize differences in brain morphometry in schizophrenia

Video Data Analysis. Video Data Analysis, B-IT

Deep Learning. Boyang Albert Li, Jie Jay Tan

Lecture Notes on Linear Regression

Other NN Models. Reinforcement learning (RL) Probabilistic neural networks

Hopfield networks and Boltzmann machines. Geoffrey Hinton et al. Presented by Tambet Matiisen

CSE 546 Midterm Exam, Fall 2014(with Solution)

Boostrapaggregating (Bagging)

CS294A Lecture notes. Andrew Ng

CHAPTER 3 ARTIFICIAL NEURAL NETWORKS AND LEARNING ALGORITHM

Neural networks. Chapter 19, Sections 1 5 1

Fundamentals of Neural Networks

Decision Boundary Formation of Neural Networks 1

NP-Completeness : Proofs

CS294A Lecture notes. Andrew Ng

Unsupervised Learning

Support Vector Machines

Structure and Drive Paul A. Jensen Copyright July 20, 2003

C4B Machine Learning Answers II. = σ(z) (1 σ(z)) 1 1 e z. e z = σ(1 σ) (1 + e z )

Logistic Classifier CISC 5800 Professor Daniel Leeds

CHAPTER III Neural Networks as Associative Memory

MLE and Bayesian Estimation. Jie Tang Department of Computer Science & Technology Tsinghua University 2012

Pattern Classification

Application research on rough set -neural network in the fault diagnosis system of ball mill

CIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M

CS224n: Natural Language Processing with Deep Learning 1 Lecture Notes: Part III 2 Winter 2019

Statistical Machine Learning Methods for Bioinformatics III. Neural Network & Deep Learning Theory

Introduction to Neural Networks. David Stutz

2 Laminar Structure of Cortex. 4 Area Structure of Cortex

Homework Assignment 3 Due in class, Thursday October 15

Why feed-forward networks are in a bad shape

Home Assignment 4. Figure 1: A sample input sequence for NER tagging

Development of a General Purpose On-Line Update Multiple Layer Feedforward Backpropagation Neural Network

Networks of Neurons (Chapter 7)

Revision: Neural Network

Linear Feature Engineering 11

Maximal Margin Classifier

Linear Classification, SVMs and Nearest Neighbors

Singular Value Decomposition: Theory and Applications

Design and Optimization of Fuzzy Controller for Inverse Pendulum System Using Genetic Algorithm

Data Mining Part 5. Prediction

COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture #16 Scribe: Yannan Wang April 3, 2014

Tracking with Kalman Filter

Neural networks. Chapter 20. Chapter 20 1

Generative classification models

Support Vector Machines CS434

CSE 252C: Computer Vision III

CS:4420 Artificial Intelligence

Lab 5: 16 th April Exercises on Neural Networks

Using Immune Genetic Algorithm to Optimize BP Neural Network and Its Application Peng-fei LIU1,Qun-tai SHEN1 and Jun ZHI2,*

Transcription:

0/25/6 Admn Assgnment 7 Class /22 Schedule for the rest of the semester NEURAL NETWORKS Davd Kauchak CS58 Fall 206 Perceptron learnng algorthm Our Nervous System repeat untl convergence (or for some # of teratons): for each tranng example (f, f 2,, f n, label): predcton = b + f predcton * label 0: // they don t agree for each w : w = w + f *label b = b + label n = w f Why s t called the perceptron learnng algorthm f what t learns s a lne Why not lne learnng algorthm Synapses + + + - - Dendrtes Axon What do you know Neuron

0/25/6 Our nervous system: the computer scence vew A neuron/perceptron Input x Dendrtes the human bran s a large collecton of nterconnected neurons Weght w Synapses + + + - - Axon a NEURON s a bran cell! they collect, process, and dssemnate electrcal sgnals! they are connected va synapses! they FIRE dependng on the condtons of the neghborng neurons Input x 3 Input x 4 Weght w 2 Weght w 3 Weght w 4 g(n) n = w x actvaton functon Output y How s ths a lnear classfer (.e. perceptron) Hard threshold = lnear classfer Neural Networks hard threshold: " f n > b g(n) = # $ 0 otherwse " $ f w output = x + b > 0 # %$ 0 otherwse Neural Networks try to mmc the structure and functon of our nervous system People lke bologcally motvated approaches x w x 2 w 2 g(n) output w m x m 2

0/25/6 Artfcal Neural Networks Node A Weght w Node B Node (Neuron/perceptron) (perceptron) (perceptron) " $ f w output = x + b > 0 # %$ 0 otherwse W s the strength of sgnal sent between A and B. Edge (synapses) If A fres and w s postve, then A stmulates B. our approxmaton If A fres and w s negatve, then A nhbts B. Other actvaton functons Neural network hard threshold: " f n > b g(n) = # $ 0 otherwse sgmod tanh x g(x) = + e ax Indvdual perceptrons/ neurons why other threshold functons 3

0/25/6 Neural network Neural network some are provded/ entered each perceptron computes and calculates an answer Neural network Neural network those answers become for the next level fnally get the answer after all levels compute 4

0/25/6 Actvaton spread Computaton (assume 0 bas) http://www.youtube.com/watchv=yq7d4rovz6i 0 0.5 - -0.5 0 0.5 0.5 " f n > b g(n) = # $ 0 otherwse Computaton - 0.05 0.03-0.02 0.0-0.05-0.02= -0.07 0.483 0.495-0.03+0.0=-0.02 0.5 0.483*0.5+0.495=0.7365 0.676 Neural networks Dfferent knds/characterstcs of networks How are these dfferent 5

0/25/6 Hdden unts/layers Hdden unts/layers Can have many layers of hdden unts of dfferng szes hdden unts/layer To count the number of layers, you count all but the Feed forward networks Hdden unts/layers Alternate ways of vsualzng Sometmes the nput layer wll be drawn wth nodes as well 2-layer network 3-layer network 2-layer network 2-layer network 6

0/25/6 Multple outputs Multple outputs 0 Can be used to model multclass datasets or more nterestng predctors, e.g. mages nput output (edge detecton) Neural networks NN decson boundary Recurrent network x w Output s fed back to nput x 2 w 2 g(n) output Can support memory! w m x m Good for temporal data What does the decson boundary of a perceptron look lke Lne (lnear set of weghts) 7

0/25/6 NN decson boundary XOR Input x b= Output = x xor x 2 b= What does the decson boundary of a 2-layer network look lke Is t lnear What types of thngs can and can t t model b= " $ f w output = x + b > 0 # %$ 0 otherwse x x 2 x xor x 2 0 0 0 XOR What does the decson boundary look lke Input x - Output = x xor x 2 Input x - Output = x xor x 2 - - " $ f w output = x + b > 0 # %$ 0 otherwse x x 2 x xor x 2 0 0 0 x x 2 x xor x 2 0 0 0 8

0/25/6 What does the decson boundary look lke NN decson boundary Input x - Output = x xor x 2 Input x - x 2 - (-,) x x 2 x xor x 2 x What does ths perceptron s decson boundary look lke 0 0 0 Let x 2 = 0, then: x 0.5 = 0 x = 0.5 (wthout the bas) NN decson boundary What does the decson boundary look lke Input x - x 2 Input x - Output = x xor x 2 - x What does ths perceptron s decson boundary look lke x x 2 x xor x 2 0 0 0 9

0/25/6 NN decson boundary NN decson boundary Input x Input x x 2 x 2 - - x x Let x 2 = 0, then: x 0.5 = 0 x = 0.5 (wthout the bas) (,-) What does the decson boundary look lke Fll n the truth table Input x - Output = x xor x 2 - out out2 What operaton does ths perceptron perform on the result x x 2 x xor x 2 0 0 0 0 0 0 0 0

0/25/6 OR What does the decson boundary look lke Input x - - Output = x xor x 2 out out2 0 0 x x 2 x xor x 2 0 0 0 Input x - - Output = x xor x 2 x 2 Input x - - Output = x xor x 2 x 2 x x x x 2 x xor x 2 0 0 0

0/25/6 What does the decson boundary look lke Ths decson boundary Input x Output = x xor x 2 Input x b= Output b= b= lnear splts of the feature space combnaton of these lnear spaces " $ f w output = x + b > 0 # %$ 0 otherwse Ths decson boundary Ths decson boundary Input x - - Output Input x - - Output - - b=0.5 - - b=0.5 b=0.5 b=0.5 " $ f w output = x + b > 0 # %$ 0 otherwse " $ f w output = x + b > 0 # %$ 0 otherwse 2

0/25/6 NOR - - - b=0.5 out out2 0 0 0 0 - b=0.5 out out2 0 0 0 0 0 0 0 What does the decson boundary look lke Three hdden nodes Input x Output = x xor x 2 lnear splts of the feature space combnaton of these lnear spaces 3

0/25/6 NN decson boundares NN decson boundares For DT, as the tree gets larger, the model gets more complex The same s true for neural networks: more hdden nodes = more complexty Or, n colloqual terms two-layer networks can approxmate any functon. Addng more layers adds even more complexty (and much more quckly) Good rule of thumb: number of 2-layer hdden nodes number of examples number of dmensons Tranng Tranng multlayer networks Input x b= b= How do we learn the weghts b= Output = x xor x 2 x x 2 x xor x 2 0 0 0 perceptron learnng: f the perceptron s output s dfferent than the expected output, update the weghts gradent descent: compare output to label and adjust based on loss functon Any other problem wth these for general NNs w w w perceptron/ lnear model w w w w w w w w w neural network 4

0/25/6 Learnng n multlayer networks Backpropagaton: ntuton Challenge: for multlayer networks, we don t know what the expected output/error s for the nternal nodes! Gradent descent method for learnng weghts by optmzng a loss functon. calculate output of all nodes w w w how do we learn these weghts w w w w w w expected output w w w 2. calculate the weghts for the output layer based on the error 3. backpropagate errors through hdden layers perceptron/ lnear model neural network Backpropagaton: ntuton Backpropagaton: ntuton Key dea: propagate the error back to ths layer We can calculate the actual error here 5

0/25/6 Backpropagaton: ntuton Backpropagaton: ntuton backpropagate the error: Assume all of these nodes were responsble for some of the error How can we fgure out how much they were responsble for w w 2 w3 error error for node s: w * error Backpropagaton: ntuton Backpropagaton: the detals w 4 w 5 w6 Gradent descent method for learnng weghts by optmzng a loss functon w 3 * error. calculate output of all nodes 2. calculate the updates drectly for the output layer Calculate as normal usng ths as the error 3. backpropagate errors through hdden layers What loss functon 6

0/25/6 Backpropagaton: the detals Gradent descent method for learnng weghts by optmzng a loss functon. calculate output of all nodes 2. calculate the updates drectly for the output layer 3. backpropagate errors through hdden layers loss = x (y ŷ)2 2 squared error 7