Artificial Neural Network

Similar documents
3.4 Linear Least-Squares Filter

Neural networks. Chapter 19, Sections 1 5 1

PMR5406 Redes Neurais e Lógica Fuzzy Aula 3 Single Layer Percetron

Artifical Neural Networks

Artificial Neural Networks

Neural networks. Chapter 20. Chapter 20 1

Course 395: Machine Learning - Lectures

Neural Networks (Part 1) Goals for the lecture

Neural Networks and Deep Learning

Rosenblatt s Perceptron

Introduction to Neural Networks

Neural networks. Chapter 20, Section 5 1

Neural Networks. Chapter 18, Section 7. TB Artificial Intelligence. Slides from AIMA 1/ 21

Single layer NN. Neuron Model

In the Name of God. Lecture 11: Single Layer Perceptrons

y(x n, w) t n 2. (1)

Artificial Neural Networks

4. Multilayer Perceptrons

Lecture 4: Perceptrons and Multilayer Perceptrons

Artificial Neural Networks

Multilayer Neural Networks. (sometimes called Multilayer Perceptrons or MLPs)

Advanced statistical methods for data analysis Lecture 2

Machine Learning (CSE 446): Neural Networks

(Feed-Forward) Neural Networks Dr. Hajira Jabeen, Prof. Jens Lehmann

Lecture 7 Artificial neural networks: Supervised learning

Artificial neural networks

Feedforward Neural Nets and Backpropagation

ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD

Numerical Learning Algorithms

Multilayer Neural Networks. (sometimes called Multilayer Perceptrons or MLPs)

Neural Networks. CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington

Neural Networks biological neuron artificial neuron 1

COMP-4360 Machine Learning Neural Networks

Neural Networks DWML, /25

1. A discrete-time recurrent network is described by the following equation: y(n + 1) = A y(n) + B x(n)

Artificial Neural Networks

EEE 241: Linear Systems

Reading Group on Deep Learning Session 1

Artificial Neural Networks" and Nonparametric Methods" CMPSCI 383 Nov 17, 2011!

Neuro-Fuzzy Comp. Ch. 4 March 24, R p

Classification with Perceptrons. Reading:

Last update: October 26, Neural networks. CMSC 421: Section Dana Nau

Multilayer Neural Networks

Lecture 5: Logistic Regression. Neural Networks

Machine Learning and Data Mining. Multi-layer Perceptrons & Neural Networks: Basics. Prof. Alexander Ihler

CS 6501: Deep Learning for Computer Graphics. Basics of Neural Networks. Connelly Barnes

Neural Networks with Applications to Vision and Language. Feedforward Networks. Marco Kuhlmann

Artificial Neural Networks The Introduction

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Apprentissage, réseaux de neurones et modèles graphiques (RCP209) Neural Networks and Deep Learning

Machine Learning

Chapter ML:VI (continued)

CSC321 Lecture 5: Multilayer Perceptrons

Introduction to Artificial Neural Networks

CSC242: Intro to AI. Lecture 21

Introduction Neural Networks - Architecture Network Training Small Example - ZIP Codes Summary. Neural Networks - I. Henrik I Christensen

In the Name of God. Lecture 9: ANN Architectures

ECE 471/571 - Lecture 17. Types of NN. History. Back Propagation. Recurrent (feedback during operation) Feedforward

Classification goals: Make 1 guess about the label (Top-1 error) Make 5 guesses about the label (Top-5 error) No Bounding Box

100 inference steps doesn't seem like enough. Many neuron-like threshold switching units. Many weighted interconnections among units

Multilayer Perceptron

Learning and Memory in Neural Networks

Neural Networks and the Back-propagation Algorithm

Introduction to Machine Learning Spring 2018 Note Neural Networks

Neural Networks. Nicholas Ruozzi University of Texas at Dallas

1 Machine Learning Concepts (16 points)

AI Programming CS F-20 Neural Networks

Artificial Neural Networks Examination, June 2005

CSC Neural Networks. Perceptron Learning Rule

18.6 Regression and Classification with Linear Models

MLPR: Logistic Regression and Neural Networks

Logistic Regression & Neural Networks

Outline. MLPR: Logistic Regression and Neural Networks Machine Learning and Pattern Recognition. Which is the correct model? Recap.

Regularization in Neural Networks

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Revision: Neural Network

The Perceptron Algorithm

CMSC 421: Neural Computation. Applications of Neural Networks

Artificial Intelligence

ARTIFICIAL NEURAL NETWORKS گروه مطالعاتي 17 بهار 92

MIDTERM: CS 6375 INSTRUCTOR: VIBHAV GOGATE October,

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

CS:4420 Artificial Intelligence

CSCI 315: Artificial Intelligence through Deep Learning

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels

A Logarithmic Neural Network Architecture for Unbounded Non-Linear Function Approximation

Artificial Neural Networks. MGS Lecture 2

Final Examination CS 540-2: Introduction to Artificial Intelligence

Last updated: Oct 22, 2012 LINEAR CLASSIFIERS. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition

Machine Learning Lecture 5

<Special Topics in VLSI> Learning for Deep Neural Networks (Back-propagation)

Neural Nets in PR. Pattern Recognition XII. Michal Haindl. Outline. Neural Nets in PR 2

Advanced Machine Learning

Bits of Machine Learning Part 1: Supervised Learning

CSE 190 Fall 2015 Midterm DO NOT TURN THIS PAGE UNTIL YOU ARE TOLD TO START!!!!

Statistical Machine Learning (BE4M33SSU) Lecture 5: Artificial Neural Networks

Statistical Machine Learning from Data

Part 8: Neural Networks

Multilayer Neural Networks

Transcription:

Artificial Neural Network Eung Je Woo Department of Biomedical Engineering Impedance Imaging Research Center (IIRC) Kyung Hee University Korea ejwoo@khu.ac.kr

Neuron and Neuron Model McCulloch and Pitts (1943) m y k = φ v k = φ w kj x j j=0 x 0 = 1 and w k0 = b k 2

Activation Function Threshold function Heaviside function Signum function S -shaped function Sigmoid function 1 if v 0 φ v = ቊ 0 if v < 0 1 if v > 0 φ v = ቐ0 if v = 0 1 if v < 0 φ v = Hyperbolic tangent function 1 1 + e av φ v = tanh v 3

Activation Function M. Hagan et al., 2017, Neural Network Design 4

Representation of Neuron M. Hagan et al., 2017, Neural Network Design Signal-flow Graph Architectural Graph Architectural Diagram 5

Feedforward Network Single-layer Feedforward Network Multilayer Feedforward Network (m-h-q) = (10-4-2) 6

Feedforward Network M. Hagan et al., 2017, Neural Network Design Single-layer Feedforward Network Multilayer Feedforward Network 7

Network using Delay M. Hagan et al., 2017, Neural Network Design Delay Recurrent Network Hamming Network Hopfield Network 8

Recurrent Network Recurrent Network with No Selffeedback Loops and No Hidden Neurons Recurrent Network with Hidden Neurons 9

Basic Setting Knowledge refers to stored information or models used by a person or machine to interpret, predict, and appropriately respond to the outside world. The known world state, represented by facts about what is and what has been known; this form of knowledge is referred to as prior information. Observations (noisy measurements) of the world, obtained by means of sensors designed to probe the environment, in which the neural network is supposed to operate. The observations provide the pool of information, from which the examples used to train the neural network are drawn. Labeled examples: pairs of (input signal, target output): training samples Unlabeled examples Three steps of machine learning Training (learning) Testing Generalization 10

Neural Network Rule Rule 1. Similar inputs (i.e., patterns drawn) from similar classes should usually produce similar representations inside the network, and should therefore be classified as belonging to the same class. Rule 2. Items to be categorized as separate classes should be given widely different representations in the network. Rule 3. If a particular feature is important, then there should be a large number of neurons involved in the representation of that item in the network. Rule 4. Prior information and invariances should be built into the design of a neural network whenever they are available, so as to simplify the network design by its not having to learn them. 11

Build Prior Information Receptive field Weight sharing 6 v j = w i x i+j 1 for j = 1,2,3,4 i=1 12

Build Invariance The system must be capable of coping with a range of transformations of the observed signal such as image rotations, signal amplitude changes, etc. Invariance by structure using same weights for chosen connections Invariance by training using a large set of examples including all possible transformations Invariance by feature space using preprocessing to extract invariant features 13

Perceptron Rosenblatt (1958) y = φ v = φ w T x = sgn(w T x) = ቊ 1 if wt x > 0 1 if w T x 0 w T x = 0 : hyperplane in m-dimensional signal space Linear classifier Linearly Separable Non-linearly Separable 14

Perceptron Convergence Algorithm x(n) d(n) w(n) y(n) e(n) 15

Error-correction Learning Quantized response x(n) y(n) = sgn w T (n)x(n) Quantized desired response d(n) = ቊ +1 if x n belongs to class C 1 1 if x n belongs to class C 2 Error signal e n = d n y n Error-correction learning rule w n + 1 = w n + ηe(n)x(n) Learning rate parameter 0 < η 1 Small η for more averaging and stable weight estimates Large η for fast adaptation w(n) y(n) e(n) d(n) 16

Batch Perceptron Algorithm Perceptron cost function J w = w T n x n d n for a set of misclassified samples H x n H Gradient vector x(n) J(w) = x n d(n) d(n) x(n) H w(n) y(n) =,,, w 1 w 2 w m T e(n) Steepest descent method to minimize J w w n + 1 = w n η(n) J(w) w n + 1 = w n + η(n) x(n) H x n d(n) 17

Bayes Classifier Average risk R = c 11 p 1 න H 1 p X xหc 1 dx + c 22 p 2 න H 2 p X xหc 2 dx + c 21 p 1 න H 2 p X xหc 1 dx + c 12 p 2 න H 1 p X xหc 2 dx = c 11 p 1 න H 1 p X xหc 1 dx + c 22 p 2 න H H 1 p X xหc 2 dx + c 21 p 1 න H H 1 p X xหc 1 dx + c 12 p 2 න H 1 p X xหc 2 dx = c 21 p 1 + c 22 p 2 + න H 1 p 2 c 12 c 22 p X xหc 2 p 1 (c 21 c 11 )p X xหc 1 dx Bayes classifier H = H 1 + H 2, c 11 < c 21, c 22 < c 12, න H If p 1 c 21 c 11 p X xหc 1 > p 2 c 12 c 22 p X xหc 2, assign x to H 1 (class C 1 ). Otherwise, assign x to H 2 (class C 2 ). If Λ x > ξ, assign x to H 1 (class C 1 ). Otherwise, assign x to H 2 (class C 2 ). p X xหc 1 dx = න H p X xหc 1 dx = 1 log Λ x = log p X xหc 1 p X xหc 2 logξ = log p 2 c 12 c 22 p 1 c 21 c 11 Log-likelihood Ratio Log-threshold 18

Bayes Classifier for Gaussian Gaussian classifier 19

Multilayer Perceptron 20

Backpropagation Learning 21

Backpropagation Learning Output layer C at n th iteration and j th neuron M j ji i y j[ n] j v j[ n] i0 v [ n] w y [ n] : Neural network e [ n] d [ n] y [ n] j j j E 1 2 [ n] e j[ n] 2 jc : Error and objective function E[ n] E[ n] e j[ n] y j[ n] v j[ n] w [ n] e [ n] y [ n] v [ n] w [ n] ji j j j ji n e [ n] 1 v [ n] y [ n] j j j i : Chain rule E[ n] w e n v n y n n y n n ji j[ ] j j[ ] i[ ] j[ ] i[ ] wji[ n] : LMS algorithm E[ n] E[ n] e [ n] y [ n] j j n j[ n] e j[ n] j v j[ n] v j[ n] e j[ n] y j[ n] v j[ n] 22

Backpropagation Learning Hidden layer C at n th iteration and j th neuron (k th neuron in next layer) E E[ n] E[ n] y [ n] E[ n] j n j[ n] j v j[ n] v j[ n] y j[ n] v j[ n] y j[ n] 1 2 [ n] ek [ n] 2 kc E[ n] ek[ n] ek[ n] vk[ n] ek[ n] ek[ n] y [ n] y [ n] v [ n] y [ n] j k j k k j k n v [ n] e [ n] d [ n] y [ n] d [ n] v [ n] k k k k k k M v [ n] w y [ n] k kj j j0 E[ n] vk[ n] w y [ n] n y [ n] j k j kj [ n] e [ n] v [ n] e [ n] v [ n] w [ n] [ n] w [ n] k k k kj k kj k k k k : Neural network : Error backpropagation : Neural network : Error backpropagation n j[ n] j v j[ n] k[ n] wkj[ n] k w [ n] y [ n] ji j i : LMS algorithm 23

Output Representation (M-class Classification) ANN is trained by using the training data set N x[ i], d[ i] i 1 For a new input x j, the ANN produced the output T y j y1, j,, ym, j F1 ( x j),, FM ( x j) F( x j) T Assign the input to a single class x C if F ( x ) F ( x ) for all l k j k k j l j Assign the input to multiple classes x C if F ( x ) threshold ex, 0.5 j k k j 24

Generalization ANN generalizes well when the input-output is correct for a test data never used in creating or training the network Apply cross-validation Proper size of training data set Proper size of validation data set Avoid overfitting Trade-off between bias and variance Adjust ANN architecture 25

Multilayer Perceptron T. Hastie et al., 2008, The Elements of Statistical Learning 26

Obesity Multilayer Perceptron T. Hastie et al., 2008, The Elements of Statistical Learning Systolic Blood Pressure Corpus Collosum Age 27

Algorithm Selection T. Hastie et al., 2008, The Elements of Statistical Learning Linear Regression of 0/1 Response 15-Nearest Neighbor Classifier 1 Nearest Neighbor Classifier 28 Bayes Optimal Classifier

EOD