Artificial Neural Networks. Introduction to Computational Neuroscience Tambet Matiisen

Similar documents
Artificial Neural Networks D B M G. Data Base and Data Mining Group of Politecnico di Torino. Elena Baralis. Politecnico di Torino

Grundlagen der Künstlichen Intelligenz

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

CS 4700: Foundations of Artificial Intelligence

Neural networks. Chapter 19, Sections 1 5 1

Neural networks. Chapter 20, Section 5 1

Neural Networks. Chapter 18, Section 7. TB Artificial Intelligence. Slides from AIMA 1/ 21

Neural Networks and Deep Learning

ARTIFICIAL INTELLIGENCE. Artificial Neural Networks

Machine Learning for Signal Processing Neural Networks Continue. Instructor: Bhiksha Raj Slides by Najim Dehak 1 Dec 2016

Lecture 17: Neural Networks and Deep Learning

Machine Learning for Large-Scale Data Analysis and Decision Making A. Neural Networks Week #6

Machine Learning. Neural Networks

Last update: October 26, Neural networks. CMSC 421: Section Dana Nau

(Feed-Forward) Neural Networks Dr. Hajira Jabeen, Prof. Jens Lehmann

Neural Networks. Mark van Rossum. January 15, School of Informatics, University of Edinburgh 1 / 28

Neural networks. Chapter 20. Chapter 20 1

Neural Networks for Machine Learning. Lecture 2a An overview of the main types of neural network architecture

Artificial Intelligence

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Introduction to Neural Networks

Artificial Neural Networks. Q550: Models in Cognitive Science Lecture 5

Supervised (BPL) verses Hybrid (RBF) Learning. By: Shahed Shahir

Introduction To Artificial Neural Networks

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels

From perceptrons to word embeddings. Simon Šuster University of Groningen

CS 4700: Foundations of Artificial Intelligence

Course 395: Machine Learning - Lectures

UNSUPERVISED LEARNING

Artificial Neural Networks" and Nonparametric Methods" CMPSCI 383 Nov 17, 2011!

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels

Jakub Hajic Artificial Intelligence Seminar I

CSC321 Lecture 16: ResNets and Attention

ARTIFICIAL NEURAL NETWORKS گروه مطالعاتي 17 بهار 92

Deep Learning Architectures and Algorithms

Master Recherche IAC TC2: Apprentissage Statistique & Optimisation

arxiv: v3 [cs.lg] 14 Jan 2018

Deep Learning: a gentle introduction

Neural Networks 2. 2 Receptive fields and dealing with image inputs

Administration. Registration Hw3 is out. Lecture Captioning (Extra-Credit) Scribing lectures. Questions. Due on Thursday 10/6

DEEP LEARNING PART ONE - INTRODUCTION CS/CNS/EE MACHINE LEARNING & DATA MINING - LECTURE 7

Sections 18.6 and 18.7 Artificial Neural Networks

Neural Networks biological neuron artificial neuron 1

Neural Networks. Fundamentals of Neural Networks : Architectures, Algorithms and Applications. L, Fausett, 1994

Sections 18.6 and 18.7 Artificial Neural Networks

Neural Networks with Applications to Vision and Language. Feedforward Networks. Marco Kuhlmann

Plan. Perceptron Linear discriminant. Associative memories Hopfield networks Chaotic networks. Multilayer perceptron Backpropagation

Neural Networks: Introduction

How to do backpropagation in a brain

Neural Networks Introduction

Lecture 8: Introduction to Deep Learning: Part 2 (More on backpropagation, and ConvNets)

DEEP LEARNING AND NEURAL NETWORKS: BACKGROUND AND HISTORY

Intelligent Systems: Reasoning and Recognition. Artificial Neural Networks

RegML 2018 Class 8 Deep learning

ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD

Deep Learning. What Is Deep Learning? The Rise of Deep Learning. Long History (in Hind Sight)

Sections 18.6 and 18.7 Analysis of Artificial Neural Networks

Intelligent Systems Discriminative Learning, Neural Networks

COMP-4360 Machine Learning Neural Networks

Speaker Representation and Verification Part II. by Vasileios Vasilakakis

Deep Learning Recurrent Networks 2/28/2018

Recurrent Neural Networks (Part - 2) Sumit Chopra Facebook

The error-backpropagation algorithm is one of the most important and widely used (and some would say wildly used) learning techniques for neural

Convolutional Neural Networks

CS:4420 Artificial Intelligence

Neural Networks. Yan Shao Department of Linguistics and Philology, Uppsala University 7 December 2016

Neural Networks and the Back-propagation Algorithm

Feedforward Neural Nets and Backpropagation

Artificial Neural Networks

Neural Networks Lecturer: J. Matas Authors: J. Matas, B. Flach, O. Drbohlav

Lecture 7 Artificial neural networks: Supervised learning

Deep Learning. What Is Deep Learning? The Rise of Deep Learning. Long History (in Hind Sight)

Neural Networks. Nicholas Ruozzi University of Texas at Dallas

Deep Learning. Ali Ghodsi. University of Waterloo

Introduction to Artificial Neural Networks

Introduction to Deep Learning

Machine Learning Lecture 12

Sequence Modeling with Neural Networks

ECE521 Lectures 9 Fully Connected Neural Networks

Making Deep Learning Understandable for Analyzing Sequential Data about Gene Regulation

CSC321 Lecture 5: Multilayer Perceptrons

SGD and Deep Learning

Multilayer Perceptrons (MLPs)

Machine Learning Lecture 10

CSE446: Neural Networks Spring Many slides are adapted from Carlos Guestrin and Luke Zettlemoyer

Introduction to Deep Learning

What Do Neural Networks Do? MLP Lecture 3 Multi-layer networks 1

Neural Networks Learning the network: Backprop , Fall 2018 Lecture 4

Neural Turing Machine. Author: Alex Graves, Greg Wayne, Ivo Danihelka Presented By: Tinghui Wang (Steve)

18.6 Regression and Classification with Linear Models

AI Programming CS F-20 Neural Networks

Lecture 4: Feed Forward Neural Networks

Demystifying deep learning. Artificial Intelligence Group Department of Computer Science and Technology, University of Cambridge, UK

Deep Learning Sequence to Sequence models: Attention Models. 17 March 2018

Neural networks and optimization

10. Artificial Neural Networks

Understanding How ConvNets See

17 Neural Networks NEURAL NETWORKS. x XOR 1. x Jonathan Richard Shewchuk

CMSC 421: Neural Computation. Applications of Neural Networks

Transcription:

Artificial Neural Networks Introduction to Computational Neuroscience Tambet Matiisen 2.04.2018

Artificial neural network NB! Inspired by biology, not based on biology!

Applications Automatic speech recognition Automatic image tagging Machine translation

Learning objectives How artificial neural networks work? What types of artificial neural networks are used for what tasks? What are the state-of-the-art results achieved with artificial neural networks?

Part 1 HOW NEURAL NETWORKS WORK?

Frank Rosenblatt (1957) Added learning rule to McCulloch-Pitts neuron.

Perceptron z Prediction: 1, if xiwi b 0 i 0, otherwise Learning: w w ( y z) x i i i b b ( y z) x 1 x 2 w 1 w 2 b Σ z 1

Let s try it out! x 1 x 2 y = x 1 or x 2 0 0 0 0 1 1 1 0 1 1 1 1 Algorithm: repeat 1, if x1w1 x2w2 b 0 z 0,otherwise w w ( y z) x 1 1 1 w w ( y z) x 2 2 2 b b ( y z) until y=z holds for entire dataset

Perceptron limitations Perceptron learning algorithm converges only for linearly separable problems. Minsky, Papert, Perceptrons (1969)

Multi-layer perceptrons Add non-linear activation functions Add hidden layer(s) Universal approximation theorem: Any continous function can be approximated to given precision by feed-forward neural network with single hidden layer containing finite number of neurons.

Forward propagation +1 w 01 +1 v 0 w 02 a 1 = w 01 +x 1 w 11 +x 2 w 21 h 1 =σ(a 1 ) z = v 0 +h 1 v 1 +h 2 v 2 x 1 w 11 Σ v 1 Σ w 12 x 2 w 21 w 22 a 2 = w 02 +x 1 w 12 +x 2 w 22 h 2 = σ(a 2 ) Σ v 2 1 ( x) 1 e x

Loss function Function approximation: 1 ( ) 2 L y z 2 ( 10 z) 2 Now we just need to find weight values that minimize the loss function for all inputs. How do you do that?

Backpropagation +1 w 01 = e a1 +1 v 0 = e z w 02 = e a2 e h1 = e z v 1 e a1 = e h1 σ (a 1 )= e h1 h 1 (1 -h 1 ) e z = y-z x 1 w 11 = e a1 x 1 w 12 = e a2 x 1 Σ v 1 = e z h 1 Σ w 21 = e a1 x 2 e h2 = e z v 2 e a2 = e h2 σ (a 2 )=e h2 h 2 (1 -h 2 ) x 2 w 22 = e a2 x 2 Σ v 2 = e z h 2 ' ( x) ( x)(1 ( x)) 1 ( x) 1 x e

Gradient Descent w w w ij ij ij v v v i i i learning rate Gradient descent finds weight values that result in small loss. Gradient descent is guaranteed to find only local minimum. But there is plenty of them and they are often good enough!

Other loss functions Binary classification: p () z L y log( p) (1 y)log(1 p) Multi-class classification: p softmax( z), p L y log p i i i i j e z i e z j log( p) log(1 p) 1 ( x) 1 e x

Things to remember... Perceptron was the first artificial neuron model invented in late 1950s. Perceptron can learn only linearly separable classification problems. Feed-forward networks with non-linear activation functions and hidden layers can overcome limitations of perceptrons. Multi-layer artificial neural networks are trained using backpropagation and gradient descent.

Part 2 NEURAL NETWORKS TAXONOMY

Simple feed-forward networks Architecture: Each node connected to all nodes of previous layer. Information moves in one direction only. Used for: Function approximation Simple classification problems Not too many inputs (~100) HIDDEN LAYER OUTPUT LAYER INPUT LAYER

Convolutional neural networks Architecture: Convolutional layer: local connections + weight sharing. Pooling layer: translation invariance. Used for: images and spatial data, any other data with locality property, i.e. adjacent characters make up word. POOLING LAYER CONVOLUTIONAL LAYER -2 2 2 max 2 1 0 1 2-1 INPUT LAYER 2 1-3 weights: 1 0-1

Hubel & Wiesel (1959) Performed experiments with anesthetized cat. Discovered topographical mapping, sensitivity to orientation and hierarchical processing.

Convolution Convolution matches the same pattern over entire image and calculates score for each match.

Example: edge detector https://developer.apple.com/library/ios/documentation/performance/conceptual/vimage/convolutionoperations/convolutionoperations.html

Pooling Pooling achieves translation invariance by taking maximum of adjacent convolution scores.

Example: handwritten digit recognition Y. LeCun et al., Handwritten digit recognition: Applications of neural net chips and automatic learning, 1989. LeCun et al. (1989)

Recurrent neural networks Architecture: Hidden layer nodes connected to itself. Allows retaining internal state and memory. Used for: speech recognition, machine translation, language modeling, any time series. HIDDEN LAYER OUTPUT LAYER INPUT LAYER

Different configurations

Backpropagation through time y 1 y 2 y 3 y 4? OUTPUT LAYER z 1 z 2 z 3 z 4 HIDDEN LAYER h 0 h 1 h 2 h 3 h 4 INPUT LAYER x 1 x 2 x 3 x 4 time

Autoencoders Architecture: Input and output layers are the same. Hidden layer functions as a bottleneck. Network is trained to reconstruct input from hidden layer activations. Used for: image semantic hashing dimensionality reduction HIDDEN LAYER OUTPUT LAYER = INPUT LAYER INPUT LAYER

We didn t talk about... Long Short Term Memory (LSTMs) Restricted Boltzmann Machines (RBMs) Echo State Networks / Liquid State Machines Hopfield Network Self-organizing maps (SOMs) Radial basis function networks (RBFs) But we covered the most important ones!

Things to remember... Simple feed-forward networks are usually used for function approximation and classification with few input features. Convolutional neural networks are mostly used for images and spatial data. Recurrent neural networks are used for language modeling and time series. Autoencoders are used for image semantic hashing and dimensionality reduction.

Part 3 SOME STATE-OF-THE-ART RESULTS

Deep Learning Artificial neural networks and backpropagation have been around since 1980s. What s all this fuss about deep learning? What has changed: we have much bigger datasets, we have much faster computers (think GPUs), we have learned few tricks how to train neural networks with very many layers.

Revolution of Depth (human error ~5.1%)

Neural Image Processing

Instance Segmentation https://www.youtube.com/watch?v=oot3uixzzte https://github.com/matterport/mask_rcnn

Image Captioning

Image Captioning Errors

Reinforcement learning screen score Pong Breakout Space Invaders actions Seaquest Beam Rider Enduro http://sodeepdude.blogspot.com/2015/03/deepminds-atari-paper-replicated.html Mnih et al., Human-level control through deep reinforcement learning (2015)

Skype Translator https://www.youtube.com/watch?v=nhxcg2pa3zi

Adversarial Examples https://www.youtube.com/watch?v=xaqu7kkqbpc

Things to remember... Artificial neural networks are state-of-the-art in image recognition, speech recognition, machine translation and many other fields. Anything that you can do in 1 second, probably we can train a neural network to do the same, i.e. neural nets can do perception. But in the end they are just reactive function approximators and can be easily fooled. In particular they do not think like humans (yet).

Thank you! tambet@ut.ee