INTRODUCTION TO ARTIFICIAL INTELLIGENCE

Similar documents
Introduction to Neural Networks

Neural networks. Chapter 20. Chapter 20 1

Neural networks. Chapter 19, Sections 1 5 1

Neural networks. Chapter 20, Section 5 1

Neural Networks. Chapter 18, Section 7. TB Artificial Intelligence. Slides from AIMA 1/ 21

CS 4700: Foundations of Artificial Intelligence

18.6 Regression and Classification with Linear Models

ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD

Learning and Memory in Neural Networks

Artificial Neural Networks" and Nonparametric Methods" CMPSCI 383 Nov 17, 2011!

Neural Networks and the Back-propagation Algorithm

COMP9444 Neural Networks and Deep Learning 11. Boltzmann Machines. COMP9444 c Alan Blair, 2017

Machine Learning. Neural Networks

EEE 241: Linear Systems

Deep Feedforward Networks. Sargur N. Srihari

Artificial Intelligence

COMP 551 Applied Machine Learning Lecture 14: Neural Networks

Introduction to Deep Learning

SPSS, University of Texas at Arlington. Topics in Machine Learning-EE 5359 Neural Networks

Neural Networks. Mark van Rossum. January 15, School of Informatics, University of Edinburgh 1 / 28

Artificial neural networks

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels

Jeff Howbert Introduction to Machine Learning Winter

AN INTRODUCTION TO NEURAL NETWORKS. Scott Kuindersma November 12, 2009

Master Recherche IAC TC2: Apprentissage Statistique & Optimisation

CMSC 421: Neural Computation. Applications of Neural Networks

Artificial Neural Networks Examination, March 2004

Course 395: Machine Learning - Lectures

Lecture 7 Artificial neural networks: Supervised learning

Deep Feedforward Networks

ARTIFICIAL NEURAL NETWORKS گروه مطالعاتي 17 بهار 92

Multilayer Neural Networks. (sometimes called Multilayer Perceptrons or MLPs)

CS:4420 Artificial Intelligence

Multilayer Neural Networks. (sometimes called Multilayer Perceptrons or MLPs)

Apprentissage, réseaux de neurones et modèles graphiques (RCP209) Neural Networks and Deep Learning

(Feed-Forward) Neural Networks Dr. Hajira Jabeen, Prof. Jens Lehmann

PV021: Neural networks. Tomáš Brázdil

Artificial Neural Networks. Q550: Models in Cognitive Science Lecture 5

Plan. Perceptron Linear discriminant. Associative memories Hopfield networks Chaotic networks. Multilayer perceptron Backpropagation

AI Programming CS F-20 Neural Networks

Neural Networks biological neuron artificial neuron 1

Mining Classification Knowledge

CS 4700: Foundations of Artificial Intelligence

Machine Learning and Data Mining. Multi-layer Perceptrons & Neural Networks: Basics. Prof. Alexander Ihler

Lecture 5: Logistic Regression. Neural Networks

DEEP LEARNING AND NEURAL NETWORKS: BACKGROUND AND HISTORY

A summary of Deep Learning without Poor Local Minima

Bits of Machine Learning Part 1: Supervised Learning

CSC242: Intro to AI. Lecture 21

100 inference steps doesn't seem like enough. Many neuron-like threshold switching units. Many weighted interconnections among units

Lecture 17: Neural Networks and Deep Learning

Grundlagen der Künstlichen Intelligenz

Multilayer Perceptron

Deep Learning for NLP

Computational Intelligence Lecture 6: Associative Memory

Artificial Neural Network

Artificial Neural Networks Examination, June 2004

11/3/15. Deep Learning for NLP. Deep Learning and its Architectures. What is Deep Learning? Advantages of Deep Learning (Part 1)

Neural networks and support vector machines

Support Vector Machine. Industrial AI Lab.

Machine Learning for Large-Scale Data Analysis and Decision Making A. Neural Networks Week #6

Neural Networks Lecturer: J. Matas Authors: J. Matas, B. Flach, O. Drbohlav

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels

Neural Networks. Single-layer neural network. CSE 446: Machine Learning Emily Fox University of Washington March 10, /9/17

Neural Networks. Advanced data-mining. Yongdai Kim. Department of Statistics, Seoul National University, South Korea

Statistical Machine Learning from Data

Numerical Learning Algorithms

Machine Learning for Physicists Lecture 1

Security Analytics. Topic 6: Perceptron and Support Vector Machine

Last update: October 26, Neural networks. CMSC 421: Section Dana Nau

ECE662: Pattern Recognition and Decision Making Processes: HW TWO

Learning Deep Architectures for AI. Part I - Vijay Chakilam

Neural Networks and Deep Learning

Artificial Neural Networks Examination, June 2005

Unit 8: Introduction to neural networks. Perceptrons

In biological terms, memory refers to the ability of neural systems to store activity patterns and later recall them when required.

Neural Networks: Backpropagation

Artificial Neural Networks

Computational Intelligence Winter Term 2017/18

Artificial Neural Networks. Introduction to Computational Neuroscience Tambet Matiisen

Neural Nets and Symbolic Reasoning Hopfield Networks

Neural Networks Learning the network: Backprop , Fall 2018 Lecture 4

ML4NLP Multiclass Classification

Sections 18.6 and 18.7 Analysis of Artificial Neural Networks

Simulating Neural Networks. Lawrence Ward P465A

Neural Networks. CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington

Deep Feedforward Networks

Course 10. Kernel methods. Classical and deep neural networks.

Data Mining Part 5. Prediction

COMS 4771 Introduction to Machine Learning. Nakul Verma

Introduction to Artificial Neural Networks

Computational Intelligence

Artificial Neural Network

CSCI 315: Artificial Intelligence through Deep Learning

Sections 18.6 and 18.7 Artificial Neural Networks

Neural Networks for Machine Learning. Lecture 2a An overview of the main types of neural network architecture

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Neural Networks DWML, /25

Support Vector Machines

Nonlinear Classification

Transcription:

v=1 v= 1 v= 1 v= 1 v= 1 v=1 optima 2) 3) 5) 6) 7) 8) 9) 12) 11) 13) INTRDUCTIN T ARTIFICIAL INTELLIGENCE DATA15001 EPISDE 8: NEURAL NETWRKS

TDAY S MENU 1. NEURAL CMPUTATIN 2. FEEDFRWARD NETWRKS (PERCEPTRN) 3. RECURRENT NETWRKS 4. SM

NEURAL CMPUTATIN The traditional model of computation based on Turing machines is only one of many frameworks Computation in natural nervous systems is quite different from the Turing machine model: parallel, stochastic and adaptive Neural computation is one of key approaches to AI 1960 saw a decline in interest ("AI Winter") but currently there's a new wave ("Deep learning")

NEURAL CMPUTATIN

NEURAL CMPUTATIN In our dichotomy of AI approaches, neural networks belong to "modern AI" rather than GFAI SUBSYMBLIC/DIGITAL VS SYMBLIC Some NNs are probabilistic (but not all), so probabilistic methods can be applied Perhaps the bigger question mark: a NN is a "black box", i.e., it very hard to interpret and say why a given output is produced many learning algorithms are poorly undersood ("optimal brain damage", "Cuckoo search",...)

NEURAL CMPUTATIN Source: Ertel: Introduction to Artificial Intelligence, Springer, 2011.

NEURAL CMPUTATIN NATURAL NEURAL NETWRK Source: Ertel: Introduction to Artificial Intelligence, Springer, 2011.

NEURAL CMPUTATIN ARTIFICIAL NEURAL NETWRK CPY JUST THE IDEA: A NUMBER F SIMPLE PRCESSING UNITS CNNECTED TGETHER AS A LARGE NETWRK Source: Ertel: Introduction to Artificial Intelligence, Springer, 2011.

N AT U R A L NATURAL NEURAL NETWRKS are usually: asynchronous binary (spike or not) recurrent (feedback) massive VS ARTIFICIAL ARTIFICIAL NEURAL NETWRKS are usually: synchronous continuous valued feedforward (no feedback) large but not massive

BASIC NEURN Real-valued or binary inputs x 1,..., x n Real-valued weights w i1,..., w in Activation function f => output x i Note: one neuron's output is another's input Image source: Ertel: Introduction to Artificial Intelligence, Springer, 2011.

CASE: PERCEPTRN A perceptron neuron can be used in isolation (a single neuron) Activation function is the step-function: f(z) = 1, if z < 0, z = Σ w ij x j j=1 1, otherwise n Image source: Ertel: Introduction to Artificial Intelligence, Springer, 2011.

CASE: PERCEPTRN The output is binary { 1,1} The goal is to adjust the weights so that the output of the neuron (from the activation function) is as we like The input can be, for example, a set of pixels in an image The goal would be to recognize whether the image represents a given pattern (e.g., number '3') Adjusting the weights can be hard Solution: Machine learning on some training data!

PERCEPTRN: THE MATH PART The effect of the weigths on the argument of the activation function, z, is linear Therefore, the decision boundary is a "hyperplane" (the generalization of a surface in high dimensions) For example, if the input is two-dimensional (x 1, x 2 ): x 1 f(z) = 1 iff the point (x 1, x 2 ) is on the "right" side of a straight line that passes through the origin DECISIN BUNDARY w 1 x 1 + w 2 x 2 = 0 x 2

PERCEPTRN ALGRITHM (RSENBLATT, 1958) The following simple algorithm finds an optimal hyperplane (weights)......if the data are "linearly separable" perceptron(data): 1: w = [0,...,0] # array of size p 2: while error(data, w) > 0: 3: (x,y) = choose_random_item(data) 4: z = w[0]x[0] +... + w[p-1]x[p-1] 5: if z 0 and y = -1: # -1 classified as 1 6: w = w x # subtract vector x 7: if z < 0 and y = 1: # 1 classified as -1 8: w = w + x # add vector x 9: return(w)

PERCEPTRN ALGRITHM (RSENBLATT, 1958) An illustration in 2D Image source: Ertel: Introduction to Artificial Intelligence, Springer, 2011.

PERCEPTRN ALGRITHM The problem is that usually data is not linearly separable then the algorithm keep updating the weights forever Variants exists for: finding the weights with minimal error finding the hyperplane that maximizes the margin between the two classes: "support vector machine (SVM)" In any case, the linearity of the classifier is a severe restriction Two approaches for constructing non-linear classifiers: multilayer perceptron kernel trick (similar idea as in non-linear regression)

MULTILAYER PERCEPTRN (MLP) Connecting many perceptron units together the activation function is usually sigmoid The output is differentiable wrt. parameters, which means that it is easier to optimize the weights Rule for learning MLPs: backpropagation UTSIDE THE SCPE F THIS CURSE

MULTILAYER PERCEPTRN (MLP) The MLP can represent "anything" (with enough hidden layers) The backpropagation algorithm doesn't guarantee the optimal weights, only a local optimum Restarting from a different starting point may give a different (better or worse) solution

RECURRENT NEURAL NETWRKS The MLP is a feedforward network because the information always flows in one direction (towards the output layer) In recurrent networks, this is not the case, and there can be feedback loops This can cause complex dynamic interactions which are often harder to model

HPFIELD NETWRK inputs weights EVERYTHING IS CNNECTED T EVERYTHING

HPFIELD NETWRK Learning occurs when a series of input configurations (each neuron's value) are presented inputs weights Weights will measure how often two neurons are at the same state (either both on, or both off) After learning: 1. the network is initialized in a new input configuration 2. each neurons may change its state according to the states of the other neurons (inputs) and the weights 3. the new states then become the input 4. this is repeated until convergence

HPFIELD NETWRK inputs The learning rule: n w ij = Σ q ik q jk / n k=1 where q ik = +1 if neuron i is on in the k'th training sample; 1 otherwise The activation rule the same as in a perceptron: x i = +1 if Σ w ij x j > 0 j i 1 otherwise weights

BLTZMANN MACHINE Another example of a recurrent network A probabilistic version of the Hopfield network The input is usually a subset of the neurons inputs "Restricted Boltmann machine": not everything is connected to everything A somewhat different learning rule

BLTZMANN MACHINE

SELF-RGANIZING MAP (SM) (KHNEN, 1982)

SELF-RGANIZING MAP (SM) The neurons form a two-dimensional grid Each neuron has a state vector The input vector activates the neuron whose state vector is nearest to the input vector: the "winner" input

SELF-RGANIZING MAP (SM) The neurons form a two-dimensional grid Each neuron has a state vector The input vector activates the neuron whose state vector is nearest to the input vector: the "winner" The state of the winner is updated to be more similar to the input input

SELF-RGANIZING MAP (SM) The neurons form a two-dimensional grid Each neuron has a state vector The input vector activates the neuron whose state vector is nearest to the input vector: the "winner" The state of the winner is updated to be more similar to the input input

SELF-RGANIZING MAP (SM) The neighbors of the winner are also updated input

SELF-RGANIZING MAP (SM) The neighbors of the winner are also updated input

SELF-RGANIZING MAP (SM) The neighbors of the winner are also updated input

SELF-RGANIZING MAP (SM) The neighbors of the winner are also updated input

SELF-RGANIZING MAP (SM) The neighbors of the winner are also updated input

SELF-RGANIZING MAP (SM) The neighbors of the winner are also updated input

SELF-RGANIZING MAP (SM) The neighbors of the winner are also updated The neighbors gradually become more and more similar As the learning proceeds, the size of the neighborhood is made smaller The updates also get smaller and the network converges input

SELF-RGANIZING MAP (SM) The input can be any vector Examples: speech recognition: = audio recording process control: = status of (e.g. a paper) machine information retrieval: = word occurrences in a document

SELF-RGANIZING MAP (SM)

SUMMARY N NEURAL NETWRKS Feedforward networks perceptron multilayer perceptron (MLP)... Recurrent networks Hopfield network Boltzmann machine... Self-organizing map

SUMMARY N NEURAL NETWRKS Different network type have different applications Feedforward networks can be used for supervised machine learning and as function approximators deep learning is typically based on feedforward networks with "convolutional" layers (recognizing image patterns) Recurrent networks can be used as a model of associative memory Self-organizing maps can be used for visualization