Neural Networks Introduction CIS 32

Similar documents
Perceptron. (c) Marcin Sydow. Summary. Perceptron

Linear Classifiers: Expressiveness

Neural networks. Chapter 19, Sections 1 5 1

COMP304 Introduction to Neural Networks based on slides by:

Neural networks. Chapter 20. Chapter 20 1

2015 Todd Neller. A.I.M.A. text figures 1995 Prentice Hall. Used by permission. Neural Networks. Todd W. Neller

22c145-Fall 01: Neural Networks. Neural Networks. Readings: Chapter 19 of Russell & Norvig. Cesare Tinelli 1

Linear discriminant functions

ARTIFICIAL INTELLIGENCE. Artificial Neural Networks

Computational Intelligence Winter Term 2009/10

EEE 241: Linear Systems

Neural networks. Chapter 20, Section 5 1

Neural Networks. Chapter 18, Section 7. TB Artificial Intelligence. Slides from AIMA 1/ 21

Artificial neural networks

Computational Intelligence

Neural Networks: Introduction

Computational Intelligence

COMP9444 Neural Networks and Deep Learning 2. Perceptrons. COMP9444 c Alan Blair, 2017

Lecture 4: Feed Forward Neural Networks

Data Mining Part 5. Prediction

CS:4420 Artificial Intelligence

Simple Neural Nets For Pattern Classification

Learning in State-Space Reinforcement Learning CIS 32

CE213 Artificial Intelligence Lecture 13

Last update: October 26, Neural networks. CMSC 421: Section Dana Nau

Artificial Neural Networks. Historical description

Introduction To Artificial Neural Networks

Artifical Neural Networks

Revision: Neural Network

Ch.8 Neural Networks

Machine Learning (CSE 446): Neural Networks

Vote. Vote on timing for night section: Option 1 (what we have now) Option 2. Lecture, 6:10-7:50 25 minute dinner break Tutorial, 8:15-9

CS 446: Machine Learning Lecture 4, Part 2: On-Line Learning

Foundations of Computation

Artificial Neural Network and Fuzzy Logic

CMSC 421: Neural Computation. Applications of Neural Networks

CSC Neural Networks. Perceptron Learning Rule

Machine Learning. Neural Networks. Le Song. CSE6740/CS7641/ISYE6740, Fall Lecture 7, September 11, 2012 Based on slides from Eric Xing, CMU

Introduction to Neural Networks

Artificial Neural Networks

Multilayer Perceptron

CSE 20 DISCRETE MATH. Fall

HW1 graded review form? HW2 released CSE 20 DISCRETE MATH. Fall

Artificial Neural Networks The Introduction

Artificial Neural Networks Lecture Notes Part 3

Introduction to Neural Networks

Introduction Biologically Motivated Crude Model Backpropagation

Instituto Tecnológico y de Estudios Superiores de Occidente Departamento de Electrónica, Sistemas e Informática. Introductory Notes on Neural Networks

CSE20: Discrete Mathematics for Computer Science. Lecture Unit 2: Boolan Functions, Logic Circuits, and Implication

Week 3: More Processing, Bits and Bytes. Blinky, Logic Gates, Digital Representations, Huffman Coding. Natural Language and Dialogue Systems Lab

IST 4 Information and Logic

EE04 804(B) Soft Computing Ver. 1.2 Class 2. Neural Networks - I Feb 23, Sasidharan Sreedharan

Linear Regression, Neural Networks, etc.

Neural networks (not in book)

Week 5: Logistic Regression & Neural Networks

Reification of Boolean Logic

Learning from Observations. Chapter 18, Sections 1 3 1

Simple Neural Nets for Pattern Classification: McCulloch-Pitts Threshold Logic CS 5870

Machine Learning. Neural Networks

Machine Learning (CS 567) Lecture 3

COGS Q250 Fall Homework 7: Learning in Neural Networks Due: 9:00am, Friday 2nd November.

Lecture 7 Artificial neural networks: Supervised learning

CSE 20 DISCRETE MATH WINTER

CS 380: ARTIFICIAL INTELLIGENCE

Qualifying Exam in Machine Learning

Binary addition example worked out

Optimization Methods for Machine Learning (OMML)

Artificial Neural Networks" and Nonparametric Methods" CMPSCI 383 Nov 17, 2011!

Artificial Intelligence

Convolutional networks. Sebastian Seung

CSE 352 (AI) LECTURE NOTES Professor Anita Wasilewska. NEURAL NETWORKS Learning

Lecture 4: Perceptrons and Multilayer Perceptrons

Logic Gate Level. Part 2

Midterm: CS 6375 Spring 2015 Solutions

Artificial Neural Network

CSC242: Intro to AI. Lecture 21

Artificial Neural Networks. Part 2

Artificial Neural Networks Lecture Notes Part 2

CSC321 Lecture 5: Multilayer Perceptrons

The exam is closed book, closed calculator, and closed notes except your one-page crib sheet.

Simulating Neural Networks. Lawrence Ward P465A

Linear Classifiers. Michael Collins. January 18, 2012

Multilayer Perceptron Tutorial

Hopfield Neural Network and Associative Memory. Typical Myelinated Vertebrate Motoneuron (Wikipedia) Topic 3 Polymers and Neurons Lecture 5

Mathematical Logic Part One

CE213 Artificial Intelligence Lecture 14

CS 173 Lecture 2: Propositional Logic

ECE521 Lectures 9 Fully Connected Neural Networks

CHAPTER 12 Boolean Algebra

Building a Computer Adder

Artificial Neural Networks

Learning and Neural Networks

Neuron Structure. Why? Model 1 Parts of a Neuron. What are the essential structures that make up a neuron?

Biosciences in the 21st century

The perceptron learning algorithm is one of the first procedures proposed for learning in neural network models and is mostly credited to Rosenblatt.

15. LECTURE 15. I can calculate the dot product of two vectors and interpret its meaning. I can find the projection of one vector onto another one.

Boolean circuits. Lecture Definitions

Machine Learning and Data Mining. Multi-layer Perceptrons & Neural Networks: Basics. Prof. Alexander Ihler

Principles of Artificial Intelligence Fall 2005 Handout #7 Perceptrons

CSC 411 Lecture 10: Neural Networks

Transcription:

Neural Networks Introduction CIS 32

Functionalia Office Hours (Last Change!) - Location Moved to 0317 N (Bridges Room) Today: Alpha-Beta Example Neural Networks Learning with T-R Agent (from before)

direction of search MAX MIN Alpha-Beta Example 3 12 8 2 4 1 14 5 2

direction of search MAX MIN Alpha-Beta Example [-inf, +inf] [-inf, 3] 3 12 8 2 4 1 14 5 2

direction of search MAX MIN Alpha-Beta Example [-inf, +inf] [-inf, 3] 3 12 8 2 4 1 14 5 2

direction of search MAX MIN Alpha-Beta Example [3, +inf] [3, 3] 3 12 8 2 4 1 14 5 2

direction of search MAX MIN Alpha-Beta Example [3, +inf] [3, 3] [-inf, 2] 3 12 8 2 4 1 14 5 2

direction of search MAX MIN Alpha-Beta Example alpha <= beta [3, +inf] alpha = 3 [3, 3] [-inf, 2] X beta = 2 3 12 8 2 4 1 14 5 2

direction of search MAX MIN Alpha-Beta Example [3, 14] no more children [3, 3] [-inf, 2] [-inf, 14] 3 12 8 2 4 1 14 5 2

direction of search MAX MIN Alpha-Beta Example [3, 5] no more children [3, 3] [-inf, 2] [-inf, 5] 3 12 8 2 4 1 14 5 2

direction of search MAX MIN Alpha-Beta Example [3, 3] [3, 3] [-inf, 2] [2, 2] 3 12 8 2 4 1 14 5 2

Neural Networks: Introduction Now we will look at neural networks, so called because they mimic the structure of the brain. Axonal arborization Synapse Axon from another cell Dendrite Axon Nucleus Synapses Cell body or Soma However, they don t have to be viewed in this way. We will start by thinking of them as an implementation of the kind of stimulus-response agents we looked at in the last lecture. They also provide us with our first taste of learning. The learning angle means we don t have to figure out the model parameters for ourselves.

Networks for Stimulus-Response Production systems can be easily implemented as computer programs. They may also be implemented directly as electronic circuits, as combinations of AND, OR, and NOT gates. (Or as simulations of electronic circuits.) One useful kind of circuit is built of elements whose output is a nonlinear discrete function of a weighted combinations of its inputs. One kind of such unit is a threshold logic unit (TLU). This computes a weighted sum of its inputs, compares this to a threshold, and outputs 1 if the threshold is exceeded, 0 otherwise.

The Boolean functions that can be computed using a TLU are called linearly separable functions.

We can use TLUs to implement some Boolean functions, for instance a simple conjunction: but we can t implement an exclusive-or this way. (Because an XOR is not linearly separable. We will see...)

Boundary Following Agent (from before) We can implement the kind of function used for boundary following.

s2 s3 s4 s5 Activation 0 0 0 0 0(1)+0(1)+0(-2)+0(-2) = 0 0 0 0 0 1 0(1)+0(1)+0(-2)+1(-2) = -2 0 0 0 1 0 0(1)+0(1)+1(-2)+0(-2) = -2 0 0 0 1 1 0(1)+0(1)+1(-2)+1(-2) = -4 0 0 1 0 0 0(1)+1(1)+0(-2)+0(-2) = 1 1 0 1 0 1 0(1)+1(1)+0(-2)+1(-2) = -1 0 0 1 1 0 0(1)+1(1)+1(-2)+0(-2) = -1 0 0 1 1 1 0(1)+1(1)+1(-2)+1(-2) = -3 0 1 0 0 0 1(1)+0(1)+0(-2)+0(-2) = 1 1 1 0 0 1 1(1)+0(1)+0(-2)+1(-2) = -1 0 1 0 1 0 1(1)+0(1)+1(-2)+0(-2) = -1 0 1 0 1 1 1(1)+0(1)+1(-2)+1(-2) = -3 0 1 1 0 0 1(1)+1(1)+0(-2)+0(-2) = 2 1 1 1 0 1 1(1)+1(1)+0(-2)+1(-2) = 0 0 1 1 1 0 1(1)+1(1)+1(-2)+0(-2) = 0 0 1 1 1 1 1(1)+1(1)+1(-2)+1(-2) = -2 0

Simple TLU When we have a simple problem, it is possible that a single TLU can compute the right action. For this to happen we need there to be only two possible actions. For more complex problems (categories), we need a network of TLUs. These are often called neural networks because they have some similarity to the networks of neurons from which the brain is constructed. We can use such a network to implement a T-R program.

TISA Network This network implements a set of production rules.

TISA Network Inputs The input to each unit on the left is the 1 or 0 of the condition. (This might be computed from the by another circuit.)

TISA Network Inputs The input to each unit from the top is a 1 or a 0 Inhibit input from another Unit.

TISA Network Each rule is a: Test, Inhibit, Squelch, Act (TISA) circuit

TISA Network TLU Computes Conjunction TLU Computes Disjunction

TISA Behavior Inhibit is 0 when no rules above have a true condition. Test is 1 if the condition is true. If Test is 1 and Inhibit is 0, Act is 1. If either Test is 1 or Inhibit is 1 then Squelch is 1. If Squelch is 1 then every TISA below is Inhibited.

Learning in neural networks So far we have assumed that the mapping between stimulus and response was programmed by the agent designer. That is not always convenient or possible. When it isn t, then it is possible to learn the right mapping. We will look at the case of learning the mapping for a single TLU.

Learning Process In brief, the learning procedure is as follows. We start with some set of weights: random; uniform Use a Training Set: Run a set of inputs, and look at the outputs: If they don t match, we alter the weights. We keep learning until the weights are right.

Back to the T-R Agent from Before Feature Vector X, maps to a particular action, a:

Supervised Learning Now consider that we have a set of these. Every element of (action). is an X (feature vector) with a corresponding a This is a training set, and the set A of all a are called the classes or labels. The learning problem here is to find a way of describing the mapping from each member of to the appropriate member of A. We want to find a function f(x) which is acceptable. That is it produces an action which agrees with the examples for as many members of the training set as possible. Because we have a set of examples to learn from, we call this supervised learning.

Learning in a single TLU We train a TLU by adjusting the input weights. We assume that the vector makes sense. The set of weights is numerical so that a weighted sum is denoted by vector The threshold is written as, so: Output is 1 if Output is 0 otherwise Dot Product: is just a way of writing

Hyperplane A TLU divides the space of feature vectors

In two dimensions, the TLU defines a boundary (a line) between two parts of a plane (as in the previous picture). In three dimensions, the TLU defines a plane which separates two parts of the space. In higher-dimension spaces the boundary defined by the TLU is a hyperplane. Whatever it is, it separates: from

Changing Changing moves the boundary relative to the origin. alters the orientation of the boundary. By convention we will assume that: Simplifies the maths :-) Arbitrary thresholds can be obtained by adding in an extra weight n+1 which is -0. The n + 1th element of the input vector is always 1. So, we don t restrict ourselves by making this assumption.

Demonstrated in Logic Functions Bias-Weight (Input Hard-Wired to 1) W 0 = 1.5 W 0 = 0.5 W 0 = 0.5 W 1 = 1 W 2 = 1 W 1 = 1 W 2 = 1 W 1 = -1 AND OR NOT Logic Function Inputs

Output is 0 Linear Separability I 1 Output is 1 I 1 I 1 1 1 1? 0 0 1 I 2 0 0 1 I 2 0 0 1 I 2 (a) I 1 and I 2 (b) I1 or I2 (c) I1 xor I2 Boundary is clear for AND and OR, but not for XOR.

Summary: Neural Networks So, we introduced neural networks. We first considered them as an implementation of stimulus-response agents. In this incarnation we adjust the weights by hand. We also started thinking about how to learn the weights automatically.