Learning and Neural Networks


 Diane Manning
 11 months ago
 Views:
Transcription
1 Artificial Intelligence Learning and Neural Networks Readings: Chapter 19 & 20.5 of Russell & Norvig
2 Example: A Feedforward Network w 13 I 1 H 3 w 35 w 14 O 5 I 2 w 23 w 24 H 4 w 45 a 5 = g 5 (W 3,5 a 3 + W 4,5 a 4 ) = g 5 (W 3,5 g 3 (W 1,3 a 1 + W 2,3 a 2 ) + W 4,5 g 4 (W 1,4 a 1 + W 2,4 a 2 )) where a i is the output and g i is the activation function of node i.
3 Computing with NNs Different functions are implemented by different network topologies and unit weights. The lure of NNs is that a network need not be explicitly programmed to compute a certain function f. Given enough nodes and links, a NN can learn the function by itself. It does so by looking at a training set of input/output pairs for f and modifying its topology and weights so that its own input/output behavior agrees with the training pairs. In other words, NNs learn by induction, too.
4 Learning = Training in Neural Networks Neural networks are trained using data referred to as a training set. The process is one of computing outputs, compare outputs with desired answers, adjust weights and repeat. The information of a Neural Network is in its structure, activation functions, weights, and Learning to use different structures and activation functions is very difficult. These weights are used to express the relative strength of an input value or from a connecting unit (i.e., in another layer). It is by adjusting these weights that a neural network learns.
5 Process for Developing Neural Networks 1. Collect data Ensure that application is amenable to a NN approach and pick data randomly. 2. Separate Data into Training Set and Test Set 3. Define a Network Structure Are perceptrons sufficient? 4. Select a Learning Algorithm Decided by available tools 5. Set Parameter Values They will affect the length of the training period. 6. Training Determine and revise weights 7. Test If not acceptable, go back to steps 1, 2,..., or Delivery of the product
6 The Perceptron Learning Method Weight updating in perceptrons is very simple because each output node is independent of the other output nodes. I j W j,i O i I j W j O Input Units Output Units Input Units Output Unit Perceptron Network Single Perceptron With no loss of generality then, we can consider a perceptron with a single output node.
7 Normalizing Unit Thresholds. Notice that, if t is the threshold value of the output unit, then step t ( n j=1 W j I j ) = step 0 ( n j=0 W j I j ) where W 0 = t and I 0 = 1. Therefore, we can always assume that the unit s threshold is 0 if we include the actual threshold as the weight of an extra link with a fixed input value. This allows thresholds to be learned like any other weight. Then, we can even allow output values in [0, 1] by replacing step 0 by the sigmoid function.
8 The Perceptron Learning Method If O is the value returned by the output unit for a given example and T is the expected output, then the unit s error is Err = T O If the error Err is positive we need to increase O; otherwise, we need to decrease O.
9 The Perceptron Learning Method Since O = g( n j=0 W ji j ), we can change O by changing each W j. Assuming g is monotonic, to increase O we should increase W j if I j is positive, decrease W j if I j is negative. Similarly, to decrease O we should decrease W j if I j is positive, increase W j if I j is negative. This is done by updating each W j as follows W j W j + α I j (T O) where α is a positive constant, the learning rate.
10 Theoretic Background Learn by adjusting weights to reduce error on training set The squared error for an example with input x and true output y is E = 1 2 Err (y h W (x))2, Perform optimization search by gradient descent: E = Err Err = Err n y g( W j x j ) W j W j W j = Err g (in) x j j = 0
11 Weight update rule W j W j α E W j = W j + α Err g (in) x j E.g., positive error = increasing network output = increasing weights on positive inputs and decreasing on negative inputs Simple weight update rule (assuming g (in) constant): W j W j + α Err x j
12 Learning a 5place Minority Function At first, collect the data (see below), then choose a structure (a perceptron with five inputs and one output) and the activation function (i.e., step 3 ). Finally, set up parameters (i.e., W i = 0) and start to learn: Assumping α = 1, we have Sum = 5 i=1 W ii i, Out = step 3 (Sum), Err = T Out, and W j W j + I j Err. I 1 I 2 I 3 I 4 I 5 T W 1 W 2 W 3 W 4 W 5 Sum Out Err e e e e e e e e
13 Learning a 5place Minority Function The same as the last example, except that α = 0.5 instead of α = 1. Sum = 5 i=1 W ii i, Out = step 3 (Sum), Err = T Out, and W j W j + I j Err. I 1 I 2 I 3 I 4 I 5 T W 1 W 2 W 3 W 4 W 5 Sum Out Err e e e e e e e e
14 Multilayer perceptrons Layers are usually fully connected; numbers of hidden units typically chosen by hand Output units a i W j,i Hidden units a j W k,j Input units a k
15 Expressiveness of MLPs All continuous functions w/ 2 layers, all functions w/ 3 layers h W (x 1, x 2 ) x x 2 h W (x 1, x 2 ) x x 2
16 Backpropagation learning Output layer: same as for singlelayer perceptron, W j,i W j,i + α a j i where i = Err i g (in i ) Hidden layer: backpropagate the error from the output layer: j = g (in j ) i W j,i i. Update rule for weights in hidden layer: W k,j W k,j + α a k j. (Most neuroscientists deny that backpropagation occurs in the brain)
17 Backpropagation derivation The squared error on a single example is defined as E = 1 2 (y i a i ) 2, where the sum is over the nodes in the output layer. i E W j,i = (y i a i ) a i W j,i = (y i a i ) g(in i) W j,i = (y i a i )g (in i ) in i W j,i = (y i a i )g (in i ) W j,i j W j,i a j = (y i a i )g (in i )a j = a j i
18 Backpropagation derivation E W k,j = i = i (y i a i ) a i W k,j = i (y i a i )g (in i ) in i W k,j = i (y i a i ) g(in i) W k,j i W k,j j W j,i = i a j i W j,i = W k,j i i W j,i g(in j ) W k,j = i = i i W j,i g (in j ) in j W k,j i W j,i g (in j ) W k,j ( k W k,j a k ) = i W j,i g (in j )a k = a k j
19 Decision Trees for Classification Examples described by attribute values (Boolean, discrete, continuous, etc.) E.g., situations where I will/won t wait for a table: Example Attributes Targe Bar F ri Hun P at P rice Rain Res T ype Est WillWa X 1 F F T Some $$$ F T French 0 10 T X 2 F F T Full $ F F Thai F X 3 T F F Some $ F F Burger 0 10 T X 4 F T T Full $ F F Thai T X 5 F T F Full $$$ F T French >60 F X 6 T F T Some $$ T T Italian 0 10 T X 7 T F F None $ T F Burger 0 10 F X 8 F F T Some $$ T T Thai 0 10 T X 9 T T F Full $ T F Burger >60 F X 10 T T T Full $$$ F T Italian F X 11 F F F None $ F F Thai 0 10 F X 12 T T T Full $ F F Burger T Classification of examples is positive (T) or negative (F)
20 Decision trees One possible representation for hypotheses. E.g., here is the true tree for deciding whether to wait: Patrons? None Some Full F T WaitEstimate? > F Alternate? Hungry? T No Yes No Yes Reservation? Fri/Sat? T Alternate? No Yes No Yes No Yes Bar? T F T T Raining? No Yes No Yes F T F T
21 Expressiveness Decision trees can express any function of the input attributes. E.g., for Boolean functions, each truth table row is a path from root to leaf: A B A xor B F F F F T T T F T T T F F F B F T A T T B T F T Trivially, a consistent decision tree for any training set w/ one path to leaf for each example (unless f nondeterministic in x) but it probably won t generalize to new examples Prefer to find more compact decision trees. F
22 Hypothesis spaces How many distinct decision trees with n Boolean attributes
23 Hypothesis spaces How many distinct decision trees with n Boolean attributes = number of Boolean functions
24 Hypothesis spaces How many distinct decision trees with n Boolean attributes = number of Boolean functions = number of distinct truth tables with 2 n rows = 2 2n
25 Hypothesis spaces How many distinct decision trees with n Boolean attributes = number of Boolean functions = number of distinct truth tables with 2 n rows = 2 2n E.g., with 6 Boolean attributes, there are 18,446,744,073,709,551,616 trees How many purely conjunctive hypotheses (e.g., Hungry Rain)
26 Hypothesis spaces How many distinct decision trees with n Boolean attributes = number of Boolean functions = number of distinct truth tables with 2 n rows = 2 2n E.g., with 6 Boolean attributes, there are 18,446,744,073,709,551,616 trees How many purely conjunctive hypotheses (e.g., Hungry Rain) Each attribute can be in (positive), in (negative), or out.
27 Hypothesis spaces How many distinct decision trees with n Boolean attributes = number of Boolean functions = number of distinct truth tables with 2 n rows = 2 2n E.g., with 6 Boolean attributes, there are 18,446,744,073,709,551,616 trees How many purely conjunctive hypotheses (e.g., Hungry Rain) Each attribute can be in (positive), in (negative), or out. 3 n distinct conjunctive hypotheses
28 Decision tree learning Aim: find a small tree consistent with the training examples Idea: (recursively) choose most significant attribute as root of (sub)tree Reason: A good attribute splits the examples into subsets that are (ideally) all positive or all negative Patrons? Type? None Some Full French Italian Thai Burger P atrons? is a better choice gives information about the classification
29 Information Information answers questions. The more clueless I am about the answer initially, the more information is contained in the answer. Scale: 1 = answer to Boolean question with prior 0.5, = answer to Boolean question with known result. Information in an answer when prior is P 1,..., P n is I( P 1,..., P n ) = n i = 1 P i log 2 P i (also called entropy of the prior)
30 Information Suppose we have p positive and n negative examples at the root. Hence I( p/(p + n), n/(p + n) ) bits needed to classify a new example. E.g., for 12 restaurant examples and p = n = 6, we need 1 bit. Suppose an attribute A splits the examples E into subsets E i, each of which (we hope) needs less information to complete the classification. Let E i have p i positive and n i negative examples. Then I( p i /(p i + n i ), n i /(p i + n i ) ) bits needed to classify E i. The remaining bits after choosing A will be Remainder(A) = i p i + n i p + n I( p i/(p i + n i ), n i /(p i + n i ) ) Idea: Choose A such that Remainder(A) is minimal.
31 Information Patrons? Type? None Some Full French Italian Thai Burger Remainder(P atrons?) = I(0, 1) + 12I(1, 0) + 12 I( 2 6, 4 6 ) = Remainder(T ype?) = 2 12 I( 1 2, 1 2 ) I( 1 2, 1 2 ) I( 2 4, 2 4 ) I( 2 4, 2 4 ) = 1. So we choose the attribute P atrons? that minimizes the remaining information.
32 Example contd. Decision tree learned from the 12 examples: Patrons? None Some Full F T Hungry? Yes No Type? F French Italian Thai Burger T F Fri/Sat? T No Yes F T Substantially simpler than true tree a more complex hypothesis isn t justified by small amount of data
33 Summary Learning needed for unknown environments, lazy designers Learning agent = performance element + learning element Learning method depends on type of performance element, available feedback, type of component to be improved, and its representation For supervised learning, the aim is to find a simple hypothesis approximately consistent with training examples Learning performance = prediction accuracy measured on test set Most brains have lots of neurons; each neuron linear threshold unit (?) Perceptrons (onelayer networks) insufficiently expressive Multilayer networks are sufficiently expressive; can be trained by gradient descent, i.e., error backpropagation Many applications: speech, driving, handwriting, credit cards, etc.
Learning from Observations. Chapter 18, Sections 1 3 1
Learning from Observations Chapter 18, Sections 1 3 Chapter 18, Sections 1 3 1 Outline Learning agents Inductive learning Decision tree learning Measuring learning performance Chapter 18, Sections 1 3
More informationCS 380: ARTIFICIAL INTELLIGENCE
CS 380: ARTIFICIAL INTELLIGENCE MACHINE LEARNING 11/11/2013 Santiago Ontañón santi@cs.drexel.edu https://www.cs.drexel.edu/~santi/teaching/2013/cs380/intro.html Summary so far: Rational Agents Problem
More informationCS 380: ARTIFICIAL INTELLIGENCE MACHINE LEARNING. Santiago Ontañón
CS 380: ARTIFICIAL INTELLIGENCE MACHINE LEARNING Santiago Ontañón so367@drexel.edu Summary so far: Rational Agents Problem Solving Systematic Search: Uninformed Informed Local Search Adversarial Search
More informationLearning Decision Trees
Learning Decision Trees CS19410 Fall 2011 Lecture 8 CS19410 Fall 2011 Lecture 8 1 Outline Decision tree models Tree construction Tree pruning Continuous input features CS19410 Fall 2011 Lecture 8 2
More informationFrom inductive inference to machine learning
From inductive inference to machine learning ADAPTED FROM AIMA SLIDES Russel&Norvig:Artificial Intelligence: a modern approach AIMA: Inductive inference AIMA: Inductive inference 1 Outline Bayesian inferences
More informationIntroduction to Artificial Intelligence. Learning from Oberservations
Introduction to Artificial Intelligence Learning from Oberservations Bernhard Beckert UNIVERSITÄT KOBLENZLANDAU Winter Term 2004/2005 B. Beckert: KI für IM p.1 Outline Learning agents Inductive learning
More informationStatistical Learning. Philipp Koehn. 10 November 2015
Statistical Learning Philipp Koehn 10 November 2015 Outline 1 Learning agents Inductive learning Decision tree learning Measuring learning performance Bayesian learning Maximum a posteriori and maximum
More information1. Courses are either tough or boring. 2. Not all courses are boring. 3. Therefore there are tough courses. (Cx, Tx, Bx, )
Logic FOL Syntax FOL Rules (Copi) 1. Courses are either tough or boring. 2. Not all courses are boring. 3. Therefore there are tough courses. (Cx, Tx, Bx, ) Dealing with Time Translate into firstorder
More informationNeural networks. Chapter 19, Sections 1 5 1
Neural networks Chapter 19, Sections 1 5 Chapter 19, Sections 1 5 1 Outline Brains Neural networks Perceptrons Multilayer perceptrons Applications of neural networks Chapter 19, Sections 1 5 2 Brains 10
More informationLearning from Examples
Learning from Examples Data fitting Decision trees Cross validation Computational learning theory Linear classifiers Neural networks Nonparametric methods: nearest neighbor Support vector machines Ensemble
More informationNeural Networks. Chapter 18, Section 7. TB Artificial Intelligence. Slides from AIMA 1/ 21
Neural Networks Chapter 8, Section 7 TB Artificial Intelligence Slides from AIMA http://aima.cs.berkeley.edu / 2 Outline Brains Neural networks Perceptrons Multilayer perceptrons Applications of neural
More informationDecision Trees. None Some Full > No Yes. No Yes. No Yes. No Yes. No Yes. No Yes. No Yes. Patrons? WaitEstimate? Hungry? Alternate?
Decision rees Decision trees is one of the simplest methods for supervised learning. It can be applied to both regression & classification. Example: A decision tree for deciding whether to wait for a place
More informationBayesian learning Probably Approximately Correct Learning
Bayesian learning Probably Approximately Correct Learning Peter Antal antal@mit.bme.hu A.I. December 1, 2017 1 Learning paradigms Bayesian learning Falsification hypothesis testing approach Probably Approximately
More informationNeural networks. Chapter 20. Chapter 20 1
Neural networks Chapter 20 Chapter 20 1 Outline Brains Neural networks Perceptrons Multilayer networks Applications of neural networks Chapter 20 2 Brains 10 11 neurons of > 20 types, 10 14 synapses, 1ms
More informationIntroduction To Artificial Neural Networks
Introduction To Artificial Neural Networks Machine Learning Supervised circle square circle square Unsupervised group these into two categories Supervised Machine Learning Supervised Machine Learning Supervised
More information22c145Fall 01: Neural Networks. Neural Networks. Readings: Chapter 19 of Russell & Norvig. Cesare Tinelli 1
Neural Networks Readings: Chapter 19 of Russell & Norvig. Cesare Tinelli 1 Brains as Computational Devices Brains advantages with respect to digital computers: Massively parallel Faulttolerant Reliable
More informationCh.8 Neural Networks
Ch.8 Neural Networks Hantao Zhang http://www.cs.uiowa.edu/ hzhang/c145 The University of Iowa Department of Computer Science Artificial Intelligence p.1/?? Brains as Computational Devices Motivation: Algorithms
More informationNeural networks. Chapter 20, Section 5 1
Neural networks Chapter 20, Section 5 Chapter 20, Section 5 Outline Brains Neural networks Perceptrons Multilayer perceptrons Applications of neural networks Chapter 20, Section 5 2 Brains 0 neurons of
More informationMachine Learning (CS 419/519): M. Allen, 14 Sept. 18 made, in hopes that it will allow us to predict future decisions
Review: Decisions Based on Attributes } raining set: cases where patrons have decided to wait or not, along with the associated attributes for each case Class #05: Mutual Information & Decision rees Machine
More informationLast update: October 26, Neural networks. CMSC 421: Section Dana Nau
Last update: October 26, 207 Neural networks CMSC 42: Section 8.7 Dana Nau Outline Applications of neural networks Brains Neural network units Perceptrons Multilayer perceptrons 2 Example Applications
More informationCS:4420 Artificial Intelligence
CS:4420 Artificial Intelligence Spring 2018 Neural Networks Cesare Tinelli The University of Iowa Copyright 2004 18, Cesare Tinelli and Stuart Russell a a These notes were originally developed by Stuart
More informationChapter 18. Decision Trees and Ensemble Learning. Recall: Learning Decision Trees
CSE 473 Chapter 18 Decision Trees and Ensemble Learning Recall: Learning Decision Trees Example: When should I wait for a table at a restaurant? Attributes (features) relevant to Wait? decision: 1. Alternate:
More informationDecision Trees. Ruy Luiz Milidiú
Decision Trees Ruy Luiz Milidiú Resumo Objetivo Examinar o conceito de Árvores de Decisão e suas aplicações Sumário Surpresa e Informação Entropia Informação Mútua Divergência e Entropia Cruzada Ganho
More informationEECS 349:Machine Learning Bryan Pardo
EECS 349:Machine Learning Bryan Pardo Topic 2: Decision Trees (Includes content provided by: Russel & Norvig, D. Downie, P. Domingos) 1 General Learning Task There is a set of possible examples Each example
More informationIncremental Stochastic Gradient Descent
Incremental Stochastic Gradient Descent Batch mode : gradient descent w=w  η E D [w] over the entire data D E D [w]=1/2σ d (t d o d ) 2 Incremental mode: gradient descent w=w  η E d [w] over individual
More informationDecision Trees. CS 341 Lectures 8/9 Dan Sheldon
Decision rees CS 341 Lectures 8/9 Dan Sheldon Review: Linear Methods Y! So far, we ve looked at linear methods! Linear regression! Fit a line/plane/hyperplane X 2 X 1! Logistic regression! Decision boundary
More information2015 Todd Neller. A.I.M.A. text figures 1995 Prentice Hall. Used by permission. Neural Networks. Todd W. Neller
2015 Todd Neller. A.I.M.A. text figures 1995 Prentice Hall. Used by permission. Neural Networks Todd W. Neller Machine Learning Learning is such an important part of what we consider "intelligence" that
More information} It is nonzero, and maximized given a uniform distribution } Thus, for any distribution possible, we have:
Review: Entropy and Information H(P) = X i p i log p i Class #04: Mutual Information & Decision rees Machine Learning (CS 419/519): M. Allen, 1 Sept. 18 } Entropy is the information gained on average when
More informationArtificial neural networks
Artificial neural networks Chapter 8, Section 7 Artificial Intelligence, spring 203, Peter Ljunglöf; based on AIMA Slides c Stuart Russel and Peter Norvig, 2004 Chapter 8, Section 7 Outline Brains Neural
More informationArtificial Neural Networks" and Nonparametric Methods" CMPSCI 383 Nov 17, 2011!
Artificial Neural Networks" and Nonparametric Methods" CMPSCI 383 Nov 17, 2011! 1 Todayʼs lecture" How the brain works (!)! Artificial neural networks! Perceptrons! Multilayer feedforward networks! Error
More informationSupervised Learning (contd) Decision Trees. Mausam (based on slides by UWAI faculty)
Supervised Learning (contd) Decision Trees Mausam (based on slides by UWAI faculty) Decision Trees To play or not to play? http://www.sfgate.com/blogs/images/sfgate/sgreen/2007/09/05/2240773250x321.jpg
More informationIntroduction to Machine Learning
Introduction to Machine Learning Reading for today: R&N 18.118.4 Next lecture: R&N 18.618.12, 20.120.3.2 Outline The importance of a good representation Different types of learning problems Different
More informationCOMP 551 Applied Machine Learning Lecture 14: Neural Networks
COMP 551 Applied Machine Learning Lecture 14: Neural Networks Instructor: Ryan Lowe (ryan.lowe@mail.mcgill.ca) Slides mostly by: Class web page: www.cs.mcgill.ca/~hvanho2/comp551 Unless otherwise noted,
More informationData Mining Part 5. Prediction
Data Mining Part 5. Prediction 5.5. Spring 2010 Instructor: Dr. Masoud Yaghini Outline How the Brain Works Artificial Neural Networks Simple Computing Elements FeedForward Networks Perceptrons (Singlelayer,
More informationSections 18.6 and 18.7 Artificial Neural Networks
Sections 18.6 and 18.7 Artificial Neural Networks CS4811  Artificial Intelligence Nilufer Onder Department of Computer Science Michigan Technological University Outline The brain vs artifical neural networks
More informationCSC 411 Lecture 3: Decision Trees
CSC 411 Lecture 3: Decision Trees Roger Grosse, Amirmassoud Farahmand, and Juan Carrasquilla University of Toronto UofT CSC 411: 03Decision Trees 1 / 33 Today Decision Trees Simple but powerful learning
More informationNonlinear Classification
Nonlinear Classification INFO4604, Applied Machine Learning University of Colorado Boulder October 510, 2017 Prof. Michael Paul Linear Classification Most classifiers we ve seen use linear functions
More informationSections 18.6 and 18.7 Analysis of Artificial Neural Networks
Sections 18.6 and 18.7 Analysis of Artificial Neural Networks CS4811  Artificial Intelligence Nilufer Onder Department of Computer Science Michigan Technological University Outline Univariate regression
More informationMultilayer Perceptron
Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Single Perceptron 3 Boolean Function Learning 4
More informationbrainlinksystem.com $25+ / hr AI Decision Tree Learning Part I Outline Learning 11/9/2010 Carnegie Mellon
I Decision Tree Learning Part I brainlinksystem.com $25+ / hr Illah Nourbakhsh s version Chapter 8, Russell and Norvig Thanks to all past instructors Carnegie Mellon Outline Learning and philosophy Induction
More informationNeural Networks and the Backpropagation Algorithm
Neural Networks and the Backpropagation Algorithm Francisco S. Melo In these notes, we provide a brief overview of the main concepts concerning neural networks and the backpropagation algorithm. We closely
More informationSections 18.6 and 18.7 Artificial Neural Networks
Sections 18.6 and 18.7 Artificial Neural Networks CS4811  Artificial Intelligence Nilufer Onder Department of Computer Science Michigan Technological University Outline The brain vs. artifical neural
More informationMultilayer Neural Networks
Multilayer Neural Networks Multilayer Neural Networks Discriminant function flexibility NONLinear But with sets of linear parameters at each layer Provably general function approximators for sufficient
More informationUnit 8: Introduction to neural networks. Perceptrons
Unit 8: Introduction to neural networks. Perceptrons D. Balbontín Noval F. J. Martín Mateos J. L. Ruiz Reina A. Riscos Núñez Departamento de Ciencias de la Computación e Inteligencia Artificial Universidad
More informationArtifical Neural Networks
Neural Networks Artifical Neural Networks Neural Networks Biological Neural Networks.................................. Artificial Neural Networks................................... 3 ANN Structure...........................................
More informationAN INTRODUCTION TO NEURAL NETWORKS. Scott Kuindersma November 12, 2009
AN INTRODUCTION TO NEURAL NETWORKS Scott Kuindersma November 12, 2009 SUPERVISED LEARNING We are given some training data: We must learn a function If y is discrete, we call it classification If it is
More informationArtificial Intelligence
Artificial Intelligence Jeff Clune Assistant Professor Evolving Artificial Intelligence Laboratory Announcements Be making progress on your projects! Three Types of Learning Unsupervised Supervised Reinforcement
More informationNeural Networks. CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington
Neural Networks CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington 1 Perceptrons x 0 = 1 x 1 x 2 z = h w T x Output: z x D A perceptron
More informationCSE446: Neural Networks Spring Many slides are adapted from Carlos Guestrin and Luke Zettlemoyer
CSE446: Neural Networks Spring 2017 Many slides are adapted from Carlos Guestrin and Luke Zettlemoyer Human Neurons Switching time ~ 0.001 second Number of neurons 10 10 Connections per neuron 10 45 Scene
More information16.4 Multiattribute Utility Functions
285 Normalized utilities The scale of utilities reaches from the best possible prize u to the worst possible catastrophe u Normalized utilities use a scale with u = 0 and u = 1 Utilities of intermediate
More informationMultilayer Perceptrons (MLPs)
CSE 5526: Introduction to Neural Networks Multilayer Perceptrons (MLPs) 1 Motivation Multilayer networks are more powerful than singlelayer nets Example: XOR problem x 2 1 AND x o x 1 x 2 +11 o x x 11
More informationMultilayer Neural Networks
Multilayer Neural Networks Introduction Goal: Classify objects by learning nonlinearity There are many problems for which linear discriminants are insufficient for minimum error In previous methods, the
More informationClassification Algorithms
Classification Algorithms UCSB 290N, 2015. T. Yang Slides based on R. Mooney UT Austin 1 Table of Content roblem Definition Rocchio Knearest neighbor case based Bayesian algorithm Decision trees 2 Given:
More informationCSC242: Intro to AI. Lecture 23
CSC242: Intro to AI Lecture 23 Administrivia Posters! Tue Apr 24 and Thu Apr 26 Idea! Presentation! 2wide x 4high landscape pages Learning so far... Input Attributes Alt Bar Fri Hun Pat Price Rain Res
More informationIntroduction to Natural Computation. Lecture 9. Multilayer Perceptrons and Backpropagation. Peter Lewis
Introduction to Natural Computation Lecture 9 Multilayer Perceptrons and Backpropagation Peter Lewis 1 / 25 Overview of the Lecture Why multilayer perceptrons? Some applications of multilayer perceptrons.
More informationSPSS, University of Texas at Arlington. Topics in Machine LearningEE 5359 Neural Networks
Topics in Machine LearningEE 5359 Neural Networks 1 The Perceptron Output: A perceptron is a function that maps Ddimensional vectors to real numbers. For notational convenience, we add a zeroth dimension
More informationCourse 395: Machine Learning  Lectures
Course 395: Machine Learning  Lectures Lecture 12: Concept Learning (M. Pantic) Lecture 34: Decision Trees & CBC Intro (M. Pantic & S. Petridis) Lecture 56: Evaluating Hypotheses (S. Petridis) Lecture
More informationAssignment 1: Probabilistic Reasoning, Maximum Likelihood, Classification
Assignment 1: Probabilistic Reasoning, Maximum Likelihood, Classification For due date see https://courses.cs.sfu.ca This assignment is to be done individually. Important Note: The university policy on
More information18.6 Regression and Classification with Linear Models
18.6 Regression and Classification with Linear Models 352 The hypothesis space of linear functions of continuousvalued inputs has been used for hundreds of years A univariate linear function (a straight
More informationArtificial Neural Networks
Artificial Neural Networks Oliver Schulte  CMPT 310 Neural Networks Neural networks arise from attempts to model human/animal brains Many models, many claims of biological plausibility We will focus on
More informationMultilayer Neural Networks. (sometimes called Multilayer Perceptrons or MLPs)
Multilayer Neural Networks (sometimes called Multilayer Perceptrons or MLPs) Linear separability Hyperplane In 2D: w x + w 2 x 2 + w 0 = 0 Feature x 2 = w w 2 x w 0 w 2 Feature 2 A perceptron can separate
More informationAI Programming CS F20 Neural Networks
AI Programming CS6622008F20 Neural Networks David Galles Department of Computer Science University of San Francisco 200: Symbolic AI Most of this class has been focused on Symbolic AI Focus or symbols
More informationCS 4700: Foundations of Artificial Intelligence
CS 4700: Foundations of Artificial Intelligence Prof. Bart Selman selman@cs.cornell.edu Machine Learning: Neural Networks R&N 18.7 Intro & perceptron learning 1 2 Neuron: How the brain works # neurons
More informationARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD
ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD WHAT IS A NEURAL NETWORK? The simplest definition of a neural network, more properly referred to as an 'artificial' neural network (ANN), is provided
More informationCSC242: Intro to AI. Lecture 21
CSC242: Intro to AI Lecture 21 Administrivia Project 4 (homeworks 18 & 19) due Mon Apr 16 11:59PM Posters Apr 24 and 26 You need an idea! You need to present it nicely on 2wide by 4high landscape pages
More informationCMPT 310 Artificial Intelligence Survey. Simon Fraser University Summer Instructor: Oliver Schulte
CMPT 310 Artificial Intelligence Survey Simon Fraser University Summer 2017 Instructor: Oliver Schulte Assignment 3: Chapters 13, 14, 18, 20. Probabilistic Reasoning and Learning Instructions: The university
More informationArtificial Intelligence Roman Barták
Artificial Intelligence Roman Barták Department of Theoretical Computer Science and Mathematical Logic Introduction We will describe agents that can improve their behavior through diligent study of their
More informationMultilayer Neural Networks. (sometimes called Multilayer Perceptrons or MLPs)
Multilayer Neural Networks (sometimes called Multilayer Perceptrons or MLPs) Linear separability Hyperplane In 2D: w 1 x 1 + w 2 x 2 + w 0 = 0 Feature 1 x 2 = w 1 w 2 x 1 w 0 w 2 Feature 2 A perceptron
More informationNeural Networks. Bishop PRML Ch. 5. Alireza Ghane. Feedforward Networks Network Training Error Backpropagation Applications
Neural Networks Bishop PRML Ch. 5 Alireza Ghane Neural Networks Alireza Ghane / Greg Mori 1 Neural Networks Neural networks arise from attempts to model human/animal brains Many models, many claims of
More informationSupervised Learning Artificial Neural Networks
Lecture 11 upervised Learning Artificial Marco Chiarandini Department of Mathematics & Computer cience University of outhern Denmark lides by tuart Russell and Peter Norvig Course Overview Introduction
More informationARTIFICIAL INTELLIGENCE. Artificial Neural Networks
INFOB2KI 20172018 Utrecht University The Netherlands ARTIFICIAL INTELLIGENCE Artificial Neural Networks Lecturer: Silja Renooij These slides are part of the INFOB2KI Course Notes available from www.cs.uu.nl/docs/vakken/b2ki/schema.html
More informationFeedforward Networks Network Training Error Backpropagation Applications. Neural Networks. Oliver Schulte  CMPT 726. Bishop PRML Ch.
Neural Networks Oliver Schulte  CMPT 726 Bishop PRML Ch. 5 Neural Networks Neural networks arise from attempts to model human/animal brains Many models, many claims of biological plausibility We will
More informationCS6375: Machine Learning Gautam Kunapuli. Decision Trees
Gautam Kunapuli Example: Restaurant Recommendation Example: Develop a model to recommend restaurants to users depending on their past dining experiences. Here, the features are cost (x ) and the user s
More informationNeural Networks. Singlelayer neural network. CSE 446: Machine Learning Emily Fox University of Washington March 10, /9/17
3/9/7 Neural Networks Emily Fox University of Washington March 0, 207 Slides adapted from Ali Farhadi (via Carlos Guestrin and Luke Zettlemoyer) Singlelayer neural network 3/9/7 Perceptron as a neural
More informationLab 5: 16 th April Exercises on Neural Networks
Lab 5: 16 th April 01 Exercises on Neural Networks 1. What are the values of weights w 0, w 1, and w for the perceptron whose decision surface is illustrated in the figure? Assume the surface crosses the
More informationNeural Networks. Nicholas Ruozzi University of Texas at Dallas
Neural Networks Nicholas Ruozzi University of Texas at Dallas Handwritten Digit Recognition Given a collection of handwritten digits and their corresponding labels, we d like to be able to correctly classify
More informationEEE 241: Linear Systems
EEE 4: Linear Systems Summary # 3: Introduction to artificial neural networks DISTRIBUTED REPRESENTATION An ANN consists of simple processing units communicating with each other. The basic elements of
More informationArtificial Neural Networks Examination, June 2005
Artificial Neural Networks Examination, June 2005 Instructions There are SIXTY questions. (The pass mark is 30 out of 60). For each question, please select a maximum of ONE of the given answers (either
More informationRevision: Neural Network
Revision: Neural Network Exercise 1 Tell whether each of the following statements is true or false by checking the appropriate box. Statement True False a) A perceptron is guaranteed to perfectly learn
More informationCS 4700: Foundations of Artificial Intelligence
CS 4700: Foundations of Artificial Intelligence Prof. Bart Selman selman@cs.cornell.edu Machine Learning: Neural Networks R&N 18.7 Intro & perceptron learning 1 2 Neuron: How the brain works # neurons
More informationIntroduction to Statistical Learning Theory. Material para Máster en Matemáticas y Computación
Introduction to Statistical Learning Theory Material para Máster en Matemáticas y Computación 1 Learning agents Inductive learning Decision tree learning First Part: Outline 2 Learning Learning is essential
More informationData Mining. Preamble: Control Application. Industrial Researcher s Approach. Practitioner s Approach. Example. Example. Goal: Maintain T ~Td
Data Mining Andrew Kusiak 2139 Seamans Center Iowa City, Iowa 522421527 Preamble: Control Application Goal: Maintain T ~Td Tel: 319335 5934 Fax: 319335 5669 andrewkusiak@uiowa.edu http://www.icaen.uiowa.edu/~ankusiak
More informationPattern Classification
Pattern Classification All materials in these slides were taen from Pattern Classification (2nd ed) by R. O. Duda,, P. E. Hart and D. G. Stor, John Wiley & Sons, 2000 with the permission of the authors
More informationData Mining. CS57300 Purdue University. Bruno Ribeiro. February 8, 2018
Data Mining CS57300 Purdue University Bruno Ribeiro February 8, 2018 Decision trees Why Trees? interpretable/intuitive, popular in medical applications because they mimic the way a doctor thinks model
More informationECLT 5810 Classification Neural Networks. Reference: Data Mining: Concepts and Techniques By J. Hand, M. Kamber, and J. Pei, Morgan Kaufmann
ECLT 5810 Classification Neural Networks Reference: Data Mining: Concepts and Techniques By J. Hand, M. Kamber, and J. Pei, Morgan Kaufmann Neural Networks A neural network is a set of connected input/output
More informationDecision Trees. CS57300 Data Mining Fall Instructor: Bruno Ribeiro
Decision Trees CS57300 Data Mining Fall 2016 Instructor: Bruno Ribeiro Goal } Classification without Models Well, partially without a model } Today: Decision Trees 2015 Bruno Ribeiro 2 3 Why Trees? } interpretable/intuitive,
More informationDecision Trees. Tirgul 5
Decision Trees Tirgul 5 Using Decision Trees It could be difficult to decide which pet is right for you. We ll find a nice algorithm to help us decide what to choose without having to think about it. 2
More informationNeural Networks, Computation Graphs. CMSC 470 Marine Carpuat
Neural Networks, Computation Graphs CMSC 470 Marine Carpuat Binary Classification with a Multilayer Perceptron φ A = 1 φ site = 1 φ located = 1 φ Maizuru = 1 φ, = 2 φ in = 1 φ Kyoto = 1 φ priest = 0 φ
More informationNeural Nets Supervised learning
6.034 Artificial Intelligence Big idea: Learning as acquiring a function on feature vectors Background Nearest Neighbors Identification Trees Neural Nets Neural Nets Supervised learning y s(z) w w 0 w
More information(FeedForward) Neural Networks Dr. Hajira Jabeen, Prof. Jens Lehmann
(FeedForward) Neural Networks 20161206 Dr. Hajira Jabeen, Prof. Jens Lehmann Outline In the previous lectures we have learned about tensors and factorization methods. RESCAL is a bilinear model for
More informationMachine Learning. Neural Networks
Machine Learning Neural Networks Bryan Pardo, Northwestern University, Machine Learning EECS 349 Fall 2007 Biological Analogy Bryan Pardo, Northwestern University, Machine Learning EECS 349 Fall 2007 THE
More information<Special Topics in VLSI> Learning for Deep Neural Networks (Backpropagation)
Learning for Deep Neural Networks (Backpropagation) Outline Summary of Previous Standford Lecture Universal Approximation Theorem Inference vs Training Gradient Descent BackPropagation
More informationLecture 4: Perceptrons and Multilayer Perceptrons
Lecture 4: Perceptrons and Multilayer Perceptrons Cognitive Systems II  Machine Learning SS 2005 Part I: Basic Approaches of Concept Learning Perceptrons, Artificial Neuronal Networks Lecture 4: Perceptrons
More informationCMSC 421: Neural Computation. Applications of Neural Networks
CMSC 42: Neural Computation definition synonyms neural networks artificial neural networks neural modeling connectionist models parallel distributed processing AI perspective Applications of Neural Networks
More informationMIDTERM: CS 6375 INSTRUCTOR: VIBHAV GOGATE October,
MIDTERM: CS 6375 INSTRUCTOR: VIBHAV GOGATE October, 23 2013 The exam is closed book. You are allowed a onepage cheat sheet. Answer the questions in the spaces provided on the question sheets. If you run
More informationArtificial Neural Networks. Edward Gatt
Artificial Neural Networks Edward Gatt What are Neural Networks? Models of the brain and nervous system Highly parallel Process information much more like the brain than a serial computer Learning Very
More informationNeural Networks Lecturer: J. Matas Authors: J. Matas, B. Flach, O. Drbohlav
Neural Networks 30.11.2015 Lecturer: J. Matas Authors: J. Matas, B. Flach, O. Drbohlav 1 Talk Outline Perceptron Combining neurons to a network Neural network, processing input to an output Learning Cost
More information) (d o f. For the previous layer in a neural network (just the rightmost layer if a single neuron), the required update equation is: 2.
1 Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.034 Artificial Intelligence, Fall 2011 Recitation 8, November 3 Corrected Version & (most) solutions
More informationCSE 352 (AI) LECTURE NOTES Professor Anita Wasilewska. NEURAL NETWORKS Learning
CSE 352 (AI) LECTURE NOTES Professor Anita Wasilewska NEURAL NETWORKS Learning Neural Networks Classifier Short Presentation INPUT: classification data, i.e. it contains an classification (class) attribute.
More information