AIBO experiences change of surface incline.
|
|
- Grant Norton
- 5 years ago
- Views:
Transcription
1 NW Computational Intelligence Laboratory AIBO experiences change of surface incline. Decisions based only on kinesthetic experience vector ICNC 07, China 8/27/07 # 51
2 NW Computational Intelligence Laboratory AIBO experiences change of surface incline. Decisions based only on kinesthetic experience vector ICNC 07, China 8/27/07 # 52
3 CONCLUSION Conjecture that the proposed experience-based approach will usher in a whole new phase of development of the decision and controls fields making a significant stride toward the achievement of more human-like decision and control. Also conjecture that the context discernment concepts plus the manifolds representation will provide a basis for constructing learning agents capable of long term rapidly accessible memory. If so, this could pave the way for scaling neural systems to brain-like capabilities ICNC 07, China 8/27/07 # 53
4 Geometric Topology construct of manifolds provides a useful formalism: 1) a set of elements, S, and 2) a coordinate system (a one-to-one mapping from S to Rn that specifies each element in S via a vector of n real numbers, a.k.a. the coordinates of the element. We let the experience repository be the set portion of a manifold. The manifold s coordinate space serves as a searchable indexing vehicle for the repository. and since the coordinate space is R n, the Euclidean distance provides a natural metric for nearness.
5 Demonstration Example Let the manifold set be a collection of neural networks (NNs) generated via a NN whose structure is fixed, and its adjustable parameters (weights and biases) are made to take on all possible value combinations. Each such combination yields a distinct member of the set, and the parameter values may serve as the coordinates; a point in the coordinate space may be called the set element s address.
6 When the manifold s set are NNs, we use the label neural manifold Care is needed relative to two aspects: 1) while each coordinate point corresponds to a distinct NN instantiation, nevertheless, many such points may all perform the same mapping, and 2) the set of (distinct) mappings that can be performed by this set of NNs is typically just a subset of all possible mappings on the NN s input domain to its output range (called the NN s Performance Subset)
7 Mapping from Context Space to the policy manifold may in general be many-to-one (in the controls vocabulary, changes in the plant dynamics or in its environment do not necessarily imply a needed change in control policy).
8 The indexing schemas for both a plant and policy neural manifold may employ the weights of their respective set of NNs. So far so good. But, how does one go about crafting a mapping between, say, the plant manifold s coordinate space to that of the policy manifold? Clearly, such a mapping will be required for the Agent to select a policy based on information about the plant model. More generally, how does one craft an appropriate mapping from the full Context Space (whatever form of representation is employed) to the coordinate system of the policy manifold? The task of answering these questions is assigned to a Higher Level Learning Algorithm (HLLA) i.e., the answers are to be learned.
9 For another aspect of mappings, consider a linear plant example, and assume the plant transfer functions in the plant manifold are factored polynomials but the CF requirements are given in terms of an expanded polynomial representation (e.g., if requirements are given in terms of damping coefficient for a second order system). While the two representations are equivalent, the Agent would need a mapping between the two to accomplish the controller selection (e.g., via factoring the polynomial, or equivalently, multiplying out the factored polynomial). In the second order case, the notion of nearness in the CF sub-space would be in terms of the damping coefficient, whereas in the corresponding plant manifold coordinate space, nearness would be in terms of S-plane pole locations.
10 As an intuitive example of notions such as efficiency, nearness, and mappings, consider the example of a store that rents movie DVDs shelved alphabetically or by content type Which is more efficient for the customer depends on the customer s needs and knowledge.
11 Key exploration steps ahead: Refine and further develop ideas related to Context Discernment / System Identification thus far developed. Move into the Controller Selection aspect of the suggested EB Control method. Expand the exploration to multiple-level considerations.
12 Key exploration steps ahead (cont.): Provide one or more feasibility demonstrations in support of developing theory and techniques for populating the experience repositories progressing from the synthetic methods already demonstrated to more and more general ones.
13 Key exploration steps ahead (cont.): Formalize ideas about and develop demonstration experiments for incorporating application domain knowledge as repository constraints of a nature that facilitates the larger objectives of improved speed of access and good generalization.
14 Key exploration steps ahead (cont.): Formalize ideas about constructing and achieving needed mappings between components of Context and from Context to Repository, and develop demonstration examples useful for theory development. CONTEXT A. PLANT B. ENVIRONMENT C. CF CONTROL LAW REPOSITORY (EXPERIENCE)
15 Key exploration steps ahead (cont.): Further develop ideas related to multi-level aspects of EB Identification & Control, leading to a Context Space Hierarchy notion, and use the associated ideas as a guide in refining and further defining the HLLA concepts and training methods.
16 Key exploration steps ahead (cont.): Formalize and further develop the role of the human designer in providing higher level knowledge for crafting the RL process entailed in the HLLAs, particularly, designation of state variables and CF s specialized to the multi-level conceptualization. Develop demonstration examples for EB controls, similar to the successful demonstrations of the Context Discernment (systems identification) part of the EB process thus far accomplished.
17 QUESTIONS? OGI Talk 7/26/2006 # 67
18
19 Agent: computational intelligence device (that, in this paper, is to perform the acts of context discernment and selection, along with possible design refinement). Context Variables (Agent centric): those attributes of i) the environment and ii) the plant/process whose variations could engender changes to the decision rule / control policy employed by the Agent while accomplishing the Agent s current objective or goal; and in addition, iii) the criteria (representing the objective or goal) to be used for designing and subsequent selection of the decision rule or control law. [The term Criterion Function (CF) is used here to represent these criteria.] Context Space (Agent centric): a vector space in which each context variable is assigned to a dimension. The Context Space concept comprises three sub-spaces; one each associated with the i) Plant, ii) Environment, and iii) Criterion Function. Context (Agent centric): a point in Context Space; the set of values taken on by the context variables in a given situation. Context Awareness: the act of monitoring the application to take notice (become aware) that a change may be occurring in the Context. Context Discernment: the act or process of determining the current values of the context variables (current point in Context Space) appropriate to the task being performed. [Webster on-line for discern : to recognize or identify as separate and distinct.] Experience-Based approach: A two-component concept: Component A: Repository of previously developed context-specific models (controller or plant models), and Component B: Algorithms used by the Agent to effectively and efficiently select a model from the repository as changes in context occur. [Note: A key task of the Higher Level Learning Algorithm (defined below) is to train the Agent to learn Component B.]
20 Selection: the act of choosing/retrieving an appropriate element of the repository corresponding to the discerned context. Higher-Level Learning Algorithm (HLLA): The reference level for the term higher is the case where the learning algorithms are applied directly to the design of optimal controllers (as in Learning Control), ones that would be accumulated in the repository (c.f. Fig. 1). Higher-Level here means applying the learning method to create a strategy for selecting an appropriate controller from the repository, where the process of selection is optimized; thus, the focus of the learning process is at the next level up. Definition of the Utility function (a specific type of CF) is key for application of this process. Note: When the Contextual Hierarchy ideas mentioned in Section I are developed, more levels will be involved. World Space (Agent Centric): A vector space whose dimensions are associated to designated attributes of the Agent s relevant environment, its physical body, and the external CF. [Note: This definition is included for completeness. It is not explicitly used in this paper, but is used in related publications in terms of mappings from World Space to Context Space, e.g. [39].] Guidelines: Parametric models/equations are used to represent the Plant, Criterion Function (CF), and Environment (for the latter, measurements may serve as parameters w/o an explicit model). Construct (conceptually) a Parameter Space that comprises three sub-spaces: (Plant, Environment, CF). The associated parameters serve as Context variables for the discernment activity; Agent s Context Space may be a sub-space of Parameter Space. Controllers are also represented via parametric models.
21
22
23 To develop feel for the weight update rule in the Adaptive Critic, consider a partial block diagram and a little math (discrete time): R(t) Controller ( ) w ij u(t) R(t+1) J(t+1) PLANT Critic Desire a training Delta Rule for w ij to minimize cost-to-go J(t). Obtain this via J () t and w () t the chain rule of differentiation. ij ICNC 07, China 8/27/07 # 73
24 Family of Adaptive Critic Methods: The critic approximates either 1) J(t), Heuristic Dynamic Programming (HDP) (cf. Q Learning ) or 2) the gradient of J(t) wrt state vector R(t) [ J(R)], Dual Heuristic Programming (DHP) [ J(R(t)) λ(t)] [Today, focus on DHP] ICNC 07, China 8/27/07 # 74
25 Overview of Adaptive Critic method Control engineer provides the Design objectives / Criteria for success through a Utility Function, U(t) (local cost). Then, a new utility function is defined (Bellman Eqn.), J () t = γ U( t+ k) k = 0 k [ cost to go ] which is to be minimized [~ Dynamic Programming]. [We note: Jt () = Ut () + γ Jt ( + 1) Bellman Recursion] ICNC 07, China 8/27/07 # 75
26 The weights in controller NN are updated with objective of minimizing J(t): J () t Δ wij() t = lcoef w () t ij where and a J() t J() t uk () t = w () t u () t w ij k = 1 k ij J() t U() t J( t + 1) = + u () t u () t u () t k k k n J( t + 1) J( t + 1) Rs ( t + 1) and = uk() t s= 1 Rs( t + 1) uk() t Call this term λ ( 1) (to be output of critic) s t + ICNC 07, China 8/27/07 # 76
27 It follows that Controller training is based on: n Jt ( + 1) Ut ( ) Jt ( + 1) Rs ( t+ 1) = + u () t u () t R ( t+ 1) u () t k k s= 1 s k Via CRITIC Via Plant Model Similarly, Critic training is based on: n Jt () dut () Jt ( + 1) Rk( t+ 1) Rk( t+ 1) um( t) = + + Rs() t drs() t k= 1 Rk( t+ 1) Rs() t m um() t Rs() t Via Plant Model [Bellman Recursion & Chain Rule used in above.] Plant model is needed to calculate partial derivatives for DHP ICNC 07, China 8/27/07 # 77
28 Utility Functions for three Design Scenarios: [different combinations of above criteria] 1. U(1,2,3) 2. U(1,2,3,5) 3. U(1,2,3,4,5) All applied to task of designing controller for autonomous 2-axle terrestrial vehicle. ICNC 07, China 8/27/07 # 78
29 ICNC 07, China 8/27/07 # 79
30 ICNC 07, China 8/27/07 # 80
31 ICNC 07, China 8/27/07 # 81
32 Design Scenario 2. Add Criterion 5 ( friction sense ) in U2. This is intended to 1.) allow aggressive lane changes on dry pavement, and 2.) make lane changes on icy road conditions as aggressively as the icy road will allow. [This was our first foray into use of CONTEXT variable: this one via Utility function.] ICNC 07, China 8/27/07 # 82
33 ICNC 07, China 8/27/07 # 83
34 ICNC 07, China 8/27/07 # 84
35 Conclusions from Utility Function Expts. Controller Designs resulting via DHP satisfy intuitive sense of being good each looks and feels like one a human designer might have designed. Control Engineer knows that controller design requires careful specification of objective, and that as change design criteria, the controller changes. For DHP, control objectives are contained in the Utility Function. The DHP process embodied the different requirements for the three design scenarios in qualitatively distinct controllers -- all yielding intuitively good results, according to the design constraints. ICNC 07, China 8/27/07 # 85
Planning in Markov Decision Processes
Carnegie Mellon School of Computer Science Deep Reinforcement Learning and Control Planning in Markov Decision Processes Lecture 3, CMU 10703 Katerina Fragkiadaki Markov Decision Process (MDP) A Markov
More informationClassification Based on Logical Concept Analysis
Classification Based on Logical Concept Analysis Yan Zhao and Yiyu Yao Department of Computer Science, University of Regina, Regina, Saskatchewan, Canada S4S 0A2 E-mail: {yanzhao, yyao}@cs.uregina.ca Abstract.
More information8: Hidden Markov Models
8: Hidden Markov Models Machine Learning and Real-world Data Simone Teufel and Ann Copestake Computer Laboratory University of Cambridge Lent 2017 Last session: catchup 1 Research ideas from sentiment
More informationMS&E338 Reinforcement Learning Lecture 1 - April 2, Introduction
MS&E338 Reinforcement Learning Lecture 1 - April 2, 2018 Introduction Lecturer: Ben Van Roy Scribe: Gabriel Maher 1 Reinforcement Learning Introduction In reinforcement learning (RL) we consider an agent
More information8: Hidden Markov Models
8: Hidden Markov Models Machine Learning and Real-world Data Helen Yannakoudakis 1 Computer Laboratory University of Cambridge Lent 2018 1 Based on slides created by Simone Teufel So far we ve looked at
More informationToday s s Lecture. Applicability of Neural Networks. Back-propagation. Review of Neural Networks. Lecture 20: Learning -4. Markov-Decision Processes
Today s s Lecture Lecture 20: Learning -4 Review of Neural Networks Markov-Decision Processes Victor Lesser CMPSCI 683 Fall 2004 Reinforcement learning 2 Back-propagation Applicability of Neural Networks
More informationECE521 Lectures 9 Fully Connected Neural Networks
ECE521 Lectures 9 Fully Connected Neural Networks Outline Multi-class classification Learning multi-layer neural networks 2 Measuring distance in probability space We learnt that the squared L2 distance
More information(Deep) Reinforcement Learning
Martin Matyášek Artificial Intelligence Center Czech Technical University in Prague October 27, 2016 Martin Matyášek VPD, 2016 1 / 17 Reinforcement Learning in a picture R. S. Sutton and A. G. Barto 2015
More informationReinforcement Learning and Control
CS9 Lecture notes Andrew Ng Part XIII Reinforcement Learning and Control We now begin our study of reinforcement learning and adaptive control. In supervised learning, we saw algorithms that tried to make
More informationCSC321 Lecture 22: Q-Learning
CSC321 Lecture 22: Q-Learning Roger Grosse Roger Grosse CSC321 Lecture 22: Q-Learning 1 / 21 Overview Second of 3 lectures on reinforcement learning Last time: policy gradient (e.g. REINFORCE) Optimize
More informationProf. Dr. Ann Nowé. Artificial Intelligence Lab ai.vub.ac.be
REINFORCEMENT LEARNING AN INTRODUCTION Prof. Dr. Ann Nowé Artificial Intelligence Lab ai.vub.ac.be REINFORCEMENT LEARNING WHAT IS IT? What is it? Learning from interaction Learning about, from, and while
More informationReinforcement Learning
Reinforcement Learning March May, 2013 Schedule Update Introduction 03/13/2015 (10:15-12:15) Sala conferenze MDPs 03/18/2015 (10:15-12:15) Sala conferenze Solving MDPs 03/20/2015 (10:15-12:15) Aula Alpha
More informationOnline Videos FERPA. Sign waiver or sit on the sides or in the back. Off camera question time before and after lecture. Questions?
Online Videos FERPA Sign waiver or sit on the sides or in the back Off camera question time before and after lecture Questions? Lecture 1, Slide 1 CS224d Deep NLP Lecture 4: Word Window Classification
More informationLecture 4: Dynamic Programming
Lecture 4: Dynamic Programming Fatih Guvenen January 10, 2016 Fatih Guvenen Lecture 4: Dynamic Programming January 10, 2016 1 / 30 Goal Solve V (k, z) =max c,k 0 u(c)+ E(V (k 0, z 0 ) z) c + k 0 =(1 +
More informationChapter 3: The Reinforcement Learning Problem
Chapter 3: The Reinforcement Learning Problem Objectives of this chapter: describe the RL problem we will be studying for the remainder of the course present idealized form of the RL problem for which
More informationChapter 3: The Reinforcement Learning Problem
Chapter 3: The Reinforcement Learning Problem Objectives of this chapter: describe the RL problem we will be studying for the remainder of the course present idealized form of the RL problem for which
More informationCS 7180: Behavioral Modeling and Decisionmaking
CS 7180: Behavioral Modeling and Decisionmaking in AI Markov Decision Processes for Complex Decisionmaking Prof. Amy Sliva October 17, 2012 Decisions are nondeterministic In many situations, behavior and
More informationLinear Discriminant Functions
Linear Discriminant Functions Linear discriminant functions and decision surfaces Definition It is a function that is a linear combination of the components of g() = t + 0 () here is the eight vector and
More informationGary School Community Corporation Mathematics Department Unit Document. Unit Name: Polynomial Operations (Add & Sub)
Gary School Community Corporation Mathematics Department Unit Document Unit Number: 1 Grade: Algebra 1 Unit Name: Polynomial Operations (Add & Sub) Duration of Unit: A1.RNE.7 Standards for Mathematical
More informationIntroduction to Reinforcement Learning
CSCI-699: Advanced Topics in Deep Learning 01/16/2019 Nitin Kamra Spring 2019 Introduction to Reinforcement Learning 1 What is Reinforcement Learning? So far we have seen unsupervised and supervised learning.
More informationLecture 1: Dynamic Programming
Lecture 1: Dynamic Programming Fatih Guvenen November 2, 2016 Fatih Guvenen Lecture 1: Dynamic Programming November 2, 2016 1 / 32 Goal Solve V (k, z) =max c,k 0 u(c)+ E(V (k 0, z 0 ) z) c + k 0 =(1 +
More informationMARKOV DECISION PROCESSES (MDP) AND REINFORCEMENT LEARNING (RL) Versione originale delle slide fornita dal Prof. Francesco Lo Presti
1 MARKOV DECISION PROCESSES (MDP) AND REINFORCEMENT LEARNING (RL) Versione originale delle slide fornita dal Prof. Francesco Lo Presti Historical background 2 Original motivation: animal learning Early
More informationLearning Control for Air Hockey Striking using Deep Reinforcement Learning
Learning Control for Air Hockey Striking using Deep Reinforcement Learning Ayal Taitler, Nahum Shimkin Faculty of Electrical Engineering Technion - Israel Institute of Technology May 8, 2017 A. Taitler,
More informationChapter 6: Classification
Chapter 6: Classification 1) Introduction Classification problem, evaluation of classifiers, prediction 2) Bayesian Classifiers Bayes classifier, naive Bayes classifier, applications 3) Linear discriminant
More informationExperiments on the Consciousness Prior
Yoshua Bengio and William Fedus UNIVERSITÉ DE MONTRÉAL, MILA Abstract Experiments are proposed to explore a novel prior for representation learning, which can be combined with other priors in order to
More informationNeural Networks Learning the network: Backprop , Fall 2018 Lecture 4
Neural Networks Learning the network: Backprop 11-785, Fall 2018 Lecture 4 1 Recap: The MLP can represent any function The MLP can be constructed to represent anything But how do we construct it? 2 Recap:
More informationArtificial Neural Network
Artificial Neural Network Contents 2 What is ANN? Biological Neuron Structure of Neuron Types of Neuron Models of Neuron Analogy with human NN Perceptron OCR Multilayer Neural Network Back propagation
More informationBeyond the Point Cloud: From Transductive to Semi-Supervised Learning
Beyond the Point Cloud: From Transductive to Semi-Supervised Learning Vikas Sindhwani, Partha Niyogi, Mikhail Belkin Andrew B. Goldberg goldberg@cs.wisc.edu Department of Computer Sciences University of
More informationParking lot navigation. Experimental setup. Problem setup. Nice driving style. Page 1. CS 287: Advanced Robotics Fall 2009
Consider the following scenario: There are two envelopes, each of which has an unknown amount of money in it. You get to choose one of the envelopes. Given this is all you get to know, how should you choose?
More informationToday s Outline. Recap: MDPs. Bellman Equations. Q-Value Iteration. Bellman Backup 5/7/2012. CSE 473: Artificial Intelligence Reinforcement Learning
CSE 473: Artificial Intelligence Reinforcement Learning Dan Weld Today s Outline Reinforcement Learning Q-value iteration Q-learning Exploration / exploitation Linear function approximation Many slides
More informationAndrews Curtis Groups and the Andrews Curtis Conjecture
Andrews Curtis Groups and the Andrews Curtis Conjecture Adam Piggott adam.piggott@tufts.edu Tufts University p. 1/33 Credits and further info This work has appeared in the Journal of Group Theory 10 (2007)
More informationAlgorithms for MDPs and Their Convergence
MS&E338 Reinforcement Learning Lecture 2 - April 4 208 Algorithms for MDPs and Their Convergence Lecturer: Ben Van Roy Scribe: Matthew Creme and Kristen Kessel Bellman operators Recall from last lecture
More informationAN INTRODUCTION TO NEURAL NETWORKS. Scott Kuindersma November 12, 2009
AN INTRODUCTION TO NEURAL NETWORKS Scott Kuindersma November 12, 2009 SUPERVISED LEARNING We are given some training data: We must learn a function If y is discrete, we call it classification If it is
More informationQ-Learning in Continuous State Action Spaces
Q-Learning in Continuous State Action Spaces Alex Irpan alexirpan@berkeley.edu December 5, 2015 Contents 1 Introduction 1 2 Background 1 3 Q-Learning 2 4 Q-Learning In Continuous Spaces 4 5 Experimental
More information2015 Todd Neller. A.I.M.A. text figures 1995 Prentice Hall. Used by permission. Neural Networks. Todd W. Neller
2015 Todd Neller. A.I.M.A. text figures 1995 Prentice Hall. Used by permission. Neural Networks Todd W. Neller Machine Learning Learning is such an important part of what we consider "intelligence" that
More informationReinforcement learning an introduction
Reinforcement learning an introduction Prof. Dr. Ann Nowé Computational Modeling Group AIlab ai.vub.ac.be November 2013 Reinforcement Learning What is it? Learning from interaction Learning about, from,
More informationNeural Networks and the Back-propagation Algorithm
Neural Networks and the Back-propagation Algorithm Francisco S. Melo In these notes, we provide a brief overview of the main concepts concerning neural networks and the back-propagation algorithm. We closely
More informationLearning Dexterity Matthias Plappert SEPTEMBER 6, 2018
Learning Dexterity Matthias Plappert SEPTEMBER 6, 2018 OpenAI OpenAI is a non-profit AI research company, discovering and enacting the path to safe artificial general intelligence. OpenAI OpenAI is a non-profit
More informationPrioritized Sweeping Converges to the Optimal Value Function
Technical Report DCS-TR-631 Prioritized Sweeping Converges to the Optimal Value Function Lihong Li and Michael L. Littman {lihong,mlittman}@cs.rutgers.edu RL 3 Laboratory Department of Computer Science
More information( t) Identification and Control of a Nonlinear Bioreactor Plant Using Classical and Dynamical Neural Networks
Identification and Control of a Nonlinear Bioreactor Plant Using Classical and Dynamical Neural Networks Mehmet Önder Efe Electrical and Electronics Engineering Boðaziçi University, Bebek 80815, Istanbul,
More informationReview: TD-Learning. TD (SARSA) Learning for Q-values. Bellman Equations for Q-values. P (s, a, s )[R(s, a, s )+ Q (s, (s ))]
Review: TD-Learning function TD-Learning(mdp) returns a policy Class #: Reinforcement Learning, II 8s S, U(s) =0 set start-state s s 0 choose action a, using -greedy policy based on U(s) U(s) U(s)+ [r
More informationReinforcement Learning, Neural Networks and PI Control Applied to a Heating Coil
Reinforcement Learning, Neural Networks and PI Control Applied to a Heating Coil Charles W. Anderson 1, Douglas C. Hittle 2, Alon D. Katz 2, and R. Matt Kretchmar 1 1 Department of Computer Science Colorado
More informationMATH 320, WEEK 7: Matrices, Matrix Operations
MATH 320, WEEK 7: Matrices, Matrix Operations 1 Matrices We have introduced ourselves to the notion of the grid-like coefficient matrix as a short-hand coefficient place-keeper for performing Gaussian
More informationLecture 25: Learning 4. Victor R. Lesser. CMPSCI 683 Fall 2010
Lecture 25: Learning 4 Victor R. Lesser CMPSCI 683 Fall 2010 Final Exam Information Final EXAM on Th 12/16 at 4:00pm in Lederle Grad Res Ctr Rm A301 2 Hours but obviously you can leave early! Open Book
More informationInternet Monetization
Internet Monetization March May, 2013 Discrete time Finite A decision process (MDP) is reward process with decisions. It models an environment in which all states are and time is divided into stages. Definition
More informationINF 5860 Machine learning for image classification. Lecture 14: Reinforcement learning May 9, 2018
Machine learning for image classification Lecture 14: Reinforcement learning May 9, 2018 Page 3 Outline Motivation Introduction to reinforcement learning (RL) Value function based methods (Q-learning)
More informationApproximate Q-Learning. Dan Weld / University of Washington
Approximate Q-Learning Dan Weld / University of Washington [Many slides taken from Dan Klein and Pieter Abbeel / CS188 Intro to AI at UC Berkeley materials available at http://ai.berkeley.edu.] Q Learning
More informationMark Gales October y (x) x 1. x 2 y (x) Inputs. Outputs. x d. y (x) Second Output layer layer. layer.
University of Cambridge Engineering Part IIB & EIST Part II Paper I0: Advanced Pattern Processing Handouts 4 & 5: Multi-Layer Perceptron: Introduction and Training x y (x) Inputs x 2 y (x) 2 Outputs x
More informationSPSS, University of Texas at Arlington. Topics in Machine Learning-EE 5359 Neural Networks
Topics in Machine Learning-EE 5359 Neural Networks 1 The Perceptron Output: A perceptron is a function that maps D-dimensional vectors to real numbers. For notational convenience, we add a zero-th dimension
More informationIntroduction to Machine Learning Prof. Sudeshna Sarkar Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur
Introduction to Machine Learning Prof. Sudeshna Sarkar Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Module 2 Lecture 05 Linear Regression Good morning, welcome
More informationSection Notes 9. Midterm 2 Review. Applied Math / Engineering Sciences 121. Week of December 3, 2018
Section Notes 9 Midterm 2 Review Applied Math / Engineering Sciences 121 Week of December 3, 2018 The following list of topics is an overview of the material that was covered in the lectures and sections
More information4. Multilayer Perceptrons
4. Multilayer Perceptrons This is a supervised error-correction learning algorithm. 1 4.1 Introduction A multilayer feedforward network consists of an input layer, one or more hidden layers, and an output
More informationInformation System Decomposition Quality
Information System Decomposition Quality Dr. Nejmeddine Tagoug Computer Science Department Al Imam University, SA najmtagoug@yahoo.com ABSTRACT: Object-oriented design is becoming very popular in today
More informationtransportation research in policy making for addressing mobility problems, infrastructure and functionality issues in urban areas. This study explored
ABSTRACT: Demand supply system are the three core clusters of transportation research in policy making for addressing mobility problems, infrastructure and functionality issues in urban areas. This study
More informationA Probabilistic Relational Model for Characterizing Situations in Dynamic Multi-Agent Systems
A Probabilistic Relational Model for Characterizing Situations in Dynamic Multi-Agent Systems Daniel Meyer-Delius 1, Christian Plagemann 1, Georg von Wichert 2, Wendelin Feiten 2, Gisbert Lawitzky 2, and
More informationAffordances in Representing the Behaviour of Event-Based Systems
Affordances in Representing the Behaviour of Event-Based Systems Fahim T. IMAM a,1, Thomas R. DEAN b a School of Computing, Queen s University, Canada b Department of Electrical and Computer Engineering,
More informationMachine Learning and Adaptive Systems. Lectures 3 & 4
ECE656- Lectures 3 & 4, Professor Department of Electrical and Computer Engineering Colorado State University Fall 2015 What is Learning? General Definition of Learning: Any change in the behavior or performance
More informationLRS Task Force June 13, REVISION HISTORY
North Carolina Department of Transportation Geographic Information Systems (GIS) Unit LINEAR REFERENCING SYSTEM (LRS) PROJECT DEFINITION Version 1.0 REVISION HISTORY Date Document Manager Revision Purpose
More informationNONLINEAR AND ADAPTIVE (INTELLIGENT) SYSTEMS MODELING, DESIGN, & CONTROL A Building Block Approach
NONLINEAR AND ADAPTIVE (INTELLIGENT) SYSTEMS MODELING, DESIGN, & CONTROL A Building Block Approach P.A. (Rama) Ramamoorthy Electrical & Computer Engineering and Comp. Science Dept., M.L. 30, University
More informationDynamical Systems and Deep Learning: Overview. Abbas Edalat
Dynamical Systems and Deep Learning: Overview Abbas Edalat Dynamical Systems The notion of a dynamical system includes the following: A phase or state space, which may be continuous, e.g. the real line,
More informationAlgorithms and Complexity theory
Algorithms and Complexity theory Thibaut Barthelemy Some slides kindly provided by Fabien Tricoire University of Vienna WS 2014 Outline 1 Algorithms Overview How to write an algorithm 2 Complexity theory
More informationObject Recognition Using a Neural Network and Invariant Zernike Features
Object Recognition Using a Neural Network and Invariant Zernike Features Abstract : In this paper, a neural network (NN) based approach for translation, scale, and rotation invariant recognition of objects
More informationCS 4100 // artificial intelligence. Recap/midterm review!
CS 4100 // artificial intelligence instructor: byron wallace Recap/midterm review! Attribution: many of these slides are modified versions of those distributed with the UC Berkeley CS188 materials Thanks
More informationMarkov Chains. Chapter 16. Markov Chains - 1
Markov Chains Chapter 16 Markov Chains - 1 Why Study Markov Chains? Decision Analysis focuses on decision making in the face of uncertainty about one future event. However, many decisions need to consider
More informationSerious limitations of (single-layer) perceptrons: Cannot learn non-linearly separable tasks. Cannot approximate (learn) non-linear functions
BACK-PROPAGATION NETWORKS Serious limitations of (single-layer) perceptrons: Cannot learn non-linearly separable tasks Cannot approximate (learn) non-linear functions Difficult (if not impossible) to design
More informationMachine Learning and Bayesian Inference. Unsupervised learning. Can we find regularity in data without the aid of labels?
Machine Learning and Bayesian Inference Dr Sean Holden Computer Laboratory, Room FC6 Telephone extension 6372 Email: sbh11@cl.cam.ac.uk www.cl.cam.ac.uk/ sbh11/ Unsupervised learning Can we find regularity
More informationCS:4420 Artificial Intelligence
CS:4420 Artificial Intelligence Spring 2018 Neural Networks Cesare Tinelli The University of Iowa Copyright 2004 18, Cesare Tinelli and Stuart Russell a a These notes were originally developed by Stuart
More informationComputational Intelligence Lecture 3: Simple Neural Networks for Pattern Classification
Computational Intelligence Lecture 3: Simple Neural Networks for Pattern Classification Farzaneh Abdollahi Department of Electrical Engineering Amirkabir University of Technology Fall 2011 arzaneh Abdollahi
More informationReinforcement Learning. Donglin Zeng, Department of Biostatistics, University of North Carolina
Reinforcement Learning Introduction Introduction Unsupervised learning has no outcome (no feedback). Supervised learning has outcome so we know what to predict. Reinforcement learning is in between it
More informationHuman-level control through deep reinforcement. Liia Butler
Humanlevel control through deep reinforcement Liia Butler But first... A quote "The question of whether machines can think... is about as relevant as the question of whether submarines can swim" Edsger
More informationQuadratics and Other Polynomials
Algebra 2, Quarter 2, Unit 2.1 Quadratics and Other Polynomials Overview Number of instructional days: 15 (1 day = 45 60 minutes) Content to be learned Know and apply the Fundamental Theorem of Algebra
More informationGeometric View of Machine Learning Nearest Neighbor Classification. Slides adapted from Prof. Carpuat
Geometric View of Machine Learning Nearest Neighbor Classification Slides adapted from Prof. Carpuat What we know so far Decision Trees What is a decision tree, and how to induce it from data Fundamental
More informationSparse Kernel Machines - SVM
Sparse Kernel Machines - SVM Henrik I. Christensen Robotics & Intelligent Machines @ GT Georgia Institute of Technology, Atlanta, GA 30332-0280 hic@cc.gatech.edu Henrik I. Christensen (RIM@GT) Support
More informationShort Course: Multiagent Systems. Multiagent Systems. Lecture 1: Basics Agents Environments. Reinforcement Learning. This course is about:
Short Course: Multiagent Systems Lecture 1: Basics Agents Environments Reinforcement Learning Multiagent Systems This course is about: Agents: Sensing, reasoning, acting Multiagent Systems: Distributed
More informationECE521 Lecture 7/8. Logistic Regression
ECE521 Lecture 7/8 Logistic Regression Outline Logistic regression (Continue) A single neuron Learning neural networks Multi-class classification 2 Logistic regression The output of a logistic regression
More informationReinforcement Learning Active Learning
Reinforcement Learning Active Learning Alan Fern * Based in part on slides by Daniel Weld 1 Active Reinforcement Learning So far, we ve assumed agent has a policy We just learned how good it is Now, suppose
More informationLecture 7 Artificial neural networks: Supervised learning
Lecture 7 Artificial neural networks: Supervised learning Introduction, or how the brain works The neuron as a simple computing element The perceptron Multilayer neural networks Accelerated learning in
More informationChapter 4: Dynamic Programming
Chapter 4: Dynamic Programming Objectives of this chapter: Overview of a collection of classical solution methods for MDPs known as dynamic programming (DP) Show how DP can be used to compute value functions,
More informationCS599 Lecture 1 Introduction To RL
CS599 Lecture 1 Introduction To RL Reinforcement Learning Introduction Learning from rewards Policies Value Functions Rewards Models of the Environment Exploitation vs. Exploration Dynamic Programming
More informationNeed for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels
Need for Deep Networks Perceptron Can only model linear functions Kernel Machines Non-linearity provided by kernels Need to design appropriate kernels (possibly selecting from a set, i.e. kernel learning)
More informationChapter 6: Conclusion
Chapter 6: Conclusion As stated in Chapter 1, the aim of this study is to determine to what extent GIS software can be implemented in order to manage, analyze and visually illustrate an IT-network between
More informationLecture 2. 1 More N P-Compete Languages. Notes on Complexity Theory: Fall 2005 Last updated: September, Jonathan Katz
Notes on Complexity Theory: Fall 2005 Last updated: September, 2005 Jonathan Katz Lecture 2 1 More N P-Compete Languages It will be nice to find more natural N P-complete languages. To that end, we ine
More information1 Differentiable manifolds and smooth maps. (Solutions)
1 Differentiable manifolds and smooth maps Solutions Last updated: March 17 2011 Problem 1 The state of the planar pendulum is entirely defined by the position of its moving end in the plane R 2 Since
More informationThe Markov Decision Process Extraction Network
The Markov Decision Process Extraction Network Siegmund Duell 1,2, Alexander Hans 1,3, and Steffen Udluft 1 1- Siemens AG, Corporate Research and Technologies, Learning Systems, Otto-Hahn-Ring 6, D-81739
More informationRecurrent Neural Networks 2. CS 287 (Based on Yoav Goldberg s notes)
Recurrent Neural Networks 2 CS 287 (Based on Yoav Goldberg s notes) Review: Representation of Sequence Many tasks in NLP involve sequences w 1,..., w n Representations as matrix dense vectors X (Following
More informationΝεςπο-Ασαυήρ Υπολογιστική Neuro-Fuzzy Computing
Νεςπο-Ασαυήρ Υπολογιστική Neuro-Fuzzy Computing ΗΥ418 Διδάσκων Δημήτριος Κατσαρός @ Τμ. ΗΜΜΥ Πανεπιστήμιο Θεσσαλίαρ Διάλεξη 21η BackProp for CNNs: Do I need to understand it? Why do we have to write the
More informationCSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18
CSE 417T: Introduction to Machine Learning Final Review Henry Chai 12/4/18 Overfitting Overfitting is fitting the training data more than is warranted Fitting noise rather than signal 2 Estimating! "#$
More informationAPPLICATION OF A KERNEL METHOD IN MODELING FRICTION DYNAMICS
APPLICATION OF A KERNEL METHOD IN MODELING FRICTION DYNAMICS Yufeng Wan, Chian X. Wong, Tony J. Dodd, Robert F. Harrison Department of Automatic Control and Systems Engineering, The University of Sheffield,
More informationML4NLP Multiclass Classification
ML4NLP Multiclass Classification CS 590NLP Dan Goldwasser Purdue University dgoldwas@purdue.edu Social NLP Last week we discussed the speed-dates paper. Interesting perspective on NLP problems- Can we
More informationSequences and infinite series
Sequences and infinite series D. DeTurck University of Pennsylvania March 29, 208 D. DeTurck Math 04 002 208A: Sequence and series / 54 Sequences The lists of numbers you generate using a numerical method
More informationLecture 3: The Reinforcement Learning Problem
Lecture 3: The Reinforcement Learning Problem Objectives of this lecture: describe the RL problem we will be studying for the remainder of the course present idealized form of the RL problem for which
More informationPolicy Gradient Reinforcement Learning for Robotics
Policy Gradient Reinforcement Learning for Robotics Michael C. Koval mkoval@cs.rutgers.edu Michael L. Littman mlittman@cs.rutgers.edu May 9, 211 1 Introduction Learning in an environment with a continuous
More informationDesigning and Evaluating Generic Ontologies
Designing and Evaluating Generic Ontologies Michael Grüninger Department of Industrial Engineering University of Toronto gruninger@ie.utoronto.ca August 28, 2007 1 Introduction One of the many uses of
More informationIntegrated CME Project Mathematics I-III 2013
A Correlation of -III To the North Carolina High School Mathematics Math I A Correlation of, -III, Introduction This document demonstrates how, -III meets the standards of the Math I. Correlation references
More informationDecision Theory: Markov Decision Processes
Decision Theory: Markov Decision Processes CPSC 322 Lecture 33 March 31, 2006 Textbook 12.5 Decision Theory: Markov Decision Processes CPSC 322 Lecture 33, Slide 1 Lecture Overview Recap Rewards and Policies
More informationModel-Based Reinforcement Learning with Continuous States and Actions
Marc P. Deisenroth, Carl E. Rasmussen, and Jan Peters: Model-Based Reinforcement Learning with Continuous States and Actions in Proceedings of the 16th European Symposium on Artificial Neural Networks
More informationA Review of Kuiper s: Spatial Semantic Hierarchy
A Review of Kuiper s: Spatial Semantic Hierarchy Okuary Osechas Comp-150: Behavior Based Robotics 4 November 2010 Outline Introduction 1 Introduction 2 Summary of ideas 3 Ontological Levels 4 Interfacing
More informationA Probabilistic Relational Model for Characterizing Situations in Dynamic Multi-Agent Systems
A Probabilistic Relational Model for Characterizing Situations in Dynamic Multi-Agent Systems Daniel Meyer-Delius 1, Christian Plagemann 1, Georg von Wichert 2, Wendelin Feiten 2, Gisbert Lawitzky 2, and
More informationGrundlagen der Künstlichen Intelligenz
Grundlagen der Künstlichen Intelligenz Formal models of interaction Daniel Hennes 27.11.2017 (WS 2017/18) University Stuttgart - IPVS - Machine Learning & Robotics 1 Today Taxonomy of domains Models of
More informationComputational Intelligence Lecture 6: Associative Memory
Computational Intelligence Lecture 6: Associative Memory Farzaneh Abdollahi Department of Electrical Engineering Amirkabir University of Technology Fall 2011 Farzaneh Abdollahi Computational Intelligence
More information