Trajectory encoding methods in robotics
|
|
- Allen Parsons
- 5 years ago
- Views:
Transcription
1 JOŽEF STEFAN INTERNATIONAL POSTGRADUATE SCHOOL Humanoid and Service Robotics Trajectory encoding methods in robotics Andrej Gams Jožef Stefan Institute Ljubljana, Slovenia
2 DMP variations: Adding additional terms Task space DMPs Compliant Movement Primitives Probalistic Movement Primitives Interaction Primitives GMM GMR HMM Outline
3 Motivation and approaches Movement Primitives (MPs) are a well-established approach for representing movement policies in robotics Beneficial properties: generalization, temporal modulation, co-activation, sequencing, easy to encode, small number of parameters Typically only some aspects are included in different representations Authors add their own touch and functionality
4 Dynamic Movement Primitives Trajectory representation DMPs are not explicitly dependent on time. Every DoF is is described by its own DMP
5 DMPs with additional terms
6 Obstacle avoidance A potential field is added to the differential equation (y: robot position, Φ angle between the velocity of the robot and the vector between the tip of the robot and the obstacle ) τz ሶ = α z β z g y z + f(x, w), +C C y, Φ = γr yሶ exp( βφ) Φ = arccos o y T yሶ o y y ሶ r = o y y, ሶ R = R π 2 r r
7 DMPs with additional terms at acceleration I Typically used for obstacle avoidance P. Pastor, L. Righetti, M. Kalakrishnan and S. Schaal, "Online movement adaptation based on previous sensor experiences," 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems, San Francisco, CA, 2011, pp Transformation system: τz ሶ = α z β z g y z + f(x, w), τy ሶ = z Canonical system (phase x): τx ሶ = α x x New transformation system: τz ሶ = α z β z g y z + f(x, w) + ζ Added at the acceleration level Relies on the relation τ = Hqሷ
8 DMPs with additional terms at acceleration II τz ሶ = α z β z g y z + f(x, w) + ζ ζ = force controller: τ = J T ζ = position controller: qሶ d = J + F des F mes + K i න t Δt xሶ d + K x x d x t F des F mes dt ζ = position and force controller:
9 DMPs with additional terms at acceleration III
10 DMPs with additional terms at acceleration IV
11 Including joint limits Joint limits can be taken into account with a modification of the original differential equations τy ሶ = z τy ሶ = z ρ y l y 3
12 DMPs with additional terms at velocity I Coupling term added at velocity Originally applied for joint limits A. Gams, B. Nemec, A. J. Ijspeert and A. Ude, "Coupling Movement Primitives: Interaction With the Environment and Bimanual Tasks," in IEEE Transactions on Robotics, vol. 30, no. 4, pp , Aug Transformation system: τz ሶ = α z β z g y z + f(x, w), τy ሶ = z New transformation system: τy ሶ = z + C C = kf C = k F d F + F c x Learned!
13 DMPs with additional terms at velocity II Learned using Iterative Learning Control (ILC) -> Bojan s lectures e j = F d j F i j F c,i j = Q F c,i j + L e j ሶ j + 1 Stability can be shown (for a given parameter set) Applicable for contact with environment of people
14 DMPs with additional terms at velocity III
15 Task-space DMPs I In task space we apply 1 DMP per 1 DOF: 3 for position 3 for orientation: for example Euler angles Problem of singularity Cartesian space DMPs use quaternions q = u + v S 3 Unit sphere in R 4
16 Task-space DMPs II τη ሶ = α z β z 2 log g 0 ഥq η + f o (x) τq ሶ = 1 η q 2 τx ሶ = α x x The nonlinear forcing term to encode an orientation trajectory T σ N o q j, ω j, ωሶ j is given by fo x = D i=1 wi Ψi x o x σ N i=1 Ψ i x Because the forcing term contains free parameters w i R 3, exponential mapping R 3 S 3 is used in orientation integration to map it back to a unit quaternion Δq S 3, Orientation integration is defined as q t + Δt = exp Δt 2 ω q(t) = exp Δt 2τ η q t A. Ude, B. Nemec, T. Petrič, and J. Morimoto, Orientation in Cartesian space dynamic movement primitives, in IEEE International Conference on Robotics and Automation (ICRA), (Hong Kong), pp , 2014.
17 Variations and extensions of Motor Primitives
18 Compliant Movement Primitives - CMPs I Compliant movements primitives are a combination of desired position trajectories and corresponding torque signals : h t = q d t, τ ff t Motion trajectories are encoded as Dynamic Movement Primitives (DMPs) Transformation system: τz ሶ = α z β z g y z + f(s, w), τy ሶ = z Canonical system (phase x): τs ሶ = α x s Forcing term f x, w = σ i w iψ i (s) s, with Ψ σ i Ψ i (s) i s = exp x c i Torque trajectories are encoded as a linear combination of basis functions N τ ff s = σ i=1 σ N i=1 wτi ψ(s) ψ(s) Both use the same phase signal to drive them σ i 2 2
19 ሶ Compliant Movement Primitives - CMPs II Position and torque signal are used as the input into the robot impedance controller, given by τ u = K q q d q + D q d q ሶ + f dyn q, q, ሶ q ሷ + τ ff Low stiffness gains (K q ) result in compliant behavior, but high trajectory tracking errors Adding a feedforward torque (τ ff ) compensates for trajectory tracking errors but maintains compliance. Feedforward torque is learned from demonstration. Additional torque compensates for the task-specific dynamics and/or robot s flawed or non-existing dynamical model
20 Compliant Movement Primitives - CMPs III Three step process: 1. Motion trajectory q d t is gained by human demonstration 2. Iterative leaning of torque primitives τ f t. Learning is updated based on kinematic trajectory 3. Movement and torque primitive are gained, stored and possibly executed. Petrič T., Gams A., Žlajpah L., Ude A., Online learning of task-specific dynamics for periodic tasks, 2014/IEEE/RSJ International Conference on Inteligent Robots and Systems, September 14-18, 2014, Chicago, IL, IROS2014, 2014, str
21 Compliant Movement Primitives - CMPs IV
22 Compliant Movement Primitives - CMPs V
23 Compliant Movement Primitives - CMPs VI
24 Compliant Movement Primitives - CMPs VII
25 Probabilistic Movement Primitives ProMPs I a probabilistic formulation for Motor Primitives (MPs) framework for implementing the desirable properties of MPs Co-activation, Modulation, Optimality, Coupling, Learning, Temporal modulation, Rhythmic movements catch the variability of the demonstration from a teacher as a probability distribution over trajectories A. Paraschos, C. Daniel, J. Peters, and G. Neumann, Probabilistic Movement Primitives, Neural Information Processing Systems, pp. 1 9, 2013, issn:
26 Probabilistic Movement Primitives ProMPs II To capture variance over the trajectories, we introduce distribution over the weights ProMPs parametrize the desired trajectory distribution of the primitive by a hierarchical Bayesian model with Gaussian distributions. This distribution is typically Gaussian θ = μ w, Σ w Trajectory distribution is computed: This defines the model
27 Probabilistic Movement Primitives ProMPs III Feedback gain, feed-forward component To control the robot, we have u = K t y t + k t + ε u to fully exploit the properties of trajectory distributions, a policy for controlling the robot is needed that reproduces these distributions. System noise matrix Inverse of the input matrix, where the linearized discrete time system is y t+dt = I + A t dt y t + B t dtu + c t dt Mean of distribution for current state Variance of distribution for current state System matrix Paraschos, A.; Daniel, C.; Peters, J.; Neumann, G (2013). Probabilistic Movement Primitives, Advances in Neural Information Processing Systems (NIPS), MIT Press
28 Probabilistic Movement Primitives ProMPs IV Demonstrations (left) ProMP(right) Distribution easily obtained from demonstrations Many possibilities: Conditioning, combination
29 Probabilistic Movement Primitives ProMPs V
30 Interaction Primitives IPs I To engage in cooperative activities with human partners, robots have to possess basic interactive abilities and skills. Inspired by dynamic motor primitives Probabilistic encoding of joint-behavior Bayesian reasoning to generate responses - Distribution over DMP parameters is used to infer further movement Generalization of the concept of imitation learning to human-robot interaction scenarios Two frameworks: DMPs and ProMPs H. Ben Amor, G. Neumann, S. Kamthe, O. Kroemer, and J. Peters, Interaction primitives for human-robot cooperation tasks, Proceedings - IEEE International Conference on Robotics and Automation, pp , 2014,
31 Interaction Primitives IPs II compact representation of a joint physical activity between two persons and use it in human robot interaction Interaction primitive specifies how a person adapts his movements to the movement of the interaction partner, and vice versa For example, in a handing-over task, the receiving person adapts his arm movements to the reaching motion of the person performing the handing-over.
32 Interaction Primitives IPs III
33 Interaction Primitives IPs IV 3 steps: phase estimation: the robot adapt its timing such that it matches the timing of the human partner (dynamic time warping) Predictive DMP distributions: predictions over the behavior of an agent given a partial trajectory using a probabilistic approach Correlating the agents: we only condition on the DoFs of the observed agent (human)
34 Interaction Primitives IPs V DMP parameters are included in the parameter vector θ = w 1 T, g 1,, w N T, g N, where N denotes DOF of the agent. Given the parameter vector samples θ j of multiple demonstrations, distribution over parameters p(θ) can be calculated. With the observed partial trajectory τ 0 the likelihood p(τ 0 θ)is determined. Both are used to calculate the needed updated parameter distribution p θ τ 0
35 Interaction Primitives IPs VI The updated parameters for cooperative movement need to be determined. The parameter vector is extended to incorporate cooperative DMP parameters θ = θ h T, θ r T T. The appropriate part of the estimated updated parameter vector can then be used to determine the cooperative DMP parameters.
36 Interaction Primitives IPs VII Phase estimation needed to temporally align both agents Dynamic Time Warping is used Specifically, the accumulated cost matrix D is used to determine the frame in the reference movement n = argmin D n, M n where M denotes the number of frames in the observed trajectory. Frame n produces minimal cost w.r.t. the observed query movement The estimate of the current DMP phase is the inferred from the frame x = exp( α x (n /N)) where N denotes the frames in the reference trajectory Which reference movement to use for comparison?
37 Interaction Primitives IPs VIII IPs on ProMPs whole interaction is recorded i.e. two humans are recorded during interaction so are stored as ProMP While human performs the trajectory, robot recalculates the new means and variances in every time step The principal difference between DMP based and ProMP based Ips: the weights are calculated using the forcing term in DMP based in ProMP based IMPs the weights are calculated using the positions For both approaches several demonstrations are needed so that weight distribution can be determined In DMP based methods, forcing term signal can be noisy and create high accelerations in the execution
38 Interaction Primitives IPs VI
39 Gaussian models
40 Gaussian Mixture Models I probabilistic model for representing normally distributed subpopulations within an overall population Example: modeling human height data, height is typically modeled as a normal distribution for each gender with a mean of approximately 5'10" for males and 5'5" for females One hint that data might follow a mixture model is that the data looks multimodal, i.e. there is more than one "peak" in the distribution of data.
41 Gaussian Mixture Models II parameterized by two types of values: the mixture component weights and the component means and variances/covariances a Gaussian mixture model with K components, the k component has a mean of μ k and variance of σ k for the univariate case and a mean of μ k and covariance matrix of Σ k for the multivariate case The mixture component weights are defined as φ k for component C k, with the constraint σ K i=1 φ i = 1 so that the total probability distribution normalizes to 1.
42 Gaussian Mixture Models III One-dimensional Model Multi-dimensional Model
43 Gaussian Mixture Models IV If the number of components K is known, expectation maximization is the technique most commonly used to estimate the mixture model's parameters Expectation maximization (EM) is a numerical technique for maximum likelihood estimation, and is usually used when closed form expressions for updating the model parameters can be calculated Guaranteed to approach a local maximum
44 Expectation Maximization I Two steps: Expectation (E) and Maximization (M) Expectation: calculating the expectation of the component assignments C k for each data point x i X given the model parameters φ k, μ k, σ k. Maximization: maximizing the expectations calculated in the E step with respect to the model parameters. This step consists of updating the values φ k, μ k, σ k. This iterative process repeats until the algorithm converges, giving a maximum likelihood estimate
45 Expectation Maximization II Initialize Expectation Maximization
46 Task-parameterized Gaussian mixture model Task-parameterized models of movements aim at automatically adapting movements to new situations encountered by a robot (generalization) S. Calinon, A tutorial on task-parameterized movement learning and retrieval, Intelligent Service Robotics, vol. 9, no. 1, pp. 1 29, 2016, issn: doi: /s
47 TP-GMM
48 TP-GMM II
49 Gaussian Mixture Regression I used to retrieve smooth generalized trajectories with associated covariance matrices describing the variations and correlations across the different variables Given a set of predictor variables ξ I and response variables ξ o Estimate the conditional expectation of ξ o given ξ I on the basis of a set of observations GMR offers a way of extracting a single generalized trajectory made-up from a set of trajectories used to train the model, the generalized trajectory is not part of the dataset but instead encapsulates all of its essential features.
50 Gaussian Mixture Regression II Calinon, S. and Billard, A. (2009). Statistical Learning by Imitation of Competing Constraints in Joint Space and Task Space. RSJ Advanced Robotics, 23,
51 Gaussian Mixture Regression III Calinon, S. and Billard, A. (2009). Statistical Learning by Imitation of Competing Constraints in Joint Space and Task Space. RSJ Advanced Robotics, 23,
52 Markov Models
53 Markov models I Markov model is a stochastic model used to model randomly changing systems Future states depend only on the current state, not on the events that occurred before it -> it assumes the Markov property this assumption enables reasoning and computation with the model that would otherwise be intractable Markov chain: simplest Markov model models the state of a system with a random variable that changes through time the Markov property suggests that the distribution for this variable depends only on the distribution of previous state Markov process is transitioning between states: at every time step t, the process moves from one state to another with a probability, which is assigned for each state
54 Markov models II transition probability is independent of previous states visited by the process (Markov property) probabilities are encoded in the so called transition matrix A, with dimensions Ns Ns, with Ns denoting number of states Each element a i,j defines probability of transitioning to state s j given current state s i. prior distribution of states is needed to completely define a Markov chain, which defines the possible states and their probabilities at time t=0. Prior distribution is usually denoted π and is of size Ns 1. A Markov chain is therefore defined by: λ MC ={π,a}.
55 Hidden Markov Model HMM I hidden Markov model is a Markov chain with states, concealed to the outside observer every time the process visits a (hidden) state, it outputs a symbol This symbol is dependent on an output probability distribution, which is defined for the given state Discrete: similar to the transition probabilities: output probability matrix B. Continuous: columns B i, corresponding to output probabilities of state s i, are replaced with a continuous probability density functions, most commonly mixtures of gaussians, μ i,k is mean and Σ i,k is covariance Thus, a hidden Markov model is defined as: λ HMM ={π,a,b}
56 Hidden Markov Model HMM II Apart from being recognized for recognition capabilities, a single model can also be used for motion generation, thus mimicking human learning processes Evaluation for recognition of the movement Choosing the model that most matches the observation Decoding: find state sequence that best explains the observation Parameters estimation
57 Parametric Hidden Markov Model - PHMM an extension to the continuous HMM the output probability density function is a Gaussian with one kernel the output probability density function bi is dependent on means and covariances. In parametric HMM framework the PDFs are defined to be dependent also on another, open parameter, denoted θ. μi=w i θ+b i, where W i denotes the matrix of coefficients defining the linear mapping and b i is the y-intercept for μ i. λ PHMM ={π,a,b,w,b}
58 PHMM - example a Cartesian trajectory for a reaching movement. The trajectory needs to be spatially shifted, depending on the position of the object being reached. This can be easily encoded with a PHMM: θ would in this case represent the position of the object, and W and b would be learned in such a way that μ of the final state would always correspond to the object's position. in some cases, linear mapping does not suffice to correctly model the variations in the data. Neural networks can be used to describe it instead {π,a,b,w,b} needs to be estimated given example sequences of a particular class an expectation-maximization procedure, the so called Baum-Welch algorithm, can be used
Learning Motor Skills from Partially Observed Movements Executed at Different Speeds
Learning Motor Skills from Partially Observed Movements Executed at Different Speeds Marco Ewerton 1, Guilherme Maeda 1, Jan Peters 1,2 and Gerhard Neumann 3 Abstract Learning motor skills from multiple
More informationA Probabilistic Representation for Dynamic Movement Primitives
A Probabilistic Representation for Dynamic Movement Primitives Franziska Meier,2 and Stefan Schaal,2 CLMC Lab, University of Southern California, Los Angeles, USA 2 Autonomous Motion Department, MPI for
More informationCoordinate Change Dynamic Movement Primitives A Leader-Follower Approach
Coordinate Change Dynamic Movement Primitives A Leader-Follower Approach You Zhou, Martin Do, and Tamim Asfour Abstract Dynamic movement primitives prove to be a useful and effective way to represent a
More informationLearning Manipulation Patterns
Introduction Dynamic Movement Primitives Reinforcement Learning Neuroinformatics Group February 29, 2012 Introduction Dynamic Movement Primitives Reinforcement Learning Motivation shift to reactive motion
More informationProbabilistic Movement Primitives under Unknown System Dynamics
To appear in Advanced Robotics Vol., No., Month XX, 8 FULL PAPER Probabilistic Movement Primitives under Unknown System Dynamics Alexandros Paraschos a, Elmar Rueckert a, Jan Peters a,c, and Gerhard Neumann
More informationPath Integral Stochastic Optimal Control for Reinforcement Learning
Preprint August 3, 204 The st Multidisciplinary Conference on Reinforcement Learning and Decision Making RLDM203 Path Integral Stochastic Optimal Control for Reinforcement Learning Farbod Farshidian Institute
More informationCITEC SummerSchool 2013 Learning From Physics to Knowledge Selected Learning Methods
CITEC SummerSchool 23 Learning From Physics to Knowledge Selected Learning Methods Robert Haschke Neuroinformatics Group CITEC, Bielefeld University September, th 23 Robert Haschke (CITEC) Learning From
More informationMovement Primitives with Multiple Phase Parameters
Movement Primitives with Multiple Phase Parameters Marco Ewerton 1, Guilherme Maeda 1, Gerhard Neumann 2, Viktor Kisner 3, Gerrit Kollegger 4, Josef Wiemeyer 4 and Jan Peters 1, Abstract Movement primitives
More informationLearning Attractor Landscapes for Learning Motor Primitives
Learning Attractor Landscapes for Learning Motor Primitives Auke Jan Ijspeert,, Jun Nakanishi, and Stefan Schaal, University of Southern California, Los Angeles, CA 989-5, USA ATR Human Information Science
More informationDynamic Movement Primitives for Cooperative Manipulation and Synchronized Motions
Dynamic Movement Primitives for Cooperative Manipulation and Synchronized Motions Jonas Umlauft, Dominik Sieber and Sandra Hirche Abstract Cooperative manipulation, where several robots jointly manipulate
More informationUsing Probabilistic Movement Primitives in Robotics
Noname manuscript No. (will be inserted by the editor) Using Probabilistic Movement Primitives in Robotics Alexandros Paraschos Christian Daniel Jan Peters Gerhard Neumann the date of receipt and acceptance
More informationHidden Markov Models
Hidden Markov Models Lecture Notes Speech Communication 2, SS 2004 Erhard Rank/Franz Pernkopf Signal Processing and Speech Communication Laboratory Graz University of Technology Inffeldgasse 16c, A-8010
More informationPILCO: A Model-Based and Data-Efficient Approach to Policy Search
PILCO: A Model-Based and Data-Efficient Approach to Policy Search (M.P. Deisenroth and C.E. Rasmussen) CSC2541 November 4, 2016 PILCO Graphical Model PILCO Probabilistic Inference for Learning COntrol
More informationLecture 13: Dynamical Systems based Representation. Contents:
Lecture 13: Dynamical Systems based Representation Contents: Differential Equation Force Fields, Velocity Fields Dynamical systems for Trajectory Plans Generating plans dynamically Fitting (or modifying)
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 11 Project
More informationStatistical Techniques in Robotics (16-831, F12) Lecture#21 (Monday November 12) Gaussian Processes
Statistical Techniques in Robotics (16-831, F12) Lecture#21 (Monday November 12) Gaussian Processes Lecturer: Drew Bagnell Scribe: Venkatraman Narayanan 1, M. Koval and P. Parashar 1 Applications of Gaussian
More informationSTA 414/2104: Machine Learning
STA 414/2104: Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistics! rsalakhu@cs.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 9 Sequential Data So far
More informationSequence labeling. Taking collective a set of interrelated instances x 1,, x T and jointly labeling them
HMM, MEMM and CRF 40-957 Special opics in Artificial Intelligence: Probabilistic Graphical Models Sharif University of echnology Soleymani Spring 2014 Sequence labeling aking collective a set of interrelated
More informationBayesian Interaction Primitives: A SLAM Approach to Human-Robot Interaction
Bayesian Interaction Primitives: A SLAM Approach to Human-Robot Interaction Joseph Campbell CIDSE Arizona State University, United States jacampb1@asu.edu Heni Ben Amor CIDSE Arizona State University,
More informationPartially Observable Markov Decision Processes (POMDPs)
Partially Observable Markov Decision Processes (POMDPs) Sachin Patil Guest Lecture: CS287 Advanced Robotics Slides adapted from Pieter Abbeel, Alex Lee Outline Introduction to POMDPs Locally Optimal Solutions
More informationCoupling Movement Primitives: Interaction with the Environment and Bimanual Tasks
IEEE T-RO Coupling Movement Primitives: Interaction with the Environment and Bimanual Tasks Andrej Gams, Bojan Nemec, Auke J Ijspeert and Aleš Ude Abstract The framework of dynamic movement primitives
More informationLecture Schedule Week Date Lecture (M: 2:05p-3:50, 50-N202)
J = x θ τ = J T F 2018 School of Information Technology and Electrical Engineering at the University of Queensland Lecture Schedule Week Date Lecture (M: 2:05p-3:50, 50-N202) 1 23-Jul Introduction + Representing
More informationHidden Markov Models and Gaussian Mixture Models
Hidden Markov Models and Gaussian Mixture Models Hiroshi Shimodaira and Steve Renals Automatic Speech Recognition ASR Lectures 4&5 23&27 January 2014 ASR Lectures 4&5 Hidden Markov Models and Gaussian
More informationOnline Learning in High Dimensions. LWPR and it s application
Lecture 9 LWPR Online Learning in High Dimensions Contents: LWPR and it s application Sethu Vijayakumar, Aaron D'Souza and Stefan Schaal, Incremental Online Learning in High Dimensions, Neural Computation,
More informationUsing Gaussian Processes for Variance Reduction in Policy Gradient Algorithms *
Proceedings of the 8 th International Conference on Applied Informatics Eger, Hungary, January 27 30, 2010. Vol. 1. pp. 87 94. Using Gaussian Processes for Variance Reduction in Policy Gradient Algorithms
More informationPolicy Search for Path Integral Control
Policy Search for Path Integral Control Vicenç Gómez 1,2, Hilbert J Kappen 2, Jan Peters 3,4, and Gerhard Neumann 3 1 Universitat Pompeu Fabra, Barcelona Department of Information and Communication Technologies,
More informationIterative Motion Primitive Learning and Refinement by Compliant Motion Control
Iterative Motion Primitive Learning and Refinement by Compliant Motion Control Dongheui Lee and Christian Ott Abstract We present an approach for motion primitive learning and refinement for a humanoid
More informationIndependent Component Analysis and Unsupervised Learning
Independent Component Analysis and Unsupervised Learning Jen-Tzung Chien National Cheng Kung University TABLE OF CONTENTS 1. Independent Component Analysis 2. Case Study I: Speech Recognition Independent
More informationUniversity of Cambridge. MPhil in Computer Speech Text & Internet Technology. Module: Speech Processing II. Lecture 2: Hidden Markov Models I
University of Cambridge MPhil in Computer Speech Text & Internet Technology Module: Speech Processing II Lecture 2: Hidden Markov Models I o o o o o 1 2 3 4 T 1 b 2 () a 12 2 a 3 a 4 5 34 a 23 b () b ()
More informationLearning Gaussian Process Models from Uncertain Data
Learning Gaussian Process Models from Uncertain Data Patrick Dallaire, Camille Besse, and Brahim Chaib-draa DAMAS Laboratory, Computer Science & Software Engineering Department, Laval University, Canada
More informationIntroduction to Reinforcement Learning. CMPT 882 Mar. 18
Introduction to Reinforcement Learning CMPT 882 Mar. 18 Outline for the week Basic ideas in RL Value functions and value iteration Policy evaluation and policy improvement Model-free RL Monte-Carlo and
More informationEM-based Reinforcement Learning
EM-based Reinforcement Learning Gerhard Neumann 1 1 TU Darmstadt, Intelligent Autonomous Systems December 21, 2011 Outline Expectation Maximization (EM)-based Reinforcement Learning Recap : Modelling data
More informationBasic math for biology
Basic math for biology Lei Li Florida State University, Feb 6, 2002 The EM algorithm: setup Parametric models: {P θ }. Data: full data (Y, X); partial data Y. Missing data: X. Likelihood and maximum likelihood
More informationStatistical Techniques in Robotics (16-831, F12) Lecture#20 (Monday November 12) Gaussian Processes
Statistical Techniques in Robotics (6-83, F) Lecture# (Monday November ) Gaussian Processes Lecturer: Drew Bagnell Scribe: Venkatraman Narayanan Applications of Gaussian Processes (a) Inverse Kinematics
More informationVariational Inference via Stochastic Backpropagation
Variational Inference via Stochastic Backpropagation Kai Fan February 27, 2016 Preliminaries Stochastic Backpropagation Variational Auto-Encoding Related Work Summary Outline Preliminaries Stochastic Backpropagation
More informationProbabilistic Prioritization of Movement Primitives
Probabilistic Prioritization of Movement Primitives Alexandros Paraschos 1,, Rudolf Lioutikov 1, Jan Peters 1,3 and Gerhard Neumann Abstract Movement prioritization is a common approach to combine controllers
More informationUnderstanding of Positioning Skill based on Feedforward / Feedback Switched Dynamical Model
Understanding of Positioning Skill based on Feedforward / Feedback Switched Dynamical Model Hiroyuki Okuda, Hidenori Takeuchi, Shinkichi Inagaki, Tatsuya Suzuki and Soichiro Hayakawa Abstract To realize
More informationIdentification of Human Skill based on Feedforward / Feedback Switched Dynamical Model
Identification of Human Skill based on Feedforward / Feedback Switched Dynamical Model Hiroyuki Okuda, Hidenori Takeuchi, Shinkichi Inagaki and Tatsuya Suzuki Department of Mechanical Science and Engineering,
More informationIndependent Component Analysis and Unsupervised Learning. Jen-Tzung Chien
Independent Component Analysis and Unsupervised Learning Jen-Tzung Chien TABLE OF CONTENTS 1. Independent Component Analysis 2. Case Study I: Speech Recognition Independent voices Nonparametric likelihood
More informationRobotics Part II: From Learning Model-based Control to Model-free Reinforcement Learning
Robotics Part II: From Learning Model-based Control to Model-free Reinforcement Learning Stefan Schaal Max-Planck-Institute for Intelligent Systems Tübingen, Germany & Computer Science, Neuroscience, &
More informationBrief Introduction of Machine Learning Techniques for Content Analysis
1 Brief Introduction of Machine Learning Techniques for Content Analysis Wei-Ta Chu 2008/11/20 Outline 2 Overview Gaussian Mixture Model (GMM) Hidden Markov Model (HMM) Support Vector Machine (SVM) Overview
More informationGaussian Processes. 1 What problems can be solved by Gaussian Processes?
Statistical Techniques in Robotics (16-831, F1) Lecture#19 (Wednesday November 16) Gaussian Processes Lecturer: Drew Bagnell Scribe:Yamuna Krishnamurthy 1 1 What problems can be solved by Gaussian Processes?
More informationHMM and IOHMM Modeling of EEG Rhythms for Asynchronous BCI Systems
HMM and IOHMM Modeling of EEG Rhythms for Asynchronous BCI Systems Silvia Chiappa and Samy Bengio {chiappa,bengio}@idiap.ch IDIAP, P.O. Box 592, CH-1920 Martigny, Switzerland Abstract. We compare the use
More informationHidden Markov Models
Hidden Markov Models Dr Philip Jackson Centre for Vision, Speech & Signal Processing University of Surrey, UK 1 3 2 http://www.ee.surrey.ac.uk/personal/p.jackson/isspr/ Outline 1. Recognizing patterns
More informationTutorial on Gaussian Processes and the Gaussian Process Latent Variable Model
Tutorial on Gaussian Processes and the Gaussian Process Latent Variable Model (& discussion on the GPLVM tech. report by Prof. N. Lawrence, 06) Andreas Damianou Department of Neuro- and Computer Science,
More information10. Hidden Markov Models (HMM) for Speech Processing. (some slides taken from Glass and Zue course)
10. Hidden Markov Models (HMM) for Speech Processing (some slides taken from Glass and Zue course) Definition of an HMM The HMM are powerful statistical methods to characterize the observed samples of
More information(W: 12:05-1:50, 50-N202)
2016 School of Information Technology and Electrical Engineering at the University of Queensland Schedule of Events Week Date Lecture (W: 12:05-1:50, 50-N202) 1 27-Jul Introduction 2 Representing Position
More informationGaussian process for nonstationary time series prediction
Computational Statistics & Data Analysis 47 (2004) 705 712 www.elsevier.com/locate/csda Gaussian process for nonstationary time series prediction Soane Brahim-Belhouari, Amine Bermak EEE Department, Hong
More informationCoupling Movement Primitives: Interaction With the Environment and Bimanual Tasks
816 IEEE TRANSACTIONS ON ROBOTICS, VOL. 30, NO. 4, AUGUST 2014 Coupling Movement Primitives: Interaction With the Environment and Bimanual Tasks Andrej Gams, Bojan Nemec, Auke Jan Ijspeert, Member, IEEE,
More informationModel Selection for Gaussian Processes
Institute for Adaptive and Neural Computation School of Informatics,, UK December 26 Outline GP basics Model selection: covariance functions and parameterizations Criteria for model selection Marginal
More informationSTA414/2104. Lecture 11: Gaussian Processes. Department of Statistics
STA414/2104 Lecture 11: Gaussian Processes Department of Statistics www.utstat.utoronto.ca Delivered by Mark Ebden with thanks to Russ Salakhutdinov Outline Gaussian Processes Exam review Course evaluations
More informationProbabilistic Model-based Imitation Learning
Probabilistic Model-based Imitation Learning Peter Englert, Alexandros Paraschos, Jan Peters,2, and Marc Peter Deisenroth Department of Computer Science, Technische Universität Darmstadt, Germany. 2 Max
More informationCS 136a Lecture 7 Speech Recognition Architecture: Training models with the Forward backward algorithm
+ September13, 2016 Professor Meteer CS 136a Lecture 7 Speech Recognition Architecture: Training models with the Forward backward algorithm Thanks to Dan Jurafsky for these slides + ASR components n Feature
More informationStatistical Visual-Dynamic Model for Hand-Eye Coordination
The 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems October 18-22, 2010, Taipei, Taiwan Statistical Visual-Dynamic Model for Hand-Eye Coordination Daniel Beale, Pejman Iravani
More informationMachine Learning Overview
Machine Learning Overview Sargur N. Srihari University at Buffalo, State University of New York USA 1 Outline 1. What is Machine Learning (ML)? 2. Types of Information Processing Problems Solved 1. Regression
More informationProbabilistic Model-based Imitation Learning
Probabilistic Model-based Imitation Learning Peter Englert, Alexandros Paraschos, Jan Peters,2, and Marc Peter Deisenroth Department of Computer Science, Technische Universität Darmstadt, Germany. 2 Max
More informationThe geometry of Gaussian processes and Bayesian optimization. Contal CMLA, ENS Cachan
The geometry of Gaussian processes and Bayesian optimization. Contal CMLA, ENS Cachan Background: Global Optimization and Gaussian Processes The Geometry of Gaussian Processes and the Chaining Trick Algorithm
More informationLecture 16 Deep Neural Generative Models
Lecture 16 Deep Neural Generative Models CMSC 35246: Deep Learning Shubhendu Trivedi & Risi Kondor University of Chicago May 22, 2017 Approach so far: We have considered simple models and then constructed
More informationGentle Introduction to Infinite Gaussian Mixture Modeling
Gentle Introduction to Infinite Gaussian Mixture Modeling with an application in neuroscience By Frank Wood Rasmussen, NIPS 1999 Neuroscience Application: Spike Sorting Important in neuroscience and for
More informationHuman Mobility Pattern Prediction Algorithm using Mobile Device Location and Time Data
Human Mobility Pattern Prediction Algorithm using Mobile Device Location and Time Data 0. Notations Myungjun Choi, Yonghyun Ro, Han Lee N = number of states in the model T = length of observation sequence
More informationADVANCED MACHINE LEARNING ADVANCED MACHINE LEARNING. Non-linear regression techniques Part - II
1 Non-linear regression techniques Part - II Regression Algorithms in this Course Support Vector Machine Relevance Vector Machine Support vector regression Boosting random projections Relevance vector
More informationInformation geometry for bivariate distribution control
Information geometry for bivariate distribution control C.T.J.Dodson + Hong Wang Mathematics + Control Systems Centre, University of Manchester Institute of Science and Technology Optimal control of stochastic
More informationLecture 11: Hidden Markov Models
Lecture 11: Hidden Markov Models Cognitive Systems - Machine Learning Cognitive Systems, Applied Computer Science, Bamberg University slides by Dr. Philip Jackson Centre for Vision, Speech & Signal Processing
More informationPattern Recognition and Machine Learning
Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability
More informationMachine Learning Techniques for Computer Vision
Machine Learning Techniques for Computer Vision Part 2: Unsupervised Learning Microsoft Research Cambridge x 3 1 0.5 0.2 0 0.5 0.3 0 0.5 1 ECCV 2004, Prague x 2 x 1 Overview of Part 2 Mixture models EM
More informationA Higher-Order Interactive Hidden Markov Model and Its Applications Wai-Ki Ching Department of Mathematics The University of Hong Kong
A Higher-Order Interactive Hidden Markov Model and Its Applications Wai-Ki Ching Department of Mathematics The University of Hong Kong Abstract: In this talk, a higher-order Interactive Hidden Markov Model
More informationLinear Dynamical Systems
Linear Dynamical Systems Sargur N. srihari@cedar.buffalo.edu Machine Learning Course: http://www.cedar.buffalo.edu/~srihari/cse574/index.html Two Models Described by Same Graph Latent variables Observations
More informationINF 5860 Machine learning for image classification. Lecture 14: Reinforcement learning May 9, 2018
Machine learning for image classification Lecture 14: Reinforcement learning May 9, 2018 Page 3 Outline Motivation Introduction to reinforcement learning (RL) Value function based methods (Q-learning)
More informationLecture 4: Hidden Markov Models: An Introduction to Dynamic Decision Making. November 11, 2010
Hidden Lecture 4: Hidden : An Introduction to Dynamic Decision Making November 11, 2010 Special Meeting 1/26 Markov Model Hidden When a dynamical system is probabilistic it may be determined by the transition
More informationLearning Dynamical System Modulation for Constrained Reaching Tasks
In Proceedings of the IEEE-RAS International Conference on Humanoid Robots (HUMANOIDS'6) Learning Dynamical System Modulation for Constrained Reaching Tasks Micha Hersch, Florent Guenter, Sylvain Calinon
More informationHidden Markov Models Part 2: Algorithms
Hidden Markov Models Part 2: Algorithms CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington 1 Hidden Markov Model An HMM consists of:
More informationModeling Multiple-mode Systems with Predictive State Representations
Modeling Multiple-mode Systems with Predictive State Representations Britton Wolfe Computer Science Indiana University-Purdue University Fort Wayne wolfeb@ipfw.edu Michael R. James AI and Robotics Group
More informationRecent Advances in Bayesian Inference Techniques
Recent Advances in Bayesian Inference Techniques Christopher M. Bishop Microsoft Research, Cambridge, U.K. research.microsoft.com/~cmbishop SIAM Conference on Data Mining, April 2004 Abstract Bayesian
More informationBayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2014
Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2014 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several
More informationNonparametric Bayesian Methods (Gaussian Processes)
[70240413 Statistical Machine Learning, Spring, 2015] Nonparametric Bayesian Methods (Gaussian Processes) Jun Zhu dcszj@mail.tsinghua.edu.cn http://bigml.cs.tsinghua.edu.cn/~jun State Key Lab of Intelligent
More informationMulti-task Learning with Gaussian Processes, with Applications to Robot Inverse Dynamics
1 / 38 Multi-task Learning with Gaussian Processes, with Applications to Robot Inverse Dynamics Chris Williams with Kian Ming A. Chai, Stefan Klanke, Sethu Vijayakumar December 2009 Motivation 2 / 38 Examples
More informationHuman Pose Tracking I: Basics. David Fleet University of Toronto
Human Pose Tracking I: Basics David Fleet University of Toronto CIFAR Summer School, 2009 Looking at People Challenges: Complex pose / motion People have many degrees of freedom, comprising an articulated
More informationAugust 17, 2017 Estimation of Phases for Compliant Motion
August 17, 2017 Estimation of Phases for Compliant Motion Tesfamichael Marikos Hagos School of Electrical Engineering Thesis submitted for examination for the degree of Master of Science in Technology.
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 7 Approximate
More informationStatistical learning. Chapter 20, Sections 1 4 1
Statistical learning Chapter 20, Sections 1 4 Chapter 20, Sections 1 4 1 Outline Bayesian learning Maximum a posteriori and maximum likelihood learning Bayes net learning ML parameter learning with complete
More informationAdministration. CSCI567 Machine Learning (Fall 2018) Outline. Outline. HW5 is available, due on 11/18. Practice final will also be available soon.
Administration CSCI567 Machine Learning Fall 2018 Prof. Haipeng Luo U of Southern California Nov 7, 2018 HW5 is available, due on 11/18. Practice final will also be available soon. Remaining weeks: 11/14,
More informationBayesian Networks BY: MOHAMAD ALSABBAGH
Bayesian Networks BY: MOHAMAD ALSABBAGH Outlines Introduction Bayes Rule Bayesian Networks (BN) Representation Size of a Bayesian Network Inference via BN BN Learning Dynamic BN Introduction Conditional
More informationAn Evolutionary Programming Based Algorithm for HMM training
An Evolutionary Programming Based Algorithm for HMM training Ewa Figielska,Wlodzimierz Kasprzak Institute of Control and Computation Engineering, Warsaw University of Technology ul. Nowowiejska 15/19,
More informationL23: hidden Markov models
L23: hidden Markov models Discrete Markov processes Hidden Markov models Forward and Backward procedures The Viterbi algorithm This lecture is based on [Rabiner and Juang, 1993] Introduction to Speech
More informationUndirected Graphical Models
Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Properties Properties 3 Generative vs. Conditional
More informationAfternoon Meeting on Bayesian Computation 2018 University of Reading
Gabriele Abbati 1, Alessra Tosi 2, Seth Flaxman 3, Michael A Osborne 1 1 University of Oxford, 2 Mind Foundry Ltd, 3 Imperial College London Afternoon Meeting on Bayesian Computation 2018 University of
More informationActive Policy Iteration: Efficient Exploration through Active Learning for Value Function Approximation in Reinforcement Learning
Active Policy Iteration: fficient xploration through Active Learning for Value Function Approximation in Reinforcement Learning Takayuki Akiyama, Hirotaka Hachiya, and Masashi Sugiyama Department of Computer
More informationSENSOR-ASSISTED ADAPTIVE MOTOR CONTROL UNDER CONTINUOUSLY VARYING CONTEXT
SENSOR-ASSISTED ADAPTIVE MOTOR CONTROL UNDER CONTINUOUSLY VARYING CONTEXT Heiko Hoffmann, Georgios Petkos, Sebastian Bitzer, and Sethu Vijayakumar Institute of Perception, Action and Behavior, School of
More informationPractical Bayesian Optimization of Machine Learning. Learning Algorithms
Practical Bayesian Optimization of Machine Learning Algorithms CS 294 University of California, Berkeley Tuesday, April 20, 2016 Motivation Machine Learning Algorithms (MLA s) have hyperparameters that
More informationSum-Product Networks. STAT946 Deep Learning Guest Lecture by Pascal Poupart University of Waterloo October 17, 2017
Sum-Product Networks STAT946 Deep Learning Guest Lecture by Pascal Poupart University of Waterloo October 17, 2017 Introduction Outline What is a Sum-Product Network? Inference Applications In more depth
More informationRelevance Vector Machines for Earthquake Response Spectra
2012 2011 American American Transactions Transactions on on Engineering Engineering & Applied Applied Sciences Sciences. American Transactions on Engineering & Applied Sciences http://tuengr.com/ateas
More informationPredictive analysis on Multivariate, Time Series datasets using Shapelets
1 Predictive analysis on Multivariate, Time Series datasets using Shapelets Hemal Thakkar Department of Computer Science, Stanford University hemal@stanford.edu hemal.tt@gmail.com Abstract Multivariate,
More informationHidden Markov Models and Gaussian Mixture Models
Hidden Markov Models and Gaussian Mixture Models Hiroshi Shimodaira and Steve Renals Automatic Speech Recognition ASR Lectures 4&5 25&29 January 2018 ASR Lectures 4&5 Hidden Markov Models and Gaussian
More informationImitation Learning of Globally Stable Non-Linear Point-to-Point Robot Motions using Nonlinear Programming
Imitation Learning of Globally Stable Non-Linear Point-to-Point Robot Motions using Nonlinear Programming S. Mohammad Khansari-Zadeh and Aude Billard Abstract This paper presents a methodology for learning
More informationSTATS 306B: Unsupervised Learning Spring Lecture 5 April 14
STATS 306B: Unsupervised Learning Spring 2014 Lecture 5 April 14 Lecturer: Lester Mackey Scribe: Brian Do and Robin Jia 5.1 Discrete Hidden Markov Models 5.1.1 Recap In the last lecture, we introduced
More informationA Probabilistic Relational Model for Characterizing Situations in Dynamic Multi-Agent Systems
A Probabilistic Relational Model for Characterizing Situations in Dynamic Multi-Agent Systems Daniel Meyer-Delius 1, Christian Plagemann 1, Georg von Wichert 2, Wendelin Feiten 2, Gisbert Lawitzky 2, and
More informationLearning from Sequential and Time-Series Data
Learning from Sequential and Time-Series Data Sridhar Mahadevan mahadeva@cs.umass.edu University of Massachusetts Sridhar Mahadevan: CMPSCI 689 p. 1/? Sequential and Time-Series Data Many real-world applications
More informationShankar Shivappa University of California, San Diego April 26, CSE 254 Seminar in learning algorithms
Recognition of Visual Speech Elements Using Adaptively Boosted Hidden Markov Models. Say Wei Foo, Yong Lian, Liang Dong. IEEE Transactions on Circuits and Systems for Video Technology, May 2004. Shankar
More informationA Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models
A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models Jeff A. Bilmes (bilmes@cs.berkeley.edu) International Computer Science Institute
More information