Trajectory encoding methods in robotics

Size: px

Start display at page:

Download "Trajectory encoding methods in robotics"

Allen Parsons
5 years ago
Views:

1 JOŽEF STEFAN INTERNATIONAL POSTGRADUATE SCHOOL Humanoid and Service Robotics Trajectory encoding methods in robotics Andrej Gams Jožef Stefan Institute Ljubljana, Slovenia

2 DMP variations: Adding additional terms Task space DMPs Compliant Movement Primitives Probalistic Movement Primitives Interaction Primitives GMM GMR HMM Outline

3 Motivation and approaches Movement Primitives (MPs) are a well-established approach for representing movement policies in robotics Beneficial properties: generalization, temporal modulation, co-activation, sequencing, easy to encode, small number of parameters Typically only some aspects are included in different representations Authors add their own touch and functionality

4 Dynamic Movement Primitives Trajectory representation DMPs are not explicitly dependent on time. Every DoF is is described by its own DMP

5 DMPs with additional terms

6 Obstacle avoidance A potential field is added to the differential equation (y: robot position, Φ angle between the velocity of the robot and the vector between the tip of the robot and the obstacle ) τz ሶ = α z β z g y z + f(x, w), +C C y, Φ = γr yሶ exp( βφ) Φ = arccos o y T yሶ o y y ሶ r = o y y, ሶ R = R π 2 r r

7 DMPs with additional terms at acceleration I Typically used for obstacle avoidance P. Pastor, L. Righetti, M. Kalakrishnan and S. Schaal, "Online movement adaptation based on previous sensor experiences," 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems, San Francisco, CA, 2011, pp Transformation system: τz ሶ = α z β z g y z + f(x, w), τy ሶ = z Canonical system (phase x): τx ሶ = α x x New transformation system: τz ሶ = α z β z g y z + f(x, w) + ζ Added at the acceleration level Relies on the relation τ = Hqሷ

8 DMPs with additional terms at acceleration II τz ሶ = α z β z g y z + f(x, w) + ζ ζ = force controller: τ = J T ζ = position controller: qሶ d = J + F des F mes + K i න t Δt xሶ d + K x x d x t F des F mes dt ζ = position and force controller:

9 DMPs with additional terms at acceleration III

10 DMPs with additional terms at acceleration IV

11 Including joint limits Joint limits can be taken into account with a modification of the original differential equations τy ሶ = z τy ሶ = z ρ y l y 3

12 DMPs with additional terms at velocity I Coupling term added at velocity Originally applied for joint limits A. Gams, B. Nemec, A. J. Ijspeert and A. Ude, "Coupling Movement Primitives: Interaction With the Environment and Bimanual Tasks," in IEEE Transactions on Robotics, vol. 30, no. 4, pp , Aug Transformation system: τz ሶ = α z β z g y z + f(x, w), τy ሶ = z New transformation system: τy ሶ = z + C C = kf C = k F d F + F c x Learned!

13 DMPs with additional terms at velocity II Learned using Iterative Learning Control (ILC) -> Bojan s lectures e j = F d j F i j F c,i j = Q F c,i j + L e j ሶ j + 1 Stability can be shown (for a given parameter set) Applicable for contact with environment of people

14 DMPs with additional terms at velocity III

15 Task-space DMPs I In task space we apply 1 DMP per 1 DOF: 3 for position 3 for orientation: for example Euler angles Problem of singularity Cartesian space DMPs use quaternions q = u + v S 3 Unit sphere in R 4

16 Task-space DMPs II τη ሶ = α z β z 2 log g 0 ഥq η + f o (x) τq ሶ = 1 η q 2 τx ሶ = α x x The nonlinear forcing term to encode an orientation trajectory T σ N o q j, ω j, ωሶ j is given by fo x = D i=1 wi Ψi x o x σ N i=1 Ψ i x Because the forcing term contains free parameters w i R 3, exponential mapping R 3 S 3 is used in orientation integration to map it back to a unit quaternion Δq S 3, Orientation integration is defined as q t + Δt = exp Δt 2 ω q(t) = exp Δt 2τ η q t A. Ude, B. Nemec, T. Petrič, and J. Morimoto, Orientation in Cartesian space dynamic movement primitives, in IEEE International Conference on Robotics and Automation (ICRA), (Hong Kong), pp , 2014.

17 Variations and extensions of Motor Primitives

18 Compliant Movement Primitives - CMPs I Compliant movements primitives are a combination of desired position trajectories and corresponding torque signals : h t = q d t, τ ff t Motion trajectories are encoded as Dynamic Movement Primitives (DMPs) Transformation system: τz ሶ = α z β z g y z + f(s, w), τy ሶ = z Canonical system (phase x): τs ሶ = α x s Forcing term f x, w = σ i w iψ i (s) s, with Ψ σ i Ψ i (s) i s = exp x c i Torque trajectories are encoded as a linear combination of basis functions N τ ff s = σ i=1 σ N i=1 wτi ψ(s) ψ(s) Both use the same phase signal to drive them σ i 2 2

19 ሶ Compliant Movement Primitives - CMPs II Position and torque signal are used as the input into the robot impedance controller, given by τ u = K q q d q + D q d q ሶ + f dyn q, q, ሶ q ሷ + τ ff Low stiffness gains (K q ) result in compliant behavior, but high trajectory tracking errors Adding a feedforward torque (τ ff ) compensates for trajectory tracking errors but maintains compliance. Feedforward torque is learned from demonstration. Additional torque compensates for the task-specific dynamics and/or robot s flawed or non-existing dynamical model

Compliant Movement Primitives - CMPs III Three step process: 1.

Iterative leaning of torque primitives τ f t.

Movement and torque primitive are gained, stored and possibly executed. Petrič T., Gams A.

20 Compliant Movement Primitives - CMPs III Three step process: 1. Motion trajectory q d t is gained by human demonstration 2. Iterative leaning of torque primitives τ f t. Learning is updated based on kinematic trajectory 3. Movement and torque primitive are gained, stored and possibly executed. Petrič T., Gams A., Žlajpah L., Ude A., Online learning of task-specific dynamics for periodic tasks, 2014/IEEE/RSJ International Conference on Inteligent Robots and Systems, September 14-18, 2014, Chicago, IL, IROS2014, 2014, str

21 Compliant Movement Primitives - CMPs IV

22 Compliant Movement Primitives - CMPs V

23 Compliant Movement Primitives - CMPs VI

24 Compliant Movement Primitives - CMPs VII

25 Probabilistic Movement Primitives ProMPs I a probabilistic formulation for Motor Primitives (MPs) framework for implementing the desirable properties of MPs Co-activation, Modulation, Optimality, Coupling, Learning, Temporal modulation, Rhythmic movements catch the variability of the demonstration from a teacher as a probability distribution over trajectories A. Paraschos, C. Daniel, J. Peters, and G. Neumann, Probabilistic Movement Primitives, Neural Information Processing Systems, pp. 1 9, 2013, issn:

26 Probabilistic Movement Primitives ProMPs II To capture variance over the trajectories, we introduce distribution over the weights ProMPs parametrize the desired trajectory distribution of the primitive by a hierarchical Bayesian model with Gaussian distributions. This distribution is typically Gaussian θ = μ w, Σ w Trajectory distribution is computed: This defines the model

27 Probabilistic Movement Primitives ProMPs III Feedback gain, feed-forward component To control the robot, we have u = K t y t + k t + ε u to fully exploit the properties of trajectory distributions, a policy for controlling the robot is needed that reproduces these distributions. System noise matrix Inverse of the input matrix, where the linearized discrete time system is y t+dt = I + A t dt y t + B t dtu + c t dt Mean of distribution for current state Variance of distribution for current state System matrix Paraschos, A.; Daniel, C.; Peters, J.; Neumann, G (2013). Probabilistic Movement Primitives, Advances in Neural Information Processing Systems (NIPS), MIT Press

28 Probabilistic Movement Primitives ProMPs IV Demonstrations (left) ProMP(right) Distribution easily obtained from demonstrations Many possibilities: Conditioning, combination

29 Probabilistic Movement Primitives ProMPs V

30 Interaction Primitives IPs I To engage in cooperative activities with human partners, robots have to possess basic interactive abilities and skills. Inspired by dynamic motor primitives Probabilistic encoding of joint-behavior Bayesian reasoning to generate responses - Distribution over DMP parameters is used to infer further movement Generalization of the concept of imitation learning to human-robot interaction scenarios Two frameworks: DMPs and ProMPs H. Ben Amor, G. Neumann, S. Kamthe, O. Kroemer, and J. Peters, Interaction primitives for human-robot cooperation tasks, Proceedings - IEEE International Conference on Robotics and Automation, pp , 2014,

Interaction Primitives IPs II compact representation of a joint physical activity between two persons and use it in human robot interaction Interaction primitive specifies how a person adapts his

31 Interaction Primitives IPs II compact representation of a joint physical activity between two persons and use it in human robot interaction Interaction primitive specifies how a person adapts his movements to the movement of the interaction partner, and vice versa For example, in a handing-over task, the receiving person adapts his arm movements to the reaching motion of the person performing the handing-over.

32 Interaction Primitives IPs III

distributions: predictions over the behavior of an agent given a partial trajectory using a

33 Interaction Primitives IPs IV 3 steps: phase estimation: the robot adapt its timing such that it matches the timing of the human partner (dynamic time warping) Predictive DMP distributions: predictions over the behavior of an agent given a partial trajectory using a probabilistic approach Correlating the agents: we only condition on the DoFs of the observed agent (human)

34 Interaction Primitives IPs V DMP parameters are included in the parameter vector θ = w 1 T, g 1,, w N T, g N, where N denotes DOF of the agent. Given the parameter vector samples θ j of multiple demonstrations, distribution over parameters p(θ) can be calculated. With the observed partial trajectory τ 0 the likelihood p(τ 0 θ)is determined. Both are used to calculate the needed updated parameter distribution p θ τ 0

35 Interaction Primitives IPs VI The updated parameters for cooperative movement need to be determined. The parameter vector is extended to incorporate cooperative DMP parameters θ = θ h T, θ r T T. The appropriate part of the estimated updated parameter vector can then be used to determine the cooperative DMP parameters.

36 Interaction Primitives IPs VII Phase estimation needed to temporally align both agents Dynamic Time Warping is used Specifically, the accumulated cost matrix D is used to determine the frame in the reference movement n = argmin D n, M n where M denotes the number of frames in the observed trajectory. Frame n produces minimal cost w.r.t. the observed query movement The estimate of the current DMP phase is the inferred from the frame x = exp( α x (n /N)) where N denotes the frames in the reference trajectory Which reference movement to use for comparison?

37 Interaction Primitives IPs VIII IPs on ProMPs whole interaction is recorded i.e. two humans are recorded during interaction so are stored as ProMP While human performs the trajectory, robot recalculates the new means and variances in every time step The principal difference between DMP based and ProMP based Ips: the weights are calculated using the forcing term in DMP based in ProMP based IMPs the weights are calculated using the positions For both approaches several demonstrations are needed so that weight distribution can be determined In DMP based methods, forcing term signal can be noisy and create high accelerations in the execution

38 Interaction Primitives IPs VI

39 Gaussian models

40 Gaussian Mixture Models I probabilistic model for representing normally distributed subpopulations within an overall population Example: modeling human height data, height is typically modeled as a normal distribution for each gender with a mean of approximately 5'10" for males and 5'5" for females One hint that data might follow a mixture model is that the data looks multimodal, i.e. there is more than one "peak" in the distribution of data.

41 Gaussian Mixture Models II parameterized by two types of values: the mixture component weights and the component means and variances/covariances a Gaussian mixture model with K components, the k component has a mean of μ k and variance of σ k for the univariate case and a mean of μ k and covariance matrix of Σ k for the multivariate case The mixture component weights are defined as φ k for component C k, with the constraint σ K i=1 φ i = 1 so that the total probability distribution normalizes to 1.

42 Gaussian Mixture Models III One-dimensional Model Multi-dimensional Model

43 Gaussian Mixture Models IV If the number of components K is known, expectation maximization is the technique most commonly used to estimate the mixture model's parameters Expectation maximization (EM) is a numerical technique for maximum likelihood estimation, and is usually used when closed form expressions for updating the model parameters can be calculated Guaranteed to approach a local maximum

44 Expectation Maximization I Two steps: Expectation (E) and Maximization (M) Expectation: calculating the expectation of the component assignments C k for each data point x i X given the model parameters φ k, μ k, σ k. Maximization: maximizing the expectations calculated in the E step with respect to the model parameters. This step consists of updating the values φ k, μ k, σ k. This iterative process repeats until the algorithm converges, giving a maximum likelihood estimate

45 Expectation Maximization II Initialize Expectation Maximization

Task-parameterized Gaussian mixture model Task-parameterized models of movements aim at automatically adapting movements to new situations encountered by a robot (generalization) S.

46 Task-parameterized Gaussian mixture model Task-parameterized models of movements aim at automatically adapting movements to new situations encountered by a robot (generalization) S. Calinon, A tutorial on task-parameterized movement learning and retrieval, Intelligent Service Robotics, vol. 9, no. 1, pp. 1 29, 2016, issn: doi: /s

47 TP-GMM

48 TP-GMM II

49 Gaussian Mixture Regression I used to retrieve smooth generalized trajectories with associated covariance matrices describing the variations and correlations across the different variables Given a set of predictor variables ξ I and response variables ξ o Estimate the conditional expectation of ξ o given ξ I on the basis of a set of observations GMR offers a way of extracting a single generalized trajectory made-up from a set of trajectories used to train the model, the generalized trajectory is not part of the dataset but instead encapsulates all of its essential features.

50 Gaussian Mixture Regression II Calinon, S. and Billard, A. (2009). Statistical Learning by Imitation of Competing Constraints in Joint Space and Task Space. RSJ Advanced Robotics, 23,

51 Gaussian Mixture Regression III Calinon, S. and Billard, A. (2009). Statistical Learning by Imitation of Competing Constraints in Joint Space and Task Space. RSJ Advanced Robotics, 23,

52 Markov Models

53 Markov models I Markov model is a stochastic model used to model randomly changing systems Future states depend only on the current state, not on the events that occurred before it -> it assumes the Markov property this assumption enables reasoning and computation with the model that would otherwise be intractable Markov chain: simplest Markov model models the state of a system with a random variable that changes through time the Markov property suggests that the distribution for this variable depends only on the distribution of previous state Markov process is transitioning between states: at every time step t, the process moves from one state to another with a probability, which is assigned for each state

54 Markov models II transition probability is independent of previous states visited by the process (Markov property) probabilities are encoded in the so called transition matrix A, with dimensions Ns Ns, with Ns denoting number of states Each element a i,j defines probability of transitioning to state s j given current state s i. prior distribution of states is needed to completely define a Markov chain, which defines the possible states and their probabilities at time t=0. Prior distribution is usually denoted π and is of size Ns 1. A Markov chain is therefore defined by: λ MC ={π,a}.

55 Hidden Markov Model HMM I hidden Markov model is a Markov chain with states, concealed to the outside observer every time the process visits a (hidden) state, it outputs a symbol This symbol is dependent on an output probability distribution, which is defined for the given state Discrete: similar to the transition probabilities: output probability matrix B. Continuous: columns B i, corresponding to output probabilities of state s i, are replaced with a continuous probability density functions, most commonly mixtures of gaussians, μ i,k is mean and Σ i,k is covariance Thus, a hidden Markov model is defined as: λ HMM ={π,a,b}

56 Hidden Markov Model HMM II Apart from being recognized for recognition capabilities, a single model can also be used for motion generation, thus mimicking human learning processes Evaluation for recognition of the movement Choosing the model that most matches the observation Decoding: find state sequence that best explains the observation Parameters estimation

57 Parametric Hidden Markov Model - PHMM an extension to the continuous HMM the output probability density function is a Gaussian with one kernel the output probability density function bi is dependent on means and covariances. In parametric HMM framework the PDFs are defined to be dependent also on another, open parameter, denoted θ. μi=w i θ+b i, where W i denotes the matrix of coefficients defining the linear mapping and b i is the y-intercept for μ i. λ PHMM ={π,a,b,w,b}

58 PHMM - example a Cartesian trajectory for a reaching movement. The trajectory needs to be spatially shifted, depending on the position of the object being reached. This can be easily encoded with a PHMM: θ would in this case represent the position of the object, and W and b would be learned in such a way that μ of the final state would always correspond to the object's position. in some cases, linear mapping does not suffice to correctly model the variations in the data. Neural networks can be used to describe it instead {π,a,b,w,b} needs to be estimated given example sequences of a particular class an expectation-maximization procedure, the so called Baum-Welch algorithm, can be used

Learning Motor Skills from Partially Observed Movements Executed at Different Speeds

Learning Motor Skills from Partially Observed Movements Executed at Different Speeds Marco Ewerton 1, Guilherme Maeda 1, Jan Peters 1,2 and Gerhard Neumann 3 Abstract Learning motor skills from multiple