Understanding of Positioning Skill based on Feedforward / Feedback Switched Dynamical Model Hiroyuki Okuda, Hidenori Takeuchi, Shinkichi Inagaki, Tatsuya Suzuki and Soichiro Hayakawa Abstract To realize the harmonious cooperation with the operator, the man-machine cooperative system must be designed so as to accommodate with the characteristics of the operator s skill. One of the important considerations in the skill analysis is to investigate the switching mechanism underlying the skill dynamics. On the other hand, the combination of the feedforward and feedback schemes has been proved to work successfully in the modeling of human skill. In this paper, a new stochastic switched skill model for the sliding task, wherein a minimum jerk motion and feedback schemes are embedded in the different discrete states, is proposed. Then, the parameter estimation algorithm for the proposed switched skill model is derived. Finally, some advantages and applications of the proposed model are discussed. I. INTRODUCTION The man-machine cooperative system is attracting great attention in many fields, such as manufacturing, medicine, welfare and so on. To realize the harmonious cooperation with the operator, the assisting system must be designed so as to accommodate with the characteristics of the operator s skill. Authors have developed the human skill model based on a hybrid dynamical system modeling under the consideration that the operator appropriately switches some simple motion control laws instead of adopting the complex nonlinear motion control law [][]. In [], the positioning task was particulaly focused on and considered as the reaching task with precise adjustment. This task was identified as the variable gain feedback control based on the hybrid system framework[]. From the viewpoint of biology or computational nueroscience, there are number of studies indicating that the feedforward control is dominant in the voluntary movements[4][]. A reaching movement is considered to be achieved by a combination of trajectory planning, such as the minimum jerk trajectory[6][7][8], and the inverse dynamics to realize the trajectory[9]. On the other hand, it is also reported that the feedback infomation, such as the visual infomation, is necessary to accomplish the precise control[]. Therefore, the combination of the feedforward and feedback schemes seems natural, however, it is unlikely that the human always activate both the feedforward and feedback schemes simultaneously. It seems more natural that the feedforward and feedback schemes are switched H. Okuda, H. Takeuchi, S. Inagaki and T. Suzuki are with the Department of Mechanical Science and Engineering, Graduate School of Engineering, Nagoya University, Furo-cho, Chikusa-ku, Nagoya, Japan h okuda@nuem.nagoya-u.ac.jp Soichiro Hayakawa is with the Toyota Technological Institute, Hisakata, Tempaku-ku, Nagoya, Japan s hayakawa@toyota-ti.ac.jp smoothly according to the progress of the task. From this viewpoint, this paper proposes a new stochastic switched skill model for the sliding task wherein the feedforward and feedback schemes are embedded in the different discrete states (FF/FB switched skill model). In particular, in the discrete state of the feedforward scheme, a minimum jerk motion[8] is embedded, while in the discrete state of the feedback scheme, a standard linear feedback control law is embedded. Then, the parameter estimation algorithm for the proposed FF/FB switched model is derived. One of the promising applications of the proposed model is the estimation of the switching condition from the feedforward to feedback scheme based only on the observed data. The estimated switching condition can be exploited for the design of the switched assisting controller wherein the assisting mode is switched according to the change of the operators control mode. Furthermore, the proposed model can be exploited for the skill recognition which is available for the analysis of experience, skillfulness, and so on. II. SLIDING TASK AND DATA ACQUISITION Throughout this paper, the sliding task shown in Figs. and is considered. The developed system consists of one d.o.f. linear slider controlled by the impedance control. The impedance parameters were set as follows; the mass M, the damping D and the stiffness K were set to be [kg], [Ns/m] and [N/m], respectively. These parameters were found by try and error. The force sensor is attached on the slider head to detect the examinee s force which is used for the impedance control. The position x t, the velocity ẋ t and the acceleration ẍ t of the slider head are observed every [µsec] and used for the skill modeling. The examinee was requested to manipulate the grip toward the target position (origin), and to stop it in the range of -[mm] to [mm]. This positioning accuracy requires the feedback control scheme in the operator s action. The moving distance was set to be 4[mm]. In this experiment, twenty trials have been made by three examinees. As an example, one of the profiles of the examinee A is shown in Fig.. The horizontal axis represents the time and the vertical axes represent the position, velocity and acceleration, respectively. III. FF/FB SWITCHED SKILL MODEL A. Structure of proposed model For the sliding task shown in Section II, it is quite natural to consider that this task is achieved by both the feedforward scheme and the feedback control scheme. In the early part of the task, the feedforward scheme must be dominant
Target (Origin) Fig.. Linear Slider Origin Fig.. x =.4m Moving Direction Shaft Force Sensor Experimental system of sliding task x t Slider Head Illustrative picture of sliding task Grip Force Sensor because the operator does not have to be conscious of the final positioning. Since the feedforward scheme includes the generation of the reference trajectory, a motion optimized under some cost function must be adopted as the feedforward scheme. On the other hand, in the latter part of the task, the feedback control scheme must be dominant to accomplish the precise positioning. In order to verify this scenario, we introduce a new stochastic switched skill model wherein the feedforward and feedback schemes are embedded in the different discrete states. A state transition diagram of the proposed model is shown in Fig. 4. This model has three discrete states. The feedforward model is assigned to the state, and the different feedback models are assigned to the states and, respectively. Because the transition from feedforward scheme to feedback scheme should be smooth enough, the state is provided to express the transient mode. The switching between modes are specified by the transition probabilities a ij. As the feedforward scheme, the Minimum Hand Jerk Motion (MHJM)[8] is considered. The MHJM is defined as the motion which yields a minimum integral of the absolute value of the jerk of the hand motion defined by J = tf d x t dt dx () where t f is a task time of MHJM. In MHJM, the position, velocity and acceleration profiles of the motion are automatically derived under the condition of ẋ =, ẋ tf =, x = and x tf = x f as follows: x t = x f (τ 4 6τ τ ) () dx t dt = t = x f (6τ τ 4 τ ) t f () d x t dt = ẍ t = x f t f (8τ τ 6τ) (4) Position[m] Velocity[m/s] Accel.[m/s ].4. Time[s] Fig.. Sample of observed profiles Here, τ = t/t f. t f is the task time and x f is a moving distance of the hand. The resulting velocity profile is called the bell-shaped velocity profile. On the other hand, the feedback control scheme is represented by the linear feedback controller model using the position and the velocity of the grip as the feedback information. Furthermore, the feedback control scheme is divided into two discrete states because the human operator seems to change the control parameters in the feedback scheme, that is, to change from rough control to precise control. This leads to the switched model shown in Fig. 4. Now, the proposed model is formally described as follows: Regressor vector Output signal Skill model where r t = [t x t ẋ t ] T = [r,t r,t r,t ] T () y t = ẍ t, (6) P {s t = S j s t = S i } = a ij, (7) y t =F i (r t, θ i ) + e i,t if s t =S i (i =,, ) (8) F (r t, θ )=θ, (8 r,t 4 θ r,t, θ 6 r,t ),, θ, (9) F (r t, θ )=θ, r,t + θ, r,t, () F (r t, θ )=θ, r,t + θ, r,t. () Feedforward Feedback a a a y F e = ( r, θ ) + y F ( r, θ ) e t t,t a a t t,t Fig. 4. = + y = F ( r, θ ) + e t t,t FF/FB Switched skill model ( states).
F i represents the scheme in the state i, and s t is the state at time t. Furthermore, e i,t is the equation error, and is assumed to have a Gaussian distribution given by { } p ei (e i,t ) = (F i(r t, θ i ) y t ) πσi σi. () The definitions of parameters are listed in the following: S i : Discrete state (i=,, ) a ij : transition probability (i=,, ; j =,, ) π i : Initial state probability (i=,, ) θ i : Parameters in dynamical model assigned to S i (i=,, ) (θ, = x f, θ, = t f, θ,, θ,, θ, and θ, are feedback gains.) σ i : Variance of equation error e i,t in dynamical model assigned to S i (i=,, ) We denote the set of parameters in the FF/FB switched skill model by λ = (π i, a ij, θ i, σ i ). B. Three fundamental problems To address several fundamental problems that are necessary for the skill analysis, the observed signal and its occurrence probability are defined for the proposed model as follows: First of all, the observed signal o l,t at time t( {,,, T }) is defined as the combination of the output y l,t and the regressor r l,t, that is, o l,t = (y l,t, r l,t ) where T is the length of the observed sequence and l is the index of the observed sequence, i.e. the index of trial. Then, its occurrence probability b i (o l,t ) is defined by assumption of the Gaussian distribution of the equation error, and is given by { } b i (o l,t ) = (F i(r l,t, θ i ) y l,t ) πσi σi. () Based on these definitions, the following three fundamental problems must be addressed for the proposed model. ) Evaluation problem In the evaluation problem, the probability P (O l λ) that the observed signal sequence O l =(o l,, o l,,, o l,t,, o l,t ) occurs from the model λ = (π i, a ij, θ i, σ i ) is calculated. This problem can be solved by applying Forward algorithm []. ) Decoding problem In the decoding problem, the most likely underlying state sequence s l = (s l,, s l,,, s l,t,, s l,t ), which yields the observed signal sequence O l = (o l,, o l,,, o l,t,, o l,t ), is found for the model λ = (π i, a ij, θ i, σ i ). This state estimation can be realized by applying Viterbi algorithm []. ) Estimation problem In the estimation problem, the model parameter λ = (π i, a ij, θ i, σ i ), which gives the highest occurrence probability for the observed signal sequence O l = (o l,, o l,,, o l,t,, o l,t ), is estimated. C. Parameter estimation based on EM algorithm The solution for the evaluation problem and the decoding problem are almost same as ones for the standard Hidden Markov Model (HMM). The parameter estimation algorithm for the proposed FF/FB skill model, however, is not straightforward extension from the one for the standard HMM. In this section, the parameter estimation algorithm for the proposed model is derived based on the Expectation and Maximization (EM) algorithm. ) EM algorithm: First of all, we consider an unobservable state sequence s l = (s l,, s l,,, s l,t,, s l,t ) and the observable signal sequences O l = (o l,, o l,,, o l,t,, o l,t ) where l represents the index of the observed signal sequence at lth trial. Since the state sequence s is unobservable, the maximization of the likelihood value of the s and O l s, L l= L(s l, O l ; λ) = L l= P (s l, O l λ), is not directly tractable (L is the number of trials). In the EM algorithm, instead of the optimization of the likelihood value itself, the expected value of the log-likelihood Γ over the unobservable state sequence s is locally optimized by the iterative procedure. Γ is given by L Γ = E [log {P (s, O l λ)}]. (4) l= Suppose that the initial parameters of the model is given by λ = {π i, a ij, θ i, σ i }, the EM algorithm tries to find the new parameter λ which maximizes the following Q function: L Q(λ, λ )= E [log {P (s, O l λ )} O l, λ] () l= By using the definition L = P (s O l, λ) log {P (s, O l λ )} (6) l= s P (s, O l λ)=π s b s (o l, ) a s s b s (o l, ) a ss b s (o l, ) a st s T b st (o l,t ), (7) the Q(λ, λ ) can be decomposed as follows: Q(λ, λ ) = Q (λ, π i) + Q (λ, a ij) + Q (λ, (θ i, σ i)) (8) Here introducing the forward probability α(l, i, t) and backward probability β(l, i, t) defined as follows, α(l, i, t)= β(l, i, t)= s = s = s t = π s b s (o l, ) a ss b s (o l, ) a st s t b st (o l,t ) (9) a st s t+ b t+ (o l,t+ ) s t+ = s t+ = s T = a st+ s t+ b st+ (o l,t+ ) a st s T b st (o l,t ) ()
equation (8) is rewritten as follows: L Q (λ, π i)= k l π i b i (o l, ) log {π i} β(l, i, ) () l= i= L T Q (λ, a ij)= l= t= i= j= k l log { a } ij α(l, i, t )a ij b j (o l,t )β(l, j, t) () L T Q (λ, (θ i, σ i))= k l log {b i(o l,t )} l= t= i= α(l, i, t)β(l, i, t) () The meaning of α(l, i, t) is the probability for the model λ to generate the lth observed signal subsequence O l = (o l,, o l,,, o l,t ) until t and reach the state S i at t (i.e. s t = S i ). Also, the meaning of β(l, i, t) is the probability for the model λ to generate the lth observed signal subsequence O l = (o l,t+, o l,t+,, o l,t ) starting from S i at t (i.e. s t = S i ) and reach the final state at T. In summary, the following procedure is executed iteratively to maximize Γ, ) Specify an initial parameter λ = λ. ) Find the λ which maximizes the Q(λ, λ ). ) If λ = λ, finish the procedure, and if λ λ, substitute λ for λ and go to step ). ) Local maximization of Q function: The parameters λ = (π i, a ij, θ i, σ i ) which locally maximizes the Q(λ, λ ), can be obtained by solving the following equations: Q π i = Q a = ij Q θ i = Q σ i = (4) The resulting parameter update laws of π i and a ij are given as follows: L π i l= = k lπ i b i (o l, )β(l, i, ) L i= l= k () lπ i b i (o l, )β(l, i, ) T L a t= l= ij = k lα(l, i, t )a ij b j (o l,t )β(l, j, t) T L j= t= l= k lα(l, i, t )a ij b j (o l,t )β(l, j, t) (6) Furthermore, the parameter update laws of θ, θ, σ and σ are obtained by the following weighted least mean square solution: { T } L θ i = k l ψ l,t ψ T l,tα(l, i, t)β(l, i, t) σ i = t= l= { T t= l= } L k l ψ l,t y l,t α(l, i, t)β(l, i, t) (7) T L t= l= k l θ T i ψ l,t y l,t α(l, i, t)β(l, i, t) T L t= l= k lα(l, i, t)β(l, i, t) (i =, ) (8) On the other hand, the parameter update laws of θ and σ cannot be obtained in an analytical form because the F is the nonlinear function of the parameters θ and σ. So our previous procedure could not overcome this problem. Therefore, the steepest descent optimization is applied in the maximization of the Q Q σ =. They are given by instead of solving Q θ θ (k+) = θ (k) + γ Q (λ, (θ, σ )) θ σ (k+) = σ (k) + γ Q (λ, (θ, σ )) σ θ =θ (k) σ =σ (k) = and, (9). () By embedding this steepest descent optimization as for the parameters θ and σ into the EM algorithm, the parameter λ is locally maximized. IV. PARAMETER ESTIMATION RESULTS In this section, the parameter estimation results are shown and discussed. initial parameters are tested to find the semi-optimal solution. All twenty trials are used for the parameter estimation independently, and twenty set of parameters were estimated as the result. An example of the estimated parameters is shown in Table I to Table III. Since the parameters θ, and θ, represent the moving distance x f and task time t f in the MHJM, the estimated θ, and θ, can be considered to represent the virtual moving distance and the task time caused by the virtual target point specified by the examinee. According to Table I, the examinee A seems to set the virtual moving distance to be about.[m] from the starting point, and the virtual task time,.7[sec]. Table IV shows the estimated θ, and θ, of trials to of the examinee A. Although there is some variance in these parameters, all estimated parameters seem appropriate considering that the actual moving distance of the task is.4[m]. In addition, the observed output and the estimated output based on the model are shown in the upper figures in Fig., Fig. 6 and Fig. 7. The lower figures show the estimated mode switching sequence found by the Viterbi algorithm. The model parameters used in Fig., Fig. 6 and Fig. 7 are same as ones given in Table I, Table II and Table III, respectively. The solid vertical lines show the estimated switching time from the feedforward scheme (state ) to the feedback scheme (state ), and the dashed vertical lines show the estimated switching time between two feedback schemes. In upper figures, the estimated outputs agree well with the observed outputs in all examinees. Thus, the proposed model can capture the dynamical characteristics of the human skill well. Furthermore, the switching time from the feedforward scheme to the feedback scheme is estimated as.4[sec],.[sec] and.79[sec] in Fig., Fig. 6 and Fig. 7, respectively. This information will be useful for the design of the switched assisting controller of the manmachine cooperative system. V. COMPARISON WITH OTHER SKILL MODELS In the proposed skill model, the feedforward and feedback schemes were synthesized together with the switching mechanism of them. In this section, to verify the validity of the
TABLE I ESTIMATED PARAMETERS (EXAM. A) () Parameter i i state state state θ i,. -66. -.49 θ i,.7 -.9-8.6 () Parameter a ij a i,j i = i = i = j =.97.4 j =.888. j =. TABLE II ESTIMATED PARAMETERS (EXAM. B) () Parameter i i state state state θ i,.74-9.8-8.78 θ i,.76-7.84 -.67 () Parameter a ij a i,j i = i = i = j =.944.6 j =.94.8 j =. TABLE III ESTIMATED PARAMETERS (EXAM. C) () Parameter i i state state state θ i,.78-8.68-4.64 θ i,.97-4.49-4.4 () Parameter a ij a i,j i = i = i = j =.97. j =.99.7 j =. TABLE IV ESTIMATED PARAMETERS θ, AND θ, ( TRIALS EXAM. A) Trial θ, θ, Trial θ, θ,..7 6..7.76. 7.7.7.7.7 8.4.79 4..68 9.4.789..678..649 proposed model, other types of skill model is considered and compared. A. Switched feedback skill model First of all, the stochastic switched skill model, wherein only the feedback schemes are embedded in the discrete states, is considered and compared. The feedback switched skill model consists only of the feedback scheme as follows: y t = F i (r t, θ i ) + e i,t (i =,, ) () y t = ẍ t r t = [x t ẋ t ] T = [r,t r,t ] T () F (r t, θ ) = θ, r,t + θ, r,t () F (r t, θ ) = θ, r,t + θ, r,t (4) F (r t, θ ) = θ, r,t + θ, r,t () ẍt[m/s ].. Mode...7...7...7...7 Fig.. Estimated mode sequence and comparison between model output and observation (exam. A). ẍt[m/s ].. Mode Mode Mode...7...7...7...7 Fig. 6. Estimated mode sequence and comparison between model output and observation (exam. B). ẍt[m/s ].. Mode Mode Mode...7...7...7...7 Fig. 7. Estimated mode sequence and comparison between model output and observation (exam. C). Mode Mode For the same data shown in Fig. (examinee A), the parameter estimation was executed. The estimated output calculated by the switched feedback model is depicted in Fig. 8. As shown in Fig. 8, we can see the big difference between the observed and estimated outputs, particularly in the early part of the task. All three examinees show the similar tendency in this comparison. This is obviously due to the lack of an appropriate feedforward scheme in the early part of the task.
ẍt[m/s ].. Mode Mode Mode...7...7...7...7 Fig. 8. Estimated mode sequence and comparison between model output and observation (switched feedback model: exam. A). Fig. 9. A) ẍt[m/s ].. MHJM t f =.7...7...7 Comparison between model output and observation (MHJM: exam. B. Minimum hand jerk motion Pure feedforward skill model, which consists only of the MHJM (no switching mechanism), is considered and compared. The output of MHJM can be calculated from (4) specifying the parameters x f =.4[m] and t f. t f was set to be.7[sec] by try and error so as to generate the most appropriate profile. Again, the data of the examinee A was used. The calculated output is depicted in Fig. 9. In contrast with the feedback skill model, there is a big difference between the observed and calculated outputs particularly in the latter part of the task. This clearly indicates that some feedback schemes must be introduced in the latter part of the task to accomplish the positioning action. These comparisons can support the validity of the proposed FF/FB switched skill model. VI. DISCUSSION AND APPLICATIONS Based on the comparison shown in Section V, the proposed skill model can be regarded as a natural combination of existing models. In this section, some interesting applications are described. ) Design of assisting system based on switched impedance control As shown in the section IV, the proposed skill model enables us to estimate the operator s switching point from the feedforward scheme to the feedback scheme. This information can be exploited for the switching of the impedance parameters in the assisting impedance control. For example, in the latter part of the task, the damping coefficient should be raised to assist the precise positioning while it must be small in the early part to realize the quick startup. This kind of assisting scenario can be realized by using the proposed skill model. ) Skill recognition Since it is straightforward to calculate the likelihood of the observed signal over the proposed skill model, the quantitative evaluation of the observed skill can be realized from viewpoint of the stochastic dynamics. The quantitative evaluation can be used for the analysis of the experience, skillfulness, and so on. Furthermore, the learning characteristics of the human operator can be analyzed by investigating the estimated model parameters. VII. CONCLUSION In this paper, a new stochastic switched skill model for the sliding task, wherein the feedforward and feedback schemes are embedded in the different discrete states (FF/FB switched skill model), has been proposed. In particular, in the discrete state corresponding to the feedforward scheme, a minimum jerk motion is embedded, while in the discrete states corresponding to the feedback scheme, the standard linear feedback control laws are embedded. Then, the parameter estimation algorithm for the proposed switched model was derived. Finally, the usefulness of the proposed modeling has been verified and discussed through some experiments. REFERENCES [] H.Okuda,S.Hayakawa,T.Suzuki and T.Tsuchida, Modeling of Human Behavior in Man-Machine Cooperative System Based on Hybrid System Framework, Proc. of IEEE Intl. Conf. on Robotics and Automation, pp.4-9, 7. [] D.Del Vecchio, R.M.Murray and P.Perona, Decomposition of human motion into dynamics-based primitives with application to drawing tasks, Automatica, Vol.9, pp.8-98,. [] G.Ferrari-Trecate, M.Muselli, D.Liberati and M.Morari A clustering technique for the identification of piecewise affine systems, Automatica, Vol.9, No., pp.47-47, 999. [4] E.Bizzi, N.Accornero, W.Chapple and N. Hogan, Posture control and trajectory formation during arm movement, J. Neuroscience, Vol.4, pp.78-744, 984. [] A.Polit and E.Bizzi, Characteristics of the motor programs underlying arm movements in monkeys, J. Neurophysiology, Vol. 4, pp.8-94, 979. [6] Y.Uno M.Kawato and R.Suzuki Formation and Control of Optimal Trajectory in Human Multijoint Arm Movement-minimum torque change model, Biological Cybernetics Vol.6, No.9, pp.89-, 989. [7] ROSENBAUM, D. A. Planning researches by evaluating stored posture, Psychol. Rev., pp.8-67, 99. [8] T.Flash and N.Hogan, The coordination of arm movements: An experimentally confirmed mathematical model, Journal of Neuroscience, vol., pp.688-7, 98. [9] M.Kawato and H.Gomi. A computational model of four regions of the cerebellum based on feedback-error learning, Biological Cybernetics, pp.9-, 99. [] W. Spijkers and P. Lochner, Partial visual feedback and spatial endpoint accuracy of discrete aiming move, J. Motor Behavior, Vol. 6, pp.8-9, 994. [] L.R.Rabiner A Tutorial on Hidden Marcov Models and Selected Applications in Speech Recognition, IEEE Vol.77 989.