On The Role Of Head Motion In Affective Expression Atanu Samanta, Tanaya Guha March 9, 2017 Department of Electrical Engineering Indian Institute of Technology, Kanpur, India
Introduction Applications of affect analysis and recognition: Human computer interaction (HCI) - E-learning, - Indexing, searching and retrieval of audiovisual content Mental and behavioral health - Counseling - Screening for autism Assistive technology - Teaching social skill to children with autism - Help improve public speaking skill by predicting audience response Marketing - Impact of ads, movies 1
Introduction Biophysical Audio Visual 2
Introduction Visual cues pertinent to affect: Facial expressions Body pose Hand and body gestures 3
Related Work Facial expression in affect analysis [Zeng et al. 2009] Image: Pantic et al. 2007 Facial Action Coding System (FACS) [Ekman et al. 1976] Automatic detection of Action Unit (AU) [Tian et al. 2001; Tong et al. 2007; Valstar et al. 2012] [Shan et al. 2009] LBP-TOP [Zhao et al. 2007] 4
Related Work Head motion in affect analysis Participants recognize emotion with 70% accuracy from only head movement [Livingstone et al. 2016] Realistic head motion synthesis using speech features [Yehia et al. 2002; Busso et al. 2005; Busso et al. 2007] Predicting emotion in continuous domain from head motion and head gestures [Gunes et al. 2007] Interpersonal coordination of head motion in interaction between distressed couples [Hammal et al. 2014] Spontaneous affect using head motion and temperature change in infrared thermal video [Liu et al. 2015] Decoupling head motion from facial expression [Adams et al. 2015] [Hammal et al. 2015] shows head motion is significantly faster in negative than positive emotions in infant 5
Objectives Does head motion alone contain significant information to distinguish one emotion from the other? 6
Objectives Does head motion alone contain significant information to distinguish one emotion from the other? Is this information complimentary to facial expression? 7
Objectives Does head motion contain significant information for distinguishing one emotion from the other? Is this information complimentary to facial expression? We do not intend to build the best classifier for emotion recognition 8
Database Acted Facial Expression in the Wild (AFEW) [Dhall et al. 2012] - video clips from 54 movies - labels - anger, disgust, fear, joy, neutral, sadness, and surprise - each clip contains only one primary face or character - recorded at 25 frames/sec Manually removed a few video clips Video clips used in our study for each emotion class Anger Disgust Fear Joy Neutral Sadness Surprise 161 107 103 192 180 147 105 9
Rigid Head Motion Head pose as Euler angles θ i = [ ] θp, i θy i, θr i T 10
Head Pose Estimation [Viola et al. 2004] Incremental Face Alignment [Asthana et al. 2014] s (p) = sr (s + Φ s g) + t p = [s; θ p ; θ y ; θ r ; t x ; t y ; g] T R is 3D rotation matrix s is scale t = [t x, t y, 0] T is translation g is non-rigid transformation Mean face shape s Note: (1) Cubic interpolation to estimate head pose at frames where face is not detected (2) Gaussian smoothing to remove noise 11
Head Pose Estimation θ p = 7.66, θ y = 36.48, θ r = 0.64 θp = 1.25, θ y = 9.99, θ r = 1.88 Examples of detected facial landmark points and estimated head pose in terms of θ p, θ y, and θ r 12
Characterizing Head Motion Non-parametric approach RMS measurements of head motion dynamics Displacement, velocity and acceleration time-series θ i d = θ i 1 N θ j N j=1 ( ) θ i v = θ i+1 d θ i d FrameRate ( ) θ i a = θ i+1 v θ i v FrameRate 9D feature vector RMS of all nine time-series 13
RMS Measurements of Head Movement Dynamics RMS Angular Displacement ( ) RMS Angular Velocity ( /sec) 20 100 Roll 10 Anger Neutral Roll 50 Anger Neutral 0 40 20 20 15 10 5 0 0 Yaw Pitch 0 200 200 100 150 100 50 0 0 Yaw Pitch (a) Angular displacement ( ) (b) Angular velocity ( /sec) RMS Angular Acceleration ( /sec 2 ) 1500 Roll 1000 500 Anger Neutral 0 4000 2000 Yaw 0 0 2000 1000 Pitch 3000 4000 (c) Angular acceleration ( /sec 2 ) RMS measurements of head motion dynamics for anger vs. neutral 14
Characterizing Head Motion Autoregressive modeling AR(3) coefficients θ(t) = a 0 + a 1 θ(t 1) + a 2 θ(t 2) + a 3 θ(t 3) 12D feature vector (a 0, a 1, a 2, a 3 ) for all three time series 15
Statistical Significance Test Hypothesis: Head motion characteristics are different for different emotions Tests: ANOVA using each RMS value separately Post-hoc multiple comparison and paired t-test 16
Multiple Comparison Confidence Intervals Confidence Intervals Confidence Intervals Anger Disgust Fear Joy Neutral Sadness Surprise 2 2.5 3 3.5 4 4.5 5 5.5 RMS displacement of pitch ( ) RMS displacement of pitch ( ) Confidence Intervals Anger Disgust Fear Joy Neutral Sadness Surprise 10 15 20 25 30 35 40 4 groups have means significantly different from Neutral RMS velocity of pitch ( /sec) Confidence Intervals Anger Disgust Fear Joy Neutral Sadness Surprise Anger Disgust Fear Joy Neutral Sadness Surprise 2 3 4 5 6 7 8 3 groups have means significantly different from Neutral RMS displacement of yaw ( ) Confidence Intervals Anger Disgust Fear Joy Neutral Sadness Surprise 10 15 20 25 30 35 40 45 RMS velocity of yaw ( /sec) RMS velocity of yaw ( /sec) Confidence Intervals Anger Disgust Fear Joy Neutral Sadness Surprise Anger Disgust Fear Joy Neutral Sadness Surprise 1.5 2 2.5 3 3.5 4 2 groups have means significantly different from Neutral RMS displacement of roll ( ) Confidence Intervals Anger Disgust Fear Joy Neutral Sadness Surprise 5 10 15 20 25 4 groups have means significantly different from Neutral RMS velocity of roll ( /sec) Confidence Intervals Anger Disgust Fear Joy Neutral Sadness Surprise 200 300 400 500 600 700 4 groups have means significantly different from Neutral acceleration of pitch ( /sec 2 ) 200 300 400 500 600 700 800 900 100 150 200 250 300 350 400 450 500 4 groups have means significantly different from Neutral 4 groups have means significantly different from Neutral acceleration of yaw ( /sec 2 ) RMS acceleration of roll ( /sec 2 ) 95% confidence intervals of in-class means of RMS measurements 17
Paired t-test Paired t-test results for one vs. all emotions. p-values: RMS Measurements Displacement Velocity Acceleration Emotion Pitch Yaw Roll Pitch Yaw Roll Pitch Yaw Roll Anger 0.000 0.432 0.851 0.000 0.024 0.015 0.000 0.015 0.041 Disgust 0.389 0.000 0.002 0.222 0.031 0.167 0.001 0.839 0.220 Fear 0.322 0.096 0.067 0.083 0.052 0.722 0.496 0.405 0.605 Joy 0.007 0.747 0.001 0.000 0.000 0.000 0.000 0.000 0.000 Neutral 0.000 0.000 0.002 0.000 0.000 0.000 0.000 0.000 0.000 Sadness 0.510 0.693 0.876 0.000 0.002 0.001 0.000 0.000 0.000 Surprise 0.000 0.414 0.005 0.000 0.310 0.000 0.000 0.006 0.000 18
Observations Differences among emotions are more significant in velocity and acceleration as compared to displacement. Anger, joy, neutral are easily distinguishable from other emotions. Sadness, surprise, neutral are not distinguishable from each other. Pitch velocity and acceleration are significantly higher for anger, joy. Roll velocity and acceleration are significantly higher for joy. 19
Classification (using RMS measurements) knn classifier (k = 5) Feature: 9 dimensional vector with RMS measurements 10-fold cross validation % accuracy of one-against-all emotion classification Anger Disgust Fear Joy Neutral Sadness Surprise 81 82 79 83 83 82 83 Comparable with previously reported psychological experiment [Livingstone et al. 2016] 20
Classification (using RMS measurements) knn classifier with k = 5 Traing set : 80% of the data, Test set: remaining 20% Feature: 9D vector with RMS measurements Confusion matrix : Anger Disgust Fear Joy Neutral Sadness Surprise Anger 47 00 04 25 16 08 00 Disgust 16 13 06 09 25 31 00 Fear 17 07 03 23 43 07 00 Joy 25 05 05 56 05 04 00 Neutral 07 02 02 06 61 20 02 Sadness 20 05 07 07 38 18 05 Surprise 09 07 00 22 50 09 03 Overall classification accuracy 34% 21
Classification (using AR(3) coefficients) knn classifier with k = 5 Traing set : 80% of the data, Test set: remaining 20% Feature : 12D vector with AR(3) coefficients Confusion matrix : Anger Disgust Fear Joy Neutral Sadness Surprise Anger 31 10 06 28 19 06 00 Disgust 18 09 05 27 23 18 00 Fear 19 00 10 19 42 05 05 Joy 19 06 05 30 24 16 00 Neutral 17 08 03 22 42 08 00 Sadness 10 04 00 41 31 14 00 Surprise 25 15 00 25 35 00 00 Overall classification accuracy 22% 22
Comparison With Facial Expression Non-verbal cue Accuracy (in %) Head motion (RMS measures) 34.23 Facial expression + Head motion (RMS) 36.15 Facial expression AFEW baseline 1 39.33 1 Dhall et al. 2015. 23
Conclusion Contribution: Systematic study of significance of head motion in communicating affect. Summary: Head motion alone carries significant information to distinguish any basic emotion from the rest. Head motion information is complimentary to that of facial expression. Angular velocity and acceleration can better discriminate among emotions. 24
Questions? 24