Juergen Gall. Analyzing Human Behavior in Video Sequences

Size: px

Start display at page:

Download "Juergen Gall. Analyzing Human Behavior in Video Sequences"

Ophelia Hancock
5 years ago
Views:

1 Juergen Gall Analyzing Human Behavior in Video Sequences

2 Juer gen Gall Instit u t e of Com puter Science III Com puter Vision Gr oup 2 Analyzing Human Behavior Analyzing Human Behavior Human Pose Low level features, e.g., gradients, optical flow Videos

3 Juer gen Gall Instit u t e of Com puter Science III Com puter Vision Gr oup 3 21 Actions from HMDB HMDB51 (Kuehne et al, ICCV 2011) 928 clips, frames

4 Puppet Annotation Juer gen Gall Instit u t e of Com puter Science III Com puter Vision Gr oup 4

5 Joint-annotated HMDB (JHMDB) [ H. Jhuang et al. Towards Understanding Action Recognition. ICCV 2013 ] [ ] Juer gen Gall Instit u t e of Com puter Science III Com puter Vision Gr oup 5

(2013) Low Mid High baseline given puppet flow given puppet mask given joint positions baseline given flow given

6 Juer gen Gall Instit u t e of Com puter Science III Com puter Vision Gr oup 6 Study with Annotated Data (2013) Low Mid High baseline given puppet flow given puppet mask given joint positions baseline given flow given mask pose features GT + ~11% + ~9% + ~20% Large potential gain for pose feature Not with existing 2d human pose methods [ H. Jhuang et al. Towards Understanding Action Recognition. ICCV 2013 ] [ ]

7 Juer gen Gall Instit u t e of Com puter Science III Com puter Vision Gr oup 7 CNNs for Pose Estimation Stack CNNs: [ S.-E. Wei et al. Convolutional Pose Machines. CVPR 2016 ]

8 Juer gen Gall Instit u t e of Com puter Science III Com puter Vision Gr oup 16 Coupled Action Recognition and Pose Estimation [ U. Iqbal et al. Pose for Action Action for Pose. FG 2017 ]

9 Pose Estimation in Videos Video datasets for human pose in unconstrained videos does not exist. [ U. Iqbal et al. Pose-Track: Joint Multi-Person Pose Estimation and Tracking. CVPR 2017 ] Juer gen Gall Instit u t e of Com puter Science III Com puter Vision Gr oup 18

10 Juer gen Gall Instit u t e of Com puter Science III Com puter Vision Gr oup 19 Pose Estimation in Videos Video datasets for human pose in unconstrained videos does not exist. Unconstrained means Public available content from the Internet (e.g. Youtube) Multiple persons in a video (no assumption about position) Arbitrary number of visible joints (truncation and occlusion) Large scale variations (unknown scale) [ U. Iqbal et al. Pose-Track: Joint Multi-Person Pose Estimation and Tracking. CVPR 2017 ]

11 Pose-Track Dataset [ U. Iqbal et al. Pose-Track: Joint Multi-Person Pose Estimation and Tracking. CVPR 2017 ] Juer gen Gall Instit u t e of Com puter Science III Com puter Vision Gr oup 20

12 Joint-annotated HMDB (JHMDB) [ H. Jhuang et al. Towards Understanding Action Recognition. ICCV 2013 ] [ ] Juer gen Gall Instit u t e of Com puter Science III Com puter Vision Gr oup 22

13 Pose-Track Dataset [ U. Iqbal et al. Pose-Track: Joint Multi-Person Pose Estimation and Tracking. CVPR 2017 ] Juer gen Gall Instit u t e of Com puter Science III Com puter Vision Gr oup 23

14 Pose-Track Dataset Juer gen Gall Instit u t e of Com puter Science III Com puter Vision Gr oup 24

15 Juer gen Gall Instit u t e of Com puter Science III Com puter Vision Gr oup 25 Challenge ICCV 2017 [ ]

16 Pose Track: Simultaneous Pose Estimation and Tracking Estimate pose + person association over time: [ U. Iqbal et al. Pose-Track: Joint Multi-Person Pose Estimation and Tracking. CVPR 2017 ] Juer gen Gall Instit u t e of Com puter Science III Com puter Vision Gr oup 26

17 Juer gen Gall Instit u t e of Com puter Science III Com puter Vision Gr oup 27 Pose Track: Simultaneous Pose Estimation and Tracking Estimate pose + person association over time: Predict body joints (CNN trained on MPII Pose)

18 Juer gen Gall Instit u t e of Com puter Science III Com puter Vision Gr oup 28 Pose Track: Simultaneous Pose Estimation and Tracking Estimate pose + person association over time: Predict body joints (CNN trained on MPII Pose) Build a graph with temporal and spatial edges [ U. Iqbal et al. Pose-Track: Joint Multi-Person Pose Estimation and Tracking. CVPR 2017 ]

19 Pose Track: Simultaneous Pose Estimation and Tracking f f f [ U. Iqbal et al. Pose-Track: Joint Multi-Person Pose Estimation and Tracking. CVPR 2017 ] Juer gen Gall Instit u t e of Com puter Science III Com puter Vision Gr oup 29

20 Pose Track: Simultaneous Pose Estimation and Tracking f f f [ U. Iqbal et al. Pose-Track: Joint Multi-Person Pose Estimation and Tracking. CVPR 2017 ] Juer gen Gall Instit u t e of Com puter Science III Com puter Vision Gr oup 30

21 Pose Track: Simultaneous Pose Estimation and Tracking Unaries: Confidences of detected joints [ U. Iqbal et al. Pose-Track: Joint Multi-Person Pose Estimation and Tracking. CVPR 2017 ] Juer gen Gall Instit u t e of Com puter Science III Com puter Vision Gr oup 31

22 Juer gen Gall Instit u t e of Com puter Science III Com puter Vision Gr oup 32 Pose Track: Simultaneous Pose Estimation and Tracking Spatial binaries: Extract quadratic bounding box around detection Two cases: Different joint type: Logistic regression based on distance and orientation [ U. Iqbal et al. Pose-Track: Joint Multi-Person Pose Estimation and Tracking. CVPR 2017 ]

23 Juer gen Gall Instit u t e of Com puter Science III Com puter Vision Gr oup 33 Pose Track: Simultaneous Pose Estimation and Tracking Spatial binaries: Extract quadratic bounding box around detection Two cases: Same joint type: [ U. Iqbal et al. Pose-Track: Joint Multi-Person Pose Estimation and Tracking. CVPR 2017 ]

Estimation and Tracking. CVPR 2017 ] 09.

24 Pose Track: Simultaneous Pose Estimation and Tracking [ U. Iqbal et al. Pose-Track: Joint Multi-Person Pose Estimation and Tracking. CVPR 2017 ] Juer gen Gall Instit u t e of Com puter Science III Com puter Vision Gr oup 34

25 Pose Track: Simultaneous Pose Estimation and Tracking Temporal binaries: Compute optical flow (DeepMatching) [ U. Iqbal et al. Pose-Track: Joint Multi-Person Pose Estimation and Tracking. CVPR 2017 ] Juer gen Gall Instit u t e of Com puter Science III Com puter Vision Gr oup 35

26 Pose Track: Simultaneous Pose Estimation and Tracking Temporal binaries: Compute optical flow (DeepMatching) Logistic regression: [ U. Iqbal et al. Pose-Track: Joint Multi-Person Pose Estimation and Tracking. CVPR 2017 ] Juer gen Gall Instit u t e of Com puter Science III Com puter Vision Gr oup 36 f f

27 Pose Track: Simultaneous Pose Estimation and Tracking Solve integer linear program: f f f [ U. Iqbal et al. Pose-Track: Joint Multi-Person Pose Estimation and Tracking. CVPR 2017 ] Juer gen Gall Instit u t e of Com puter Science III Com puter Vision Gr oup 37

28 Pose Track: Simultaneous Pose Estimation and Tracking Solve integer linear program: [ U. Iqbal et al. Pose-Track: Joint Multi-Person Pose Estimation and Tracking. CVPR 2017 ] Juer gen Gall Instit u t e of Com puter Science III Com puter Vision Gr oup 38

29 Pose Track: Simultaneous Pose Estimation and Tracking Solve integer linear program: f f f [ U. Iqbal et al. Pose-Track: Joint Multi-Person Pose Estimation and Tracking. CVPR 2017 ] Juer gen Gall Instit u t e of Com puter Science III Com puter Vision Gr oup 39

30 Pose Track: Simultaneous Pose Estimation and Tracking To obtain plausible pauses, constraints are added: Spatial transitivity: [ U. Iqbal et al. Pose-Track: Joint Multi-Person Pose Estimation and Tracking. CVPR 2017 ] Juer gen Gall Instit u t e of Com puter Science III Com puter Vision Gr oup 40

31 Juer gen Gall Instit u t e of Com puter Science III Com puter Vision Gr oup 42 Pose Track: Simultaneous Pose Estimation and Tracking To obtain plausible pauses, constraints are added: Spatial transitivity: Temporal transitivity: [ U. Iqbal et al. Pose-Track: Joint Multi-Person Pose Estimation and Tracking. CVPR 2017 ]

32 Juer gen Gall Instit u t e of Com puter Science III Com puter Vision Gr oup 43 Pose Track: Simultaneous Pose Estimation and Tracking To obtain plausible pauses, constraints are added: Spatial transitivity: Temporal transitivity: Spatio-temporal trans.: [ U. Iqbal et al. Pose-Track: Joint Multi-Person Pose Estimation and Tracking. CVPR 2017 ]

Pose Track: Simultaneous Pose Estimation and Tracking 09. 10.

33 Pose Track: Simultaneous Pose Estimation and Tracking Juer gen Gall Instit u t e of Com puter Science III Com puter Vision Gr oup 44

34 Pose Track: Simultaneous Pose Estimation and Tracking To obtain plausible pauses, constraints are added: Spatio-temporal consistency: [ U. Iqbal et al. Pose-Track: Joint Multi-Person Pose Estimation and Tracking. CVPR 2017 ] Juer gen Gall Instit u t e of Com puter Science III Com puter Vision Gr oup 45

35 Juer gen Gall Instit u t e of Com puter Science III Com puter Vision Gr oup 46 Pose Track: Simultaneous Pose Estimation and Tracking Estimate pose + person association over time: Predict body joints (CNN trained on MPII Pose) Build a graph with temporal and spatial edges Partition spatio-temporal graph [ U. Iqbal et al. Pose-Track: Joint Multi-Person Pose Estimation and Tracking. CVPR 2017 ]

36 Pose Track: Simultaneous Pose Estimation and Tracking Juer gen Gall Instit u t e of Com puter Science III Com puter Vision Gr oup 47

37 Juer gen Gall Instit u t e of Com puter Science III Com puter Vision Gr oup 48 Pose Track: Evaluation Pose estimation accuracy (map) Person association (MOTA) [ U. Iqbal et al. Pose-Track: Joint Multi-Person Pose Estimation and Tracking. CVPR 2017 ]

38 Juer gen Gall Instit u t e of Com puter Science III Com puter Vision Gr oup 49 Pose Track: Evaluation Pose estimation accuracy (map) Person association (MOTA) [ U. Iqbal et al. Pose-Track: Joint Multi-Person Pose Estimation and Tracking. CVPR 2017 ]

39 Joint-annotated HMDB (JHMDB) [ H. Jhuang et al. Towards Understanding Action Recognition. ICCV 2013 ] [ ] Juer gen Gall Instit u t e of Com puter Science III Com puter Vision Gr oup 50

Video Analysis for Studying the Behavior of Mice 09. 10.

40 Video Analysis for Studying the Behavior of Mice Juer gen Gall Instit u t e of Com puter Science III Com puter Vision Gr oup 72

41 10/ 9 / Juer gen Gall Instit u t e of Com puter Science III Com puter Vision Gr oup 74 Recurrent Neural Networks Gated units (LSTM/GRU)

42 Weakly Supervised Learning Fully supervised: Weakly supervised (transcripts) [ A. Richard et al. Weakly Supervised Action Learning with RNN based Fine-to-Coarse Modeling. CVPR 2017 ] Juer gen Gall Instit u t e of Com puter Science III Com puter Vision Gr oup 75

43 Weakly Supervised Learning Juer gen Gall Instit u t e of Com puter Science III Com puter Vision Gr oup 77

44 Juer gen Gall Instit u t e of Com puter Science III Com puter Vision Gr oup 78 Weakly Supervised Learning Represent an activity a like spoon_powder by latent sub-activities s 1 (a),s 2 (a),s 3 (a), s (a) 1 s (a) 2 s (a) 3 s (a) 4 s (a) 5 s (a) 6 Optimal number of sub-activities is unknown: Many sub-activities for long activities Few sub-activities for short activities

45 Juer gen Gall Instit u t e of Com puter Science III Com puter Vision Gr oup 79 Model RNN with Gated Recurrent Units (GRU)

46 Juer gen Gall Instit u t e of Com puter Science III Com puter Vision Gr oup 80 Model Hidden Markov Model (HMM) enforce fixed order of sub-activities: s 1 (a),s 2 (a),s 3 (a), HMMs use probabilities of RNN as input

47 Model Hidden Markov Model (HMM) for each activity Juergen Gall Institute of Computer Science III Computer Vision Group 81

48 Juer gen Gall Instit u t e of Com puter Science III Com puter Vision Gr oup 82 Model The transcripts define the order of activities:

49 Juer gen Gall Instit u t e of Com puter Science III Com puter Vision Gr oup 83 Model The transcripts define the order of activities:

50 Juer gen Gall Instit u t e of Com puter Science III Com puter Vision Gr oup 84 Model The transcripts define the order of activities:

51 Weakly Supervised Learning [ A. Richard et al. Weakly Supervised Action Learning with RNN based Fine-to-Coarse Modeling. CVPR 2017 ] Juer gen Gall Instit u t e of Com puter Science III Com puter Vision Gr oup 85

52 Weakly Supervised Learning [ A. Richard et al. Weakly Supervised Action Learning with RNN based Fine-to-Coarse Modeling. CVPR 2017 ] Juer gen Gall Instit u t e of Com puter Science III Com puter Vision Gr oup 86

53 Weakly Supervised Learning [ A. Richard et al. Weakly Supervised Action Learning with RNN based Fine-to-Coarse Modeling. CVPR 2017 ] Juer gen Gall Instit u t e of Com puter Science III Com puter Vision Gr oup 87

54 Weakly Supervised Learning [ A. Richard et al. Weakly Supervised Action Learning with RNN based Fine-to-Coarse Modeling. CVPR 2017 ] Juer gen Gall Instit u t e of Com puter Science III Com puter Vision Gr oup 88

55 Weakly Supervised Learning [ A. Richard et al. Weakly Supervised Action Learning with RNN based Fine-to-Coarse Modeling. CVPR 2017 ] Juer gen Gall Instit u t e of Com puter Science III Com puter Vision Gr oup 89

56 Results Juer gen Gall Instit u t e of Com puter Science III Com puter Vision Gr oup 90

57 Results Accuracy on unseen sequences (video without transcript) [ A. Richard et al. Weakly Supervised Action Learning with RNN based Fine-to-Coarse Modeling. CVPR 2017 ] Juer gen Gall Instit u t e of Com puter Science III Com puter Vision Gr oup 91

58 Results Accuracy on unseen sequences (video without transcript) [ A. Richard et al. Weakly Supervised Action Learning with RNN based Fine-to-Coarse Modeling. CVPR 2017 ] Juer gen Gall Instit u t e of Com puter Science III Com puter Vision Gr oup 92

59 Results Accuracy on unseen sequences (video with transcript) [ A. Richard et al. Weakly Supervised Action Learning with RNN based Fine-to-Coarse Modeling. CVPR 2017 ] Juer gen Gall Instit u t e of Com puter Science III Com puter Vision Gr oup 93

Research Unit - Anticipating Human Behavior [ https://pages.iai.uni-bonn.de/for2535 ] 2 0.

60 Research Unit - Anticipating Human Behavior [ ] R e s e a r c h U n i t Antici p a t i n g H u m a n B e h a vi o r 94

Research Unit - Anticipating Human Behavior 2 0. 0 9.

61 Research Unit - Anticipating Human Behavior R e s e a r c h U n i t Antici p a t i n g H u m a n B e h a vi o r 95

62 Thank you for your attention Juer gen Gall Instit u t e of Com puter Science III Com puter Vision Gr oup 96

Deep Reinforcement Learning for Unsupervised Video Summarization with Diversity- Representativeness Reward

Deep Reinforcement Learning for Unsupervised Video Summarization with Diversity- Representativeness Reward Kaiyang Zhou, Yu Qiao, Tao Xiang AAAI 2018 What is video summarization? Goal: to automatically