Singlets Multi-resolution Motion Singularities for Soccer Video Abstraction Katy Blanc, Diane Lingrand, Frederic Precioso I3S Laboratory, Sophia Antipolis, France Katy Blanc Diane Lingrand Frederic Precioso 1
Overview VIDEO & MOTION SINGULARITIES & SINGLETS SOCCER SALIENT MOMENTS 2
Video Analysis Burst of video content production new sources of videos big databases : Youtube 8M [1] much analyzed types : meeting/conferences, movies, news and sports Diverse applications : browsing in database, automatic video surveillance, driverless car, Exponential amount of information: a match of soccer of HDTV 324000 images of 1920 1080 pixels each Collaboration with Wildmoka, themselves in collaboration with l INA and BeIN. 3
Related works: Video description Modelling human motion: [2] Handcrafted features: Stip [3], idt [4] Deep learning representations [5] 4
Related works : sport abstraction Clue Detection to detect highlights: ground color, jersey color, shot segmentation and view classification Line marks positions and shot detection [6] Ye et al. Goals, attacks and other events using logo and score appearances and goal mouth position [7] Zawbaa et al. Face and skin detection, whistle detector and user specifications [8] Raventos et al. 5
Overview VIDEO & MOTION SINGULARITIES & SINGLETS SOCCER SALIENT MOMENTS 6
Inspiration: fluid movement Inspired from the work of Druon et al. [9] and the further work of Kihl et al. [10] 7
Optical Flow Approximation Optical Flow = discrete bivariable vector field F Ω R 2 x 1, x 2 U x 1, x 2, V x 1, x 2 with Ω = 1, 1 2 Polynomial subspace and the Legendre Basis P K,L x 1, x 2 = K L k=0 l=0 γ k,l. x 1 k x 2 l Projection on the Legendre Basis with K + L < D U = u 0,0 P 0,0 + u 0,1 P 0,1 + u 1,0 P 1,0 V = v 0,0 P 0,0 + v 0,1 P 0,1 + v 1,0 P 1,0 8
Polynomial projection = (, ) Original Flow F Flow U Flow V Projection U V = 1.38 + 0 + 0.007 + 0.023 + 0 4.47 0.01 + 4.53 + 0.01 + 0 + 0 7E 5 U V = 0 0.02 3.91 0. x 1 x + 0.69 2 0.006 9
Approximation and coefficient analysis From a simple analysis example to production, scenarization, event sementization U V = 1.38 0 Coefficient value counterattack u 0,0 horizontal global displacement v 0,0 vertical global displacement u 0,0 horizontal global position v 0,0 vertical global position Frame number 10
Singularity First projection on the Legendre basis U = u 0,0 P 0,0 + u 0,1 P 0,1 + u 1,0 P 1,0 V = v 0,0 P 0,0 + v 0,1 P 0,1 + v 1,0 P 1,0 Then on the canonical basis U V = A. x 1 x 2 + b with A M 2,2 and b R 2 6 types of singularities A = tr A 2 4. det A, λ 1 and λ 2 the Eigenvalues of A 11
Singularities extraction From a multi-resolution analysis of the optical flow U V = A. x 1 x 2 + b 12
Singlets: match singularities during time From singularity on optical flow frame to tracks of singularities: Singlets 13
Overview VIDEO & MOTION SINGULARITIES & SINGLETS SOCCER SALIENT MOMENTS 14
Application on soccer abstraction Our database Zoom Slow Motion Global Excitement Soccer Saliant Moment 15
Soccer Summarization: Our database Lack of standard benchmark for comparison sake Germany vs Portugal Nigeria vs Argentina France vs Honduras Switzerland vs France 4-0 2-3 3-0 2-5 http://www.fifa.com/worldcup/matches/ 16
Soccer Summarization: Our database 17
Zoom detection Zoom motion is a pure star node singularity. Zoom combined with a translation is an improper node Conditions: there is a singularity it is a star or improper node : A = 0 time consistent : last for one second Positives eigenvalues -> zoom out Negatives eigenvalues -> zoom in Determine the zoom direction center 18
Zoom detection A = 0 A 0.2 Threshold on an average A on a second set to 0,2 Comparison - Global motion estimation [6,11] - Duan et al. method [12] : Two histograms, one on magnitude one on angles to detect diagonal pattern (DL) 19
Slow Motion Detection How to differentiate a fast motion that has been artificially slowed down from a slow motion? 20
Slow Motion Detection Fast Motion Slow Motion 21
Global excitement Spatial histogram of 3x3 per frame Sum spatial histograms on 10 frames Threshold set on 1500 to select the most agitated moments 22
Soccer Saliant Moment Within 30 seconds: - at least two zoom direction changes - an activity peaks higher than 1500 in the farthest view - a slow motion replay in a close up view 23
Example of Summarization Zooms Agitation Slow Motion 24
Extension to other sports Set and Trained on Soccer and Tested on Handball All hyperparameters, parameters and SVM were trained and set on soccer videos and we test our framework on 5 minutes of Qatar Handball World Championship final without any adjustement or retraining. 25
Conclusion Optical Flow Approximation Possibility to use : - different basis - different degree - other space than the polynomial one Singularity extraction Vanishing point 6 types for the affine approximation Motion description Global distribution Singlets Track singularities Length histograms Compute a description like for IDT Sport Video Abstraction Zoom detection Slow Motion Detection Global excitement 26
Because there is not only soccer in life Abstraction of concert Facial Emotion Action recognition 27
Vision in Sports: CVSports 28
References [1] S. Abu-El-Haija, N. Kothari, J. Lee, P. Natsev, G. Toderici, B. Varadarajan, and S. ijayanarasimhan. Youtube-8m: A large-scale video classification benchmark. CoRR, abs/1609.08675, 2016. [2] Gorelick, Lena, et al. "Actions as space-time shapes." IEEE transactions on pattern analysis and machine intelligence 29.12 (2007): 2247-2253. [3] I. Laptev. On space-time interest points. Int. J. Comput. Vision, 64(2-3):107 123, Sept. 2005. [4] H. Wang, A. Kläser, C. Schmid, and C.-L. Liu. Action Recognition by Dense Trajectories. In IEEE Conference on Computer Vision & Pattern Recognition, pages 3169 3176, Colorado Springs, United States, June 2011. [5] D. Tran, L. Bourdev, R. Fergus, L. Torresani, and M. Paluri. Learning spatiotemporal features with 3d convolutional net-works. In roceedings of the IEEE International Conference on Computer Vision, pages 4489 4497, 2015. [6] Q. Ye, Q. Huang, W. Gao, and S. Jiang. Exciting event detection in broadcast soccer video with mid-level description and incremental learning. In Proceedings of the 13th annual ACM international conference on Multimedia, 2005. [7] H. M. Zawbaa, N. El-Bendary, A. E. Hassanien, and T. Kim. Event detection based approach for soccer video summarization using machine learning. International Journal of Multimedia and Ubiquitous Engineering, 2012 [8] A. Raventos, R. Quijada, L. Torres, and F. Tarres. Automatic summarization of soccer highlights using audio-visual descriptors. arxiv preprint arxiv:1411.6496, 2014. [9] Druon, Martin. Modélisation du mouvement par polynômes orthogonaux: application à l'étude d'écoulements fluides. Diss. Université de Poitiers, 2009. [10] O. Kihl, B. Tremblais, and B. Augereau. Multivariate orthogonal polynomials to extract singular points. In IEEE International Conference on Image Processing 2008. ICIP 2008 San Diego, CA, United States, Oct. 2008. [11] X. Qian. Global Motion Estimation and Its Applications. INTECH Open Access Publisher, 2012. [12] L.-Y. Duan, M. Xu, Q. Tian, C.-S. Xu, and J. S. Jin. A unified framework for semantic shot classification in sports video. IEEE Transactions on Multimedia, 2005 29
Optical Flow Extraction Gunnar Farneback Method Pixel Neighborhood Approximation Approximation translation Relations Speed vector estimation 30
Polynomial Space Projection Scalar product Legendre basis with w x 1, x 2 = 1 and Ω = 1,1 2 Polynomial degree D nd = (D+1)(D+2) polynomials in the basis and so nd 2 coefficients Flot optique blurring (decreasing with the degree) 31
Multi-Resolution Singularity One sliding windows -> possibly one singularity The same singularity can be at different size and aprroximately the same position: We keep the one with less angular deviation from the original flow. dev = ω Ω 1 2 sin θ ω θ ω With θ ω the angle of the motion vector at the pixel position ω for the originalf flow and θ ω for the polynomial approximation 32