Singlets. Multi-resolution Motion Singularities for Soccer Video Abstraction. Katy Blanc, Diane Lingrand, Frederic Precioso

Similar documents
Classification of Hand-Written Digits Using Scattering Convolutional Network

Modeling Complex Temporal Composition of Actionlets for Activity Prediction

Action-Decision Networks for Visual Tracking with Deep Reinforcement Learning

Two-Layered Face Detection System using Evolutionary Algorithm

Lecture 8: Interest Point Detection. Saad J Bedros

Event Detection and Semantic Identification Using Bayesian Belief Network

Histogram of multi-directional Gabor filter bank for motion trajectory feature extraction

Object Recognition Using Local Characterisation and Zernike Moments

Empirical Analysis of Invariance of Transform Coefficients under Rotation

Image Recognition Using Modified Zernike Moments

Learning Spa+otemporal Graphs of Human Ac+vi+es

Recap: edge detection. Source: D. Lowe, L. Fei-Fei

Lecture 8: Interest Point Detection. Saad J Bedros

Brief Introduction of Machine Learning Techniques for Content Analysis

Shape of Gaussians as Feature Descriptors

Multiple Similarities Based Kernel Subspace Learning for Image Classification

Kai Yu NEC Laboratories America, Cupertino, California, USA

Sound Recognition in Mixtures

A Human Behavior Recognition Method Based on Latent Semantic Analysis

Iterative face image feature extraction with Generalized Hebbian Algorithm and a Sanger-like BCM rule

INTEREST POINTS AT DIFFERENT SCALES

A Novel Activity Detection Method

Edges and Scale. Image Features. Detecting edges. Origin of Edges. Solution: smooth first. Effects of noise

Anticipating Visual Representations from Unlabeled Data. Carl Vondrick, Hamed Pirsiavash, Antonio Torralba

Shankar Shivappa University of California, San Diego April 26, CSE 254 Seminar in learning algorithms

Fast Supervised LDA for Discovering Micro-Events in Large-Scale Video Datasets

Rapid Object Recognition from Discriminative Regions of Interest

Detectors part II Descriptors

Machine Learning with Quantum-Inspired Tensor Networks

CS 231A Section 1: Linear Algebra & Probability Review

CS 231A Section 1: Linear Algebra & Probability Review. Kevin Tang

Convolutional Neural Networks

Deep Learning of Invariant Spatiotemporal Features from Video. Bo Chen, Jo-Anne Ting, Ben Marlin, Nando de Freitas University of British Columbia

ENGG5781 Matrix Analysis and Computations Lecture 10: Non-Negative Matrix Factorization and Tensor Decomposition

Mitosis Detection in Breast Cancer Histology Images with Multi Column Deep Neural Networks

Preliminary results of a learning algorithm to detect stellar ocultations Instituto de Astronomia UNAM. Dr. Benjamín Hernández & Dr.

CSCI 250: Intro to Robotics. Spring Term 2017 Prof. Levy. Computer Vision: A Brief Survey

A RAIN PIXEL RESTORATION ALGORITHM FOR VIDEOS WITH DYNAMIC SCENES

Scale and Rotation Invariant Detection of Singular Patterns in Vector Flow Fields

Corners, Blobs & Descriptors. With slides from S. Lazebnik & S. Seitz, D. Lowe, A. Efros

INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY

Lie Algebrized Gaussians for Image Representation

arxiv: v1 [cs.cv] 7 Aug 2018

Instance-level l recognition. Cordelia Schmid INRIA

Event Detection by Eigenvector Decomposition Using Object and Frame Features

Slide a window along the input arc sequence S. Least-squares estimate. σ 2. σ Estimate 1. Statistically test the difference between θ 1 and θ 2

PCA FACE RECOGNITION

Kronecker Decomposition for Image Classification

Corner. Corners are the intersections of two edges of sufficiently different orientations.

Role of Assembling Invariant Moments and SVM in Fingerprint Recognition

Blur Insensitive Texture Classification Using Local Phase Quantization

self-driving car technology introduction

Latent Semantic Analysis. Hongning Wang

LivePhoto Physics Activity 3. Velocity Change. Motion Detector. Sample

Advances in Computer Vision. Prof. Bill Freeman. Image and shape descriptors. Readings: Mikolajczyk and Schmid; Belongie et al.

Human activity recognition in the semantic simplex of elementary actions

A METHOD OF FINDING IMAGE SIMILAR PATCHES BASED ON GRADIENT-COVARIANCE SIMILARITY

The Deep Ritz method: A deep learning-based numerical algorithm for solving variational problems

Feature extraction: Corners and blobs

DYNAMIC TEXTURE RECOGNITION USING ENHANCED LBP FEATURES

Aruna Bhat Research Scholar, Department of Electrical Engineering, IIT Delhi, India

Edge Image Description Using Angular Radial Partitioning

Space-time Zernike Moments and Pyramid Kernel Descriptors for Action Classification

Vlad Estivill-Castro (2016) Robots for People --- A project for intelligent integrated systems

Lecture: Face Recognition

Linear Algebra & Geometry why is linear algebra useful in computer vision?

CS4670: Computer Vision Kavita Bala. Lecture 7: Harris Corner Detec=on

International Journal of Computer Engineering and Applications, Volume XII, Special Issue, August 18, ISSN

Instance-level recognition: Local invariant features. Cordelia Schmid INRIA, Grenoble

A Generative Model Based Kernel for SVM Classification in Multimedia Applications

Invariant Pattern Recognition using Dual-tree Complex Wavelets and Fourier Features

Scale-Invariance of Support Vector Machines based on the Triangular Kernel. Abstract

Affine Structure From Motion

Machine Learning for Computer Vision 8. Neural Networks and Deep Learning. Vladimir Golkov Technical University of Munich Computer Vision Group

Linear Algebra & Geometry why is linear algebra useful in computer vision?

Machine vision, spring 2018 Summary 4

Ragav Venkatesan, 2 Christine Zwart, 2,3 David Frakes, 1 Baoxin Li

Wavelet-based Salient Points with Scale Information for Classification

FPGA Implementation of a HOG-based Pedestrian Recognition System

On The Role Of Head Motion In Affective Expression

Face detection and recognition. Detection Recognition Sally

Conjugate gradient acceleration of non-linear smoothing filters Iterated edge-preserving smoothing

38 1 Vol. 38, No ACTA AUTOMATICA SINICA January, Bag-of-phrases.. Image Representation Using Bag-of-phrases

CITS 4402 Computer Vision

Science Insights: An International Journal

Digital Trimulus Color Image Enhancing and Quantitative Information Measuring

Face Recognition Using Multi-viewpoint Patterns for Robot Vision

Anomaly Localization in Topic-based Analysis of Surveillance Videos

A Factorization Method for 3D Multi-body Motion Estimation and Segmentation

Representing Visual Appearance by Video Brownian Covariance Descriptor for Human Action Recognition

Object Detection Grammars

Robust Motion Segmentation by Spectral Clustering

Online Appearance Model Learning for Video-Based Face Recognition

CS5670: Computer Vision

Estimating the Rotation of a Ball Using High Speed Camera

Multi-Layer Boosting for Pattern Recognition

Artificial Neural Networks D B M G. Data Base and Data Mining Group of Politecnico di Torino. Elena Baralis. Politecnico di Torino

Temporal Factorization Vs. Spatial Factorization

Optical flow. Subhransu Maji. CMPSCI 670: Computer Vision. October 20, 2016

arxiv: v2 [cs.sd] 7 Feb 2018

Transcription:

Singlets Multi-resolution Motion Singularities for Soccer Video Abstraction Katy Blanc, Diane Lingrand, Frederic Precioso I3S Laboratory, Sophia Antipolis, France Katy Blanc Diane Lingrand Frederic Precioso 1

Overview VIDEO & MOTION SINGULARITIES & SINGLETS SOCCER SALIENT MOMENTS 2

Video Analysis Burst of video content production new sources of videos big databases : Youtube 8M [1] much analyzed types : meeting/conferences, movies, news and sports Diverse applications : browsing in database, automatic video surveillance, driverless car, Exponential amount of information: a match of soccer of HDTV 324000 images of 1920 1080 pixels each Collaboration with Wildmoka, themselves in collaboration with l INA and BeIN. 3

Related works: Video description Modelling human motion: [2] Handcrafted features: Stip [3], idt [4] Deep learning representations [5] 4

Related works : sport abstraction Clue Detection to detect highlights: ground color, jersey color, shot segmentation and view classification Line marks positions and shot detection [6] Ye et al. Goals, attacks and other events using logo and score appearances and goal mouth position [7] Zawbaa et al. Face and skin detection, whistle detector and user specifications [8] Raventos et al. 5

Overview VIDEO & MOTION SINGULARITIES & SINGLETS SOCCER SALIENT MOMENTS 6

Inspiration: fluid movement Inspired from the work of Druon et al. [9] and the further work of Kihl et al. [10] 7

Optical Flow Approximation Optical Flow = discrete bivariable vector field F Ω R 2 x 1, x 2 U x 1, x 2, V x 1, x 2 with Ω = 1, 1 2 Polynomial subspace and the Legendre Basis P K,L x 1, x 2 = K L k=0 l=0 γ k,l. x 1 k x 2 l Projection on the Legendre Basis with K + L < D U = u 0,0 P 0,0 + u 0,1 P 0,1 + u 1,0 P 1,0 V = v 0,0 P 0,0 + v 0,1 P 0,1 + v 1,0 P 1,0 8

Polynomial projection = (, ) Original Flow F Flow U Flow V Projection U V = 1.38 + 0 + 0.007 + 0.023 + 0 4.47 0.01 + 4.53 + 0.01 + 0 + 0 7E 5 U V = 0 0.02 3.91 0. x 1 x + 0.69 2 0.006 9

Approximation and coefficient analysis From a simple analysis example to production, scenarization, event sementization U V = 1.38 0 Coefficient value counterattack u 0,0 horizontal global displacement v 0,0 vertical global displacement u 0,0 horizontal global position v 0,0 vertical global position Frame number 10

Singularity First projection on the Legendre basis U = u 0,0 P 0,0 + u 0,1 P 0,1 + u 1,0 P 1,0 V = v 0,0 P 0,0 + v 0,1 P 0,1 + v 1,0 P 1,0 Then on the canonical basis U V = A. x 1 x 2 + b with A M 2,2 and b R 2 6 types of singularities A = tr A 2 4. det A, λ 1 and λ 2 the Eigenvalues of A 11

Singularities extraction From a multi-resolution analysis of the optical flow U V = A. x 1 x 2 + b 12

Singlets: match singularities during time From singularity on optical flow frame to tracks of singularities: Singlets 13

Overview VIDEO & MOTION SINGULARITIES & SINGLETS SOCCER SALIENT MOMENTS 14

Application on soccer abstraction Our database Zoom Slow Motion Global Excitement Soccer Saliant Moment 15

Soccer Summarization: Our database Lack of standard benchmark for comparison sake Germany vs Portugal Nigeria vs Argentina France vs Honduras Switzerland vs France 4-0 2-3 3-0 2-5 http://www.fifa.com/worldcup/matches/ 16

Soccer Summarization: Our database 17

Zoom detection Zoom motion is a pure star node singularity. Zoom combined with a translation is an improper node Conditions: there is a singularity it is a star or improper node : A = 0 time consistent : last for one second Positives eigenvalues -> zoom out Negatives eigenvalues -> zoom in Determine the zoom direction center 18

Zoom detection A = 0 A 0.2 Threshold on an average A on a second set to 0,2 Comparison - Global motion estimation [6,11] - Duan et al. method [12] : Two histograms, one on magnitude one on angles to detect diagonal pattern (DL) 19

Slow Motion Detection How to differentiate a fast motion that has been artificially slowed down from a slow motion? 20

Slow Motion Detection Fast Motion Slow Motion 21

Global excitement Spatial histogram of 3x3 per frame Sum spatial histograms on 10 frames Threshold set on 1500 to select the most agitated moments 22

Soccer Saliant Moment Within 30 seconds: - at least two zoom direction changes - an activity peaks higher than 1500 in the farthest view - a slow motion replay in a close up view 23

Example of Summarization Zooms Agitation Slow Motion 24

Extension to other sports Set and Trained on Soccer and Tested on Handball All hyperparameters, parameters and SVM were trained and set on soccer videos and we test our framework on 5 minutes of Qatar Handball World Championship final without any adjustement or retraining. 25

Conclusion Optical Flow Approximation Possibility to use : - different basis - different degree - other space than the polynomial one Singularity extraction Vanishing point 6 types for the affine approximation Motion description Global distribution Singlets Track singularities Length histograms Compute a description like for IDT Sport Video Abstraction Zoom detection Slow Motion Detection Global excitement 26

Because there is not only soccer in life Abstraction of concert Facial Emotion Action recognition 27

Vision in Sports: CVSports 28

References [1] S. Abu-El-Haija, N. Kothari, J. Lee, P. Natsev, G. Toderici, B. Varadarajan, and S. ijayanarasimhan. Youtube-8m: A large-scale video classification benchmark. CoRR, abs/1609.08675, 2016. [2] Gorelick, Lena, et al. "Actions as space-time shapes." IEEE transactions on pattern analysis and machine intelligence 29.12 (2007): 2247-2253. [3] I. Laptev. On space-time interest points. Int. J. Comput. Vision, 64(2-3):107 123, Sept. 2005. [4] H. Wang, A. Kläser, C. Schmid, and C.-L. Liu. Action Recognition by Dense Trajectories. In IEEE Conference on Computer Vision & Pattern Recognition, pages 3169 3176, Colorado Springs, United States, June 2011. [5] D. Tran, L. Bourdev, R. Fergus, L. Torresani, and M. Paluri. Learning spatiotemporal features with 3d convolutional net-works. In roceedings of the IEEE International Conference on Computer Vision, pages 4489 4497, 2015. [6] Q. Ye, Q. Huang, W. Gao, and S. Jiang. Exciting event detection in broadcast soccer video with mid-level description and incremental learning. In Proceedings of the 13th annual ACM international conference on Multimedia, 2005. [7] H. M. Zawbaa, N. El-Bendary, A. E. Hassanien, and T. Kim. Event detection based approach for soccer video summarization using machine learning. International Journal of Multimedia and Ubiquitous Engineering, 2012 [8] A. Raventos, R. Quijada, L. Torres, and F. Tarres. Automatic summarization of soccer highlights using audio-visual descriptors. arxiv preprint arxiv:1411.6496, 2014. [9] Druon, Martin. Modélisation du mouvement par polynômes orthogonaux: application à l'étude d'écoulements fluides. Diss. Université de Poitiers, 2009. [10] O. Kihl, B. Tremblais, and B. Augereau. Multivariate orthogonal polynomials to extract singular points. In IEEE International Conference on Image Processing 2008. ICIP 2008 San Diego, CA, United States, Oct. 2008. [11] X. Qian. Global Motion Estimation and Its Applications. INTECH Open Access Publisher, 2012. [12] L.-Y. Duan, M. Xu, Q. Tian, C.-S. Xu, and J. S. Jin. A unified framework for semantic shot classification in sports video. IEEE Transactions on Multimedia, 2005 29

Optical Flow Extraction Gunnar Farneback Method Pixel Neighborhood Approximation Approximation translation Relations Speed vector estimation 30

Polynomial Space Projection Scalar product Legendre basis with w x 1, x 2 = 1 and Ω = 1,1 2 Polynomial degree D nd = (D+1)(D+2) polynomials in the basis and so nd 2 coefficients Flot optique blurring (decreasing with the degree) 31

Multi-Resolution Singularity One sliding windows -> possibly one singularity The same singularity can be at different size and aprroximately the same position: We keep the one with less angular deviation from the original flow. dev = ω Ω 1 2 sin θ ω θ ω With θ ω the angle of the motion vector at the pixel position ω for the originalf flow and θ ω for the polynomial approximation 32