Tennis player segmentation for semantic behavior analysis

Similar documents
A Novel Activity Detection Method

Shape Outlier Detection Using Pose Preserving Dynamic Shape Models

Anomaly Detection for the CERN Large Hadron Collider injection magnets

Robotics 2 AdaBoost for People and Place Detection

Towards Fully-automated Driving

Visual meta-learning for planning and control

A RAIN PIXEL RESTORATION ALGORITHM FOR VIDEOS WITH DYNAMIC SCENES

FPGA Implementation of a HOG-based Pedestrian Recognition System

Detection of Artificial Satellites in Images Acquired in Track Rate Mode.

Boosting: Algorithms and Applications

CS 231A Section 1: Linear Algebra & Probability Review. Kevin Tang

CS 231A Section 1: Linear Algebra & Probability Review

Joint GPS and Vision Estimation Using an Adaptive Filter

Statistical Filters for Crowd Image Analysis

Vision-based navigation around small bodies

Collaborative topic models: motivations cont

A Background Layer Model for Object Tracking through Occlusion

Robotics 2. AdaBoost for People and Place Detection. Kai Arras, Cyrill Stachniss, Maren Bennewitz, Wolfram Burgard

A Hierarchical Convolutional Neural Network for Mitosis Detection in Phase-Contrast Microscopy Images

Roadmap. Introduction to image analysis (computer vision) Theory of edge detection. Applications

Su Liu 1, Alexandros Papakonstantinou 2, Hongjun Wang 1,DemingChen 2

Global 3D Machine Vision Market Report- Forecast till 2022

Two-Stream Bidirectional Long Short-Term Memory for Mitosis Event Detection and Stage Localization in Phase-Contrast Microscopy Images

Anticipating Visual Representations from Unlabeled Data. Carl Vondrick, Hamed Pirsiavash, Antonio Torralba

SVAN 2016 Mini Course: Stochastic Convex Optimization Methods in Machine Learning

Multi-scale Geometric Summaries for Similarity-based Upstream S

Orbital Insight Energy: Oil Storage v5.1 Methodologies & Data Documentation

A stochastic model-based approach to online event prediction and response scheduling

Available online at ScienceDirect. Procedia Engineering 119 (2015 ) 13 18

Clustering non-stationary data streams and its applications

Shape of Gaussians as Feature Descriptors

University of Genova - DITEN. Smart Patrolling. video and SIgnal Processing for Telecommunications ISIP40

Learning theory. Ensemble methods. Boosting. Boosting: history

Mejbah Alam. Justin Gottschlich. Tae Jun Lee. Stan Zdonik. Nesime Tatbul

RESTORATION OF VIDEO BY REMOVING RAIN

Vision for Mobile Robot Navigation: A Survey

INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY

Orientation Map Based Palmprint Recognition

Pearson-based Mixture Model for Color Object Tracking

ECE 521. Lecture 11 (not on midterm material) 13 February K-means clustering, Dimensionality reduction

Adaptive Covariance Tracking with Clustering-based Model Update

Improved Kalman Filter Initialisation using Neurofuzzy Estimation

A New Unsupervised Event Detector for Non-Intrusive Load Monitoring

Learning Methods for Linear Detectors

Evolutionary Multiobjective. Optimization Methods for the Shape Design of Industrial Electromagnetic Devices. P. Di Barba, University of Pavia, Italy

SP-CNN: A Scalable and Programmable CNN-based Accelerator. Dilan Manatunga Dr. Hyesoon Kim Dr. Saibal Mukhopadhyay

Automatic estimation of crowd size and target detection using Image processing

Generalized Laplacian as Focus Measure

Runtime Model Predictive Verification on Embedded Platforms 1

Asaf Bar Zvi Adi Hayat. Semantic Segmentation

Consensus Algorithms for Camera Sensor Networks. Roberto Tron Vision, Dynamics and Learning Lab Johns Hopkins University

Robotics 2 Target Tracking. Giorgio Grisetti, Cyrill Stachniss, Kai Arras, Wolfram Burgard

Artificial Neural Networks D B M G. Data Base and Data Mining Group of Politecnico di Torino. Elena Baralis. Politecnico di Torino

A CUSUM approach for online change-point detection on curve sequences

A robust multilevel approximate inverse preconditioner for symmetric positive definite matrices

Human-Oriented Robotics. Temporal Reasoning. Kai Arras Social Robotics Lab, University of Freiburg

Clustering Analysis of London Police Foot Patrol Behaviour from Raw Trajectories

Pose Tracking II! Gordon Wetzstein! Stanford University! EE 267 Virtual Reality! Lecture 12! stanford.edu/class/ee267/!

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Decision Trees. Tobias Scheffer

Tracking Human Heads Based on Interaction between Hypotheses with Certainty

Context-based Reasoning in Ambient Intelligence - CoReAmI -

Leakage Assessment Methodology - a clear roadmap for side-channel evaluations - Tobias Schneider and Amir Moradi

Design of Norm-Optimal Iterative Learning Controllers: The Effect of an Iteration-Domain Kalman Filter for Disturbance Estimation

Exploring Human Mobility with Multi-Source Data at Extremely Large Metropolitan Scales. ACM MobiCom 2014, Maui, HI

Feature extraction: Corners and blobs

Modeling Complex Temporal Composition of Actionlets for Activity Prediction

Apprentissage, réseaux de neurones et modèles graphiques (RCP209) Neural Networks and Deep Learning

ECE521 week 3: 23/26 January 2017

Digital Image Processing COSC 6380/4393

Streaming multiscale anomaly detection

Recurrent Neural Networks with Flexible Gates using Kernel Activation Functions

1 Kalman Filter Introduction

Monitoring and data filtering III. The Kalman Filter and its relation with the other methods

Descriptive Data Summarization

OBJECT DETECTION AND RECOGNITION IN DIGITAL IMAGES

Automated Segmentation of Low Light Level Imagery using Poisson MAP- MRF Labelling

Temporal analysis for implicit compensation of local variations of emission coefficient applied for laser induced crack checking

Robotics 2 Target Tracking. Kai Arras, Cyrill Stachniss, Maren Bennewitz, Wolfram Burgard

Overview. Introduction to local features. Harris interest points + SSD, ZNCC, SIFT. Evaluation and comparison of different detectors

Large-Scale Behavioral Targeting

Real-Time Computerized Annotation of Pictures

Linear Classifiers as Pattern Detectors

Fast Adaptive Algorithm for Robust Evaluation of Quality of Experience

Lecture 7 Predictive Coding & Quantization

Swarm-bots and Swarmanoid: Two experiments in embodied swarm intelligence

The Detection Techniques for Several Different Types of Fiducial Markers

CS 188: Artificial Intelligence Spring Announcements

CS 188: Artificial Intelligence Spring Announcements

Artificial Intelligence (AI) Common AI Methods. Training. Signals to Perceptrons. Artificial Neural Networks (ANN) Artificial Intelligence

Action-Decision Networks for Visual Tracking with Deep Reinforcement Learning

An Efficient Sliding Window Approach for Approximate Entity Extraction with Synonyms

INTRODUCTION TO EVOLUTION STRATEGY ALGORITHMS. James Gleeson Eric Langlois William Saunders

Efficient and Principled Online Classification Algorithms for Lifelon

Partially Observable Markov Decision Processes (POMDPs)

GIS Data Conversion: Strategies, Techniques, and Management

Grassmann Averages for Scalable Robust PCA Supplementary Material

Assertions and Measurements for Mixed-Signal Simulation

Sensor Localization and Target Estimation in Visual Sensor Networks

Non-linear Measure Based Process Monitoring and Fault Diagnosis

Random Delay Insertion: Effective Countermeasure against DPA on FPGAs

Transcription:

Proposta di Tennis player segmentation for semantic behavior analysis Architettura Software per Robot Mobili Vito Renò, Nicola Mosca, Massimiliano Nitti, Tiziana D Orazio, Donato Campagnoli, Andrea Prati, Ettore Stella {reno, stella}@ba.issia.cnr.it Institute of Intelligent Systems for Automation Via G. Amendola 122 D/O Bari www.issia.cnr.it National Research Council of Italy (CNR)

Outline Introduction Methodology BG Initialization BG Update Energy processing Variance Processing One step frame differencing Fine tuning Experiments and results Conclusions and future works

Introduction Computer vision & sports Sports is said to be the social glue of society. [ ] Technology is therefore becoming more and more crucial [ ] Since the use of sensors or other devices fixed to players or equipment is generally not possible, a rich set of opportunities exist for the application of computer vision techniques to help the competitors, trainers and audience. Computer Vision in Sports Preface ISBN 978-3-319-09396-3

Introduction Motivation why another background model? Results obtained with the Mixture of Gaussian Tendency of break players silhouettes Investigate how to produce better low level outputs avoiding post processing

Introduction Background (BG) modeling in artificial vision systems Object segmentation BG process Object tracking cameras raw data artificial vision system Scene understanding users One of the first low level computational tasks executed Input for other high level software modules (e.g. object tracking/scene understanding) Very high throughputs achieved by state of the art cameras

Introduction System overview Preliminary step of a system aimed to address coaching needs Four cameras cover all game areas with at least two views Trade-off between computational complexity and reliable results in real time

Introduction Globally Intrinsic VariancE for BACKground (GIVEBACK) Segment active entities in tennis context (balls, players) Intrinsic sensor variance of each returned value in the range 0 255 Scalability (single-multi camera), robustness and reliable results in real time

Methodology Algorithm description Background Initialization for each frame Variance process One step frame differencing if (Background is learned) Foreground extraction Fine tuning process Background Update Energy process

Methodology BG initialization BG image set to half intensity (all gray logic) No a priori knowledge of the scene

Methodology BG Update BG t ( u, v) = " $ # $ $ % BG t 1 BG t 1 BG t 1 ( u, v) κ ( u, v) ( u, v) +κ if BG t 1 u, v if BG t 1 u, v if BG t 1 u, v ( ) > I t (u, v) ( ) = I t (u, v) ( ) < I t (u, v) Each BG pixel value is increased or decreased by κ

Methodology Energy process ε = BG t 1 I t L 1 norm (for speed reasons) Used to stop the learning phase when it reaches its minimum The BG is substituted with the last captured frame during the learning phase

Methodology Variance process Variance is not related to the observations of a single pixel over time, but is a function of the gray level returned by the sensor Obs γ It models different responses to different light intensities { } ( ) = k = ( u, v) BG( u, v) = γ V t ( γ) = V t 1 γ k N t ( ) N t 1 + I t k ( ) BG( k) 2 ( γ) ( ) ( ) frequency of the γ-th gray level over time k Obs γ N γ

Methodology One step frame differencing AD = I t I t 1 M os = ( ) = 3.5σ ( γ) τ γ σ ( γ) = V ( γ) 0 AD u, v 255 AD u, v ( ) ( ( )) ( ) τ I t 1 ( u, v) ( ) > τ I t 1 u, v Binary mask calculated at each iteration Also the threshold is function of a specific gray value Robust approach (e.g. no BG update on moving players)

Methodology Foreground extraction AD = I t BG t 1 M fg = ( ) = 3.5σ ( γ) τ γ σ ( γ) = V ( γ) 0 AD u, v 255 AD u, v ( ) ( ( )) ( ) τ BG t 1 ( u, v) ( ) > τ BG t 1 u, v Similar to the one step frame differencing The current frame is compared to the BG model

Methodology Fine tuning process Blob analysis done both on M os and M fg to obtain two sets of connected regions and build the update mask B os = { b 1, b 2,, b n } B fg = { b 1, b 2,, b m } M upd = ( b i, b ) j b i B os, b j B fg, b i b j { } minimum circumscribed rectangle overlapping blobs

Experiments and results Dataset description Video sequences that represent a tennis training session Raw videos acquired by AVT Prosilica GT1920C (frame size 1920x1024@50fps) and equipped with auto iris lens Typical situations of a tennis training session, e.g. players similar to the ground, fast balls, stripes on red ground

Experiments and results Test case GIVEBACK, MoGv2, GMG and Kalman filter based BG are evaluated Starting from frame f 0, 10 images are taken every 500 frames GMG and MoGv2 implementations available online https://github.com/andrewssobral/bgslibrary, Kalman filter based BG available in MVTec Halcon 12 Precision, Recall and F-Measure are used as metrics for the quantitative results Qualitative results are presented in terms of ground truth and foreground masks

Experiments and results Qualitative results

Experiments and results Qualitative results Parts of the player considered as BG

Experiments and results Qualitative results Ghosting issues

Experiments and results Qualitative results Well cut player silhouette (ghost reduction while preserving the shape)

Experiments and results Quantitative results P = R = TP TP + FP TP TP + FN 0.9 0.8 0.7 0.6 Precision vs. Recall GMG MOG2 MOG2 FILT KALMAN KALMAN FILT GIVEBACK FINE TUNE GIVEBACK Each point refers to a run Recall 0.5 0.4 Ground truth in the upper right position (1,1) 0.3 0.2 0.1 0 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1 Precision

Experiments and results Quantitative results 0.9 F Measure F = 2 P R P + R 0.8 0.7 0.6 Median F 25 th to 75 th percentile F Measure 0.5 0.4 0.3 + Outliers 0.2 0.1 GMG MOG2 MOG2 FILT KALMAN KALMAN FILT GIVEBACK FINE TUNE GIVEBACK Algorithm

Experiments and results Quantitative results 0.9 F Measure F = 2 P R P + R 0.8 0.7 0.6 Median F F Measure 0.5 0.4 Reproducible results over time 25 th to 75 th percentile 0.3 + Outliers 0.2 0.1 GMG MOG2 MOG2 FILT KALMAN KALMAN FILT GIVEBACK FINE TUNE GIVEBACK Algorithm

Experiments and results Performances C++ implementation on PC Intel Xeon E5-2603 @ 1.60 GHz, 32GB RAM, Windows 7 64bit OS The algorithm can run at 50 fps

Conclusion and future works Efficient method to segment active entities in tennis context, preserving players silhouettes BG modeled as mean image, while the variance is related to the specific gray level captured by the sensor (enriched by a selective update mask) Future implementations/optimizations directly on smart cameras (e.g. FPGA or ARM architectures) High level analyses like posture recognition and semantic analysis