Activity related recognition in AmI environments

Similar documents
Biometrics: Introduction and Examples. Raymond Veldhuis

CITS 4402 Computer Vision

Temporal Modeling and Basic Speech Recognition

Score Normalization in Multimodal Biometric Systems

Dynamic Data Modeling, Recognition, and Synthesis. Rui Zhao Thesis Defense Advisor: Professor Qiang Ji

Joint Factor Analysis for Speaker Verification

This is an accepted version of a paper published in Elsevier Information Fusion. If you wish to cite this paper, please use the following reference:

HMM and IOHMM Modeling of EEG Rhythms for Asynchronous BCI Systems

Recognition Using Class Specific Linear Projection. Magali Segal Stolrasky Nadav Ben Jakov April, 2015

Information geometry for bivariate distribution control

Face detection and recognition. Detection Recognition Sally

Face Detection and Recognition

ECE 661: Homework 10 Fall 2014

Robust Speaker Identification

Unsupervised Anomaly Detection for High Dimensional Data

Time Series Classification

Orientation Map Based Palmprint Recognition

Fingerprint Individuality

Outline: Ensemble Learning. Ensemble Learning. The Wisdom of Crowds. The Wisdom of Crowds - Really? Crowd wiser than any individual

Modeling Complex Temporal Composition of Actionlets for Activity Prediction

2D Image Processing Face Detection and Recognition

Human Mobility Pattern Prediction Algorithm using Mobile Device Location and Time Data

If you wish to cite this paper, please use the following reference:

Boosting: Algorithms and Applications

Face recognition Computer Vision Spring 2018, Lecture 21

Palmprint based Verification System Robust to Occlusion using Low-order Zernike Moments of Sub-images

Jun Zhang Department of Computer Science University of Kentucky

Artificial Intelligence & Neuro Cognitive Systems Fakultät für Informatik. Robot Dynamics. Dr.-Ing. John Nassour J.

Score calibration for optimal biometric identification

Latent Variable Models and Expectation Maximization

Forward algorithm vs. particle filtering

Aruna Bhat Research Scholar, Department of Electrical Engineering, IIT Delhi, India

FINAL: CS 6375 (Machine Learning) Fall 2014

Video and Motion Analysis Computer Vision Carnegie Mellon University (Kris Kitani)

Eigenface-based facial recognition

E = K + U. p mv. p i = p f. F dt = p. J t 1. a r = v2. F c = m v2. s = rθ. a t = rα. r 2 dm i. m i r 2 i. I ring = MR 2.

Advanced Machine Learning & Perception

Change Detection in Optical Aerial Images by a Multi-Layer Conditional Mixed Markov Model

STA 4273H: Statistical Machine Learning

Tennis player segmentation for semantic behavior analysis

Shankar Shivappa University of California, San Diego April 26, CSE 254 Seminar in learning algorithms

Advanced Machine Learning & Perception

HMM part 1. Dr Philip Jackson

PCA FACE RECOGNITION

Recent Advances in Bayesian Inference Techniques

Learning to Disentangle Factors of Variation with Manifold Learning

Around the Speaker De-Identification (Speaker diarization for de-identification ++) Itshak Lapidot Moez Ajili Jean-Francois Bonastre

Robotics 2 AdaBoost for People and Place Detection

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Decision Trees. Tobias Scheffer

Mobile Robot Localization

I D I A P R E S E A R C H R E P O R T. Samy Bengio a. November submitted for publication

10. Hidden Markov Models (HMM) for Speech Processing. (some slides taken from Glass and Zue course)

COS 429: COMPUTER VISON Face Recognition

Machine Learning - MT Clustering

Expectation Maximization

Technical Report. Results from 200 billion iris cross-comparisons. John Daugman. Number 635. June Computer Laboratory

Introduction to Boosting and Joint Boosting

Undirected Graphical Models

Latent Variable Models and Expectation Maximization

Spectral Clustering with Eigenvector Selection

Course 495: Advanced Statistical Machine Learning/Pattern Recognition

Human Pose Tracking I: Basics. David Fleet University of Toronto

Spectral clustering with eigenvector selection

Activity Mining in Sensor Networks

Example: Face Detection

Reconnaissance d objetsd et vision artificielle

Spectral Clustering of Polarimetric SAR Data With Wishart-Derived Distance Measures

CSCI-567: Machine Learning (Spring 2019)

FINAL EXAM: FALL 2013 CS 6375 INSTRUCTOR: VIBHAV GOGATE

Machine Learning for OR & FE

EECS490: Digital Image Processing. Lecture #26

User Requirements, Modelling e Identification. Lezione 1 prj Mesa (Prof. Ing N. Muto)

Deriving Principal Component Analysis (PCA)

Advanced Machine Learning & Perception

A brief introduction to Conditional Random Fields

Final Examination CS540-2: Introduction to Artificial Intelligence

Classification for High Dimensional Problems Using Bayesian Neural Networks and Dirichlet Diffusion Trees

Predictive analysis on Multivariate, Time Series datasets using Shapelets

Shape of Gaussians as Feature Descriptors

Hidden Markov Models Part 1: Introduction

Symmetries 2 - Rotations in Space

Machine Learning Basics

Relational Event Modeling: Basic Framework and Applications. Aaron Schecter & Noshir Contractor Northwestern University

INTEGRATING RECOGNITION AND REASONING IN SMART ENVIRONMENTS

Qualifier: CS 6375 Machine Learning Spring 2015

Independent Component Analysis and Unsupervised Learning. Jen-Tzung Chien

Independent Component Analysis and Unsupervised Learning

Recurrent Autoregressive Networks for Online Multi-Object Tracking. Presented By: Ishan Gupta

Speaker Representation and Verification Part II. by Vasileios Vasilakakis

Graphical Object Models for Detection and Tracking

Rotation. Rotational Variables

STA 414/2104: Machine Learning

Statistical and Learning Techniques in Computer Vision Lecture 2: Maximum Likelihood and Bayesian Estimation Jens Rittscher and Chuck Stewart

ECE 521. Lecture 11 (not on midterm material) 13 February K-means clustering, Dimensionality reduction

Optimal Control Problems in Eye Movement. Bijoy K. Ghosh Department of Mathematics and Statistics Texas Tech University, Lubbock, TX

Master of Intelligent Systems - French-Czech Double Diploma. Hough transform

Role of Assembling Invariant Moments and SVM in Fingerprint Recognition

Missing Data and Dynamical Systems

Clustering and Gaussian Mixtures

Transcription:

Informatics & Telematics Institute Presentation Activity related recognition in AmI environments Mr. Anastasios Drosou (Dr. Tzovaras Group)

Outline Introduction Activity Detection Activity related recognition Human Tracking Reaching Part Grasping Part Classifiers Results Conclusions Presentation, Thessaloniki 16 March 2011 2

Definition of Biometrics Biometrics measure the unique physical or behavioural characteristics of individuals as a means to recognize or authenticate their identity. Establish someone s identity based on who he/she is rather than on what he/she poseses (e.g. ID cards,) or what he/she remembers (e.g. password). Verification problem: Is the user Mr. X? 1:1 comparison. Identification: Who is the user? 1:many comparison. Presentation, Thessaloniki 16 March 2011 3

Types of Biometrics Description Types Pros Cons Physiological Biometrics A physical attribute unique to a person Fingerprint, palm, iris, geometry, retina, facial characteristics, etc. High Accuracy. Obtrusive, Uncomfortable, Easy to spoof, Intolerant of changes over time. Behavioural Biometrics Traits that are learned or acquired from a person Facial Dynamics, Gait, Voice, Key-stroke Patterns, Activity related signals, etc. Unobtrusive, Difficult to spoof, Continuous & on-the-move authentication, Rare changes. Lower accuracy, yet... Presentation, Thessaloniki 16 March 2011 4

Motivation We claim recognition potential because: 1. Personal behavioural patterns in everyday movements (i.e. gait, grimaces, standard movements, etc) 2. Physiology of the human body (height, arm length, finger lengths, etc.) 3. Bodymetric restrictions (i.e. impairments, etc.) 4. Minimum Jerk Model (Flash & Hogan) 5. Minimum Torque Change Model (Uno, Kawato & Suzuki) 6. End state comfort effect (Rosenbaum et al.) 7. Presentation, Thessaloniki 16 March 2011 5

Case study: Prehension biometrics Response of the person to specific stimuli. Work related, everyday activities with no special protocol. Continuous authentication/recognition. Multi-activity concept for human authentication. 2 stages of a prehension movement: i. a fast initial movement (reaching) ii. a slow approach movement (grasping) Presentation, Thessaloniki 16 March 2011 6

System-Level Criteria Characteristics of a biometric that must be present in order to use the system for authentication purposes Universality Universality: Every person should possess this characteristic In practice, this may not be the case Otherwise, population of non-universality must be small < 1% Uniqueness Uniqueness: No two individuals possess the same characteristic. Genotypical Genetically linked (e.g. identical twins will have same biometric) Phenotypical Non-genetically linked, different perhaps even on same individual Establishing uniqueness is difficult to prove analytically May be unique, but uniqueness must be distinguishable Permanence Permanence: The characteristic does not change in time, that is, it is time invariant At best this is an approximation Degree of permanence has a major impact on the system design and long term operation of biometrics. (e.g. enrollment, adaptive matching design, etc.) Long vs. short-term stability Collectability - Measurability Collectability: The characteristic can be quantitatively measured. In practice, the biometric collection must be: Non-intrusive Reliable and robust Cost effective for a given application Measurability: The trait can be measured with simple technical instruments Performance accuracy, speed, and robustness of technology used Circumvention The trait is easily measured with minimal discomfort.then it can potentially serve as a biometric for a given application. 7

Outline Introduction Activity Detection Activity related recognition Human Tracking Reaching Part Grasping Part Classifiers Results Conclusions Presentation, Thessaloniki 16 March 2011 8

Motion Templates Motion Energy Image (MEI) Describes the motion energy for a given view of action. Binary image. E T ( x, y, t) = τ 1 i= 0 D( x, y, t i) Motion History Image (MHI) Intensity image. MHI T (, x y,) t τ if Dxyt (,, ) = 1 = max(0, MHIT ( x, y, t 1) 1) otherwise Presentation, Thessaloniki 16 March 2011 9

Radon Transforms (I) 1. Face detection 2. Centre of Face 3. RIT Extraction 0.025 0.02 0.015 0.01 Radial Integration Transform (RIT): 0.005 0 0 20 40 60 80 100 120 1 RIT ( t θ) MHI ( r j u co ( t sθ), y j u sin t ()) θ J = J j= 1 0 + 0 + o for t = 1,..., T with T = 360 / θ Presentation, Thessaloniki 16 March 2011 10

Radon Transforms (II) 1. Face detection 2. Centre of Face 3. CIT Extraction Circular Integration Transform (CIT): 1 CIT ( t ρ) MHI ( x k ρcos( t θ), y k ρsin( t θ)) T = T t= 1 0 + 0 + o for k = 1,..., K with T = 360 / θ Presentation, Thessaloniki 16 March 2011 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 10 20 30 40 50 60 70 80 11

Similarity Metrics Euclidian Distance 2 n 2 E( xy, ) = ( x y) = ( ) i= 1 i i D x y Correlation Factor - Coefficients corr( x, y) cov( xy, ) = ρxy, = = σσ E(( x µ )( y µ )) x σσ x y x y y Presentation, Thessaloniki 16 March 2011 12

Human Body Tracking Camera based tracking: Custom Tracker OpenNI Library Tracker Ascension Technology Corp. Magnetic Tracker (ground truth). Cyberglove finger tracking. Presentation, Thessaloniki 16 March 2011

Face Detection & Tracking Viola & Jones s real-time face detection method: T weak classifiers form a strong one, integral Image, haar like features (subtracting the pixel values in the dark from the bright rectangles), AdaBoost algorithm. Mean Shift algorithm: Bases on spatial and colour information between sequential frames. Bhattacharyya distance Presentation, Thessaloniki 16 March 2011 14

Background Extraction Skin Colour Filtering Motion Detection Location of the face Disparity images (stereoscopic Camera) DI ( ) = SI ( ) BI ( ) No training required. Computational inexpensive. Simple approach - explicit rules. Skin cluster boundaries of RGB & HSV. M ( I) D ( I) D ( I) t 1 2 MHI ( x, y) t 2, if Mt ( Ixy (, )) = 1 = max(0, MHIt 1( x, y) 1), otherwise Presentation, Thessaloniki 16 March 2011 15

Movement s Tracking Presentation, Thessaloniki 16 March 2011 16

Body Tracker Evaluation Presentation, Thessaloniki 16 March 2011

Primesense TOF Sensor - OpenNI Open Source Library Real time & accurate. Human body form recognition and segmentation ecognizes the human Human body tracking by simultaneousl tracking of 48 essential points of the human body in the 3D space. Trained machine learning algorithm by millions of manually tagged images of people in different poses. OpenNI manages to adjust the most appropriate skeleton model to each human body in terms of size and pose. Presentation, Thessaloniki 16 March 2011

Cyberglove finger tracker 4 DoF for each finger 3 phalanxes for each finger 3 DoF for the orientation of the whole palm Time-stamped Data Presentation, Thessaloniki 16 March 2011

Outline Introduction Activity Detection Activity related recognition Human Tracking Reaching Part Grasping Part Classifiers Results Conclusions Presentation, Thessaloniki 16 March 2011 20

Feature Extraction (I) x Head Right Hand Left Hand *The diameter of the dots is anti proportional to the depth of the tracked point. y Presentation, Thessaloniki 16 March 2011 21

Phone Conversation Trajectories Subject 1 Subject 2 Subject 3 1 2 Presentation, Thessaloniki 16 March 2011 22

Office Panel Trajectories Subject 1 Subject 2 Subject 3 1 2 Presentation, Thessaloniki 16 March 2011 23

Warping the trajectory Enrollment Authentication Client: PD ( ') qpd ( ) q = = Impostor: P P Head Phone P P Head Phone Presentation, Thessaloniki 16 March 2011 24

Feature Extraction (I) Exact location of the head/hand 3D position, at time t (timestamp tracking). v x, y, z = ds x, y, z dt Dynamic Spatial Cost a xyz,, = dv xyz,, dt S t S t x x 2 2 2 p ( ) = p ( 1) + ( t ( t 1) ) + ( t ( t 1) ) + ( t ( t 1) ) y y z z Presentation, Thessaloniki 16 March 2011 25

Feature Extraction (II) 26 Curvature - Derivative Torsion - Derivative abc c s b s a s s P i ) )( )( ( 4 ) ( ^ ^ ^ ^ * = κ g d b a P P P i i i s + + + = + 2 2 ) ( ) ( 3 ) ( 1 * 1 * * κ κ κ ) ( 6 ) ( (6 2 1 ) ( * * * i i i P gmm H P def H P κ κ τ + = + g h d b a P P P r P P P i i s i i i i s + + + + + = + 2 2 2 )) ( 6 ) / ( ) ( ( ) ( ) ( 4 ) ( * * * 1 * 1 * * κ κ τ τ τ τ Presentation, Thessaloniki 16 March 2011

Spatiotemporal Activity Surface 1. PCA on 3D Trajectories 2. 3-dimensional manifold trajectories using time as 3 rd dimension 3. Triangulation Presentation, Thessaloniki 16 March 2011 f ( θφ, ) = min { d( RP, s )} RP k = 1,, K i k i

Intra-similarity vs. Inter-variance of AS Presentation, Thessaloniki 16 March 2011

Static Anthropometric Attributed Graph Matching 7 Attributed Edges 8 Nodes G 7 = { V, E,{A},{B} i=1} Supports: Full-graph matching (n`=n) Sub-graph matching (n`>n) (AGM) problem: ( s ) q W j 1 j+ r B j PB = j min ' p B = PB' P ( +ε M ) T j 0 0 j j Presentation, Thessaloniki 16 March 2011

Confidence Factor Ergonomy 0.1 dtorso, object + 0.5, if dtorso, object < 5cm b = 1, if 5cm dtorso, object 35cm 0.02 dtorso, object + 1.7, if dtorso, object > 35cm f q = 1 N + N + N 3N misshead missrhand misslhand frames f q, final = bf q Presentation, Thessaloniki 16 March 2011

Outline Introduction Activity Detection Activity related recognition Human Tracking Reaching Part Grasping Part Classifiers Results Conclusions Presentation, Thessaloniki 16 March 2011 31

F = { θ, θ,..., θ } a 0 1 22 Feature Extraction Angle between the phalanxes of each finger, at time t (timestamp tracking). ω = dθ dt Dynamic Travel Cost θ a θ dω = = dt 2 d θ 2 dt * 2 ka j j [ Tj Tj ( aj)] Vj( aj( t), Tj) = ( )(1 + ) 2 r s T * ( a ( t)) = k ln( a + 1), k 0 The cost of moving joint j through an angle of size in the given time Presentation, Thessaloniki 16 March 2011 j j j j j Joint s optimal time 32

Dynamic Time Warping (DTW) Classifier Based on dynamic Programming No training required Simple Implementation Fast Area size The problem of finding the optimal warping path can be reduced to finding this sequence of nodes (x k ; y k ), which minimizes D(x k ; y k ) along the complete path. The main aim is to find the path for which the least cost is associated. Dx (, y) = Dx (, y ) + cx (, y) = cx (, y) k k ( k 1) ( k 1) k k m m m= 1 Presentation, Thessaloniki 16 March 2011 k S L L = V( p, q ) c i j i= 1 j= 1 D = S D ( LL, ) M c min Total dissimilarity

Hidden Markov Model (HMM) Classifier Fully connected, left-to-right, five-state (N=5) HMM. The triplet: λ = { π, α, b } j ij j π j is the probability of the j th trajectories, state being the first state among all the a ij is the probability of the j th state occurring immediately after the i th state, b j denotes the PDF of the j th state. The observational data from each state of the HMM are generated according to a PDF dependent on the instant of t th state,. b ( O ) Given HMMs for the L enrolled subjects and the new trajectory vectors, we assign user label m as the HMM that maximizes the likelihood (ML principle). S 5 j S 4 S 3 t S 1 S 2 Presentation, Thessaloniki 16 March 2011 35

Spherical Harmonics 2l+ 1 ( l m)! Y P cos cos m isin m 4 π ( l+ m)! m ( θφ m l, ) = ( θ l )( ( φ ) + ( φ )) Spherical Harmonics series 2 l (, ) m (, ) π π (, ) m m = RPi l Ω= Ω 0 0 RPi l c f θφy θφd f θφy dθdφ Rotational Invariance: Z-Normalization: c z l c * l K l = c * l m= K µ σ Fusion among Reference Points: Presentation, Thessaloniki 16 March 2011 = y y l c m l N Spherical Harmonics coeffs. S = ws = ws + ws + + w S tot j j 1 1 2 2 N N j= 1

Database ACTIBIO dataset (AmI indoor environment): 29 Subjects, 2 Time sessions - 8 repetitions in total, Annotated frame sequences, 5 cameras in total: 1 stereo camera (BumblebeeXB3 Point Grey Inc), 2 usb cameras (Lateral-Zenithal) Presentation, Thessaloniki 16 March 2011 37

Outline Introduction Activity Detection Activity related recognition Human Tracking Reaching Part Grasping Part Classifiers Results Conclusions Presentation, Thessaloniki 16 March 2011 38

Activity Detection Results Events Phone Panel Mic. Panel Drink Hands Up Phone 93.1% 0% 0% 6.9% 0% Panel 0% 89.7% 10.3% 0% 0% Mic. Panel 0% 3.44% 3.44% 93.1% 100% Drink 0% 10.3% 86.2% 3.44% 0% Presentation, Thessaloniki 16 March 2011 39

Tracking Results Subject s No. Confidence Score Subject s No. Confidence Score Subject 1 0.835227 Subject 3 0.818681 Subject 3 0.937500 Subject 4 0.846774 Subject 5 0.933333 Subject 6 0.927778 Subject 27 0.935897 Subject 28 0.720588 Subject 29 0.866667 Presentation, Thessaloniki 16 March 2011 40

Authentication HMM Results Phone Conv. (EER=15%) Office Panel (EER=9.8%) Presentation, Thessaloniki 16 March 2011 41

Authentication Performance ACTIBIO Database (29 Subjects) Internal Database (15 Subjects ACTIBIO Database (29 Subjects) Score Level Fusion: S = 0.25* S + 0.75* S total Phone Panel Presentation, Thessaloniki 16 March 2011

Recognition Performance (HMM) Presentation, Thessaloniki 16 March 2011 Score Level Fusion: S = 0.25* S + 0.75* S total Phone Panel

Spherical Harmonics Results (EER=11.4%) (EER=15%) (EER=16.7%) Presentation, Thessaloniki 16 March 2011

Conclusions Recognition using just a common stereoscopic camera. Unobtrusive, on the move, continuous authentication. Significant recognition potential just by the spatial information of the trajectories. Higher authentication rates are expected, given the relative entropies. Rotational invariance from spherical harmonics analysis. Privacy enabled method, since no obtrusive information is stored. Activity-related biometric authentication provided very promising results and is expected to maximize the performance of a multimodal biometric system. Presentation, Thessaloniki 16 March 2011 45