Switch Mechanism Diagnosis using a Pattern Recognition Approach

Similar documents
ECE 521. Lecture 11 (not on midterm material) 13 February K-means clustering, Dimensionality reduction

Pattern Recognition and Machine Learning

Machine Learning for Signal Processing Bayes Classification and Regression

L11: Pattern recognition principles

Parametric Models. Dr. Shuang LIANG. School of Software Engineering TongJi University Fall, 2012

CSC411: Final Review. James Lucas & David Madras. December 3, 2018

SYDE 372 Introduction to Pattern Recognition. Probability Measures for Classification: Part I

Intro. ANN & Fuzzy Systems. Lecture 15. Pattern Classification (I): Statistical Formulation

Machine Learning Support Vector Machines. Prof. Matteo Matteucci

1 EM algorithm: updating the mixing proportions {π k } ik are the posterior probabilities at the qth iteration of EM.

Lecture 3: Pattern Classification

PATTERN CLASSIFICATION

Brief Introduction of Machine Learning Techniques for Content Analysis

Mathematical Formulation of Our Example

Bayesian Learning (II)

Introduction to Graphical Models

Heeyoul (Henry) Choi. Dept. of Computer Science Texas A&M University

CS534 Machine Learning - Spring Final Exam

Classification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2012

Introduction: MLE, MAP, Bayesian reasoning (28/8/13)

An Introduction to Statistical and Probabilistic Linear Models

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013

EEL 851: Biometrics. An Overview of Statistical Pattern Recognition EEL 851 1

Statistical Learning Reading Assignments

Statistical learning. Chapter 20, Sections 1 4 1

HMM and IOHMM Modeling of EEG Rhythms for Asynchronous BCI Systems

Learning with Noisy Labels. Kate Niehaus Reading group 11-Feb-2014

Mining Classification Knowledge

SGN (4 cr) Chapter 5

ISyE 6416: Computational Statistics Spring Lecture 5: Discriminant analysis and classification

Computer Vision Group Prof. Daniel Cremers. 3. Regression

Chapter 3: Maximum-Likelihood & Bayesian Parameter Estimation (part 1)

DEPARTMENT OF COMPUTER SCIENCE Autumn Semester MACHINE LEARNING AND ADAPTIVE INTELLIGENCE

CS6220: DATA MINING TECHNIQUES

Midterm Review CS 6375: Machine Learning. Vibhav Gogate The University of Texas at Dallas

Machine Learning. CUNY Graduate Center, Spring Lectures 11-12: Unsupervised Learning 1. Professor Liang Huang.

Artificial Neural Networks

Dynamic Data Modeling, Recognition, and Synthesis. Rui Zhao Thesis Defense Advisor: Professor Qiang Ji

Lecture 3: Pattern Classification. Pattern classification

Mixture of Gaussians Models

Parametric Unsupervised Learning Expectation Maximization (EM) Lecture 20.a

Computer Vision Group Prof. Daniel Cremers. 2. Regression (cont.)

Midterm Review CS 7301: Advanced Machine Learning. Vibhav Gogate The University of Texas at Dallas

Clustering VS Classification

Overview of Statistical Tools. Statistical Inference. Bayesian Framework. Modeling. Very simple case. Things are usually more complicated

Probabilistic classification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2016

STA414/2104. Lecture 11: Gaussian Processes. Department of Statistics

Machine Learning for Data Science (CS4786) Lecture 12

STA 414/2104: Lecture 8

Introduction to machine learning and pattern recognition Lecture 2 Coryn Bailer-Jones

Machine Learning for Signal Processing Bayes Classification

Linear Models for Classification

ECE521 week 3: 23/26 January 2017

Bayesian Decision Theory

The Bayes classifier

Ch 4. Linear Models for Classification

Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function.

Machine Learning. Lecture 4: Regularization and Bayesian Statistics. Feng Li.

Introduction to Signal Detection and Classification. Phani Chavali

STA 4273H: Statistical Machine Learning

Linear Models for Classification

Classification for High Dimensional Problems Using Bayesian Neural Networks and Dirichlet Diffusion Trees

Master 2 Informatique Probabilistic Learning and Data Analysis

Last updated: Oct 22, 2012 LINEAR CLASSIFIERS. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition

Discriminant Analysis and Statistical Pattern Recognition

Active and Semi-supervised Kernel Classification

Face Recognition Using Laplacianfaces He et al. (IEEE Trans PAMI, 2005) presented by Hassan A. Kingravi

Naïve Bayes classification

Gaussian Models

Non-Parametric Bayes

Pattern Recognition and Machine Learning. Bishop Chapter 6: Kernel Methods

Logistic Regression. Machine Learning Fall 2018

Clustering. CSL465/603 - Fall 2016 Narayanan C Krishnan

Notes on Discriminant Functions and Optimal Classification

5. Discriminant analysis

Algorithmisches Lernen/Machine Learning

Introduction to Machine Learning. Introduction to ML - TAU 2016/7 1

CSCI-567: Machine Learning (Spring 2019)

p(d θ ) l(θ ) 1.2 x x x

Feature Extraction with Weighted Samples Based on Independent Component Analysis

Classification: The rest of the story

LINEAR MODELS FOR CLASSIFICATION. J. Elder CSE 6390/PSYC 6225 Computational Modeling of Visual Perception

Machine Learning Lecture 5

Recent Advances in Bayesian Inference Techniques

Computer Vision Group Prof. Daniel Cremers. 6. Mixture Models and Expectation-Maximization

Time Series Classification

Chemometrics: Classification of spectra

CS6220: DATA MINING TECHNIQUES

Non-parametric Methods

Lecture 3: Machine learning, classification, and generative models

CS145: INTRODUCTION TO DATA MINING

Machine Learning Linear Classification. Prof. Matteo Matteucci

9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering

COM336: Neural Computing

PATTERN RECOGNITION AND MACHINE LEARNING

Naïve Bayes classification. p ij 11/15/16. Probability theory. Probability theory. Probability theory. X P (X = x i )=1 i. Marginal Probability

Final Examination CS 540-2: Introduction to Artificial Intelligence

Machine Learning Practice Page 2 of 2 10/28/13

> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE GRAVIS 2016 BASEL. Logistic Regression. Pattern Recognition 2016 Sandro Schönborn University of Basel

Transcription:

The 4th IET International Conference on Railway Condition Monitoring RCM 2008 Switch Mechanism Diagnosis using a Pattern Recognition Approach F. Chamroukhi, A. Samé, P. Aknin The French National Institute for Transport and Safety Research (INRETS-LTN) M. Antoni The French National Railway Company (SNCF), Derby, UK

Overview of the presentation 2 Context The proposed pattern recognition approach General principle of a pattern recognition approach Signals parameterization Parameters learning: Mixture Distribution Modeling Signals classification Experimental study Conclusion and future works

Context 3 Switch mechanism diagnosis Considered switches: Operated by an electric motor Equiped with a Clamp-Lock system («Verrou-Carter- Coussinet» in french) Electric Motor Moving Points

Acquired signals 4 Measurements of the electrical power consumption during the switch actuation period Sampling frequency: 100 hz Length of each signal: 550 points 600 550 Power (Watt) 500 450 400 350 300 250 0 1 2 3 4 5 6 Time (Second)

The proposed pattern recognition approach 5 General principle of a pattern recogniton approach New signals Sensors Preprocessing Feature extraction Supervised Learning Algorithm Decision rule Signals classes Signals Features Model

Signals parameterization: Feature extraction 6 The switch actuation consists of successive mechanical motions of different parts of the mechanism: starting phase points unlocking points translation points locking friction phase These motions are observed on the shape of the signal

The different phases of a switch 7 actuation The starting phase 600 550 Power (Watt) 500 450 400 350 300 250 0 1 2 3 4 5 6 Time (Second) Starting

The different phases of a switch 8 actuation The unlocking phase 600 550 Power (Watt) 500 450 400 350 300 250 0 1 2 3 4 5 6 Time (Second) Starting Unlocking

The different phases of a switch 9 actuation The translation phase 600 550 Power (Watt) 500 450 400 350 300 250 0 1 2 3 4 5 6 Time (Second) Starting Unlocking Translation

The different phases of a switch 10 actuation The locking phase 600 550 Power (Watt) 500 450 400 350 300 250 0 1 2 3 4 5 6 Time (Second) Starting Unlocking Translation Locking

The different phases of a switch 11 actuation The friction phase 600 550 Power (Watt) 500 450 400 350 300 250 0 1 2 3 4 5 6 Time (Second) Starting Unlocking Translation Locking Friction

Feature extraction 12 Each signal is described by the set of the features of its three main phases Polynomial fitting Parameter vector: polynomial coefficients, min, max, mean and variance of the signal in each segment 600 Power (Watt) 550 500 450 400 350 300 a 0 + a 1 t + + a p t p α i = (a 0, a 1,, a p ) x i = (α 1 ; α 2 ; α 3 ; min ; max; mean ; variances ) 250 0 1 2 3 4 5 6 Time (Second) For Chamroukhi each signal Faicel we have 21 IET parameters RCM 2008 instead of 19 550 June points! 2008

Learning parameters 13 The three considered classes C 1 : class without defect C 2 : class with minor defect C 3 : class with critical defect 600 600 600 550 550 550 Power (Watt) 500 450 400 Power (Watt) 500 450 400 lack of lubrication Power (Watt) 500 450 400 critical lack of lubrication 350 350 350 300 300 300 250 0 1 2 3 4 5 6 Time (Second) 250 0 1 2 3 4 5 6 Time (Second) 250 0 1 2 3 4 5 6 Time (Second)

Learning parameters: Mixture Discriminant Analysis (MDA) 14 Why Mixture Discriminant Analysis? In classical Linear Discriminant Analysis (LDA), each class is modeled by a single Gaussian density For complex classes, a single density is insufficient Proposed solution: Gaussian Mixture Density MDA is a probabilistic discimination method based on Gaussian Mixture Model (GMM) Advantages: MDA allows to model classes more precisely Improve the correct classification rate

Gaussian Mixture Model (GMM) 15 The mixture density for class C k : x i is the feature vector extracted from the i th signal R k is the number of densities of the mixture The proportions of the mixture verify is the Gaussian probability density function with mean m r and covariance matrix Σ r of the class C k to be estimated. : parameter vector

Examples of GMM distributions 16 8 0.2 0.18 0.16 6 4 f(x) 0.14 0.12 0.1 0.08 2 0-2 0.06 0.04 0.02-4 -6 0-10 -5 0 5 10 15 20 x A GMM density in dimension 1-8 -8-6 -4-2 0 2 4 6 8 A bidimensional data set simulated according to two GMM distributions

Estimation of the mixture model parameters 17 Maximum Likelihood method Log-likelihood: The maximization is performed by a specific algorithm: the Expectation-Maximization (EM) algorithm The optimal number of Gaussian distributions R k for each class is computed by maximizing the Bayesian Information Criterion (BIC)

How to classify signals? 18 Use the Maximum A Posteriori (MAP) rule A new signal to be classified Preprocessing Feature extraction Decision rule MAP The signal classe Feature vector Assign each signal represented by x i to the class k* which maximizes the posterior probabilities where

Experimental study 19 Database: 119 labellized signals 90 Signals used for learning (supervised learning) 29 signals used to evaluate the classifier Comparison to alternative classification approaches Neural Network (Based on Multilayer Perceptron) K-Nearest Neighbours Bayesian discrimination approach (A single gaussian density for each class)

Estimated GMM distributions into the principal factor discriminant plane 20-62 -63 LDA PC2 FDA PC2-64 -65-66 -67-15 -10-5 0 5 10 15 20 FDA LDA PC1 PC1 Class1 = green circle, class 2 = blue triangle, class 3 = red square

Results 21 The correct classification rate obtained with the four methods : Approach Correct Classification Rate MDA 95 % NN 90 % KNN 88 % Bayesian disc. with one Gaussian 75 % The number of selected mixture components Class C1 C2 C3 Number of mixture components 4 2 2

Conclusion 22 Development of a classification method based on Mixture Discriminant Analysis (MDA) in a switch mechanism diagnosis context This type of approach can be applied to various switch mechanisms since it simply requires the electric power consumption signals The experimental study on real signals has revealed some good performances of our approach, compared to alternative methods

Future Works 23 Time monitoring of the state point over a sequence of actuations Envisaged approaches: Regressive and autoregressive mixtures models Hidden Markov Models