Vassilis Katsouros, Vassilis Papavassiliou and Christos Emmanouilidis

Similar documents
Information-based Feature Selection

Chapter 7. Support Vector Machine

Enterprise interoperability measurement - Basic concepts

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015

15-780: Graduate Artificial Intelligence. Density estimation

Vector Quantization: a Limiting Case of EM

Title: Damage Identification of Structures Based on Pattern Classification Using Limited Number of Sensors

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering

Statistical Pattern Recognition

Research on Dependable level in Network Computing System Yongxia Li 1, a, Guangxia Xu 2,b and Shuangyan Liu 3,c

1 Review of Probability & Statistics

Hybridized Heredity In Support Vector Machine

6.867 Machine learning

Stochastic Matrices in a Finite Field

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5

Discrete Mathematics and Probability Theory Spring 2013 Anant Sahai Lecture 18

Mixtures of Gaussians and the EM Algorithm

Expectation-Maximization Algorithm.

Continuous Domain Analysis of Graph Laplacian Regularization for Image Denoisng. Presenter: Gene Cheung National Institute of Informatics

Short Term Load Forecasting Using Artificial Neural Network And Imperialist Competitive Algorithm

Clustering. CM226: Machine Learning for Bioinformatics. Fall Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar.

Outline. CSCI-567: Machine Learning (Spring 2019) Outline. Prof. Victor Adamchik. Mar. 26, 2019

As stated by Laplace, Probability is common sense reduced to calculation.

Probability and MLE.

Generalized Semi- Markov Processes (GSMP)

Uncertainty. Variables. assigns to each sentence numerical degree of belief between 0 and 1. uncertainty

Linear Algebra Issues in Wireless Communications

Supplementary Material: HCP: A Flexible CNN Framework for Multi-label Image Classification

DERIVING THE 12-LEAD ECG FROM EASI ELECTRODES VIA NONLINEAR REGRESSION

Pattern Classification

μ are complex parameters. Other

Structuring Element Representation of an Image and Its Applications

Machine Learning.

Accuracy assessment methods and challenges

Outline. Linear regression. Regularization functions. Polynomial curve fitting. Stochastic gradient descent for regression. MLE for regression

IP Reference guide for integer programming formulations.

Statistical Analysis on Uncertainty for Autocorrelated Measurements and its Applications to Key Comparisons

Lectures on Stochastic System Analysis and Bayesian Updating

Forecasting SO 2 air pollution in Salamanca, Mexico using an ADALINE.

1 Review and Overview

Presenting A Framework To Study Linkages Among Tqm Practices, Supply Chain Management Practices, And Performance Using Dematel Technique

Generating Functions. 1 Operations on generating functions

3/8/2016. Contents in latter part PATTERN RECOGNITION AND MACHINE LEARNING. Dynamical Systems. Dynamical Systems. Linear Dynamical Systems

Soo King Lim Figure 1: Figure 2: Figure 3: Figure 4: Figure 5: Figure 6: Figure 7:

INF Introduction to classifiction Anne Solberg Based on Chapter 2 ( ) in Duda and Hart: Pattern Classification

Chapter 3: Other Issues in Multiple regression (Part 1)

Quality prediction for polypropylene production process based on. CLGPR model

Research Article A Unified Weight Formula for Calculating the Sample Variance from Weighted Successive Differences

Channel coding, linear block codes, Hamming and cyclic codes Lecture - 8

EDGE AND SECANT IDEALS OF SHARED-VERTEX GRAPHS

Lecture 10: Performance Evaluation of ML Methods

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

A Unified Model between the OWA Operator and the Weighted Average in Decision Making with Dempster-Shafer Theory

Intermittent demand forecasting by using Neural Network with simulated data

Improvement of Generic Attacks on the Rank Syndrome Decoding Problem

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss

Discrete Mathematics and Probability Theory Fall 2009 Satish Rao,David Tse Lecture 16. Multiple Random Variables and Applications to Inference

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 12

The target reliability and design working life

7. Modern Techniques. Data Encryption Standard (DES)

Estimation of Population Mean Using Co-Efficient of Variation and Median of an Auxiliary Variable

A Unified Approach on Fast Training of Feedforward and Recurrent Networks Using EM Algorithm

EE 6885 Statistical Pattern Recognition

Warped, Chirp Z-Transform: Radar Signal Processing

Extreme Value Charts and Analysis of Means (ANOM) Based on the Log Logistic Distribution

1 Efficient Splice Site Prediction with Context-Sensitive Distance Kernels

Solution of Final Exam : / Machine Learning

The Pascal Fibonacci numbers

The AMSU Observation Bias Correction and Its Application Retrieval Scheme, and Typhoon Analysis

Evapotranspiration Estimation Using Support Vector Machines and Hargreaves-Samani Equation for St. Johns, FL, USA

Linear Classifiers III

Sequential Monte Carlo Methods - A Review. Arnaud Doucet. Engineering Department, Cambridge University, UK

RAINFALL PREDICTION BY WAVELET DECOMPOSITION

The prediction of cement energy demand based on support vector machine Xin Zhang a, *, Shaohong Jing b

2 1. The r.s., of size n2, from population 2 will be. 2 and 2. 2) The two populations are independent. This implies that all of the n1 n2

Empirical Process Theory and Oracle Inequalities

FLOWSHOP SCHEDULING USING A NETWORK APPROACH

Bi-Magic labeling of Interval valued Fuzzy Graph

Tracking Bearing Degradation using Gaussian Wavelets

Optimally Sparse SVMs

INTEGRATION BY PARTS (TABLE METHOD)

Bayesian Methods: Introduction to Multi-parameter Models

ANY TIME PROBABILISTIC REASONING FOR SENSOR VALIDATION

TEACHER CERTIFICATION STUDY GUIDE

AUTOMATED ANALYSIS OF EDDY CURRENT (EC) GIANT MAGNETO RESISTIVE (GMR) DATA

Axioms of Measure Theory

Parallel Vector Algorithms David A. Padua

Provläsningsexemplar / Preview TECHNICAL REPORT INTERNATIONAL SPECIAL COMMITTEE ON RADIO INTERFERENCE

HARMONIC ANALYSIS FOR OPTICALLY MODULATING BODIES USING THE HARMONIC STRUCTURE FUNCTION (HSF) Lockheed Martin Hawaii

Pattern Classification

An Introduction to Randomized Algorithms

62. Power series Definition 16. (Power series) Given a sequence {c n }, the series. c n x n = c 0 + c 1 x + c 2 x 2 + c 3 x 3 +

Deep Neural Networks CMSC 422 MARINE CARPUAT. Deep learning slides credit: Vlad Morariu

Kinetics of Complex Reactions

PARETO-OPTIMAL SOLUTION OF A SCHEDULING PROBLEM ON A SINGLE MACHINE WITH PERIODIC MAINTENANCE AND NON-PRE-EMPTIVE JOBS

CHAPTER 5 SOME MINIMAX AND SADDLE POINT THEOREMS

GUIDELINES ON REPRESENTATIVE SAMPLING

A Note on Effi cient Conditional Simulation of Gaussian Distributions. April 2010

Hypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance

Transcription:

Vassilis Katsouros, Vassilis Papavassiliou ad Christos Emmaouilidis ATHENA Research & Iovatio Cetre, Greece www.athea-iovatio.gr www.ceti.athea-iovatio.gr/compsys e-mail: christosem AT ieee.org

Problem defiitio Modelig Results Discussio - coclusio

Applied research, prototypes ad applicatios developmet Dissemiatio ad R&D results valorizatio Research Iovatio support Traiig ad kowledge trasfer 3

Idustrial Systems Istitute Istitute for the Maagemet of Iformatio Systems Istitute for Laguage ad Speech Processig Idustrial Iformatio & Commuicatio Systems Embedded Systems Eterprise Itegratio Security & Privacy Idustrial Automatio & Cotrol ICT i Maufacturig Large-scale iformatio systems Data maagemet Big data Geo-iformatics Databases Software egieerig Digital curatio Bio-iformatics E-Govermet Natural & Embodied Laguage Processig Multimedia, 2D/3D imagig & VR Itelliget Systems Speech ad Music Techology Techology-Ehaced Learig Cultural Iformatics Corallia Iovatio Clusters Develop iovative ecosystems i specific sectors ad regios of the coutry, ad where a competitive advatage ad export orietatio exists Nao/Microelectroics-based Systems ad Applicatios (mi-cluster) Iovative Gamig Techologies ad Creative Cotet (gi-cluster) Space Techologies ad Applicatios (si-cluster) Iitiatio of R&D Projects

welcom-project.ceti.gr Wireless sesor etworks for egieerig asset lifecycle optimal maagemet

Worked with differet modelig approaches Best results obtaied with a Bayesia classificatio approach defie distict problem classes oe for each problem type calculate the posterior probability of a test case for each problem class recommed the problem class that correspods to the maximum a posteriori (MAP) probability

case: cosists of a collectio of evet codes, each of which correspods to a umber of parameters a record of a case ca be defied as a sigle evet code alog with the respective measuremet parameters 30 parameters of oboard measuremets, recorded each time a evet code was geerated 1 dataset of cases with their evet codes ad respective parameters for traiig ad aother 1 for testig/evaluatio The traiig dataset icluded the classificatio of the cases ito uisace or problem for the cases classified as problems, the correspodig problem label/idetifier was also provided.

Traiig dataset: 1.316.653 records that correspod to 10.459 cases of which 10.295 were characterized as uisace ad 164 as problem (13 distict problem IDs) Testig dataset: 1.893.882 records of evet codes measured parameters that correspod to 9.358 distict cases. Groud truth of the testig dataset ivolved 174 problem cases, with the remaiig 9.184 beig uisace cases A recommeder should idetify the problem cases from the 9.358 testig cases ad for each oe of them provide the respective problem idetifier. Evaluatio metric: calculated o a set of cases that ivolved all 174 groud truth problem cases ad a radom selectio of 174 from the total of 9.184 uisace cases.

maximum performace: 348, i.e. sum of 174 groud truth problem cases ad 174 uisace that the traiig dataset icludes oe case with a problem type idetifier (P7940) that is ot foud i the test dataset test dataset icludes cases that have bee categorized i two problem types (P0932 ad P6880) for which there is o available traiig data Problem ID Traiig dataset Test dataset Number of Cases P0159 19 15 P0898 4 6 P0932-2 P1737 2 2 P2584 53 26 P2651 13 13 P3600 17 20 P6559 3 1 P6880-15 P7547 6 4 P7695 17 37 P7940 1 - P9766 14 12 P9965 2 5 P9975 13 16 Total 164 174

Pr P j C k Pr C k P j Pr[ C k Pr[ P ] j ] Bayes rule

P 1 C k P 2 P P C 3 P arg max log Pr Pr C P Pr log P PrP P M k jj j1... MMM j k k k jj j j

C C C C P Pr E E... E P Pr k j 1 2 k k k Nk j C C C E P Pr E P...Pr E P Pr j 1 2 k k k j Nk j Assumig idepedece amog evets

Cosider the case IDs as colums of a matrix NK N N K K K e e e e e e e e e e e e 2 1 3 32 31 2 22 21 1 12 11 Cosider the evet IDs as rows of a matrix The elemets e ij of the matrix is the umber of occurreces of evet i beig observed i case j

Agai case IDs are the colums of the matrix 0 0 1 0 1 0 0 0 0 1 0 0 With the rows beig the problem IDs Ad elemets 0 or 1

C k P 1 C log Pr[ Ck Pj ] log Pr E Pj P i1 2 P 3 P M N k k i

Cosider the problem IDs as colums of a matrix NM N N M M M 2 1 3 32 31 2 22 21 1 12 11 Cosider the evet IDs as rows of a matrix The elemets ij of the matrix is the umber of occurreces of evet i is observed i a case that is classified i problem j

Add a ε term whe ij = 0 to avoid zero probabilities E 1 E 2 P 1 P 2 P M p p p p 11 21 31 N1 p p p p 12 32 22 E PrE 3 i Pj E N N 2 N p p p i1 p 1M 2M ij 3M NM ij

P 1 P 2 P M p p... 1 Pr P j 2 N i1 M N j1 i1 pij M ij

Problem ID Number of Cases Top-1 Top-3 Top-5 P0159 19 15 16 18 P0898 4 4 4 4 P1737 2 1 1 1 P2584 53 49 53 53 P2651 13 13 13 13 P3600 17 15 17 17 P6559 3 1 2 2 P7547 6 2 5 5 P7695 17 15 17 17 P7940 1 1 1 1 P9766 14 14 14 14 P9965 2 2 2 2 P9975 13 10 11 12 Total 164 142 156 159 Overall (%) 86.59 95.12 96.95

Problem ID Number of Cases Top-1 Top-2 Top- 3 Top-4 Top-5 P0159 15 1 2 4 7 10 P0898 6 0 1 1 5 6 P1737 2 0 0 0 0 0 P2584 26 12 19 21 24 25 P2651 13 6 7 7 7 11 P3600 20 9 15 18 19 19 P6559 1 0 0 0 0 0 P7547 4 1 1 1 1 1 P7695 37 24 30 32 35 37 P9766 12 5 5 5 8 9 P9965 5 0 0 0 1 1 P9975 16 2 3 4 7 7 Total 157 60 83 93 114 126 Overall (%) 38.22 52.87 59.24 72.61 80.25

Gaussia Mixture Models (GMMs) for each problem ID traied o the case to evets parameters Diagoal covariace type No. of Gaussia mixtures 10 Best score: 18 Support Vector Machies (SVM) classifiers with features the case to evets parameters Radial Basis Kerel Best score: 35

A hybrid techique comprisig: (a) A recommeder for problem type ID (b) A classifier for uisace / problem Tried the proposed (a) approach with a SVM-based (b) approach but the gai i uisace rate detectio was balaced by the loss i problem ID idetificatio ad there was o time left for further improvemets

A Bayesia Approach for the Maiteace Actio Recommedatio 2013 PHM Data Challege was proposed A hybrid techique could achieve improved results The defiitio of evaluatio metrics for differet PHM problems is a key issue How about data challeges with multiple objectives?

Vassilis Katsouros, Vassilis Papavassiliou ad Christos Emmaouilidis ATHENA Research & Iovatio Cetre, Greece www.athea-iovatio.gr www.ceti.athea-iovatio.gr/compsys e-mail: christosem AT ieee.org