Heartbeats Arrhythmia Classification Using Probabilistic Multi-Class Support Vector Machines: A Comparative Study

Size: px

Start display at page:

Download "Heartbeats Arrhythmia Classification Using Probabilistic Multi-Class Support Vector Machines: A Comparative Study"

Muriel Blankenship
6 years ago
Views:

1 Heartbeats Arrhythmia Classification Using Probabilistic Multi-Class Support Vector Machines: A Comparative Study M. Hendel SIMPA Laboratory, Department of Computer Science, Faculty of Science, University of Science & Tech. of Oran, Mohammed Boudiaf, USTO-MB, Algeria. mounia_90@hotmail.com A. Benyettou SIMPA Laboratory, Department of Computer Science, Faculty of Science, University of Science & Tech. of Oran, Mohammed Boudiaf, USTO-MB, Algeria. a_benyettou@yahoo.fr Abstract The support vector machines were originally created to classify binary problems. Their extension for multiclass problems was the subject of several researches. Usually, a multiclass classifier is obtained by combining several binary classifiers. During the last years, the attention is focused on four main models of Multi-class Support Vector Machines (M-SVM), which consider all classes simultaneously: Weston and Watkins (WW) [] model, Crammer and Singer (CS) [2] model, Lee, Lin and Wahba (LLW) [3] model and the quadratic loss multi-class support vector machines (M- SVM 2 ) [4] model introduced by Guermeur and Monfrini. This study aims to develop a new method based on four M-SVMs models to classify seven different arrhythmia, plus normal ECG obtained from the Physiobank database [5]. The four models work separately, each one aims to output class posterior probability estimates. The results indicate that models achieved an average accuracy between (95.8%) and (98.42%), however the generalization ability of M-SVM 2 is better than the other models. This results show that the M-SVMs can be useful classifier for the automatic detection of heart diseases. Keywords: Probabilistic M-SVMs, Electrocardiogram (ECG), Discrete Wavelet Transform (DWT), MIT-BIH database. Introduction The electrocardiogram (ECG) is a bio-electric signal that records the electrical activities of the heart. It s non-invasive and provides helpful and valuable information about the functional aspects of the heart and the cardiovascular system. However, in some situations, the disease symptoms may not come all the time, but would manifest at certain irregular intervals during the day. Therefore, for a correct diagnostic, the study of ECG signal must be performed on several hours, which implies the necessity of an automatic treatment of the ECG signal. This treatment consists of two main procedures: features extraction and classification, but, the study of such signals is difficult, because they are contaminated with artifacts (power line interference, electrode movements and muscle movements, etc.). So, the necessity of having a good classification system plays an important role in performance of diagnostic. In this sense several studies have been achieved in the last years: Artificial Neural Networks (NNs) [6, 7], Support Vector Machines (SVMs) [8, 9], Mixture of experts [0], etc. Among the mentioned classification methods, those that have given the best results are the SVMs. However, all the proposed systems are based on SVMs involving decomposition methods, but, the M-SVMs have never be used in ECG signals classification problems. So, we decided to use the four models of M-SVM for the classification of seven arrhythmias: Premature Ventricular Contraction (PVC), Atrial Premature Contraction (APB), Right Bundle Branch Block (RBBB), Left Bundle Branch Bloc (LBBB), Paced Beat (PB), Ventricular Flutter Wave (VFW), Ventricular Escape beat (VEB), in addition to Normal (N) ECG. The four modules receive as input the Statistical features of the Discrete Wavelet Transform (DWT) decomposition and output class posterior probability estimates. The obtained results are then compared. (Fig.) The outline of this study is as follows. In session 2, the description of the database used is given. In session 3, we explain the construction of our parameter vectors and present description of the four M-SVMs. In section 4, we display the results obtained for a set of real ECG signals tacked from the MIT-BIH database, and finally in section 5 we present some conclusions. Heart-beat Acquisition 360 Samples Centred on peak R Feature Extraction: - DWT evaluation. - Statistical features +RR intervals - Normalization. (7 parameters) WW CS LLW Heart-beat Arrhythmia + Comparison M-SVM 2 Figure : Diagram Scheme of he Proposed Classification Method 293

2 Description of database The data for this study were obtained from the MIT-BIH Arrhythmia database. This database has been used in a number of studies. It contains records obtained from 48 subjects. Each record is sampled at 360 Hz, has approximately 30 min in length, and has a corresponding annotation file created by qualified cardiologists that identify the category of each beat. We have used 24 ECG records of this database, and we have considered eight ECG types, including the normal beat. The origins of ECG beats are summarized in Table.. Table : Origins And Numbers Of ECG Samples Used in the Study Type Records Beats Number N 03, 3, 5, 23, 220, *6 LBBB 09,, 207, *4 RBBB 8, 24, 22, *4 PVC , *2 APB PB 07, *2 VFW VEB Total Methodology A. Estimated parameters using DWT Several features extraction methods have been proposed in this framework, they include: time domain [, 2], frequency domain [3, 4], statistical measure [0], and time-frequency domain [7]. i. DWT evaluation The ECG is the bio-electric signal that records electrical activities of the heart in time and frequency. Therefore, the study of ECG signals requires methods able to provide information in time and frequency domain. So, DWT is selected, because it has the capacity of representing the non-stationary signals in this two domains. First, each cardiac beat is centered on the peak R and sampled on 360 Hz, so that we are sure that all beat waves will be included and the morphology of the beat is preserved. Then, we applied five levels of decomposition for each cardiac beat. Finally, we selected the three last details (D3-D5) and the last approximation (A5), because the ECG signal lies between 0.5 Hz and 40 Hz and the energy of wavelet coefficients is concentrated mostly in the lower sub-bands. ii. Statistical features extraction We extracted from the sub-bands (D, D2, D3, and A5) and the original signal, three statistical features [7] to characterizing each hearth beat: ) Signal Variance, 2) Variance of the Autocorrelation function of a signal, and 3) Relative Amplitude. Somme ECG arrhythmia, like APB and PVC, has shorter RR intervals than other types of ECG signals. Therefore, we added them to our vector as another important feature for characterizing an ECG beat. The RR intervals are defined as the time duration between the two adjacent R peaks (R ir i and R ir i+). Thus our signal will be characterized by 7 parameters. On the other hand, and in order to standardize all descriptors in the same level, the normalization process is applied. X ij = (X ij X j ) () σ xj B. Classification using Probabilistic M-SVMs The SVMs were originally designed to solve binary classification problems. Recently different types of extensions of SVMs to multi-class classification problems have been proposed in the study, they include direct and indirect methods. The indirect approach were mainly based on the fusion of bi-class SVMs and covers three main methods: -against-, -against-all and Error Correcting Output Code. However, the indirect approach treats the problem partially, in the inverse the second approach consider all classes simultaneously and treats the learning problem of the SVMs as global optimization one. We find four large direct models: WW, CS, LLW and MSVM 2, and in 202 Guermeur developed a Generic model of M-SVMs which includes all the four models. The following gives the mathematical details of Generic model. (for more detail see [5]) Definition (Generic model of M-SVM, Definition 4 in [5]) As all the SVMs the M-SVMs belong to the family of Kernel machines [6, 7], which implies that the functional class on which they operate is induced by a positive function/kernel [8]. Let X (non empty set) be the description space, Y be the set of the categories, C with 3 C +, κ be a realvalued positive type function on X 2, and (H k,.,. Hk ) be the RKHS with kernel κ. For m N, let z m = ((x i, y i )) i m (X, C ) m, y m = (y i ) i m, and ε R Cm (y m ). A C-category M-SVM with kernel κ and "training set" z m is a large margin discriminant model trained by solving a convex quadratic programming problem of the form: 294

3 Problem (Learning problem of an M-SVM, primal formulation) min h,ε C {λ h k Hk k= 2 p + Mε p } s.t. { i, m, k, C \{y i }, K h yi (x i ) h k (x i ) K 2 ε (i )c+k i, m, (k, l) (, C \{y i }) 2, K 3 (ε (i )c+k + ε (i )c+l ) = 0 i, m, k, C \{y i }, (2 p)ε (i )c+k 0 ( K ) C k= h k = 0 Where λ R +, (K, K 3 ) {0,} 2, K 2 R +, M M Cm,Cm (z m ) is a matrix of rank (C )m, p {,2}. if p =, then M is diagonal matrix. In order to characterize the four models of M-SVMs (WW, CS, LLW, M-SVM 2 ) as instances of the generic model, we use the values of the hyper parameter represented in Table.2, where I cm (z m ) and M 2 are matrices of M Cm,Cm (z m ) whose terms are respectively: and m ik,jl = δ i,j δ k,l ( δ yi,k) (2) m ik,jl = ( δ yi,k)( δ yj,l)(δ k,l + C )δ C i,j, with δ is the Kronecker symbol. Table 2: Hyper Parameters of the Four M-SVMs M-SVM M p K K 2 K 3 WW I Cm (z m ) 0 CS C I Cm(z m ) LLW I Cm (z m ) 0 MSVM 2 M C C In our experiments we used the a Multi-class Support Vector Machine package (MSVM-pack) [4], which allowed us to Implement all the different M-SVMs models (WW model, CS model, LLW model, M-SVM 2 model). We tested the three type of kernel available in the package (Linear, Polynomial, and Gaussian RBF), and we saved the "Gaussian RBF" kernel which gives the best results. i. Post-processing the outputs of the M-SVMs To derive class posterior probability estimates from the outputs of the M-SVMs, we extend Platt s bi-class solution [20] to the multi-class case. This corresponds to applying to the outputs the softmax function given by: 0 0 k, C, h k = exp(h k) C exp( k= h k ) (2) Results and discussion Performances of our system is determined using four-fold cross-validation. In each cross validation set, /4 of beats was chosen randomly to be the test data and the others were slected to consitute the training data. Table.3 shows the results obtained in terms of testing accuracy with the WW model, CS model, LLW model and M-SVM 2 model. We considered that a beat is correctly classified by a models only if the probability of a class is higher than 0.75, else the beat is assigned to the reject class (it is better to say "I don t know" than making a false diagnosis), our main objective is to minimize at the maximum the risk of error. According to this classification rates, it can be observed that the average classification accuracy of all models exceeds (95.8%), and that the best average classification accuracy was obtained using M-SVM 2 classifier compared to other models. One can also see, that the testing accuracy of (VFW, WEB) is low in comparison to the other six arrhythmias. This result is justified because the number of examples of the other categories is much higher. On the other hand, a good classification system should have both lowered false negative (FN) and false positive (FP), so to examine more precisely the effect of M-SVMs in ECG beat classification, we add them as other performance indices in our proposed method. The number of misclassified beats which are represented as FP and FN are illustrated in Table.4. The number of misclassified beats is between (43-56) for the FP and between (42-56) for FN. These results represent respectively (0.60%-0.23%) and (0.59%-0.23%) of the total beats (23776), which shows that the number of FP and FN is very low. These observations confirm that our classifiers generalize well and that the rate of mis-classifications is minimized to the maximum. 295

4 Table 3: Classification Results of the Proposed Classifier s (%) Classes s N LBBB RBBB PVC APB PB VFW VEB Average accuracy Reject WW CS LLW M-SVM Table 4: False Positive And False Negative Using the Proposed Classification s (Number Beats) s WW CS LLW MSVM 2 FP FN FP FN FP FN FP FN N LBBB RBBB PVC APB PB VFW VEB Total Also, it s interesting to compare our approach with the methods summarized in Table.5. It can be seen that our results are comparable with those presented in the same table. However, our results are slightly lower, this can be explained by the fact that we took, more classes, or more examples. We also wanted to associate a degree of confidence in the outputs of our classifiers, so we used probabilistic models and we considered that a beat is correctly classified only if the probability of a class is higher than It is more informative to take a decision if we know that there is a 90% chance that the patient has an arrhythmia than having only as indication "arrhythmia". RQ: We note that we were severe in taken decisions about the winning class. In fact, we tested the four models, and we have taken as winning class, the one that maximizes the function h k. The obtained results were better than those presented in Table.4, in particular for LLW model and M- SVM 2 model. Method Table 5: Results of Related Works. Number of EEG classes Number of EEG samples Accuracy (%) ICA+PNN [6] ICA+SVM [9] DTW+PNN [7] PCA+LS-SVM [8] Proposed Method WW CS LLW M-SVM Conclusion In this study, we have proposed a new approach based on Probabilistic M-SVMs to discriminate between seven different arrhythmias, plus normal ECG. The four main models of M-SVMs (WW, CS, LLW, M-SVM 2 ) were implemented. They received as input the statistical features of the DWT decomposition and the original signal. The present classifiers were validated on ECG records taken from the MIT-BIH Arrhythmia database. The results indicate that the M-SVM 2 gives a better average accuracy in comparison to the other models. On the other hand, the results obtained are comparable with those obtained from the previous systems proposed in this application. However, we would like to study in our future works the fusion effect of the four classifiers, and enhance the parameters extraction phase, in order to maximize reliability and recognition rate of the system. References [] J. Weston and C. Watkins. Multi-class support vector machines, Technical Report CSD-TR-98-04, Royal Holloway, University of London, Department of Computer Science, 998. [2] K. Crammer and Y. Singer. On the algorithmic implementation of multiclass kernelbased vector 296

5 machines, Journal of Machine Learning Research, Vol.2, pp , 200. [3] Y. Lee, Y. Lin, and G. Wahba. Multicategory support vector machines: Theory and application to the classification of microarray data and satellite radiance data, Journal of the American Statistical Association, Vol.99, n.465, pp.67 8, [4] Y. Guermeur and E. Monfrini. A quadratic loss multi-class svm for which a radiusmargin bound applies, INFORMATICA, Vol.22, n., pp.73 96, 20. [5] [6] S-N. Yu and K-T. Chou. Integration of independent component analysis and neural networks for ecg beat classification, Expert Systems with Applications, Vol.34, pp , [7] S-N. Yu and Y-H. Chen. Integration of independent component analysis and neural networks for ecg beat classification, Journal Pattern Recognition Letters, Vol.0, pp.42 50, [8] R.J. Martis, U.R. Acharya, K.M. Mandana, A.K. Ray, and C. Chakraborty. Application of principal component analysis to ecg signals for automated diagnosis of cardiac health, Expert Systems with Applications, Vol.39, pp , 202. [9] [S-N. Yu and K-T. Chou. Selection of significant independent components for ecg beat classification, Expert Systems with Applications, Vol.36, pp , [0] S. Osowski and T.H Linh. Ecg beat recognition using fuzzy hybrid neural network, IEEE Transactions on Biomedical Engineering, Vol.48, pp , 200. [] P. De Chazal. Automatic classification of heartbeats using ecg morphology and heartbeat interval features, IEEE Transactions on Biomedical Engineering, Vol.5, n.7, pp , [2] Y.H Hu, S.H Palreddy, and W Tompkins. A patient adaptable ecg beat classifier using a mixture of experts approach, IEEE Transactions on Biomedical Engineering, Vol.44, pp , 997. [3] L. Khadra, A. Al-Fahoum, and S. Binajjaj. A quantitative analysis approach for cardiac arrhythmia classification using higher order spectral techniques, IEEE Transactions on Biomedical Engineering, Vol.52, n., pp , [4] K.. Minami, H. Nakajima, and T. Toyoshima. Realtime discrimination of ventricular tachyarrhythmia with fourier- transform neural network, IEEE Transactions on Biomedical Engineering, Vol.46, pp.79 85, 999. [5] Y. Guermeur. International Journal of Intelligent Information and Database Systems, Vol.6, n.6, pp , 202. [6] J. Shawe-Taylor and N. Cristianini. Kernel Methods for Pattern Analysis, Cambridge University Press, Cambridge, [7] B. Schölkopf and A.J. Smola. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond, The MIT Press, Cambridge, MA, [8] A. Berlinet and C. Thomas-Agnan. Reproducing Kernel Hilbert Spaces in Probability and Statistics, Kluwer Academic Publishers, Boston, [9] F. Lauer and Y. Guermeur. MSVMpack: a multiclass support vector machine package, Journal of Machine Learning Research, Vol.2, pp: , 20. [20] J.C. Platt. Probabilities for SVmachines, A.J. InSmola, P.L. Bartlett, B. Schölkopf, and D. Schuurmans, editors, Advances in Large Margin Classifiers, The MIT Press, Cambridge, MA, chapter.5, pp

GLR-Entropy Model for ECG Arrhythmia Detection

, pp.363-370 http://dx.doi.org/0.4257/ijca.204.7.2.32 GLR-Entropy Model for ECG Arrhythmia Detection M. Ali Majidi and H. SadoghiYazdi,2 Department of Computer Engineering, Ferdowsi University of Mashhad,