Multiple Wavelet Coefficients Fusion in Deep Residual Networks for Fault Diagnosis

Multiple Wavelet Coefficients Fusion in Deep Residual Networks for Fault Diagnosis Minghang Zhao, Myeongsu Kang, Baoping Tang, Michael Pecht 1

Backgrounds Accurate fault diagnosis is important to ensure the safety of automobiles and helicopters, long-term generation of electric power, and reliable operating of other electrical and mechanical systems. Discrete wavelet packet transform (DWPT), an effective tool to decompose non-stationary vibration signals into various frequency bands, has been widely applied for machine fault diagnosis [1]. Besides, the usage of deep learning methods is becoming more and more popular to automatically learn discriminative features from vibration signals for improving diagnostic accuracies [2]. 2

Motivations However, there is still no consensus as to which wavelet (e.g., DB1, DB2, and DB3) can achieve an optimal performance in fault diagnosis. Besides, different wavelets may be optimal for recognizing different kinds of faults under different working conditions. It is very unlikely for one certain wavelet to be the most effective in recognizing all kinds of faults (such as bearing inner raceway faults, outer raceway faults, and rolling element faults). Therefore, the fusion of multiple wavelets into deep neural networks has an potential to improve the accuracy of a fault diagnostic task which involves the recognition of various fault types. 3

Input Data Configuration The wavelet coefficients at various frequency bands obtained using a certain wavelet can be stacked to be a 2D matrix; then, the 2D matrices derived from multiple wavelets can be formed to be a 3D matrix. Wavelet coefficients at the 1 st decomposition level 2D matrices of wavelet coefficients at the ii th decomposition level Decomposition using various wavelets Signal 2 nd wavelet 1 st wavelet WW 1,0 WW 1,1 WW 1,0 WW 1,1 NN ww th wavelet WW 1,0 WW 1,1 WW ii,0 WW ii,1 WW ii,2 WW ii,0 WW ii,1 WW ii,2 WW ii,3 WW ii,3 WW ii,2 ii 2 WW ii,2 ii 2 WW ii,2 ii 1 WW ii,0 WW ii,1 WW ii,2 WW ii,3 WW ii,2 ii 2 WW ii,2 ii 1 Frequency band Time WW ii,2 ii 1 4

An Overview of Deep Residual Networks The deep residual network (DRN) is an improved variant of convolutional neural networks (CNNs), which uses identity shortcuts to ease the difficulty of training [3]-[4]. Input BN ReLU Conv 3 3 BN ReLU Conv 3 3 Conv 3 3 BN, ReLU, Conv 3 3 BN, ReLU, Conv 3 3 BN, ReLU, Conv 3 3 BN, ReLU, Conv 3 3 A number of RBUs A residual building unit (RBU) BN, ReLU, GAP Fully connected output layer A deep residual network BN: Batch normalization ReLU: Rectifier linear unit Conv 3 3: Convolution with kernels in the size of 3 3 GAP: Global average pooling 5

The First Developed Method To achieve multiple wavelet coefficients fusion, a simple method is to concatenate these 2D matrices of wavelet coefficients and feed them into a DRN. The method was named as Multiple Wavelet Coefficients Fusion in a Deep Residual Network by Concatenation (MWCF-DRN-C). 2D matrix 1 A vibration signal + DWPTs 2D matrix 2 2D matrix 3 2D matrix N A concatenation layer BN, ReLU, BN, ReLU, Conv, m BN, ReLU, GAP (Dropout) Fully connected output layer m: an indicator of the number of convolutional kernels 6

The Second Developed Method An individual convolutional layer with trainable parameters is applied to each 2D matrix of wavelet coefficients with the goal of converting the important wavelet coefficients to be large features. Then, the element-wise maximum features are chosen to be the output in the maximization layer [5]. The method was named as Multiple Wavelet Coefficients Fusion in a Deep Residual Network by Maximization (MWCF-DRN-M). 2D matrix 1 A vibration signal + DWPTs 2D matrix 2 2D matrix 3 A maximization layer BN, ReLU, BN, ReLU, Conv, m BN, ReLU, GAP (Dropout) Fully connected output layer 2D matrix N 7

Explanations on the Second Developed Method The 2D matrices of wavelet coefficients are different representations of the same vibration signal. It is unavoidable that these 2D matrices of wavelet coefficients contain much redundant/repetitive information. Much redundancy 2D matrix 1 A vibration signal + DWPTs 2D matrix 2 2D matrix 3 A maximization layer BN, ReLU, BN, ReLU, Conv, m BN, ReLU, GAP (Dropout) Fully connected output layer 2D matrix N 8

Explanations on the Second Developed Method The maximization layer and the convolutional layers before it can be interpreted as a trainable feature selection process, which allows the important features to be passed to the subsequent layers while the relatively unimportant features being abandoned. Much redundancy Trainable feature selection 2D matrix 1 A vibration signal + DWPTs 2D matrix 2 2D matrix 3 A maximization layer BN, ReLU, BN, ReLU, Conv, m BN, ReLU, GAP (Dropout) Fully connected output layer 2D matrix N 9

Experimental Setup A drivetrain dynamics simulator [6] was used to simulate the faults. Experiments were conducted under the 10-fold cross-validation scheme. Comparisons were made with the conventional CNN and DRN to demonstrate the efficacy of the developed MWCF-DRN-C and MWCF-DRN-M. 10

Results 11

Conclusions The fusion of multiple wavelet coefficients in deep neural networks can be able to improve the fault diagnostic performance. In the experimental result, the MWCF-DRN-M method was slightly better than the MWCF-DRN-C method by yielding a 0.80% improvement in terms of overall average testing accuracy. 12

References 1. R. Yan, R. X. Gao, and X. Chen, Wavelets for fault diagnosis of rotary machines: A review with applications, Signal Process., vol. 96, pp. 1 15, 2014. 2. M. Zhao, M. Kang, B. Tang, and M. Pecht, Deep Residual Networks With Dynamically Weighted Wavelet Coefficients for Fault Diagnosis of Planetary Gearboxes, IEEE Transactions on Industrial Electronics, vol. 65, no. 5, pp. 4290 4300, 2018. 3. K. He, X. Zhang, S. Ren, and J. Sun, Deep residual learning for image recognition, in Proc. IEEE Conf. Comput. Vision Pattern Recognit., Seattle, WA, USA, Jun. 27 30, 2016, pp. 770 778. 4. K. He, X. Zhang, S. Ren, and J. Sun, Identity mappings in deep residual networks, in Computer Vision ECCV 2016 (Lecture Notes in Computer Science 9908), B. Leibe, J. Matas, N. Sebe, and M. Welling, Eds., Cham, Switzerland: Springer, 2016, pp. 630 645. 5. Z. Liao and C. Gustavo, A deep convolutional neural network module that promotes competition of multiple-size filters, Pattern Recognit., vol. 71, pp. 94 105, 2017. 6. Drivetrain Diagnostics Simulator. SpectraQuest, Richmond, VA, USA, [Online]. Available: http://spectraquest.com/drivetrains/details/dds/ 13