Keywords: Multimode process monitoring, Joint probability, Weighted probabilistic PCA, Coefficient of variation.

2016 International Conference on rtificial Intelligence: Techniques and pplications (IT 2016) ISBN: 978-1-60595-389-2 Joint Probability Density and Weighted Probabilistic PC Based on Coefficient of Variation for Multimode Process Monitoring Tian-xian ZHU, Jian HUNG and Xue-feng YN * Key Laboratory of dvanced Control and Optimization for Chemical Processes of Ministry of Education, East China University of Science and Technology, Shanghai 200237, P. R. China *Corresponding author Keywords: Multimode process monitoring, Joint probability, Weighted probabilistic PC, Coefficient of variation. bstract. For probabilistic monitoring of multimode processes, this paper introduced a monitoring scheme that integrates joint probability density and weighted probabilistic principal component analysis based on coefficient of variation (CV-WPPC). joint probability based on T 2 statistic was constructed for mode identification. fter it concentrated maximum fault-relevant information into dominant subspace by identifying and extracting important noise factors from the residual subspace, the new approach utilized a weighting strategy based on coefficient of variation method to highlight the useful information in the reconstructed dominant subspace. case study on the Tennessee Eastman process was applied to demonstrate the efficiency of the proposed method. Introduction Multivariate statistical process monitoring (MSPM) has received much attention from the academe and industries. [1, 2] s the most widely-used method among the MSPM-based methods, principal component analysis (PC) [3, 4] assumes that process variables are noiseless, deterministic and operated under single mode. However, the majority of process data obtained from complex processes always performs through a random manner because of the contamination of random noises, and they may come from different operation modes. In consideration of the randomness of process variables, probabilistic PC (PPC) [5] have been developed to define an appropriate probabilistic model for traditional PC, and it successfully integrates noise information into the generative model, which makes process data description more accurate and appropriate. However, there is no explicit mapping relationship exists between fault information and several certain probabilistic principal components (PPCs), and useful information might be scattered across different subspaces when a fault occurs. Ge and song [6] analyzed fault samples and presented a novel monitoring performance-driven IC selection method. Moreover, if all the selected PPCs are used to construct the T 2 statistic with the same importance, a large amount of useless information might cover up the fault-relevant information; which makes fault detection results undesirable. Therefore, many weight strategies [7-9] are proposed to solve this problem. Jiang and Yan [10] introduced a double-weighted strategy into IC process monitoring (DWIC) to improve the detection performance. In this article, weighted PPC based on coefficient of variation method (CV-WPPC) is introduced to deal with the above mentioned shortcomings. CV-WPPC uses normal process data to construct a conventional PPC model, the monitoring space is categorized as dominant and residual, T 2 and SPE statistics are constructed to monitor the two subspaces, accordingly. Given that all PPCs and noise factors are mutually uncorrelated, PPC can scale the variation directly along the direction of each PPC in the dominant subspace and along the direction of each noise factor in the residual subspace, respectively. Therefore, in consideration of the compatibility of two subspaces when the PPC model is used, identifying and extracting the fault-relevant noise factors from the residual subspace for integration into the dominant subspace are rational. This strategy concentrates as much useful information as possible into a specific subspace for further analysis. 74

Due to the different requirements of product quality and quantity, modern industrial process usually possesses the characteristic of multiple operating modes. Hence, monitoring methods based on single operation mode assumption may not apply to multimodal process monitoring, and constructing a multimodal model for integration into an online monitoring scheme has become a new research focus. Many related researches have been proposed. Chen and Liu [11] developed a mixture PC model method for multimodal fault detection. Ge and song [12] employed joint probability scheme for mode identification when using the PC-IC model. This study intends to improve the monitoring performance under different operating modes based on CV-WPPC method. The remainder of this article is organized as follows. The conventional PPC method is briefly reviewed in section 2, which is followed by the concretely demonstration of CV-WPPC for multimode process monitoring. The proposed method is tested in a benchmark case study of the Tennessee Eastman process (TE) in section 4, and some conclusions of the study are presented in the last section. Probabilistic Principal Component nalysis The generative model of PPC can be expressed as x = Pt + e, (1) where observed variable x R m is regarded as a linear combination of latent variable t R k and noise variable e R m, P R m k is the loading matrix, k < m is the number of PPCs retained. The prior distribution of the latent variable is supposed to follow a standard Gaussian distribution, and the noise variable follows a Gaussian distribution with zero mean and variance σ 2 I, namely, p(t) = N(t 0, I) and p(e) = N(e 0, σ 2 I). Thus, the marginal distribution of x can be calculated easily as p(x) = N(x 0, P P T + σ 2 I), (2) For a given observed variable (x 1, x 2,.., x n ), the expectation maximization (EM) algorithm is utilized to determine the parameter set {P, σ 2 } by maximizing the following likelihood function: L(P, σ 2 n ) = ln i=1 p(x i P, σ 2 ). (3) s the parameter set of PPC has been calculated, the corresponding monitoring scheme can be constructed accordingly. CV-WPPC for Multimode Process Monitoring T j 2 (i) statistic, known as the T 2 statistic in the direction of the jth PPC, is defined as T j 2 (i) = t j(i)(λ i 1 )t j(i) to identify the mode to which the current sample belongs. First, the traditional PPC model is constructed in each mode. Then, T j 2 (i) statistic in each mode is converted into mode probability for mode identification, which can be calculated as p(t j M c ) = e T 2 j,c, (4) where M c is the cth operation mode, and T 2 j,c is the T 2 j statistic on the cth mode. Considering the mutual irrelevance of all PPCs contained in dominant subspace, the probability of event x M c can be obtained as follow p(x M c ) = p(t 1 M c )p(t 2 M c ) p(t k1 M c ) = e (T 1,c 2 +T 2 2,c 2 + +T k1,c ). (5) 75

Joint probability density can demonstrate whether the sample point conform to its corresponding mode correctly. The mode of current sample can be regarded as the corresponding mode joint probability with highest value. Once we identified the mode, the next thing we should do is to monitor process on the current mode. When a PPC mode is constructed, maximum fault-relevant information is supposed to be assembled into the dominant subspace. In fact, some fault-relevant information that cannot be ignored may be dispersed on the residual subspace. SPE r (i) is defined to measure the variation along the direction of each noise factor, where r = 1,2,, m is the number of noise factors. The SPE r (i) statistic can be constructed as: SPE r (i) = e r(i)(σ 2 )e r(i). (6) The control limit of this statistic can be calculated by the χ 2 distribution with one dimension of freedom. If the quantitative value of SPE r (i) statistic is above control limit during online monitoring, the rth direction will generate significant variation when a fault occurs. This strategy concentrates as much fault-relevant information as possible into dominant subspace. The idea of WPPC focuses on estimating the degree of importance of each new PPCs online and setting different weighting value on every new PPC so as to highlight the useful information. We define a weighting matrix W = diag(w 1, w 2,, w K ), where K is the number of new PPCs contained in the reconstructed dominant subspace. The weighted statistic GT 2 can be calculated as: GT 2 k (i) = 1 w j T 2 k j=1 j (i) + 2 r=1 w k1 +r SPE r (i), (7) where k 1 is the number of PPCs and k 2 is the number of selected noise factors. s an objective weighting assignment method, the coefficient of variation is proposed in this work to determine weighting matrix W. It eliminates the influence of new PPCs with different dimensions and measures variation degree of each new PPC. Given the new statistic matrix CONT 2 (i) = [T 2 j (i), SPE r (i)], the variation coefficient and the weighting value of each new PPC can be obtained through Eq. 8 and Eq. 9. V k = σ k x k, (8) K w k = V k 1 V k, (9) where σ k and x k denotes the standard deviation and mean value of kth new PPC, respectively. Remarkably, the threshold of the GT 2 statistic no longer follows a particular distribution. Kernel density estimation (KDE) [13] is introduced to determine the new confidence limit. Benchmark Case Study of TE Process TE process introduced by Downs and Vogel [14] is regarded as a benchmark for the simulation of chemical production processes, which has been widely applied to estimate the monitoring performance of various corresponding methods. In this study, 31 variables are selected for monitoring purposes. Four typical cases, which are listed in Table 1, are introduced to evaluate multimode monitoring performance of CV-WPPC. The number of PPCs is determined by the variance explanation ratio [15] and the confidence level α is set as 0.99. In Case 1, the normal operating conditions of different models are utilized to evaluate mode identification and to determine whether the proposed method is effective for fault detection. The joint probabilities of the modes are shown in Figure 1a, which illustrates that the process runs on correct modes. The monitoring performances of both two methods are shown in Fig. 1b, the false alarm rate of PPC and CV-WPPC can be accepted in realistic applications. The monitoring results of two methods for fault 10 and fault 11 are illustrated in Figure 2a and Fig. 2b, respectively. It shows that compared to traditional PPC, the misdetection rate of CV-WPPC method is obviously decreases by highlighting the fault-relevant information. 76

The missed detection rates of traditional PPC, DWIC [10] and CV-WPPC for all predetermined faults under the three modes are presented in Table 2. The comparison indicates that the proposed CV-WPPC performs most effectively and significantly reduces the missed detection rates for most of the faults when the process runs on different operating modes. Figure 1. The monitoring results of the TE process Case 1: (a) joint probabilities, (b) monitoring results. Figure 2. The monitoring results of the TE process: (a) monitoring results in Case 3, (b) monitoring results in Case 4. Table 1. Test cases of the TE process. Case no. Case 1 Test cases Normal operation from the 1 st to 500 th samples on mode 1; Normal operation from the 501 st to 1000 th samples on mode 2; Normal operation from the 1001 st to 1500 th samples on mode 3; Normal operation from the 1501 st to 2000 th samples on mode 1. Case 2 Normal operation from the 1 st to 160 th samples on mode 1; Fault 4 occurs from the 161 st to 700 th samples on mode 1; Normal operation from the 701 st to 860 th samples on mode 2; Fault 4 occurs from the 861 st to 1400 th samples on mode 2; Normal operation from the 1401 st to 1560 th samples on mode 3; Fault 4 occurs from the 1561 st to 2100 th samples on mode 3 Case 3 Normal operation from the 1 st to 160 th samples on mode 1; Fault 10 occurs from the 161 st to 700 th samples on mode 1; Normal operation from the 701 st to 860 th samples on mode 2; Fault 10 occurs from the 861 st to 1400 th samples on mode 2; Normal operation from the 1401 st to 1560 th samples on mode 3; Fault 10 occurs from the 1561 st to 2100 th samples on mode 3. 77

Case 4 Normal operation from the 1 st to 160 th samples on mode 1; Fault 11 occurs from the 161 st to 700 th samples on mode 1; Normal operation from the 701 st to 860 th samples on mode 2; Fault 11 occurs from the 861 st to 1400 th samples on mode 2; Normal operation from the 1401 st to 1560 th samples on mode 3; Fault 11 occurs from the 1561 st to 2100 th samples on mode 3 Table 2. Missed detection rates of PPC, DWIC and CV-WPPC. Mode no. Mode1 Mode2 Mode3 Fault no. DWIC PPC WPPC DWIC PPC WPPC DWIC PPC WPPC DI2 T2 SPE GT2 DI2 T2 SPE GT2 DI2 T2 SPE GT2 1 0.004 0 0.006 0 0.006 0.001 0.003 0.001 0.006 0.003 0.003 0.003 2 0.021 0.009 0.317 0.009 0.011 0.004 0.014 0.003 0.035 0.015 0.02 0.018 3 0.837 0.94 0.995 0.884 0.978 0.948 1 0.916 0.889 0.765 0.862 0.789 4 0.004 0 0.999 0 0.002 0.001 0.029 0.001 0.002 0.001 0.001 0.001 5 0 0.716 0.004 0 0.969 0.94 1 0.911 0.002 0.001 0.001 0.001 7 0 0 0.693 0 0 0 0 0 0 0 0 0 8 0.022 0.015 0.413 0.014 0.059 0.036 0.046 0.035 0.061 0.029 0.039 0.029 9 0.913 0.951 0.999 0.912 0.98 0.94 1 0.91 0.756 0.5 0.998 0.617 10 0.076 0.14 0.261 0.099 0.044 0.183 0.158 0.063 0.041 0.068 0.065 0.032 11 0.161 0.175 0.993 0.156 0.024 0.004 0.34 0.004 0.004 0.005 0.314 0.007 12 0.002 0.008 0.179 0.003 0.67 0.523 0.944 0.442 0.011 0.008 0.006 0.006 13 0.07 0.046 0.247 0.045 0.106 0.067 0.078 0.066 0.191 0.083 0.248 0.075 14 0.002 0 0.97 0 0.009 0.005 0.009 0.005 0.009 0.006 0.416 0.006 15 0.97 0.844 0.995 0.804 0.976 0.941 1 0.907 0.963 0.908 1 0.961 16 0.448 0.51 0.834 0.4 0.981 0.946 1 0.909 0.944 0.901 1 0.912 17 0.043 0.03 0.455 0.027 0.017 0.008 0.099 0.009 0.011 0.006 0.128 0.008 18 0.139 0.091 0.119 0.088 0.141 0.103 0.305 0.093 0.019 0.01 0.021 0.01 19 0.791 0.728 0.805 0.356 0.004 0.004 0.028 0.003 0.007 0.004 0.01 0.004 20 0.119 0.375 0.316 0.156 0.03 0.025 0.025 0.02 0.085 0.035 0.139 0.03 Conclusions In this study, joint probability density is introduced for mode identification and CV-WPPC is proposed to improve the monitoring performance of PPC method under different operating modes. CV-WPPC method aims to concentrate as much fault-relevant information as possible into dominant subspace. The weighting strategy based on the coefficient of variation method is proposed to highlight the useful information. The effectiveness of the proposed scheme in monitoring multimode process is validated by applying it in the TE benchmark process. References [1] Q. Chen, U. Kruger, M. Meronk,.Y.T. Leung, Synthesis of T 2 and Q statistics for process monitoring, Control Engineering Practice, 12 (2004) 745-755. [2]. lghazzawi, B. Lennox, Monitoring a complex refining process using multivariate statistics, Control Engineering Practice, 16 (2008) 294-307. [3] X. Wang, U. Kruger, G.W. Irwin, Process monitoring approach using fast moving window PC, Industrial & Engineering Chemistry Research, 44 (2005) 5691-5702. [4] H. bdi, L.J. Williams, Principal component analysis, Wiley Interdisciplinary Reviews: Computational Statistics, 2 (2010) 433-459. 78

[5] M.E. Tipping, C.M. Bishop, Probabilistic principal component analysis, Journal of the Royal Statistical Society: Series B (Statistical Methodology), 61 (1999) 611-622. [6] Z. Ge, Z. Song, Performance-driven ensemble learning IC model for improved non-gaussian process monitoring, Chemometrics and Intelligent Laboratory Systems, 123 (2013) 1-8. [7] Q. Jiang, X. Yan, Weighted kernel principal component analysis based on probability density estimation and moving window and its application in nonlinear chemical process monitoring, Chemometrics and Intelligent Laboratory Systems, 127 (2013) 121-131. [8] Q. Jiang, X. Yan, Probabilistic monitoring of chemical processes using adaptively weighted factor analysis and its application, Chemical Engineering Research and Design, 92 (2014) 127-138. [9] X.B. He, Y.P. Yang, Y.H. Yang, Fault diagnosis based on variable-weighted kernel Fisher discriminant analysis, Chemometrics and Intelligent Laboratory Systems, 93 (2008) 27-33. [10] Q. Jiang, X. Yan, Joint Probability Density and Double-Weighted Independent Component nalysis for Multimode Non-Gaussian Process Monitoring, Industrial & Engineering Chemistry Research, 53 (2014) 20168-20176. [11] J. Chen, J. Liu, Mixture principal component analysis models for process monitoring, Industrial & engineering chemistry research, 38 (1999) 1478-1488. [12] Z. Ge, Z. Song, Bayesian inference and joint probability analysis for batch process monitoring, iche Journal, 59 (2013) 3702-3713. [13] J.-M. Lee, C. Yoo, I.-B. Lee, Statistical process monitoring with independent component analysis, Journal of Process Control, 14 (2004) 467-485. [14] J.J. Downs, E.F. Vogel, plant-wide industrial process control problem, Computers & chemical engineering, 17 (1993) 245-255. [15] D. Kim, I.-B. Lee, Process monitoring based on probabilistic PC, Chemometrics and intelligent laboratory systems, 67 (2003) 109-123. 79