WWTP diagnosis based on robust principal component analysis Y. Tharrault, G. Mourot, J. Ragot, M.-F. Harkat Centre de Recherche en Automatique de Nancy Nancy-université, CNRS 2, Avenue de la forêt de Haye 54516 Vandœuvre-lès-Nancy Cedex 7 th IFAC Symposium on Fault Detection, Supervision and Safety of Technical Processes July 3, 2009, Barcelona, Spain Tharrault, Mourot, Ragot, Harkat (CRAN) WWTP diagnosis with robust PCA SAFEPROCESS 09 1 / 21
Outline 1 Principle of the Principal component analysis 2 Robust PCA 3 Fault detection and isolation 4 Application to hydraulic part of a wastewater treatment plant Tharrault, Mourot, Ragot, Harkat (CRAN) WWTP diagnosis with robust PCA SAFEPROCESS 09 2 / 21
Principle of the Principal component analysis Data matrix X R N n in a normal process operation Tharrault, Mourot, Ragot, Harkat (CRAN) WWTP diagnosis with robust PCA SAFEPROCESS 09 3 / 21
Principle of the Principal component analysis Data matrix X R N n in a normal process operation PCA Maximization of the variance projections T = X P T R N n : principal component matrix P R n n : projection matrix Tharrault, Mourot, Ragot, Harkat (CRAN) WWTP diagnosis with robust PCA SAFEPROCESS 09 3 / 21
Principle of the Principal component analysis Data matrix X R N n in a normal process operation PCA Maximization of the variance projections T = X P T R N n : principal component matrix P R n n : projection matrix Decomposition in eigenvalues/eigenvectors of the covariance matrix Σ = 1 N 1 X T X = PΛP T with PP T = P T P = I n Σ = [ ] [ ][ ] Λ P l P l 0 P T l n l 0 Λ n l Pn l T Tharrault, Mourot, Ragot, Harkat (CRAN) WWTP diagnosis with robust PCA SAFEPROCESS 09 3 / 21
Principle of the Principal component analysis Data matrix X R N n in a normal process operation PCA Maximization of the variance projections T = X P T R N n : principal component matrix P R n n : projection matrix Decomposition in eigenvalues/eigenvectors of the covariance matrix Σ = 1 N 1 X T X = PΛP T with PP T = P T P = I n Σ = [ ] [ ][ ] Λ P l P l 0 P T l n l 0 Λ n l Pn l T Tharrault, Mourot, Ragot, Harkat (CRAN) WWTP diagnosis with robust PCA SAFEPROCESS 09 3 / 21
Principle of the Principal component analysis Data matrix X R N n in a normal process operation PCA Maximization of the variance projections T = X P T R N n : principal component matrix P R n n : projection matrix Decomposition in eigenvalues/eigenvectors of the covariance matrix Σ = 1 N 1 X T X = PΛP T with PP T = P T P = I n Σ = [ ] [ ][ ] Λ P l P l 0 P T l n l 0 Λ n l Pn l T Tharrault, Mourot, Ragot, Harkat (CRAN) WWTP diagnosis with robust PCA SAFEPROCESS 09 3 / 21
Principle of the Principal component analysis Data matrix X R N n in a normal process operation PCA Maximization of the variance projections T = X P T R N n : principal component matrix P R n n : projection matrix Decomposition in eigenvalues/eigenvectors of the covariance matrix Σ = 1 N 1 X T X = PΛP T with PP T = P T P = I n Σ = [ ] [ ][ ] Λ P l P l 0 P T l n l 0 Λ n l Pn l T Tharrault, Mourot, Ragot, Harkat (CRAN) WWTP diagnosis with robust PCA SAFEPROCESS 09 3 / 21
Principle of the Principal component analysis Decomposition x 2 ˆX E x 1 Tharrault, Mourot, Ragot, Harkat (CRAN) WWTP diagnosis with robust PCA SAFEPROCESS 09 4 / 21
Principle of the Principal component analysis Decomposition Principal part ˆX = X C l with C l = P l P T l x 2 ˆX E x 1 Tharrault, Mourot, Ragot, Harkat (CRAN) WWTP diagnosis with robust PCA SAFEPROCESS 09 4 / 21
Principle of the Principal component analysis Decomposition Principal part ˆX = X C l with C l = P l P T l x 2 ˆX Residual part E = X ˆX = X (I n C l ) = X P n l P T n l E x 1 Tharrault, Mourot, Ragot, Harkat (CRAN) WWTP diagnosis with robust PCA SAFEPROCESS 09 4 / 21
Principle of the Principal component analysis Decomposition Principal part ˆX = X C l with C l = P l P T l x 2 ˆX Residual part E = X ˆX = X (I n C l ) = X P n l P T n l E x 1 Determination of the number of principal components l Tharrault, Mourot, Ragot, Harkat (CRAN) WWTP diagnosis with robust PCA SAFEPROCESS 09 4 / 21
PCA weakness Sensitive to outliers z 2 u 1 u 1 Outliers: Data different from the normal operating conditions (faulty data, data obtained during shutdown or stratup periods or data issued form different operating mode) u 2 u 2 z 1 Robust PCA with respect to outliers Outliers detection and isolation Tharrault, Mourot, Ragot, Harkat (CRAN) WWTP diagnosis with robust PCA SAFEPROCESS 09 5 / 21
Outliers n = 2 variables (x 1, x 2 ) X = [x 1 x 2 ] l = 1 3 x 2 P 1 P 2 1 x 1 2 1 1 2 3 1 2 3 P 1 2 1 0 1 2 3 3 2 1 P 2 0 1 2 3 Tharrault, Mourot, Ragot, Harkat (CRAN) WWTP diagnosis with robust PCA SAFEPROCESS 09 6 / 21
Outliers n = 2 variables (x 1, x 2 ) X = [x 1 x 2 ] l = 1 Outliers 1 : green observation Outliers 2 : red observation 3 x 2 P 1 P 2 1 x 1 2 1 1 2 3 1 2 3 P 1 2 1 0 1 2 3 3 2 1 P 2 0 1 2 3 Tharrault, Mourot, Ragot, Harkat (CRAN) WWTP diagnosis with robust PCA SAFEPROCESS 09 6 / 21
Robust PCA Residual r(k) r(k) = Pn l T (x(k) µ) 2 with x(k) an observation µ the mean of the data X P n l is the eigenvector matrix of the robust covariance matrix corresponding to its n l smallest eigenvalues Objective function PCA minimise the following criterion: 1 N 1 N k=1 (r(k)) 2 1 0 1 2 Weigth function r(k) with the constraint P T n l P n l = I n l. 1 r(k) 2 1 0 1 2 Tharrault, Mourot, Ragot, Harkat (CRAN) WWTP diagnosis with robust PCA SAFEPROCESS 09 7 / 21
Robust PCA Residual r(k) r(k) = Pn l T (x(k) µ) 2 with x(k) an observation µ the mean of the data X P n l is the eigenvector matrix of the robust covariance matrix corresponding to its n l smallest eigenvalues The general scale-m estimator minimizes the following objective function with the constraint P T n l P n l = I n l : N 1 N k=1 ρ ( ) r(k) ˆσ with the function ρ as the objective function, ˆσ the robust scale of the residual r ( k) calculated by minimising the following criterion: N 1 N k=1 ρ ( ) r(k) = δ ˆσ Objective function ρ 1 2 1 0 1 2 Weight function w 1 2 1 0 1 2 r(k) ˆσ r(k) ˆσ Tharrault, Mourot, Ragot, Harkat (CRAN) WWTP diagnosis with robust PCA SAFEPROCESS 09 8 / 21
Robust PCA Initialization with a robust covariance matrix Robust covariance matrix N 1 N w(i,j)(x(i) x(j))(x(i) x(j)) T i=1 j=i+1 V = N 1 N w(i, j) i=1 j=i+1 where the weights w(i, j) themselves are defined by: ( w(i,j) = exp β ) 2 (x(i) x(j))t Σ 1 (x(i) x(j)) with β turning parameter For β = 0, V = 2Σ Tharrault, Mourot, Ragot, Harkat (CRAN) WWTP diagnosis with robust PCA SAFEPROCESS 09 9 / 21
Robust PCA Only robust to outliers with a projection onto the residual space sacle M-estimator The scale M-estimator maximizes the following criterion with the constraint P T l P l = I l : with ˆσ the robust scale of the residual r and ρ the objective function. N 1 N k=1 ( P T ρ l (x(k) µ) 2 ) ˆσ Tharrault, Mourot, Ragot, Harkat (CRAN) WWTP diagnosis with robust PCA SAFEPROCESS 09 10 / 21
Fault detection and isolation The reconstruction ˆx R Minimizing the influence of fault ˆx R (k) = x(k) Ξ R f R with x(k) : an observation f R : the fault magnitude (unknown) Ξ R : matrix of reconstruction directions Tharrault, Mourot, Ragot, Harkat (CRAN) WWTP diagnosis with robust PCA SAFEPROCESS 09 11 / 21
Fault detection and isolation The reconstruction ˆx R Minimizing the influence of fault ˆx R (k) = x(k) Ξ R f R with x(k) : an observation f R : the fault magnitude (unknown) Ξ R : matrix of reconstruction directions For example, to reconstruct 2 variables (R = 2,4) among 5 variables [ 0 1 0 0 0 Ξ R = 0 0 0 1 0 ] T Tharrault, Mourot, Ragot, Harkat (CRAN) WWTP diagnosis with robust PCA SAFEPROCESS 09 11 / 21
Fault detection and isolation The reconstruction ˆx R Minimizing the influence of fault ˆx R (k) = x(k) Ξ R f R with x(k) : an observation f R : the fault magnitude (unknown) Ξ R : matrix of reconstruction directions For example, to reconstruct 2 variables (R = 2,4) among 5 variables [ 0 1 0 0 0 Ξ R = 0 0 0 1 0 ] T Estimation of the fault magnitude ˆfR : with D R (k) = ˆx R (k) T PΛ 1 P Tˆx R (k) ˆfR = argmin{d R (k)} f R Tharrault, Mourot, Ragot, Harkat (CRAN) WWTP diagnosis with robust PCA SAFEPROCESS 09 11 / 21
Fault detection and isolation The reconstruction vector ˆx R (k) of the vector x(k) is given by: ˆx R (k) = ( I Ξ R (Ξ T R PΛ 1 P T Ξ R ) 1 Ξ T R PΛ 1 P T) x(k) Tharrault, Mourot, Ragot, Harkat (CRAN) WWTP diagnosis with robust PCA SAFEPROCESS 09 12 / 21
Fault detection and isolation The reconstruction vector ˆx R (k) of the vector x(k) is given by: ˆx R (k) = ( I Ξ R (Ξ T R PΛ 1 P T Ξ R ) 1 Ξ T R PΛ 1 P T) x(k) Condition of reconstruction: Existence of (Ξ T R PΛ 1 P T Ξ R ) 1 matrix Ξ T R PΛ 1 P T Ξ R of full rank Tharrault, Mourot, Ragot, Harkat (CRAN) WWTP diagnosis with robust PCA SAFEPROCESS 09 12 / 21
Fault detection and isolation The reconstruction vector ˆx R (k) of the vector x(k) is given by: ˆx R (k) = ( I Ξ R (Ξ T R PΛ 1 P T Ξ R ) 1 Ξ T R PΛ 1 P T) x(k) Condition of reconstruction: Existence of (Ξ T R PΛ 1 P T Ξ R ) 1 matrix Ξ T R PΛ 1 P T Ξ R of full rank To reconstruct a fault, it must be at least projected onto the principal space (r l) or onto the principal space (r n l). This implies that the number of reconstructed variables r must respect the following inequality: r max(n l,l) with r : Number of reconstructed variables l : Number of principal components n : Number of variables Tharrault, Mourot, Ragot, Harkat (CRAN) WWTP diagnosis with robust PCA SAFEPROCESS 09 12 / 21
Fault detection and isolation The reconstruction vector ˆx R (k) of the vector x(k) is given by: ˆx R (k) = ( I Ξ R (Ξ T R PΛ 1 P T Ξ R ) 1 Ξ T R PΛ 1 P T) x(k) Condition of reconstruction: Existence of (Ξ T R PΛ 1 P T Ξ R ) 1 matrix Ξ T R PΛ 1 P T Ξ R of full rank To reconstruct a fault, it must be at least projected onto the principal space (r l) or onto the principal space (r n l). This implies that the number of reconstructed variables r must respect the following inequality: r max(n l,l) with r : Number of reconstructed variables l : Number of principal components n : Number of variables The number of maximum reconstruction: with C r n denotes the combination of r from n. max(n l,l) 1 C r n r=1 Tharrault, Mourot, Ragot, Harkat (CRAN) WWTP diagnosis with robust PCA SAFEPROCESS 09 12 / 21
Fault detection and isolation Fault detection indicator D R D R (k) = ˆx R (k) T PΛ 1 P Tˆx R (k) Tharrault, Mourot, Ragot, Harkat (CRAN) WWTP diagnosis with robust PCA SAFEPROCESS 09 13 / 21
Fault detection and isolation Fault detection indicator D R D R (k) = ˆx R (k) T PΛ 1 P Tˆx R (k) Fault detection A fault is detected, if: D R (k) > γ 2 α with γ 2 α the detection threshold of indicator D R, Tharrault, Mourot, Ragot, Harkat (CRAN) WWTP diagnosis with robust PCA SAFEPROCESS 09 13 / 21
Fault detection and isolation Fault detection indicator D R D R (k) = ˆx R (k) T PΛ 1 P Tˆx R (k) Fault detection A fault is detected, if: D R (k) > γ 2 α with γ 2 α the detection threshold of indicator D R, Fault isolation For the faulty observations, the faulty variables ˆR are determined as follows: ˆR = arg D R (k) < γ 2 α R I with γ 2 α the detection threshold of indicator D R and I all combinations of possible reconstruction directions. Tharrault, Mourot, Ragot, Harkat (CRAN) WWTP diagnosis with robust PCA SAFEPROCESS 09 13 / 21
Fault detection and isolation Reduction of the number of reconstruction Determination of the colinear direction projection a global indicator K is built: K(R 1,R 2 ) = max{(d(r 1,R 2 ), d(r1,r 2 )} with R 1 and R 2 correspond to sets of variable reconstruction and d(r 1,R 2 ) distance between two sub-spaces onto principal space and d(r1,r 2 ) distance between two sub-spaces onto residual space d(r 1,R 2 ) = ˆΞ R1 (ˆΞ T R 1 ˆΞR1 ) 1ˆΞT R1 ˆΞ R2 (ˆΞ T R 2 ˆΞR2 ) 1ˆΞT R2 2 d(r 1,R 2 ) = Ξ R1 ( Ξ T R 1 Ξ R1 ) 1 Ξ T R 1 Ξ R2 ( Ξ T R 2 Ξ R2 ) 1 Ξ T R 2 2 with ˆΞ R1 = Λ 1/2 l P T l Ξ R 1, Ξ R1 = Λ 1/2 n l PT n l Ξ R 1. For example, to reconstruct 2 variables (R 1 = 2,4) among 5 variables [ 0 1 0 0 0 Ξ R1 = 0 0 0 1 0 ] T Tharrault, Mourot, Ragot, Harkat (CRAN) WWTP diagnosis with robust PCA SAFEPROCESS 09 14 / 21
Fault detection and isolation Algorithm for the determination of the detectable and isolable faults 1 r = 1 2 Calculate for all available directions (R 1 I and R 2 I) the indicator K(R 1,R 2 ). If K(R 1,R 2 ) is equal to zero: only a set of variables potentially faulty may be determined, i.e. the faulty variables are associated to the indices R 1 or R 2 or R 1 and R 2. Thus, it is only required to determine one direction, for example R 1. If K(R 1,R 2 ) is closed to zero: magnitude of the fault has to be important to ensure fault isolation. Else the fault are isolable 3 r = r + 1 4 While r < max(l,n l) do to the step 2 Tharrault, Mourot, Ragot, Harkat (CRAN) WWTP diagnosis with robust PCA SAFEPROCESS 09 15 / 21
Application to hydraulic part of a wastewater treatment plant Description of the station 1 2 3 25cm 4 Extraction surface sludge 1m60 9 6 zoom P 5 Biology1 2 Recirculation sludge 7 8 4 1 3 bar screen Sump Valve Overflow Biology2 10 Sûre Extraction sludge 1 High before bar screen (H1) 2 Command of the cleaning of the bar screens 3 Water level after the bar screen (H2) 4 Water level in the sump (H3) 5 Pumping station (C4) 6 Flow (Q5) 7 8 9 10 Pump for recirculation sludge Pump for extraction sludge Pump for extraction surface sludge Water level in the overflow (H6) Construction of the data matrix Dynamic process : Temporal lag Non linear process : Transformed variables x(k) = [ H1(k) H2(k) H3(k) Q5(k) H6(k) tanh((q5(k 1) 550)./150) H1(k 1) H6(k 1) C4(k) ] T (1) The data matrix X is constituted of N observations of the vector x(k). Tharrault, Mourot, Ragot, Harkat (CRAN) WWTP diagnosis with robust PCA SAFEPROCESS 09 16 / 21
Application to hydraulic part of a wastewater treatment plant Construction of the model 4 principal components are determined 80 70 60 Measure Classical estimation Robust estimation 50 H1 (cm) 40 30 20 10 0 800 900 1000 1100 1200 1300 1400 1500 1600 40 Classical residual 30 Robust residual 20 10 0 10 800 900 1000 1100 1200 1300 1400 1500 1600 Time Measure and estimation of H1 Tharrault, Mourot, Ragot, Harkat (CRAN) WWTP diagnosis with robust PCA SAFEPROCESS 09 17 / 21
Application to hydraulic part of a wastewater treatment plant Analysis of the reconstruction directions The maximum number of reconstructions is then equal to 255 (max(n l,l) 1 = 4) K R 1 1 2 3 4 5 6 7 8 9 1 0 0.94 0.99 1.00 0.99 0.98 0.97 0.99 1.00 2 0 0.52 1.00 0.98 1.00 1.00 1.00 1.00 3 0 0.99 1.00 1.00 1.00 1.00 0.99 R 2 4 0 1.00 0.99 1.00 1.00 0.33 5 0 0.82 0.96 0.90 0.99 6 0 0.80 1.00 1.00 7 0 1.00 0.97 8 0 0.97 Indicator K for r = 1 The number of useful reconstruction can be reduced to 202 Tharrault, Mourot, Ragot, Harkat (CRAN) WWTP diagnosis with robust PCA SAFEPROCESS 09 18 / 21
Application to hydraulic part of a wastewater treatment plant Fault detection 2 3 5 10 13 15 18 19 20 21 22 4 6 8 11 7 9 12 14 1 16 17 1 500 1000 1500 2000 2500 Figure: Fault detection with Mahalanobis distance Tharrault, Mourot, Ragot, Harkat (CRAN) WWTP diagnosis with robust PCA SAFEPROCESS 09 19 / 21
Application to hydraulic part of a wastewater treatment plant Fault isolation Fault index Reconstruction direction under the detection threshold 16, 17 D 1 3, 10, 11, 14, 15, 19, 20, 21 D 3 7, 12 D 1,6 6, 8, 13, 18, 22 D 3,9 1, 2, 5 D 1,3,4 4 D 3,7,9 9 D 1,2,3,7 Table: Summary of fault isolation Faults 16 and 17 are under the threshold when the first variable (H1(k)) is reconstructed. variable H1(k) is faulty Tharrault, Mourot, Ragot, Harkat (CRAN) WWTP diagnosis with robust PCA SAFEPROCESS 09 20 / 21
Conclusion Conclusion Robust PCA with respect to outliers > directly applicable on data containing potential faults use of the principle of reconstruction and projection of the reconstructed data together > outliers detection and isolation Reduction of the computational load > determination of the dectable and isolable faults Application to hydraulic part of a wastewater treatment plant Tharrault, Mourot, Ragot, Harkat (CRAN) WWTP diagnosis with robust PCA SAFEPROCESS 09 21 / 21