Using Principal Component Analysis Modeling to Monitor Temperature Sensors in a Nuclear Research Reactor

Similar documents
Reconstruction-based Contribution for Process Monitoring with Kernel Principal Component Analysis.

Advanced Methods for Fault Detection

Introduction to Machine Learning

UPSET AND SENSOR FAILURE DETECTION IN MULTIVARIATE PROCESSES

1 Principal Components Analysis

Dynamic-Inner Partial Least Squares for Dynamic Data Modeling

Principal Component Analysis vs. Independent Component Analysis for Damage Detection

Basics of Multivariate Modelling and Data Analysis

DIMENSION REDUCTION AND CLUSTER ANALYSIS

Comparison of statistical process monitoring methods: application to the Eastman challenge problem

BATCH PROCESS MONITORING THROUGH THE INTEGRATION OF SPECTRAL AND PROCESS DATA. Julian Morris, Elaine Martin and David Stewart

Learning based on kernel-pca for abnormal event detection using filtering EWMA-ED.

Principal Component Analysis-I Geog 210C Introduction to Spatial Data Analysis. Chris Funk. Lecture 17

Impeller Fault Detection for a Centrifugal Pump Using Principal Component Analysis of Time Domain Vibration Features

Introduction to Principal Component Analysis (PCA)

PROCESS MONITORING OF THREE TANK SYSTEM. Outline Introduction Automation system PCA method Process monitoring with T 2 and Q statistics Conclusions

Advances in Military Technology Vol. 8, No. 2, December Fault Detection and Diagnosis Based on Extensions of PCA

Principal Components Analysis (PCA)

Principal component analysis

Principal Component Analysis -- PCA (also called Karhunen-Loeve transformation)

Sensor Failure Detection of FASSIP System using Principal Component Analysis

7. Variable extraction and dimensionality reduction

Synergy between Data Reconciliation and Principal Component Analysis.

CONTROLLER PERFORMANCE ASSESSMENT IN SET POINT TRACKING AND REGULATORY CONTROL

Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA

Independent Component Analysis for Redundant Sensor Validation

Unconstrained Ordination

Principal Component Analysis (PCA) Theory, Practice, and Examples

FERMENTATION BATCH PROCESS MONITORING BY STEP-BY-STEP ADAPTIVE MPCA. Ning He, Lei Xie, Shu-qing Wang, Jian-ming Zhang

CS4495/6495 Introduction to Computer Vision. 8B-L2 Principle Component Analysis (and its use in Computer Vision)

FAULT DETECTION AND ISOLATION WITH ROBUST PRINCIPAL COMPONENT ANALYSIS

THE FORTRAN NALAP CODE ADAPTED TO A MICROCOMPUTER COMPILER

Data reduction for multivariate analysis

Dimensionality Reduction Techniques (DRT)

Robust Statistical Process Monitoring for Biological Nutrient Removal Plants

-Principal components analysis is by far the oldest multivariate technique, dating back to the early 1900's; ecologists have used PCA since the

Principal component analysis

Bearing fault diagnosis based on EMD-KPCA and ELM

Chemometrics. Matti Hotokka Physical chemistry Åbo Akademi University

GEOG 4110/5100 Advanced Remote Sensing Lecture 15

Dimension Reduction Techniques. Presented by Jie (Jerry) Yu

UCLA STAT 233 Statistical Methods in Biomedical Imaging

2/26/2017. This is similar to canonical correlation in some ways. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2

CHAPTER 4 PRINCIPAL COMPONENT ANALYSIS-BASED FUSION

PROCESS FAULT DETECTION AND ROOT CAUSE DIAGNOSIS USING A HYBRID TECHNIQUE. Md. Tanjin Amin, Syed Imtiaz* and Faisal Khan

Nonlinear dynamic partial least squares modeling of a full-scale biological wastewater treatment plant

The Principal Component Analysis

Linear Algebra Methods for Data Mining

Unsupervised Learning Methods

SYSTEMATIC APPLICATIONS OF MULTIVARIATE ANALYSIS TO MONITORING OF EQUIPMENT HEALTH IN SEMICONDUCTOR MANUFACTURING. D.S.H. Wong S.S.

Building knowledge from plant operating data for process improvement. applications

3.1. The probabilistic view of the principal component analysis.

Learning Eigenfunctions: Links with Spectral Clustering and Kernel PCA

On-line multivariate statistical monitoring of batch processes using Gaussian mixture model

Defining the structure of DPCA models and its impact on Process Monitoring and Prediction activities

Keywords: Multimode process monitoring, Joint probability, Weighted probabilistic PCA, Coefficient of variation.

Machine Learning (Spring 2012) Principal Component Analysis

ASSESSMENT OF A HEAT EXCHANGER INLET NOZZLE FLOW USING ANSYS-CFX

A COMPARISON OF TWO METHODS FOR STOCHASTIC FAULT DETECTION: THE PARITY SPACE APPROACH AND PRINCIPAL COMPONENTS ANALYSIS

Instrumentation Design and Upgrade for Principal Components Analysis Monitoring

CSE 554 Lecture 7: Alignment

Algebra of Principal Component Analysis

Robust Method of Multiple Variation Sources Identification in Manufacturing Processes For Quality Improvement

Iterative face image feature extraction with Generalized Hebbian Algorithm and a Sanger-like BCM rule

FAST CROSS-VALIDATION IN ROBUST PCA

Fault Detection Approach Based on Weighted Principal Component Analysis Applied to Continuous Stirred Tank Reactor

Eigenfaces. Face Recognition Using Principal Components Analysis

MultiDimensional Signal Processing Master Degree in Ingegneria delle Telecomunicazioni A.A

Multivariate statistical monitoring of two-dimensional dynamic batch processes utilizing non-gaussian information

Karhunen-Loéve (PCA) based detection of multiple oscillations in multiple measurement signals from large-scale process plants

Chapter 4: Factor Analysis

Advanced Monitoring and Soft Sensor Development with Application to Industrial Processes. Hector Joaquin Galicia Escobar

Quantitative Understanding in Biology Principal Components Analysis

Noise & Data Reduction

INVESTIGATIONS OF THE FRICKE GEL (FXG) DOSIMETER DEVELOPED AT IPEN IRRADIATED WITH 60 Co GAMMA RAYS

Linear Models 1. Isfahan University of Technology Fall Semester, 2014

Notes on Latent Semantic Analysis

Lecture: Face Recognition and Feature Reduction

Design of Adaptive PCA Controllers for SISO Systems

Neuroscience Introduction

Recursive PCA for adaptive process monitoring

MAXIMIZING SIGNAL-TO-NOISE RATIO (SNR) IN 3-D LARGE BANDGAP SEMICONDUCTOR PIXELATED DETECTORS IN OPTIMUM AND NON-OPTIMAL FILTERING CONDITIONS

e 2 e 1 (a) (b) (d) (c)

Factor analysis. George Balabanis

Concurrent Canonical Correlation Analysis Modeling for Quality-Relevant Monitoring

Dimensionality Reduction Using PCA/LDA. Hongyu Li School of Software Engineering TongJi University Fall, 2014

WWTP diagnosis based on robust principal component analysis

IN Pratical guidelines for classification Evaluation Feature selection Principal component transform Anne Solberg

Design of Image Adaptive Wavelets for Denoising Applications

12.2 Dimensionality Reduction

Independent Component Analysis and Its Application on Accelerator Physics

Identification of a Chemical Process for Fault Detection Application

DEVELOPMENT AND APPLICATION OF AN APPROACH TO ASSESS THE GLOBAL THERMAL EFFICIENCY OF A THERMAL ELECTRIC POWER PLANT

Nonlinear process monitoring using kernel principal component analysis

Artificial Intelligence Module 2. Feature Selection. Andrea Torsello

A methodology for fault detection in rolling element bearings using singular spectrum analysis

Principal Component Analysis (PCA) Our starting point consists of T observations from N variables, which will be arranged in an T N matrix R,

Structure in Data. A major objective in data analysis is to identify interesting features or structure in the data.

Machine learning for pervasive systems Classification in high-dimensional spaces

Data Mining. Dimensionality reduction. Hamid Beigy. Sharif University of Technology. Fall 1395

Transcription:

Using Principal Component Analysis Modeling to Monitor Temperature Sensors in a Nuclear Research Reactor Rosani M. L. Penha Centro de Energia Nuclear Instituto de Pesquisas Energéticas e Nucleares - Ipen São Paulo, SP 558-9 Brasil Email: rmpenha@net.ipen.br Phone: () 387-74 J. Wesley Hines Nuclear Engineering Department The University of Tennessee Knoxville, TN 37996-75 USA Email: hines@utkux.utk.edu Phone: (865) 974-656 Abstract Principal Component Analysis (PCA) is a data-driven modeling technique that transforms a set of correlated variables into a smaller set of new variables (principal components) that are uncorrelated and retain most of the original information. Thus, a reduced dimension PC model can be used to detect and diagnose abnormalities in the original system in a robust way. This paper presents an application of using the PCA modeling technique to detect abnormalities in temperature sensors used to monitor the primary loop of a MW research pool reactor. The PCA model maps the temperature variables into a lower dimensional space and tracks their behavior using T and Q statistics. The Hotelling's T statistic measures the variation within the PCA model while the Q statistic measures the variation outside of the PCA model. Five temperature sensors are considered in the model. Three sensors are located inside the pool and two sensors are located at the primary loop piping. The reduced dimension PCA model has well behaved T and Q statistics. To test the functionality of the model, a drift was imposed sequentially to each temperature sensor, and the PCA model was able to detect and identify the faulty sensors at very low thresholds. Introduction Many researchers have addressed the use of Principal Component Analysis (PCA) modeling in the monitoring and fault detection of process sensors [,, 3]. The objective of this work is to construct a PCA model that best maps a research reactor's primary loop temperature sensors into a lower dimensional space, in order to characterize the behavior of these variables through the use of T and Q statistics. Five sensors have been chosen to be monitored. Three sensors are located inside the pool: T is located near the surface of the pool, T is located at the midway down into the pool and T3 is located just above of the reactor core. Temperature sensor T4 is located in the outlet pipe that takes the water from the core to the decay tank. At the outlet of the decay tank we have the sensor T6. The schematic diagram of the pool research reactor is shown in the Appendix. Principal Component Analysis PCA is a method used to transform a set of correlated variables into a smaller set of new variables that are uncorrelated and retain most of the original information, where the variation in the signals is considered

to be the information. PCA takes advantage of redundant information existent in highly correlated variables to reduce the dimensionality. After developing a model using good (training) data, the reduced dimension PC model can be use to detect and diagnose process abnormalities in a robust way [4]. For a basic reference book on PCA see Joliffe [5]. PCA decomposes the data matrix X (m samples, n variables) as the sum of the outer product of vectors t i and p i plus a residual matrix E []: X = t p T + t p T +... + t k p T k + E = T k P T k + E () The vectors p i are orthonormal, and the vectors t i are orthogonal, that is: p i T p j =, if i = j and p i T p j =, if i j; and () t i T t j = if i j (3) Also we can note that t i is the linear combination of the original X data defined by the transformation vector p i : Xp i = t i (4) The vectors t i are known as the principal component scores and contain information on how the samples are related to each other. The p i vectors are the eigenvectors of the covariance of matrix X. They are known as the principal component loadings and contain information on how variables are related to each other. In fact, the PCA splits the data matrix X in two parts: one describes the system variation (the process model T k P T k) and the other captures the noise or unmodeled information (residual variance E). This division is not always perfect, but it routinely provides good results []. It is very important to distinguish the number of components (dimension) that are to be kept in the model. The number of principal components k, to retain in the model must be less than or equal to the smaller dimension of X, i.e., k min{m, n} []. The goodness of the model depends on a good choice of how many PCs to keep. There are different criteria to choose the number of PCs [6]. The eigenvalues associated with each eigenvector or principal component: p i, tell us how much information (variation) each PC explains. Then we can look at the cumulative percent variance captured by the first few PCs and choose a number of PCs that accounts for most of the variability of the data (i.e. 9% to 99%). Alternatively, we can look for a knee point in the residual percent variance plotted against the number of principal components. This is thought to be the natural break between the useful PCs and residual noise. Another criterion is to accept the PCs whose eigenvalues are above the average eigenvalue. In correlatedbased PCA, the average eigenvalue is. It is advisable to investigate more than one criterion since there is no universally accepted methodology. PCA Model The concept of principal components is shown graphically in Figure. The figure shows a PCA model constructed for a data set of three variables. We can see that the samples lie mainly on a plane, thus the data is well described by a two principal component model ( PCs). The first PC aligns with the greatest variation in the data while the second PC aligns with the greatest remaining variance that is orthogonal to the first PC.

Figure - Principal Component Model Staying on a Plane, Showing T² and Q Outliers There are two statistics commonly employed in evaluating new data using a previously developed PCA model: Q statistic and Hotteling's T statistic. The Q statistic measures the lack of model fit for each sample. It indicates how well each sample conforms to the PCA model by measuring the distance a data point falls from the PC model. It is calculated as the difference between the data point and its projection on the PC model. It gives the lack of fit to the model []. Q i = e i e T i = x i ( I - P k P T T k ) x i (5) The Hotteling's T² measures the variation within the PCA model. T² is the sum of the normalized squared scores defined as []: T² i = t i (T k T T k ) - tit (6) Statistical limits can be developed for sample scores, T² and Q, and individual residuals. If some point falls outside the limits for a specific confidence interval (95%, for example), this point may be considered to be an outlier and may be not representative of the data used to develop the PCA model. The PCA model of a data matrix includes mean and variance scaling vectors, eigenvalues, loadings, statistics limits on the scores, Q and T. The model can be used with new process data to detect changes in the system that generated the original data. After detecting a probable outlier due to extreme T² or Q values, we can investigate the inputs that contribute to the extreme statistical value. Methodology A PCA model was developed to describe correlations of the primary loop temperature variables of the IPEN research reactor located in Sao Paulo, Brazil (see Appendix). The reactor data acquisition system records a snapshot of data once a minute for a complete fuel cycle that usually lasts a couple of days. A representative data set that corresponds to 5 hours of reactor operation was used to construct the model. A second data set, corresponding to 5 hours of another cycle operation, is used to validate the PCA model. Lastly, an artificial drift is imposed on each sensor to test the sensitivity of the model. The input data used to develop the model was arranged in a matrix with 5 rows (samples) and 5 columns (temperature variables). The time plot of these variables is shown in Figure.

33.5 Data for PCA Modelling 33 temperature (ºC) 3.5 3 3.5 3 3.5 T T T3 T4 T6 3 9.5 9 8.5 4 6 8 4 6 samples Figure - Temperature Model Development Data As shown in Table ; T, T, T3, T4 and T6 are highly correlated, meaning that they vary together. This redundancy in the measurements allows us to build a PCA model that will retain the most of information in a few principal components. corr. coef. T T T3 T4 T6 T..9963.996.998.993 T.9963..9945.9887.986 T3.996.9945..9939.997 T4.998.9887.9939..9975 T6.993.986.997.9975. Table - Correlation Coefficients To construct the PCA model, the input matrix was divided into a training set (to develop the model) and a test set in an odd-even manner. The input matrix was mean centered and scaled to a unit variance. This is necessary for PCA model development. PCA functions in MATLAB were used to calculate the principal components, the eigenvalues, and the amount of variance explained by each PC component. Figure3 is a plot of the eigenvalues versus the PC number and is used to help to choose the number of PCs to keep in the model. The size of the eigenvalue equals the amount of variance explained in the corresponding PC. We use the log plot that can show the break when the eigenvalues cover several orders of magnitude, as in this case. This plot is used to identify a knee point where the PCs above the knee contain information and the PCs below the knee represent noise. The knee occurs between and PCs. The second PC does contribute useful information. We will keep PCs that explain 99.8% of the variance.

Eigenvalue vs. Principal Component Eigenvalue - - -3 3 4 5 Principal Component Figure 3 - Eigenvalues versus Principal Components Next we plot and analyze the loadings on the retained principal components: PCand PC..6 Loads Plot 5.4478 Loads Plot 3.4 4.4476 PC (.38%). -. 3 PC (99.4%).4474.447 4 -.4.447 -.6.4468 5 -.8.5.5 3 3.5 4 4.5 5.4466.5.5 3 3.5 4 4.5 5 Figure 4 - Loadings on Principal Component and The principal component loadings are the weightings of each input to the specific PC. They show the underlying relationship among the variables. PC weights all the variables positively, so it is a gross measurement of the temperature in the reactor. PC accounts for the differences between variables {T, T, T3} and {T4, T6}. This makes sense, since the three first variables are located physically near each other and are related to the temperature inside the pool while the last two sensors are located outside the pool. Although PC accounts for a small amount of variation when compared to PC, it is important to describe the differences between the two groups of variables. Figure 5 is a plot of PC versus PC and shows that PC and PC are uncorrelated. If there were a noticeable relationship in this plot, it would be attributed to non-linear relationships in the data. The PC

technique removes all linear correlations and results in a scatter plot when the non-linear relationships are small or nonexistent..3.. PC (.38%) -. -. -.3 -.4 -.5-4 -3 - - 3 4 5 6 PC (99.4%) Figure 5 - Scores on PC versus Scores on PC The T² and Q statistics are shown in Figure 6. The dashed lines represent a 95% confidence interval used to identify possible outliers. The T² and Q residuals show the data fits the model well. 9.6 8 7.5 Hotelling T 6 5 4 3.4.3.. 4 6 8 4 6 8 4 6 8 4 6 8 Figure 6 - T² Statistic on PC Model (left) and Q Statistic on PC Model (right) Validation To validate the PCA model, another data set corresponding to another week operation was applied to the PCA model. The resultant T and Q residuals are shown in Figure 7.

.6.5 Hotelling T 8 6.4.3 4.. 3 4 5 6 7 8 9 3 4 5 6 7 8 9 Figure 7 - T Statistic on Validation Data (left) and Q Statistic on Validation Data (right) Since the T and Q statistics are within the confidence limits, the model represents the validation data set well. The PCA gives the best linear model in sense of minimum squared error. Drifted Sensor To verify the model ability to detect and identify the drifting sensors, an artificial drift (ramp) was applied separately to each input variable. The drift simulates a common problem that affects process sensors and may result from aging. The simulated drift is a ramp that grows to.5ºc (maximum value) for a temperature variable that originally varies from 8.87ºC to 3.ºC. This small drift corresponds to a.7% change and is imperceptible in a time profile. Figure 8 shows the T and Q statistics for this case..4 9. 8 Hotelling T 7 6 5 4..8.6 3.4. 4 6 8 4 6 8 4 6 8 4 6 8 Figure 8 - T Statistic on Drifted Data (left) and Q Statistic on Drifted Data (right)

The T statistic doesn't seem to be out of the ordinary but the Q statistic plot shows that its values are increasing over time. This indicates that something is going on that is not in the original data. We can look at the contribution of each input to the large Q statistic. Through this analysis, it is possible to determine which variable is responsible for the unusual Q behavior. Samples 89, 38 and 43 were investigated. The contributions to the Q statistics are plotted in Figures 9 and..5 Sample 89 =.835.5 Sample 38 =.8436.. Contribution.5..5 -.5 -. Contribution.5..5 -.5 -. -.5 3 4 5 -.5 3 4 5 Figure 9 - Contributions (T, T, T3, T4, T6) to Q statistics of samples 89 (left) and 38 (right).5 Sample 43 =.3979 Contribution.4.3.. -. -. 3 4 5 Figure - Contributions (T, T, T3, T4, T6) to Q statistics of sample 43 From Figures 9 and, it is easy to see that the variable T is the responsible for the unusually large Q statistic. This agrees with the fact that the drift was added to variable T. Artificial drifts that were added to each of the other variables were detected and the drifted variable was identified using the Q statistic. When a ramp drift with.5ºc maximum value is added to the T variable, the deviation of the Q statistic is even more evident as shown in Figure.

.5.5 4 6 8 4 6 8 Figure - Q Statistic on Drifted Data The contributions of the variables to the Q residuals on the samples 379 and 43 are shown in Figure..4 Sample 379. Sample 43 585.3.8..6..4. -. -. -.4 -. -.6 -.3 3 4 5 -.8 3 4 5 Figure - Contributions of (T, T, T3, T4, and T6) to Q statistics of samples 379 and 43 Again we observe the large contribution of sensor T to the Q statistic. Conclusions A PCA model with two principal components was developed to describe five temperature sensors in a nuclear research reactor. The model fitted the data well, as shown by the T and Q statistics. The model was tested with a validation data set from a separate reactor fuel cycle and the model performed well. Artificial drifts were added to the variables and the model both detected and identifies the drifted variables. The PCA model was determined to be a good method to monitor the temperature sensors in this plant. This is due to the highly correlations in the data and to the insignificant non-linearities present.

If non-linearities or time delays were present in the system, other methods such as non-linear partial least squares or neural networks may be used. Acknowledgments The authors wish to thank the National Science Foundation (NSF) and the Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq) that gave support to this joint research project. References [] Gallagher, N.B., B.M. Wise, S.W. Butler, D.D.White Jr. and G.G. Barna (997), Development and Benchmarking of Multivariate of Statistical Process Control Tools for a Semiconductor Etch Process: Improving Robustness through Model Updating, ADCHEM 997, Banff. [] Wise, B.M., N.B. Gallagher, S.W. Butler, D.D.White Jr. and G.G. Barna (996), Development and Benchmarking of Multivariate of Statistical Process Control Tools for a Semiconductor Etch Process: Impact of Measurement Selection and Data Treatment on Sensitivity, Safeprocess '97, Hull, England August 6-7. [3] Dunia R., Qin S. J. (998), Joint Diagnosis of Process and Sensor Faults Using Principal Component Analysis. Control Engineering Practice 6 (998) 457-469. [4] Hines J.W., (), PCA models. NE 589 class notes, The University of Tennessee. [5] Joliffe, L.T., (986), Principal Components Analysis, Springer-Verlag. [6] Valle S., Weihua Li and S.J. Qin. (999), Selection of the Number of Principal Components: The Variance of the Reconstruction Error Criterion with a comparison to Other Methods, Ind. Eng. Chem. Res., 38, 4389-44.

Appendix - Schematic Diagram of the IEA-R Pool Research Reactor