Research of power plant parameter based on the Principal Component Analysis method

Research of ower lant arameter based on the Princial Comonent Analysis method Yang Yang *a, Di Zhang b a b School of Engineering, Bohai University, Liaoning Jinzhou, 3; Liaoning Datang international Jinzhou ower generation Co.,ltd, Liaoning Jinzhou ABSTRACT With the develoment of ower technology and the exansion of ower lants, lant oeration monitoring oints are increasing at the same time. A large number of data arameters let technicians obtain more information about unit running, but adusting and rocessing the data rocessing are inconvenient. Princial Comonent Analysis was used for the real-time data analysis in the thermal ower lant unit running. New variables can be obtained from the multi-arameter indicators by knowledge mining. Since the new variables are airwise uncorrelated which can reflect most of original data information, they can rovide the basis for otimal oeration and adustment of the actual roduction units. It will also lay an imortant role in the factory data rocessing and related fields. Keywords: Princial Comonent Analysis (PCA), Knowledge mining, Power lant arameter. INTRODUCTION The Distributed Control System (DCS) and the information technology which includes Suervisory Information System (SIS), Management Information Systems (MIS) are widely alied in the thermal ower lant. At the same time, eole ay more attention to the real-time data rocessing and integration in thermal ower lant. The core of SIS in thermal ower lant is to rocess the real-time data efficiently, obtain the full knowledge mining of the device and achieve the erformance otimization, fault diagnosis, exert system and so on [,]. The real-time data which records the oeration rocess of equiment and ersonnel controls for ower lant can rovide basis for decision making, maintenance and incident handling. The data has a ositive significance in the lant roductivity, economic security and otimization of ower lant oeration, fault diagnosis and maintenance technology and so on. However, in ractice, the effects among the different factors are comlex. The various arameters include the unit load, environment temerature, fuel comosition and oeration mode and so on. They need mutual controls which bring the inconvenience of adusting the equiment. Meanwhile, with the SIS and MIS station develoment, large amount of historical data is stored in the database. A lot of information such as data characteristic and the develoment of the overall trends hidden in the data have imortant value for the decision-making. The study of a thermal ower lant reveals the eddy turbulence, fluid friction, and ressure/temerature deendent viscosity variations of working fluid, which are not considered under ideal conditions. Decisions regarding the analysis and technological advancements for these effects become vital while looking for the imrovement of the efficiency of thermal ower lants. Turbine, steam generator (boiler) and a um are the basic comonents of a steam thermal ower lant. An analytical discussion in Ref [3] about the steam turbine and um functions has been taken u as these affect the Rankine cycle efficiency. Load-cycling oeration of thermal ower lants leads to changes in oerating oint right across the whole oerating range. This results in non-linear variations in most of the lant variables. In Ref[4] the authors investigate methods to account for non-linearities without resorting to on-line arameter estimation as done in self-tuning control. A constrained multivariable long range redictive controller (LRPC), based on generalised redictive control (GPC) algorithm, is *yangy9@63.com International Worksho on Image Processing and Otical Engineering, edited by Hai Guo, Qun Ding, Proc. of SPIE Vol. 833, 8338 SPIE CCC code: 77-786X//$8 doi:.7/.974 Proc. of SPIE Vol. 833 8338- Downloaded From: htt://roceedings.siedigitallibrary.org/ on /8/6 Terms of Use: htt://siedigitallibrary.org/ss/termsofuse.asx

designed and imlemented in a simulation of MW oil-fired drum-boiler thermal ower lant. In order to take into account system non-linearity, the LRPC is evaluated using two tyes of redictive models: aroximate single global linear models and local model networks (LMN). In Ref[] a discrete time adative control system was alied to the steam temerature control of a boiler for electric ower generation. In the system, the adative controllers are installed in arallel with the conventional PID controllers are adatively estimated from the lant data. The adative control signals are synthesized on the basis of the estimated for suerheated and reheated steam temeratures. The lant dynamics was described with olynomials whose arameters arameters so as to achieve the secified control obective. The system was alied to a 37MW lant and found to realize far better control erformance than the conventional PID control system. Princial comonent analysis (PCA) is a mathematical rocedure that uses an orthogonal transformation to convert a set of observations of ossibly correlated variables into a set of values of uncorrelated variables called rincial comonents. The number of rincial comonents is less than or equal to the number of original variables. This transformation is defined in such a way that the first rincial comonent has as high a variance as ossible (that is, accounts for as much of the variability in the data as ossible), and each succeeding comonent in turn has the highest variance ossible under the constraint that it be orthogonal to (uncorrelated with) the receding comonents. Princial comonents are guaranteed to be indeendent only if the data set is ointly normally distributed. PCA is sensitive to the relative scaling of the original variables. Deending on the field of alication, it is also named the discrete Karhunen Loève transform (KLT), the Hotelling transform or roer orthogonal decomosition (POD) [6]. Princial comonent analysis of a data matrix extracts the dominant atterns in the matrix in terms of a comlementary set of score and loading lots. It is the resonsibility of the data analyst to formulate the scientific issue at hand in terms of PC roections, PLS regressions, etc [7]. In our aer, rincial comonent analysis was used for realtime data analysis unit data mining in the thermal ower lant. Multi-arameter indicators were transformed into new variables based on the knowledge mining result. They can give us a clear relationshi among the original factors. Since the new variables are airwise uncorrelated which can reflect the most information of the original data. They can rovide the basis for otimal oeration of adustment and can be used for the actual roduction units.. DATA PREPARING It is the actual unit oerating data from :3 m to 7: m in a thermal ower lant in Liaoning Province in China. The measurement collects a grou of data every seconds. Totally 66 grous of data were collected in the secified time in our exeriment. Since there are many arameters in the actual measurement, here we only select eight arameters are a reresentation. Secific arameter names can be seen in Table, the data curves can also be seen in Figure. Tab arameters name Num arameters name unit Generator outut ower MW Load setting MW 3 Turbine master outut % 4 Main steam ressure MPa Main steam ressure setting MPa 6 main steam temerature behind the furnace 7 total amount of fuel t/h 8 temerature reduction of hot water ie flow t/h Proc. of SPIE Vol. 833 8338- Downloaded From: htt://roceedings.siedigitallibrary.org/ on /8/6 Terms of Use: htt://siedigitallibrary.org/ss/termsofuse.asx

arameter arameter - arameter3 - arameter - arameter7 - - arameter4 - arameter6 - arameter8 - Fig, curves of eight arameters 3. PRINCIPAL COMPONENTS ANALYSIS (PCA) [6-8] PCA was invented in 9 by Karl Pearson. Now it is mostly used as a tool in exloratory data analysis and for making redictive models. PCA can be done by eigenvalue decomosition of a data covariance matrix or singular value decomosition of a data matrix, usually after mean centering the data for each attribute. The results of a PCA are usually discussed in terms of comonent scores (the transformed variable values corresonding to a articular case in the data) and loadings (the weight by which each standarized original variable should be multilied to get the comonent score). PCA is the simlest of the true eigenvector-based multivariate analyses. Often, its oeration can be thought of as revealing the internal structure of the data in a way which best exlains the variance in the data. If a multivariate dataset is visualised as a set of coordinates in a high-dimensional data sace ( axis er variable), PCA can suly the user with a lower-dimensional icture, a "shadow" of this obect when viewed from its (in some sense) most informative viewoint. This is done by using only the first few rincial comonents so that the dimensionality of the transformed data is reduced.pca is closely related to factor analysis; indeed, some statistical ackages (such as Stata) deliberately conflate the two techniques. True factor analysis makes different assumtions about the underlying structure and solves eigenvectors of a slightly different matrix. A common roblem in data rocessing is that large amounts of data are exensive to transmit, store or rocess. PCA wishes to extract from a set of variables a reduced set of n comonents or factors that accounts for most of the variance in the variables. In other words, we wish to reduce a set of variables to a set of n underlying suerordinate dimensions. Given a data matrix with variables and n samles, the data are first centered on the means of each variable. Proc. of SPIE Vol. 833 8338-3 Downloaded From: htt://roceedings.siedigitallibrary.org/ on /8/6 Terms of Use: htt://siedigitallibrary.org/ss/termsofuse.asx

This will insure that the cloud of data is centered on the origin of our rincial comonents, but does not affect the satial relationshis of the data nor the variances along our variables. The first rincial comonent is given by the linear combination of the variables X, X,...,X. F = ax+ ax +... + ax () The first rincial comonent is calculated such that it accounts for the greatest ossible variance in the data set. Of course, one could make the variance of F as large as ossible by choosing large values for the weights. To revent this, weights are calculated with the constraint that their sum of squares is.that is a + a +... + a = () The second rincial comonent is calculated in the same way, with the condition that it is uncorrelated with (i.e., erendicular to) the first rincial comonent and that it accounts for the next highest variance. Collectively, all of these transformations of the original variables to the rincial comonents is F = AX.The contribution rate of rincial comonent i is var( F )/ Var( F ) = λ / λ i i i i i= i= (3) The cumulative contribution rate of k rincial comonents is k λ / i i= i= λ i (4) PCA has the distinction of being the otimal orthogonal transformation for keeing the subsace that has largest "variance" (as defined above). This advantage, however, comes at the rice of greater comutational requirements if comared, for examle and when alicable, to the discrete cosine transform. Nonlinear dimensionality reduction techniques tend to be more comutationally demanding than PCA. PCA is sensitive to the scaling of the variables. If we have ust two variables and they have the same samle variance and are ositively correlated, then the PCA will entail a rotation by 4 and the "loadings" for the two variables with resect to the rincial comonent will be equal. But if we multily all values of the first variable by, then the rincial comonent will be almost the same as that variable, with a small contribution from the other variable, whereas the second comonent will be almost aligned with the second original variable. This means that whenever the different variables have different units (like temerature and mass), PCA is a somewhat arbitrary method of analysis. (Different results would be obtained if one used Fahrenheit rather than Celsius for examle.) Note that Pearson's original aer was entitled "On Lines and Planes of Closest Fit to Systems of Points in Sace" "in sace" imlies hysical Euclidean sace where such concerns do not arise. One way of making the PCA less arbitrary is to use variables scaled so as to have unit variance. In ractice, the number of rincial comonents choosing is based on the amount of information which can reflect. The number of rincial comonents is sufficient if the cumulative contribution rate is 8%.The most common situation is the or 3 rincial comonents.this rocess not only reduces the number of variables, but also brings the convenience of analysis. In ractice, we often meet larger correlation between the oerating arameters, the rocess of mutual adustment, check, control are difficult to finish. In this aer, rincial comonent analysis was used to do mining about the knowledge of arameters. Proc. of SPIE Vol. 833 8338-4 Downloaded From: htt://roceedings.siedigitallibrary.org/ on /8/6 Terms of Use: htt://siedigitallibrary.org/ss/termsofuse.asx

Before using PCA, the data need be firstly standardized.suose the test matrix is x (i is the samle number, i i n n is the whole number; is the indicator number,, is the whole indicators). After being rocessed by xi x () xi = S where and x n n i = = x i n = (7) ( i ) n i= S x x Tab correlation data matrix between arameters (6) ameter ameter ameter3 ameter4 ameter ameter6 ameter7 ameter8 arameter arameter arameter3 arameter4 arameter arameter6 arameter7 arameter8..94.7..4.98.8 -.773.94..888 -.9 -.66..8648 -.7.7.888. -.96 -.68.79.979 -.9. -.9 -.96..344 -.389 -.44 -.44.4 -.66 -.68.344..86 -.9.6.98..79 -.389.86..34.486.8.8648.979 -.44 -.9.34. -.47 -.773 -.7 -.9 -.44.6.486 -.47. Tab3 rincial comonents analysis result ameter ameter ameter3 ameter4 ameter ameter6 ameter7 ameter8 eigenvalue F F F3 F4 F F6 F7 F8 -.899 -.349.9 -.36.644 -.76.66.97 -.937 -.39.63 -.49.984 -.368 -.3 -.6 -.9367 -.6 -.8.6.37.3 -.6.36.499 -.77.367 -.34 -.399. -.39 -.48.3 -.9.899.383.4 -.7 -.6.69 -.398.6867.47. -.4689 -.6 -..7 -.96 -.9 -..8.63.6.9 -.96.89.84.6 -.399.3.4 -.3.6 3.78.886.4.49.46.6.867.783 contribution rate(%) 46.84 3.33.67 6.37.766.9449.833.979 Cumulative contribution rate (%) 46.84 69.84 84.784 9.96 9.997 97.9376 99.9 Proc. of SPIE Vol. 833 8338- Downloaded From: htt://roceedings.siedigitallibrary.org/ on /8/6 Terms of Use: htt://siedigitallibrary.org/ss/termsofuse.asx

After we got the standardized data, we used to the rincial comonent analysis to understand the relationshi among the factors data. The correlation data can be seen in Tab, and then the rincial comonents analysis result can also be seen in Tab3 and Fig.. From Table 3 we can see that the first, three rincial comonents cumulative contribution rate is close to 8%. The first rincial comonent has a greater negative correlation with arameters,,3,7 and it can be seen as the relationshi between the ower lant fuel and the load; the second rincial comonent has a great negative correlation with arameter 4 and has a great ositive correlation with arameters 6,8. It can reresent the relationshi between the steam temerature and steam ressure; the third rincial comonent has a greater ositive correlation with arameter. As the unit's internal arameter, the third comonent reflects the energy balance between the roduction of steam boiler-turbine volume and the amount of steam used. F F - F3 - F - F7. -. - F4 - F6 - F8. -. Fig, curves of new eight variables after ca. CONCLUSION Based on the lant s requirement, the knowledge extraction of arameters from the unit oerating data was analyzed in this aer. Princial comonent analysis was used to create new variables. Since the new variables are airwise uncorrelated which can reflect the most original information, these new variables can be used for the actual roduction units. The otimize arameters results can rovide the basis for otimal oeration of adustment. At the same time the new less rincial comonents can also be used which for the arameters data comression unit, on-line fault diagnosis Proc. of SPIE Vol. 833 8338-6 Downloaded From: htt://roceedings.siedigitallibrary.org/ on /8/6 Terms of Use: htt://siedigitallibrary.org/ss/termsofuse.asx

and further data analysis. Due to the reduction of the comutational comlexity, the roosed method will imrove the comutational efficiency and recognition accuracy to a certain extent. It will also lay a bigger role in related fields. 6. ACKNOWLEDGEMENT The authors will thank the suort of Education Deartment of Liaoning Province, China. The roect number is L6. REFERENCES [] Wang,P.H., Chen, Q. and Dong, Y.H. et al. data mining and its alication of erformance analysis in thermal ower units [J], Automation of Electric Power System, 8 (8),76-79(4). [] Chen, X.Y., Zhang, X.H. and Qu. F., et al, Survey on alications of data mining in ower system, Journal of electric ower science and technology, (3),-6(7). [3] Kaooria, R. K., Kumar, S.and Kasana, K.S., An analysis of a thermal ower lant working on a Rankine cycle: A theoretical investigation, Journal of Energy in Southern Africa, 9(),77-83(8). [4] Prasad, G., Swidenbank, E.and Hogg, B. W. A Local Model Networks Based Multivariable Long-Range Predictive Control Strategy for Thermal Power Plants,Automatica,34(), 8-4(998). [] Matsumura, S., Ogata, K. and Fuii, S. et al, Adative control for the steam temerature of thermal ower lants,control Engineering Practice, (4), 67-7(994). [6] htt://en.wikiedia.org/wiki/princial_comonent_analysis [7] Wold,S.,Esbensen,K. and Geladi, P.,Princial comonent analysis,chemometrics and Intelligent Laboratory Systems,(-3), 37-(987). [8] Kreinin,A., Merkoulovitch, L. and Rosen,D. et al, Princial Comonent Analysis inquasi Monte Carlo Simulation, algo research quarterly, (), -9(998). Proc. of SPIE Vol. 833 8338-7 Downloaded From: htt://roceedings.siedigitallibrary.org/ on /8/6 Terms of Use: htt://siedigitallibrary.org/ss/termsofuse.asx