Automated Statistical Recognition of Partial Discharges in Insulation Systems.

Similar documents
PATTERN RECOGNITION FOR PARTIAL DISCHARGE DIAGNOSIS OF POWER TRANSFORMER

Analysis of Fast Input Selection: Application in Time Series Prediction

T been recognized as early as the beginning of the century.

Confidence Estimation Methods for Neural Networks: A Practical Comparison

T is well known. There is much knowledge about how

Functional Preprocessing for Multilayer Perceptrons

Pairwise Neural Network Classifiers with Probabilistic Outputs

Optimal Linear Regression on Classifier Outputs

Switch Mechanism Diagnosis using a Pattern Recognition Approach

Adaptive Binary Integration CFAR Processing for Secondary Surveillance Radar *

A Novel Analytical Expressions Model for Corona Currents Based on Curve Fitting Method Using Artificial Neural Network

Bayesian Modeling and Classification of Neural Signals

Partial Discharge Source Classification Using Pattern Recognition Algorithms

A New Method Based on Extension Theory for Partial Discharge Pattern Recognition

Classifier s Complexity Control while Training Multilayer Perceptrons

Selection of Classifiers based on Multiple Classifier Behaviour

Least Absolute Shrinkage is Equivalent to Quadratic Penalization

Performance Evaluation and Comparison

Efficient Algorithms for Pulse Parameter Estimation, Pulse Peak Localization And Pileup Reduction in Gamma Ray Spectroscopy M.W.Raad 1, L.

A Perspective on On-line Partial Discharge Testing of Stator Windings. Greg Stone Iris Power Qualitrol

Multilayer Neural Networks

Notes on Discriminant Functions and Optimal Classification

DEPARTMENT OF COMPUTER SCIENCE Autumn Semester MACHINE LEARNING AND ADAPTIVE INTELLIGENCE

Subcellular Localisation of Proteins in Living Cells Using a Genetic Algorithm and an Incremental Neural Network

Discriminant Analysis and Statistical Pattern Recognition

SUPERVISED LEARNING: INTRODUCTION TO CLASSIFICATION

PD Analysis in Cylindrical Void With Respect To Geometry of the Void

Research Article Combined Diagnosis of PD Based on the Multidimensional Parameters

Analysis of phase-resolved partial discharge patterns of voids based on a stochastic process approach

Explaining Results of Neural Networks by Contextual Importance and Utility

Simulation of Partial Discharge in Solid Dielectric Material

Shankar Shivappa University of California, San Diego April 26, CSE 254 Seminar in learning algorithms

Intelligent Modular Neural Network for Dynamic System Parameter Estimation

Learning Gaussian Process Models from Uncertain Data

Statistical Rock Physics

Comparing Robustness of Pairwise and Multiclass Neural-Network Systems for Face Recognition

Algorithm-Independent Learning Issues

Repetitive control mechanism of disturbance rejection using basis function feedback with fuzzy regression approach

THE APPROACH TO THE ANALYSIS OF ELECTRICAL FIELD DISTRIBUTION IN THE SETUP OF PAPER INSULATED ELECTRODES IN OIL

FEEDBACK GMDH-TYPE NEURAL NETWORK AND ITS APPLICATION TO MEDICAL IMAGE ANALYSIS OF LIVER CANCER. Tadashi Kondo and Junji Ueno

Neural Networks biological neuron artificial neuron 1

Midterm Review CS 6375: Machine Learning. Vibhav Gogate The University of Texas at Dallas

Comparison of Partial Discharge Characteristics for Different Defect Types in SF 6 Gas Insulation System

MODULE -4 BAYEIAN LEARNING

ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD

COMPARING PERFORMANCE OF NEURAL NETWORKS RECOGNIZING MACHINE GENERATED CHARACTERS

Score Normalization in Multimodal Biometric Systems

Finite Population Correction Methods

A Modified Incremental Principal Component Analysis for On-line Learning of Feature Space and Classifier

Independent Component Analysis and Unsupervised Learning

Machine Learning, Midterm Exam

Feature selection and extraction Spectral domain quality estimation Alternatives

What Do Neural Networks Do? MLP Lecture 3 Multi-layer networks 1

A. Pelliccioni (*), R. Cotroneo (*), F. Pungì (*) (*)ISPESL-DIPIA, Via Fontana Candida 1, 00040, Monteporzio Catone (RM), Italy.

High Dimensional Discriminant Analysis

Revisiting linear and non-linear methodologies for time series prediction - application to ESTSP 08 competition data

A quantative comparison of two restoration methods as applied to confocal microscopy

Multivariate Methods in Statistical Data Analysis

Midterm Review CS 7301: Advanced Machine Learning. Vibhav Gogate The University of Texas at Dallas

Dover- Sherborn High School Mathematics Curriculum Probability and Statistics

Algorithmisches Lernen/Machine Learning

A Novel Activity Detection Method

Pattern Recognition and Machine Learning

Introduction: MLE, MAP, Bayesian reasoning (28/8/13)

QT (Al Jamia Arts and Science College, Poopalam)

Classifier performance evaluation

MLPR: Logistic Regression and Neural Networks

Independent Component Analysis and Unsupervised Learning. Jen-Tzung Chien

Outline. MLPR: Logistic Regression and Neural Networks Machine Learning and Pattern Recognition. Which is the correct model? Recap.

Multilayer Neural Networks

Neural Networks and Ensemble Methods for Classification

A Modular NMF Matching Algorithm for Radiation Spectra

SYDE 372 Introduction to Pattern Recognition. Probability Measures for Classification: Part I

Significance Tests for Bizarre Measures in 2-Class Classification Tasks

Research Article A Nonparametric Two-Sample Wald Test of Equality of Variances

COMP9444: Neural Networks. Vapnik Chervonenkis Dimension, PAC Learning and Structural Risk Minimization

Artificial Neural Networks

Introduction to Statistical Inference

Cooperative Communication with Feedback via Stochastic Approximation

Mining Classification Knowledge

Introduction to Machine Learning

Modifying Voice Activity Detection in Low SNR by correction factors

O 3 O 4 O 5. q 3. q 4. Transition

Multilayer Perceptrons (MLPs)

Model Comparison. Course on Bayesian Inference, WTCN, UCL, February Model Comparison. Bayes rule for models. Linear Models. AIC and BIC.

Chapter 1 Introduction

A Modified Incremental Principal Component Analysis for On-Line Learning of Feature Space and Classifier

Classifier Selection. Nicholas Ver Hoeve Craig Martek Ben Gardner

Engineering Part IIB: Module 4F10 Statistical Pattern Processing Lecture 5: Single Layer Perceptrons & Estimating Linear Classifiers

Data Mining: Concepts and Techniques. (3 rd ed.) Chapter 8. Chapter 8. Classification: Basic Concepts

Fundamentals to Biostatistics. Prof. Chandan Chakraborty Associate Professor School of Medical Science & Technology IIT Kharagpur

Battery System Safety and Health Management for Electric Vehicles

Final Examination CS540-2: Introduction to Artificial Intelligence

Classification of Ordinal Data Using Neural Networks

Dynamics of partial discharges involved in electrical tree growth in insulation and its relation with the fractal dimension

CMU-Q Lecture 24:

Lecture 7 Artificial neural networks: Supervised learning

> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE GRAVIS 2016 BASEL. Logistic Regression. Pattern Recognition 2016 Sandro Schönborn University of Basel

A nonparametric two-sample wald test of equality of variances

Software Reliability Growth Modelling using a Weighted Laplace Test Statistic

Transcription:

Automated Statistical Recognition of Partial Discharges in Insulation Systems. Massih-Reza AMINI, Patrick GALLINARI, Florence d ALCHE-BUC LIP6, Université Paris 6, 4 Place Jussieu, F-75252 Paris cedex 5. {Massih-Reza.Amini, Patrick.Gallinari, Florence.dAlche}@lip6.fr François BONNARD, Edouard FERNANDEZ Schneider Electric, Research Center, A2 plant, 4 rue Volta, F-38050 Grenoble 09. {francois_bonnard, edouard_fernandez}@mail.schneider.fr Abstract We present the development of the successive stages of a statistical system for a classical diagnosis problem in the domain of power systems. It is shown how the different steps may be designed in a sound statistical way in order to develop a complete and efficient diagnosis system which provides the end user with a maximum of information about the system behaviour. 1 Introduction We describe here the use of Neural Networks (NNs) and statistical techniques for a real life application in the domain of power systems. This application - the detection of electrical Partial Discharges (PD) in power apparatus - has been a challenging problem for more than 20 years. Efforts to automate PD detection began with expert systems [1]. Later on simple statistical models [2] and more recently Hidden Markov Models [3] and neural networks [4] have also been tested. However, most of this work has been exploratory and has addressed specific aspects of the problem. Some of the major statistical issues involved in the development of the successive processing steps were never considered. For example, in most approaches, feature selection relies on expertise, results validation is missing, data bases are too small to allow a sound comparison of different classifiers and feature sets. In the study presented here, a complete and valid approach ranging from feature extraction to statistical validation is developed. This allows us to compare both the performances and reliability of different systems and to show the interest of using flexible methods such as neural networks in real-world problems. In section 2, the PD phenomenon in insulating systems is described. Section 3 is devoted to feature extraction and selection and section 4 to the validation of the classifier. In section 5 we present experimental results. 2 Apparent charge detection In high voltage power systems, small defects during manufacturing lead to the occurrence of Partial Discharges. Their accumulation is recognized as one of the

major sources of deterioration for power apparatus and can trigger breakdown. Commercially available PD detectors allow to measure PD pulses, to process them and display the PD patterns on a time basis. Some of these detectors also offer basic diagnosis tools. The usual representation of these discharges is via 3-d fingerprints defined by the phase angleϕ i, apparent charge q j and pulse count H ( ϕ, q ) n i j coordinates (Figure. 1-left). Other useful statistics may be derived from this representation like (Figure 1-right): H q max ( ϕ ) the maximum pulse height distribution, H qn ( ϕ ) the mean pulse height distribution, H n ( ϕ ) the pulse count distribution, H(q) the number of discharges vs. discharge magnitude and H(p) the number of discharges vs. discharge energy. 343.71 Hn(Φ,q) 261.43 Hqmax(Φ) 179.16 [pc] 96.88 14.60 322.48 241.86 Hqn(Φ) 161.24 [pc] 80.62 0.00 13.00 9.75 Hn(Φ) 6.50 [Number] 3.25 0.00 q [pc] Φ Voltage 220[kV] Figure 1 : left : 3-d representation of a PD, an entry n ij represents the number of PD pulses having magnitude q j and phase position ϕ i. Right: some of the distributions which may be derived from the 3-d pattern, from top to bottom H q max ( ϕ ), H qn ( ϕ) and H n ( ϕ ). PDs characterize random phenomena and are influenced by several factors like aging, amplitude of voltage, frequency of voltage applications. PD measures are corrupted by interferences and intrinsic background noise. These 3-d patterns are thus complex and their characterization is a challenging task. The problem we tackle here is the detection and classification of PDs for apparatus which use SF6 gas for insulation. More precisely we are interested into classifying PD signals into 4 classes: corona discharges (sharp points in an electric field), surface discharges (surface irregularities), floating parts (small particles in an electric field) or background measurement noises. 3 Feature extraction and selection In the dielectric literature, features derived from the 3-d representation are used to characterize PDs. We have considered here some of the most promising feature sets in order to evaluate them, we have also performed variable selection on each set and on a combination of all these feature sets. The features we have considered are: fractal characteristics of the 3-d patterns, simple statistics over H q max ( ϕ ), H qn ( ϕ ), H n ( ϕ ), H(q) and H(p) distributions, Fourier transform of H qn ( ϕ ). We first briefly describe these methods below and present after that variable selection.

3.1 Pre-processing Fractal features Fractal measures allow to compute global characteristics of an image. We have used two fractal measures on the 3-d fingerprints, the fractal dimension and the lacunarity [5] which quantify respectively the irregularity and the denseness of an image surface. Statistical operators Useful information from the distributions H q max ( ϕ ), H qn ( ϕ ), H n ( ϕ ), H(q) and H(p) may be inferred from their moments. The following features have been found useful for PD detection [2], µ i denotes the i th order moment of variable x for a given distribution: Skewness, Sk = µ 3 /µ 2 3/2, which measures the degree of tilting of a distribution, Kurtosis, Ku = µ 4 /µ 2 2-3, which measures the peakedness of a distribution, the number of modes of the distribution, and other features which characterize the asymmetry of the charges. Fourier Analysis The normalized charge distribution can be evaluated by the average value of its spectral components in the frequency domain. The normalized average discharge is ~ a given by H ( ) 0 qn ϕ = + [ ak cos( kϕ ) + bk sin ( kϕ ) ] 2 k = 1 where a and b are the Fourier coefficients in the frequency domain, ϕ is the phase angle. The first 8 spectral components a and b have been used here to approximate the charge distribution H qn ( ϕ ). 3.2 Variable selection All measurements proposed by domain experts are not equally informative, it is thus useful to test automatic selection methods on these variable sets. Several methods have been proposed for variable selection with NNs. We have chosen here a method proposed in [6]. The relevance of a set of input variables is defined as the mutual information between these variables and the corresponding desired outputs. This dependence measure is well suited for measuring non linear dependencies as they are captured in NNs. For two variables x and d, it is defined P( x, d) as: MI( x, d) = P( x, d)log. Starting from an empty set, variables are P( x) P( d ) x, d added one at a time, the variable x i selected at step i being the one which maximizes MI(Sv i-1,x i,d) where Sv i-1 is the set of i-1 already selected variables and d the desired output. Selection was stopped when MI increase falls below a fixed threshold. Densities were estimated by Epanechnikov kernels. For the classification, we have used multilayer perceptrons trained according to a MSE criterion plus a regularization term.

4 Validation The validation of classifier decisions is important for any real-world application. It includes both performance and confidence evaluation. We focus here on the latter. There are two aspects of interest regarding the confidence, the first, is the computation of global confidence intervals which allows to calibrate a classifier, the second is the computation of local confidence measures for each classifier decision. For computing global confidence intervals, we have used both a classical parametric estimation which relies on the hypothesis of binomial classifier outputs and a bootstrap estimate [7]. For the latter, the quantity to be estimated is the percentage of correct classification for each class. We compute : for each bootstrap replicate i and output k the percentage of correct classificationθ ik, the mean of θik θk these estimatesθ k and their standard error σ k, z ik =, t ( 1 α ) and t ( α ) the σ left and right α th percentiles of the θ ik distribution. The bootstrap-t confidence ( 1 α ) ( α ) interval is then [ θk tk σ k, θk + tk σ k ] Note that the confidence could be computed in the same way for the global classification performances instead of class performances. For the local confidence, several heuristic measures have been proposed in the literature. Most of them assume that the estimates of posterior class probabilities are accurate enough. We have used here the following two estimates : max k y k max2 k yk, where max2 k y k is the second maximum output value. 1 1 p P ( k / x )log P ( k / x ) k= log p that outputs sum to one. 5 Results 5.1 Experiments. k where P( k / x) is the k th output normalized so Data were generated using an experiment design plan in order to insure a good representativity of the different classes and a minimum set of measures. The design factors for PD source discharges and background noise were : measure attenuation [0, 25] db, gas pressure [1, 3] bar and test voltage [3.5, 64.1] kv. Simulations have been carried out to compare the different pre-processing methods and to show that the combination of these methods leads to a satisfactory discrimination. Variable selection allows to reduce the initial input dimension with a mean classification increase of about 1%. The number of variables is reduced from 28 to 12, 16 to 7 and 46 to 9 respectively for the statistical operators, Fourier components, and the combination of these two feature sets altogether with the 2 fractal measures. Performances are shown in table 1, the feature combination plus

variable selection offers a significant performance increase compared to the other feature sets. Table 1. Performances of different NN models retrained on the selected variables and for different space features. NN Performances (%) Space Architecture Training Test Fractal Dimension 2-6-4 88.34 81.25 Statistical operators 12-16-4 97.25 93.12 Fourier components 7-16-4 98.75 94.38 Combination 9-16-4 100 96.25 5.2 Validation Table 2, gives the confidence interval with 95% precision for the different classes using the 9 variables selected among the combination of the three feature sets. Intervals have been computed using the binomial assumption and the t-bootstrap interval method. With the t-bootstrap method, the confidence interval could be a singleton when a class has been recognized with a 0% or a 100% correct classification rate. Table 2 : 95% confidence interval (on the test set) for the 9 input variables selected from the combination set. This interval has been computed with two methods (binomial assumption and t-bootstrap estimates). binomial distribution t-bootstrap interval Corona discharge [80.12%, 95.41%] [84.31%, 97.18%] Surface discharge [96.62%, 100%] {100%} Floating parts [98.17%, 100%] {100%} Noise [96.62%, 100%] {100%} Global performance [94.54%, 97.43%] [94.16%, 97.18%] Additional information about the reliability of a classifier is provided by the confidence intervals computed directly on the outputs of the classifier instead of being computed on the classifier decision as in table 2. These intervals have been computed here via t-bootstrap and Figure 2 shows the corresponding box-plots for the combination of feature sets. In this case, the precision of the output values is high, it is significantly higher than for other feature sets. 0.95 0.87 0.8 0.72 Corona discharge Surface discharge Floating parts Background noise Figure 2 : Box-plots for the combination feature sets, The bold dot in each box indicates the median value and the lower and upper edges the 2.5 and 97.5% percentiles. Local confidence intervals have been computed with the two methods described in section 4. Figure 3 shows the correct classification performances against the

percentage of reject. These experiments have been performed on the selected combination set. For both estimates of the local confidence, performances increase rapidly with the reject threshold to reach perfect recognition at 65% reject. These estimates allow the user to set up very easily the confidence level which fits the application. Figure 3: percentage of correct classification performances versus percentage of reject computed for two confidence measures. 6 Conclusion We have described the development of the successive stages of a diagnosis system in the domain of power systems. Our goal was to illustrate how the different processing steps could be performed in order to develop, validate and calibrate as best as possible an operational system for a real world challenging problem. For this, we have used at the different steps (feature preprocessing and selection, classification and validation) methods which are well suited for the problem and complementary one to the other. Such software sensors could be easily implemented for several industrial diagnosis problems. References [1] Wootton RE. Computer Assistance for the Performance and Interpretation of H.V. ac Discharge Tests. 5th Int. Conf. On H.V. Eng., Braunschweig, 1987 ; Journal 41.12 [2] Gulski E. Computer-Aided Recognition of Partial Discharges using Statistical Tools. PhD thesis, Delft University, 1991 [3] Satish L, Gururaj BI. Use of Hidden Markov Models for Partial Discharge Pattern Classification. IEEE trans. On Electrical Insulation, 1993 ; 28 :172-182 [4] Gulski E, Krivda A. Neural Networks as a Tool for Recognition of Partial Discharges. IEEE trans. On Electrical Insulation, 1993 ; 28 :984-1001 [5] Satish L, Zaengl WS. Can Fractal Features be Used for Recognizing 3-d Partial Discharge Pattern? IEEE trans. On Electrical Insulation, 1995 ; 2 :352-359 [6] Bonnlander BV, Weigend AS. Selecting Input Variables Using Mutual Information and Nonparametric Density Evaluation. In Procedings of ISANN 94, Taiwan, 1994 ; 42-50 [7] Efron B, Tibshirani R. An Introduction to Bootstrap, Chapman Hall, London, 1993