6 PDF
6 PDF
6 PDF
Abstract—A synthesized method is presented in this paper for and other pattern recognition problems[12-16].
transformer fault diagnosis, This model combines the principal
component analysis and the support vector machine. Firstly, by This paper presents a synthesized method based on PCA
principal component analysis, the characteristics of the sample and SVM for transformer fault diagnosis. Firstly, the method
data are extracted, the main information is be retrieved, a new analyses the impact factors by using PCA, to achieve
sample set is created. Then, a support vector machine model is reducing dimension and noise filtration of data, then the SVM
created and the new sample set is used to train the support vector is used to train and test the data samples which have been
machines. This method achieves the advantages of the two reduced to noise, and the ideal results are obtained.
algorithms. The accuracy of transformer fault diagnosis based on
this method is improved when the sample information is noisy or II. PCA–SVM THEORY
incomplete. Experimental results show that the method is valid
A. Principle Component Analysis
and feasible and has better diagnostic accuracy.
Assume that there are p attribute x1 , x 2 , " x p , the various
Keywords—Principle Component Analysis; Support Vector
characteristics of the objective object. Let
Machine; Transformer; Fault Diagnosis T
x = ( x1 , x2 ," x p ) be a P dimensional vector. Denoted by
I. INTRODUCTION
u =E( x), H = D( x) is the mean and variance of vector x.
At present, the three-ratio fault diagnosis based on Consider its linear transformation:
dissolved gas in oil is one of the most simple and effective
method for fault diagnosis of power transformer. However,
dissolved gas in transformer oil does not carry all of the fault y1 = l1T x = l11 x1 + l21 x2 + " + l p1 x p
information, so in recent years, the comprehensive diagnosis °
method based on dissolved gas analysis combined with the ®"" (1)
preventive test results of transformer is introduced into the fault ° y = lT x = l x + l x + " + l x
diagnosis of the transformer. Among them, the support vector ¯ p p 1p 1 2p 2 pp p
machine, rough set theory and fuzzy neural network algorithm
in the fault diagnosis of the transformer has achieved good We get:
results. However, due to the limitation of the algorithm itself, it
is difficult to overcome the problem that the sample data is Var ( y i ) = liT ¦ li
easy to be polluted by noise and the information is not (2)
complete. COV ( yi , y f ) = liT ¦ l f i, j = 1," , p
Principal component analysis (PCA) is a kind of data
mining technology in multivariate statistics. The purpose of the Use y1 instead of the original P variables x1 , x2 ," x p , and
PCA is to reduce the dimension of data, and to convert a y1 variance representation is the most classical method that can
number of indicators into a few comprehensive indexes under reflect the information contained in these variables as much as
the premise of not losing the main information. To exclude the possible. The greater the Var ( y1 ) , the more information the y1
existence of noise pollution and overlapping information
among the numerous information[10-11]. Support vector contains. In order to avoid the emergence of Var ( y1 ) → ∞ , we
machine (SVM) is a new pattern recognition method. SVM have to limit the li.
adopts the principle of structural risk minimization, taking into
account the training error and generalization ability. SVM has
advantages in solving small sample, nonlinear, local minimum liT li = 1, i = 1," , p (3)
So in the constraint type (3) and (4) for l1, so that the
s.t. y j − w • φ ( xi ) − b ≤ ε + ξ j ᧤7᧥
maximum Var ( y2 ) , the y2 is known as the second principal w • φ ( xi ) + b − y j ≤ ε + ξ *
j
component. Similar to the definition of third, fourth principal
component, etc.. ξi , ξi* ≥ 0, i = 1, ⋅⋅⋅, l
Let λi be the eigen values of x, ti as the corresponding unit In the above formula, C is the fault tolerant penalty
feature vector. Set a feature vector extraction method, then coefficient, C > 0 ; ξi is the relaxation factor.
according to the theorem of principal component analysis, it
can be known that the number of the original P variables The dual optimization problem for the above mentioned
changes which are reflected by the principal components can optimization problem is
be represented by the ratio of p p
, p
as
Var ( y i ) = λi and ¦ λi = ¦ σ ii λk / ¦ λi
1 l l
¦ ¦ (α i − α i* )(α j − α *j ) K ( xi , x j )
i =1 i =1 i =1
m p max L D = −
the contribution of the principal component, and ¦ λi / ¦ λi is 2 i =1 j =1
i =1 i =1 l l
known as the cumulative contribution rate of principal − ε ¦ (α i + α i* ) + ¦ y i (α i + α i* ) (8)
component x1 , x2 ," xm . In short, the principal component i =1 i =1
l
analysis is to map the original attributes to one or a few of the
principal components that are not related to each other, and
s .t . ¦ (α
i =1
i − α i* ) = 0
keep most of the original attribute information. The 0 ≤ α i , α i* ≤ C
information is not represented by the principal component is
removed as the noise, because this part of the information is
Where K ( xi , x j ) = φ ( xi ) φ ( x j ) is called a kernel
T
not important.
B. Support Vector Machine function, SV and NSV are used to represent the support vector
The application of SVM has two main types at present, set and the standard support vector set respectively. So we can
namely, pattern recognition and regression analysis. In this obtain the nonlinear classifier
paper, we discuss the classification and recognition problem,
which belongs to a class of pattern recognition, without loss of
generality, the classification problem can finally be classified f ( x) = sgn( ¦ (ai − ai* ) K ( x, xi ) + b) (9)
xi ∈sv
as two categories. The goal of this problem is to introduce a
function from a known sample, to classify the two types of
objects. Where b can be calculated according to the following
formula:
The following training samples for a given set of training
are separated into two categories
1
b= { ¦ [ yi − ¦ (α j −α*j )K(xi , xj ) −ε ]}
NNSV 0<αi <C
S = {( x1 , y1 ),( x2 , y2 ), ",( xl , yl )} (5)
x j ∈sv (10)
xi ∈ R n , y ∈ {−1,1} + ¦ [ y − ¦ (α −α )K(x , x ) +ε ]
i j
*
j i j
0<αi* <C x j ∈sv
K ( x, x2 )
2
experimental conditions, environment, personnel, transformer
# test data actually obtained there may be some deviation from
x n
# # αl
* −α l
the true value. In this paper, we can understand the incomplete
and the deviation of the information as the noise pollution of
K ( x, xl )
the attribute data. In order to simulate this kind of noise
pollution, on the basis of the original test set, this paper is
Fig.1 SVM model diagram conscious to change a known attribute to form a test sample of
C. PCA-SVM Fault Diagnosis 50 cases, randomly change 2 to 4 known attribute values to
form a test sample for each test sample in 500 cases. Table 4
Fault diagnosis based on PCA-SVM can be divided into the presents the comparison of results of several methods for the
following steps: diagnosis of contaminated samples with noise pollution.
1) Use PCA to process the sample set. A linear From the results in Table 4 we can see that transformer
transformation of the sample set has been taken, then the fault diagnosis based on PCA-SVM is better than SVM and
principal component is selected according to the cumulative Method of literature [8] in dealing with noisy data, especially
contribution rate, finally, a new set of samples is obtained by when the noise pollution is gradually increasing, the number of
reducing the dimension of the transformed samples set attributes is modified, the advantage is more obvious.
according to theprincipal component.
V. CONCLUSIONS
2) Concentrated train the new samples and construct the
form of input variables. Principal component analysis and support vector machine,
3) Enter the sample to be measured to the support vector the Synthesized of transformer faults diagnosis method is
studied in this paper, the method overcomes the problem that
machine model, and calculate its predictive value.
transformer fault diagnosis sample data is easy to be
4) Restore the obtained predictive value to the fault contaminated by noise and the information is not complete,
diagnosis result, through the coefficient which is used in the ensures a high diagnostic accuracy when there is noise
process of principal component analysis. pollution in the sample data, so the method has a high
practical value.
IV. EXAMPLE ANALYSIS
In this paper, based on the decision table provided by the
literature [7], combined with the transformer test data collected
REFERENCES
by the author in the power production unit, 250 samples were
collected, 200 cases were taken as training set, and 50 cases as [1] GB 7252-2001, Guide for the analysis and judgment of dissolved gases in
transformer oil[S], 1987.
test set.
[2] CAI. Jinding, WANG Shaofang,ĀApplication of rough set theory in
After calculation, the principal component of the PCA IEC-6059 three ratio fault diagnosis decision rules[J]ā, Proceedings of
analysis is determined to be 10, and the cumulative the CSEE, 2005, 25(11), pp.134-139.
contribution rate is 90.2%. Experience shows, the cumulative
[3] WANG Yong-qiang, LV Fang-cheng, LI He-ming, ĀTransformer fault
diagnosis based on Bayesian network and DGA[J] ā , High Voltage
Engineering, 2004, 30(5), pp.12-13
[4] YU Jie, ZHOU Hao, Ā Fault diagnosis model of dissolved gas in
transformer based on immune algorithm[J]ā, High Voltage Engineering,
2006, 32(3), pp.49-50
[5] LU Gan-yun, CHENG Hao-zhong, DONG Li-xin, et al, Ā Fault
identificationof power transformer based on multi-layer SVM
classifier[J]ā, Proceedings of the CSU-EPSA, 2005, 17(1), pp. 19-22.
[6] Liu Na, GAO Wen-sheng, TAN Kexiong,ĀFault diagnosis of power
transformer based on combined neural network model[J]ā, Transactions
of China Electrotechnical Society, 2003, 18(2), pp.83-86.
[7] Mo Juan, Wang Xue, Dong Ming et al,ĀFault diagnosis method of
power transformer based on Rough Set Theory[J]ā, Proceedings of the
CSEE, 2004, 24(7),pp.162̾167.
[8] ZHU Yong-Li, WU Li-zeng, LI Xue-yu, Ā Synthesized diagnosis on
transformer fault based on Bayesian classifier and Rough set”,
Proceedings of the CSEE, 2005, 25(10), pp.159-165.
[9] WU Zhong-li, YANG Jian, ZHU Yong-li, “Fault diagnosis of transformer
based on rough set theory and support vector machine[J]”, 2010, 38(18),
pp.80-83.
[10] Tipping M E, Probabilistic principle component analysis.Journal of the
Royal Statistical Society, 1999, 61(3), pp.611-622.
[11] Xiang Dongjin, Practical multivariate statistical analysis[M],
Wuhan:China University of Geosciences press, 2005.
[12] Vapnik V, The Nature of Statistical Learning Theory[M], New
York:SpringerVerlag, 1995.
[13] Deng Naiyang, TianYingjie, SVM-A new method in data mining[M],
Beijing : Science Press, 2002.
[14] NelloCristianini, John Shawe-Taylor, An Introduction to Support Vector
Machines and Other Kernel-based Learning Methods[M], Beijing:
Publishing Housing of Electronics Industry, 2004.
[15] S. K. Shevade, S. S.Keerthi , C. Bhattacharyya and K. R. K. Murthy,
ĀImprovements to the SMO Algorithm for SVM Regression”, IEEE
Transaction on neural networks, 2000, 11(5), pp.1188-1193.
[16] Danian Zheng, Jiaxin Wang, Yannan Zhao, “Non-flat function estimation
with a multi-scale support vector regression[J]”, Neurocomputing, 2006,
70, pp.420-429.