DGA Study
DGA Study
Article
Power Transformer Fault Diagnosis Based on Improved BP
Neural Network
Yongshuang Jin1,2 , Hang Wu 1 , Jianfeng Zheng 1,3, * , Ji Zhang 4, * and Zhi Liu 2
1 School of Mechanical Engineering and Rail Transit, Changzhou University, Changzhou 213164, China;
[email protected] (Y.J.); [email protected] (H.W.)
2 Jiangsu Yude Xingyan Intelligent Technology Co., Ltd., Changzhou 213164, China; [email protected]
3 Jiangsu Province Engineering Research Center of High-Level Energy and Power Equipment,
Changzhou University, Changzhou 213164, China
4 School of Computer Science and Artificial Intelligence, Changzhou University, Changzhou 213164, China
* Correspondence: [email protected] (J.Z.); [email protected] (J.Z.);
Tel.: +86-137-7510-9878 (J.Z.); +86-0519-8633-0287 (J.Z.)
Abstract: Power transformers are complex and extremely important piece of electrical equipment
in a power system, playing an important role in changing voltage and transmitting electricity. Its
operational status directly affects the stability and safety of power grids, and once a fault occurs, it
may lead to significant economic losses and social impacts. The traditional detection methods rely on
the technical level of power system operation and maintenance personnel, and are based on Dissolved
Gas Analysis (DGA) technology, which analyzes the components of dissolved gases in transformer oil
for preliminary fault diagnosis. However, with the increasing accuracy and intelligence requirements
for transformer fault diagnosis in power grids, the DGA analysis method is no longer able to meet the
requirements. Therefore, this article proposes an improved transformer fault diagnosis method based
on a residual BP neural network. This method deepens the BP neural network by stacking multiple
residual network modules, and fuses and expands gas feature information through an improved
BP neural network. In the improved residual BP neural network, SVM is introduced to judge the
extracted feature vectors at each layer, screen out feature vectors with high accuracy, and increase
their weights. The feature vector with the highest cumulative weight is selected as an input for
Citation: Jin, Y.; Wu, H.; Zheng, J.;
transformer fault diagnosis. This method utilizes multi-layer neural network mapping to extract
Zhang, J.; Liu, Z. Power Transformer
gas feature information with more significant feature differences after fusion expansion, thereby
Fault Diagnosis Based on Improved
effectively improving diagnostic accuracy. The experimental results show that, compared with
BP Neural Network. Electronics 2023,
12, 3526. https://doi.org/10.3390/
traditional BP neural network methods, the proposed algorithm has higher accuracy in transformer
electronics12163526 fault diagnosis, with an accuracy rate of 92%, which can ensure the sustainable, normal, and safe
operation of power grids.
Academic Editors: Hyeongon Park,
Chun-Kwon Lee and Seung Jin
Keywords: transformer; fault diagnosis; improved BP neural network
Chang
the power grid through the step-up transformer, and each region and system are intercon-
nected through the transformer. On the power side, the voltage is reduced to the required
voltage level through the step-down transformer [7]. Therefore, the running state of the
transformer directly affects the reliable operation of the power grid system, and plays a
vital role in the normal power supply of people’s daily life [8,9].
Due to the defects of the transformer operating environment, transportation and in-
stallation [10,11], there are inevitably some local defects in the transformer, such as bubbles,
cracks, electrode burrs, long-term deterioration, aging insulation and other problems; there-
fore, failures occur from time to time [12]. IEC 60599:2007 The “Interpretation guidelines for
dissolved gas analysis of electrical equipment in operation with mineral insulating oils” re-
duced the original nine typical faults to six, namely, partial discharge, low-energy discharge,
high-energy discharge, low-temperature overheating, medium-temperature overheating,
and high-temperature overheating. Medium-temperature and low-temperature overheat-
ing can be combined into a single category, called medium-to-low-temperature overheating,
resulting in a total of five fault types. When analyzing the types of transformer faults
in domestic scenarios [13], overheating faults have the highest probability of occurrence,
followed by high-energy discharge faults, low-energy discharge faults, and partial dis-
charge faults, with the lowest occurrence rate being related to faults caused by transformer
dampness or partial discharge.
The transformer plays an important role in the power grid system. If failure or acci-
dents occur during operation, they will affect the power quality, damage related equipment,
and cause personal injury and other malignant accidents. For example, in 2005, the winding
insulation of Laiwu Station (110 kV) of Shandong Power Grid was seriously damaged
due to the inter-turn short-circuit discharge fault. With the continuous development of
science and technology, the circuit department attaches more and more importance to
transformer failures and hidden dangers [14], and adopts such methods, such as inspection
and elimination, to reduce the probability of transformer failures, but it is still difficult to
avoid transformer failures. Therefore, it is urgent to study the transformer fault diagnosis
and operation monitoring, so that the power department can find the existing faults and
hidden dangers as early as possible, according to the actual problems, deal with problems
in the bud stage, and extend its service life. With the continuous progress of science and
technology, the capacity of a single transformer is increasing, its internal structure is becom-
ing more and more complex, and the mutual influence and interaction between different
internal components are becoming closer, which increases the difficulty of transformer fault
diagnosis to some extent [15,16]. By improving the accuracy of transformer fault diagnosis
methods, it is possible to promptly detect and repair faults, ensuring the reliability and
stability of the electricity supply.
Currently, the development of Dissolved Gases Analysis (DGA) technology in oil has
become relatively mature [17]. Based on DGA technology, some traditional diagnostic
methods, such as the IEC three-ratio method, modified three-ratio method, and Rogers ratio
method, have been applied in practical transformer fault diagnosis. However, these tradi-
tional diagnostic methods inevitably have certain flaws, such as limited usage conditions
and incomplete ratio encoding [18,19]. With the development of intelligent algorithms,
most researchers have started combining DGA diagnostic results with intelligent diagnostic
methods to improve the accuracy of transformer fault prediction. Common intelligent
diagnostic methods include the BP neural network [20] and the Support Vector Machine
(SVM) model [21], etc.
The literature [22] proposes a method that combines neural networks and the three-
ratio method to convert samples with diagnostic errors from neural networks to the three-
ratio method for diagnosis. However, the accuracy of neural network judgments depends
on the selection of weights and thresholds, and requires a large amount of training data,
making the operation complex and the stability insufficient. Moreover, the optimal thresh-
old is prone to change with different quantities of sample data. Xu Xin et al. utilized locust
swarm optimization to optimize certain parameters of the BP neural network, improving
Electronics 2023, 12, 3526 3 of 26
its speed and search abilities, but its network performance was poor and the learning rate
was unstable [23]. In the literature [24], a transformer fault intelligent diagnosis method
combining empirical wavelet transform and improved convolutional neural network is pro-
posed. The results show that this diagnostic model can effectively identify the fault states
of transformers. When using the SVM model for fault diagnosis, the kernel function and
penalty factor in the model limit its classification performance. Improper values can lead to
significant prediction errors. Therefore, various bio-inspired optimization algorithms [25]
are introduced to enhance the predictive ability of the SVM model for transformer faults.
In practical applications, the aforementioned methods can solve the cumbersome steps and
absolute diagnosis issues of traditional methods, but the further optimization of training
accuracy and improvements in adaptability are needed.
In addition, during the daily operation of the transformer, the failure is an accidental
event, and the probability is small, so it is difficult to obtain a large amount of fault
data [26]. The lack of samples and the diversity of faults further increase the difficulties
in transformer fault diagnosis and make it difficult to diagnose and predict transformer
faults only by human experience. Accurate transformer fault diagnosis methods can assist
maintenance personnel in quickly identifying the type and location of faults, thereby
improving the efficiency of troubleshooting. This is crucial for power grids because rural
areas typically have limited human and material resources, and the efficiency of fault-
handling directly affects the normal operation of the grid. When fault diagnosis is carried
out using artificial intelligence, sufficient historical data should be supported. Only with
large and comprehensive data can the accuracy and practicability of the whole diagnosis
system be guaranteed. This method has the problems of low accuracy and low diagnosis
accuracy with limited information [27–29]. Although the diagnostic methods mentioned
above have achieved certain results in the diagnosis of transformer faults, the overall
diagnostic accuracy is still insufficient. At the same time, the above diagnosis method,
based on a traditional BP neural network, still has problems of low diagnostic accuracy in
shallow models and poor diagnostic accuracy when there are few sample data. This method
is based on the residual backpropagation (BP) neural network, and its purpose is to improve
the accuracy and reliability of transformer fault diagnosis, ensuring the stability and safe
operation of the power grid. In this method, each residual network module in this paper
consists of two layers of BP neural networks. Residual learning is utilized to transform the
identity mapping learning in the conventional BP neural network. In addition, the input
information of each remaining network module can be transmitted on one layer of the
network to better extract feature information from transformer fault data. In the improved
residual BP neural network, a support vector machine (SVM) is introduced to evaluate the
feature vector extracted by each layer; the vector with high diagnostic accuracy is selected
and its weight is increased. Finally, the eigenvector with the highest cumulative weight
is selected for transformer fault diagnosis. Therefore, the improved residual BP neural
network model exhibits an excellent diagnostic performance even with few sample data.
In conclusion, the improved residual backpropagation (BP) neural network trans-
former fault diagnosis method proposed in this paper effectively addresses the cumber-
some steps and absolutization issues present in traditional methods, while maintaining
high training accuracy and adaptability. The fault diagnosis based on the residual BP neural
network makes the diagnosis of transformer faults easier, significantly reducing mainte-
nance frequency, minimizing repair costs, shortening planned outage time for transformers,
saving resources in terms of manpower and materials, alleviating maintenance burden,
and enhancing the operational reliability of transformers [30].
Relevant data indicate that implementing fault diagnosis measures under transformer
operating conditions can lead to a reduction in annual maintenance costs of 25–50% and a
decrease in fault-related downtime of 75% [31,32]. A survey conducted in the UK on 2000
state-owned projects showed that adopting fault diagnosis technology could save up to
300 million pounds in annual maintenance costs [33]. Therefore, adopting the residual BP
neural-network-based fault diagnosis method for power transformers can effectively pro-
Electronics 2023, 12, 3526 4 of 26
Table 1. Cont.
decompose, producing a large amount of CO2 , CO gas, and a small amount of water and
hydrocarbon gas.
(c) Other sources
The gas produced by the transformer is not necessarily caused by transformer failure;
there are also some objective reasons for the production of gas, such as the environment,
weather and manufacturing process. When the temperature is high, the trace oxygen (O2 )
dissolved in the oil reacts with the paint of the internal device of the transformer to produce
a certain amount of H2 . Trace amounts of water (H2 O) dissolved in the oil react with the
ferrous components inside the transformer to produce a small amount of H2 . During the
later maintenance and repair of the transformer, gases such as carbon dioxide (CO2 ) in the
air will also be absorbed by the transformer’s insulating oil, and the oil will also produce
certain gases under sunlight.
Ci,2 − Ci,1 m
γa = · (1)
∆t p
Ci,2 − Ci,1 1
γγ = · · 100% (2)
Ci,1 ∆t
When a potential failure occurs in the transformer, gas and bubbles will be generated
when gas is generated at a high speed, and some of the gases and bubbles in the transformer
oil will melt into the oil. The smaller the bubble, the greater the viscosity of the oil and the
slower the bubble floats, meaning that the bubble is in total contact with the transformer
oil, so that the gas replacement is more complete. Of course, some losses occur when the
gas is dissolved in the transformer oil by diffusion and convection. For example, due to
temperature differences, gas will be transferred from inside the fuel tank of the transformer
to the oil level and storage tank. When the transformer is operating, the gases adsorbed by
these solids are redissolved in the transformer oil.
Serial Nature of
Characteristic Gas Characteristics
Number the Failure
General
1 overheating Total hydrocarbons are too high, C2 H2 < 5 µL/L
failure
Severity high total hydrocarbons C2 H2 > 5 µL/L, However, C2 H2 did not
2 overheating constitute the main component of total hydrocarbons, and the H2
failure content was high
Partial Total hydrocarbons are not high, H2 >100 µL/L, CH4 This
3
discharge accounts for the main component of total hydrocarbons
Total hydrocarbons are not high, C2 H2 > 10 µL/L, Totahid
4 Patialdi
Rocabens–Anotte Sea
The total hydrocarbon content is high, the H2 content is high,
5 Arcing
and the C2 H2 content is high, but it is not the main component
As shown in the above table, when the ratio of the three gases is less than 0.1, the ratio
codes of the three gases are 0, 1, and 0, respectively. When the ratio of the three gases is
greater than or equal to 0.1 and less than 1, their ratio codes are 1, 0 and 0, respectively.
When the ratio of the three gases is greater than or equal to 1 and less than 3, their ratio
codes are 1, 2 and 1, respectively. When the ratio of the three gases is greater than or equal
to 3, their ratio codes are 2, 2, and 2, respectively. When using the three-ratio method for
transformer fault diagnosis, it is necessary to obtain the three gas ratio codes according to
Table 7, and then compare the coding combination with the fault type.
2. No coding ratio method
IEC three-ratio method, as a transformer fault diagnosis method, has the character-
istics of being simple and convenient, but the fault code is sometimes missing, and the
coding information corresponding to the fault state cannot be found. In order to solve
the shortcomings of IEC three-ratio method fault diagnosis, many experts and scholars at
home and abroad have proposed the “coding ratio method” by simulating a large number
of actual cases of transformer failure. The troubleshooting of the uncoded ratio method is
shown in Table 8.
Electronics 2023, 12, 3526 10 of 26
Algorithm Peculiarity
This can visually determine whether there is a
Characteristic gas method latent fault, but cannot determine the type and
status of the fault
Electronics 2023, 12, 3526 11 of 26
Table 9. Cont.
Algorithm Peculiarity
Simple and convenient, but the encoding will
be missing, and the coding information
Ratio method
corresponding to the fault state cannot be
found
This can better deal with the uncertainty and
ambiguity between fault types and symptoms,
but can determine the membership function
Fuzzy theory
based on experience. Rhere is more human
intervention, and there is a lack of a convincing
objective basis
A large amount of experimental data and
monitoring information can be
comprehensively evaluated and analyzed,
Expert system
but expert knowledge is difficult to express in
rules, and the reasoning of expert systems has
some uncertainties
This is mainly used for binary classification
Support vector machine problems, and is not effective for
multi-classification problems
There is a clear input–output relationship,
BP neural network
and it has a good effect in multi-classification
According to the definition of the feature matrix, feature vector and sample label, it
is assumed that the sample points are linear and divisible. Using the distance calculation
formula in n-dimensional space, the distance from the sample point x as a support vector
in Figure 2 to the decision hyperplane l1 : w T x + b = 0 is:
wT x + b
d = (4)
kwk
q
of which kwk = w12 + w22 + ... + w2n is the modulus of the weight w, and b is the hyper-
plane intercept. Let the sample point label y = 1 be above the decision boundary in Figure 2
and the sample point label y = −1 be below, because the distance of all sample points xi
from the decision boundary should be greater than d, according to Equation (4):
wT xi +b
kwk
≥ d, ∀yi =1
(5)
w T xi + b
kwk
≤ −d, ∀yi = −1
Divide the left and right ends of the inequality in Equation (5) by d at the same time;
wT
since ||w|| and d are constants, we can make wdT = kw ,b = kwbkd then Equation (5) can
kd d
be equivalent to transform to:
wdT · xi + bd ≥ 1, ∀yi
(
= 1
(6)
T
wd · xi + bd ≥ −1, ∀yi = −1
As can be seen from Equation (6), Figure 2 l2 : wdT x + bd = 1,l3 : wdT x + bd = −1. Make
the same scaling of both ends of the hyperplane l1 expression as shown in Equation (6):
l21 : wdT x + bd = 0. Therefore, we can see that, for the support vector x, maximizing d
is equivalent to maximizing |wdT x + bd | = 1, Whether the support vector is in the l2 or l3
hyperplane, there is |wdT x + bd | = 1, so the optimal index of SVM is to minimize the kwk,
that is, minkwk. Simplifying Equation (6) yields yi wdT xi + bd ≥ 1; in summary, the core
idea of the SVM algorithm is the following conditional optimization problem:
(
min : kwk
(7)
subject to yi wdT xi + bd ≥ 1
For the above optimization objective function, the primary task of SVM is to ensure
that the decision hyperplane can completely separate the two types of feature samples,
but the generalization ability of the model under this goal will be limited and the common
problem of linear indivisibility of data samples cannot be solved; therefore, the optimization
objective function of SVM should be transformed into the following form:
Its aim is to introduce a loose operator ζ to shift the hyperplane l2 down (l3 up),
thereby increasing the fault tolerance of the model to the training data set and improving the
generalization ability of the model. The loose operator ζ = [ζ 1,2 , · · · , ζ i , · · · ζ m ] corresponds
to m eigenvectors in eigenmatrix X, which are calculated as follows:
ζ i = max 0, 1 − yi wTd xi + bd (9)
At the same time, to limit the fault tolerance of the model, ∑im=1 ζ i should be introduced
into the optimization objective function ζ i As the regularization term, C in Equation (8)
is used as the weight hyperparameter, and the value of C can be dynamically adjusted
Electronics 2023, 12, 3526 13 of 26
by adjusting the value of C through the intelligent optimization algorithm to find the
proportion of kwk and ∑im=1 ζ i in the optimization process.
The implementation logic of the linear classifier of SVM is given above; however, for
transformer fault classification, because the distribution of data samples in the feature
space is more complex, the use of linear classifier cannot obtain the best classification effect,
so it is necessary to use the ascending method to map the originally linearly indivisible
data into a higher-dimensional feature space to achieve linear separability. To do this, the
concept of kernel function needs to be introduced.
In Equation (8), by solving the Lagrangian dual problem, the objective function of the
optimization problem can be transformed into the following form:
max : ∑m 1 m m
(
i=1 αi − 2 ∑i=1 ∑i=1 αi αj yi yj xi xj
(10)
subject to 0 ≤ αi ≤ C and ∑m i=1 αi yi = 0
The first-layer residual network first weights its input data x, and calculates the ReLu
activation function of its input data x to obtain the output F1 ( x ), and the output of the
second-layer residual network F2 ( x ) is as follows:
Formula: S—number of neurons in layer (l-1) of the hidden layer; α1 —feature vectors
output by layer l in the hidden layer.
Taking the residual network modules V and VI as an example, when two hidden
layers occur in the same residual network module, the output eigenvector α1 is F2,5 ( x ).
When the two hidden layers are in different network modules, the output eigenvector
α1 = R[w3 F3,5 ( x ) + b5 + F4 ( x )]is calculated by Equation (12), that is, the output of the
module [w3 , F3,5 ( x ) + b5 ], plus the input F4 ( X ) of the module.
The backpropagation of the error is shown in Equation (14), which updates ω and b
according to the stochastic gradient descent method.
∆ω = −α ∂ω ω = ω + ∆ω
( ∂E
,
(14)
∆b = −α ∂E
∂b , b = b + ∆b
(b) Extract the feature vectors α11 ,α12 of module V and module VI. and their corresponding
category labels in the improved BP neural network to form a new training set(α11 , Y ),
(α12 , Y ).
(c) Use the new training set formed in step (b) to train the models TSVM1 and TSVM2
respectively, and the trained models are SVM1 and SVM2.
(d) Input the validation set data ( Xval , Yval ) into the network to extract the feature vectors
vα11 and vα12 of the V. residual module and V I. residuals respectively, and then use
SVM1 and SVM2 to diagnose the feature vectors vα11 and vα12 , respectively, and output
the corresponding accuracy P11 and P12 .
(e) If P11 > P12 , calculate the weights of the eigenvectors according to Equation (15).
2
P11
P12
−µ
ω11 = e σ2
ω (P11 , P12 )
P12
2 (15)
P11
−µ
−
ω12 = e σ2
Formula: ω11 ,ω12 —indicates accuracy. The feature vector weights are P11 and P12 . µ
corresponds to the average; σ2 corresponds to variance.
(f) 0 α and
Update the feature vectors α11 and α1 of modules V and V I in step (b) to w11 11
0 α , according to the new weights w0 and w0 obtained in step (e).
w12 12 11 12
(g) According to the expected output yi of the i-group sample of the output layer, the cal-
culation error is e; see Equation (16).
1
e =
2n ∑kyi − zyi k2 (16)
n
Xi − Xmin,i
Xnew,i = (17)
Xmax,i − Xmin,i
Formula: Xi is the original amount of the gas content value or gas content ratio,
i ∈ 1, 2, 3..., n. Xmax,i , Xmin,i are the maximum and minimum values of gas content or
content ratio in the training set, respectively; Xnew,i is the normalized value after reasoning.
Note that test set data also need to be normalized
∑m Xik − Xi · Xik − Xi
γPCCa Xi ,Xj = 1 − q k=1
r 2 (19)
2
∑m
k=1 Xik −Xi · ∑m
k=1 Xjk −X j
Formula: γPCCa Xi , Xj Represents the Pearson coefficients of the k-dimensional vec-
tors Xik and X jk .
According to Equations (18) and (19), the Euler distance and Pearson coefficient of the
training data are calculated, respectively, and the data with the minimum value of both
types of indicators are taken as the training data for this class.
Using the random sampling method, the interpolated and expanded training sample
is divided into three training subsamples, and all the validation sets need to be put into
each subsample, and >, according to < label, data, and type The data format is stored in
HDFS (Distributed File System). “train” indicates training data, and its line begins with a
label of 0 or 1; “val” indicates validation data, whose line prefix label is empty.
Interpolating and sampling the training data, on the one hand, can increase the size
of training samples, provide more feature information yo the neural network, and make
the training data meet the data requirements of distributed learning. On the other hand,
random sampling can alleviate the imbalance in training data to a certain extent and solve
the problem of bias in the training results.
Residual Network
Model SVM Precision/%
Module
× × 87.2
X × 91.3
BP neural network
× X 89.8
X X 92.7
“×” means that this module is not added to the network model, while “X” means that this module is added to the
network model.
Table 12 shows the diagnostic results of transformer fault data on the corresponding
training and test sets of the improved residual BP neural network model, the traditional
deep BP neural network model and the traditional shallow BP neural network model in
this paper. As can be seen from Table 11, the improved residual BP neural network model
maintains a high fault diagnosis accuracy under different test sets, with an average accuracy
of 92.51%, and the diagnostic effect on each test set is relatively stable, with all remaining
at 91.52% and above. The diagnostic accuracy of the traditional shallow BP neural network
model is higher than that of the traditional deep BP neural network model, and the results
show that the diagnostic performance of the traditional BP neural network decreases to a
certain extent after an increase the network depth. In addition, the results shown in Table 11
show that the transformer fault diagnosis accuracy based on the improved residual BP
neural network model is higher than that of the traditional shallow BP neural network and
the traditional deep BP neural network model under different test sets, and the diagnostic
results show that, after stacking multiple residual network modules, the number of layers
of the deepening BP neural network not only does not decrease diagnostic performance
compared with the traditional BP neural network, but leads to a significant improvement.
Compared with the traditional shallow BP neural network and the traditional deep BP
neural network, the diagnostic accuracy of the improved residual BP neural network model
is improved by an average of 2.57% and 5.66%, respectively.
Table 13 shows the transformer fault diagnosis results based on an improved residual
BP neural network model, traditional deep BP neural network model and traditional
shallow BP neural network model with few sample data. It can be seen from Table 13
that, when there are few sample data, although the diagnostic accuracy of the improved
residual BP neural network model decreases slightly with the increase in the number of
Electronics 2023, 12, 3526 20 of 26
test sample sets, the overall average diagnostic accuracy remains at 90.38%, which is 5.76%
and 7.15% higher than that of the traditional shallow and deep BP neural network models.
Furthermore, it is shown that the improved residual BP neural network model still has a
good diagnostic performance with few sample data.
Table 13. Diagnostic accuracy of different models with few sample data.
Figure 9 shows the improved accuracy of the residual BP neural network and the
traditional deep BP neural network for the specific fault types of transformers in test set
T1 . As can be seen from Figure 9, among the six fault types of diagnosis, the diagnostic
accuracy of the improved residual BP neural network model proposed in this paper is
higher than that of traditional deep BP neural network.
In the diagnosis of different fault types, the improved residual BP neural network
model has strong diagnostic stability, and the diagnostic accuracy of each fault type is
maintained at above 90.9%. Table 14 shows the test results of some test data on the trained
model. It can be seen from the table that the data of different fault types are tested, and the
improved residual BP neural network model accurately predicts the type of corresponding
fault data.
Electronics 2023, 12, 3526 21 of 26
However, if IEC 60599 code, Duval triangle, and Rogers methods are used for diagno-
sis, there will be an incorrect diagnosis. Analyzing the error results, due to the different
fault severities, occurrence points and causes, for transformers belonging to the same fault
type, the dissolved gas content in the oil was shown to have a large difference, causing
the samples to be classified into other fault types. The diagnostic model in this paper
has a high diagnostic accuracy in the diagnostic results of the several transformer states
shown above. At the same time, these methods have a limited ability to diagnose complex
or multiple faults. Interpretation of the results requires expertise and experience. It is
also relatively complex, requiring a lot of data collection and analysis. Interpretation may
be subjective and depends on expert judgment. Therefore, the method used in this paper
has certain advantages.
In order to comprehensively measure the superiority of the model proposed in this
paper and avoid errors caused by the data, each model uses the same training set and test
set to conduct 20 experiments; the actual diagnosis results are shown in Table 15. According
to the diagnosis results of different models on the same data set, the diagnosis accuracy
of the model is as high as 92%, which is higher than other algorithm models. Therefore,
the model proposed in this paper can judge the state of the transformer very well.
During the process of data collection, there may sometimes be missed sampling or
incorrect sampling. In order to prove that the model proposed in this paper still has a high
diagnostic accuracy in the case of sampling errors, part of the data set was set to 10% wrong
sampling. In this paper, the method of setting data to 0 was used to simulate the sampling
error, and the fault diagnosis results are shown in Table 16.
Table 16. Comparison of comprehensive accuracy under the data set with errors.
Table 16 shows that the diagnostic accuracy of the model proposed in this paper
declined when the sampling error of some data sets was considered, but the rate of decline
was only 6%, while the diagnostic accuracy of other algorithms was lower than that of other
algorithms. This significant drop indicates that the proposed model has strong robustness.
grades, provide a stable and reliable power supply, and contribute to sustainable economic
and social development.
The method has the following characteristics: (a) After stacking multiple residual
network modules, this method does not deepen the network layers, resulting in more
accurate transformer fault diagnosis based on the residual BP neural network model.
(b) Experimental results show that the proposed transformer fault diagnosis method,
based on the improved residual BP neural network, outperforms the traditional deep BP
neural network and shallow BP neural network diagnosis methods, and the proposed
method maintains a good diagnostic performance with few sample data. (c) Interpolation
methods are used to expand the positive and negative samples in the training and test sets,
meeting the training requirements of the neural network, and enhancing the robustness
and generalization performance of the network. By using the Euler distance coefficient
and Pearson coefficient, as well as random sampling and data partitioning, the problem
of insufficient learning with few sample data and excessive biased learning for with a
large amount of sample data due to data imbalances can be alleviated. (d) The model is
trained on a distributed computing platform, making it suitable for transformer fault type
diagnosis with larger datasets.
Our research method employs a combined approach, using the residual backpropaga-
tion (BP) neural network and support vector machine (SVM) for transformer fault diagnosis.
Many recent studies by experts in the field have also explored methods for diagnosing
faults in power transformers. For instance, Liu Chang et al. [36] applied the bee algorithm
to optimize the BP neural network for transformer fault diagnosis; Fu Baoying et al. [37]
used the particle swarm algorithm to optimize the BP neural network for transformer fault
diagnosis; Han Qingchun [38] proposed a transformer fault diagnosis method based on
the cuckoo algorithm, optimizing the BP neural network; Zeng Zhi et al. [39] developed a
transformer BP neural network fault diagnosis system based on the ant algorithm.
Researchers have employed various swarm intelligence algorithms, such as the bee
algorithm, genetic algorithm, particle swarm algorithm, cuckoo algorithm, artificial fish
swarm algorithm and ant algorithm, to optimize the BP neural network in studying trans-
former fault diagnosis techniques. These approaches have achieved results, but they also
suffer from limitations and deficiencies in the iterative optimization process, including high
computational complexity, a slow convergence speed, and susceptibility to local optima.
As a consequence, these algorithms often lead to misdiagnoses when a transformer fault
occurs, affecting the accuracy of transformer fault diagnosis.
Compared to these methods, the transformer fault diagnosis approach presented
in this paper, which combines the residual backpropagation (BP) neural network with
support vector machines (SVM), offers several advantages. The residual BP neural network
enhances diagnostic accuracy by introducing residual learning. The improved combination
of the residual BP neural network with SVM, through feature vector selection and weighting,
demonstrates a better generalization performance under small sample data. Leveraging
the deep feature learning ability of the residual BP neural network and the feature selection
ability of SVM, our method can effectively handle complex fault scenarios, including low-
probability faults and cross-impacts. By optimizing the model and adjusting parameters,
our approach possesses high real-time and practical capabilities for practical applications.
In the real-time monitoring and fault handling of power systems, rapid and accurate fault
diagnosis is crucial, and our method meets this demand, ensuring the stability and safe
operation of the power system.
However, when using our method, the acquisition of transformer fault data may be
limited by the actual collection process and the frequency and types of transformer faults.
Insufficient fault samples may affect the model’s generalization ability and robustness.
The quality and scale of the dataset are critical to the method’s performance. Future research
can explore the expansion of more real and diverse datasets and share these data with the
academic and industrial communities to promote research and progress in this field.
Electronics 2023, 12, 3526 24 of 26
When using SVM for feature vector evaluation and weight allocation in our method,
appropriate parameter tuning may be necessary. Different parameter choices can impact
the model’s performance; hence, further optimization and adjustments are required.
In the future, further research and applications of this method are necessary to con-
tinuously improve transformer fault diagnosis technology and drive the modernization
of power systems. While this paper primarily focuses on fault diagnosis, in the future,
the proposed method can be extended to the long-term operational health monitoring of
transformers. Through real-time monitoring and diagnosis, potential faults can be better
predicted, and preventive maintenance measures can be taken, thereby extending the life
of transformers and increasing their reliability.
Author Contributions: Conceptualization, Y.J. and J.Z. (Jianfeng Zheng); methodology, Y.J., J.Z.
(Jianfeng Zheng) and J.Z. (Ji Zhang); software, H.W.; validation, Z.L. and H.W.; formal analysis, H.W.
and Y.J.; investigation, H.W.; resources, Y.J. and J.Z. (Jianfeng Zheng); data curation, Y.J. and H.W.;
writing—original draft preparation, Y.J.; writing—review and editing, Y.J., J.Z. (Jianfeng Zheng) and
J.Z. (Ji Zhang); visualization, Y.J.; supervision, Z.L. All authors have read and agreed to the published
version of the manuscript.
Funding: This research received no external funding.
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: Dataset link: https://github.com/jeson2017/Transformer_Fault_
Diagnosis_Dataset.git (accessed on 30 May 2023).
Acknowledgments: I would like to express my heartfelt gratitude to many people who helped during
the process of completing this study. Their help, support, and encouragement played an important
role in my completing this work. In addition, I would like to thank the members of the laboratory.
They have provided me with a lot of help in the operation of experimental equipment and data
collection. Their cooperation and teamwork spirit have enabled my research to proceed smoothly.
Conflicts of Interest: The authors declare no conflicts of interest.
Abbreviations
The following abbreviations are used in this manuscript:
BP Back Propagation
DGA Dissolved Gas Analysis
SVM Support Vector Machine
IEC International Electrotechnical Commission
Bi-LSTM Bi-directional Long Short-Term Memory
References
1. Equbal, M.; Khan, S.A.; Islam, T. Transformer incipient fault diagnosis on the basis of energy-weighted DGA using an artificial
neural network. Turk. J. Electr. Eng. Comput. 2018, 26, 77–88. [CrossRef]
2. Wang, D.; Lei, Q. Fault diagnosis of power transformer based on BR-DBN. Electr. Power Autom. Equip. 2018, 38, 129–135.
3. Maofa, G.; Yanni, L.; Laihe, W.; Baoye, S.; Wenqiang, Z. Fault diagnosis of power transformers based on chaos particle swarm
optimization BP neural network. Electr. Meas. Instrum. 2016, 53, 13–16.
4. Feng, Z.; Shuo, L. Fault Diagnosis of Traction Transformer Based on DGA and Improved Association Degree Model. High Volt.
Appar. 2015, 51, 41–45.
5. Zhang, W.; Yuan, J.; Zhang, T.; Zhang, K. An improved three-ratio method for transformer fault diagnosis using B-spline theory.
Proc. CSEE 2014, 34, 4129–4136.
6. Yang, X.; Chen, W.; Li, A.; Yang, C.; Xie, Z.; Dong, H. BA-PNN-based methods for power transformer fault diagnosis. Adv. Eng.
Inform. 2019, 39, 178–185. [CrossRef]
7. Weihua, Z.; Jinsha, Y.; Shan, W. A caculation method for transformer fault basic probability assignment based on improved
three-ratio method. Power Syst. Prot. Control 2015, 43, 115–121.
8. Li, Y.; Shu, N. Transformer fault diagnosis based on fuzzy clustering and complete binary tree support vector machine. Trans.
China Electrotech. Soc. 2016, 31, 64–70.
Electronics 2023, 12, 3526 25 of 26
9. Kari, T.; Gao, W.; Zhao, D.; Abiderexiti, K.; Mo, W.; Wang, Y.; Luan, L. Hybrid feature selection approach for power transformer
fault diagnosis based on support vector machine and genetic algorithm. IET Gener. Transm. Distrib. 2018, 12, 5672–5680.
[CrossRef]
10. Yuan, F.; Guo, J.; Xiao, Z. A transformer fault diagnosis model based on chemical reaction optimization and twin support vector
machine. Energies 2019, 12, 960. [CrossRef]
11. Xiao, Y.; Pan, W.; Guo, X.; Bi, S.; Lin, S. Fault Diagnosis of Traction Transformer Based on Bayesian Network. Energies 2020,
13, 4966. [CrossRef]
12. Zeng, B.; Guo, J.; Zhu, W.; Xiao, Z.; Yuan, F.; Huang, S. A Transformer Fault Diagnosis Model Based On Hybrid Grey Wolf
Optimizer and LS-SVM. Energies 2019, 12, 4170. [CrossRef]
13. Yang, Z. “Guidelines for Dissolved Gas Analysis and Fault Diagnosis of Transformers”—A Discussion on Transformer Fault
Diagnosis. Transformer 2008, 45, 24–27.
14. Zhu, Y.; Yin, J. Study on application of multi-kernel learning relevance vector machines in fault diagnosis of power transformers.
IEEE Inst. Electr. Electron. Eng. 2013, 33, 68–74.
15. Hanbo, Z.; Wei, W.; Xiaogang, L.; Linan, W.; Yuquan, L.; Jinhua, H. Fault diagnosis method of power transformers using
multi-class LS-SVM and improved PSO. High Volt. Eng. 2014, 40, 3424–3429.
16. Shijun, H.; Ju, Z.; Jigui, M. Fault Diagnosis of Transformer Based on Particle Swarm Optimization-Based Support Vector Machine.
Electr. Meas. Instrum. 2014, 51, 71–75.
17. Ali, M.S.; Abu Bakar, A.H.; Omar, A.; Abdul Jaafar, A.S.; Mohamed, S.H. Conventional methods of dissolved gas analysis using
oil-immersed power transformer for fault diagnosis: A review. Electr. Power Syst. Res. 2023, 216, 109064. [CrossRef]
18. Taha, I.; Hoballah, A.; Ghoneim, S. Optimal Ratio Limits of Rogers’ Four-Ratios and IEC 60599 Code Methods Using Particle
Swarm Optimization Fuzzy- Logic Approach. IEEE Trans. Dielectr. Electr. Insul. 2020, 27, 222–230. [CrossRef]
19. Deng, X.; Zhu, H.; Liu, S. Research on Digital Twin Modeling Technology for Transformer Protection. Power Syst. Technol. 2022,
46, 4982–4992.
20. Yuan, J.; Xu, P.; Li, L. Prediction of transformer oil- paper insulation aging based on BP neural networks with the chicken swarm
optimization algorithm. J. Electr. Power Sci. Technol. 2020, 35, 33–41.
21. Wang, B.; Yang, Y.; Zhang, S. Fault diagnosis of support vector machine transformer based on improved BP neural network.
Electr. Meas. Instrum. 2019, 56, 53–58.
22. Li, P.; Hu, G. Transformer fault diagnosis method based on the fusion of improved neural network and ratio method. High Volt.
Eng. 2022 , 7, 1–9. [CrossRef]
23. Xu, X.; Jiang, B.; Cao, W. Application of locust optimized neural network in transformer fault diagnosis. Power Syst. Clean Energy
2021, 37, 17–23.
24. Xian, R.; Fan, H.; Li, F. Power Transformer Fault Diagnosis Based on Improved GSA-SVM Model. Smart Power 2022, 50, 50–56.
25. Xiong, Y.; Liao, X.; Ke, F. Life cycle cost analysis of main transformer based on the multi-system data fusion. J. Electr. Power Sci.
Technol. 2020, 35, 3–11.
26. Fu, H.; Ren, R.; Yan, Z.; Ma, Y. Fault Diagnosis Method of Power Transformers Using Multi-kernel RVM and QPSO. Gaoya
Dianqi/High Volt. Appar. 2017, 53, 131–135 + 141 .
27. Song, Z.J.; Wang, J. Transformer Fault Diagnosis Based on BP Neural Network Optimized by Fuzzy Clustering and LM Algorithm.
High Volt. Appar. 2013, 49, 54–59.
28. Li, X.; Chen, Z.; Fan, X. Fault Diagnosis of Transformer Based on BP Neural Network and ACS-SA. High Volt. Appar. 2018,
54, 134–139 + 146.
29. Yuan, P.; Mao, J.; Xiao, F. Grid Fault Diagnosis Based on Improved Genetic Optimization BP Neural Network. J. Electr. Power Syst.
Autom. 2017, 29, 118–122.
30. Wu, R.; Li, C. Design of Transformer Fault Intelligent Diagnosis System. In Proceedings of the 2021 International Conference on
Networking Systems of AI (INSAI), Shanghai, China, 19–20 November 2021 ; pp. 293–296. [CrossRef]
31. Zhao, W. Study for Transformer Fault Diagnosis and Forecast Based on Data Mining. Master’s Thesis, North China Electric
Power University, Beijing, China , 2009.
32. Huang, Y.; Huang, S. Condition Maintenance of Power Generation Equipment; China Power Press: Beijing, China , 2000.
33. Du, J. Transformer Fault Diagnosis Expert System. Master’s Thesis, North China Electric Power University, Beijing, China, 2003.
34. Crowley, T.H.; Hagman, W.H.; Tabors, R.D.; Cooke, C.M. Expert system for on-line monitoring of large power transformers.
Expert Syst. Appl. Electr. Power Ind. 1990, 1, 629–660.
35. Yin, J. Research on Fault Diagnosis Method of Oil-immersed Power Transformers Based on Correlation Vector Machine.
Ph.D. Thesis, North China Electric Power University, Beijing, China, 2013.
36. Liu, C.; Wu, J.; Gao, Y. Transformer Fault Diagnosis Based on BP Neural Network and Bee Colony Algorithm. Industrialization
2020, 10, 7–11.
37. Fu, B.; Wng, Q. Transformer fault diagnosis based on adaptive Particle swarm optimization BP neural network. J. Huaqiao Univ.
2013, 34, 262–266.
Electronics 2023, 12, 3526 26 of 26
38. Han, Q. Research on Transformer Fault Diagnosis Based on the Optimization of BP Neural Network Using the Bouguebird
Algorithm. Master’s Thesis, Shandong University of Science and Technology, Qingdao, China, 2019.
39. Zeng, Z.; Zhang, H.; Yang, T.; Zeng, X.; Zeng, C. Power transformer incipient fault diagnosis based on neural network optimized
by combined ant colony optimization. Electrotech. Appl. 2019, 38, 43–49.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.