0% found this document useful (0 votes)
72 views10 pages

Detection of Malfunctioning Photovoltaic Modules Based On Machine Learning Algorithms

This document proposes a hybrid scheme using three machine learning methods to detect malfunctioning photovoltaic modules based on infrared thermographic images. The first method uses a convolutional neural network trained on infrared images preprocessed with an improved gamma correction function. The second method trains a CNN using infrared temperatures preprocessed with a threshold function. The third replaces the CNN with an XGBoost model trained on selected temperature statistics. Experimental results show all three methods can detect malfunctions with high accuracy and efficiency, and the hybrid scheme provides even better performance.

Uploaded by

Rodrigo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
72 views10 pages

Detection of Malfunctioning Photovoltaic Modules Based On Machine Learning Algorithms

This document proposes a hybrid scheme using three machine learning methods to detect malfunctioning photovoltaic modules based on infrared thermographic images. The first method uses a convolutional neural network trained on infrared images preprocessed with an improved gamma correction function. The second method trains a CNN using infrared temperatures preprocessed with a threshold function. The third replaces the CNN with an XGBoost model trained on selected temperature statistics. Experimental results show all three methods can detect malfunctions with high accuracy and efficiency, and the hybrid scheme provides even better performance.

Uploaded by

Rodrigo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Received February 16, 2021, accepted February 25, 2021, date of publication March 2, 2021, date of current version

March 10, 2021.


Digital Object Identifier 10.1109/ACCESS.2021.3063461

Detection of Malfunctioning Photovoltaic


Modules Based on Machine Learning Algorithms
HUMBLE PO-CHING HWANG1,2 , COOPER CHENG-YUAN KU 1, (Member, IEEE),
AND JAMES CHI-CHANG CHAN2
1 Institute of Information Management, National Yang Ming Chiao Tung University, Hsinchu 30010, Taiwan
2 Industrial Technology Research Institute (ITRI), Hsinchu 31040, Taiwan
Corresponding author: Cooper Cheng-Yuan Ku ([email protected])
This work was supported in part by the Ministry of Science and Technology in Taiwan under Grant MOST 106-2410-H-009-025-MY3,
and in part by the Industrial Technology Research Institute in Hsinchu under Grant 107C411 and Grant 108A406.

ABSTRACT In recent years, with the rise of environmental awareness worldwide, the number of solar
power plants has significantly increased. However, the maintenance of solar power plants is not an easy
job, especially the detection of malfunctioning photovoltaic (PV) cells in large-scale or remote power
plants. Therefore, finding these cells and replacing them in time before severe events occur is increasingly
important. In this paper, we propose a hybrid scheme with three embedded learning methods to enhance
the detection of malfunctioning PV modules with validated efficiencies. For the first method, we combine
the improved gamma correction function (preprocess) with a convolutional neural network (CNN). Infrared
(IR) thermographic images of solar modules are used to train the abovementioned improved algorithm. For
the second method, we train a CNN model using the IR temperatures of PV modules with the preprocessing
of a threshold function. A compression procedure is then designed to cut the time-consuming preprocesses.
The third method is to replace the CNN with the eXtreme Gradient Boosting (XGBoost) algorithm and the
selected temperature statistics. The experimental results show that all three methods can be implemented
with high detection accuracy and low time consumption, and furthermore, the hybrid scheme provides an
even better accuracy.

INDEX TERMS Solar power generation, fault detection, infrared imaging, image processing, machine
learning.

I. INTRODUCTION to take IR images for the detection of malfunctioning PV


Recently, renewable energy sources (RESs), such as wave modules. This measure can be used to detect different kinds
power, wind power, and solar energy, have undergone impres- of malfunctions, and it is nondestructive, contactless and
sively fast evolution. In Taiwan, the solar industry is specif- efficient. Actually, in both [1] and [2], authors have proven
ically playing a critical role in the clean and low-carbon that infrared thermography is able to detect all kinds of
energy industry due to climatic factors; therefore, the quan- malfunctioning PV modules. There are three major types of
tity of solar power plants has grown rapidly. In addition malfunctioning PV modules, i.e., hot spots, potential-induced
to many factories, many families install PV modules on degradation (PID) and open circuits. They are summarized as
the roofs of houses, buildings, yards, etc. However, solar follows, and the corresponding IR images are illustrated in
modules may suffer from malfunctions such as open cir- Fig. 1.
cuits, cracks, bird droppings or heavy dust from time to 1) Hot spot: A hot spot is the most common PV module
time. Hence, the maintenance of solar modules is crucially defect. A hot spot results in a higher temperature and
important to avoid safety events, especially for very large may be caused by many reasons, including short cir-
solar power plants or remote plants. The traditional method cuits, overhead objects, surface fouling, cell material
involves maintenance personnel patrolling the whole plant defects, cell cracks, broken glass, and so on.
2) PID: PID is a condition that may occur a few years
The associate editor coordinating the review of this manuscript and after installation. It can be caused by humidity, heat
approving it for publication was Xiaodong Liang . or voltage. The temperature of the malfunctioned cell

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
37210 VOLUME 9, 2021
H. P.-C. Hwang et al.: Detection of Malfunctioning PV Modules Based on Machine Learning Algorithms

FIGURE 2. Illustration of the CNN architecture.


FIGURE 1. IR images of different PV module malfunctions, (a) hot spot,
(b) PID, and (c) open circuit. and application topic. Many fields need AI to intelligently
save time and human labor costs. Machine learning, a branch
is also higher than others and results in a larger and of AI, is a type of smart method that learns from data,
extremely hot area. identifies patterns and makes decisions with less human
3) Open circuit: An open circuit of a PV module causes intervention. Machine learning is applied in many fields,
a higher temperature in this array than other arrays. such as finance, image recognition, object detection, weather,
Therefore, as shown in the IR image of Fig. 1(c), the and medical research. Recently, deep learning, which is a
temperature of an open-circuit module array is extraor- subset of machine learning that includes networks capable
dinarily red. of conducting unsupervised learning from unstructured or
However, the detection of malfunctioning solar modules unlabeled data, has boomed. Many deep learning algorithms
is not effortless because it takes large amounts of time and have been proven to be very powerful, such as CNNs, recur-
human labor, especially for very large or remote solar plants. rent neural networks (RNNs), and long short-term memory
With the advancement of technology, some detection teams (LSTM). Among them, the CNN has been widely used in
use unmanned aerial vehicles (UAVs) equipped with infrared the field of image processing because it is especially good
cameras to search for malfunctioning PV modules. In addi- at image classification and recognition.
tion to a wide working range, drones may also patrol some The CNN is a backpropagation algorithm built by mod-
unreachable locations. Therefore, in this paper, we design eling brain functions. This network employs a mathematical
a hybrid scheme with three procedures that use an infrared operation named the convolution, which is a specialized kind
camera on a UAV to take IR images and then analyze them of linear operation. The CNN extracts the feature boundaries
using machine/deep learning algorithms to detect malfunc- of the object and learns to perform recognition through a
tioning PV modules or classify malfunction types. In the first series of layered operations. The architecture of the CNN is
procedure, the enhanced infrared images of solar panels are composed of multiple convolutional layers, pooling layers,
used to train the deep learning model to automatically identify rectified linear unit (ReLU) layers, loss layers and fully con-
malfunctions. In the second one, the adjusted temperatures of nected layers, as shown in Fig. 2.
all pixels of the IR image are used to train the deep learning In [3], the authors trained a CNN model to classify
model. For the third one, the characteristics of temperatures 1.2 million images into 1000 different classes in the ImageNet
are used for training the machine learning model. The exper- LSVRC 2010 contest. A customized CNN was also used to
imental results validate that all three methods can save labor classify lung image patches with interstitial lung disease [4].
and time when detecting malfunctioning modules. Further- The results of both [3] and [4] demonstrate that the CNN is
more, a hybrid scheme adopting the above three methods suitable for image recognition and classification.
shows even better performance. XGBoost, a well-known machine learning technique, is an
The contributions of this study are briefly summarized as improved version of the gradient boosting decision tree
follows. (I) We improve the ‘‘gamma correction’’ image pro- (GBDT) [5]. XGBoost combines the gradient boosting and
cessing algorithm that is used to enhance the contrast between gradient descent algorithms and is primarily adopted for
normal and abnormal cells and then build the CNN-based pro- supervised learning. The different features of XGBoost and
cedure to detect PV module defects. (II) To stress the contrast GBDT include the clever penalization of trees, a proportional
of malfunctioning locations for the temperature dataset, we shrinking of leaf nodes, an extra randomization parameter and
also design a threshold function to preprocess the tempera- Newton boosting [6]. The objective function of XGBoost is
tures. (III) To the best of our knowledge, our proposals are the composed of
 two parts, as indicated in (1) [5]. The first term,
,
P
first to adopt temperature data for training. (IV) Our model i l ŷi y i , is the training loss of the model and measures the
can also differentiate PIDs, open circuits and hot spots well. difference between Pthe predicted value and the target value.
(V) A hybrid detection scheme with even better performance The second term, k  (fk ), is a normalization part in which
is proposed. the penalty is generated to control the complexity (the number
of leaves) of the model, and the score weight of each leaf node
II. RELATED TECHNIQUES AND WORKS is added to prevent overfitting [5].
A. MACHINE LEARNING AND DEEP LEARNING
X  X
obj (φ) = l ŷi , yi +  (fk ) (1)
With the advancement of information technology, artificial i k

intelligence (AI) has become the most important research where  (f ) = γ T + 2 λ kwk .
1 2

VOLUME 9, 2021 37211


H. P.-C. Hwang et al.: Detection of Malfunctioning PV Modules Based on Machine Learning Algorithms

Another major difference between XGBoost and GBDT is further improve the anomalous detection accuracy of the local
that it uses the Taylor series in the objective function, which fault [18].
makes the convergence of XGBoost faster than that of GBDT. Pierdicca et al. proposed a CNN model to detect PV cell
XGBoost is widely used by data scientists to solve many degradation using the VGG-16 network [19]. The authors
machine learning problems in practice [7]–[8]. used PV IR images to train the deep learning model, and
an automatic recognition algorithm was then developed to
B. IMAGE PREPROCESSING detect PV module faults. Li et al. proposed a CNN solution
1) GAMMA CORRECTION FUNCTION for the defect detection of PV farms by using a drone to take
The gamma correction function is often used to correct an IR pictures [20]. In [21], Nie et al. presented a CNN-based
image’s luminance [9]. This function is defined by the fol- model to detect the hot spots of PV modules, and IR image
lowing power-law expression: data were used to train the CNN model. Grimaccia et al. also
suggested a method with an image processing algorithm to
γ
Vout = aVin (2) detect defects using UAVs [22]. The PV modules could be
classified into healthy, hot spots, bypass diodes, and dis-
where Vin is the input value, Vout is the output value, a is a connected. In [23], another method with an image process-
constant and γ is the power. To make the difference between ing algorithm was presented for the thermography defect
the abnormal and normal locations of IR images more obvi- detection of PV modules. These improved IR images could
ous, we design an improved gamma correction function to provide more details about the types of defects. In [24], an IR
achieve better preprocessing performance. thermography system on a drone was developed to detect and
locate malfunctioning PV modules. The K neighbors mean
2) OTHER IMAGE PROCESSING TECHNIQUES filter and Canny technique were used to preprocess these
Some image-processing techniques, such as Canny edge images.
detection and Gaussian filtering, have also been proposed However, based on a survey of previous works, we believe
to enhance the detection of malfunctioning PV modules. that the training image dataset could be further modified
In [10], the authors used a Gaussian filter and a binary model by the specifically designed image preprocessing technique,
to determine the defect and degradation percentages of PV which should enhance the learning result with the emphasized
modules. In [11], both the thermal image process and Canny contrast between normal cells and malfunctioning cells and
edge detection technique were used to detect the module- then provide higher accuracy. Similarly, a numerical prepro-
related faults that lead to hot-spot malfunctions. cessing method is proposed for the temperature dataset. Fur-
thermore, we also select some particular statistical features
C. LITERATURE REVIEW to better train the machine learning model. Finally, a hybrid
There have been some published methods for detecting dam- scheme embedded with these three detection procedures is
aged PV modules. Chouder and Silvestre presented a method proposed with even better accuracy. Many experiments are
based on power-loss data analysis to automatically detect implemented to validate our proposals.
faults in a PV system [12]. To calculate the main parameters
of the PV system from monitoring data, a parameter extrac-
III. PROPOSED DETECTION SCHEME
tion method was adopted. In [13], the authors proposed a deep
In this section, we detail three proposed detection meth-
learning-based method to detect and classify the defects of
ods and one corresponding hybrid scheme that includes the
PV modules. The CNN was used to extract features from 2-D
deep learning algorithm, the image preprocessing method,
scalograms of system data. This approach could effectively
the threshold function, and the machine learning algorithm.
classify five different faulty cases. Another type of fault
The parameter adjustment and feature selection steps are also
detection method was based on electroluminescence (EL)
introduced.
images. Both [14] and [15] used the EL images of solar cells
as the input dataset for a deep learning method to automati-
cally detect and classify the defects of solar cells. In [16], the A. CNN DEEP LEARNING ALGORITHM
authors also built two detection models using a support vector For the first two methods, the same CNN deep learning
machine (SVM) and CNN for an EL image dataset. algorithm is chosen to train the detection models to assess
Furthermore, two other studies proposed different methods whether a PV module has malfunctioned. Table 1 shows the
for detecting the defects of PV modules. In [17], an indepen- CNN structure and feature extraction adopted in this study.
dent component analysis reconstruction algorithm was used The architecture of the CNN consists of six convolutional
to detect surface faults. Another novel algorithm was used layers, six max pooling layers and one fully connected layer.
in [18]. In this paper, local detection and global detection The convolutional layer is used to extract the features of the
methods were proposed. In local detection, a water filling input data using kernels, and the output of the convolutional
algorithm was used to determine the local maximum tem- layer is a feature map. The objective of max pooling is
perature of the PV panel region. Then, global detection, to downsample an input image, and the output of the max
namely, multiframe recognition of PV faults, was adopted to pooling layer is the maximum value in each area. The fully

37212 VOLUME 9, 2021


H. P.-C. Hwang et al.: Detection of Malfunctioning PV Modules Based on Machine Learning Algorithms

TABLE 1. CNN structure and feature extraction. red pixels can cause larger differences between the defective
locations and normal locations. The greater the difference, the
easier the CNN model can distinguish between normal and
abnormal PV modules during the detection process.
Algorithm 1 is the improved gamma correction function
and includes the following four steps:

Improved gamma correction (image, gamma = 2)


Input PV module RGB-images (m × n)
Output PV module RGB-images (m × n)
1: for x = 1; x <= n do
2: for y = 1; y <= m do
3: convert image[x][y] (R,G,B) to image[x][y]
(H,S,V)
4: if (20,255,255) >= image[x][y] (H,S,V) >=
(0,43,46) then
5: S = S ×ρ
6: V = V ×ρ
7: for x = 1; x <= n do
8: for y = 1; y <= m do
9: convert image[x][y] (H,S,V) to image[x][y]
(R,G,B)
gamma
10: output = ( image
255 ) × 255
11: return output

1) Convert the RGB image to the HSV format;


2) Multiply the S and V values of the pixels in the
red range by ρ, i.e., (20, 255, 255) ≥ (H, S, V) ≥
(0, 43, 46);
3) Convert the HSV image back to the RGB format;
FIGURE 3. Illustration of image processing results. (a) No image 4) Output the gamma correction with gamma = 2, as indi-
processing, (b) Processed by gamma, (c) Processed by improved gamma,
(d) Processed by the color mask, (e) Processed by Sobel and (f) Processed cated in (3).
by Canny.
image gamma
Output = ( ) × 255 (3)
255
connected layer is used to flatten the output of the max Figs. 3 (a), (b) and (c) demonstrate the original image,
pooling layer and then feed the results into neural networks. the image processed by gamma correction and the image
processed by improved gamma correction, respectively.
B. IMPROVED GAMMA CORRECTION FUNCTION Figs. 3 (d), (e) and (f) show the image processed by the
In the first method, IR images are used to train the detec- color mask, the image processed by Sobel and the image
tion model. In general, an image preprocessing technique, processed by Canny, respectively. From these images, it is
the gamma correction function, can correct an image’s straightforward to conclude that Fig. 3 (c) presents a sharper
luminance. However, to increase the detection efficiency, contrast around the boundary.
an improved gamma correction function is designed to
enhance the contrast of the images of malfunctioning cells. C. THRESHOLD FUNCTION
In the gamma correction operation, an RGB (red, green and In the second method, temperatures are used to train the
blue) image is converted into the HSV format first, where H learning model. Therefore, a threshold function is proposed to
represents the hue, S represents the saturation and V repre- enhance the detection accuracy. This threshold function with
sents the value. In the improved gamma correction, we would two parts enlarges the numerical difference of the tempera-
like to make the red pixels more obvious than the other colors. tures around the damaged area. The first part determines the
Accordingly, the S and V values of red pixels are emphasized value of threshold. The average temperature of all pixels in
and adjusted by ρ thereafter. Fig. 3 (c) indicates that the the image is calculated, and then the threshold value is chosen
red pixels become darker than the red pixels in Fig. 3 (a). thereafter. If the average is larger than S, then the threshold
In contrast to dark red pixels, the light red pixels do not is set as TU ; otherwise, it is set as TL as described in (4),
become redder, as they are revised by the gamma correction where TU > TL. The second part enlarges the contrast of the
in Fig. 3 (b). Creating a larger contrast for the red and light boundary of the damaged area. If temperature x is greater than

VOLUME 9, 2021 37213


H. P.-C. Hwang et al.: Detection of Malfunctioning PV Modules Based on Machine Learning Algorithms

TABLE 2. XGBoost features.

FIGURE 4. Flow chart of the hybrid scheme.

the average temperature plus the threshold, then return this


temperature plus d; otherwise, return this temperature minus
d, as shown in (5). 2) IR thermography was converted into image and tem-
( perature data.
TL, mean ≤ S 3) Preprocess the image using an image processing tech-
threshold = (4)
TU , mean > S nique, i.e., the improved gamma correction function,
( and preprocess the temperatures using the threshold
x − d, x < mean + threshold
f (x) = (5) function.
x + d, x ≥ mean + threshold 4) Input the image and temperature files into the CNN
deep learning model individually.
D. XGBOOST AND THE HYBRID SCHEME 5) Identify whether the PV module is malfunctioning or
Since the abovementioned image and temperature CNN mod- normal.
els learn and detect from the perspectives of colors and 6) Adjust the parameters of the preprocessing functions
numbers, it is highly possible that they can complement each and then repeat the above steps if the accuracy is less
other. Moreover, to implement a hybrid detection-making than 0.981.
scheme, we introduce the third machine learning algorithm 7) Stop adjusting the parameters when the model is good
called XGBoost for the temperature dataset. The XGBoost enough (i.e., accuracy > 0.981).
algorithm is designed to retrieve the statistical character- For the image dataset, we observe, test and conclude
istics of the temperatures; nevertheless, the original CNN that the best ρ for the improved gamma correction func-
models are used to learn the general variation and distribu- tion is 0.5 to emphasize red pixels within the range of
tion of temperatures. Thus, the major features of XGBoost (20, 255, 255) ≥ (H , S, V ) ≥ (0, 43, 46).
will be chosen based on this criterion thereafter. Finally, the For the temperature dataset, if the temperature file has an
detection results of image CNN model 1, temperature CNN average (i.e., mean) larger than 27 ◦ C, the threshold should
model 2 and temperature XGBoost model 3 are combined by be equal to 2.5. Otherwise, it is 1, as indicated in (6). Then,
using a decision maker. The final decision is made by the all temperature values have 3 added to them if they are
linear combination of these models. The flow chart of this larger than or equal to the mean + threshold. Otherwise, all
hybrid detection scheme is described in Fig. 4. temperature values have 3 subtracted, as shown in (7). This
is because we find that the best d is 3 for this dataset after
E. PARAMETER ADJUSTMENT AND FEATURE SELECTION repeated testing.
As introduced earlier, the CNN models are trained to classify (
the malfunctioning modules and normal modules by using 1, mean ≤ 27
threshold = (6)
images and temperatures, respectively. Two preprocessing 2.5, mean > 27
functions are designed to provide better performance. Dur- (
x − 3, x < mean + threshold
ing two individual procedures that select better parameters, f (x) = (7)
the key parameters of the improved gamma correction and x + 3, x ≥ mean + threshold
threshold function are tuned according to the accuracy of the The XGBoost algorithm is designed to learn the statistical
training result. The accuracies are chosen to be better than characteristics of the temperature variations, and therefore,
the 0.981 of the original gamma correction. Finally, two sets the following five major features, as listed in Table 2, are
of parameters are decided. Fig. 5 shows the two respective selected.
adjustment processes using images with the improved gamma
and temperature values with the threshold, and the detailed IV. EXPERIMENTAL RESULTS
steps are described as follows. In this section, we detail the dataset, experimental environ-
1) A picture of a PV module was taken with an IR thermal ment and results. The thermographic inspection data taken
imaging camera. from a thermographic camera are converted into Comma

37214 VOLUME 9, 2021


H. P.-C. Hwang et al.: Detection of Malfunctioning PV Modules Based on Machine Learning Algorithms

TABLE 3. Confusion matrix.

FIGURE 6. Thermographic images.

Keras, which is a high-level deep learning API written in


Python. Keras allows users to build deep learning models with
minimal code and time.

C. CLASSIFICATION ACCURACY
To evaluate the performance of the proposed methods, the
FIGURE 5. Two parameter adjustment flows using images with improved following formula for the classification accuracy is employed
gamma and temperatures with a threshold function.
and shown in (8). The formula is calculated based on the
confusion matrix in Table 3.
Separated Values (CSV) files of temperatures first. Both the TP + TN
Accuracy = (8)
image files and temperature files are adjusted by the cor- TP + TN + FP + FN
rection function and threshold function, respectively. Then,
these outputs and selected features are inputted into the CNN D. CLASSIFICATION STEPS
models and the XGBoost model for individual learning and After the thermal image dataset is generated from the ther-
identification. mographic camera, the images are converted to temperatures
first. Then, either dataset is forwarded to the enhancement
A. EXPERIMENTAL DATASET function as in step (1) and is further fed into the CNN model
The thermal images are collected from the roof of the for classification as in step (2). In step (3), five major features
Industrial Technology Research Institute in Hsinchu, Taiwan. are calculated using the temperatures and then inputted to the
This thermographic inspection dataset includes 684 images XGBoost model. Finally, the outputs of the above models are
and 684 converted temperature CSV files as follows. used to decide if a PV module is malfunctioning or normal,
1) These 240 × 320 thermographic images include 189 as in step (4).
images for normal PV modules and 495 images for 1) Preprocess the IR thermographic images using the
malfunctioning PV modules. Some samples are shown improved gamma correction function and preprocess
in Fig. 6. the temperatures using the threshold function.
2) These 240 × 320 CSV files include 189 CSV files for 2) Forward the enhanced image file and the temperature
normal PV modules and 495 CSV files for malfunc- file to the CNN deep learning models. In the second
tioning PV modules. model, temperatures are used to complete the one-
Approximately 76% of the data are used as the training dimensional matrix.
dataset, and approximately 24%, i.e., 161 photovoltaic mod- 3) Five major features were calculated and then fed into
ules, are used as the detection test set. To deal with the the XGBoost model.
imbalanced data, we double train the minor classes of data 4) Identify the PV module as malfunctioning or normal
during the training processes [25]. using the outputs of the above models.

B. EXPERIMENTAL ENVIRONMENT E. DETECTION ACCURACY USING IMAGE AND


The hardware used for this experiment is a server equipped TEMPERATURE CNNS
with an Nvidia GeForce RTX 2080 Ti GPU, an 8 core CPU The performances of the proposed image CNN model with
and 64 GB of RAM. The software includes two parts. The first the improved gamma correction function and some other
one, TensorFlow, is a deep learning framework and is an open image processing methods are listed in Table 4. We also
source software library developed by Google that is widely test many different kinds of image processing techniques to
used for machine learning and deep learning. The second is demonstrate the usefulness of the improved gamma. Each

VOLUME 9, 2021 37215


H. P.-C. Hwang et al.: Detection of Malfunctioning PV Modules Based on Machine Learning Algorithms

TABLE 4. Comparison of some image processing techniques. TABLE 6. Comparison of various compression ratios.

TABLE 5. Confusion matrix for test with temperatures (240 × 320).

technique is implemented three times, and then the average


accuracy is calculated. As predicted, preprocessing of the
improved gamma correction achieves the highest detection
accuracy of 0.938.
The average detection accuracy of the proposed CNN
model using the temperatures of PV modules with the thresh-
old function is 0.946, and the best one is 0.981. Table 5
displays the confusion matrix for the best result.
This temperature CNN method also performs well with
high accuracy. However, it takes a longer time to gener-
ate detection results, as does the detection model using the
images with the improved gamma correction function. The
inefficiency should be due to both of the mathematical oper-
ations, i.e., the improved gamma correction and the thresh-
old function. Therefore, an image compression method is
adopted to reduce the IR images from 240 × 320 to a size
ranging from 240 × 160 to 60 × 80. We think that this method
decreases the sizes of both the image and temperature datasets
and should shorten the calculation times. Then, the accuracies
and time consumptions of various compression ratios are
experimented and summarized in Table 6. Each compression FIGURE 7. Comparison of various compression ratios for image CNN
model.
ratio is implemented three times, and the best accuracy is
emphasized in bold. The average accuracies are calculated
and listed in the next column. Comparing the first and second CNN detection model. These two rows are thus highlighted
rows in the upper half and lower half of Table 6, respectively, in shade in Table 6. We believe that larger datasets or smaller
it is easy to conclude that both improvement functions, i.e., datasets may cause the learning methods to limit focus on
improved gamma correction and the threshold enhance the irregularity or to focus too much on irregularity. Therefore,
accuracy of detection but consume more calculation time. medium compression ratios are more appropriate for both
Contrasting all rows except the first one in the upper half CNN models.
and lower half of Table 6, we indeed find that compression
reduces the calculation times; however, overcompression is F. HYBRID DETECTION SCHEME WITH THREE MODELS
not good for detection accuracy. Figs. 7 and 8 illustrate the In addition to two CNN models, the third one we chose is the
higher accuracy, lower accuracy and average accuracy with XGBoost algorithm. The reason why we select XGBoost is
various compression ratios for the image and temperature that it performs better than many other machine learning algo-
detection methods, respectively. rithms for our experimental dataset, as indicated in Table 7.
120 × 160 is the best size for the image CNN detection To optimize the accuracy of the proposed hybrid detection
model, and 80 × 160 is the best size for the temperature scheme, we use the combination of the best compression

37216 VOLUME 9, 2021


H. P.-C. Hwang et al.: Detection of Malfunctioning PV Modules Based on Machine Learning Algorithms

TABLE 8. Comparison on different methods.

FIGURE 8. Comparison of various compression ratios for temperature


CNN model.

TABLE 7. Comparison of some machine learning techniques.

TABLE 9. Number of modules after using data augmentation.

ratios of the image and temperature CNN models, i.e.,


120 × 160 for the image CNN algorithm and 80 × 160 for
the temperature CNN algorithm with the XGBoost model.
They are of equal weights in the decision-making procedure. TABLE 10. Confusion matrix of testing.
The detailed procedure of the hybrid scheme is illustrated
in Fig. 9. From the IR camera, the image and temperature
datasets are created. The images are compressed to 120 ×
160 first and preprocessed by the improved gamma cor-
rection function. Then, they are forwarded to Image CNN
model 1 for detection. The temperatures are compressed to
80 × 160 and preprocessed by a threshold function. These
numbers are passed to Temperature CNN model 2 for judg- may vary the accuracy of detection. Finally, the proposed
ment. The specific features of the temperatures are also hybrid scheme functions better than any single method as
retrieved and then sent to Temperature XGBoost model 3 expected.
for third detection. The outcomes of these models are aggre-
gated by the decision maker to generate the final detection G. FURTHER CLASSIFICATION OF THE DEFECTS OF
result. PHOTOVOLTAIC MODULES
A total of nine tests are implemented, and the average Since different PV module defects may require different
accuracy is 0.992. Six of these nine results have an accuracy maintenance procedures, this study also tries to further clas-
of 0.994, and three have an accuracy of 0.987. As predicted sify the defect types. As introduced earlier, the major defect
earlier, the complementary design works somewhat well. types include hot spots, PIDs and open circuits. However, the
To clarify that the proposed methods are effective, we also numbers of PID and open circuit modules are insufficient.
compare our method with other approaches, and the details To train a detection model, we generate more PID and open
are shown in Table 8. As observed, our methods are ranked circuit data using a data augmentation method [27], as shown
in the top tier. The CNN is the major efficient deep learning in Table 9. To achieve better accuracy, we select the image
method used for detecting the defects of PV modules. How- CNN model with 120 × 160 images. The performance of
ever, both of our CNN proposals outperform the other CNN our classification model is 0.938, and Table 10 demonstrates
schemes proposed in [13], [16], [19], [21] and achieve very the confusion matrix of the classification results. From this
close performance to the first-tier methods presented in [14], table, it is observed that the classification of actual PID and
[20]. In fact, [14] used an EL image dataset to implement open circuit defects is not that good compared to hot spot
detection, but EL images were typically collected in a dark defects. Therefore, if the classification is critical for some
environment to reduce background light [26]. This may not maintenance programs, then a preprocessing algorithm to
be good for the operation of UAVs. Reference [20] adopted stress the characteristics of PID and open circuit defects
a visible image dataset; nevertheless, visible images were may be necessary for even better accuracy of classification
tremendously affected by the lightness of the sky, which models.

VOLUME 9, 2021 37217


H. P.-C. Hwang et al.: Detection of Malfunctioning PV Modules Based on Machine Learning Algorithms

FIGURE 9. Detailed procedure of the hybrid detection scheme.

V. CONCLUSION [2] C. Buerhop, D. Schlegel, M. Niess, C. Vodermayer, R. Weißmann, and


We have proposed a hybrid detection scheme with three C. J. Brabec, ‘‘Reliability of IR-imaging of PV-plants under operating con-
ditions,’’ Sol. Energy Mater. Sol. Cells, vol. 107, pp. 154–164, Dec. 2012,
embedded learning methods that can be used to detect mal- doi: 10.1016/j.solmat.2012.07.011.
functioning PV modules with high accuracy. For the first [3] A. Krizhevsky, I. Sutskever, and G. E. Hinton, ‘‘ImageNet classification
method, the CNN model is trained using IR images, which are with deep convolutional neural networks,’’ Commun. ACM, vol. 60, no. 6,
pp. 84–90, May 2017, doi: 10.1145/3065386.
preprocessed by the improved gamma correction to empha- [4] Q. Li, W. Cai, X. Wang, Y. Zhou, D. D. Feng, and M. Chen, ‘‘Medical
size the high temperatures with deeper reds. We test different image classification with convolutional neural network,’’ in Proc. 13th Int.
image processing techniques that are expected to highlight Conf. Control Automat. Robot. Vis. (ICARCV), Dec. 2014, pp. 844–848,
doi: 10.1109/icarcv.2014.7064414.
the color of malfunctioning ones, and the improved gamma [5] T. Chen and C. Guestrin, ‘‘XGBoost,’’ in Proc. 22nd ACM SIGKDD
correction we design is the most suitable algorithm for our Int. Conf. Knowl. Discovery Data Mining, Aug. 2016, pp. 1–4, doi:
purpose. For the second method, the CNN model is trained 10.1145/2939672.2939785.
[6] D. Nielsen, ‘‘Tree boosting with XGBoost: Why does XGBoost win every
using the temperature dataset, which is preprocessed using a machine learning competition,’’ M.S. thesis, Dept. Phys. Math., Norwegian
threshold function, and it can accurately detect malfunction- Univ. Sci. Technol., Trondheim, Norway, Dec. 2016. [Online]. Available:
ing modules as well. To enhance the contrast of the malfunc- https://ntnuopen.ntnu.no/ntnu-xmlui/bitstream/handle/11250/2433761/
16128_FULLTEXT.pdf?sequence=1&isAllowed=y
tioning modules, a threshold function is designed to process [7] D. Zhang, L. Qian, B. Mao, C. Huang, B. Huang, and Y. Si, ‘‘A data-
the temperature files in advance. However, the improved driven design for fault detection of wind turbines using random forests
gamma correction and the threshold function consume con- and XGboost,’’ IEEE Access, vol. 6, pp. 21020–21031, 2018, doi:
10.1109/access.2018.2818678.
siderable processing time. To solve the time-consumption
[8] Z. Chen, F. Jiang, Y. Cheng, X. Gu, W. Liu, and J. Peng, ‘‘XGBoost
issue for the extra preprocesses, we adopt a compression classifier for DDoS attack detection and analysis in SDN-based cloud,’’
method. That is, the sizes of the IR images and corresponding in Proc. IEEE Int. Conf. Big Data Smart Comput. (BigComp), Jan. 2018,
pp. 251–256, doi: 10.1109/bigcomp.2018.00044.
temperature files are reduced from 240×320 to some smaller
[9] C. Poynton, Digital Video and HD: Algorithms and Interfaces, 2nd ed.
ones ranging from 60 × 80 to 240 × 160. Therefore, the Amsterdam, The Netherlands: Elsevier, 2012.
time consumption is tremendously decreased by reducing [10] M. Aghaei, A. Gandelli, F. Grimaccia, S. Leva, and R. E. Zich, ‘‘IR real-
the mathematical calculations. In addition, we retrieve the time analyses for PV system monitoring by digital image processing tech-
niques,’’ in Proc. Int. Conf. Event-Based Control, Commun., Signal Pro-
specific statistical features regarding the variation in tem- cess. (EBCCSP), Jun. 2015, pp. 1–6, doi: 10.1109/ebccsp.2015.7300708.
peratures to train the third XGBoost algorithm. This hybrid [11] J. A. Tsanakas, D. Chrysostomou, P. N. Botsaris, and A. Gasteratos,
detection scheme achieves very good detection accuracy for ‘‘Fault diagnosis of photovoltaic modules through image process-
ing and canny edge detection on field thermographic measurements,’’
our dataset. Because the major phenomenon of malfunction- Int. J. Sustain. Energy, vol. 34, no. 6, pp. 351–372, Jul. 2015, doi:
ing PV modules is unusual high temperatures, all of our 10.1080/14786451.2013.826223.
designs focus on identifying them. Consequently, we believe [12] A. Chouder and S. Silvestre, ‘‘Automatic supervision and fault detec-
tion of PV systems based on power losses analysis,’’ Energy Con-
that the good detection accuracy of our hybrid scheme can be vers. Manage., vol. 51, no. 10, pp. 1929–1937, Oct. 2010, doi:
validated for the other datasets as well. 10.1016/j.enconman.2010.02.025.
To also classify the kinds of PV defects, we train a CNN [13] F. Aziz, A. Ul Haq, S. Ahmad, Y. Mahmoud, M. Jalal, and U. Ali, ‘‘A novel
convolutional neural network-based approach for fault classification in
model using 120 × 160 IR images that are preprocessed photovoltaic arrays,’’ IEEE Access, vol. 8, pp. 41889–41904, 2020, doi:
using the improved gamma function. This CNN model also 10.1109/access.2020.2977116.
achieves a decent performance. As explained earlier, we think [14] M. R. U. Rahman and H. Chen, ‘‘Defects inspection in polycrystalline
solar cells electroluminescence images using deep learning,’’ IEEE Access,
that a UAV equipped with an IR camera is a good carrier vol. 8, pp. 40547–40558, 2020, doi: 10.1109/access.2020.2976843.
to take IR pictures. After these pictures have been retrieved, [15] A. Bartler, L. Mauch, B. Yang, M. Reuter, and L. Stoicescu, ‘‘Auto-
the corresponding images and temperatures and the proposed mated detection of solar cell defects with deep learning,’’ in Proc. 26th
Eur. Signal Process. Conf. (EUSIPCO), Sep. 2018, pp. 2035–2039, doi:
detection methods can be used to construct an efficient 10.23919/eusipco.2018.8553025.
maintenance program for medium- to large-scale solar power [16] S. Deitsch, V. Christlein, S. Berger, C. Buerhop-Lutz, A. Maier, F. Gallwitz,
plants. and C. Riess, ‘‘Automatic classification of defective photovoltaic module
cells in electroluminescence images,’’ Sol. Energy, vol. 185, pp. 455–468,
REFERENCES Jun. 2019, doi: 10.1016/j.solener.2019.02.067.
[1] R. Ebner, B. Kubicek, and G. Ujvari, ‘‘Non-destructive techniques for qual- [17] X. Zhang, H. Sun, Y. Zhou, J. Xi, and M. Li, ‘‘A novel method for surface
ity control of PV modules: Infrared thermography, electro-and photolumi- defect detection of photovoltaic module based on independent component
nescence imaging,’’ in Proc. 39th Annu. Conf. IEEE Ind. Electron. Soc. analysis,’’ Math. Problems Eng., vol. 2013, Sep. 2013, Art. no. 520568,
(IECON), Nov. 2013, pp. 8104–8109, doi: 10.1109/iecon.2013.6700488. doi: 10.1155/2013/520568.

37218 VOLUME 9, 2021


H. P.-C. Hwang et al.: Detection of Malfunctioning PV Modules Based on Machine Learning Algorithms

[18] V. Carletti, A. Greco, A. Saggese, and M. Vento, ‘‘An intelligent flying COOPER CHENG-YUAN KU (Member, IEEE)
system for automatic detection of faults in photovoltaic plants,’’ J. Ambient received the B.S. degree in control engineering
Intell. Hum. Comput., vol. 11, no. 5, pp. 2027–2040, May 2020, doi: from National Chiao Tung University, Taiwan,
10.1007/s12652-019-01212-6. in 1987, and the M.S. and Ph.D. degrees in
[19] R. Pierdicca, E. S. Malinverni, F. Piccinini, M. Paolanti, A. Felicetti, and electrical engineering from Northwestern Uni-
P. Zingaretti, ‘‘Deep convolutional neural network for automatic detection versity, USA, in 1993 and 1995, respectively.
of damaged photovoltaic cells,’’ Int. Arch. Photogramm., Remote Sens. From 1999 to 2014, he was with the Department of
Spatial Inf. Sci., vol. 42, pp. 893–900, May 2018, doi: 10.5194/isprs-
Information Management, National Chung Cheng
archives-xlii-2-893-2018.
University. Since 2014, he has been with the
[20] X. Li, Q. Yang, Z. Lou, and W. Yan, ‘‘Deep learning based module defect
analysis for large-scale photovoltaic farms,’’ IEEE Trans. Energy Convers., Institute of Information Management, National
vol. 34, no. 1, pp. 520–529, Mar. 2019, doi: 10.1109/tec.2018.2873358. Yang Ming Chiao Tung University, Hsinchu, Taiwan. His current research
[21] J. Nie, T. Luo, and H. Li, ‘‘Automatic hotspots detection based on UAV interests include artificial intelligence, information security and manage-
infrared images for large-scale PV plant,’’ Electron. Lett., vol. 56, no. 19, ment, blockchain systems, the Internet of Things, and data communication
pp. 993–995, Sep. 2020, doi: 10.1049/el.2020.1542. networks.
[22] F. Grimaccia, S. Leva, and A. Niccolai, ‘‘PV plant digital mapping for
modules’ defects detection by unmanned aerial vehicles,’’ IET Renew.
Power Gener., vol. 11, no. 10, pp. 1221–1228, Aug. 2017, doi: 10.1049/iet-
rpg.2016.1041.
[23] M. W. Akram, G. Li, Y. Jin, X. Chen, C. Zhu, X. Zhao, M. Aleem, and
A. Ahmad, ‘‘Improved outdoor thermography and processing of infrared
images for defect detection in PV modules,’’ Sol. Energy, vol. 190,
pp. 549–560, Sep. 2019, doi: 10.1016/j.solener.2019.08.061.
[24] P. Zhang, L. Zhang, T. Wu, H. Zhang, and X. Sun, ‘‘Detection and
location of fouling on photovoltaic panels using a drone-mounted infrared
thermography system,’’ J. Appl. Remote Sens., vol. 11, no. 1, Feb. 2017,
Art. no. 016026, doi: 10.1117/1.jrs.11.016026.
[25] M. Buda, A. Maki, and M. A. Mazurowski, ‘‘A systematic study of the
class imbalance problem in convolutional neural networks,’’ Neural Netw.,
vol. 106, pp. 249–259, Oct. 2018, doi: 10.1016/j.neunet.2018.07.011.
[26] S. Johnston, ‘‘Contactless electroluminescence imaging for cell and
module characterization,’’ in Proc. IEEE 42nd Photovolt. Spec.
Conf. (PVSC), New Orleans, LA, USA, Jun. 2015, pp. 1–6, doi:
10.1109/PVSC.2015.7356423.
[27] C. Shorten and T. M. Khoshgoftaar, ‘‘A survey on image data augmentation
JAMES CHI-CHANG CHAN was born in Taipei,
for deep learning,’’ J. Big Data, vol. 6, no. 1, Dec. 2019, Art. no. 60, doi:
10.1186/s40537-019-0197-0. Taiwan, in January1958. He received the Ph.D.
degree from the Computer-Aided Engineer-
ing (CAE) Group, Department of Civil Engi-
HUMBLE PO-CHING HWANG received the B.S. neering, National Taiwan University. Since 1989,
degree from the Department of Business Adminis- he has been with the Laboratory of Research
tration, National Chung Cheng University, Chiayi, and Development, Industrial Technology Research
Taiwan, in 2017. He is currently pursuing the Ph.D. Institute (ITRI). His research interests include
degree with the Institute of Information Manage- computer-aided engineering, engineering informa-
ment, National Yang Ming Chiao Tung University, tion technology, structure safety non-destructive
Hsinchu, Taiwan. His research interests include detection, civil and disaster prevention, solar PV system planning and
machine and deep learning, solar energy, informa- evaluation design, solar PV system detection and evaluation, solar PV system
tion security, and game theory. data value-added analysis, and solar BIPV systems.

VOLUME 9, 2021 37219

You might also like