0% found this document useful (0 votes)
29 views11 pages

A Deep Learning-Based Experiment On Forest Wildfire Detection in Machine Vision Course

This document presents a comprehensive experiment on forest wildfire detection integrated into a Machine Vision course, utilizing deep learning and digital image processing techniques. It introduces a novel wildfire image classification algorithm based on Reduce-VGGnet and a region detection algorithm using an optimized CNN, achieving high accuracy rates of 91.20% and 97.35%, respectively. The framework aims to enhance students' understanding of machine vision while addressing the challenges of effective wildfire detection.

Uploaded by

sandysandeep0096
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views11 pages

A Deep Learning-Based Experiment On Forest Wildfire Detection in Machine Vision Course

This document presents a comprehensive experiment on forest wildfire detection integrated into a Machine Vision course, utilizing deep learning and digital image processing techniques. It introduces a novel wildfire image classification algorithm based on Reduce-VGGnet and a region detection algorithm using an optimized CNN, achieving high accuracy rates of 91.20% and 97.35%, respectively. The framework aims to enhance students' understanding of machine vision while addressing the challenges of effective wildfire detection.

Uploaded by

sandysandeep0096
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Received 18 March 2023, accepted 20 March 2023, date of publication 28 March 2023, date of current version 5 April 2023.

Digital Object Identifier 10.1109/ACCESS.2023.3262701

A Deep Learning-Based Experiment on Forest


Wildfire Detection in Machine Vision Course
LIDONG WANG 1,2 , (Member, IEEE), HUIXI ZHANG3 , YIN ZHANG2 ,
KEYONG HU1 , (Member, IEEE), AND KANG AN 1 , (Member, IEEE)
1 School of Engineering, Hangzhou Normal University, Hangzhou 310018, China
2 College of Computer Science and Technology, Zhejiang University, Hangzhou 310012, China
3 School of Information Science and Technology, Hangzhou Normal University, Hangzhou 310018, China
Corresponding authors: Lidong Wang ([email protected]), Yin Zhang ([email protected]), and Kang An ([email protected])
This work was supported in part by the Zhejiang Provincial High-Education Teaching Reform Project under Grant jg20220770, in part by
the Zhejiang Provincial Educational Science Planning Project under Grant 2022SCG022, in part by the AI Micro-Major Project under
Grant QJWZY0201, in part by the China Knowledge Centre for Engineering Sciences and Technology (CKCEST), and in part by the Joint
Funds of the Zhejiang Provincial Natural Science Foundation of China under Grant LHY21E090004.

ABSTRACT As an interdisciplinary course, Machine Vision combines AI and digital image processing
methods. This paper develops a comprehensive experiment on forest wildfire detection that organically
integrates digital image processing, machine learning and deep learning technologies. Although the research
on wildfire detection has made great progress, many experiments are not suitable for students to operate.
Also, the detection with high accuracy is still a big challenge. In this paper, we divide the task of forest
wildfire detection into two modules, which are wildfire image classification and wildfire region detection.
We propose a novel wildfire image classification algorithm based on Reduce-VGGnet, and a wildfire region
detection algorithm based on the optimized CNN with the combination of spatial and temporal features. The
experimental results show that the proposed Reduce-VGGNet model can reach 91.20% in accuracy, and the
optimized CNN model with the combination of spatial and temporal features can reach 97.35% in accuracy.
Our framework is a novel way to combine research and teaching. It can achieve good detection performance
and can be used as a comprehensive experiment for Machine Vision course, which can provide the support
for talent cultivation in machine vision area.

INDEX TERMS Machine vision, computer science education, wildfire detection, comprehensive
experiment, CNN.

I. INTRODUCTION The camera firstly obtains the images, then we can recog-
With the rapid development of computer technology and nize the target object through the computer’s visual recog-
the popularity of cameras, machine vision technology based nition algorithm, and finally the image processing device
on artificial intelligence (AI) and digital image processing can output the target recognition result through the termi-
has been applied to increasing fields, such as face detec- nal [3]. At present, machine vision has become one of the
tion [1], wildfire detection [2], object measurement [3] and essential skills of image and video processing practitioners,
surface defect detection [4]. As an interdisciplinary course, and is also an important professional course in intelligent
Machine Vision combines AI and digital image processing. manufacturing, computer science and technology, and other
With the development of AI, machine vision can replace majors. With the rapid development of AI in recent years,
human beings with intelligent programs for some automated there is an increasing demand for talents in two main appli-
operations and measurements [5]. A complete machine vision cation fields, natural language processing and digital image
system includes a camera and an image processing device. processing.
In recent years, experts at home and abroad have been
The associate editor coordinating the review of this manuscript and exploring the reform of Machine Vision course. For exam-
approving it for publication was Senthil Kumar . ple, Min and Lu [6] focused on the production practice
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.
VOLUME 11, 2023 For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/ 32671
L. Wang et al.: Deep Learning-Based Experiment on Forest Wildfire Detection in Machine Vision Course

and proposed multimedia teaching and guided interactive such extent. The main contributions about this paper can be
teaching. They also suggested that experiments should not summarized as follows:
only be closely related to classroom teaching, but also in (1) We propose a novel wildfire image classification algo-
accordance with practical application needs, and can arouse rithm based on Reduce-VGGnet, which can reduce the train-
students’ interest. Wang et al. [7], aiming at the principle and ing parameters of VGGnet and achieve 91.20% in accuracy.
application of machine vision in the postgraduate curricu- (2) We propose a novel wildfire region detection algorithm
lum, integrated scientific researches, teaching and practical based on the optimized CNN with the combination of spatial
projects into the classroom. In this way, students can asso- and temporal features. The experimental results on FLAME
ciate the project research with the development of products. dataset show the effectiveness of our method.
Shao et al. [8] designed cocoon sorting in the field of machine (3) We combine wildfire image classification module and
vision in order to cultivate intelligent manufacturing talents wildfire region detection module to be a comprehensive
under the background of new engineering subjects. Han and experiment for Computer Vision course. It is a novel way
Liu [9] designed a machine vision experiment platform with to effectively combine teaching and scientific research, and
multiple modules using Tensorflow and Opencv library to incorporates teachers’ research into the teaching process.
solve the problems of insufficient experiments related to The rest of our paper is structured as follows. In Section II,
machine vision, unreasonable experimental design and lack we describe the background of the task of wildfire region
of practical data. The reform of Machine Vision course in for- detection. The design of our framework is presented in
eign countries focuses on social awareness education and the Section III. The results and analysis are presented in
reform of teaching methods of basic technology. For example, Section IV. Finally, the conclusion is provided in Section V.
Sigut et al. [10] believed that the teaching theme of machine
vision depends on the use of new technologies. To enable II. EXPERIMENT REQUIREMENTS AND BACKGROUND
students to better understand the concept, this paper devel- A. EXPERIMENT REQUIREMENTS
oped an application for Android operating system that can The comprehensive experiment designed in this paper can
perform real-time presentation of Opencv image processing be applied to undergraduate/postgraduate Machine Vision
technology to help students better understand the concepts courses, so that students can understand the popular tech-
related to image processing. Cote and Albu [11] advocated nology of machine vision, and cultivate the students’ ability
integrating the social awareness module into the Machine to develop some novel algorithms on image processing and
Vision course, so as to study the social impact of technology automatic recognition. Students need to understand the fol-
and the technology itself. Spurlock and Duvall [12], in order lowing knowledge: ① Digital image processing. For example,
to expand the educational audience of machine vision, that is, the video reading technology is used to read the image frame,
not limited to postgraduates or doctoral students, increased the image enhancement technology is used to achieve image
the development of practical cases in the field of machine contrast expansion, and the image segmentation technology
vision applications and reduced the derivation of mathemati- is used to achieve the target object segmentation. ② Machine
cal formulas to better adapt to undergraduate teaching. From learning. It includes basic principles and experimental eval-
the above reform research, it can be found that most teaching uation of traditional Machine learning algorithms, such as
methods focus on the combination of theory and practice, SVM (Support Vector Machine) and Adaboost, etc. ③ Deep
while in practice, they focus on how to design experiments learning. It includes the fundamentals, principles and popular
that have industrial practicality and can arouse students’ deep network models of deep learning, such as VGGNet
interests. (Visual Geometry Group Network), CNN (Convolutional
Machine Vision is one of the courses that closely Neural Network) and RNN (Recurrent Neural Network).
link theory with practice. However, at present, most uni-
versities’ comprehensive experiments for undergraduates/ B. EXPERIMENT BACKGROUND
postgraduates have problems such as outdated design, lack The occurrence of forest wildfires often causes great damage
of practicality, and most of them only use traditional machine to national economic property. In recent years, the number
learning for experiments [13], [14]. Although the research on of trees near the electricity transmission areas has increased
wildfire detection has made great progress, detection with dramatically. In addition, extreme weather such as drought
high accuracy is still a big challenge. In order to solve has greatly increased the frequency and intensity of forest
the above problems, this paper designs an automatic forest fires [15]. If the machine vision technology can be used to
wildfire detection framework that can also be used as a monitor the wildfire in real time, it can effectively avoid the
comprehensive experiment for Computer Vision course. This serious damage caused by the fire spreading, thus effectively
framework uses image processing, machine learning, and reducing the direct loss caused thereby.
deep learning technology to achieve automatic detection and Because the efficiency of manual fire detection is
annotation of forest wildfire regions, which is a novel way to extremely low, an increasing number of researchers use mod-
combine research and teaching. To the best of our knowledge, ern means to detect fire, such as UAV-IoT [16], satellite
no previous work has explored the preceding problems to remote sensing, wireless sensor, and image fire detectors.

32672 VOLUME 11, 2023


L. Wang et al.: Deep Learning-Based Experiment on Forest Wildfire Detection in Machine Vision Course

In recent years, with the improvement of image processing,


wildfires can be detected automatically by on-site video col-
lected by the camera installed on the line tower, which greatly
improves the accuracy of wildfire detection.
Forest wildfires can be detected through smoke and
flame [17], [18]. The smoke-based detection is difficult to
effectively distinguish the smoke from other non-dangerous
smoke such as smoke from cooking and industrial chim-
neys, etc., so it is not as good as the flame-based detection.
Therefore, the comprehensive experiment designed in our
manuscript is mainly to detect wildfires by flames.

C. RELATED WORKS
There are two ways of flame detection: static and dynamic
flame detection. Static flame detection aims at a single image
and detects the flame region through image processing and
machine learning. The dynamic flame detection aims at a
video image sequence, which uses the static spatial features
and time sequence information of the images to detect the
fire target. The current research about these two detection
methods is introduced below.
(1) Static flame detection. It often detects the flame by
extracting the color and texture features of the image. For
example, Jia and Jiong [19] proposed a method combining
saturation and Otsu threshold segmentation for flame detec-
tion, which was judged based on SVM by combining the
shape features of the flame area with LBP texture analysis.
Tan et al. [20] used RGB and HSI dual color spaces to detect
FIGURE 1. The framework of forest wildfire region detection.
flame objects. Hossain et al. [21] proposed a novel algorithm
that was capable of detecting both flame and smoke from
a single image using block-based color features and texture flame object in real time. As shown above, most research on
features. The static detection methods mainly rely on color, flame-based detection focuses on dynamic detection, which
texture, or shape features, but there are background noise can achieve better results than static detection [28].
and interference signals in many images, such as the sun Besides the above research, lost of studies use machine
and sunset glow. Therefore, these features cannot effectively learning and deep learning algorithms to detect wildfire.
detect the flame objects, and the detection accuracy is unsat- Most of the recent detection algorithms use Convolu-
isfactory. tional Neural Networks (CNNs), including different ver-
(2) Dynamic flame detection. This method combines the sions of YOLO, U-Net, and DeepLab [33]. For example,
temporal information of video with the static features of the Rashkovetsky et al. [34] proposed a single-input convolu-
image for detection. Schultze et al. [22] proposed to use tional neural network based on the well-known U-Net
spectrogram and acoustic spectrogram to obtain flame fea- architectures to detect wildfire area in satellite images.
tures according to flame flicker and movement direction. This Sousa et al. [2] proposed a transfer learning-based method
method can also monitor the movement direction of flame. and data augmentation for wildfire early warning. Treneska
Xie et al. [23] used dynamic features and deep image features and Stojkoska [36] also utilized transfer learning by fine-
for recognition. Shahid et al. [24] obtained the candidate tuning the ResNet50 to detect wildfire on UAV collected
flame regions by combining the shape features and the motion images, which can obtain 88% accuracy. Jindal et al. [35] uti-
stroboscopic features of the flame and then used the classifier lized an algorithm based on YOLOv3 and YOLOv4 to detect
to identify them. Zhang et al. [25] improved the target detec- forest smoke. The results show that YOLOv3 outperforms
tion network YOLOv5 by combining the static and dynamic YOLOv4 in all evaluation metrics. But, the above methods
features of the flame, and solved the problem of unbalanced cannot obtain satisfactory detection results, which may be
positive and negative samples. Yuan et al. [26] detected the further improved by parameter optimization and different
forest wildfire by combining the deep static spatial features data augmentation techniques.
with deep dynamic features, and achieved good detection
results. Wang et al. [27] proposed to integrate the bottom III. DESIGN OF THE EXPERIMENT
color features and motion features of the flame to design a To enable students to master both traditional machine learn-
multistage flame detection method, but failed to detect the ing and deep learning and enhance the accuracy of wildfire

VOLUME 11, 2023 32673


L. Wang et al.: Deep Learning-Based Experiment on Forest Wildfire Detection in Machine Vision Course

detection, this paper divides the forest wildfire detection task to accelerate the convergence speed of the algorithm.
into two modules, namely, wildfire image classification and x − xmin
wildfire region detection (see Figure 1). At the same time, x′ = (1)
xmax − xmin
we propose a novel wildfire image classification algorithm
based on Reduce-VGGnet, and a wildfire detection algorithm where x ′ denotes the data after normalization, x denotes the
based on the optimized CNN with the combination of spa- extracted feature, xmin denotes the minimum value of one
tial and temporal features. The wildfire image classification feature, and xmax denotes the maximum value of one feature.
module extracts the video image frame, and then extracts the Finally, we input the normalized features to SVM, and
shape, texture and color features of the images and normal- compare the performance of SVM and Reduce-VGGNet.
izes them. Then, we design a wildfire image classification
algorithm based on traditional machine learning (SVM) and 1) SVM
Reduce-VGGNet. Finally, the wildfire region detection mod- SVM is a very popular classifier in recent years. It can realize
ule requires further annotating the fire regions on the classi- nonlinear segmentation of feature vectors. Kernel functions in
fied wildfire images, and uses the Vibe algorithm to detect SVM can simplify the number of inner product calculations,
the candidate fire region, and needs to design an optimized reduce the running time, and can convert the inner product
CNN to extract temporal and spatial features respectively for of high-dimensional space to a low-dimensional space. The
wildfire region detection. This comprehensive experiment is performance of support vector machine mainly depends on
designed hierarchically from different angles, which is con- the selection of kernel function, and the selection of kernel
sistent with the students’ gradual understanding and cognitive function depends on the actual dataset. In addition, students
process of image processing and image recognition. are required to determine the penalty factor c that affects the
generalization ability of the classifier and the parameters of
the kernel function g through cross-validation. The parame-
A. WILDFIRE IMAGE CLASSIFICATION ters with the best performance on the training set are selected
The purpose of this module is to let students understand and as the final parameters of the model.
master the image preprocessing, image feature extraction, The Kernel functions in our experiment mainly include
and image classification. Before classification, we need to Radial Basis Function (RBF), Polynomial Kernel and Sig-
extract the image frames and extract features from the images. moid kernel. We use the package Libsvm to establish the
In this experiment, we propose the Reduce-VGGNet module classification model. The process of SVM classification is
to classify forest wildfire images, and require students to shown in the figure below:
compare this method with the traditional machine learning
algorithm SVM.
Preprocessing First, we extract the video image frames by
the OpenCV module. OpenCV has powerful video editing
FIGURE 2. The flow chart of the SVM classification.
capabilities, and encapsulates many image processing API
functions, including image reading, scanning and face recog-
nition. To extract meaningful information from a video or 2) REDUCE-VGGNET
image, the CV module, VideoCapture(File_path), read() and The Reduce-VGGNet model proposed in this experiment
imwrite(filename, img[, params]) functions can be used for takes VGG-16 as the basic network structure. VGG16 net-
video reading, and the image frame can be saved to a specified work consists of 13 convolutional layers and 3 full connected
file. layers. After each group of convolutional layers, a max pool-
Feature extraction Next, we extract image features based ing layer is connected, followed by ReLU activation function
on color, texture and shape features. To extract the color to solve the gradient dispersion problem.
features, we transform the RGB image to a gray image and As a deep convolution neural network, VGGNet has been
extract the gray histogram features, including the mean and widely used in image classification tasks. Based on the idea
standard deviation of brightness and the probability of gray of transfer learning, this experiment transfers the weight
value. To extract the texture features, we extract the gray parameters obtained from the training set of the network to
co-occurrence matrix, and extract seven invariant moments the wildfire image set. As shown in Figure 3, the weight
based on the co-occurrence matrix. For the shape features, coefficients of the first 13 layers are transferred, the original
the area, roundness, boundary circumference and boundary three full connected layers are removed, and two full con-
roughness of fire region are extracted. The above feature nected layers and softmax are used instead. The number of
extraction belongs to the digital image processing part of neurons of two full connected layers are set to 1024 and 2
Machine Vision. Students can further deepen their under- respectively. We use the forest wildfire image dataset to train
standing of the basic knowledge of digital image processing the full connected layers and Softmax classifier, and fine tune
through this module. the VGG16 for classification. The purpose of this design is to
Normalization Due to the different ranges of different reduce the training parameters and training time in VGGNet
features, it is necessary to normalize them to the range of [0,1] model.

32674 VOLUME 11, 2023


L. Wang et al.: Deep Learning-Based Experiment on Forest Wildfire Detection in Machine Vision Course

FIGURE 3. The structure of Reduce-VGGNet.

In this experiment, stochastic gradient descent (SGD) and for classification due to their ability of extracting high repre-
Momentum are combined to train the model. We set the epoch sentative features, and also, the forest wildfires are occurring
to 100, the batch_size to 64, and the momentum parameter to in different spatial and temporal scales, we design new CNNs
β1 = 0.9, β2 = 0.999, the initial value of the learning rate is to extract both spatial and temporal features in this paper.
set to 0.001. We use cross entropy loss function to train the The specific steps of this module are designed as follows:
model: we set the moving objects detected by the Vibe algorithm as
T candidate wildfire regions, then we use 16 ∗ 16 blocks to tra-
1X
Logloss = − (yt log(ŷt ) + (1 − yt ) log(1 − ŷt )) (2) verse these regions, and optimize the CNN network through
T appropriate network depth and multiple convolution kernel
t=1
where T represents the training sample, yt is the expected sizes to extract the spatial features of each block and classify
category and ŷt is the predicted category. them. If it is classified as a flame block, the region will be
The learning rate of this experiment is updated by the annotated. Furthermore, we extract the temporal features of
exponential decay method: candidate flame blocks by an optimized CNN model to anno-
tate the wildfire region. We finally detect the wildfire region
η = η0 · α (⌊l/d⌋) (3) through the combination of temporal and spatial features.
where η represents the updated learning rate, η0 is the initial
learning rate, α is the decay coefficient, and ⌊l/d⌋ denotes 1) VIBE ALGORITHM
the downward rounding of the quotient of the number of Vibe is an effective object detection algorithm proposed by
iterations and the decay step size. In the training process, Barnich and Van Droogenbroeck [29]. First, we initialize
the loss value is calculated after each learning rate is set. the background model. Vibe algorithm needs to initialize
When the loss value is stable, the learning rate can be reduced, the background model with the first frame of the video. For
so that the minimum learning rate can be obtained by repeated each pixel, considering that its adjacent pixels may have
experiments. similar pixel values, the pixel value of its neighborhood is
randomly selected as its sample value. Then, the background
B. WILDFIRE REGION DETECTION modeling and foreground detection are carried out. The main
This module is designed to enable students to understand and idea of this algorithm is to determine whether a pixel is the
master how to build a deep neural network model to extract background point. Background modeling stores a sample set
spatial and temporal features. Thus, we can use these features M (x) = {v1 , v2 , . . . , vn } for each background point x, n is
to detect wildfire regions. the size of the sample set. The pixel value of its neighbor
The wildfire region detection contains two stages, the first point is randomly selected as its sample value. For each new
stage can detect moving objects by Vibe algorithm, the sec- pixel, we calculate the distance between the pixel and each
ond stage uses 16∗ 16 blocks to traverse the moving object value in the sample set. When this number is greater than the
and classify each of these blocks to be wildfire blocks or threshold T , the new pixel is considered as the background,
non-wildfire blocks. The detection of wildfire regions can be otherwise it is considered as the foreground:
considered as a binary classification problem in our work.
Considering that deep CNN architectures are the best choice #{d(NR (x), {v1 , v2 , . . . , vn })} ≥ T (4)

VOLUME 11, 2023 32675


L. Wang et al.: Deep Learning-Based Experiment on Forest Wildfire Detection in Machine Vision Course

TABLE 1. The structure of spatial CNN network.

where NR (x) represents the neighborhood with the radius R


of the point x, # represents the number. Students can try
different parameters in their experiments. For example, the
parameter n can be set to 10, 20, and 30, the threshold T
can be set to [1] and [10], and the threshold R can be set
to [10] and [30], from which the optimal threshold can be
selected. Finally, we update the background. The background
model needs to be updated, and a value in the pixel sample FIGURE 4. An example of candidate flame area traversal.
set is randomly replaced with a new pixel value. In addition,
we update the background according to a certain probability.
When a pixel is determined as the background, we update
receptive field, weight sharing and pooling. The design of
it with the probability of 1/r, where r is the time sampling
CNN convolution neural network is the difficulty of this
factor, and we set it to 16 in our experiment.
experiment, and it is necessary to design an optimization
model that adapts to the characteristics of forest wildfire
2) SPATIAL FEATURE EXTRACTION images. To extract spatial features of image blocks, we design
Firstly, we use 16 ∗ 16 blocks to traverse the foreground the CNN shown as follows:
area detected by the Vibe algorithm. Then, when the number As shown in Table 1, in the ‘‘Type’’ column of Table 1,
of foreground pixels of the current block is greater than a ‘‘b’’ represents the batchnormalize layer, ‘‘c’’ represents the
certain threshold, an optimized CNN is designed to extract convolution layer, ‘‘r’’ represents the Relu layer, ‘‘p’’ rep-
the features of the current block for classification. resents the pooling layer, ‘‘d’’ represents the dropout layer,
As shown in the following figure, the upper left corner box and ‘‘s’’ represents the softmax layer. The batchnormalize
in the left figure defines the region to be detected, and the layer is set before the convolution layer to normalize the
right figure shows the moving foreground detected by the data in batches. Therefore, layers 1, 5, and 9 are all set as
Vibe algorithm for this region. The small box in the upper the batchnormalize layer, followed by the convolution layer.
left corner of the right figure represents a 16 ∗ 16 block. Layers 2, 6, 10 and 12 are set as convolution layers. Layers 2
We use the block to traverse from top to bottom for CNN and 6 use a 5∗ 5 convolution kernel, layer 10 uses a 4∗ 4 con-
forest wildfire detection. Before inputting the image blocks volution kernel, and layer 12 uses a 1∗ 1 convolution kernel.
to CNN network, we normalize the 16 ∗ 16 block size to 28 ∗ It is used to map 100 featuremaps to 2-dimensional vectors
28 by the Bilinear Interpolation algorithm. for binary classification. Layers 3, 7 and 11 are designed as
Convolutional Neural Network (CNN) is a commonly the ReLU activation layer and placed after the convolution
used deep learning architecture with convolution, which is layer. Layers 4 and 8 are set as the max pooling pool layer
an important model for image recognition speech recogni- to expand the receptive field. Layer 13 is set as the dropout
tion [37] and other issues [38], [39]. Compared with gen- layer, and the drop factor is set to 0.5 in this experiment to
eral neural networks, its three main characteristics are local enhance the generalization ability of the model. Layer 14 is

32676 VOLUME 11, 2023


L. Wang et al.: Deep Learning-Based Experiment on Forest Wildfire Detection in Machine Vision Course

TABLE 2. The structure of temporal CNN network.

TABLE 3. The detailed information of our dataset.

the softmax layer. The ‘‘size’’ column in the table means the
size of convolutional kernel. The ‘‘step’’ column means the
stride step. The ‘‘input’’ column means the shape of the input
tensor, while the ‘‘output’’ column means the shape of the
output tensor.
FIGURE 5. An example of the optical flow field.
3) TEMPORAL FEATURE EXTRACTION
This module intends to extract optical flow sequence features
as the input of CNN model to extract dynamic temporal fea- only the structures of layer 1 to 6 are different, and the
tures. The optical flow represents the instantaneous speed of parameters in the other layers are the same.
a moving object. The optical flow is defined as the displace-
ment vector field d t of anterior and posterior frames, where IV. RESULTS AND ANALYSIS
d t (m, n) represents the displacement vector of the pixel (m, n) For the experiment of wildfire image classification, the accu-
from time t to time t + 1, d xt represents the horizontal com- racy is used to evaluate our model [26].
y
ponent and d t represents the vertical component. In order to Npos
show its temporal feature, for a group of continuous L-frame accuracy = (5)
Ntotal
optical flows, the horizontal component d xt and the vertical
y where Npos represents the number of images that are classified
component d t are concatenated as a 2L-channel optical flow
x,y correctly, Ntotal represents the total number of images. In case
sequence, which is denoted as d t ∈ Rw∗h∗2L .
of wildfire region detection, we use precision, recall and
Figure 5 shows a wildfire image from two contin-
accuracy to evaluate. The calculation of accuracy is the same
uous images and its corresponding optical flow fields.
as Eq. (5), which can also be denoted as:
Figure (b) represents the optical flow field in the horizontal
direction, and Figure (c) represents the optical flow field in TP + TN
accuracy = (6)
the vertical direction. It can be seen that the optical flow TP + TN + FP + FN
field in the flame region varies greatly, but the optical flow The precision and recall can be denoted as follows:
field in other regions with little change is relatively smooth.
TP
The flame movement is generally reflected in the vertical precision = (7)
direction, so the change of optical flow field in the verti- TP + FP
cal direction is more significant than that in the horizontal TP
recall = (8)
direction. TP + FN
x,y
We input d t ∈ Rw∗h∗2L to an optimized CNN network to where TP is True Positive, TN is True Negative, FP is False
extract the temporal features and classify them. The value of Positive, FN is False Negative.
L is set as 5 in our experiment. The CNN network structure In order to verify our experiment, we used the dataset
designed is similar to that in Table 1. As shown in Table 2, from FLAME, a forest wildfire dataset opened by Northern

VOLUME 11, 2023 32677


L. Wang et al.: Deep Learning-Based Experiment on Forest Wildfire Detection in Machine Vision Course

FIGURE 6. Examples of wildfire images.

FIGURE 7. Examples of non-wildfire images.

TABLE 4. The effects of normalization. TABLE 5. The effects of kernel functions.

Arizona University [30]. A total of 900 consecutive images of


wildfire flames were collected as positive samples and 1000
non-flame images were collected as negative samples from
FLAME. We divided the dataset into three parts, which are
training set, validation set and testing set. The wildfire region
detection based on the optimized CNN needs to identify the
specific location of the wildfire, so it is necessary to man-
ually annotate the fire region containing the wildfire. Each
wildfire image was divided into 16∗ 16 areas, we manually
annotated each block in each image, while the blocks from
the non-wildfire images can directly be annotated as non-
wildfire blocks. In addition, we set the batch size of CNN
to 100. For the wildfire classification experiment based on FIGURE 8. The accuracy curves of Reduce-VGGNet with different epochs.
SVM and Reduce-VGGNet, it only needs to know whether
the image is a positive sample or a negative sample. The
detailed information about our dataset is shown in Table 3. 1) RESULTS OF SVM
Figure 6 shows some wildfire images. Figure 7 shows some The parameters in libsvm are set to default values. The model
non-wildfire images. is trained on 1140 images, and is tested on 380 images. The
results of classification are closely related to the selection of
A. RESULTS OF WILDFIRE IMAGE CLASSIFICATION parameters. First, we conduct experiments on the operation of
Considering the requirements of VGG16 model on the input non-normalization, and the results are shown in Table 4. It can
images, all images are scaled to 224 × 224 in our experiment. be seen that the classification results after normalization are
For wildfire image classification, we only use the training set much higher than that after non-normalization. The accuracy
and the testing set in Table 3. is 83.64% after normalization.

32678 VOLUME 11, 2023


L. Wang et al.: Deep Learning-Based Experiment on Forest Wildfire Detection in Machine Vision Course

TABLE 6. The results of spatial feature detection based on the optimized


CNN.

FIGURE 9. The loss curves of Reduce-VGGNet with different epochs.

Secondly, we compare the effects of different kernel


functions after normalization, including radial basis ker-
TABLE 7. The results of temporal feature detection based on the
nel function, polynomial kernel function and Sigmoid ker- optimized CNN.
nel function. The results are shown in Table 5. It can be seen
from Table 4 that the performance of Radial basis kernel is
superior to Polynomial kernel and Sigmoid kernel.
Thirdly, for the selection of optimal parameters c, g, stu-
dents are required to select the combination with the highest
accuracy within the interval [−10, 10], respectively with
0.5 step length for cross-verification, and select the combi-
nation c, g with the best performance. On the basis of this
module, students can be encouraged to compare it with other
algorithms, such as Adaboost.

2) RESULTS OF REDUCE-VGGNET
Figure 8 shows the experimental results of Reduce-VGGNet
network. When the number of iterations is close to 100, the
network reaches the convergence state, and the accuracy on
TABLE 8. The results of the combination of spatial and temporal features.
the test set reaches 91.20%. It can be seen that the experimen-
tal result of this method is superior to that of SVM algorithm.
Figure 9 shows the loss curves in model training and testing.
The larger the number of iterations, the more the model tends
to converge, and the convergence speed of this model is
very fast.

B. RESULTS OF WILDFIRE REGION DETECTION


For the CNN-based spatial feature extraction and detection,
after 1000 iterations on the validation set, the accuracy of Moreover, we compare our method with well-known wild-
wildfire region detection can reach 92.6%. For the CNN- fire detection algorithms proposed by Habiboğlu [31] and
based temporal feature detection, after 1000 iterations on the Oh [32] on testing set. The former research divides the
validation set, the accuracy of wildfire region detection can video into some blocks and adopts covariance-based features
reach 93.92%. It can be seen from Table 6 that the accuracy extracted from these blocks to detect fire regions. The latter
of the model can reach about 88% after 100 iterations on both research adopts a light weight CNN model to tackle the early
temporal and spatial features. As shown in Table 8, if the detection wildfire region. These two methods have detailed
spatial and temporal features are combined, the accuracy steps, which are suitable for students to conduct experimental
can reach 97.35% under the classification operation, and can comparison. We also compare our method with some state-of-
reach 94.37% without the classification operation. This result the-art object detection algorithms. We adopt Faster R-CNN
shows that the classification operation before wildfire region (Faster Region-based Convolutional Network) and SSD (Sin-
detection can help improve the performance. Figure 10 shows gle Shot Multibox Detector) models [40] for comparison.
some examples of the detection of fire regions. It can be The results are shown in Table 9. It can be seen that the
seen from the experimental results that the CNN optimization method proposed in our manuscript performs better than the
model can effectively detect the wildfire region. other methods. The accuracy of our method is 2.34% higher

VOLUME 11, 2023 32679


L. Wang et al.: Deep Learning-Based Experiment on Forest Wildfire Detection in Machine Vision Course

Our framework may not be directly employed to satellite


images. Although the combination of spatial and temporal
features in our method can solve the problem of wildfires
occurring in different spatial and temporal scales, there are
still some challenges in satellite data processing, such as the
complexity of noise information (clouds, lighting, smoke,
etc.). Traditional deep learning frameworks may not solve
these challenges.
In our future work, 1) We will explore other CNN-based
methods and some pre-processing methods to further improve
both the speed and the accuracy of wildfire region detection.
2) We will utilize satellite images and explore the detection
of wildfire at pixel level. We can explore multi-sensor data
from different sources (openly available satellite imagery) to
improve the results in noisy conditions. Specifically, we can
combine the single-sensor detection results and quantify its
improvement.
FIGURE 10. The results of some detected wildfire regions.
REFERENCES
TABLE 9. The experimental results of different methods. [1] A. Kumar, A. Kaur, and M. Kumar, ‘‘Face detection techniques: A review,’’
Artif. Intell. Rev., vol. 52, no. 2, pp. 927–948, 2019.
[2] M. J. Sousa, A. Moutinho, and M. Almeida, ‘‘Wildfire detection using
transfer learning on augmented datasets,’’ Exp. Syst. Appl., vol. 142,
Mar. 2020, Art. no. 112975.
[3] L. Wang, Y. Zhang, and X. Xu, ‘‘A novel group detection method for find-
ing related Chinese herbs,’’ J. Inf. Sci. Eng., vol. 31, no. 4, pp. 1387–1411,
2015.
[4] B. Tang, J. Kong, and S. Wu, ‘‘Review of surface defect detection based on
machine vision,’’ J. Image Graph., vol. 22, no. 12, pp. 1640–1663, 2017.
[5] S. Yin, Y. J. Ren, T. Liu, S. Guo, J. Zhao, and J. Zhu, ‘‘Review on
application of machine vision in modern automobile manufacturing,’’ Acta
Optica Sinica, vol. 38, no. 8, 2018, Art. no. 0815001.
than that of Faster R-CNN, and 3.50% higher than that of [6] F. Min and T. Lu, ‘‘Exploration and practice of machine vision course
teaching for production practice,’’ Comput. Educ., vol. 10, pp. 41–43,
Habiboğlu. Jan. 2017.
[7] Z. Wang, G. Xiao, and H. Liu, ‘‘Teaching reform of the course of machine
V. CONCLUSION vision principles and applications for professional degree postgraduates,’’
Educ. Teaching Forum, vol. 3, no. 11, pp. 37–40, 2021.
This paper designs a comprehensive experiment of forest
[8] T. Shao, J. Tang, and Y. Liao, ‘‘Comprehensive experiment system of
wildfire detection, which covers digital image processing, COCO on sorting based on machine vision,’’ Experim. Technol. Manag.,
machine learning and deep learning, and meets the require- vol. 37, no. 289, pp. 125–127&132, 2020.
ments of the comprehensive experiment for Machine Vision [9] Z. Han and X. Liu, ‘‘Design of machine vision experiment teaching
course. The topic selection of this experiment is still a hot platform based on Python,’’ Comput. Meas. Control, vol. 258, no. 3,
pp. 249–253, 2020.
topic in current image processing research. The designed [10] J. Sigut, M. Castro, R. Arnay, and M. Sigut, ‘‘OpenCV basics: A mobile
experiments focus on wildfire image classification and wild- application to support the teaching of computer vision concepts,’’ IEEE
fire region detection. We propose the Reduce-VGGNet model Trans. Educ., vol. 63, no. 4, pp. 328–335, Nov. 2020.
for classification, and propose the combination of spatial and [11] M. Cote and A. B. Albu, ‘‘Teaching computer vision and its societal effects:
A look at privacy and security issues from the students’ perspective,’’ in
temporal features based on the optimized CNN model for Proc. IEEE Conf. Comput. Vis. Pattern Recognit. Workshops (CVPRW),
wildfire region detection. The experimental results show that Honolulu, HI, USA, Jul. 2017, pp. 1378–1386.
the Reduce-VGGNet can achieve better results than SVM, the [12] S. Spurlock and S. Duvall, ‘‘Making computer vision accessible for under-
graduates,’’ J. Comput. Sci. Colleges, vol. 33, no. 2, pp. 215–221, 2017.
optimized CNN under classification operation can achieve
[13] Y. Jiang and B. Li, ‘‘Exploration on the teaching reform measure for
better performance than other state-of-the-art methods. machine learning course system of artificial intelligence specialty,’’ Sci.
Students can further explore the recent research on the basis Program., vol. 2021, Nov. 2021, Art. no. 8971588.
of this experiment, and further improve our methods to [14] Y. Wang, M. Stankovic, A. Smith, and E. T. Matson, ‘‘Leader-follower
system in convoys: An experimental design focusing on computer vision,’’
achieve more innovative improvement in the field of for- in Proc. IEEE Sensors Appl. Symp. (SAS), Aug. 2021, pp. 1–6.
est wildfire detection. The experiment effectively combines [15] N. J. Abram, B. J. Henley, A. Sen Gupta, T. J. R. Lippmann, H. Clarke,
teachers’ scientific research with teaching. It can not only A. J. Dowdy, J. J. Sharples, R. H. Nolan, T. Zhang, M. J. Wooster,
mobilize students’ subjective initiative, but also improve their J. B. Wurtzel, K. J. Meissner, A. J. Pitman, A. M. Ukkola, B. P. Murphy,
N. J. Tapper, and M. M. Boer, ‘‘Connections of climate change and vari-
practical ability and innovative thinking, to meet the needs of ability to large and extreme forest fires in Southeast Australia,’’ Commun.
fostering talents in machine vision for artificial intelligence. Earth Environ., vol. 2, no. 1, p. 8, Jan. 2021.

32680 VOLUME 11, 2023


L. Wang et al.: Deep Learning-Based Experiment on Forest Wildfire Detection in Machine Vision Course

[16] O. M. Bushnaq, A. Chaaban, and T. Y. Al-Naffouri, ‘‘The role of UAV- [39] L. Wang, Y. Zhang, and K. Hu, ‘‘FEUI: Fusion embedding for user iden-
IoT networks in future wildfire detection,’’ IEEE Internet Things J., vol. 8, tification across social networks,’’ Int. J. Speech Technol., vol. 52, no. 7,
no. 23, pp. 16984–16999, Dec. 2021. pp. 8209–8225, May 2022.
[17] G. Saldamli, S. Deshpande, K. Jawalekar, P. Gholap, L. Tawalbeh, and [40] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C. Y. Fu, and
L. Ertaul, ‘‘Wildfire detection using wireless mesh network,’’ in Proc. 4th A. C. Berg, ‘‘SSD: Single shot multibox detector,’’ in Proc. Eur. Conf.
Int. Conf. Fog Mobile Edge Comput. (FMEC), Jun. 2019, pp. 229–234. Comput. Vis. Cham, Switzerland: Springer, Sep. 2016, pp. 21–37.
[18] N. T. Toan, P. Thanh Cong, N. Q. V. Hung, and J. Jo, ‘‘A deep learning
approach for early wildfire detection from hyperspectral satellite images,’’
in Proc. 7th Int. Conf. Robot Intell. Technol. Appl. (RiTA), Nov. 2019,
pp. 38–45. LIDONG WANG (Member, IEEE) was born in
[19] L. Jia and Y. Jiong, ‘‘Multi-information fusion flame detection based
Wenzhou, Zhejiang, China, in 1982. She received
on Ohta color space,’’ J. East China Univ. Sci. Technol., vol. 45, no. 6,
pp. 962–969, 2019.
the Ph.D. degree from the College of Com-
[20] Y. Tan, L. Xie, H. Feng, L. Peng, and Z. Zhang, ‘‘Flame detection algo- puter Science and Technology, Zhejiang Univer-
rithm based on image processing technology,’’ Laser Optoelectron. Prog., sity, in 2013.
vol. 56, no. 16, 2019, Art. no. 161012. She was a Visiting Scholar with Tongji Uni-
[21] F. M. A. Hossain, Y. Zhang, C. Yuan, and C.-Y. Su, ‘‘Wildfire flame and versity, in 2022. She is currently an Associate
smoke detection using static image features and artificial neural network,’’ Professor with Hangzhou Normal University. She
in Proc. 1st Int. Conf. Ind. Artif. Intell. (IAI), Jul. 2019, pp. 1–6. has published more than 30 articles on social net-
[22] T. Schultze, T. Kempka, and I. Willms, ‘‘Audio-video fire-detection of open work analysis, data mining, pattern recognition,
fires,’’ Fire Saf. J., vol. 41, no. 4, pp. 311–314, Jun. 2006. and computer education. Her current research interests include image pro-
[23] Y. Xie, J. Zhu, Y. Cao, Y. Zhang, D. Feng, Y. Zhang, and M. Chen, cessing, machine learning, and text mining.
‘‘Efficient video fire detection exploiting motion-flicker-based dynamic
features and deep static features,’’ IEEE Access, vol. 8, pp. 81904–81917,
2020.
[24] M. Shahid, I.-F. Chien, W. Sarapugdi, L. Miao, and K.-L. Hua, ‘‘Deep
HUIXI ZHANG received the master’s degree
spatial–temporal networks for flame detection,’’ Multimedia Tools Appl.,
vol. 80, nos. 28–29, pp. 35297–35318, Nov. 2021. in circuit and system from Zhejiang University,
[25] D. Zhang, H. Xiao, J. Wen, and Y. Xu, ‘‘Real-time fire detection method in 2005. She joined Hangzhou Normal University
with multi-feature fusion on YOLOv5,’’ Pattern Recognit. Artif. Intell., as a Lecturer. Her research interests include signal
vol. 35, no. 6, pp. 548–561, 2022. processing, system design, the Internet of Things
[26] J. Yuan, L. Wang, P. Wu, C. Gao, and L. Sun, ‘‘Detection of wildfires along technology.
transmission lines using deep time and space features,’’ Pattern Recognit.
Image Anal., vol. 28, no. 4, pp. 805–812, Oct. 2018.
[27] Z. Wang, D. Wei, and X. Hu, ‘‘Research on two stage flame detection
algorithm based on fire feature and machine learning,’’ in Proc. Int. Conf.
Robot., Intell. Control Artif. Intell., Sep. 2019, pp. 574–578.
[28] J. Ryu and D. Kwak, ‘‘Flame detection using appearance-based pre-
processing and convolutional neural network,’’ Appl. Sci., vol. 11, no. 11,
YIN ZHANG was born in Lanzhou. She received
p. 5138, May 2021.
[29] O. Barnich and M. Van Droogenbroeck, ‘‘ViBe: A universal background the Ph.D. degree in computer science from Zhe-
subtraction algorithm for video sequences,’’ IEEE Trans. Image Process., jiang University. She is currently an Assistant
vol. 20, no. 6, pp. 1709–1724, Jun. 2011. Professor with the College of Computer Science
[30] A. Shamsoshoara, F. Afghah, A. Razi, L. Zheng, P. Z. Fulé, and E. Blasch, and Technology, Zhejiang University. Her current
‘‘Aerial imagery pile burn detection using deep learning: The FLAME research interests include knowledge discovery,
dataset,’’ Comput. Netw., vol. 193, Jul. 2021, Art. no. 108001. machine learning, digital library, and information
[31] Y. H. Habiboğlu, O. Günay, and A. E. Çetin, ‘‘Covariance matrix-based and knowledge management.
fire and flame detection method in video,’’ Mach. Vis. Appl., vol. 23, no. 6,
pp. 1103–1113, 2012.
[32] S. H. Oh, S. W. Ghyme, S. K. Jung, and G. W. Kim, ‘‘Early wildfire
detection using convolutional neural network,’’ in Proc. Int. Workshop
Frontiers Comput. Vis. Singapore: Springer, 2020, pp. 18–30.
KEYONG HU (Member, IEEE) received the Ph.D.
[33] A. Bouguettaya, H. Zarzour, A. M. Taberkit, and A. Kechida, ‘‘A review
on early wildfire detection from unmanned aerial vehicles using deep degree in mechatronic engineering from the Zhe-
learning-based computer vision algorithms,’’ Signal Process., vol. 190, jiang University of Technology, Hangzhou, China,
Jan. 2022, Art. no. 108309. in 2016. He is currently a Teacher of electronic
[34] D. Rashkovetsky, F. Mauracher, M. Langer, and M. Schmitt, ‘‘Wildfire information engineering with Qianjiang College,
detection from multisensor satellite imagery using deep semantic segmen- Hangzhou Normal University. His research inter-
tation,’’ IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 14, ests include artificial intelligence and new energy
pp. 7001–7016, 2021. technology.
[35] P. Jindal, H. Gupta, N. Pachauri, V. Sharma, and O. P. Verma, ‘‘Real-
time wildfire detection via image-based deep learning algorithm,’’ in Soft
Computing: Theories and Applications (Advances in Intelligent Systems
and Computing), vol. 2, Singapore: Springer, 2021, pp. 539–550.
[36] S. Treneska and B. R. Stojkoska, ‘‘Wildfire detection from UAV collected
KANG AN (Member, IEEE) received the mas-
images using transfer learning,’’ in Proc. 18th Int. Conf. Informat. Inf.
Technol., Skopje, North Macedonia, 2021, pp. 6–7. ter’s degree in circuit and systems from the Guilin
[37] A. A. Abdelhamid, E.-S.-M. El-Kenawy, B. Alotaibi, G. M. Amer, University of Electronic Technology, in 2007.
M. Y. Abdelkader, A. Ibrahim, and M. M. Eid, ‘‘Robust speech emotion He joined Hangzhou Normal University as an
recognition using CNN+LSTM based on stochastic fractal search opti- Associate Professor. His research interests include
mization algorithm,’’ IEEE Access, vol. 10, pp. 49265–49284, 2022. the Internet of Things technology and machine
[38] N. Lopac, F. Hrzic, I. P. Vuksanovic, and J. Lerga, ‘‘Detection of learning.
non-stationary GW signals in high noise from Cohen’s class of time-
frequency representations using deep learning,’’ IEEE Access, vol. 10,
pp. 2408–2428, 2022.

VOLUME 11, 2023 32681

You might also like