0% found this document useful (0 votes)
18 views

Vehicle Damage Detection Segmentation Algorithm Based On Improved Mask

Uploaded by

Ridhi Jeph
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views

Vehicle Damage Detection Segmentation Algorithm Based On Improved Mask

Uploaded by

Ridhi Jeph
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

SPECIAL SECTION ON ARTIFICIAL INTELLIGENCE (AI)-EMPOWERED

INTELLIGENT TRANSPORTATION SYSTEMS

Received December 17, 2019, accepted December 29, 2019, date of publication January 6, 2020, date of current version January 14, 2020.
Digital Object Identifier 10.1109/ACCESS.2020.2964055

Vehicle-Damage-Detection Segmentation
Algorithm Based on Improved Mask RCNN
QINGHUI ZHANG 1,2 , XIANING CHANG 2, AND SHANFENG BIAN 2
1 Key Laboratory of Grain Information Processing and Control, Ministry of Education, Henan University of Technology, Zhengzhou 450001, China
2 College of Information Science and Engineering, Henan University of Technology, Zhengzhou 450001, China

Corresponding author: Qinghui Zhang ([email protected])


This work was supported by the National Natural Science Foundation of China under Grant U1404617.

ABSTRACT Traffic congestion due to vehicular accidents seriously affects normal travel, and accurate
and effective mitigating measures and methods must be studied. To resolve traffic accident compensation
problems quickly, a vehicle-damage-detection segmentation algorithm based on transfer learning and an
improved mask regional convolutional neural network (Mask RCNN) is proposed in this paper. The
experiment first collects car damage pictures for preprocessing and uses Labelme to make data set labels,
which are divided into training sets and test sets. The residual network (ResNet) is optimized, and feature
extraction is performed in combination with Feature Pyramid Network (FPN). Then, the proportion and
threshold of the Anchor in the region proposal network (RPN) are adjusted. The spatial information of
the feature map is preserved by bilinear interpolation in ROIAlign, and different weights are introduced
in the loss function for different-scale targets. Finally, the results of self-made dedicated dataset training and
testing show that the improved Mask RCNN has better Average Precision (AP) value, detection accuracy
and masking accuracy, and improves the efficiency of solving traffic accident compensation problems.

INDEX TERMS Mask RCNN, vehicle-damage-detection, loss function, detection accuracy.

I. INTRODUCTION instance segmentation tasks, which can not only accurately


Object detection is one of the main research contents of segment individuals in different categories, but also label each
computer vision. It is to determine the category and loca- pixel in the image to distinguish different individuals in the
tion information of the object of interest in the image on same category [8].
the instance level. Currently the most popular target detec- Most current instance segmentation algorithms are based
tion algorithms include RCNN[1], Fast RCNN[2], Faster on candidate regions. Pinheiro et al. [9] proposed a Deep-
RCNN[3] and SSD[4]. However, these frameworks require Mask segmentation model, which outputs prediction candi-
a large amount of training data, which cannot achieve end-to- date masks through the instances appearing in the input image
end detection. The positioning ability of the detection frame to segment each instance object, but the accuracy of boundary
is limited, and when the feature is extracted, as the number of segmentation is low [10]; Li et al. [11] proposed the first end-
convolution layers increases, gradient disappearance or gradi- to-end instance segmentation framework, full convolutional
ent explosion often occurs. For these drawbacks, He Kaiming instance segmentation (FCIS). By improving the position-
et al. proposed a residual network (ResNet) [5] [25], which sensitive score map, FCIS predicts both the bounding box
helps the model to converge by using the residual module, and instance segmentation, but it can only roughly detect the
accelerates the training of the neural network, and combines boundary of each instance object when processing overlap-
with the target detection model Mask RCNN[6] [26] [27] to ping object instances [12]; He et al. [6] proposed the Mask
realize object detection and segmentation, greatly improving RCNN framework, which is an algorithm with relatively fine
the accuracy of the model detection. Mask RCNN is the first instance segmentation results among existing segmentation
deep learning model that combines both target detection and algorithms [13].
segmentation in one network [7]. It can achieve challenging Compared with the traditional target detection method,
the target detection model Mask RCNN not only has a great
The associate editor coordinating the review of this manuscript and improvement in detection accuracy, but also has great advan-
approving it for publication was Amr Tolba . tages in the field of small target detection. It is widely used in

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see http://creativecommons.org/licenses/by/4.0/
VOLUME 8, 2020 6997
Q. Zhang et al.: Vehicle-Damage-Detection Segmentation Algorithm Based on Improved Mask RCNN

agriculture [14], construction [15] , Medical image segmen-


tation [16] and other fields. Lin et al. [17] used Mask RCNN
to classify rice planthoppers, and realized the effective and
rapid identification of rice planthoppers and non-rice plan-
thoppers, achieving an average recognition accuracy of 0.923.
Wang et al. [18] used Mask RCNN to ship-target detection,
which shows that Mask RCNN has better performance in
solving the problem of closely aligned targets and multi-scale
targets. Shi et al. [19] used Mask RCNN to the existing home- FIGURE 1. Car-damage-detection segmentation system framework.
service-robot platform to obtain category information, loca-
tion information, and item-mask information of the target,
and obtained an 85% mAP value. Li et al. [20] proposed a
building target detection algorithm based on Mask RCNN.
In remote sensing images of different scenes, the detection
of building targets can achieve an accuracy of 94.6%. The
application field of Mask RCNN algorithm is very wide,
but no one has used it in the field of automobile damage
detection.
The paper uses Mask RCNN algorithm to detect and seg-
ment automobile damaged areas in traffic accidents. It has
very important research value and has broad application sce-
narios in the field of transportation. Due to the complexity of
car damage detection and segmentation, there are problems
such as lower detection segmentation accuracy and slower
detection speed. This paper improves the model’s network FIGURE 2. Mask RCNN network framework model.
structure by reducing the number of layers in the residual
network, and adjusting the internal structure to strengthen
the regularization of the model, enhance the generalization box and mask. The network structure block diagram of the
ability, and then adjust the parameters of the anchor box and Mask RCNN algorithm is shown in Figure 2.
the loss loss function to improve the accuracy of car damage The algorithm flow is the following.
detection and segmentation. In this paper, the improved Mask (1) Input the image to be processed into a pre-trained
RCNN is applied to the field of automobile damage detection, ResNet50+FPN network model to extract features and obtain
and a model based on it proposed for detecting and segment- corresponding feature maps.
ing the damaged area of a vehicle in an accident. Photos can (2) This feature map obtains a large number of candidate
be taken from both sides of the accident and uploaded for frames (i.e., the region of interest, or ROI) through RPN, and
assessment. Insurance companies can also use this model to then uses the softmax classifier to perform binary classifica-
process claims quickly. tion of foreground and background, using frame regression
to obtain more accurate candidate-frame position informa-
II. CAR-DAMAGE-DETECTION ALGORITHM FRAMEWORK tion, and filtering out part of the ROI by non-maximum
The vehicle-damage-detection and segmentation system suppression.
based on the Mask RCNN model designed in this paper is (3) The feature map and the last remaining ROI are sent to
shown in Figure 1. the RoIAlign layer, so that each ROI generates a fixed-size
It can be seen from the figure that an image of the damaged feature map.
part of the car is selected and collected according to the (4) Finally, the flow goes through two branches, one branch
demand, and the data are marked by the LabelMe annotation entering the fully connected layer for object classification and
tool to make a dataset in the.json format, which is divided frame regression, and the other entering the full convolution
into a training set and a test set. The data are sent to the Mask network (FCN) for pixel segmentation.
RCNN for feature extraction and classification prediction and
segmentation masking, and the car-damage-detection result is B. BACKBONE NETWORK STRUCTURE IMPROVEMENT
output. Generally, the backbone network of Mask RCNN adopts
ResNet101; that is, the number of network layers is 101, but
A. MASK RCNN ALGORITHM too many layers will greatly reduce the rate of the network
Mask RCNN is an instance segmentation framework structure. The car-damage category trained in this paper is
extended by Faster RCNN. It is divided into two stages: the relatively simple, and the requirements for the network layer
first stage scans the image and generates the proposal, and are lower; thus, to further improve the running speed of the
the second classifies the proposal and generates the bounding algorithm, this paper uses ResNet50.

6998 VOLUME 8, 2020


Q. Zhang et al.: Vehicle-Damage-Detection Segmentation Algorithm Based on Improved Mask RCNN

FIGURE 4. ROI generated by the original RPN and the improved RPN.

and then the RPN can extract the RoI features from different
levels of the feature pyramid according to the size of the target
object. Thereby, the simple network structure changes, with-
out substantially increasing the calculation amount, greatly
improving the detection performance of small objects and
achieving excellent improvement in accuracy and speed.
RPN is equivalent to a sliding-window-based classless tar-
get detector. It is based on the structure of a convolutional
neural network. The sliding frame scan produces anchor
frame anchors. A suggested area can generate a large num-
ber of anchors of different sizes and aspect ratios, and they
FIGURE 3. ResNet structure and improved ResNetV2 structure.
overlap to cover as many images as possible; the size of
Because the size of the car damage in the images will be the suggested area and the desired area overlap (IOU) will
different, only a single convolutional neural network cannot directly affect the classification effect. To be able to adapt
extract all the image attributes well. Therefore, the backbone to more damaged car areas, the algorithm adjusts the scaling
structure of ResNet50 and a FPN feature pyramid network scale of the ‘‘anchor point’’ to {32 × 32, 64 × 64, 128 × 128,
is used in this paper. FPN [21] uses a top-down hierarchy 256 × 256, 512 × 512}, and the aspect ratio of the anchor
with lateral connections, from single-scale input to build- point is changed to {1:2,1:1,3:1}, as shown in Figure 4. The
ing a network feature pyramid, which solves the multi-scale so-called IoU is the coverage of the predicted box and the
problem of extracting target objects in images. This structure real box, the value of which is equal to the intersection of
has strong robustness and adaptability, and requires fewer the two boxes divided by the union of the two boxes. In
parameters. this paper, the value of IoU is set to 0.8; that is, when the
To further improve the detection accuracy, the backbone overlap ratio of the area corresponding to the anchor frame
network structure is improved and the order of each layer is and the real target area is greater than 0.8, it is the foreground;
adjusted, as shown in Figure 3. The right-hand part of the when the overlap rate is less than 0.2, it is the background;
diagram in each figure is called the ‘‘residual’’ branch and between the two values, it is discarded. This reduces the
the left-hand part the ‘‘identity’’ branch. The value of the amount of computation underlying the model, saves time,
‘‘identity’’ branch cannot be easily changed. Keep the input and the improved RPN produces less ROI, which, in turn,
and output consistent, otherwise it will affect the information increases the efficiency of the model.
transmission and hinder the loss of loss. Adjust the order of D. RoIAlign MODEL
the layers on the ‘‘residual’’ branch, the improved ResNet In the Mask RCNN network structure, the mask branch must
structure has two advantages. First, back-propagation basi- determine whether a given pixel is part of the target, and
cally meets the requirements, and information transmission the accuracy must be at the pixel level. After the original
is unimpeded. Second, the BN layer acts as a pre-activation, image is heavily convolved and pooled, the size of the image
and the concept of ‘‘pre’’ is relative to the weight (conv) layer. has changed. When the pixel-level segmentation is directly
This can enhance the regularization of the model, and the performed, the image target object cannot be accurately posi-
generalization performance is better. tioned, so the Mask RCNN is improved on the basis of
Faster RCNN, and the Rol Pooling layer is changed into
C. RPN MODEL IMPROVEMENT the interest-region alignment layer (RoIAlign). The bi-linear
In this paper, the Feature Pyramid Networks structure is interpolation [23] method preserves the spatial information
adopted, and the images are made into different sizes to gen- on the feature map, which largely solves the error caused by
erate features corresponding to different sizes. The shallow the two quantizations of the feature map in the RoI Pooling
features can distinguish simple large targets and the deep layer, and solves the problem of regional mismatch of the
features can distinguish small targets. The different-size fea- image object. Pixel-level detection segmentation can thus be
ture maps generated by the FPN are input into the RPN [22], achieved.

VOLUME 8, 2020 6999


Q. Zhang et al.: Vehicle-Damage-Detection Segmentation Algorithm Based on Improved Mask RCNN

λ Represents the balance coefficient, which is used to control


the proportion of the two loss functions.
In the Faster RCNN,a hyperparameter λ = 10 control
balance is introduced between the classification loss and
regression loss, and the large-scale target and small-scale
target share this one parameter.
The error function of the Class prediction branch in the
Mask RCNN can be calculated by the following formula:
L(p, u, t u , v) = Lcls (p, u) + λ[u ≥ 1]Lloc (t u , v) (3)
where p is the predicted class, u is GT class, tu is the predicted
FIGURE 5. RoIAlign schematic.
bounding box for class u, v is GT bounding box.
The interest-area alignment layer RoIAlign differs from the If the hyperparameter λ = 10 is still introduced in the
ROI pooling in that it eliminates the quantization operation Mask RCNN, it will cause a phenomenon. The high-level
and does not quantize the ROI boundary and the unit, but uses semantic information is introduced on the underlying feature.
bi-linear interpolation to calculate the exact position of the The small-scale target has obvious rise points and the large-
sample points in each unit, retaining its decimal, and then uses scale target is not obvious. On the high-level feature, more
the maximum pooling or average pooling operation to output underlying feature information is introduced or maintained.
the last fixed-size RoI. As shown in Fig. 5, the blue dotted The large-scale target has obvious rise points and the small
line is the 5 × 5 feature map after convolution, the solid line scale is not obvious. The frame of the large target is actually
is the feature small block corresponding to the ROI in the fea- more accurate, but the position drift is more serious, so the
ture map, and RoIAlign maintains the floating-point number underlying information that is good for positioning is needed,
boundary, without quantization processing. First, the feature which contributes to the improvement of the large target on
small block is divided into 2 × 2 units (each unit boundary is the map indicator. The possible position of the small target is
not quantized) and then divided into four small blocks in each more accurate. However, the judgment of semantic informa-
unit; the center point is taken as four coordinate positions, tion is relatively weak, so high-level semantic information is
as shown by the blue dot in the figure. Then, the values of the needed to assist the discrimination, which contributes to the
four positions are calculated by bi-linear interpolation, and improvement of the small target in the map index. To summa-
finally the maximum pooling or average pooling operation rize, the focus is on optimizing the location information for
performed to obtain the feature map of 2 × 2 size. large targets. For small targets, the focus is on optimizing cat-
egory prediction. That is, for different scale targets, different
E. IMPROVEMENT OF LOSS FUNCTION weights should be introduced in the loss function to improve
The multitasking loss function of Mask RCNN is the detection accuracy of the detection branches.

L = Lcls + Lbox + LMask (1) III. EXPERIMENTAL RESULTS AND ANALYSIS


To reduce the number of steps in making dataset labels and
The above equation is the same as the loss function in the to improving the detection accuracy of car-damage images,
Faster RCNN model, which represents the classification error transfer learning and Mask RCNN are used in this paper to
and detection error, respectively. The mask branch and the process and detect images showing damage.
class prediction branch are decoupled, and a binary mask is
independently predicted for each category, without relying on A. TRANSFER LEARNING
the prediction results of the classification branch. The loss Deep learning requires a significant amount of data, but in
function in Faster RCNN: most cases it is difficult to find enough training data for a spe-
1 X 1 X ∗ cific problem within a certain range. To solve this problem,
L({pi }, {ti }) = Lcls (pi , p∗i )+λ pi Lreg (ti , ti∗ )
Ncls Nreg a solution is proposed, namely to use transfer learning [24].
i i
(2) Transfer learning includes a source domain and a target
domain, defined as
In the above formula, i is the index of the anchor box in
D(s) = {x, P(x)}, D(t) = {x, P(x)} (4)
the mini-batch;Ncls and Nreg indicate the number of classi-
fication layers and regression layers respectively;Pi repre- where D(s) is the source domain, D(t) the target domain,
sents the predicted probability value of anchor i being an x the feature space, and P(X) the marginal probability
object;P∗i is 0 if the anchor box is negative, and is 1 if the distribution,X = {x1 , K , xn } ∈ x.
anchor box is positive;ti indicates 4 parameterized coordi- It can be seen from the above formula that transfer learning
nates of the prediction candidate box;ti∗ refers to 4 param- is used to transfer the model parameters already trained in
eterized coordinates of the true value region;Lcls and Lreg the source domain to new models in the target domain to
represent classification loss and regression loss, respectively. help the new model training. Considering that most of the

7000 VOLUME 8, 2020


Q. Zhang et al.: Vehicle-Damage-Detection Segmentation Algorithm Based on Improved Mask RCNN

FIGURE 6. Comparison between transfer learning and starting from


scratch.
FIGURE 7. Labelme-marked car-damage image.

image data have similar basic features, such as color and


mode, in this paper pre-training is first done on large coco TABLE 1. Experimental environment information table.

datasets, and then the trained weight files are migrated to the
dedicated datasets collected in this article for training, fine-
tuning the network parameters. This allows the convolutional
neural networks to achieve good results on small datasets,
thereby alleviating the problem of insufficient data sources.
Ideally, a comparison of successful Transfer Learning with
Starting From Scratch is shown in Figure 6.
It can be seen that using migration learning can bring three
advantages. First, the initial performance of the model is
higher. Second, the rate of performance improvement of the
model is greater during the training process. Third, the final TABLE 2. Experimental part parameter table.
performance of the trained model is better.

B. BUILDING A DATASET
The main research object of this paper is a picture of a vehicle
that is scratched. The experiment collected 2,000 damaged
vehicle images (1600 training sets and 400 test sets) from
online downloads and daily photographs.
The main steps in acquiring a dedicated dataset for detect-
ing vehicle scratches in complex environments include the
following two parts.
1) Image collection: Images of damaged vehicles at dif-
ferent angles and of different sizes in different scenes
are photographed and downloaded from the Internet. D. PARAMETER SETTINGS
Because the downloaded images vary in size, and the See Table 2.
sample of Mask RCNN must be normalized to a uni-
form size, a script is used to normalize the images to E. EVALUATION INDEX
1024 × 1024 pixels, and the insufficient portions are
The evaluation index of the experimental results consists of
filled with 0.
two aspects: detection performance and segmentation perfor-
2) Image processing: The captured images are marked
mance. In this experiment, the P-R curve and the AP value
using the marking tool Labelme and divided into train-
were used to evaluate the performance of the target detec-
ing sets and test sets. The specific steps in this process
tion, and the mean intersection over union (MIoU) and the
are the following.
running speed were used to evaluate the image segmentation
First, a folder ‘‘datasets’’ is created, and then two subfolders, performance.
‘‘train’’ and ‘‘val’’, are created for storing training samples
and test samples. The images in each folder correspond to TP TP
P= , R= (5)
a.json annotation information file with the same name. The TP + FP TP + FN
labeling interface is shown in Figure 7. where TP is the correct number of samples correctly
classified. FP is the number of negative samples of a positive
C. EXPERIMENTAL ENVIRONMENT sample that is incorrectly marked. FN is the number of posi-
See Table 1. tive samples that are incorrectly marked as negative samples.

VOLUME 8, 2020 7001


Q. Zhang et al.: Vehicle-Damage-Detection Segmentation Algorithm Based on Improved Mask RCNN

FIGURE 8. P-R curve.

FIGURE 9. AP values of the two algorithms.

P is the accuracy rate and R is the recall rate. A P-R graph


is made based on the prediction results of the test set in the
network model, and then the average accuracy of the model
is obtained from the area under the P-R. The larger the AP
value, the better the detection performance.
1 k pii
MIoU = 6 k (6)
k + 1 i=1 6j=0 pij + 6j=0
k p −p
ji ii

FIGURE 10. Vehicle damage detection result based on Mask RCNN


where k is the total number of output classes in the model, pij algorithm.
represents the number of pixels that belong to category i but
have been misjudged as category j. pii indicates the number
of pixels correctly classified, while pij and pji represent pixels TABLE 3. Comparison of test results accuracy and time (fps denotes
that are misclassified. frame per second).

F. EXPERIMENTAL RESULTS AND ANALYSIS


In order to study the detection performance of the improved
algorithm on the car damage data set, it is compared with
the advanced detection algorithm Mask RCNN algorithm.
Figure 8 shows the P-R curve obtained using two algorithms.
Then, the area under the P-R curve is obtained by integration,
and the average accuracy of the two algorithms for car dam- detection algorithm is 0.83, which is 0.08 higher than the
age detection, that is, the AP value, is obtained, and the result advanced target detection algorithm Mask RCNN.
is shown in Figure 9. As can be seen from Table 3, compared with the Mask
It can be seen from Fig.9 and Fig.10 that the improved RCNN, the improved Mask RCNN improves the detection
Mask RCNN algorithm has a significant improvement in accuracy by 2.15%, the mask accuracy by 1.89%, and the
detection performance by comparing the Mask RCNN algo- running speed by 0.52fps. It can be seen that the improved
rithm. As can be seen from Figure 10, the Mask value of algorithm not only improves the accuracy, but also speeds
the Mask RCNN is 0.75, and the AP value of the improved up the detection speed, has better performance advantages,

7002 VOLUME 8, 2020


Q. Zhang et al.: Vehicle-Damage-Detection Segmentation Algorithm Based on Improved Mask RCNN

TABLE 4. Statistical Table of Automobile Damage Detection Results


Based on Mask RCNN.

TABLE 5. Statistical Table of Automobile Damage Detection Results


Based on Improved Mask RCNN.

and has higher applicability in the damaged area of the


automobile.
To verify the accuracy and reliability of the improved Mask
RCNN for automobile damage detection, experiments were
conducted on images under the following conditions: nor-
mal illumination, weak illumination, close distance, multiple
damage, strong exposure, and insignificant damage. These
conditions map to images (a)–(f), respectively, in the origi-
nal Mask RCNN algorithm and the improved Mask RCNN
for testing, and the test results are shown in Table 4 and
Table 5. The rectangular box indicates the detected target FIGURE 11. Car damage detection results based on the improved Mask
RCNN algorithm.
position, the number on the rectangular frame the probabil-
ity of belonging to the damaged area of the car, and the
binary mask the approximate outline of the damaged area of IV. CONCLUSION
the car. In the work described in this paper, a detection algorithm
Statistics on the car damage detection results of the based on deep learning for vehicle-damage detection is
Mask RCNN algorithm above are shown in the following used to deal with the compensation problem in traffic
table: accidents. After testing and improvement, the proposed
The results of the car damage detection of the improved transfer-learning and improved Mask RCNN–based vehicle-
Mask RCNN algorithm in the above figure are counted as damage-detection method is more universal, and can better
shown in the following table: adapt to various aspects of car-damage images. The algo-
Comparing the Figure 10, Figure 11 and Table 4, Table 5, rithm achieved good detection results in different scenarios.
it is shown that the improved Mask RCNN exhibits improve- Regardless of the strength of the light, the damaged area of
ments in missed detection and low accuracy. The improved multiple cars, or a scene with an overly high exposure, the
algorithm thus shows strong robustness and adaptability for fitting effect is better and the robustness is strong.
vehicle-damage detection. It can be further seen from the Although the robust Mask RCNN algorithm is adopted
comparison of experimental results that it is difficult to detect in this paper and it improves on the original algorithm and
the damaged area of the vehicle with high exposure using obtained ideal experimental results, some aspects have yet to
the original Mask RCNN. Areas in which the damage is be studied. For example, the detection accuracy is very high,
not obvious are also difficult to detect, but the improved but the mask instance segmentation cannot be completely
Mask RCNN has a good performance improvement in this correct, and some areas in which the damage is not obvious
area. cannot be segmented. In future work, data expansion can be

VOLUME 8, 2020 7003


Q. Zhang et al.: Vehicle-Damage-Detection Segmentation Algorithm Based on Improved Mask RCNN

carried out to increase the size of the dataset, collect more [21] T.-Y. Lin, P. Dollar, R. Girshick, K. He, B. Hariharan, and S. Belongie,
car damage images under different weather conditions and ‘‘Feature pyramid networks for object detection,’’ in Proc. IEEE Conf.
Comput. Vis. Pattern Recognit. (CVPR), Jul. 2017, pp. 320–329.
different levels of illumination, enhance the data, improve the [22] X. Zhang, J. Zou, X. Ming, K. He, and J. Sun, ‘‘Efficient and accurate
edge-contour enhancement of images, and make the masking approximations of nonlinear convolutional networks,’’ in Proc. IEEE Conf.
of the damaged areas of the car more accurate. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2015, pp. 1984–1992.
[23] S. Wang and K. Yang, ‘‘An image scaling algorithm based on bilinear inter-
polation with VC++,’’ in Proc. Techn. Autom. Appl., 2008, pp. 168–176.
REFERENCES [24] A. Mathew, J. Mathew, M. Govind, and A. Mooppan, ‘‘An improved
transfer learning approach for intrusion detection,’’ Procedia Comput. Sci.,
[1] R. Girshick, J. Donahue, T. Darrell, and J. Malik, ‘‘Rich feature hierar- vol. 115, pp. 251–257, Jan. 2017.
chies for accurate object detection and semantic segmentation,’’ in Proc. [25] G. Han, J. Su, and C. Zhang, ‘‘A method based on multi-convolution layers
IEEE Conf. Comput. Vis. Pattern Recognit., Jun. 2014, vol. 13, no. 1, joint and generative adversarial networks for vehicle detection,’’ in Proc.
pp. 580–587. KSII Trans. Internet Inf. Syst., 2019, pp. 1795–1811.
[2] R. Girshick, ‘‘Fast R-CNN,’’ in Proc. IEEE Int. Conf. Comput. Vis. (ICCV), [26] Y. Yu, K. Zhang, L. Yang, and D. Zhang, ‘‘Fruit detection for strawberry
Dec. 2015. pp. 1440–1448. harvesting robot in non-structural environment based on mask-RCNN,’’
[3] S. Ren, K. He, R. Girshick, and J. Sun, ‘‘Faster R-CNN: Towards real- Comput. Electron. Agricult., vol. 163, Aug. 2019, Art. no. 104846.
time object detection with region proposal networks,’’ IEEE Trans. Pat- [27] Y. Liu, P. Zhang, Q. Song, A. Li, P. Zhang, and Z. Gui, ‘‘Automatic
tern Anal. Mach. Intell., vol. 39, no. 6, pp. 1137–1149, Jun. 2017, segmentation of cervical nuclei based on deep learning and a conditional
doi: 10.1109/tpami.2016.2577031. random field,’’ IEEE Access, vol. 6, pp. 53709–53721, 2018.
[4] W. Liu, D. Anguelov, and D. Erhan, ‘‘SSD: Single shot multi-
box detector,’’ in Proc. IEEE Eur. Conf. Comput. Vision, Jun. 2016,
pp. 21–37.
[5] K. He, X. Zhang, S. Ren, and J. Sun, ‘‘Deep residual learning for image
recognition,’’ in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR),
Jun. 2016, pp. 770–778.
[6] K. He, G. Gkioxari, P. Dollar, and R. Girshick, ‘‘Mask R-
CNN,’’ in Proc. IEEE Int. Conf. Comput. Vis. (ICCV), Oct. 2017, QINGHUI ZHANG received the B.S.E. degree
pp. 2980–2988. from the College of Fire Control, Zhengzhou Insti-
[7] N. Kumar and R. Verma, ‘‘A multi-organ nucleus segmentation chal- tute of Anti-Aircraft, in 1996, the M.E. degree
lenge,’’ IEEE Trans. Med. Imag., vol. 11, no. 1, pp. 34–39, Oct. 2019, in navigation guidance and control from the Ord-
doi: 10.1109/TMI.2019.2947628. nance Engineering College, Shijiazhuang, in 2003,
[8] A. K. Jaiswal, P. Tiwari, S. Kumar, D. Gupta, A. Khanna, and and the Ph.D. degree from the Beijing Institute of
J. J. Rodrigues, ‘‘Identifying pneumonia in chest X-rays: A deep learning
Technology, Beijing, China, in 2006. He is cur-
approach,’’ Measurement, vol. 145, pp. 511–518, Oct. 2019, doi: 10.1016/
rently a Professor with the College of Informa-
j.measurement.2019.05.076.
tion Science and Engineering, Henan University
[9] P. Pinheiro and R. Collobert, ‘‘Learning to segment object candidates,’’ in
Proc. Adv. Neural Inf. Process. Syst., 2015, pp. 1990–1998.
of Technology. His research interests include arti-
ficial intelligence information processing and embedded systems.
[10] W. Tang, H.-L. Liu, L. Chen, K. C. Tan, and Y.-M. Cheung, ‘‘Fast hyper-
volume approximation scheme based on a segmentation strategy,’’ Inf. Sci.,
vol. 509, pp. 320–342, Jan. 2020, doi: 10.1016/j.ins.2019.02.054.
[11] Y. Li, H. Qi, J. Dai, X. Ji, and Y. Wei, ‘‘Fully convolutional instance-
aware semantic segmentation,’’ in Proc. IEEE Conf. Comput. Vis. Pattern
Recognit. (CVPR), Jul. 2017, pp. 4438–4446.
[12] X. Rong, C. Yi, and Y. Tian, ‘‘Unambiguous scene text segmentation
with referring expression comprehension,’’ IEEE Trans. Image Process.,
XIANING CHANG received the B.S.E. degree
vol. 29, pp. 591–601, Jul. 2019, doi: 10.1109/tip.2019.2930176.
from the College of Science, Henan Agricultural
[13] Y. L. Qiao, M. Truman, and S. Sukkarieh, ‘‘Cattle segmentation and
University, in 2018. She is currently pursuing
contour extraction based on mask R-CNN for precision livestock farming,’’
Comput. Electron. Agricult., vol. 165, Oct. 2019, Art. no. 104958, doi: 10.
the master’s degree in computer technology with
1016/j.compag.2019.104958. the College of Information Science and Engi-
[14] C. Shuhong, Z. Shijun, and Z. Dianfan, ‘‘Water quality monitoring neering, Henan University of Technology, China.
method based on feedback self correcting dense connected convolu- Her research interests include artificial intelli-
tion network,’’ Neurocomputing, vol. 349, pp. 301–313, Jul. 2019, gence information processing, road scene target
doi: 10.1016/j.neucom.2019.03.023. detection, and deep learning.
[15] J. Yang, L. Ji, X. Geng, X. Yang, and Y. Zhao, ‘‘Building detection in high
spatial resolution remote sensing imagery with the U-rotation detection
network,’’ Int. J. Remote Sens., vol. 40, no. 15, pp. 6036–6058, Aug. 2019,
doi: 10.1080/01431161.2019.1587200.
[16] E. K. Wang, X. Zhang, L. Pan, C. Cheng, A. Dimitrakopoulou-Strauss,
Y. Li, and N. Zhe, ‘‘Multi-path dilated residual network for nuclei
segmentation and detection,’’ Cells, vol. 8, no. 5, p. 499, May 2019,
doi: 10.3390/cells8050499.
[17] X. Lin, S. Zhu, and J. Zhang, ‘‘Rice planthopper image classification SHANFENG BIAN received the B.S.E. degree
method based on transfer learning and mask R-CNN,’’ Trans. Chin. Soc. from the College of Science, Huanghuai Univer-
Agricult. Mach., vol. 13, no. 4, pp. 181–184, Dec. 2019. sity, in 2017. He is currently pursuing the master’s
[18] G. Wang and S. Liang, ‘‘Ship object detection based on mask RCNN,’’ in degree in signal and information processing with
Proc. Radio Eng., 2018, pp. 947–952. the College of Information Science and Engineer-
[19] J. Shi, Y. Zhou, and Q. Zhang, ‘‘Service robot item recognition system ing, Henan University of Technology, China. His
based on improved mask RCNN and Kinect,’’ in Proc. Appl. Res. Comput., research interests include intelligent information
Jun. 2019, pp. 1–9. processing and embedded systems, vehicle detec-
[20] J. Li and W. He, ‘‘Building target detection algorithm based on mask tion, and deep learning.
RCNN,’’ in Proc. Sci. Surv. Mapping, Apr. 2019, pp. 1–13.

7004 VOLUME 8, 2020

You might also like