0% found this document useful (0 votes)
52 views4 pages

A Very Deep Transfer Learning Model For Vehicle Damage Detection and Localization

Uploaded by

Second Space
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
52 views4 pages

A Very Deep Transfer Learning Model For Vehicle Damage Detection and Localization

Uploaded by

Second Space
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

A Very Deep Transfer Learning Model for Vehicle

Damage Detection and Localization


Najmeddine Dhieb∗§ , Hakim Ghazzai∗ , Hichem Besbes§ , and Yehia Massoud∗
∗ School of Systems and Enterprises, Stevens Institute of Technology, Hoboken, NJ, USA.
E-mails: {ndhieb, hghazzai, ymassoud}@stevens.edu
§ University of Carthage, Higher School of Communications of Tunis, Tunisia

E-mails: {najmeddine.dhieb, hichem.besbes}@supcom.tn

Abstract— Claims leakage is a major problem engendering knowledge, only the research presented in [4], where a deep
tremendous losses for insurance companies. Those losses are learning-based solution was adopted to detect damage in vehi-
due to the difference between the amount paid by insurance
cles. Three different approaches has focused on such a topic:
companies and the exact amount that should be spent, which
cost millions of dollars yearly. Experts assert that these losses are training Convolutional Neural Networks (CNNs) from random
caused by inefficient claims processing, frauds, and poor decision- initialization, convolution auto-encoder based on pre-training
making in the company. With the huge advances in Artificial model followed by fine-tuning, and transfer learning based on
Intelligence (AI), machine and deep learning algorithms, those VGG16 pre-trained model [5]. Despite that, the previous study
technologies have started being used in insurance industry to
is limited to the damage identification in vehicles without
solve such problems and cope with their negative consequences.
In this paper, we propose automated and efficient deep learning- providing extra details. In addition, it cannot evaluate the
based architectures for vehicle damage detection and localization. damage level or accurately locate it, as it is highly sensible to
The proposed solution combines deep learning, instance segmen- overfitting.
tation, and transfer learning techniques for features extraction
and damage identification. Its objective is to automatically detect
Damage detection and visualization have been studied in
damages in vehicles, locate them, classify their severity levels, and other fields. For instance, the study presented in [6] proposed
visualize them by contouring their exact locations. Numerical a deep learning pipeline solution for fine-grained classifica-
results reveal that our transfer learning proposed solution, tion of building images taken by Unmanned Aerial Vehicles
based on Inception-ResnetV2 pre-trained model followed by a (UAVs) for damage assessment. An integrated deep learning
fully connected neural network, achieves higher performances
in features extraction and damage detection/localization than
pipeline was suggested to identify structures followed by a
another pre-trained model, i.e., VGG16. fine-grained damage classification for those buildings. Road
damage detection and street monitoring solution was devel-
Index Terms—Damage Detection, Deep Learning, Insurance,
Transfer Learning.
oped in [7] where You Only Live Once (YOLO) pre-trained
model was adopted to detect various roads damages types as
identifiable objects in images.
I. I NTRODUCTION In this paper, we present a novel framework to detect,
Insurance industry is one of the first industry which invested locate, and identify damage severity on vehicles using CNN,
in innovation, high tech, and Artificial Intelligence (AI) [1]. transfer learning, and Mask R-CNN techniques. Unlike previ-
Nowadays, car insurance companies lose a tremendous amount ous studies, our approach not only detects damages but also
of money due to claims leakage [2], which is defined as the identifies their severity levels, localizes, and visualizes them
difference between the actual amount paid by the insurer and on the vehicle’s images. In this context, we propose to use the
the exact amount that should be spent in reality. Often, this Inception-Resnet pre-trained model [8], as features extractor
difference is caused by an inefficient claim processing, poor where we replace the last flatten layer for classification by
decision-making, and/or poor customer service. AI technology other neural networks for damage detection and classification.
has shown remarkable improvement in helping making accu- Since object instance segmentation followed by object detec-
rate decision in many fields such as robotics, computer vision, tion and classification used for roads and buildings damage
and medical science. Many AI techniques are also designed to detection proved their efficiency in those cases. We propose
help in solving a variety of issues in insurance industry such to use Mask R-CNN not only for its performance in object
as analyzing and processing data, detecting frauds, minimizing detection but also for its efficiency in instance segmentation.
risks, and automating claim process. Such a technology can Data augmentation and regularization techniques are employed
be also utilized to automate visual inspection and validation to reduce the overfitting problems. Then, we provide a com-
in order to help cope with claims leakage. parison between the proposed pre-trained model Inception-
AI has proved its efficiency in fraud detection for suspected ResnetV2 and VGG16 employed in [4]. Results show that
collusion claims as shown in [3]. However, few researchers Inception-ResnetV2 outperforms the VGG16 due to its large
worked on developing automated visual recognition services number of hidden layers and its residual connections. For
in order to create custom solutions for insurance companies instance, the proposed model exceeds the VGG16 by more
to detect and locate vehicle damage. To the best of our than 10% in damage localization.

978-1-7281-4058-2/19/$31.00 ©2019 IEEE 


2

Fig. 1: Model architecture for damages detection and classification.

II. DAMAGE D ETECTION AND C LASSIFICATION dressed: i) the high inter-class similarity ii) and images side
and orientation. To build our model, we involve a pre-trained
In this section, we present our approach to detect, identify, model as a deep feature extractor followed by another neural
and localize damages in vehicle images. To this end, deep network to detect and classify the damages. We propose to
learning, CNN, and transfer learning techniques are used. The employ the use of Inception-ResnetV2 [8], a CNN composed
workflow in Fig. 1 details our proposed methodology. of 572 layers and trained on more than one million images
• Dataset: from ImageNet, from which we remove the last layer used
Due to the lack of public and accessible datasets for vehicle for object classification. Instead, we add two neural networks,
damage, we manually collect and label images available in pooling, and softmax layers to accommodate the model to our
the Internet and our local area. We work with tree datasets, objectives. A dropout layer is used to improve performance
the first one contains images for damaged and non-damaged and reduce overfitting problem risk. Since we are using a small
vehicles. The second dataset is composed of three classes to dataset, we freeze all the weights of the pre-trained model and
evaluate damage severity: minor (e.g., scratch), moderate (e.g., only train the last two neural network layers. Regularization
large dents), and major (e.g., a complete side of the vehicle parameters are also applied to increase the performance of our
is damaged). Finally, the third one describes different damage model and avoid overfitting.
locations. As we are dealing with a small dataset, we use data Since CNN and transfer learning based models are time
augmentation to synthetically expand and modify the dataset consuming during the training phase, we use a learning
in order to enhance its performance and relax its tolerance to strategy in order to get the best learner parameters in a shorter
the overfitting issue during training. We use random rotation, time. Note that our learning strategy focuses on training the
dimension shift, zooming, and flipping transformations to model for k epochs and then evaluate its performance. As
variate the generated data. long as the validation performance are converging towards the
• Transfer Learning: right expected values, we train our model for a longer time,
Transfer learning is inspired from transfer learned knowledge otherwise we adjust the regularization and hyper parameters.
concept to solve similar problems faster and/or with better per-
formances. It is one of the most effective techniques dedicated
for small labeled datasets, where a pre-trained model is used to III. DAMAGE L OCALIZATION AND VISUALIZATION
extract features for the targeted task, while guaranteeing a low In this section, we propose to use the Mask R-CNN [10]
overfitting risk. Multiple models pre-trained on Imagenet are [11], which is an object detection, classification, and segmen-
available, such as VGG16 and VGG19 [5] as well as Inception tation method, to localize and visualize the damage in vehicle
[9] are publicly ready for use. In traditional machine learning images. We use Resnet-101 feature pyramid network model
techniques, the model learns and trains each task from the as a backbone, we initialize our model weights from a pre-
ground up, whereas transfer learning focuses on extracting trained model based on Microsoft Common Objects in Context
features and relevant information from source tasks and applies (MS-COCO) dataset [12] and train only the network heads to
the acquired knowledge to a target task. increase our model performance.
As a result, when the source and the target domains are Mask R-CNN is an extension of Faster R-CNN [13], where
similar, knowledge transfer may improve the performance of as a first step a CNN is used to extract accurate features
the target tasks. In our case, the source tasks are the classes followed by a Region Proposal Network to create Region
of the pre-trained model and the target tasks are the damages of Interests (RoIs). Then, a ROI-Align operation allows the
to be detected, their location, and their severity levels. construction of instance segment masks. Afterwards, Mask R-
• Model Details and Settings: CNN introduces another fully CNN to extract the useful and
In order to efficiently detect, classify, and localize vehicle essential segmentation instances in addition to a fully con-
damages from images, two major challenges must be ad- nected neural networks for the classification and boundary box


3

Fig. 2: Mask R-CNN model architecture.

prediction. As a final step, the network heads independently in vehicles. Two different pre-trained models are investigated
predicts the classes, boundary boxes, and the desired mask. in this work, VGG16 [5] and Inception-ResnetV2 [8]. We
• Backbone Network: provide both models the same training and testing data. The
The backbone network in Mask R-CNN model is a CNN used models are trained for 100 epochs. Comparisons between both
as a features extractor, where low features are obtained from approaches for the damage detection, localization, and severity
primary layers and deep ones are extracted from later layers. In classification are provided in Table I.
addition, by passing through the backbone network, images are To evaluate the performance of the different transfer learn-
converted to generate the feature maps. The Feature Pyramid ing models, we aim to use three different metrics: Preci-
Network (FPN) [14] is a top down pyramid architecture used sion, Recall, and F1-score. The higher those metrics are, the
to extract features from the top pyramid layers and transfer best our model is. We observe that Inseption-ResnetV2 pre-
their outputs to the following lower layers, involving lateral trained model outperforms VGG16. Most notably in damage
connections between the pyramid layers. As a result, each level localization and severity classification. The proposed pre-
in the pyramid layers have the access to the higher and lower trained model is more efficient in damage localization with
levels of visual features. In this context, we use Resnet-101- the precision of 86.8% in contrast to 75.4%. In addition the
FPN backbone as a feature extractor for the Mask R-CNN to VGG16 shows minor performance with only 69% of precision
increase its speed and performance. in damage severity classification compared to 80% for the
• Region Proposal Network: Inception-ResnetV2 model. Overall, the best performances in
The Regional Proposal Network is a fully CNN where the all challenges are achieved by the Inception-ResnetV2. The
features extracted by the backbone network are used as an notable performance of the proposed model can be validated
input to predict the probability of an anchor being background by the confusion matrices provided in Fig. 3, which summarize
or foreground. In this context, a sliding window browsing the normalized predicted values of each class. The difference
the feature maps generates sets of anchors with different between those models can be explained by the fact that the
ratios and scales to be used as a bounding box predictor Inception-ResnetV2 has residual connections that allow short-
of being a background or foreground object. Since we have cuts in the model to train very deep neural networks without
an overlap between anchors, we adopt the Non-Maximum overfitting problems, which results in better performance.
Supression (NMS) technique with an Intersection over Union
The Mask R-CNN model performance during training and
(IoU) threshold set to 0.7 to minimize the redundancy.
evaluation processes is evaluated using a multi-task loss
• Region Of Interests Alignment:
function which combines classification, localization, and seg-
Due to the refinement process of bounding box in RPN, the
mentation masks losses. The classification and bonding-box
RoI boxes may have different sizes. In order to get an accurate
losses are the same as those defined in [13]. Since there
mask with Mask R-CNN, RoI features should be aligned to
is any competition among classes for masks generation, the
maintain the same size as the RoI boxes. Unlike the RoIPool
segmentation mask loss is defined as the average binary cross-
method used in Faster R-CNN, which applies quantification for
entropy loss, including only the k th mask if the region is
RoI to discrete the feature map and introduces misalignments
associated with the k th ground truth class. The evolution
between the RoI and extracted features, Kaiming He and al
of loss function and mask R-CNN loss function are shown
[10] proposed the RoI-Align method where they can avoid the
respectively in Fig. 4. The lower the loss function is, the better
use of quantification and apply the bi-linear interpolation [15]
a model is. Since the loss functions computing on training and
to measure the accurate values of features and hence, the result
validation datasets are close to zero, we can insure that the
can be aggregated.
proposed model performs well in contouring damages.
Fig. 5 visualizes some examples of the proposed AI-based
IV. P ERFORMANCE E VALUATION
damage localization. The model can identify damages from
In this section, we evaluate the performances of the pro- different locations and orientations as well as different types
posed AI technique to detect, classify, and visualize damages and sizes of car damages.


4

TABLE I: Performances for damage severity classification


Performances for damage severity Performances for damage localization Performances for damage detection
hhh
hhh Metric
hhh Precision (%) Recall F1-score Precision (%) Recall F1-score Precision (%) Recall F1-score
Pre-trained Model hh
VGG16 69.4 0.68 0.67 75.4 0.75 0.75 94.5 0.95 0.94
Inception-ResnetV2 80.2 0.78 0.78 86.8 0.86 0.85 96.8 0.96 0.96

Fig. 3: Confusion matrices for damage detection, localization, and severity classification .

proposed the use of Mask R-CNN to outline the damage


location. The empirical results show that the suggested pre-
trained model not only detects damaged vehicles but also
identifies its location and severity level. This solution shows an
important asset for insurance industry to fight against claims
leakage problems.

R EFERENCES
[1] N. Dhieb, H. Ghazzai, H. Besbes, and Y. Massoud, “Extreme gradient
boosting machine learning algorithm for safe auto insurance operations,”
Fig. 4: Loss function evolution. in IEEE International Conference on Vehicular Electronics and Safety
(ICVES’19), Cairo, Egypt, Sept. 2019.
[2] M. Wassel, “Property Casualty: Deterring Claims Leakage in the Digital
Age,” Cognizant Insurance Practice, Tech. Rep., 2018.
[3] K. Supraja and S. J. Saritha, “Robust fuzzy rule based technique to
detect frauds in vehicle insurance,” in IEEE Inte. Conf. Ener. Comm.
Data Analy. Soft Comp. (ICECDS’17), Chennai, India, Aug. 2017.
[4] K. Patil, M. Kulkarni, A. Sriraman, and S. Karande, “Deep learning
based car damage classification,” in IEEE Int. Conf. Machine Learning
App. (ICMLA’17), Cancun, Mexico, Dec. 2017.
[5] K. Simonyan and A. Zisserman, “Very Deep Convolutional Networks
for Large-Scale Image Recognition,” arXiv e-prints, Sept. 2014.
[6] N. Attari, F. Ofli, M. Awad, J. Lucas, and S. Chawla, “Nazr-CNN: Fine-
grained classification of UAV imagery for damage assessment,” in IEEE
Int. Conf. Data Sc. Adv. Analy. (DSAA’17), Tokyo, Japan, Oct. 2017.
[7] A. Alfarrarjeh, D. Trivedi, S. H. Kim, and C. Shahabi, “A deep learning
approach for road damage detection from smartphone images,” in IEEE
Int. Conf. Big Data (Big Data’18), Seattle, Washington, USA, Dec. 2018.
[8] C. Szegedy, S. Ioffe, V. Vanhoucke, and A. Alemi, “Inception-v4,
Inception-ResNet and the Impact of Residual Connections on Learning,”
arXiv e-prints, Feb. 2016.
[9] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, “Rethinking
the inception architecture for computer vision,” in IEEE Conf. Comp.
Vis. Patt. Recog. (CVPR’16), June 2016.
[10] K. He, G. Gkioxari, P. Dollr, and R. Girshick, “Mask R-CNN,” in IEEE
Int. Conf. Comp. Vis. (ICCV’17), Venice, Italy, Oct. 2017.
[11] W. Abdulla, “Mask R-CNN for object detection
and instance segmentation on keras and tensorflow,”
https://github.com/matterport/Mask RCNN, 2017.
Fig. 5: Examples of localization and visualization. [12] T.-Y. Lin, M. Maire, S. Belongie, L. Bourdev, R. Girshick, J. Hays,
P. Perona, D. Ramanan, C. L. Zitnick, and P. Dollár, “Microsoft COCO:
V. C ONCLUSION Common Objects in Context,” arXiv e-prints, May 2014.
[13] S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards real-
In this study, we introduced a novel deep learning architec- time object detection with region proposal networks,” IEEE Trans. Patt.
Analy. Mach. Intel., vol. 39, no. 6, pp. 1137–1149, June 2017.
ture to detect, locate, classify, and visualize the damages in [14] T. Lin, P. Dollr, R. Girshick, K. He, B. Hariharan, and S. Belongie,
vehicle images using transfer learning and instance segmenta- “Feature pyramid networks for object detection,” in IEEE Conf. Comp.
tion techniques. We proposed the use of Inception-Resnet pre- Vis. Patt. Recog. (CVPR’17), July 2017.
[15] M. Jaderberg, K. Simonyan, A. Zisserman, and K. Kavukcuoglu, “Spa-
trained model as features extractor followed by fully connected tial Transformer Networks,” arXiv e-prints, Jun 2015.
neural networks to identify and classify the damages. We also



You might also like