0% found this document useful (0 votes)

28 views10 pages

NITCAD - Developing An Object Detection Classifica

Uploaded by

dakshithbiju

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

28 views10 pages

NITCAD - Developing An Object Detection Classifica

Uploaded by

dakshithbiju

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

Available online at www.sciencedirect.

com
Available online at www.sciencedirect.com
Available online at www.sciencedirect.com

ScienceDirect
Procedia Computer Science 00 (2019) 000–000
Procedia
Procedia Computer
Computer Science
Science 17100 (2019)
(2020) 000–000
207–216 www.elsevier.com/locate/procedia
www.elsevier.com/locate/procedia

Third International Conference on Computing and Network Communications (CoCoNet’19)

Third International Conference on Computing and Network Communications (CoCoNet’19)
NITCAD
NITCAD -- Developing
Developing anan object
object detection,
detection, classification
classification and
and stereo
stereo
vision dataset for autonomous navigation in Indian roads
vision dataset for autonomous navigation in Indian roads
Namburi GNVV Satya Sai Srinath, Athul Zac Joseph, S Umamaheswaran, Ch. Lakshmi Priyanka, Malavika Nair M,
Namburi GNVV Satya Sai Srinath, Athul Zac Joseph, S Umamaheswaran,
Praveen Sankaran Ch. Lakshmi Priyanka, Malavika Nair M,
Praveen Sankaran
National Institute of Technology, Calicut, Kerala, India
National Institute of Technology, Calicut, Kerala, India

Abstract
Abstract
Autonomous vehicles with various levels of autonomy are becoming popular in developed countries due to their effectiveness in
Autonomous vehicles caused
reducing the fatalities with various
by roadlevels of autonomy
accidents. are becoming
A developing countrypopular in developed
like India countries
with the second duepopulation
largest to their effectiveness
in the world,in
reducing the fatalities caused by road accidents. A developing country like India with the second largest population
creates unique road scenarios for an autonomous car which requires a lot of testing and fine tuning before implementation. This in the world,
creates
leads tounique road scenarios
the importance for an providing
of datasets autonomous car whichabout
information requires a lot traffic
various of testing and fine
situations in tuning before
India. For implementation.
planning This
its path ahead,
leads to the importance
autonomous vehicles haveoftodatasets
detect, providing
classify andinformation
estimate the about various
depth traffic that
of obstacles situations in India. on
they encounter Forroads.
planning its path ahead,
The purpose of this
autonomous vehiclesa have
paper is to provide to for
dataset detect, classify
object and estimate
classification, the depth
detection and of obstacles
stereo visionthat they encounter
corresponding on roads.
to Indian Thewhich
roads purposecanofserve
this
paper is to provide a dataset for object classification, detection and stereo vision corresponding to Indian roads
as a platform for developing effective algorithms for autonomous cars in Indian roads. In this work, we benchmarked the object which can serve
as a platform by
classification forusing
developing effective
confusion matrixalgorithms for autonomous
obtained from various deepcars in Indian
learning roads.
models, In this detection
evaluated work, we using
benchmarked the object
Faster R-CNN and
classification
compared depth by estimation
using confusion matrix
processed byobtained
Realsense from various
stereo deep
camera bylearning
applyingmodels, evaluated
convolutional detection
neural using
network Faster
based R-CNN and
algorithms.
compared depth estimation processed by Realsense stereo camera by applying convolutional neural network based algorithms.
c 2020

© 2020 The
The Authors.
Authors. Published
Published byby Elsevier
Elsevier B.V.
B.V.
c 2020

This is The Authors. Published by Elsevier B.V.
This is an
an open
open access
access article
article under
under the
the CC
CC BY-NC-ND
BY-NC-ND license
license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
(http://creativecommons.org/licenses/by-nc-nd/4.0/)
This is an open
Peer-review access
under article under
responsibility of the
theCC BY-NC-ND
scientific licenseof(http://creativecommons.org/licenses/by-nc-nd/4.0/)
committee the Third International Conference on Computing and Network
Peer-review under(CoCoNet’19).
Communications
Communications responsibility of the scientific committee of the Third International Conference on Computing and Network
(CoCoNet’19).
Communications (CoCoNet’19).
Keywords: Dataset, Object classification, Object detection, Stereo vision, Indian roads
Keywords: Dataset, Object classification, Object detection, Stereo vision, Indian roads

1. Introduction
1. Introduction
The future of transportation lies in the development of autonomous vehicles that can eliminate the factor of human
The
error future reducing
thereby of transportation liesof
the chance in road
the development of autonomous
accidents. The vehicles
technology will only that can eliminate
be effective the factor of
if implemented human
across all
error thereby reducing the chance of road accidents. The technology will only be effective if implemented
the vehicles on the roads thereby reducing the uncertainties associated with drivers. For perfecting the technology across all
the vehicles on the roads thereby reducing the uncertainties associated with drivers. For perfecting the
to be implemented on a large scale, it should be tested across various traffic situations that will be encountered by technology
to be implemented
these on alarge
vehicles. Several large datasets
scale, it like
should be tested
Imagenet [1]across various[2],
and COCO traffic
are situations thatimage
available for will beclassification
encountered but
by
these vehicles. Several large datasets like Imagenet [1] and COCO [2], are available for image classification but

∗ Corresponding author. Tel.: +0495-2286721.

∗ Corresponding
E-mail address:author. Tel.: +0495-2286721.
[email protected]
E-mail address: [email protected]

1877-0509
1877-0509 c 2020
© 2020 The
The Authors.
Authors. Published
Published by
by Elsevier
Elsevier B.V.
B.V.
1877-0509
This isisan c 2020 Thearticle
Authors. Published by Elsevier B.V.
This anopen
openaccess under
access article the CC
under the BY-NC-ND
CC BY-NC-ND licenselicense
(http://creativecommons.org/licenses/by-nc-nd/4.0/)
(http://creativecommons.org/licenses/by-nc-nd/4.0/)
This is an open access article under
of the
Peer-review
Peer-review under
underresponsibility
responsibility ofCC
the theBY-NC-ND
scientific
scientific license
committee of (http://creativecommons.org/licenses/by-nc-nd/4.0/)
the Third
committee International
of the Conference on
Third International Computingon
Conference andComputing
Network Communications
and Network
Peer-review under responsibility of the scientific committee of the Third International Conference on Computing and Network Communications
(CoCoNet’19).
Communications (CoCoNet’19).
(CoCoNet’19).
10.1016/j.procs.2020.04.022
208 Namburi GNVV Satya Sai Srinath et al. / Procedia Computer Science 171 (2020) 207–216
2 Author name / Procedia Computer Science 00 (2019) 000–000

for localisation and segmentation, which are primary tasks for autonomous vehicles, only a few datasets like Pascal
VOC [3] can be used. But it has a wide variety of classes that are not necessary for autonomous navigation. There
are some datasets available exclusively for pedestrian detection such as Caltech Pedestrian Dataset [4], Citypersons
dataset [5], Daimler Pedestrain [6]. Krause et al. [7] created a dataset which can be used for classification of cars. But
autonomous vehicles need to detect different classes of obstacles present in its path. For that, Oxford robotcar dataset
[8] has collected data in the city of Oxford, UK at various times in the same location. KITTI [9] is another dataset
which is collected on the urban roads of Karlsruhe, Germany. Cityscapes [10], and Mapillary Vistas [11] created
datasets for semantic understanding of urban streets. More recently Apolloscapes [12], BDD100k [13], nuScenes [14]
provided datasets that are collected across various weather conditions, times and places, along with labels which are
crowd sourced.
To test an autonomous vehicle in India, the above mentioned datasets cannot be used due to the lack of information
about classes like auto rickshaws that are exclusively found on Indian roads. Also, unstructured traffic scenarios can
be observed frequently on Indian roads. These factors make it essential to have a dataset exclusively for Indian roads
which can give information regarding the same. Varma et al. [15] has studied about the situations in Indian roads
and created a dataset named IDD. For an autonomous vehicle to plan its path, it should be aware of distances to
other vehicles in its vicinity so that the velocity of other vehicles can be estimated. This can be achieved by using
a 3D LiDAR which can scan its surroundings and give distances to different obstacles but this is rather expensive.
A cost effective alternative is to use a stereo camera which can provide the depth information about the obstacles.
The performance of this method purely depends upon the algorithms for evaluating depth information. Due to the
introduction of better algorithms this method will be more suitable for a price conscious market like India.
In this context, there is a requirement for a new dataset, that can be used to develop autonomous navigation systems
for Indian roads. So, a dataset named National Institute of Technology, Calicut Autonomous Driving (NITCAD) is
presented in this paper. NITCAD primarily consists of NITCAD object dataset which can be used for classification,
detection and NITCAD stereo vision dataset for depth estimation on Indian roads, thereby leading to the development
of level 3 autonomous vehicles capable of handling Indian road scenarios.

2. Methodology

The autonomous vehicles that are being developed for Indian roads should be able to detect and classify different
vehicle classes that are exclusively found here. This will be advantageous for performing the path planning operation
since different vehicle classes behave differently on roads. NITCAD object dataset provided here was created under
this objective.
Inorder to keep track of the detected objects on the road, the autonomous vehicle needs to evaluate the velocity
of these objects. For developing stereo vision based velocity estimation algorithms, NITCAD stereo vision dataset
provides image data collected from synchronised left and right cameras having global shutter.
The NITCAD object dataset was evaluated with different deep learning architectures and the respective confusion
matrix, precision, recall values were found out. For the NITCAD stereo vision dataset, the relative difference between
the disparity maps obtained by different methods are evaluated and interpreted.

2.1. Traffic scenario in India

India has one of the the highest number of road fatalities in the world. Autonomous navigation is one of the solu-
tions to reduce the number of fatalities due to road accidents. To build an efficient and reliable system, the knowledge
of the traffic structure in India is highly essential. Traffic in India is highly heterogeneous and the models trained for
autonomous navigation in other countries may not be sufficient to characterize Indian traffic. Indias roads are func-
tionally classified as expressways, national highways, state highways, district roads, and rural roads. Out of this, only
expressways and some of the national highways are four-lane or six-lane. An example of unlaned traffic junction can
be observed in Fig. 1(a). Most roads are unpaved with potholes and with ambiguous boundaries. Compared to other
countries, India has low road density per 1000 people. This has led to many problems like traffic congestion and
irregular traffic speeds. The Indian roads carry almost 90 per cent of the countrys passenger traffic. Passengers use a
multitude of vehicles for transport daily which include cars, two-wheelers, and auto-rickshaws, etc. Auto-rickshaws
Namburi GNVV Satya Sai Srinath et al. / Procedia Computer Science 171 (2020) 207–216 209
Author
Author
namename
/ Procedia
/ Procedia
Computer
Computer
Science
Science
00 (2019)
00 (2019)
000–000
000–000 3 3

(a) An
(a) unlaned
An unlaned
traffic
traffic
junction.
junction. (b) Unstructured
(b) Unstructured
environment
environment
prevalent
prevalent
in India.
in India.

Fig. Fig.
1: General
1: General
traffic
traffic
scenarios
scenarios
on Indian
on Indian
Roads.
Roads.

are are
a class
a class
of vehicles
of vehicles
thatthat
are are
trulytruly
unique
unique
to the
to the
Indian
Indian
traffic
traffic
system.
system.
Further,
Further,
the the
frequency
frequency
andand
variety
variety
of trucks
of trucks
andand
buses
buses
are are
alsoalso
highhigh
as compared
as compared to other
to other
countries.
countries.
Another
Anotherhugehuge
bottleneck
bottleneck
for for
autonomous
autonomous navigation
navigation
in India
in India
is that
is that
the the
pedestrians
pedestrians
andand
drivers
drivers
are are
lessless
likely
likely
to follow
to follow
traffic
traffic
rules.
rules.
Pedestrians
Pedestrians
often
often
cross
cross
the the
roadroad
at arbitrary
at arbitrary
locations
locations
andand
drivers
drivers
sometimes
sometimes
overtake
overtake
fromfrom
the the
wrong
wrong
side.side.
An An
example
example
of this
of this
unstructuredness
unstructurednessin traffic
in traffic
cancan
be be
observed
observed
in Fig.
in Fig.
1(b).
1(b).

2.2.2.2.
Collection
Collection
of data
of data

To collect
To collect
data,
data,
oneone
RGBRGBcamera
camera
andand
a Stereo
a Stereo
camera
camera
werewere
mounted
mounted
on aoncar
a car
andand
waswas
mademade
to travel
to travel
in the
in the
ruralrural
andand
urban
urban
roads
roads
of Kerala
of Kerala
where
where
various
various
traffic
traffic
situations
situations
arise.
arise.
TheThe
route
route
in which
in which
the the
datadata
waswas
collected
collected
is shown
is shown
in in
Fig.Fig.
2 2

Fig. Fig.
2: Route
2: Route
followed
followed
for data
for data
collection.
collection.

2.3.2.3.
NITCAD
NITCAD
Object
Object
dataset
dataset

For For
training
training
the the
system
system
to classify
to classify
different
different
objects
objects
thatthat
could
could
be encountered
be encountered on an
onIndian
an Indian
road,
road,
a dataset
a dataset
including
including
a variety
a variety
of classes
of classes
needs
needs
to be
tocreated.
be created.
By By
using
using
NoiseNoise
PlayPlay
2 Action
2 Action
camera,
camera,
traffic
traffic
in and
in and
around
aroundKottayam
Kottayamdistrict,
district,
Kerala
Kerala
waswas recorded
recordedalong
along
the the
route
route
shown
shown
in Fig.2
in Fig.2
in 720p
in 720p
at 30fps.
at 30fps.
A setA of
setimages
of imagesat aatrate
a rate
of 5ofimages
5 imagesper per
second
second
werewere
generated
generated from from
thisthis
recorded
recorded
video
video
footage.
footage.
These
These
images
images
werewere
annotated
annotatedusing
using
an online
an online
tooltool
- Label
- Label
boxbox
andand
a text
a text
file file
waswascreated
created
for for
eacheach
scene
scene
whichwhich
has has
information
informationabout
about
the the
location
location
of various
of variousobjects
objects
in that
in that
particular
particular
frame.
frame.
TheseThese
filesfiles
cancan
thenthen
be used
be used
for for
the the
visualization
visualization
of the
of the
dataset
dataset
as well
as well
as for
as for
training
training
a system
a systemto detect
to detect
andand
210
4 Namburi GNVV
Author Satya
name Sai Srinath
/ Procedia et al. / Science
Computer Procedia00Computer Science 171 (2020) 207–216
(2019) 000–000

(a) Number of objects per class. (b) Number of images having a class.

Fig. 3: Details of NITCAD object dataset

Fig. 4: An example of various classes present in NITCAD object dataset. From left to right: Car, Bus, Pedestrian, Two wheeler, Truck, Van and
Auto rickshaw

classify the objects on road. There are seven classes in the dataset namely car, pedestrian, auto rickshaw, truck, two
wheeler, bus and van. A total of 11000 images were collected under different traffic conditions out of which 4800
images were manually labelled.
Fig. 3(a) gives details about the frequency of each class. From Fig. 3(a), it can be observed that the number of
cars present in the dataset is maximum and the number of vans present is minimum. Fig. 3(b) gives details about the
number of images or frames having a particular class from which it can be inferred that cars, auto rickshaws and two
wheelers are almost present in all the frames while vans occur at rare instances on roads. Fig. 4 gives a typical example
of all the classes present in our dataset.

2.3.1. Camera Calibration:

The images obtained from Noise play action camera were subjected to radial and tangential distortion because
of the wide angle lens employed in the camera. For preserving the information in each frame the intrinsic camera
Namburi GNVV Satya Sai Srinath et al. / Procedia Computer Science 171 (2020) 207–216 211
Author name / Procedia Computer Science 00 (2019) 000–000 5

(a) distorted image. (b) undistorted image.

Fig. 5: Camera Calibration results.

Table 1: Comparison with different datasets.

Dataset Year Avg.Resolution Train/Validation/Split Location Unstructured

KITTI 2012 1245x375 7.5k/-/7.5k Karlsruhe
BDD100k 2017 1280x720 84M/36M NY, SF
Vistas 2017 1920x1080 18k/2k/5k 6 continents
IDD 2019 1920x1080 31k/10k/4k Hyderabad, Bangalore
NITCAD 2019 1280x720* 4.8k/2.7k Kerala
* indicates the dimension before the undistortion operation. After undistortion, it is 1172x544.

matrix and distortion coefficients of the camera where computed according to Zhang [18] and were obtained as below:
 
791.6965 0 632.9851
 
camera intrinsic matrix =  0 791.4219 347.7182
 
0 0 1

radial distortion coefficients = −0.3454 0.1593 −0.0344

tangential distortion coefficients = 0.0021 0.0016

With the above obtained camera intrinsic matrix and distortion coefficients the images taken by Noise play action
camera were undistorted and provided in the dataset. An example of distorted and corresponding undistorted image
is shown in Fig.5 where the distortion is clearly visible near the edges. Around 10,000 undistorted images (of which
3600 are labelled) are also provided.

2.4. NITCAD Stereo vision dataset

For performing the task of depth estimation, data was collected in and around Kottayam district, Kerala using Intel
RealSense Depth camera D435. This depth camera has 2 infrared cameras having global shutter that are triggered
simultaneously so that calculation of disparity for a particular scene is possible. By using its inbuilt vision processor
a disparity map can be generated which can be used for depth estimation. More efficient algorithms can be developed
to improve its accuracy so that it becomes a cost effective depth approximation technique using stereo vision. Depth
estimation obtained by the inbuilt vision processor can be used for validation of the results obtained after performing
stereo algorithms. An example image is shown in Fig. 6 where a small disparity can be observed.
212 Namburi GNVV Satya Sai Srinath et al. / Procedia Computer Science 171 (2020) 207–216
6 6 Author namename
Author / Procedia Computer
/ Procedia Science
Computer 00 (2019)
Science 000–000
00 (2019) 000–000

(a) Left image.

(a) Left image. (b) Right image.
(b) Right image.

Fig. 6: An6:example
Fig. imageimage
An example pair from NITCAD
pair from stereostereo
NITCAD visionvision
dataset.
dataset.

TableTable
2: Evaluation of various
2: Evaluation architectures
of various on NITCAD
architectures objectobject
on NITCAD dataset.
dataset.

Architecture
Architecture Accuracy*
Accuracy*Precision
PrecisionF1 score
F1 score
DenseNet [19][19]
DenseNet 0.828
0.828 0.795
0.795 0.782
0.782
Inceptionv3 [20][20] 0.836
Inceptionv3 0.836 0.801
0.801 0.805
0.805
Mobilenet [21][21] 0.839
Mobilenet 0.839 0.780.78 0.796
0.796
NASNet [22][22]
NASNet 0.811
0.811 0.757
0.757 0.760
0.760
VGG16 [23][23]
VGG16 0.789
0.789 0.779
0.779 0.750
0.750
Xception [24][24]
Xception 0.854
0.854 0.832
0.832 0.825
0.825
*Accuracy and Recall
*Accuracy values
and Recall are same
values as micro-averaging
are same as micro-averaging
is considered for multiclass
is considered confusion
for multiclass matrix
confusion matrix

3. Evaluation
3. Evaluation

NITCAD
NITCAD can can
be considered
be consideredchallenging onlyonly
challenging if theif classes present
the classes in itin
present areit difficult
are difficultto classify. ThusThus
to classify. the dataset is is
the dataset
processed accordingly
processed and and
accordingly the cropped images
the cropped are fed
images are tofedvarious classification
to various algorithms
classification algorithmsto obtain confusion
to obtain matrix.
confusion matrix.
To evaluate the detection
To evaluate algorithms
the detection on our
algorithms on dataset, Faster
our dataset, R-CNN
Faster R-CNNis used. The The
is used. stereo dataset
stereo is evaluated
dataset on the
is evaluated on basis
the basis
of average of the
of average of relative difference
the relative between
difference the disparity
between the disparitymaps generated
maps (Rdi (R
generated f f )diby
ff )taking
by the
taking output
the obtained
output from
obtained from
IntelIntel
RealSense as the
RealSense as ground truth.
the ground truth.

3.1. 3.1.
NITCAD object
NITCAD dataset-
object Evaluation
dataset- for classification
Evaluation for classification

To evaluate
To evaluate howhowgoodgood
the classification algorithms
the classification perform
algorithms on the
perform on dataset, confusion
the dataset, matrix
confusion for different
matrix deepdeep
for different
learning
learning architectures was computed. Each image is cropped to extract all individual classes and a subsetthe
architectures was computed. Each image is cropped to extract all individual classes and a subset of of the
labelled datadata
labelled was was
considered for training,
considered validation
for training, and and
validation testing. Pre-trained
testing. models
Pre-trained on Imagenet
models werewere
on Imagenet taken and and
taken
initial layers
initial werewere
layers frozen as they
frozen learnlearn
as they about simple
about features
simple like like
features edges and and
edges lineslines
and and
thesethese
are common
are common in all
in the
all the
objects. Validation accuracy is used as a metric while training for 20 epochs and the best weights are used
objects. Validation accuracy is used as a metric while training for 20 epochs and the best weights are used for testing. for testing.
The The
confusion
confusionmatrix for various
matrix architectures
for various are represented
architectures in Fig.
are represented 7. 7.
in Fig.
It can be inferred that the class van is being confused with car/auto
It can be inferred that the class van is being confused with car/auto by these architectures.
by these Also,
architectures. the auto-rickshaw
Also, the auto-rickshaw
whichwhich is the most common class in Indian roads has been classified well as the dataset collected enough
is the most common class in Indian roads has been classified well as the dataset collected has has enoughnumber
number
of autos
of autos to train networks. The accuracy and precision values of various classes and models can be seenTable
to train networks. The accuracy and precision values of various classes and models can be seen in 2 2
in Table
and and
Table 3
Table 3
Author name / Procedia Computer Science 00 (2019) 000–000 7
Namburi GNVV Satya Sai Srinath et al. / Procedia Computer Science 171 (2020) 207–216 213

Fig. 7: Confusion matrix for various architectures.

DenseNet, Inceptionv3, Mobilenet, NASNet, VGG16 and Xception(from left to right)
0-Auto rickshaw, 1-Bus, 2-Car, 3-Pedestrian, 4-Truck, 5-Two Wheeler, 6-Van
214 Namburi GNVV Satya Sai Srinath et al. / Procedia Computer Science 171 (2020) 207–216
88 Author
Authorname
name
/ Procedia
/ Procedia
Computer
Computer
Science
Science
0000
(2019)
(2019)
000–000
000–000

Table
Table 3:3: Precision,
Precision, Recall
Recall values
values forfor different
different classes
classes present
present inin NITCAD
NITCAD object
object dataset.
dataset.

Precision
Precision Recall
Recall
Architecture
Architecture
AA BB CC PP TrTr Tw
Tw VV AA BB CC PP TrTr Tw
Tw VV
DenseNet
DenseNet 0.77 1.00
0.77 1.00 0.82
0.82 0.53
0.53 0.65
0.65 0.97
0.97 0.29
0.29 0.98
0.98 0.03
0.03 0.95
0.95 0.95
0.95 0.4
0.4 0.94 0.02
0.94 0.02
Inceptionv3 0.86
Inceptionv3 0.86 0.94
0.94 0.77
0.77 0.76
0.76 0.4
0.4 0.96 0.27
0.96 0.27 0.99
0.99 0.25
0.25 0.89
0.89 0.94
0.94 0.69
0.69 0.96
0.96 0.07
0.07
Mobilenet
Mobilenet 0.9
0.9 0.48 0.72
0.48 0.72 0.82
0.82 0.39
0.39 0.96
0.96 0.2
0.2 0.99 0.25
0.99 0.25 0.94
0.94 0.97
0.97 0.17
0.17 0.96
0.96 0.02
0.02
NASNet
NASNet 0.98 0.2
0.98 0.2 0.79 0.8
0.79 0.8 1.00 0.76
1.00 0.76 0.26
0.26 0.91
0.91 0.04
0.04 0.97
0.97 0.5
0.5 0.32 0.99
0.32 0.99 0.01
0.01
VGG16
VGG16 0.66 0.73
0.66 0.73 0.87
0.87 0.51
0.51 0.42
0.42 0.94
0.94 0.4
0.4 0.98 0.17
0.98 0.17 0.9
0.9 0.96 0.59
0.96 0.59 0.85
0.85 0.01
0.01
Xception
Xception 0.88 0.88
0.88 0.88 0.92
0.92 0.84
0.84 0.14
0.14 0.94
0.94 0.21
0.21 0.99
0.99 0.25
0.25 0.94
0.94 0.99
0.99 0.69
0.69 0.99
0.99 0.009
0.009
A-auto
A-autorickshaw,B-bus,C-car,P-pedestrian,Tr-truck,Tw-two
rickshaw,B-bus,C-car,P-pedestrian,Tr-truck,Tw-twowheeler,V-van
wheeler,V-van

3.2.
3.2.NITCAD
NITCADobject
objectdataset
dataset- Evaluation
- Evaluationfor
fordetection
detection

ToToevaluate
evaluatethethedetection,
detection,Faster
FasterR-CNN
R-CNNisischosen.
chosen.1200
1200images
imagesare
aretrained
trainedfor
for7070epochs
epochswith
witheach
eachepoch
epochhaving
having
200
200iterations
iterationsinin4GB
4GBGPUGPUsystem.
system.Resnet
Resnetisisused
usedasasthe
thebase
basearchitecture
architecturetototrain,
train,extract
extractfeatures
featuresand
andthe
themetrics
metrics
are
aretabulated
tabulatedininTable
Table4 4

Table
Table4:4:
Different
Different
metrics
metrics
obtained
obtained
after
after
training
training
Faster
Faster
R-CNN.
R-CNN.

Classifier
ClassifierAccuracy
Accuracy 0.894
0.894
LossRPN
Loss RPNClassifier
Classifier 0.057
0.057
LossRPN
Loss RPNRegression
Regression 0.0846
0.0846
LossDetector
Loss DetectorClassifier
Classifier 0.26
0.26
LossDetector
Loss DetectorRegression
Regression 0.11
0.11

3.3.
3.3.Evaluation
EvaluationofofNITCAD
NITCADStereo
Stereovision
visiondataset
dataset

Intel
IntelRealsense
Realsensestereostereocamera
camerageneratesgeneratesa adepth
depthmapmapofofthe
thescene
scenethat
thatisisbeing
beingrecorded
recordedusing
usingthethebuilt
builtininvision
vision
processor.
processor.This Thiswas wastaken
takenasasthe theground
groundtruthtruthofofdepth.
depth.ForForimproving
improvingthe thedepth
depthestimation,
estimation,a aneural
neuralnetwork
networkbased
based
approach
approachMC-CNN MC-CNN[16] [16]waswasapplied.
applied.For Fora apair
pairofofimages
imagescorresponding
correspondingtotoa ascene
sceneMC-CNN
MC-CNNgenerates
generatesa adisparity
disparity
map
mapwhich
whichisisused usedtotoobtain
obtainthe thedepth
depthinformation.
information.Pre-trained
Pre-trainednetwork
networkononKITTI
KITTIwaswaschosen
chosentotoestimate
estimatethethedisparity
disparity
maps.
maps.Disparity
Disparitymap mapfor forthat
thatparticular
particularscene scenewaswasalso
alsoobtained
obtainedusing
usinginbuilt
inbuiltfunctions
functionsprovided
providedbybyOpenCV
OpenCVlibrary.
library.
Let
LetDD Intel (x,(x,y)y)and
Intel andDD method (x,(x,y)y)corresponds
method correspondstotothe thedisparity
disparitymaps
mapsgenerated
generatedbybyIntel
IntelRealsense
Realsensestereo
stereocamera
cameraand andbyby
two
twoofofthetheabove
abovementioned
mentionedmethodsmethodsrespectively
respectivelyforfora aparticular
particularscene.
scene.The
Theaverage
averageofofthe
therelative
relativedifference
differencebetween
between
the
thedisparity
disparitymaps mapsgenerated(R
generated(R f ff)f can
di di ) canbebeobtained
obtainedasas

11
RR f ff f==
di di |D|D
Intel (x,(x,y)y)−−DD
Intel method (x,(x,y)|y)|
method (1)
(1)
ww××h hxw,yh
xw,yh

IfIfthe
thevalue
valueofofRR f ff fisisless,
di di less,then
thenit itcan
canbebeimplied
impliedthat
thatthe thedisparity
disparitymap mapobtained
obtainedbybythethemethod
methodisisaccurate.
accurate.
Obtaining
Obtainingthe thedepth
depthinformation
informationisisusefulusefulininestimating
estimatingthe
thevelocity
velocityofofthe
theobjects
objectsininthe
thescene.
scene.The
Theoutput
outputobtained
obtained
from
fromIntel
IntelRealsense
Realsenseisistakentakenasasground
groundtruth
truthand
andthe
theRR f ff fobtained
di di obtainedwithwithMC-CNN
MC-CNNisis1414and andwith
withOpenCV
OpenCVisis1818
where
whereit itisisthe
theaverage
averageofofaboutabout100 100image
imagepairs.
pairs.
Namburi GNVV Satya Sai Srinath et al. / Procedia Computer Science 171 (2020) 207–216 215
Author name / Procedia Computer Science 00 (2019) 000–000 9

(a) Intel Realsense (b) MC-CNN (c) OpenCV

Fig. 8: Disparity maps obtained from different algorithms of the image pair as shown in Fig.6. (Images thresholded for visual analysis)

4. Conclusion and Future Work

A challenging dataset is presented which includes various test cases that are frequently present in Indian roads. To
get the information regarding velocity, a stereo dataset is presented which can be used to develop algorithms to obtain
depth information. Various classes are labelled for object classification and is evaluated with confusion matrix which
is obtained by different architectures. For detection, Faster R-CNN is used. The stereo dataset is evaluated by absolute
sum of error between the output obtained from the Intel camera and with the methods described i.e MC-CNN and
OpenCV. Our dataset can be further extended by collecting data which can include new classes like animals, lorry,
sign boards etc. Research on the development of novel architectures that can detect and classify in various conditions
including many edge cases needs to be carried on.

5. Acknowledgements

We would like to thank TEQIP - III for providing fund to acquire Intel RealSense D435 Depth camera.

References

[1] Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: A large-scale hierarchical image database. In: Computer Vision and
Pattern Recognition, 2009. CVPR 2009. IEEE Conference on. pp. 248255. IEEE (2009)
[2] Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollr, P., Zitnick, C.L.: Microsoft coco: Common objects in context. In:
European conference on computer vision. pp. 740755. Springer (2014)
[3] Everingham, M., Van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The pascal visual object classes (voc) challenge. International journal
of computer vision 88(2), 303338(2010)
[4] Dollar, P., Wojek, C., Schiele, B., Perona, P.: Pedestrian detection: A benchmark. In: Computer Vision and Pattern Recognition, 2009. CVPR
2009. IEEE Conference on. pp. 304311. IEEE (2009)
[5] Zhang, Shanshan, Rodrigo Benenson, and Bernt Schiele. “Citypersons: A diverse dataset for pedestrian detection.” Proceedings of the IEEE
Conference on Computer Vision and Pattern Recognition. 2017.
[6] Keller, Christoph Gustav, Markus Enzweiler, and Dariu M. Gavrila. “A new benchmark for stereo-based pedestrian detection.” 2011 IEEE
Intelligent Vehicles Symposium (IV). IEEE, 2011.
[7] Krause, Jonathan, et al. “3d object representations for fine-grained categorization.” Proceedings of the IEEE International Conference on
Computer Vision Workshops. 2013.
[8] Maddern, W., Pascoe, G., Linegar, C., Newman, P.: 1 year, 1000 km: The oxford robotcar dataset. IJ Robotics Res. 36(1), 315 (2017)
[9] Geiger, Andreas, et al. “Vision meets robotics: The KITTI dataset.” The International Journal of Robotics Research 32.11 (2013): 1231-1237.
[10] Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., Franke, U.,Roth, S., Schiele, B.: The cityscapes dataset for
semantic urban scene understanding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 3213 3223
(2016)
[11] Neuhold, G., Ollmann, T., Bul, S.R., Kontschieder, P.: The mapillary vistas dataset for semantic understanding of street scenes. In: International
Conference on Computer Vision (ICCV) (2017)
[12] Huang, Xinyu, et al. “The apolloscape dataset for autonomous driving.” Proceedings of the IEEE Conference on Computer Vision and Pattern
Recognition Workshops. 2018.
[13] Yu, Fisher, et al. “BDD100K: A diverse driving video database with scalable annotation tooling.” arXiv preprint arXiv:1805.04687 (2018).
[14] Caesar, Holger, et al. “nuScenes: A multimodal dataset for autonomous driving.” arXiv preprint arXiv:1903.11027 (2019).
[15] Varma, Girish, et al. “IDD: A dataset for exploring problems of autonomous navigation in unconstrained environments.” 2019 IEEE Winter
Conference on Applications of Computer Vision (WACV). IEEE, 2019.
216 Namburi GNVV Satya Sai Srinath et al. / Procedia Computer Science 171 (2020) 207–216
10 Author name / Procedia Computer Science 00 (2019) 000–000

[16] J. Zbontar, Y. LeCunet al., “Stereo matching by training a convolutional neuralnetwork to compare image patches.Journal of Machine Learning
Research,vol. 17, no. 1-32, p. 2, 2016.
[17] E. Rosten and T. Drummond, “Machine learning for high-speed cornerdetection, in European Conference on Computer Vision, vol. 1, May
2006,pp. 430443. [Online]. Available: http://www.edwardrosten.com/work/rosten2006machine.pdf
[18] Z. Zhang, “A flexible new technique for camera calibration,IEEE Transactions onpattern analysis and machine intelligence, vol. 22, 2000
[19] Huang, Gao, et al. “Densely connected convolutional networks.” Proceedings of the IEEE conference on computer vision and pattern recogni-
tion. 2017.
[20] Szegedy, Christian, et al. “Rethinking the inception architecture for computer vision.” Proceedings of the IEEE conference on computer vision
and pattern recognition. 2016.
[21] Howard, Andrew G., et al. “Mobilenets: Efficient convolutional neural networks for mobile vision applications.” arXiv preprint
arXiv:1704.04861 (2017).
[22] Zoph, Barret, et al. “Learning transferable architectures for scalable image recognition.” Proceedings of the IEEE conference on computer
vision and pattern recognition. 2018.
[23] Simonyan, Karen, and Andrew Zisserman. “Very deep convolutional networks for large-scale image recognition.” arXiv preprint
arXiv:1409.1556 (2014).
[24] Chollet, Franois. “Xception: Deep learning with depthwise separable convolutions.” Proceedings of the IEEE conference on computer vision
and pattern recognition. 2017.

Comparative Analysis of Feature Descriptors and Classifiers For Real-Time Object Detection
No ratings yet
Comparative Analysis of Feature Descriptors and Classifiers For Real-Time Object Detection
11 pages
Major PRC-1 ppt-1
No ratings yet
Major PRC-1 ppt-1
12 pages
Aiav Unit 2 Notes
No ratings yet
Aiav Unit 2 Notes
8 pages
Ijet V4i3p31 PDF
No ratings yet
Ijet V4i3p31 PDF
5 pages
Complete Research Paper Computer Vision
No ratings yet
Complete Research Paper Computer Vision
6 pages
Application of Deep Learning For Object Detection
No ratings yet
Application of Deep Learning For Object Detection
12 pages
JOIV - Template - A Thorough Review of Vehicle Detection and Distance Estimation Using Deep Learning in Autonomous Cars
No ratings yet
JOIV - Template - A Thorough Review of Vehicle Detection and Distance Estimation Using Deep Learning in Autonomous Cars
10 pages
51 Submission
No ratings yet
51 Submission
5 pages
V25I0106
No ratings yet
V25I0106
5 pages
173 Road Lane Detection
No ratings yet
173 Road Lane Detection
6 pages
Smart Cities
No ratings yet
Smart Cities
23 pages
Cross-Field Road Markings Detection Based On Inverse Perspective Mapping
No ratings yet
Cross-Field Road Markings Detection Based On Inverse Perspective Mapping
21 pages
Relatório Cíentifico - A Comparative Study of The Art Deep Learning Algorithms For Vehicle Detection
No ratings yet
Relatório Cíentifico - A Comparative Study of The Art Deep Learning Algorithms For Vehicle Detection
14 pages
Implementation of An Improved Multi-Object Detection, Tracking, and Counting For Autonomous Driving
No ratings yet
Implementation of An Improved Multi-Object Detection, Tracking, and Counting For Autonomous Driving
29 pages
IEEE - MODAT - Sairaj
No ratings yet
IEEE - MODAT - Sairaj
4 pages
Sensors 22 04833
No ratings yet
Sensors 22 04833
17 pages
Final Project2
No ratings yet
Final Project2
46 pages
An Application of A Deep Learning Algorithm For Automatic Detection of Unexpected Accidents Under Bad CCTV Monitoring Conditions in Tunnels
No ratings yet
An Application of A Deep Learning Algorithm For Automatic Detection of Unexpected Accidents Under Bad CCTV Monitoring Conditions in Tunnels
5 pages
Object Detection For Self Driving Car in Complex Traffic Scenarios
No ratings yet
Object Detection For Self Driving Car in Complex Traffic Scenarios
8 pages
2802 8020 1 PB
No ratings yet
2802 8020 1 PB
3 pages
An Investigation of Deep Neural Network Based Techniques For Object Detection An
No ratings yet
An Investigation of Deep Neural Network Based Techniques For Object Detection An
6 pages
Electronics 13 02790
No ratings yet
Electronics 13 02790
15 pages
Automatic Toll Collection System Using Rfid With Vehicle Classification Using Convolutional Neural Network
No ratings yet
Automatic Toll Collection System Using Rfid With Vehicle Classification Using Convolutional Neural Network
6 pages
Wen Wen 2021 Thesis
No ratings yet
Wen Wen 2021 Thesis
114 pages
Real Time Object Detection With Deep Learning and OpenCV
No ratings yet
Real Time Object Detection With Deep Learning and OpenCV
5 pages
Object Tracking For Autonomous Vehicles: Project Report
No ratings yet
Object Tracking For Autonomous Vehicles: Project Report
10 pages
ANN Mini ProjectSwami Exam
No ratings yet
ANN Mini ProjectSwami Exam
18 pages
Object Detection in 20 Years A Survey
No ratings yet
Object Detection in 20 Years A Survey
20 pages
A Survey On 3D Object Detection Methods For Autonomous Driving Applications
No ratings yet
A Survey On 3D Object Detection Methods For Autonomous Driving Applications
14 pages
A Novel Model To Detect and Categorize Objects From Images by Using A Hybrid Machine Learning Model
No ratings yet
A Novel Model To Detect and Categorize Objects From Images by Using A Hybrid Machine Learning Model
13 pages
Object Detection in Images and Videos Using OpenCV A Comparative Study of Deep Learning and Traditional Computer Vision Techniques
No ratings yet
Object Detection in Images and Videos Using OpenCV A Comparative Study of Deep Learning and Traditional Computer Vision Techniques
6 pages
Self - Driving - Car (Main Project)
No ratings yet
Self - Driving - Car (Main Project)
34 pages
Sensors 23 03385
No ratings yet
Sensors 23 03385
20 pages
Alert System
No ratings yet
Alert System
7 pages
Second Progress Report UID - 17BCS2127
No ratings yet
Second Progress Report UID - 17BCS2127
13 pages
Application of Deep Learning To Develop An Autonomous Vehicle
No ratings yet
Application of Deep Learning To Develop An Autonomous Vehicle
11 pages
Final Project Paper Akash
No ratings yet
Final Project Paper Akash
5 pages
Vision basedVehicleDetectionandDistanceEstimation
No ratings yet
Vision basedVehicleDetectionandDistanceEstimation
9 pages
Sensors: Deep Learning For Real-Time 3D Multi-Object Detection, Localisation, and Tracking: Application To Smart Mobility
No ratings yet
Sensors: Deep Learning For Real-Time 3D Multi-Object Detection, Localisation, and Tracking: Application To Smart Mobility
15 pages
Instance Segmentation For Autonomous Vehicle
No ratings yet
Instance Segmentation For Autonomous Vehicle
6 pages
1 s2.0 S1877050920300892 Main
No ratings yet
1 s2.0 S1877050920300892 Main
7 pages
WRAP Survey 3D Object Detection Methods Autonomous Driving Applications Arnold 2019
No ratings yet
WRAP Survey 3D Object Detection Methods Autonomous Driving Applications Arnold 2019
15 pages
Object Detection Using AI: Neha Shinde, Sandali Tagunde, Saraswati Zerkunte, Prof. Megha Dhotay
No ratings yet
Object Detection Using AI: Neha Shinde, Sandali Tagunde, Saraswati Zerkunte, Prof. Megha Dhotay
5 pages
Fast and Accurate Object Detector For Autonomous D
No ratings yet
Fast and Accurate Object Detector For Autonomous D
14 pages
Automatic Vehicle Detection System in Different Environment Conditions Using Fast R-CNN
No ratings yet
Automatic Vehicle Detection System in Different Environment Conditions Using Fast R-CNN
21 pages
Part B Eti-1
No ratings yet
Part B Eti-1
7 pages
Vehicle Counting, Classification and Detection Using OpenCV
No ratings yet
Vehicle Counting, Classification and Detection Using OpenCV
5 pages
E3sconf Icmed-Icmpc2023 01016
No ratings yet
E3sconf Icmed-Icmpc2023 01016
6 pages
Object Detection Using YOLO
No ratings yet
Object Detection Using YOLO
9 pages
Aws RP
No ratings yet
Aws RP
11 pages
Paper Id 334 (New) With Animation - PPTX - 20240311 - 215722 - 0000
No ratings yet
Paper Id 334 (New) With Animation - PPTX - 20240311 - 215722 - 0000
11 pages
Real-Time Object Detection Using SSD MobileNet Mod
No ratings yet
Real-Time Object Detection Using SSD MobileNet Mod
6 pages
2 - Design and Development of Software Atack of Autonomous Vehicle Using Robot Operating System
No ratings yet
2 - Design and Development of Software Atack of Autonomous Vehicle Using Robot Operating System
12 pages
Article For ROBOT Camera Version
No ratings yet
Article For ROBOT Camera Version
13 pages
Paperreview
No ratings yet
Paperreview
6 pages
Dokania IDD-3D Indian Driving Dataset For 3D Unstructured Road Scenes WACV 2023 Paper
No ratings yet
Dokania IDD-3D Indian Driving Dataset For 3D Unstructured Road Scenes WACV 2023 Paper
10 pages
Component-Based Car Detection in Street Scene Images
No ratings yet
Component-Based Car Detection in Street Scene Images
71 pages
Chinese Association of Automation. Technical Committee On Data Driven Control Et Al. - Unknown - Proceedings of 2018 IEEE 7th Data Drive-Annotated
No ratings yet
Chinese Association of Automation. Technical Committee On Data Driven Control Et Al. - Unknown - Proceedings of 2018 IEEE 7th Data Drive-Annotated
7 pages
Outside State Colleges Email Details
No ratings yet
Outside State Colleges Email Details
24 pages
Institutions Detail
No ratings yet
Institutions Detail
15 pages
POC List Document
No ratings yet
POC List Document
15 pages
Vellore Le Paper
No ratings yet
Vellore Le Paper
10 pages
Advanced Driver Assistance System ADASon FPGA
No ratings yet
Advanced Driver Assistance System ADASon FPGA
5 pages
Bencao Gangmu 12
No ratings yet
Bencao Gangmu 12
4 pages
Corporation) Law) Case) Digests) 3C) &) 3S) - ) ATTY.) CARLO) BUSMENTE)
No ratings yet
Corporation) Law) Case) Digests) 3C) &) 3S) - ) ATTY.) CARLO) BUSMENTE)
7 pages
2019.08.19 Jowers's Answer, Affirmative Defenses, Counterclaims, and Third-Party Complaint (Filed) PDF
No ratings yet
2019.08.19 Jowers's Answer, Affirmative Defenses, Counterclaims, and Third-Party Complaint (Filed) PDF
71 pages
Ap Spanish Language - Syllabus Spanish 4
No ratings yet
Ap Spanish Language - Syllabus Spanish 4
9 pages
Argument of Unequal Childhoods: Class Differences in Childrearing
No ratings yet
Argument of Unequal Childhoods: Class Differences in Childrearing
8 pages
Gearmax Ep: Gearmax Ep 680 Es Una Mezcla Con Aceites Y Aditivos de Base
No ratings yet
Gearmax Ep: Gearmax Ep 680 Es Una Mezcla Con Aceites Y Aditivos de Base
2 pages
(QP - Test-01 CH-10) (Std. 8 Mathematics 'Ii')
No ratings yet
(QP - Test-01 CH-10) (Std. 8 Mathematics 'Ii')
2 pages
NX WAVE Geometry Linker
No ratings yet
NX WAVE Geometry Linker
13 pages
07 Chapter 01 - Basics of Ethics - Part-I
No ratings yet
07 Chapter 01 - Basics of Ethics - Part-I
17 pages
Globalization & Dislocation in Novels of Kazuo Ishiguro
100% (1)
Globalization & Dislocation in Novels of Kazuo Ishiguro
413 pages
LIC - Jeevan Akshay VII - Sales Brochure - 4 Inch X 9 Inch - Eng - Single
No ratings yet
LIC - Jeevan Akshay VII - Sales Brochure - 4 Inch X 9 Inch - Eng - Single
12 pages
Presentation Gruop D
No ratings yet
Presentation Gruop D
27 pages
ETR PHD Chemistry 2019
No ratings yet
ETR PHD Chemistry 2019
5 pages
Theory of Music Grade 1
No ratings yet
Theory of Music Grade 1
8 pages
Tutorial Chapter 7 Inventory Management
No ratings yet
Tutorial Chapter 7 Inventory Management
31 pages
Salas vs. Adil - Digest
No ratings yet
Salas vs. Adil - Digest
2 pages
STIHL FS 110 Owners Instruction Manual
No ratings yet
STIHL FS 110 Owners Instruction Manual
116 pages
Physics Unit & Mesaurement
No ratings yet
Physics Unit & Mesaurement
26 pages
3SU19000GB100AA0 Datasheet en
No ratings yet
3SU19000GB100AA0 Datasheet en
2 pages
Nurses As Key Players in Information Technology Health Solutions
No ratings yet
Nurses As Key Players in Information Technology Health Solutions
5 pages
Algorithm and Application of Inverse Kinematics For 6-DOF Welding Robot Based On Screw Theory
No ratings yet
Algorithm and Application of Inverse Kinematics For 6-DOF Welding Robot Based On Screw Theory
5 pages
How To Be A Network Engineer in A Programmable Age PDF
100% (1)
How To Be A Network Engineer in A Programmable Age PDF
38 pages
Words Worth Essay
No ratings yet
Words Worth Essay
1 page
MCB - Designed
No ratings yet
MCB - Designed
4 pages
「ほんまや！」
No ratings yet
「ほんまや！」
4 pages
Ethics - Utilitarianism
No ratings yet
Ethics - Utilitarianism
72 pages
The Male Reproductive System Lesson Plan (FINAL)
No ratings yet
The Male Reproductive System Lesson Plan (FINAL)
19 pages
0024.11 Ic Process Safety Management Certificate Web
No ratings yet
0024.11 Ic Process Safety Management Certificate Web
2 pages
The Arson Betrayal Walkthrough
No ratings yet
The Arson Betrayal Walkthrough
18 pages
Staff Profile C.ajitHA
No ratings yet
Staff Profile C.ajitHA
10 pages

NITCAD - Developing An Object Detection Classifica

Uploaded by

NITCAD - Developing An Object Detection Classifica

Uploaded by

Available online at www.sciencedirect.

Third International Conference on Computing and Network Communications (CoCoNet’19)

∗ Corresponding author. Tel.: +0495-2286721.

2.1. Traffic scenario in India

Fig. 3: Details of NITCAD object dataset

2.3.1. Camera Calibration:

(a) distorted image. (b) undistorted image.

Fig. 5: Camera Calibration results.

Table 1: Comparison with different datasets.

Dataset Year Avg.Resolution Train/Validation/Split Location Unstructured

2.4. NITCAD Stereo vision dataset

(a) Left image.

Fig. 7: Confusion matrix for various architectures.

(a) Intel Realsense (b) MC-CNN (c) OpenCV

4. Conclusion and Future Work

You might also like