Vehicle Detection Trackingand Counting
Vehicle Detection Trackingand Counting
net/publication/330254555
CITATIONS READS
8 1,198
6 authors, including:
Javed Ahmed
Vision Systems
24 PUBLICATIONS 2,129 CITATIONS
SEE PROFILE
All content following this page was uploaded by Javed Ahmed on 04 March 2019.
Javed Ahmad
Military College of Signals
National University of Sciences and Technology
Rawalpindi, Pakistan
e-mail: [email protected]
Abstract—Traffic congestion and occlusions are major a single image or from a sequence of images. It implicates
problems nowadays in metropolitan cities which leads to an the theoretical and algorithmic based steps to achieve
ever growing traffic accidents. Therefore, the need of traffic visual understanding automatically. In this paper, we
flux management in order to avoid these congestions, propose a computer vision based vehicle detection,
unnecessary time wastage and tragic accidents is very
counting and tracking method that uses a Gaussian
important. Traffic regulation by optimizing timing of traffic
control signals is one of the solutions for this purpose. This mixture model for background subtraction which yields a
paper presents a low cost camera based algorithm in order foreground mask (binary mask). The foreground mask
to control traffic flow on a road. The algorithm is based on thus obtained is then processed using the morphological
mainly three steps: vehicle detection, counting and tracking. operators (e.g., dilation, erosion, opening and closing) to
Background subtraction is used to isolate vehicles from their eliminate noise. The BLOB analysis technique then helps
background, Kalman filter is used to track the vehicles and in clearly discerning the vehicles from the background. It
Hungarian algorithm is exploited for association of labels to basically detects the cluster of connected pixels which
the tracked vehicles. This algorithm is implemented on both may correspond to moving objects. Afterwards, a binary
daytime and night time videos acquiered from CCTV
classifier helps in discerning a vehicle from pedestrians.
camera and IR camera. Experimental results show the
efficacy of the algorithm. This classifier makes use of the fact that the width to
height ratio of vehicles is always greater than 1 and that
Keywords-Computer vision, vehicle detection, counting, for pedestrians is always less than 1. Kalman filter then
tracking helps predict the locations of vehicles during the next
update interval. Hungarian algorithm is then exploited for
I. INTRODUCTION associating labels to the tracked vehicles. In order to
increase the computational speed, all of this processing is
Nowadays, people prefer to use their own vehicles for
done within a region of interest (ROI) which is set in our
commutation rather than public transport and this trend
video frames such that only those vehicles are detected
causes a huge traffic flux on roads. It leads to problems
which enters the ROI. Thus, the vehicle counter is
like traffic congestion, air and noise pollution,
incremented only when a vehicle enters the ROI.
peevishness in behavior, etc. Therefore, traffic flow
According to Fig. 1, extraction of vehicles in frames of
management is important to alleviate such issues. With
a video given as an input in the first step is achieved using
ubiquitously available cameras, vision based systems are
Gaussian Mixture Model. After the extraction of vehicles,
more suitable and lower cost solutions than loop detectors
noise is removed from the processed binary image
(which require massive hardware to automate the traffic
obtained as a result of second step. This noise filtering is
flow) and they also minimize the number of traffic
taken care of using morphological operators. After the
wardens [1]. Moreover, the cameras may also be used for
removal of noise, BLOB analysis is used to highlight the
surveillance purposes.
detected vehicles and counter is employed to achieve the
Detection of vehicles in a sequence of frames is an
counting of vehicles depending upon the highlighting of
active field of research in computer vision which helps in
vehicles in Step 4. Counting (labelling) in Fig. 1, basically
overcoming an ever growing complications in traffic
refers to cost estimation which is achieved in this frame
surveillance and security. Fig. 1 shows the block diagram
work using a Hungarian model. This cost estimation
of the system based on computational analysis. Computer
model helps in assigning tracks to the corresponding
vision deals highly with the automatic extraction,
associations.
understanding and analysis of the useful information from
Rest of the paper is organized as follows: Section II based on features and combined methods. Of these three,
discusses related work and Section III describes the in tracking based on region method labels are assigned to
explanation of proposed solution, Section IV explains the vehicles detected in order to track them through cross
implementation and results, and finally Section V correlation measurements over time. However, the
elaborates the conclusions and future work. limitation of this approach includes the fact that this
algorithm is applicable for large size objects. Furthermore,
II. RELATED WORK this algorithm is applicable only for fewer vehicles on the
Jang Hyeok et al. in [2] employs Gaussian Mixture road and thus cannot in general handle occlusion reliably.
Background Model along with Pyramidal Lucas Kanade In tracking, based on active contour method, the main
method to first extract the objects and then measure the idea is presentation of environmental contour of a vehicle
displacement of an object in two consecutive frames of a and its dynamic updating [6]. The problem of this
video. The Lucas Kanade method is a differential method algorithm is that occlusion handling is almost inevitable
used for optical flow estimation. In this method, since one because tracking precision at contour position is limited.
of the two frames is taken as a reference, the problems In tracking, based on feature technique, features of
like aperture focusing and therefore correspondence intended vehicles are used as a key to tracking that
issues arise which may yield results that are quite vehicle. For this, in each frame, the vehicle under
uncertain. In [3], for background subtraction and image observation is subjected to feature extraction. Using the
segmentation, first frame in a video file is assumed to be a extracted features, vehicle tracking takes place by
reference background. For the elimination of background, analyzing these features in consecutive frames of a video
video input is first converted to frames and then the sequence. This method is used widely in many systems
difference between the current frame and background is [7]. For monitoring the status of a traffic, vehicle counting
calculated. This difference is then used to eradicate the is usually taken into account in order to estimate traffic
pixels having same values. This algorithm tracks every flow. In [8], normalized color and edge map method was
object that appears to be displaced in the next frame as a discussed for vehicle counting. Normalized color and
vehicle even if it is not a vehicle thus yielding uncertainty. edge map technique basically uses a color transform
In [4], frame differencing method for vehicle detection is model to find vehicle color to efficiently locate possible
discussed in which sub-features of a vehicle account for candidates of a vehicle [9]. In [10], cascade Haar with
the detection of vehicle. Feature extraction method is pyramidal Kanade Lucas Tomasi tracker is used for
followed upon using Haar wavelets technique. A wavelet vehicle counting. An object is said to be matched with an
analysis is almost similar to Fourier analysis in that the object if a threshold percentage of its points are contained
target function is represented in terms of orthonormal in the detection window [10]. Moreover, in order to
bases thus providing a non-redundant representation of an reduce false counts, an object, which is not matched with
input image. In wavelet pyramidal decomposition process detection in maximum number of frames, is ignored in the
discussed in [5], each frame of an input video, after final count [10]. When a right threshold percentage of
passing through series of filters (high pass and low pass) points associated with an object go out of the window, the
and down sampling, is compressed. This provides a richer object is counted [10].
model along with spatial resolution and is even more The system proposed in this paper uses Gaussian
suitable in complex patterns capturing [5]. The mixture model which has an advantage of detecting more
complexity of this algorithm increases when attributes minor details in foreground extraction because this
belonging to same class having same absolute values are method basically computes the PDF corresponding to
detected as two different features. In [6], different every pixel in a frame which means that it is more flexible
algorithms for tracking are discussed which are classified in terms of cluster covariance. Although there are many
into three main categories namely tracking based on other filters better than Kalman filter, the Kalman filter is
region, tracking based on active contour and tracking employed for tracking in proposed system since it has an
127
advantage over all other trackers that it can correct its
assumptions based on prior and posterior knowledge. This
tracker also takes into account the quantities that are
partially or completely neglected in other techniques and
its recursive nature can help in real time execution of an
algorithm without storing observations or past estimates.
128
Here, we have four tracks m, n, o and p and total four defined as the state transition matrix [15]. Note that k-1
detections. The task is to find the least cost of assigning represents the previous state.
the tracks to the detections. Since the rows and columns
of a matrix are equal, we can infer that at a time only one
track can be assigned to a single detection. IV. IMPLEMENTATION AND RESULTS
The above mentioned algorithm for vehicle detection is
implemented using a MATLAB 2017a software .The
results of the aforementioned algorithm are compared
with the ground truth values of vehicles in frames.
Figure 5. Blob Detection (BLOB analysis) Figure 6. Comparison of results with the ground truth (green boxes show
actual detection through this algorithm whereas yellow boxes represent
In the first step, we will subtract the least element of the ground truth positions of cars in a frame)
entire row from the elements of that same row. This
results in at least one zero in an entire row. The process is
repeated for all the rows of a matrix.
0 m2 m3 m4
n1 n2 0 n4
o1 o2 o3 0
p1 0 p3 p4
129
ground truth bounding box whereas corresponds to
the area encompassed by both the predicted and the
ground truth bounding boxes [16]. Fig. 6 shows the
comparision of results with the ground truth values.
The above mentioned algorithm when run at on other
times of a day, for example night time (CCTV camera),
yields results as shown in Fig. 7
Figure 12. Original frame and processed frame (from left to right) in
medium traffic density.
Figure 11. Original frame and processed frame (from left to right) in low
traffic density.
130
Figure 13. Graph of accuracy of our algorithm based detections
Figure 14. Graph of accuracy of our algorithm based countings.
TABLE III. RESULTS OF IMPLEMENTATION OF OUR ALGORITHM ON BENCHMARK VIDEOS [19] for (low traffic density)
Dataset#10
Dataset#11
Dataset#12
Dataset#13
Dataset#14
Dataset#15
Dataset#16
Dataset#17
Dataset#18
Dataset#19
Dataset#20
Dataset#1
Dataset#2
Dataset#3
Dataset#4
Dataset#5
Dataset#6
Dataset#7
Dataset#8
Dataset#9
Results of
Reference No. %age accuracy for detections %age accuracy for counting
(Average value) (Average value)
[18] 94.22% -
In [17], detection of vehicle using a triangle counting accuracy of our algorithm on each dataset is also
thresholding method was proposed. According to [17], represented in the form of percentage. Through our
this thresholding technique is determined through algorithm we attain detection accuracy of 94.77% which
histogram by normalizing its brightness values and the is very close to 95% and is better than the results in [18].
dynamic range of its intensity values. The differenciation Counting accuracy of our algorithm comes out to be 89.57%
line drawn between maximum and minimum of the which is approximately equal to 90% as TABLE IV
histogram brightness values is used to determine the depicts. The lower percentage of counting in some
threshold value. Brightness value having max distance is datasets results due to error in first frame of videos in
set as a threshold [17]. Similarly in [8] as discussed benchmark. Fig. 13 and Fig. 14 show the graphs of our
before, normalized color and edge map method was results discussed in TABLE III and TABLE IV.
discussed for vehicle counting and in [18], pixel domain Dataset:
methods were employed to achieve the objective of Due to lack of augmented dataset online for vehicle
vehicle detection. To account for the accuracy of our detection in videos taken with stationary camera,
proposed algorithm, benchmark videos used in [18] were groundtruth for videos was self constructed using object
used which were obtained from a UCSD database [19]. annotation tools. Day time video (using CCTV camera)
The accuracy of our algorithm compared with that of [18] used in this algorithm was acquired from a youtube
is depicted in figures (see Fig. 11 and Fig. 12 ) source [20] whereas night time video and its dataset was
TABLE III. shows the results of applying our algorithm, self constructed using online annotation tools LabelMe to
on bemchmark videos in [19]. Detections are represented obtain the ground truth positions of car in a video frame.
in the form of Intersection Over Union percentage and The daytime video has a frame rate of 25 fps. Night time
131
video was acquired online from [21] and its groundtruth [3] P.M. Daigavane, P.R. Bajaj. “Real Time Vehicle detection and
was also self constructed using object annotation tools. Counting Method for Unsupervised Traffic Video on Highways,”
IJCSNS International Journal of Computer Science and Network
The night time video (CCTV camera) has a frame rate of Security, vol.10, (2010).
29.97fps. IR-camera video was acquired from database in [4] Hadi Raad Ahmed, Sulong Ghazali, George Loay Edwar. “Vehicle
[22]. This video has a frame rate of 30fps. Benchmark Detection and Tracking Techniques: A concise review,”
videos in [18] which were used for comparision of our International Journal (SIPIJ), Vol. 5, No.1, (2014).
[5] Wen Xuezhi, et al. “Improved wavelet feature extraction methods
algorithm were obtained from UCSD database [19]. Each based on HSV space for vehicle detection,” IAPR conference on
video in dataset has a frame rate of 10fps. machine vision application, Tokyo (2007).
[6] Farahani, Gholamreza, “Dynamic and Robust Method for Detection
and Locating Vehicles in the Video Images Sequences with the Use
of Image Processing Algorithm.”, EURASIP journal on image and
video processing, 2017.
[7] B Han, L Davis, “Object Tracking by Adaptive Feature Extraction.”
International Conference on Image Processing (Institute of Electrical
and Electronics Engineers (IEEE), Singapore, 2004.
[8] Yang Zi, Pun-Cheng S.C Lilian, “Vehicle Detection in Intelligent
Transportation Systems and Its aplications under varying
environments,” The Department of Land Surveying and Geo-
Informatics, The Hong Kong Polytechnic University, Hong Kong
(2017)
[9] Luo-Wei Tsai, Jun-Wei Hsieh, Kuo-Chin Fan, “Vehicle Detection
Figure 15. Difference between the plots of coordinates of ACTUAL using Normalized Color and Edge Map.”, IEEE transactions on
DETECTION (through this frame work) and that of GROUND TRUTH Image Processing, 2007.
coordinates [10] Patel, Chirag I., Patel Ripal. “Counting Cars in Traffic using
Cascade Haar with KLP.”, International Journal of Computer and
V. CONCLUSION AND FUTURE WORK Electrical Engineering, 2013
In this paper, a framework for vehicle detection, [11] Hardas, Adesh, et al. “Moving Object Detection using Background
Subtraction Shadow Removal and Post Processing,” International
tracking and counting was proposed. Its key functions are Journal of computer applications, 0975 – 8887(2015).
detecting vehicles which form the foreground part of a [12] Alandkar, Lajari, Gengaje, Sachin R. “Delaing Background issues
frame using Gaussian Mixture model [11], [12] which in Object Detection using GMM: A Survey,” International Journal
gives a binary mask. This binary mask is then subjected to of Computer Applications, vol.150-No.5, 0975-8887(2016).
[13] “Machine vision guide”, Retrieved from: https://docs.adaptive-
morphological operators [23]. vision.com/current/studio/machine_vision_guide/BlobAnalysis.ht
Morphology is basically applied in order to overcome ml, last accessed 2018/01/05.
noise in binary foreground detected image. After the [14] “The free encyclopedia”, Retrieved from:
removal of noise, connected regions are detected using https://en.wikipedia.org/wiki/Hungarian_algorithm, last accessed
2018/01/05.
Blob Analysis [13] and these regions are then assigned [15] Rezaei, Mahdi, “Computer vision for road safety: A system for
detections using Hungarian Algorithm [14] on the basis of simultaneous monitoring of driver behavior and road hazards,”
cost estimation. Next, for tracking the trajectory of the University of Auckland, New Zealand (2014).
vehicle within the next update interval, KALMAN filter is [16] PYimagesearch – “Machine learning,”
https://www.pyimagesearch.com/2016/11/07/intersection-over-
employed. Comparing the results of our algorithm with union-iou-for-object-detection/, last accessed 2018/02/12.
the one discussed in [8] and [15] we conclude that our [17] Mohamed A. El-Khoreby, Syed Abd Rehman Abu Bakr, “Vehicle
system is more efficient and accurate since it yields Detection and Counting in Complex Weather Conditions.”, IEEE
results that are much closer to the ground truth values of International Conference on Signal and Image Processing,
Malaysia, 2017.
vehicles. In future, the proposed algorithm can be [18] Xu Liu, ZileiWang, et al., “Highway Vehicle Counting in
implemented using OpenCV software to work online. Compressed Domain.”, CVPR, 3016-3024, 2015.
The limitations of this paper include: vehicle shadows [19] Statistical Visual Computing Lab-“UCScanDiego”, retrieved from:
which become a part of foreground, nonstationary camera, http://www.svcl.ucsd.edu/projects/traffic/, last accessed
2018/04/23.
greater velocity of vehicles and intense sunlight which [20] Day time video using CCTV camera, Retrieved from:
causes reflection from car windows thus making extra https://www.youtube.com/watch?v=eO19UTm93GQ.
objects a part of foreground. These issues can be resolved [21] Shutterstock, Inc [US] – “Cars run low speed in late nighttraffic on
using some techniques like thresholding the frames of a motorway”, retrieved from:
video to remove noise or by using histogram of gradient https://www.shutterstock.com/video/clip-3756275-stock-footage-
cars-run-low-speed-in-late-night-traffic-on-motorway-dusseldorf-
(HOG) for vehicle detection. For fast speed vehicles,
germany-area.html?src=rel/3756290:3
model of Kalman filter can be improved so that it can [22] OTCBVS Benchmark Dataset Collection, Dataset#11, Retrieved
detect fast moving vehicles. from: http://vcipl-okstate.org/pbvs/bench/, 10th May, 2018.
[23] “Image processing basics”, Retrieved from:
http://www.coe.utah.edu/~cs4640/slides/Lecture11.pdf, last
REFERENCES accessed 2018/01/05.
[1] Gupte, Surendra, et al. “Detection and Classification of vehicles,”
ITS Institute of Minnesota, 2002.
[2] Jang, Hyeok, In-Su Won, and Dong-Seok Jeong. “Automatic Vehicle
Detection and Counting Algorithm,” IJCSNS International Journal
of Computer Science and Network Security, vol.14, (2014).
132