Social Distancing Analyzer Using Computer Vision - and Deep Learning
Social Distancing Analyzer Using Computer Vision - and Deep Learning
Retraction
This article (and all articles in the proceedings volume relating to the same conference) has been
retracted by IOP Publishing following an extensive investigation in line with the COPE guidelines.
This investigation has uncovered evidence of systematic manipulation of the publication process and
considerable citation manipulation.
IOP Publishing respectfully requests that readers consider all work within this volume potentially
unreliable, as the volume has not been through a credible peer review process.
IOP Publishing regrets that our usual quality checks did not identify these issues before publication,
and have since put additional measures in place to try to prevent these issues from reoccurring. IOP
Publishing wishes to credit anonymous whistleblowers and the Problematic Paper Screener [1] for
bringing some of the above issues to our attention, prompting us to investigate further.
[1] Cabanac G, Labbé C and Magazinov A 2021 arXiv:2107.06751v1
Retraction published: 23 February 2022
Content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence. Any further distribution
of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.
Published under licence by IOP Publishing Ltd
ICCCEBS 2021 IOP Publishing
Journal of Physics: Conference Series 1916 (2021) 012039 doi:10.1088/1742-6596/1916/1/012039
d
G V Shalini1, M Kavitha Margret2, M J Sufiya Niraimathi1, S Subashree1
1
Student, Department of Computer Science, Sri Krishna College of Technology,
cte
Coimbatore, Tamilnadu, India
2
Assistant Professor, Department of Computer Science, Sri Krishna College of
Technology, Coimbatore, Tamilnadu, India
Email - [email protected]
Abstract. In the fight against the coronavirus, social distancing has proven to be an effective
measure to hamper the spread of the disease. The system presented is for analyzing social
distancing by calculating the distance between people in order to slow down the spread of the
virus. This system utilizes input from video frames to figure out the distance between individuals
to alleviate the effect of this pandemic. This is done by evaluating a video feed obtained by a
surveillance camera. The video is calibrated into bird’s view and fed as an input to the YOLOv3
model which is an already trained object detection model. The YOLOv3 model is trained using
the Common Object in Context (COCO). The proposed system was corroborated on a pre-filmed
tra
video. The results and outcomes obtained by the system show that evaluation of the distance
between multiple individuals and determining if rules are violated or not. If the distance is less
than the minimum threshold value, the individuals are represented by a red bounding box, if not
then it is represented by a green bounding box. This system can be further developed to detect
social distancing in real-time applications.
1. Introduction
The World Health Organization has claimed the spread of coronavirus as a global pandemic because of
the increment in the expansion of coronavirus patients detailed over the world. To hamper the pandemic,
numerous nations have imposed strict curfews and lockdowns where the public authority authorized that
the residents stay safe in their home during this pandemic. Various healthcare organizations needed to
Re
clarify that the best method to hinder the spread of the virus is by distancing themselves from others and
by reducing close contact. To flatten the curve and to help the healthcare system on this pandemic.
A new report shows that practicing social distancing and wearing masks is a significant regulation
measure to slow down the spread of SARSCoV-2 since individuals with mild or no indications at all
may accidentally convey crowd contamination and can spread the virus to others. To contemplate data-
driven models and numerical models which are consistently the most favored decision. In the fight
against the coronavirus, social distancing has proven to be an effective measure to hamper the spread of
the disease. As the name suggests, it implies that people are suggested that they should maintain physical
distance from one another, reduce close contact, and thereby reduce the spread of coronavirus.
By referring to the already existing works, enhancements are to be done to the proposed system.
The system to be developed aims to promote social distancing by providing an analyzer tool to monitor
public areas, workplaces, schools, and colleges to analyze and detect any social distance violation and
Content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence. Any further distribution
of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.
Published under licence by IOP Publishing Ltd 1
ICCCEBS 2021 IOP Publishing
Journal of Physics: Conference Series 1916 (2021) 012039 doi:10.1088/1742-6596/1916/1/012039
to generate warnings. This is done using a computer vision and deep learning model. Computer vision
alongside image processing, machine learning, and deep learning provide effective solutions to measure
social distancing among humans across the moving frames. Computer vision extracts information from
the input images and videos to possess a correct understanding of them to predict the visual input just
like the human brain. To achieve the above objective, objects are detected in real-time using YOLO
d
(You only look once), an algorithm supported convolutional neural networks which are employed for
the detection & determine the distancing between the human using clusters of pedestrians during a
neighborhood by grabbing the feed from a video.
2. Related Works
This section features and highlights some works related to object detection and person detection using
cte
deep learning. A heft of work recently focused on the classification of objects and detecting them
involving deep learning are also discussed. Detection of humans done using computer vision is
considered as a part of object detection. The detected objects are localized and classified based on their
shape with the help of a predefined model [1]. The techniques that use convolutional neural networks
(CNN) and deep learning have shown to achieve better performance on visual recognition benchmarks.
It is a multilayered perceptron neural network that contains many fully connected layers, sub-sampling
layers, and convolutional layers. It is powerful in detecting different objects from different inputs and it
is a supervised feature learning method. Because of the outstanding performance in large datasets such
as ImageNet, this model has achieved tremendous success in large-scale image classification tasks [2].
The object detection and recognition have achieved great success due to its neural network
structure which is capable of constructing objects on its own with the help of descriptors and can learn
distinguished features that are not primarily given in the dataset. But this has its own set of advantages
and disadvantages as of speed and accuracy. The real-time object detection algorithms which use the
tra
CNN model such as Region-Based Convolutional Neural Networks (R-CNN) [3-5] and You Only Look
Once (YOLO) are developed for the detection of multiple classes in various regions. YOLO (You Only
Look Once) is a prominent technique as to speed and accuracy in deep CNN based object detection.
Figure 1 shows how object detection is done based on the YOLO model.
Transforming the objective and interpretation from the work [6-8], this system which is proposed
presents a method for detecting people using computer vision. Instead of using drone technology, the
input is a stream of a video sequence from a CCTV camera installed. The camera’s range of view covers
the pedestrians passing by in the range of the installed camera. The people in the frame are represented
using a bounding box using the deep CNN models. The deep CNN based YOLO algorithm is used to
detect the people in the sequence of video streams taken by the CCTV camera. The calculations are done
by measuring the centroid distance between the pedestrians, this will represent whether the pedestrians
Re
3. Proposed System
The proposed system, the social distancing analyzer tool was developed using computer vision, deep
learning, and python to detect the interval between people to maintain safety. The YOLOv3 model based
on convolution neural networks, computer vision, and deep learning algorithms is employed in the
2
ICCCEBS 2021 IOP Publishing
Journal of Physics: Conference Series 1916 (2021) 012039 doi:10.1088/1742-6596/1916/1/012039
development of this work. Initially, for detection of the people in the image or frame YOLOv3 is used
an object detection network based on the YOLOv3 algorithm was used [9-11]. From the result obtained,
only the “People” class is filtered by ignoring objects of classes. The bounding boxes are mapped in the
frame. The distance is measured using the result obtained by this process.
d
3.1. Approach
The working of the Social Distancing Analyzer is depicted using a flowchart shown in Figure 2.
cte
tra
Re
Figure 2. The flow chart for the social distancing analyzer model.
3
ICCCEBS 2021 IOP Publishing
Journal of Physics: Conference Series 1916 (2021) 012039 doi:10.1088/1742-6596/1916/1/012039
d
4. Design Methodology
cte
Figure 3. Process flow of the Social Distancing Analyzer model.
This section discusses the design methodology and working of the Social Distancing Analyzer model.
4
ICCCEBS 2021 IOP Publishing
Journal of Physics: Conference Series 1916 (2021) 012039 doi:10.1088/1742-6596/1916/1/012039
The region of interest (ROI) of an image or a video frame focused on the person who is walking was
captured using a CCTV camera was then changed into a two-dimensional bird’s view. The changed
view’s dimension is 480 pixels on all sides. The calibration is done by transforming the view frame
captured into a two-dimensional bird’s view. The camera calibration is done straightforwardly using
OpenCV. The transformation of view is done using a calibration function that selects 4 points in the
d
input image/video frame and then mapping each point to the edges of the rectangular two-dimensional
image frame. On performing this transformation, every person in the image/frame is considered to be
standing on a leveled horizontal plane. Now the interval of each person in the frame can be calculated
easily as it corresponds to the total pixels present in between each person in the changed bird’s view.
cte
Deep Convolutional Neural Networks model is a simple and efficient model for object detection. This
model considers the region which contains only “Person” class and discards the regions that are not
likely to contain any object. This process of extracting the regions that contain the objects only is called
as Region Proposals. The regions predicted by region proposal can vary in size and can be overlapping
with other regions. So to ignore the bounding boxes surrounding the overlapping region, depending
upon the Intersection Over Union (IOU) score maximum non suppression is used.
The object detection approach used in the Social distancing analyzer model reduces the
computational complexity issues. It is done by formulating the detection of objects with the help of a
single regression problem [5]. In object detection models based on deep learning, the You Only Look
Once model. This model is suitable for real-time applications and it is faster and provides accurate
results. Figure 5 shows the pedestrian detection using the YOLOv3 model. The YOLOv3 is an object
detection model that takes an image or a video as an input and can simultaneously learn and draw
bounding box coordinates (tx, ty, tw, th), corresponding class label probabilities (P1 to Pc), and object
tra
confidence. The YOLOv3 is an already trained model on the Common Objects in Context dataset
(COCO dataset) [4]. This dataset consists of 80 labels including a human class known as pedestrian
class. The Figure 5 represents the YOLOv3 model used in Social Distancing Analyzer, The parameters
used in the detection of pedestrians are as follows:
● Box Coordinates - tx, ty, tw, th
● Object Confidence - C
● Pedestrians - P1, P2, … Pc
Re
5
ICCCEBS 2021 IOP Publishing
Journal of Physics: Conference Series 1916 (2021) 012039 doi:10.1088/1742-6596/1916/1/012039
There are different objects present in a single frame, the goal is to identify “Only Person” class
map bounding boxes related to only the people. The code for drawing the bounding boxes is given below
and the output of this code is shown in Figure 6.
#To identity “Person Only” class
x = np.where(classes==0)[0]
d
p=box[x]
count= len(p)
x1,y1,x2,y2 = p[0]
print(x1,y1,x2,y2)
cte
Figure 6. Output obtained from Bounding Box method.
For each person in the input frame, the orientation in the bird’s view transformation is calculated
based on the central axis point of every person in the input frame. The distance interval of every set of
people can be estimated from the bird’s view by calculating the euclidean distance between centroids.
As the camera is calibrated, more accurate results can be obtained.
The set of individuals whose interval is lower than the preset minimum threshold value is considered as
violation. The people who violate the condition are marked using a red box, and the remaining people
are marked using a green box. The code for computing the centers of the boxes of are given below:
#To compute center
x_c = int((x1+x2)/2)
y_c = int(y2)
c = (x_c, y_c)
_ = cv2.circle(image, c, 5, (255, 0, 0), -1)
6
ICCCEBS 2021 IOP Publishing
Journal of Physics: Conference Series 1916 (2021) 012039 doi:10.1088/1742-6596/1916/1/012039
plt.figure(figsize=(20,10))
plt.imshow(image)
def mid(image,p,id):
x1,y1,x2,y2 = p[id]
_ = cv2.rectangle(image, (x1, y1), (x2, y2), (0,0,255), 2)
d
The code to compute the pairwise distances between all detected people in a frame is given below:
%%time
from scipy.spatial import distance
def dist(midpt,n):
d = np.zeros((n,n))
for i in range(n):
cte
for j in range(i+1,n):
if i!=j:
dst = distance.euclidean(midpt[i], midpt[j])
d[i][j]=dst
return d
If the result obtained in the previous method is less than the minimum acceptable threshold
value, then the box around the set of people is represented using red color. The code that defines a
function to change the color of the closest people to red is given below:
def red(image,p,p1,p2):
unsafe = np.unique(p1+p2)
for i in unsafe:
x1,y1,x2,y2 = p[i]
_ = cv2.rectangle(image, (x1, y1), (x2, y2), (255,0,0), 2)
tra
return image
Figure 8. Social distancing analyzer detecting pedestrians in video frame - Unsafe Distance.
7
ICCCEBS 2021 IOP Publishing
Journal of Physics: Conference Series 1916 (2021) 012039 doi:10.1088/1742-6596/1916/1/012039
d
cte
Figure 9. Social distancing analyzer detecting pedestrians in video frame - Safe Distance.
Even though the detection people within the range are detected, some detection error occurs
possibly because of the overlapping of frames or the people walking too close to each other. This
detection error is shown in Figure 10, where there are six people within the range of detection but only
five people are detected, this is due to the overlapping of frames and two people are standing too close
to each other.
tra
Figure 10. Error in detection people within the range.
The accuracy of the calculated distance between every individual depends upon the algorithm.
The YOLOv3 algorithm can also detect pedestrians as an object even if only their half of the body is
visible, the bounding box will be mapped even to the half-visible body. The position of the person
corresponding to the midpoint of the lowermost side of the bounding box is comparatively less precise.
To eliminate the error occurring due to the overlapping of frames, a quadrilateral box is added to
represent the range. Figure 11 shows the range of detection, only the people within this range will be
Re
Figure 11. Pedestrians who are out of the specified range are not considered.
8
ICCCEBS 2021 IOP Publishing
Journal of Physics: Conference Series 1916 (2021) 012039 doi:10.1088/1742-6596/1916/1/012039
d
proposed system is capable of estimating the distance between people. The social distancing patterns are
distinguished and classified as “Safe” and “Unsafe” distance. Additionally, it also displays labels as per
the object detection and classification. The classifier can be implemented for live video streams and can
be used for developing real-time applications. This system can be integrated with CCTV for surveillance
of people during pandemics [9]. Mass screening is feasible and hence is often utilized in crowded places
like railway stations, bus stops, markets, streets, mall entrances, schools, colleges, work environments,
cte
and restaurants. By monitoring the space between two individuals, we can confirm that a safe distance
is maintained, this can help us to curb the virus.
References
[1] D.T. Nguyen, W. Li, P.O. Ogunbona, Human detection from images and videos: A survey,
Pattern Recognition, 51:148-75, 2016.
[2] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, L. Fei-Fei, ImageNet: A Large-Scale Hierarchical
Image Database, In Computer Vision and Pattern Recognition, 2009.
[3] R. Girshick, J. Donahue, T. Darrell, J. Malik. Rich feature hierarchies for accurate object
detection and semantic segmentation. In Proceedings of the IEEE conference on computer
vision and pattern recognition, pp. 580-587. 2014.
[4] J. Redmon, S. Divvala, R. Girshick, A. Farhadi, You only look once: Unified, real-time object
detection, In Proceedings of the IEEE conference on computer vision and pattern recognition,
tra
pp. 779-788. 2016.
[5] A. Haldorai and A. Ramu, Security and channel noise management in cognitive radio networks,
Computers & Electrical Engineering, vol. 87, p. 106784, Oct. 2020.
doi:10.1016/j.compeleceng.2020.106784
[6] A. Haldorai and A. Ramu, Canonical Correlation Analysis Based Hyper Basis Feedforward
Neural Network Classification for Urban Sustainability, Neural Processing Letters, Aug. 2020.
doi:10.1007/s11063-020-10327-3
[7] Landing AI Creates an AI Tool to Help Customers Monitor Social Distancing in the Workplace
[Onlive] (Access on 4 May 2020).
[8] Ahmed, I., Ahmad, M., Rodrigues, J. J. P. C., Jeon, G., & Din, S. (2020). A deep learning-based
social distance monitoring framework for COVID-19. Sustainable Cities and Society, 102571.
Re
doi:10.1016/j.scs.2020.102571
[9] Dhaya, R. CCTV Surveillance for Unprecedented Violence and Traffic Monitoring. Journal of
Innovative Image Processing (JIIP) 2, no. 01 (2020): 25-34.
[10] Ramadass, Lalitha, Sushanth Arunachalam, and Z. Sagayasree. Applying deep learning
algorithms to maintain social distance in public places through drone technology. International
Journal of Pervasive Computing and Communications (2020).
[11] Degadwala, Sheshang, et al. Visual Social Distance Alert System Using Computer Vision &
Deep Learning. 2020 4th International Conference on Electronics, Communication and
Aerospace Technology (ICECA). IEEE, 2020.