Thesis ADS11
Thesis ADS11
Of
KANAK ARORA
NIDHI SAKHARE
HIMANSHU DHOMANE
YOGESH RATHOD
i
CERTIFICATE OF APPROVAL
Certified that the project report entitled “Crowd Detection And Notifier System” has been
successfully completed by Kanak Arora, Nidhi Sakhare, Himanshu Dhomane, Yogesh Rathod
under the guidance of Prof. Shweta A. Gode in recognition to the partial fulfillment for the award of
the degree of Artificial Intelligence and Data Science, Yeshwantrao Chavan College of
Engineering, Nagpur (An Autonomous Institution Affiliated to Rashtrasant Tukdoji Maharaj Nagpur
University)
ii
Certificate of collaboration (industry/research organization)
This is to certify that following students of final year Artificial Intelligence and Data Science
Live/Industry/Joint research project titled “Crowd Detection And Notifier System” under the guidance
of Prof. Shweta A. Gode and Mr. Vishal Deshmukh with Name of Industry for the session 2024-25.
iii
DECLARATION
We certify that
a. The work contained in this project has been done by me under the guidance of supervisor(s).
b. The work has not been submitted to any other Institute for any degree or diploma.
c. We have followed the guidelines provided by the Institute in preparing the project report.
d. We have conformed to the norms and guidelines given in the Ethical Code of Conduct of the
Institute.
e. Whenever we have used materials (data, theoretical analysis, figures, and text) from other sources,
we have given due credit to them by citing them in the text of the report and giving their details in
the references. Further, we have taken permission from the copyright owners of the sources,
whenever necessary.
Kanak Arora
Nidhi Sakhare
Himanshu Dhomane
Yogesh Rathod
iv
ACKNOWLEDGEMENT
This project work is one of the major milestones in our journey of learning. We wish to express
our sincere thanks and sense of gratitude to our guide Prof. Shweta A. Gode and co-guide Mr. Vishal
Deshmukh, for their guidance, constant inspiration, and continued support throughout the tenure of
this project. The blessings, help, and guidance given by them from time to time shall carry us a long
way in the journey of technical research.
We also want to thank our Head of Department, Dr. Kavita R. Singh. She was always kind and
ready to help. Her words of support made us feel more confident. She gave us helpful suggestions that
helped us learn and grow. We truly appreciate the time she gave us.
A big thank you to our Principal, Dr. U. P. Waghe, for always being there to support us. He
gave us permission to use the lab and all the other things we needed to complete our project well. His
support made it easy for us to stay focused on our project.
We are also thankful to Prof. Nilesh U. Sambhe, our project coordinator. He gave us advice
when we needed it most. He helped us stay on track. His timely suggestions were simple but very
useful.
Our special thanks go to Mrs. B. H. Kulkarni, our technical assistant. She was always kind and
cooperative. Whenever we needed technical help, she was there with a solution. Her support made our
work much easier.
Lastly, We express our gratitude to all the teachers at all the level, especially who thought
fundamental concept and investigative strategies and who fostered a sense of wonder.
v
TABLE OF CONTENTS
Title Page i
Certificate of Approval ii
Certificate of Collaboration iii
Declaration iv
Acknowledgement v
Table of Contents vi
List of Tables viii
List of Figures ix
List of Abbreviations x
List of Symbols xi
Abstract xii
CHAPTER 1: Introduction 1
1.1 Overview 1
1.2 Literature Survey 1
1.3 Problem Statement 2
1.4 Project Objectives 2
1.5 Project Contributions 3
vi
TITLE PAGE NO.
3.6 Training 21
3.7 Modes of Detection 21
3.8 Web Portal 21
3.9 YOLOv8 Integration and Inference Optimization 22
3.10 Architecture and Communication Between SIM800L and 22
Raspberry Pi 4
3.11 Alert Mechanism and SMS Integration 23
3.12 Flutter App for Viewing Images via URL 23
Social Utility 33
Appendix 35
References 40
vii
LIST OF TABLES
viii
LIST OF FIGURES
ix
LIST OF ABBREVIATIONS
x
LIST OF SYMBOLS
SYMBOL DESCRIPTION
xi
ABSTRACT
This system proposes a real time surveillance mechanism for crowd density and violent behaviour
detection of an advanced machine learning in this system. Accurate detection in live streams: This
leverages YOLOv8n model, trained on custom datasets by Roboflow It processes camera video from
Tapo C200 CCTV Camera on a Raspberry Pi 4 Solar-powered and keeps inference happening
faster thanks to Google Edge TPU.
Thanks to the Edge TPU delivering up to 4 TOPS and only consuming around 2W, real-time
performance at low latency.
Overview: This is the Tapo C200, which provides full 360° panoramic views, the aspect of 1080p
resolution and night vision for unquestionable video coverage. With 4GB RAM, Quad-core Cortex-
A72 processor in Raspberry Pi 4 handles streaming and inference tasks smoothly
Over thresholds of crowd or violence, the SIM800L GSM Module sends alerts via GSM based
communication for real-time SMS notifications even without internet —Thanks to alerts at thresholds
alerts on microprocessors.
The snapshots are pushed to a Flask Web Portal and can be accessed through Flutter Mobile
Application for remote viewing, history.
The system combines CCTV, Raspberry Pi 4, Edge TPU, and SIM800L—it can spot fast, stream
timely alerts and reliable surveillance—best suited for scalable surveillance in busy or unsafe
locations.
Keywords: Real-time Surveillance, YOLOv8n, Google Edge TPU, Tapo C200 CCTV Camera,
Raspberry Pi 4, Crowd Detection, Violence Detection, Flask Web Portal, GSM Alert System, Flutter
Mobile Application, Machine Learning, Edge Computing.
xii
CHAPTER-1: INTRODUCTION
1. INTRODUCTION
1.1 OVERVIEW:
Critical to safety, surveillance systems capture and monitor populations at high risk. The goal of this
pipeline is to accurately do surveillance on the fly and detect crowd density, and hopefully violent
activities. It runs based on YOLOv8n, custom datasets from Roboflow for streaming live video to
detect objects and lifecycle. Standard Raspberry Pi 4 system with Google Edge TPU for inference
acceleration provides up to 4 TOPS of processing power at only 2 Watts of energy Video from Tapo
C200 Security (CCTV) Camera is continuous monitoring. For threats, the SIM800L GSM Module
will send instant SMS alerts, while event snapshots are streamed to the Flask Web Portal and
rendered on the Flutter Mobile Application. The hardware-software integration into mechanisms that
make it possible to work properly in real time also works to secure and add safety & responsibility in
life-critical environments.
The literature review consists of 21 research papers that sum up the real-time system of objects and
crowd detection, technologies and methodologies involved in it. Research in YOLO versions ([1]–
[4]) tracks the inception from YOLOv1 to YOLOv8, and the latest advancement in YOLO series,
YOLO-NAS which improves detection performance in terms of localization accuracy and detection
speed further for autonomous driving/ surveillance research. ([5]-[15]: Work on Object and Crowd
detection like ViTPose, YOLOv6[16] used for human pose estimation, traffic accident detection and
violent recognition problems using CNNs && Transformer. Similarly, these works underscore the
importance of deep learning, datasets that are benchmarked and in real-time analytics that give more
effectiveness for public safety/surveillance. On the other hand, it shows that there are plenty more use
cases like in [16]–[19] where these models are capable to train on resource-constrained devices such
as Raspberry Pi and realize functionality like object detection, theft prevention etc., smart surveillance.
The investigation on the GSM module ([20]) also introduces the programmable real-time alert device
which is able to communicate with the public in seconds via SMS so essential for event monitoring.
References to webRTC (live video stream support in the literature [21]) can provide information for
live correlation across a remote network and therefore improve the effectiveness of surveillance
systems. Reviewing all the literature reviewed above gives a good base for creating smart, real time
monitoring solutions by trying with YOLO + embedded hardware, GSM and even live communication
via WebRTC.
1
CHAPTER-1: INTRODUCTION
Maintaining monitoring ability on crowd behavior and spotting violent activities in real-time is a big
challenge to impose strict control areas. Surveillance systems, traditional and modern, have no mind
to do the true analysis in real-time and are generally looking for human intervention in order to detect
a threat. Directly, it means that we are failing to respond quickly on issues that need an immediate
response and compromises safety & security To cope with these problems, a systematic automated
surveillance system that is able to detect crowd density and violent behaviors on the fly as well as
generate a real time alert and offer remote monitoring capabilities. A system in development is focused
on creating a surveillance solution that could help guard against proactive threat detection and increase
situational awareness through a controlled number of people.
1. Develop a crowd-counting system using YOLOv8 with user-defined thresholds for crowd
prevention and automatic alerts.
2. Alternatively, develop a violence detection feature that captures violent activities.
3. Ensure continuous monitoring from surveillance cameras, capturing minimal latency video.
4. Provide a user-friendly interface for real-time monitoring, configurable thresholds, and
toggleable modes.
5. Enable instant alerting through SMS and image snapshots for immediate response and making
the user aware.
6. Ensure the system is portable and easily deployable across various environments without
complex setup.
2
CHAPTER-1: INTRODUCTION
In this project, we showcase a simplified and smart surveillance system that blends cutting-edge
technology with practical needs facilitated in a user-friendly way. Its main contributions are
1. In-Crowdedness and Violent Behavior Detection: Advanced Real Time live video using
YOLOv8n on only custom datasets for the crowd density as well any kind of bad behavior in
a split second.
2. Reduces Stress on Human Monitoring: It automatizes threats detection so that the human
operators can focus on reacting rather than staring at a screen the whole day.
3. Environmental Friendly: The system will be lightweight (~2 watts) and all power efficient,
thus making it an eco-friendly approach to tech deployment thereby saving the environment.
5. Limited Data Sharing, a bit more secure: Alerts and images are only shared when triggered,
access via web & mobile controlled by authorized users.
6. Efficient Monitoring Tool: Light weight, low carbon design for easier large scale usage as an
eco-friendly solution in schools, home/business office scales-up.
7. Enhances Public Safety: Reduces the risk of overcrowding and violence among students in
schools, offices and events or public spaces through accurate previews.
3
CHAPTER-2: REVIEW OF LITERATURE
2. REVIEW OF LITERATURE
2.1 OVERVIEW:
A total of 21 research papers have been examined for this work. In particular, references [1–3] focus
on YOLO versions, [5–15] pertain to object detection and crowd detection, [16–19] relate to
Raspberry Pi, [20] discuss the GSM model.
Summary of above studied research and review paper as follows: -
Yolo Versions:-
Peiyuan Jiang, Daji Ergu, Fangyao Liu, Ying Cai, Bo Ma, “A Review of YOLO Algorithm
Developments” (2024) [1] it basically offers a comprehensive survey of advances in real-time object
detection with the YOLO (You Only Look Once) algorithm. From YOLOv1 to YoloV4 it discusses
the changes in detection performance, speed at deploy (as well as the architecture improvements). The
review originally describes how each version overcomes/fixes/addresses the shortcomings of the
previous generation. It goes on to look at the different ways YOLO can be used (autonomous driving,
medical imaging) and performance between versions, whilst benchmarking against alternative
algorithms. They talk about current obstacles that exist in small object detection and changing
conditions and indicate directions for future work to push the technology.
Kaiming Gu and Boyu Su, “A Study of Human Pose Estimation in Low-Light Environments Using
YOLOv8 Model” (2024) [2], discusses the comprehensive examination of YOLOv8 family models
for low-light conditions human pose estimation which can be considered one of the challenging
problems in computer vision. The authors attempt their six different flavors of YOLOv8 for
investigating the competencies of the models in detecting and understanding human body key points
more or less when lighting is bad. The methodology of the study is to compare comparatively the
models on a diligent low-light dataset, they evaluate the performance in the mean of precision, recall,
and processing rate. Results demonstrate that more elaborate YOLOv8 models are better at
recognizing poses but at the cost of computational resources — amounting to memory and processing
time. It raises serious engineering challenges for these models to be deployed in real-time systems
such as surveillance cameras, mobile devices or embedded systems where resources are constrained.
4
CHAPTER-2: REVIEW OF LITERATURE
Mehmet Şirin Gündüz and Gültekin Işık, “A New YOLO-Based Method for Real-Time Crowd
Detection from Video and Performance Analysis of YOLO Models” (2023) [3] talking about real-
time crowd detection using YOLO models especially for indoor capacities (COVID-19) handling. Our
research introduces a mechanism to count in a given area in video, to compute the amount of people
within that area and indicate its capacity limit. YOLO object detection model (pretrained weights) on
Microsoft COCO dataset will be used for detection and labelling of people The metrics for the
performance, optionally across different YOLO models are : mean average precision (mAP),frames
per second (fps), accuracy for YOLOv3 vs YOLO v5s. Highest accuracy and mAP from YOLO v3,
YOLO v5s besting all non-Tiny in terms of fps.
F. Sultana, A. Sufian, and P. Dutta, “A Review of Object Detection Models Based on Convolutional
Neural Network” (2019) [5] in this paper, we provide a survey of state-of-the-art CNN-based object
detection models. Our review re-arranges the models into two families according to two approaches:
Two-stage and one-stage. From R-CNN to state-of-the-art RefineD, this paper shows you its evolution
model, each model description and corresponding training process. This chapter also reports
simulation results comparing the models, or updating with the evolution of object detection systems.
Yufei Xu, Jing Zhang, Qiming Zhang, and Dacheng Tao, “ViTPose: Simple Vision Transformer
Baselines for Human Pose Estimation” (2022) [6]evaluates plain vision transformers on human human
pose estimation. ViTPose, the starting point model based on transformer-free-hierarchical and
lightweight decoder for pose estimation in the paper reviewed a study. ViTPose is stunningly simple,
composable and scalable from 100M parameters up to almost /b/VLT herculean scales. It is very high-
throughput / performance with many options for attention types, resolutions, and training schemes.
MS COCO Keypoint Detection Benchmark: The model sets state-of-the-art on this
5
CHAPTER-2: REVIEW OF LITERATURE
benchmark with significant results compared with previous version of the full version, 80.9 AP on ms
coco test-dev.
Hadi Ghahremaninezhad, Hang Shi, and Chengjun Liu, “Real-Time Accident Detection in Traffic
Surveillance Using Deep Learning” (2022) [7] this paper describes a GPU-based algorithm for
computer vision of detecting traffic accident intersections. The framework fuses three components:
YOLOv4 for precise object detection with Kalman filter and the Hungarian algorithm for object
tracking, and trajectory conflict analysis in the accident decision stage. We first propose a cost function
for enhanced object association in terms of occlusions and overlapped objects with respect to shape
changes. Based on object trajectories in velocity angle and distance we derive different trajectory
conflicts like vehicle-to-vehicle, vehicle-to-pedestrian and vehicle-to-bicycle interactions of the
framework. The Experimental results show that the proposed method is successful in real-time traffic
surveillance with high detection rate and low false alarm rate even in complex lighting conditions.
İrem Üstek et al., “Two-Stage Violence Detection Using ViTPose and Classification Models at Smart
Airports” (2022) [8], we demonstrate a framework of VTpose for pose estimation in combination with
CNN-BiLSTM model that integrates for realtime violence detection computes. SAAB SAFE
(Integrates into the SAAB SAFE system and tested using AIRTLab dataset), it improves security by
offering better accuracy and less false alarms to enable faster response time threat in post-pandemic
airport environments.
Licheng Jiao, Fan Zhang, Fang Liu, Shuyuan Yang, Lingling Li, Zhixi Feng, and Rong Qu, “A Survey
of Deep Learning-Based Object Detection” (2019) [9] the Review of Object Detection in Computer
Vision that discuss results and upcoming methods with deep details For the last two paragraphs, we
will also review a paper which talks about the evolution deep learning algorithms have gone through
to drastically improve performance on object detection problems across security and autonomous
driving, In addition. The paper ( Will be posted soon) – rigorous review of one-stage and two-stage
models for detection, introducing benchmark datasets, their roles and so on. The survey includes a
comprehensive review of both classical and modern applications that recur along the main branches
in object detection and the Shape of Architecture to build effective detectors? It also conjectures
directions of future research to stay abreast with recent state-of-the-art algorithms.
Xinyi Zhou, Wei Gong, WenLong Fu, and Fengtong Du, “Application of Deep Learning in Object
Detection” (2017) [10] under deep learning based object detection for computer vision. The paper
6
CHAPTER-2: REVIEW OF LITERATURE
also offers a non-comprehensive survey on the widely-used dataset and algorithms in this domain. It
proposes a new dataset generated from (previously) existing ones and conducts experiments with
Faster R-CNN. The study demonstrates the importance of deep learning frameworks and that state-of-
the-art object detection results can significantly be improved with better datasets.
Abdul Vahab, Maruti S Naik, Prasanna G Raikar, and Prasad S R, “Applications of Object Detection
System” (2019) [11] in this paper, we delve into productivity, usefulness and variety of object
detection technology (computer and robot vision systems). Object detection has seen a massive uptick
in real world adoption, as the paper points out due to improvements from machine learning and deep
learning algorithms & computer vision The paper also touches on how object detection boosts the
performance in a system, in doing realistic gestures exposes tracking a feature quickly and deciding
real-time analysis & action. It highlights the aspects of this technology and how it can disrupt the
industries, for safety reasons, efficiency and automation. The paper discusses technical difficulties
such as dataset issues and variations, detection accuracy in different environments, computational
overheads, among others — with an object detection mindset covering the already-seen research areas
trying to surmount these challenges further.
Sanket Kshirsagar et al., “Crowd Monitoring and Alert System” (2024) [12], proposes a crowd
surveillance (real-time) ai/ML based system in the proposed structure for crowded areas. Behavior
analytics as well anomaly detection are employed by it to identify any suspicious behavior and security
teams are immediately alerted with instantaneous notification. This study further balances privacy and
ethics as well dedicating itself to public safety and pointing future directions for crowd monitoring.
Esraa Samkari, Muhammad Arif, Manal Alghamdi, and Mohammed A. Al Ghamdi, “Human Pose
Estimation Using Deep Learning: A Systematic Literature Review” (2023) [13] here we cover all
problems of Human Pose Estimation (HPE) with DL approaches in depth. This work discusses
different models and mechanisms for localization of human joints from an image or video stream,
with an emphasis on sports analysis and surveillance applications. It thoroughly summarizes more
than 100 articles published between 2014 and we focus on single/multi-task HPE as well as datasets,
loss functions and pre-trained feature extraction models. In the paper, Convolutional Neural Networks
(CNNs) and Recurrent Neural Networks (RNNs) are the most frequently applied methodologies. It
also identifies complications like occlusion, evaluation on crowded scenes which affects the
performance of models, providing remedies and suggesting possible directions for new future research
works on HPE.
7
CHAPTER-2: REVIEW OF LITERATURE
Dushyant Kumar Singha, Sumit Paroothi, Mayank Kumar Rusiac, and Mohd. Aquib Ansari, “Human
Crowd Detection for City Wide Surveillance,” (2019) [14] at the Third International Conference on
Computing and Network Communications (CoCoNet’19), provides an Autonomous Solution for the
Improvement of City-wide Surveillance. This paper presents a system concept as follows: make use
of the existing CCTV to monitor public places. While the system employs computer vision methods
for recognizing crimes on video streams, real-time analysis is done on video feeds. Dependence on
frequent manual surveillance by security forces It comes with a prompt communication mechanism to
quicken responses and get alarm signals along textual warnings on abnormal activities This
methodology seeks to make surveillance and enforcement efficient, yet reduce the need of intensive
human supervision.
Li Liu, Wanli Ouyang, Xiaogang Wang, Paul Fieguth, Jie Chen, Xinwang Liu, and Matti Pietikäinen,
“Deep Learning for Generic Object Detection: A Survey” (2019) [15] here, it present an extensive
review of novel methodology in object detection fueled by deep learning. A central problem in
computer vision, object detection aims to find instances of a few predefined categories across image
and natural scene. This paper surveyed more than 300 research papers along with some of the key
aspects such as detection frameworks, feature extraction, object proposal methods, context modeling,
training methodology and evaluation metrics. This review and its concluding remarks in particular
indicate substantial advancements by deep learning methods, unraveling new directions for future
research.
Raspberry Pi:-
Madhura Vajarekar, Krutika Patil, Meera Yadate, and T. N. Sawant, “Vehicle Theft Detection using
GSM on Raspberry Pi” (2020) [16] this shows a Raspberry Pi 3-based system for cheap vehicle theft
detection. The system consists of a MEMS accelerometer sensor mounted on the target vehicle for
insertion of keys, engine start and drive mode monitoring. The system then sends alert messages to
vehicle owners, GPS Mode(mobile owners mobile number) or Global System for Mobile
Communication(GSM). User mode (and theft mode) the device works in two modes. The article
underscores the idea of the system being a low-cost, compact solution for vehicle security and theft
activities detection.
Kashaboina Radhika and Dr. Velmani Ramasamy, “Bluetooth and GSM based Smart Security System
using Raspberry Pi” (2020) [17] introducing an intelligent security system with the feature
8
CHAPTER-2: REVIEW OF LITERATURE
of Bluetooth, GSM connectivity for expanded security measures in banking their homes or any
business. This system utilizes a Raspberry Pi for fast data processing, hence the processing power and
real-time wireless data access it provides is very robust. Putting the Bluetooth together with GSM such
as proximity based access and relying on remote notifications/alerts, the system delivers a secure &
efficient smart door access. Integration of Wireless Technologies to provide a Smart Security
Application, fast and secure with reliability. The paper stresses the effectiveness of the system to boost
security by availing of advanced technology and real time applications.
R. Sai Sree, P. Chandu, and B. Pranavi, “Real Time Object Detection Using Raspberry Pi” (2023) [18]
this paper looks into Raspberry Pi Real-Time Object Detection (an important problem in numerous
applications of today like autonomous vehicles, drones, and smartphones exist) The paper focuses on
issues to solve object detection on embedded devices which lack memory and computation. In this
paper, a lightweight detection system of intelligent low-powered device is built to show that Raspberry
Pi could possibly be employed for accurate object recognition with acceptable loss in performance.
The hardware configuration proposed consist in a simple approach that works to detect in 2D and 3D
environments. The work demonstrates the rising trend in object detection for computer vision and tells
that Raspberry Pi should be considered as a pragmatic solution for real-time applications. This context
using popular models such as RCNN, RetinaNet and YOLO emphasizes the capabilities of embedded
systems to carry out complex computer vision tasks.
S. Srikanth, Ch. Sai Kumar, N. Uday Rao, and R. Srinivasa Rao, “Raspberry Pi Based Smart
Surveillance System” (2022) [19] introduces a Raspberry Pi home security solution for deep
surveillance. Easy to access and set up, this Raspberry Pi system uses a Pi-camera and checks for
human intrusion by emailing users images of the offending person to their mobile device or computer.
The Raspberry Pi-3 administers the security system via Python programming and gives live streaming
from the camera's local server which is executed by pedestrians. Because it is so accessible, use of this
IoT-based method enables the user to keep an eye on his/her property from anywhere in the entire
world and the implementation is practical and convenient for home, and office security. Raspberry Pi,
slightly compacted by Raspbian with direct programming languages makes it a good testing platform
for such development.
GSM Module:-
9
CHAPTER-2: REVIEW OF LITERATURE
controller with sensor based on the detection of threat-and-critical event. Its GSM capabilities call
out alerts so it can be responsive in time, reducing the risks and effectively communicating through
its prototype.
A total of 3 key patents have been examined for this system. In particular, patent [4] focuses on real-
time crowd measurement and management systems, patent [8] addresses camera pose estimation
devices and control methods, and patent [9] explores crowd behavior anomaly detection based on
video analysis.
Summary of the above-studied patents is as follows:-
Andrew Tatrai, Travis Lachlan Semmens, “Real-time Crowd Measurement and Management Systems
and Methods Thereof” (CA3143637A) [4] presents a real-time crowd measurement and management
system designed to operate across multiple zones. The system uses data-capturing devices to
continuously monitor crowd dynamics and an analysis module to evaluate crowd characteristics,
emotional states, and behavioral trends. The technology enables real-time prediction of emergent
crowd behaviors and potential risks. Predicted outcomes and alerts are then displayed through an
integrated display module. This approach moves beyond traditional passive surveillance, offering
proactive management of crowd safety. However, the system's effectiveness is contingent upon the
precision of its data interpretation models and real-time computational performance, suggesting
opportunities for optimization and integration with intelligent video analytics.
Atsunori Moteki, Nobuyasu Yamaguchi,Toshiyuki Yoshitake, “Camera Pose Estimation Device and
Control Method”×,US10088294B2[8] describes an approach and device for deriving the pose of a
camera using simplified motion models or feature matching methods across a set/sequence of A video
providing the capability of computing the changing 3D position and orientation of a camera that is
required in Vis-based systems like autonomous vehicles, augmented reality and surveillance for scene
consistency. But with complex translations and rotations resulting in both translation/drift as well as
pose resolv-ection issues are costly for this system to solve. The said limitations urge the need for
more sophisticated modeling approach or a hybrid sensor fusion methods in order to improve accuracy
in non-dynamic enviroments or texture deficient.
10
CHAPTER-2: REVIEW OF LITERATURE
Milan Redzic, Jian Tang, Zhilan Hu, Joseph Antony, Haolin Wei, Noel O'Connor, Alan Smeaton
“Crowd Anomaly Detection in Video Analysis Based on Front collection of Observed Persons”
(WO2021069053A1)[9] methodology and system for advanced video analysis to detect anomalies in
a crowd behavior. These approach stacked two feature sets one by extracting from a single images by
using machine learning models pre-trained on normal crowd behaviour and the other using optical
flow to compute motion patterns on pairs of consecutive frames. Concatenated raw features passed to
classification algos for outlier detection of atypical behaviour These fusion features are used to detect
emergent crowd anomalies in real-time surveillance, capable of being robust. The performance of the
system relies heavily on the quality of the training dataset and motion estimation accuracy which
perhaps points to improvements with deep learning and integration of multi-modal sensors.
11
CHAPTER-3: WORK DONE
3. WORK DONE
12
CHAPTER-3: WORK DONE
Real-time surveillance system works in a regularized process with best efficiency for detection & alert
mechanisms The Tapo C200 CCTV camera is live video streaming to the RTSP URL of the Tapo
C200. The real-time video feed from YOLOv8 model to process is done, which is packed with Google
Edge TPU for superb fast inference with a video. The system provides a two modes (Crowd Detection
and Violence Detection) switch for users, both of them configured via Flask-based web portal. An
alert is generated by the SIM800L GSM Module when the detection threshold crosses. It also records
images from the time-critical events that are made viewable via a Optimal Flask instance and view it
on flutter application for real time updates and monitoring.
13
CHAPTER-3: WORK DONE
The pre-trained YOLOv8n model was fine-tuned using custom datasets from Roboflow for violence
and crowd detection. The datasets were formatted in YOLO style with annotations for training and
validation. A data.yaml file was used to define class names and paths. Transfer learning helped
the model quickly adapt to new detection tasks. The fine-tuned model weights were saved for real-
time video stream inference. This enabled accurate detection of violence and crowd density in live
surveillance.
The block diagram represents the complete architecture as shown in fig 3.2.2 of the surveillance
system, starting from video capture to real-time alerting and monitoring. The Tapo C200 CCTV
Camera streams live video using the RTSP protocol, which is fed into Raspberry Pi 4 for processing.
The YOLOv8n model, fine-tuned for violence and crowd detection, runs on the Raspberry Pi with
acceleration from the Google Edge TPU for fast inference. When an event is detected, an alert is
triggered and sent through the SIM800L GSM Module as an SMS notification. At the same time, a
snapshot of the event is captured and uploaded to the Flask Web Portal, which displays the detection
results in real-time. These snapshots are accessible to the Flutter Mobile Application via the Flask
server's URL, enabling users to view live alerts and detection history remotely. This interconnected
system facilitates efficient real-time detection, instant alerts, and seamless remote monitoring.
14
CHAPTER-3: WORK DONE
● Focus Layer:
This layer down-samples the input image by a factor of 4 during inference time while
keeping the main spatial information to make it cost computationally.
𝑥
𝑆𝑖𝐿𝑈(𝑥) = 𝑥 · 𝜎(𝑥) =
1 + 𝑒 –𝑥
1
𝜎(𝑥) =
1 + 𝑒–𝑥
● C2f Module (Cross Stage Partial Network):
This module promotes efficient feature reuse by splitting and merging feature maps, thus
reducing redundant computations.
𝑌 = 𝑆𝑖𝐿𝑈 (𝐵𝑎𝑡𝑐ℎ𝑁𝑜𝑟𝑚(𝐶𝑜𝑛𝑣(𝑋)))
The neck aggregates features at multiple scales to improve detection performance for objects of
different sizes.
15
CHAPTER-3: WORK DONE
FPN formula:
𝑃𝑖 = 𝐶𝑜𝑛𝑣(𝐹𝑖) + 𝑈𝑝𝑠𝑎𝑚𝑝𝑙𝑒(𝑃𝑖 + 1)
PANet formula:
𝑃ᵢ = 𝐶𝑜𝑛𝑣(𝐶𝑜𝑛𝑐𝑎𝑡(𝑃ᵢ , 𝐹ᵢ))
where 𝐹ᵢ denotes the feature map at scale i, and 𝑃ᵢ represents the aggregated feature map at scale i.
The head predicts bounding boxes, objectness scores, and class probabilities for each grid cell.
where:
These calculations ensure bounding boxes accurately represent object location and scale relative to
the input image.
16
CHAPTER-3: WORK DONE
3.4 DATASET
The annotated dataset is from Roboflow, was exported to YOLOv8 PyTorch format which contain
images, annotations and also a configuration file (data.yaml). Roboflow offers an API to download this
dataset programmatically, so that we can get the data prepared in a consistent way. This dataset was
opened by sharing the link through Roboflow client with API key, workspace, project and dataset
version. Then the script that download dataset is as follows:
rf = Roboflow(api_key="YOUR_ROBOFLOW_API_KEY")
project = rf.workspace().project("PROJECT_NAME")
dataset = project.version().download("yolov8")
This process automatically prepared the dataset folder structure and generated the necessary files for
YOLOv8 training.
For training, the Ultralytics YOLOv8 framework was utilized. A pretrained yolov8n.pt model
served as the starting point, enabling transfer learning to improve efficiency and accuracy. The training
procedure was initiated by specifying the dataset configuration file and training parameters such as
epochs, image size, and batch size:
Python Code:
model = YOLO("yolov8n.pt")
This method ensured a reproducible training pipeline with minimal manual intervention, facilitating
consistent results and allowing for efficient optimization of the crowd detection model.
17
CHAPTER-3: WORK DONE
The Crowd Detection Dataset (Head-Based Dataset) — taken from Roboflow Universe — its Head
Datasets designed to detect human heads in heavy env reg's by crud in the detector-like backgrounds.
It is the best for person counting and such because it has high res images w/bounding boxes that
represent the locations of heads in it. The annotations are in YOLO (You Only Look Once) format to
facilitate the automatic annotation integration homogeneously in architectures based on YOLO family
like YOLOv5 and YOLOv8. Dataset contains a diverse background and cameras providing more
robustness to the model for deployment scenarios. Also, it is trained with the data augmentation
techniques that result in the model generalizing well. In this project the dataset was concatenated with
the YOLOv8 model to do a real-time crowd detection by counting the number of visible heads,
providing accuracy and speed up in surveillance applications.
The GunW Dataset, which is taken from Roboflow Universe since it is collected for detecting firearms
in different environments which makes this dataset highly applicable for violence detection
application. This is an image set where guns in images are labeled w/ bounding boxes to properly train
object detection models detecting/locating guns correctly. Annotations are in YOLO format, allowing
for easy usage with YOLO variants of architecture (e.g., YOLOv5, YOLOv8). The dataset features a
wide set of conditions such as varying backgrounds, lighting and viewpoints making the model behave
better in real life. By the project, GunW Dataset will be used for training YOLOv8 models that perform
detection of real-time violence, so our system is able to detect as soon and accurately any kind of
threat possessing people.
18
CHAPTER-3: WORK DONE
The SIM800L module handles the alerting in the system. This is set
to SMS the user in real time when crowd thresholds are crossed or
violence detected to reinforce that awareness as quickly as possible.
19
CHAPTER-3: WORK DONE
3.5.4 RASPBERRY PI 4:
3.5.5 BUZZER:
A Buzzer has been included in the system for prompt audible alert when a
critical event gets detected by people on site. It is actually just a local
warning system to signal the public in an immediate emergency.
20
CHAPTER-3: WORK DONE
3.5.8 BATTERY:
The 3.7V Li-Po Battery Used for powering the SIM800L GSM Module,
prevents unstable voltage while transmitting SMS or its restarts
immediately. It provides stable links even in heavy power consumption.
The training process involved fine-tuning the YOLOv8 model using datasets specifically curated for
crowd detection and violence identification. The model was trained using Google Colab with
annotated data for accurate object detection. After training, the model was optimized and exported in
.pt format.
In Crowd Detection mode, the system monitors the number of people in a frame and triggers alerts if
the count exceeds the user-defined threshold. In Violence Detection mode, the model actively looks
for aggressive behavior patterns and triggers alerts if violence is detected.
The Flask driven Web Portal is the UI for setting up the system. The user can enter RTSP URL for
the live feed, crowd detection mode and also the threshold of crowd size. The portal further shows
Live Video streaming of Tapo C200 CCTV Camera, and updates in real-time.
21
CHAPTER-3: WORK DONE
22
CHAPTER-3: WORK DONE
The SIM800L GSM Module is connected to the Raspberry Pi 4 to enable real-time alerting through
SMS notifications. The communication is established using UART (Universal Asynchronous
Receiver-Transmitter) protocol, which allows serial communication between the two devices. Below
is the connection setup:
Wiring Configuration:
● SIM800L VCC → 5V Power Supply (or 3.7V Li-Po Battery for stability)
Communication Setup:
23
CHAPTER-4: RESULTS AND DISCUSSION
1. Crowd Detection
2. Violence Detection
Users interact with the web portal through an intuitive interface built with HTML and CSS, where
they can:
● Input the RTSP URL of the CCTV camera to initiate the live feed.
To achieve real-time inference, the system is optimized with a Google Edge TPU. This accelerates the
processing speed of YOLOv8, enabling fast and accurate detections on the Raspberry Pi 4, which is
essential for live video analysis.
● The alert includes a snapshot of the event, captured at the moment of detection.
24
CHAPTER-4: RESULTS AND DISCUSSION
● The Flask application automatically JSONify the snapshot data and makes it available through
a REST API.
● This data is accessible from a Flutter app, where the user can view every captured image by
simply entering the Flask server URL.
● This functionality ensures that users have visual evidence of the incident, accessible in real-
time from their mobile application.
Inference Time, from the other hand tells how much time it takes to image the single image — lower
inference time is necessary for live application(eg. Surveillance). Model Size is a bit footprint of
trained model in terms of storage, highly important for edge devices like raspberry pi The F1-Score
(1) is the harmonic mean of precision and recall (2 × P × R / (P + R)) makes it easier on my model
and provides an all-rounder to determine how good it is. Finally, IoU (Intersection over Union)[7] The
basis for most of the above metrics is IoU, which measures the overlap between the predicted bounding
box and true ground truth box. All in all, each of these metrics integrates to quantify whether the
YOLOv8 framework performs detection on the object as well as locating the same object correctly
and fast.
25
CHAPTER-4: RESULTS AND DISCUSSION
26
CHAPTER-4: RESULTS AND DISCUSSION
4.7 Discussion:
This system includes a real-time surveillance setup that efficiently monitors live video feeds from the
Tapo C200 CCTV camera using a structured workflow. The primary objective of the system is to
ensure safety and crowd control by detecting the number of people and any violent activities in real
time.
27
CHAPTER-4: RESULTS AND DISCUSSION
The Flask based web portal accepts user inputs for the operation of the system. Users also give you
the RTSP URL of the camera feed, choose between two detection modes (Crowd or Violence
Detection) and set their own crowd size threshold. A friendly interface, written in HTML and CSS for
immediate video stream activation via click will be used throughout this interactive interface.
OnceWhen the RTSP URL has been given, YOLOv8 is processing live feed where EDGE TPUs from
Google are used for efficiency to make better inference.This enables the model to
Using the Edge TPU integrates inference and enables an accurate, real-time detection required for live
monitoring use cases. The system lights up a red LED when the number of detected individuals
exceeds a threshold value defined by users or violence is detected. And to be able to do this, uses
SIM800L GSM module which sends a message notifications to users cell phone additionally it
captures an image of the event saved at server. Also, it shot a shot of the same event that is captured
on the server, The Flask application converts this snap to JSON and provides it in API form.
We just make users paste the Flask server URL in a Flutter application and from here, directly those
incident snaps will be visible (with the help of this API integration).
It makes sure that users are marked in real-time as well as have visual proof (if required) of the incident
to improve situational awareness.
Critical events are specifically designed for low-lag and high-reliability, that reflects on the system's
capability of reacting in milliseconds.
Also, Flask for backend operations will ease communication between detection module, GSM alerts
and mobile interface so this entire architecture is robust and scalable.
28
CHAPTER-5: SUMMARY AND DISCUSSION
The Vision, Design and Development of a real-time Crowd Detection Alert System with the intention
to enact better safety and security in busy urban areas such as public places, educational institutions
to be resumed for implementation.The system itself is based on the ready-made features of real-time
video feeds from standard CCTV cameras in order to observe crowd density and subsequently detect
action of violence on the go. To be precise: systems adopt a great but simplest object detection
framework based on YOLOv8 (the state-of-the-art algorithm for fast and accurate multi-object frame
detections).
A web-based portal for user interaction and customization using some of the basic web technologies
like HTML as a structure and CSS for rendering is recommended to be developed in this area. It is the
portal with RTSP which allows for live video from connected CCTV streams to the user-friendly
interface in a continuous, low-latency way. The first and most useful feature of this site is that the
users should be allowed to configure crowd density specific threshold limits. Over these thresholds
are the safety regulations or operational requirements of the monitored area, that are used to tell the
system that instances of crowding have been detected.
Connected to the central processing unit (Raspberry Pi single-board computer running Raspbian) via
a Global System for Mobile communication (GSM) module, the alert mechanism of the system. The
Raspberry Pi acts as command center when the YOLOv8 algorithm identifies either crowd volume
above threshold, or recognizable signs of violence making an immediate notification via GSM module
to named recipients. The purpose is to inform the staff as soon as possible for quick actions against
potential safety dangers or incidents.
The review of the state-of-the-art (subsection IV-A) rests on extensive literature survey to select these
technologies YOLOv8 for video intelligent analysis, Raspberry Pi to cope with processing and
surveillance at a fraction of cost, RTSP for rapid web-based monitoring, and GSM provides reliable
out-of-band alert. This piece of review basically brings forth the journey YOLO algorithms have been
through, state of the art methods on object & crowd detection, use of Raspberry Pi with surveillance
system, use of GSM modules to transmit data wirelessly and RTSP Streaming(Single video stream)
for almost real-time.
29
CHAPTER-5: SUMMARY AND DISCUSSION
5.2 CONCLUSION:
● In real time, the system runs efficiently in detecting crowd density and informing alert when
the crowd reaches above safe thresholds by using YOLOv8.
● Also it has a violence detection module trained on data to detect violence in the videos.
● Offline: even though not yet in real-time, this process is still beneficial for post-event analyses
especially for educational institutes where student safety is a huge concern..
● Integration of object detection models in real-time with live camera feeds was technically
difficult.
● Creating a violence detection model needed a large amount of labeled video data and
computational power.
● A significant struggle was to make the system be both fast crowd detection (online) and offline
video-violence detection.
● Crowd patterns at schools and colleges are different Customizing the system for a classroom
or college environment requires step fine tune.
● Violence detection feature is offline- so it will not notify any police in real time.
● The system is mainly designed for the educational context and needs to be configured to
different environments.
● It depends on what crowds can be detected with, more like the quality and diversity of datasets
used for violence detection, which restricts it to new scenarios.
30
CHAPTER-5: SUMMARY AND DISCUSSION
● Optimize models or use edge computing devices for a transition in the real time of Violence
detection System.
● Deepen the dataset with more in-coder(i.e.,violent incident recorded from within institute)
suicidal and violent events to augment the accuracy of detection.
● Improper prediction algorithms for behavior to foresee potential risks of school and college
gatherings
● Extend crowd flow analysis and access control integration with the system in educational
campuses.
● The inclusion of real time crowd management and offline violence detection in the system
brings dual benefit safer environments within educational institutions.
● Low cost implementation for the existing surveillance infrastructures and open-source tools.
● Reduces dependency on manual supervision and enables the staff to devote themselves to
prevention/rescue.
● Extremely flexible and can be scaled down to different types of educational setups in schools,
colleges, universities.
● Essentially developed with educational institutes in mind, the system prevents littering & an
inbuilt provision does crowd analysis to enhance campus safety after every event.
31
CHAPTER-5: SUMMARY AND DISCUSSION
● Future works will focus on implementing live active learning environment ready violence
detection.
● Going farther to just multi-camera, multi-location setups over multiple institutes or campuses.
● Isolating its use with emergency alert systems for prompt notification to security staff.
● Admin dashboards to check trends in crowd behavior & historical violence incident reports at
Deeper levels.
32
SOCIAL UTILITY
SOCIAL UTILITY
33
SOCIAL UTILITY
34
APPENDIX
35
APPENDIX
Kanak Arora
Email: [email protected] | Phone no: 9309524723 | LinkedIn: kanak-arora | GitHub: arorakanak
Education:
Yeshwantrao Chavan College of Engineering, B-Tech in Artificial Intelligence and Data Science,
CGPA: 7.30, December 2021 - May 2025.
HSC, Prerana Junior College, Percentage: 93.67, August 2021.
SSC, Swami Awadheshanad Public School, Percentage: 81.60, May 2019.
Projects:
Final Year Project | Crowd Detection and Notifier System | Aug 2024 – Present
Developing a real-time surveillance system using Raspberry Pi, YOLOv8, GSM module, and WebRTC
to monitor crowd size and detect violent behaviour in high-density areas. Instantly sends alerts and
streams live video via a user-friendly HTML/CSS web portal.
Crop Production Analysis in India | Power BI | June 2024
Conducted an in-depth analysis of crop production trends in India using Power BI for data
visualization.
Analysed datasets to identify patterns, regional variations, and key factors influencing crop yield.
Technologies Skills:
Languages: C, Python, HTML, CSS, R, JavaScript.
Technologies & Tools: Microsoft Power BI, MySQL, VSCode, Google Colab.
Certifications:
● Introduction to Deep learning.
● Data Visualization for Deep Learning using Power BI and Tableau, VNRVJIET, Hyderabad.
Extracurricular Activities:
● Vice President, Nrutyakala, YCCE (Present).
● Co-head (Content Writer), Nrutyakala (YCCE) – 2023-24.
● Organizer, YIN (YCCE) – 2023-24.
● Visharad in Kathak, ABGMV, Mumbai – November 2023.
36
APPENDIX
Nidhi Sakhare
Email: [email protected] | Phone no: 9730109343 | LinkedIn: NidhiSakhare | GitHub: NidhiSakhare
Education:
Yeshwantrao Chavan College of Engineering, B-Tech in Artificial Intelligence and Data Science
HSC, St. George College.
SSC, Guru Nanak High School.
Experience:
Java Developer Intern | Informatrix IT Solution Pvt Ltd | Jan 2025 – Present
● Learned core Java concepts, OOP principles, and real-time project development. Gained hands-on
experience with tools like NetBeans, Apache Tomcat, and MySQL.
● Exploring 3-tier architecture by working on backend logic, frontend integration, and database
connectivity.
Projects:
Final Year Project | Crowd Detection and Notifier System | Aug 2024 – Present
Developing a real-time surveillance system using Raspberry Pi, YOLOv8, GSM module, and WebRTC to
monitor crowd size and detect violent behavior in high-density areas. Instantly sends alerts and streams
live video via a user-friendly HTML/CSS web portal.
DIWALI SALES ANALYSIS | JULY 2024
Performed Exploratory Data Analysis (EDA) to analyze sales trends by state, city, gender, age group, and
marital status using pandas and matplotlib.
Technologies Skills:
Languages: C, Python, HTML, CSS, R, JavaScript.
Technologies & Tools: Microsoft Power BI, MySQL, Pandas, Numpy, Databases, Data visualizations,
Data Analysis, Microsoft Excel, VS Code, Google Colab.
Certifications:
● Introduction to Deep Learning.
Extracurricular Activities:
● Completed an online short-term course on Data Visualization for Deep Learning Using Power BI
and Tableau conducted by the Dept. of CSE, NIT Warangal, and Dept. IT, VNRVJIET,
Hyderabad.
● Completed a virtual internship on Data Visualization: Empowering Business with Effective
Insights by TATA, where I gained hands-on experience in using data visualizations to take
informed decisions.
37
APPENDIX
HIMANSHU DHOMANE
Email: [email protected] | Phone: (+91) 8788663472 | GitHub: https://www.github.com/himanshuio
PROJECTS
Crowd Detection and Notifier System (Flask, HTML, CSS, YOLOv8n, Raspberry Pi 4, Google Coral Edge TPU, SIM800L)
● Developed a real-time crowd monitoring system using Raspberry Pi 4, Coral Edge TPU, Sim800L and
YOLOv8n(pretrained) to analyze CCTV footage via RTSP.
● Additionally integrated a violence detection model trained on Roboflow dataset with 7000 images and also
implemented SMS alerts using the SIM800L GSM module.
Shophouse E-commerce Website (Flask, PostgreSQL, HTML/CSS, Render) View: https://shophouse-xh8n.onrender.com/
● Developed an e-commerce platform with product display, cart management, user authentication, and a
responsive frontend using HTML and CSS.
● Used Flask for the backend, managed the database with PostgreSQL (via DBeaver), and deployed the
website on Render for public access.
Notes Classifier (TensorFlow, Keras, Streamlit, Google Colab)
● Built a CNN-based notes classifier trained on 390 'Notes' images and 392 'Others' images for accurate
classification.
● Designed a Streamlit-based UI allowing users to upload images and classify them as 'Notes' or 'Others'.
TECHNICAL SKILLS
Java | Python | SQL | HTML/CSS | Flask | Dart | Flutter | MySQL | Git/GitHub | Figma
WORK EXPERIENCE
OceanZen — Flutter Intern (January 2025 – Present)
● Learned Dart fundamentals, Stateless and Stateful widgets, layouts, and navigation in Flutter.
● Gained knowledge of OOP concepts and implemented API integration to fetch and display real-time
cricket data in Flutter.
EDUCATION
Yeshwantrao Chavan College of Engineering, Nagpur, India CGPA: 6.78
(Expected May2025)
B.Tech. in Artificial Intelligence & Data Science Engineering
Balaji Junior College, Butibori, Nagpur, India Percentage:
71.83% (May 2021)
HSC
Holy Cross English Medium High School, Butibori, Nagpur, India Percentage: 73.20%
(May 2019)
SSC
EXTRA-CURRICULAR ACTIVITIES
● National Level UI/UX Competition — Marathwada Mitramandal College of Engineering, Pune
Designed a UI in Figma for a student course explorer app as part of a problem statement challenge.
● Innovation ‘R’ Us — Yeshwantrao Chavan College of Engineering, Nagpur
Presented a Period Tracker app as a team of four at Innovation 'R' Us.
● International Conference on Advances in Computing, Control & Telecommunication Technologies
(ACT 2025) – YCCE, Nagpur
Presented the research paper titled "Crowd Detection and Notifier System” in international conference.
38
APPENDIX
39
REFERENCES
REFERENCES
[1] Jiang, Peiyuan, Daji Ergu, Fangyao Liu, Ying Cai, and Bo Ma. 2021. “AReview of YOLO
Algorithm Developments.” The 8th International Conference on Information Technology and
Quantitative Management (ITQM 2020 & 2021). Published by Elsevier B.V, doi:
10.1016/j.procs.2022.01.135.
[2] Gu, Kaiming, and Boyu Su. 2024. “A Study of Human Pose Estimation in Low-Light
Environments Using the YOLOv8 Model.” International Engineering College, Xi’an University of
Technology & School of Intelligent Engineering, Zhengzhou University of Aeronautics, doi:
10.54254/2755-2721/32/20230200.
[3] Gündüz, M.Ş., Işık, G. “A new YOLO-based method for real-time crowd detection from video and
performance analysis of YOLO models”. J Real-Time Image Proc(2023).
[5] F. Sultana, A. Sufian, P. Dutta. “A Review of Object Detection Models Based on Convolutional
Neural Networks” (2019), arXiv, 2019, arXiv:1905.01614.
[6] Yufei Xu, Jing Zhang, Qiming Zhang, Dacheng Tao. “ViTPose: Simple Vision Transformer
Baselines for Human Pose Estimation”(2022), arXiv, 2022, arXiv:2204.12484v3.
[7] H. Ghahremaninezhad, H. Shi and C. Liu, “Real-Time Accident Detection in Traffic Surveillance
Using Deep Learning,” 2022 IEEE International Conference on Imaging Systems and Techniques
(IST), Kaohsiung, Taiwan, 2022, pp. 1-6, doi:10.1109/IST55454.2022.9827736.
[8] İrem Üstek, Jay Desai, Iván López Torrecillas, Sofiane Abadou, Jinjie Wang, Quentin Fever,
Sandhya Rani Kasthuri, Yang Xing, Weisi Guo, Antonios Tsourdos. “Two-Stage Violence Detection
Using ViTPose and Classification Models at Smart Airports” (2022), IEEE Access, arXiv, 2022.
[9] L. Jiao et al., “A Survey of Deep Learning-Based Object Detection,” in IEEE Access, vol. 7, pp.
128837-128868, 2019, doi: 10.1109/ACCESS.2019.2939201.
40
REFERENCES
[10] X. Zhou, W. Gong, W. Fu, and F. Du, “Application of deep learning in object detection,” 2017
IEEE/ACIS 16th International Conference on Computer and Information Science (ICIS), Wuhan,
China, 2017, pp. 631-634, doi: 10.1109/ICIS.2017.7960069.
[11] Abdul Vahab, Maruti S Naik, Prasanna G Raikar, Prasad S R. “Applications of Object Detection
System” (2019).
[12] Sanket Kshirsagar, Rushikesh Matele, Atharva Patil, Prof. B.B. Waghmode. “Crowd Monitoring
and Alert System” (2024).
[13] Esraa Samkari, Muhammad Arif, Manal Alghamdi, Mohammed A. Al Ghamdi. “Human Pose
Estimation Using Deep Learning: A Systematic Literature Review”(2023), Mach. Learn. Knowl. Extr.
2023, 5(4), 1612-1659.
[14] Dushyant Kumar Singha, Sumit Paroothi, Mayank Kumar Rusiac, Mohd. Aquib Ansari. “Human
Crowd Detection for City Wide Surveillance” (2020), doi: 10.1016/j.procs.2020.04.036.
[15] Liu, L., Ouyang, W., Wang, X. et al. Deep Learning for Generic Object Detection: A Survey. Int
J Comput Vis 128, 261–318 (2020), doi: 10.1007/s11263-019-01247-4.
[16] Krutika Patil, Madhura Vajarekar, Meera Yadate, T. N. Sawant “Vehicle Theft Detection Using
GSM on Raspberry Pi” Iconic Research And Engineering Journals Volume 3 Issue 11 2020 Page 119-
124.
[17] Kashaboina Radhika and Ramasamy Dr. Velmani 2020 IOP Conf. Ser.: Mater. Sci. Eng.
981042009. DOI10.1088/1757-899X/981/4/042009.
[18] R. Sai Sree, P. Chandu, B. Pranavi. “Real-Time Object Detection Using Raspberry Pi”
(2023).Raspberry Pi. 1R. Sai Sree, 2P. Chandu, 3B. Pranavi. 1Student at SNIST, 2Student at SNIST,
3Student at SNIST.
[19] S. Srikanth, Ch. Sai Kumar, N. Uday Rao, R. Srinivasa Rao. “Raspberry Pi Based Smart
Surveillance System” (2022).
41
REFERENCES
[21] H. Fateh Ali Khan, A. Akash, R. Avinash, and C. Lokesh, “WebRTC Peer to Peer Learning,”
Department of Information Technology, Valliammai Engineering College, Chennai, India, 2020.
[22] A. Tatrai and T. L. Semmens, "Real-time crowd measurement and management systems and
methods thereof," Patent, Australia, filed July 24, 2019, published 2021, CA3143637A.
[23] A. Moteki, N. Yamaguchi, and T. Yoshitake, "Camera pose estimation device and control
method," Patent (US), United States, filed October 12, 2023, published May 10, 2024,
US10088294B2.
[24] M. Redzic, J. Tang, Z. Hu, J. Antony, H. Wei, N. O’Connor, and A. Smeaton, "Crowd behavior
anomaly detection based on video analysis," Patent (WIPO International), filed October 7, 2019,
published April 15, 2021, WO2021069053A1.
42