Uday Nani
Uday Nani
BACHELOR OF TECHNOLOGY
in
by
G. VIVEK 21K81A7415
V. ARAVIND 22K85A7404
R. UDAY 22K85A7407
Mr. V. SATHISH
ASSISTANT PROFESSOR
DEPARTMENT OF CSD
NOVEMBER - 2024
St. MARTIN'S ENGINEERING COLLEGE
UGC Autonomous
Affiliated to JNTUH, Approved by AICTE
NBA & NAAC A+ Accredited
Dhulapally, Secunderabad - 500
100
Certificate
This is to certify that the project entitled “REAL TIME VIDEO BASED VIOLENCE
DETECTION SYSTEM IN PUBLIC AREA USING NEURAL NETWORK ” is being
submitted by G. VIVEK (21K81A7415), V. ARAVIND (22K85A7404), R. UDAY
(22K85A7407) in fulfilment of the requirement for the award of degree of BACHELOR
OF TECHNOLOGY IN COMPUTER SCIENCE AND DESIGN is recorded of
bonafide work carried out by them. The result embodied in this report have been verified
and found satisfactory.
Place:
Date:
St. MARTIN'S ENGINEERING COLLEGE
UGC Autonomous
Affiliated to JNTUH, Approved by AICTE,
Accredited by NBA & NAAC A+, ISO 9001:2008 Certified
Dhulapally, Secunderabad - 500 100
DECLARATION
We, the students of “Bachelor of Technology in Department of Computer Science and Design”,
session: 2021 - 2025, St. Martin’s Engineering College, Dhulapally, Kompally, Secunderabad,
hereby declare that the work presented in this project work entitled REAL TIME VIDEO BASED
VIOLENCE DETECTION SYSTEM IN PUBLIC AREA USING NEURAL NETWORKis the
outcome of our own bonafide work and is correct to the best of our knowledge and this work has been
undertaken taking care of Engineering Ethics. This result embodied in this project report has not been
submitted in any university for award of any degree.
G. VIVEK 21K81A7415
V.ARAVIND 22K85A7404
R. UDAY 22K85A7407
ACKNOWLEDGEMENT
The satisfaction and euphoria that accompanies the successful completion of any task
would be incomplete without the mention of the people who made it possible and whose
encouragement and guidance have crowded our efforts with success.
First and foremost, we would like to express our deep sense of gratitude and indebtedness
to our College Management for their kind support and permission to use the facilities available in
the Institute.
We especially would like to express our deep sense of gratitude and indebtedness to
Dr. P. SANTOSH KUMAR PATRA, Professor and Group Director, St. Martin’s Engineering
College, Dhulapally, for permitting us to undertake this project.
We are also thankful to Dr. P. SAI PRASAD, Head of the Department, Computer
Science and Design, St. Martin’s Engineering College, Dhulapally, Secunderabad, for his
support and guidance throughout our project as well as Project Coordinator Dr. B. CHANDRA
SHEKAR, Professor, Computer Science and Design department for his valuable support.
We would like to express our sincere gratitude and indebtedness to our project supervisor
Mr.V. SATHISH, Professor, Information Technology, St. Martins Engineering College,
Dhulapally, for his support and guidance throughout our project.
Finally, we express thanks to all those who have helped us successfully completing this
project. Furthermore, we would like to thank our family and friends for their moral support and
encouragement. We express thanks to all those who have helped us in successfully completing
the project.
G. VIVEK 21K81A7415
V.ARAVIND 22K85A7404
R. UDAY 22K85A7407
ABSTRACT
The increasing concerns over public safety have necessitated the development of
advanced surveillance systems capable of real-time violence detection. This paper
presents a real-time video-based violence detection system leveraging the capabilities of
YOLO (You Only Look Once) and Convolutional Neural Networks (CNN). The system
aims to enhance public security by accurately identifying violent actions in surveillance
footage. YOLO, a state-of-the-art object detection framework, is utilized for its high-
speed processing and efficiency in detecting objects within frames. Combined with
CNNs, which excel in feature extraction and classification, the system is designed to
recognize and categorize violent behaviors effectively. The architecture integrates a
YOLO-based detector to identify and track individuals in the video stream, while a CNN
model is employed to analyze the behavior of detected individuals and classify actions as
violent or non-violent.
The system operates in real-time, processing video feeds from public area surveillance
cameras to provide immediate alerts when violent activities are detected. This capability
not only aids in swift response and intervention but also enhances the overall
effectiveness of public safety measures. The proposed solution is validated through
extensive experimentation, demonstrating its robustness and accuracy in various
scenarios. This paper outlines the design, implementation, and performance evaluation of
the system, highlighting its potential impact on public area surveillance and security
LIST OF FIGURES
ACKNOWLEDGEMENT i
ABSTRACT ii
LIST OF FIGURES iii
LIST OF TABLES v
1.2 Overview 03
4.1 Database 15
4.2 CNN Algorithm 15
4.3 Design 20
INTRODUCTION
In today's digital era, the prevalence of video content across various online platforms has
grown significantly, bringing attention to the need for efficient and automated violence
detection systems. To secure user security, safeguard online communities, and help law
enforcement authorities quickly respond to bad content, violence identification in real-
world recordings is an essential duty. Researchers have looked to pre-trained models as a
possible approach to tackle this problem, relying on their capacity to generalize from in-
depth training on substantial datasets. Public video observation systems are widely
utilized worldwide and can give accurate and detailed information in various security
applications . Although the need to review hours of video material compromises the
capacity to make decisions fast, which is essential in video surveillance for the
prevention of crime and violence.
In several research areas, CNNs are developing quickly, and that is anticipated that next
solutions will accelerate the implementation of CNNs. These learning algorithms still
have lots of room for progress given the accessibility of massive data and the rapid
increase in computing power. The spatial CNNs, they are used for images recognition
tasks, have recently been extended to the temporal domain for HAR in movies using a
ways of effective methods.
OBJECTIVE
The technological advancement in video and image processing has been unprecedented
due to their importance in finding out intricate contents for various applications and
purposes which includes search, summarization and recognizing action. The emphasis on
recognizing actions from video stream has been growing in recent years due to the rise of
violent acts involving terror groups to various single or multiple person attacks. This has
resulted into the usage of surveillance cameras throughout the world on a ubiquitous
level. The footage that these cameras streams are manually inspected for such violent
acts by humans all the time which is really tenacious and not feasible in the long run-in
order to scale this operation. Also, the process of detecting such scenarios can be error
prone due to the fact that humans can make mistakes, or they might not catch a
significant event due to inspecting other feeds.
There have been millions of cameras around the world for surveillance purposes but
even if the error percentages are low still the dangers are there for thousands of people.
This is just an estimation which allowed us to think about the current situation of
violence detection systems and methodologies. This calls for certain measures and fast
detection of violence without the help of humans. Hence, we turn towards deep learning
methods which are able to learn by itself and it is required as it can be used effectively to
detect potential violent activity before anyone. This is where our system comes in which
proposes to use deep learning algorithms in order to detect violent activity automatically.
This involves various stages of process such as object detection, action detection and
video classification. In this system we propose methodologies that will be able to
recognize violent threats and activities using deep learning methodologies.
OVERVIEW
putting in spot a violence detection system that is accurate and efficient, able to scan
real-world films and distinguish between violent and nonviolent content. Including a
mail notification system that, when it identifies violence in a video, delivers quick alerts
to a list of recipients. The aim of this work is to examine and develop a real-world film-
based violence detection system.
The study's objectives are as follows Create and improve already-trained models: The
project will concentrate on choosing appropriate pre-trained models for violence
detection, such as two-stream CNNs or 3D CNNs. Using annotated datasets of violent
and nonviolent videos, these models will be refined and tailored to the unique properties
of real-life settings. Improve accuracy and robustness.
A Convolutional Neural Network (CNN) is used to extract frame level features from a
video which are then aggregated using a variant of Long Short -Term Memory (LSTM)
that uses convolutional gates. CNN and LSTM are together used for the analysis of local
motion in a video.
CHAPTER 2
LITERATURE SURVEY
S. No
Authors Title Year
1.
2.
Deep Learning Approaches for Real-Time
A. Patel, M. Sharma, and R.
Detection of Violent Behavior in Video 2021
Kumar
Surveillance Systems
3.
R. Singh, P. Gupta, and K. Efficient Real-Time Violence Detection
2023
Mehta Using YOLO and LSTM Networks