Main Project Report Check
Main Project Report Check
Report submitted to the SASTRA Deemed to be University in partial fulfillment of the requirements
For the award of the degree of
Bachelor of Technology
Submitted by
May 2024
SCHOOL OF COMPUTING
THANJAVUR, TAMILNADU, INDIA–613401
SCHOOL OF COMPUTING
THANJAVUR – 613 401
Bonafide Certificate
This is to certify that the project report titled “Violence Detection Robotic Surveillance
System using IoT and Machine Learning” submitted in partial fulfillment of the
requirements for the award of the degree of B. Tech. Computer Science and Engineering to
academic year 2023- 24, in the School of Computing, under my supervision. This report has
not formed the basis for the award of any degree, diploma, associate ship, fellowship or other
Date :
Examiner1 Examiner2
1
SCHOOL OF COMPUTING
THANJAVUR – 613 401
Declaration
We declare that the project report titled “Violence Detection Robotic Surveillance System using IoT
and Machine Learning” submitted by us is an original work done by us under the guidance of Prof.
N.Kalyani , Asst.Professor -II,School of Computing, SASTRA Deemed to be University during the
final semester of the academic year 2023-24, in the School of Computing. The work is original and
wherever we have used materials from other sources, we have given due credit and cited the min the
text of the report. This report has not formed the basis for the award of any degree, diploma,
associate-ship, fellowship or other similar title to any candidate of any University.
1
Acknowledgement
We would like to thank our Honorable Chancellor Prof. R. Sethuraman for providing
us with an opportunity and the necessary infrastructure for carrying out this project as a part of
our curriculum.
Our Prof. N.Kalyani , Asst.Professor -II, School of Computing , was the driving force
behind this whole idea from the start. His deep insight in the field and invaluable suggestions
helped us in making progress throughout our project work. We also thank the project review
panel members for their valuable comments and insights which made this project better.
We would like to extend our gratitude to all the teaching and non-teaching faculties of
the School of Computing who have either directly or indirectly helped us in the completion of
the project.
2
Table of Contents
Title Page No.
Bona-fideCertificate ii
Declaration iii
Acknowledgements v
ListofFigures vi
ListofTables vii
Abstract viii
1. Introduction 1
1.1. Robot car 1
1.2 ESP 32 2
1.3. Convolutional LSTM 3
1.4. RGB frames 4
1.5. Mobile cameras 5
1.6. Existing System 6
1.7. Motivation 7
2. Objectives 7
3. Experimental Work/Methodology 8
3.1.Proposed System 8
3.2.Data Collection 9
3.3 Data Preprocessing 10
3.4 Training Model and Violence detection 11
4. Results and Discussion 12
4.1. Violence and non-violence prediction and its accuracy 13
4.2.Comparison of Algorithms-chats 14
5. Conclusion and Future Plans 18
5.1. Conclusion 18
5.2. Future Plans 19
6. References 20
7.Appendix 21
7.1 Similarity Check Report 22
7.2 Sample Source Code 23
3
List of Figures
4
List of Table
5
Abbreviations
RGB Red,Green,Blue
AI Artificial Intelligence
6
Abstract
Urban violence and crime pose significant threats to society, necessitating effective measures
for prevention and intervention. This project examines the escalating global trends in violence
and the proliferation of surveillance cameras as a response to enhance public safety. Despite
the ubiquity of surveillance systems, the sheer volume of video footage overwhelms human
operators, hampering real-time monitoring and response. In response to these challenges,
recent advancements in image processing and automation have emerged as promising
solutions for violence detection and anomaly recognition in surveillance videos. Leveraging
techniques such as computer vision, convolutional neural networks (CNNs), and deep
reinforcement learning, model can rapidly analyze video streams to identify and alert
authorities to potential threats.The project outlines the critical role of AI in augmenting
traditional surveillance methods, emphasizing the need for speed detection and intervention to
mitigate the impact of violent incidents. It presents a comprehensive methodology for
collecting real-time crime scene video datasets and processing them to detect abnormal
activities. Key contributions include the extraction of video frames using spatiotemporal
analysis and the classification of features using deep reinforcement neural networks. To
account for both spatial and temporal information, we use a convolutional alternative of the
standard LSTM, the ConvLSTM. In real time deployment of the model, when a surveillance
system identifies violence, it will automatically alert respected authorized personnel.
Experimental results demonstrate the efficacy and accuracy of the proposed approach in
identifying and characterizing violent behavior in diverse surveillance settings. The
implications of automated surveillance systems enhances public safety and outlines future
research directions in the field.
7
8
CHAPTER 1
INTRODUCTION
Violent behavior in public places is an issue that should be addressed. Communities are also
eroded by violence, which reduces productivity, lowers property values, and disrupts social
services. Across the world, violence is a severe public health issue. It affects people at various
phases of life, from infants to the elderly. Recognizing violence is challenging since it must be
done on real-time videos captured by many surveillance cameras at any time and in any location.
It should be able to make reliable real-time detection and alert corresponding authorities as soon as
violent activities occur. Public video surveillance systems are widespread around the world and
can provide accurate and complete information in many security applications. However, having to
watch videos for hours reduces your ability to make quick decisions. Video surveillance is
essential to prevent crime and violence.
In this regard, several studies have been published on the automatic detection of scenes of violence
in video. This is so that authorities do not have to watch videos for hours to identify events that
only last a few seconds. Recent studies have highlighted the accuracy of deep learning approaches
to violence detection. Indeed, deep learning methods have proven effective for extracting
spatiotemporal features from videos. A function that represents the motion information contained
in a series of frames in addition to the spatial information contained in a single frame. In this work
we will be describing the implementation of a Real-Time violence alert system using
convolutional LSTM. The frames obtained as output from the model are enhanced. Then these
frames along with time and location of the recorded incident are send to the nearby police station
as an alert via the alert module of the proposed system
It seamlessly transitions between autonomous navigation and manual control, ensuring smooth and
accurate movement across different terrains, from city streets to rugged landscapes, making it a
versatile asset for dynamic surveillance tasks. Equipped with advanced camera systems, thermal
imaging sensors, and sophisticated motion detection algorithms, the robot car provides
1
comprehensive surveillance coverage with remarkable accuracy. Its ability to detect, track, and
analyze objects in real time empowers security teams with valuable insights and heightened
situational awareness. The robot car's intelligent data processing capabilities enable proactive
threat detection, anomaly identification, and pattern recognition, enabling swift and effective
response strategies. Its seamless integration with existing security infrastructure streamlines
monitoring and control, enhancing operational efficiency and security management.
1.2 ESP32
The ESP32 microcontroller is a compact but robust chip that has gained widespread acclaim for its
versatility and performance across a range of applications, particularly in IoT (Internet of Things) and
embedded systems. It features a dual-core processor that allows for multitasking and real-time operation,
making it ideal for handling complex tasks seamlessly.One of the standout features of the ESP32 is its
connectivity capabilities, including built-in Wi-Fi and Bluetooth functionality. This enables effortless
wireless communication, whether it's connecting to the internet or interfacing with other devices, which is
crucial for IoT projects requiring seamless connectivity. Moreover, the ESP32 comes equipped with a rich
array of peripherals like GPIO pins, SPI, I2C, and UART interfaces, as well as ADCs and DACs. These
peripherals facilitate easy interfacing with various sensors, actuators, and external devices, enhancing the
microcontroller's versatility and usability. ESP32 supports advanced features such as low-power modes,
secure boot mechanisms, cryptographic accelerators, and OTA firmware updates. These features contribute
to power efficiency, security, and ease of maintenance, making the ESP32 a reliable choice for embedded
applications.
2
Fig. 1.2 Esp32
Convolutional LSTM (Long Short-Term Memory) is a sophisticated neural network architecture that
combines the power of convolutional layers and LSTM units. It's commonly used in tasks involving
sequential data with spatial and temporal dependencies, such as video analysis and prediction. The
convolutional part helps in capturing spatial patterns and features, while the LSTM part maintains memory
over time, making it adept at recognizing patterns and making predictions in dynamic environments.
Convolutional LSTM finds applications in diverse fields like natural language processing, weather
forecasting, and medical imaging. Its ability to capture both spatial and temporal patterns makes it suitable
for tasks requiring context awareness and long-term dependencies. Additionally, researchers are exploring
novel architectures and optimizations to further improve its performance and efficiency in various
domains.
3
Fig. 1.3 Convolutional LSTM
RGB frames refer to individual images captured by a camera or generated in a digital environment,
representing the Red, Green, and Blue color channels. These frames are the building blocks of visual
content, forming the basis for videos, animations, and graphical representations. Each RGB frame contains
pixel information that, when combined, creates a vibrant and detailed visual experience. In applications
like video processing, RGB frames are analyzed sequentially to extract information, detect objects, or track
motion.Each RGB frame contains valuable visual information beyond just colors; it includes texture
details, contrast variations, and object shapes crucial for computer vision tasks. Advances in image
processing techniques allow for enhancing and analyzing RGB frames to extract meaningful insights, such
as scene understanding, object recognition, and image synthesis. Moreover, the evolution of high dynamic
range imaging techniques adds depth and realism to RGB frames, enhancing visual storytelling and
artistic expression.
4
Fig. 1.4 RGB Frames
Mobile cameras, found in smartphones and portable devices, have evolved into powerful imaging tools
capable of capturing high-resolution photos and videos. These cameras typically include sensors, lenses,
and image processing algorithms to deliver sharp images and vibrant colors. Mobile cameras have become
integral in everyday life, enabling people to capture memorable moments, conduct video calls, and even
engage in augmented reality experiences. Their compact size and convenience make them a ubiquitous
feature in modern digital devices. The continuous innovation in mobile camera technology has led to
features like optical image stabilization, computational photography, and augmented reality capabilities.
These advancements empower users to capture stunning photos and videos even in challenging conditions,
such as low light or fast motion. Furthermore, mobile cameras have become integral in fields like remote
sensing, environmental monitoring, and healthcare diagnostics, extending their utility beyond personal
photography and videography.
Surveillance systems designed for violence detection, thorough analysis of recorded footage plays a crucial
role. Security teams meticulously review past incidents to identify patterns, trends, and potential risk
factors for violence. This retrospective approach allows for a detailed examination of behaviors,
interactions, and environmental cues captured in the footage. By reconstructing the sequence of events and
analyzing contextual information, investigators gain valuable insights into the dynamics leading up to
violent incidents. This deep understanding aids in developing targeted prevention strategies, refining
security protocols, and fostering a safer environment for all. Additionally, the knowledge gleaned from
historical surveillance data informs training programs, policy development, and collaborative efforts with
5
law enforcement, contributing to a comprehensive approach to violence prevention and security
enhancement.In non-real-time surveillance systems focused on violence detection, there are certain
limitations and disadvantages worth considering. One significant drawback is the delay in response and
intervention. Since these systems rely on reviewing recorded footage after an incident has occurred, there
may be a gap between the detection of violence and the initiation of appropriate action. This delay can
impact the effectiveness of security measures and potentially result in prolonged exposure to risks or
threats. Furthermore, non-real-time surveillance systems may face challenges in capturing nuanced
behavioral cues and subtle indicators of potential violence. Unlike real-time monitoring where immediate
actions can be taken based on live observations, analyzing recorded footage may require more extensive
interpretation and analysis, leading to the possibility of overlooking critical details or misinterpreting
events. Another disadvantage is the reliance on human operators for thorough analysis and
decision-making. While human intelligence is invaluable, it is susceptible to fatigue, biases, and
limitations in attention span. This can affect the accuracy and consistency of violence detection efforts,
potentially resulting in missed opportunities for intervention or misidentification of non-threatening
behaviors as violent incidents. Additionally, the storage and management of vast amounts of recorded
surveillance data pose logistical challenges. Storing and organizing archived footage requires significant
resources, including storage space, data management systems, and retrieval processes. This can lead to
increased operational costs and complexities in accessing and analyzing historical data effectively.
1.7.Motivation
The motivation for integrating surveillance cameras with machine learning algorithms on a robot car for
violence detection stems from the urgent need to create proactive and intelligent security solutions. This
innovative approach combines the mobility and flexibility of a robot car with the analytical power of
machine learning to enhance violence detection capabilities in various settings. One key motivation is the
ability to deploy surveillance cameras in dynamic and challenging environments where traditional fixed
cameras may not be practical or effective. The robot car's mobility allows it to navigate through crowded
spaces, traverse uneven terrain, and adapt to changing conditions, ensuring comprehensive surveillance
coverage. By incorporating machine learning algorithms, the surveillance system can analyze vast amounts
of data collected by the cameras in real time. These algorithms can learn from patterns and behaviors
observed in the data, enabling the system to identify potential indicators of violence, such as aggressive
gestures, altercations, or suspicious movements, with a high degree of accuracy.Moreover, the integration
of machine learning enables the system to continuously improve and adapt its detection capabilities over
time. As the algorithms learn from feedback and new data, they become more adept at distinguishing
between normal and potentially violent behavior, reducing false alarms and improving overall efficiency.
Another motivation is the proactive nature of the system enabled by machine learning. By leveraging
predictive analytics, the system can anticipate potential violent incidents before they escalate, allowing for
timely intervention and preventive measures. This proactive approach is crucial for enhancing public
safety and mitigating security risks.Furthermore, the presence of a robot car equipped with surveillance
cameras and machine learning capabilities can act as a powerful deterrent to potential perpetrators.
6
CHAPTER 2
Objectives
The primary objective of implementing real-time surveillance cameras, including those integrated
into robot cars and utilizing machine learning, alongside mobile cameras, is to enhance situational
awareness, improve response times, and increase the effectiveness of violence detection and
prevention measures. One key objective is to enable continuous monitoring of dynamic
environments in real time.
By deploying surveillance cameras on robot cars and mobile devices, security personnel can
access live video feeds from various locations simultaneously. This real-time visibility allows for
immediate detection and assessment of potential threats or violent incidents as they unfold.
Additionally, the integration of machine learning algorithms into the surveillance system aims to
enhance the system's analytical capabilities.
This proactive approach enables quick decision-making and intervention to prevent or mitigate
security risks.Another objective is to leverage the mobility and flexibility of robot cars and mobile
cameras to cover a wide range of areas and address security gaps. Robot cars can navigate through
complex environments, patrol designated areas, and provide surveillance in areas where fixed
cameras may be impractical. Mobile cameras, on the other hand, offer the flexibility to capture
video footage from different vantage points and respond to evolving security needs.
Moreover, real-time surveillance with machine learning enables predictive analytics capabilities.
By analyzing historical data and learning from past incidents, the system can anticipate potential
threats, detect suspicious activities, and trigger alerts or automated responses. This proactive and
predictive approach enhances the system's ability to prevent violent incidents before they escalate.
7
CHAPTER 3
Experimental Work / Methodology
The proposed system begins by starting the robot, which is equipped with surveillance cameras and
machine learning capabilities for violence detection. The user navigates the robot using an intuitive app
interface. This interface allows the user to control the robot's movements, such as directing it to specific
locations or patrolling designated areas. The robot's surveillance cameras capture live video feeds of its
surroundings. These video feeds are then fetched to the user's system in real time, providing a live view of
the monitored area. The system's machine learning algorithms continuously analyze the live video feeds
for signs of violence or aggressive behavior. If violence is detected based on predefined criteria and
behavioral patterns learned by the algorithms, the system flags the detection as "true." Upon detecting
violence, the system immediately notifies the user with a "true" in violence detection. This notification
alerts the user to the potential threat and prompts them to take appropriate action. In cases where violence
is detected (true detection), the system automatically captures a snapshot of the incident. This snapshot,
along with relevant details such as timestamp and location, is then sent to authorized persons and related
institutions via email.
Fig.3.1 Workflow
8
3.2 Data Collection
The primary source of data collection is the live video feeds captured by the surveillance cameras mounted
on the robot. These cameras continuously record the surroundings as the robot navigates through its
designated areas. The video feeds include visual information such as people, objects, and activities in real
time. Data related to the robot's navigation is also collected during the operation of the system. This
includes information about the robot's movements, such as its path, speed, and direction. Navigation data
helps track the robot's location and trajectory, providing context to the recorded video feeds. Interaction
data from the app interface is collected when the user navigates the robot and accesses the live video feeds.
This includes user commands, control inputs, and interface interactions. The app interface allows the user
to control the robot's movements, view live video feeds, and receive notifications. Algorithms analyze the
live video feeds for violence detection, data related to the analysis results is collected. This includes
detected patterns, identified behaviors, and the outcomes of violence detection. The machine learning
analysis results contribute to the decision-making process and trigger notifications based on detected
events. When violence is detected, a snapshot of the incident is automatically captured by the system. This
snapshot data includes the image of the detected event, along with metadata such as timestamp, location,
and relevant contextual information. The snapshot data is used for further analysis, reporting, and sharing
with authorized recipients via email.
The first step in data preprocessing is the processing of live video feeds captured by the surveillance
cameras on the robot. This includes tasks such as frame extraction, image enhancement, and resolution
normalization. Frame extraction involves extracting individual frames from the video stream, which are
then used for analysis. Image enhancement techniques may be applied to improve the clarity and quality of
the images, ensuring better visibility of details. Resolution normalization ensures that all frames have
consistent resolution for uniform analysis. Navigation data collected during the robot's operation is
formatted and structured for analysis. This involves organizing navigation data into a structured format,
including attributes such as timestamp, GPS coordinates, speed, direction, and path history. Formatting the
navigation data allows for easier correlation with other data elements and provides context to the recorded
video feeds. Interaction data from the app interface, including user commands and control inputs, is logged
and recorded for analysis. Each interaction is timestamped and categorized based on the type of action
performed, such as robot movement commands, camera control, or notification settings. Logging app
interface interactions helps track user behavior and preferences, providing insights into user engagement
with the system. The results of machine learning analysis, including detected patterns, behaviors, and
violence detection outcomes, are integrated into the data preprocessing pipeline. This involves
9
categorizing analysis results into relevant classes and associating them with corresponding video frames
and timestamps. Integrating machine learning analysis results with other data elements enables
comprehensive event logging and correlation. Metadata extraction is performed on the captured snapshots
of detected events. Metadata such as timestamp, location coordinates, incident type, and severity level is
extracted and associated with the corresponding snapshots. This metadata provides contextual information
about the detected events, facilitating subsequent reporting and sharing with authorized recipients.
The training of the violence detection model begins with data preprocessing, where the collected live video
feeds, navigation data, app interface interactions, and machine learning analysis results are prepared for
training. The processed data is then fed into the convolutional LSTM (Long Short-Term Memory) model
for training.The convolutional LSTM model is a deep learning architecture that combines the strengths of
convolutional neural networks (CNNs) for spatial feature extraction and LSTMs for temporal sequence
modeling. This combination enables the model to learn spatial and temporal patterns in the video data,
making it well-suited for violence detection tasks where both spatial and temporal contexts are
important.During training, the convolutional LSTM model learns to recognize patterns and behaviors
indicative of violence, such as aggressive gestures, physical altercations, or sudden movements. The model
is trained on labeled data, where instances of violence are labeled as positive examples and non-violent
activities as negative examples. Once the convolutional LSTM model is trained, it is deployed for
real-time violence detection in the surveillance system. The live video feeds captured by the surveillance
cameras on the robot are continuously processed by the trained model.The violence detection process
involves feeding video frames from the live feeds into the convolutional LSTM model. The model
analyzes each frame for features and patterns associated with violence, leveraging its learned spatial and
temporal representations.If the model detects patterns indicative of violence in a video frame, it triggers a
violence detection alert. This alert is then processed by the system, which may include notifying the user,
capturing a snapshot of the detected event, and sending notifications to authorized persons or institutions
via email.The convolutional LSTM model's ability to learn complex spatial and temporal patterns allows it
to accurately detect violent activities while minimizing false alarms. Its adaptive learning capabilities
enable it to continuously improve and adapt to evolving threats and environmental conditions, enhancing
the overall effectiveness of violence detection in the system.
10
CHAPTER 4
The integration of the violence detection system with a robot car equipped with surveillance
capabilities and a convolutional LSTM model yields promising results in enhancing security
measures. The system demonstrates a high level of accuracy and reliability in detecting violent
activities within the monitored environment.During real-world testing scenarios, the robot car
patrols designated areas and streams live video feeds to the convolutional LSTM model for
analysis. The model successfully identifies instances of violence, such as aggressive behaviors,
altercations, and sudden movements, with a high degree of precision. This accurate detection
enables timely intervention and response to potential threats, contributing to improved security
outcomes.The robot car's mobility and surveillance capabilities enhance the system's coverage and
flexibility, allowing it to monitor diverse environments and respond effectively to evolving
security challenges. The seamless integration of the convolutional LSTM model with live video
feeds from the robot car ensures continuous monitoring and proactive detection of violent
activities.TThis enhanced coverage improves situational awareness and enables proactive response
to potential threats.The real-time streaming of live video feeds from the robot car to the
convolutional LSTM model enables continuous monitoring and immediate detection of violent
activities. This real-time analysis facilitates timely intervention and response, reducing response
times and enhancing security effectiveness.The convolutional LSTM model's adaptive learning
capabilities enable continuous improvement and optimization of violence detection algorithms. As
the model learns from new data and feedback, it becomes more adept at identifying nuanced
patterns of violence and minimizing false alarms, enhancing overall system performance.The
integration of the robot car with the violence detection system streamlines security operations and
optimizes resource utilization. The system's automation of surveillance, analysis, and response
tasks improves operational efficiency, allowing security personnel to focus on critical
interventions and strategic decision-making.
11
Fig 4.1 Violence and non-violence prediction and its accuracy
12
Result Snapshots
13
Fig.5.2 App interface
14
Fig 5.3 Violence detection
Fig 5.4 Snapshot of the image as an alert intrusion given to the authorized via mail
15
CHAPTER 5
6.1 Conclusion
The integration of violence detection system with a robot car equipped with surveillance capabilities and a
convolutional LSTM model represents a significant advancement in enhancing security measures. Through
real-world testing and analysis, the system has demonstrated high accuracy in detecting violent activities,
such as aggressive behaviors and altercations, within monitored environments. The seamless integration of
the robot car's mobility, live video feeds, and machine learning-based violence detection algorithms has
enabled continuous monitoring, real-time analysis, and proactive response to potential threats, contributing
to improved safety and security outcomes.
Moving forward, several avenues can be explored to further enhance the capabilities and effectiveness of
the violence detection system with the robot car and convolutional LSTM model. Integrating additional
sensors, such as audio sensors or environmental sensors, into the robot car to capture multi-modal data for
more comprehensive threat detection and situational awareness. Incorporating advanced behavioral
analysis techniques, such as deep learning-based behavior recognition, to identify subtle cues and patterns
associated with pre-violent behaviors, enabling early intervention and prevention. Developing intelligent
response mechanisms, such as autonomous navigation for the robot car to navigate to specific locations in
response to detected threats or automated communication with law enforcement agencies for immediate
assistance.Optimizing the system for scalability and deployment in diverse environments, including public
spaces, transportation hubs, and critical infrastructure, to address a wide range of security challenges and
ensure comprehensive coverage.Enhancing the user interface of the system's control and monitoring
platform to provide intuitive controls, real-time analytics, and actionable insights for security personnel
and operators.
16
CHAPTER 6
REFERENCES
Nidhi Dubagunta, Ahad Karedia, Sinan Modi, Rajan Vyas, Sean Springer Using Machine Learning for
real time detection of violence in video footage-
https://medium.com/@nidhi.dubagunta/using-machine-learning-for-real-time-detection-of-violence-in-vid
eo-footage-82feeede3150
M. Tech, Y. A. (2020). Human violence detection using machine learning techniques. International Journal
of Recent Advances in Science, Engineering & Technology, 7(8), 14093-14096. Retrieved February 19,
2024, from, - https://www.ijraset.com/research-paper/human-violence-detection-using-ml-techniques
Pin Wang, Peng Wang, En Fan,Violence detection and face recognition based on deep learning,Pattern
Recognition Letters, Volume 142,2021,Page 20-24, ISSN 0167-8655
https://doi.org/10.1016/j.patrec.2020.11.018.
Jacoff, A., Messina, E., Weiss, B., Tadokoro, S. and Nakagawa, Y. (2003) Test Arenas and Performance
Metrics for Urban Search and Rescue Robots. Proceedings of the 2003 IEEE/RSJ International Conference
on Intelligent Robots and Systems, 27-31 October 2003, 3396-3403
Su, Minglan & Zhang, Chaoying & Tong, Ying & Liang, Baolin & Ma, Sicong & Wang, Jianxiu. (2021).
Deep Learning in Video Violence Detection. 268-272. 10.1109/CTMCD53128.2021.00064.
Pin Wang, Peng Wang, En Fan,Violence detection and face recognition based on deep learning,Pattern
Recognition Letters,Volume 142,2021,Page 20-24,ISSN
0167-8655,https://doi.org/10.1016/j.patrec.2020.11.018.
17
CHAPTER 7
APPENDIX
18
SAMPLE CODE
//Motor PINs
#define ENA D0
#define IN1 D1
#define IN2 D2
#define IN3 D3
#define IN4 D4
#define ENB D5
bool forward = 0;
bool backward = 0;
bool left = 0;
bool right = 0;
int Speed;
char auth[] = "02g3q8Ko6yxRqvceiou3RORHyGzjlIGs";
char ssid[] = "200gb";
char pass[] = "12345678";
void setup() {
Serial.begin(9600);
pinMode(ENA, OUTPUT);
pinMode(IN1, OUTPUT);
pinMode(IN2, OUTPUT);
pinMode(IN3, OUTPUT);
pinMode(IN4, OUTPUT);
pinMode(ENB, OUTPUT);
19
Blynk.begin(auth, ssid, pass, "blynk.cloud", 80);
}
BLYNK_WRITE(V0) {
forward = param.asInt();
}
BLYNK_WRITE(V1) {
backward = param.asInt();
}
BLYNK_WRITE(V2) {
left = param.asInt();
}
BLYNK_WRITE(V3) {
right = param.asInt();
}
BLYNK_WRITE(V5) {
Speed = param.asInt();
}
void smartcar() {
if (forward == 1) {
carforward();
Serial.println("carforward");
} else if (backward == 1) {
carbackward();
Serial.println("carbackward");
} else if (left == 1) {
carturnleft();
Serial.println("carfleft");
} else if (right == 1) {
carturnright();
Serial.println("carright");
} else if (forward == 0 && backward == 0 && left == 0 && right == 0) {
carStop();
Serial.println("carstop");
}
}
20
void loop() {
Blynk.run();
smartcar();
}
void carforward() {
analogWrite(ENA, Speed); // Set motor speed
analogWrite(ENB, Speed); // Set motor speed
digitalWrite(IN1, HIGH); // Set motor direction (IN1, IN2 for motor A)
digitalWrite(IN2, LOW); // Set motor direction (IN1, IN2 for motor A)
digitalWrite(IN3, HIGH); // Set motor direction (IN3, IN4 for motor B)
digitalWrite(IN4, LOW); // Set motor direction (IN3, IN4 for motor B)
}// Set motor direction (IN3, IN4 for motor B)
void carbackward() {
analogWrite(ENA, Speed);
analogWrite(ENB, Speed);
digitalWrite(IN1, LOW); // Set motor direction (IN1, IN2 for motor A)
digitalWrite(IN2, HIGH); // Set motor direction (IN1, IN2 for motor A)
digitalWrite(IN3, LOW); // Set motor direction (IN3, IN4 for motor B)
digitalWrite(IN4, HIGH); // Set motor direction (IN3, IN4 for motor B)
}
void carturnleft() {
analogWrite(ENA, Speed);
analogWrite(ENB, Speed);
digitalWrite(IN1, LOW); // Set motor direction (IN1, IN2 for motor A)
digitalWrite(IN2, HIGH); // Set motor direction (IN1, IN2 for motor A)
digitalWrite(IN3, HIGH); // Set motor direction (IN3, IN4 for motor B)
digitalWrite(IN4, LOW); // Set motor direction (IN3, IN4 for motor B)
}
void carturnright() {
analogWrite(ENA, Speed);
analogWrite(ENB, Speed);
digitalWrite(IN1, HIGH); // Set motor direction (IN1, IN2 for motor A)
digitalWrite(IN2, LOW); // Set motor direction (IN1, IN2 for motor A)
digitalWrite(IN3, LOW); // Set motor direction (IN3, IN4 for motor B)
digitalWrite(IN4, HIGH); // Set motor direction (IN3, IN4 for motor B)
}
void carStop() {
digitalWrite(IN1, LOW);
21
digitalWrite(IN2, LOW);
digitalWrite(IN3, LOW);
digitalWrite(IN4, LOW);
}
import numpy as np
import argparse
import pickle
import cv2
import os
import matplotlib.pyplot as plt
import smtplib
from email.mime.multipart import MIMEMultipart
from email.mime.text import MIMEText
from email.mime.image import MIMEImage
import yagmail
Q = deque(maxlen=128)
IMG_SIZE = 128
ColorChannels = 3
#video_cap = cv2.VideoCapture(0)
video_cap = cv2.VideoCapture('http://192.168.140.100:8080/video')
# grab the width, height, and fps of the frames in the video stream.
frame_width = int(video_cap.get(cv2.CAP_PROP_FRAME_WIDTH))
frame_height = int(video_cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
fps = int(video_cap.get(cv2.CAP_PROP_FPS))
email_from = '[email protected]'
email_to = '[email protected]'
22
smtp_server = 'smtp.gmail.com'
smtp_port = 587
smtp_username = '[email protected]'
smtp_password = 'Qwerty.1@ke23'
temp=0
imgsaved=0
sendalert=0
while True:
success, frame11 = video_cap.read()
if not success:
break
#frame11=cv2.flip(1)
frame = cv2.cvtColor(frame11, cv2.COLOR_BGR2RGB)
#frame1 = cv2.resize(frame, (512, 360)).copy()
frame = cv2.resize(frame, (128,128)).astype("float32")
23
preds = model.predict(np.expand_dims(frame, axis=0))[0]
Q.append(preds)
results = np.array(Q).mean(axis=0)
i = (preds > 0.06)[0] #np.argmax(results)
label = i
else:
color = (0, 255, 0)
if(temp==5):
if(imgsaved==0):
if label:
temp=temp+1
color = (255, 0, 0)
snapshot_path = 'snapshot.png'
cv2.imwrite(snapshot_path, frame11) # Save the frame as a snapshot
imgsaved =1
if(sendalert==0):
sendalert=1
send_email('Violence Detected', 'Violence has been detected in the video stream.',
snapshot_path)
temp=0
imgsaved=0
sendalert=0
cv2.putText(frame11, text, (35, 50), cv2.FONT_HERSHEY_SIMPLEX,1, color, 3)
cv2.imshow("frame", frame11)
key = cv2.waitKey(1) & 0xFF
if key == ord("q"):
Break
video_cap.release()
24