0% found this document useful (0 votes)
95 views5 pages

Abnormal Event Detection Using CCTV Camera

Abnormal event detection, human behavior detection, as well as object recognition plays a vital role in the creation of a smart CCTV system
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
95 views5 pages

Abnormal Event Detection Using CCTV Camera

Abnormal event detection, human behavior detection, as well as object recognition plays a vital role in the creation of a smart CCTV system
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Volume 6, Issue 6, June – 2021 International Journal of Innovative Science and Research Technology

ISSN No:-2456-2165

Abnormal Event Detection Using CCTV Camera


1 2
Andrew Joemon Ashwin Prakash V
Student, Computer Science and Engineering Department Student, Computer Science and Engineering Department
Sahrdaya College of Engineering & Technology Sahrdaya College of Engineering & Technology
Thrissur, India Thrissur, India

3 4
Awin Chris Joy Bibin Thomas
Student, Computer Science and Engineering Department Student, Computer Science and Engineering Department
Sahrdaya College of Engineering & Technology Sahrdaya College of Engineering & Technology
Thrissur, India Thrissur, India

5
Divya R
Asst. Professor, Computer Science and Engineering Department
Sahrdaya College of Engineering & Technology
Thrissur, India

Abstract:- Abnormal event detection, human behavior I. INTRODUCTION


detection, as well as object recognition plays a vital role in
the creation of a smart CCTV system. These systems In the present day world, CCTV cameras are seen in
make it possible to detect abnormal events in an every nook and corners of our surroundings. The primary
environment, abnormal behaviors by humans and the objective of a CCTV camera is in the post scenario analysis
state of alert in the environment. Machine Vision as the CCTV records everything and these recordings are
property along with Machine Learning are used in these used only after an event has occurred in order to determine its
systems to detect as well as identify the particular aspects. As the world today demands the system to be more
anomalies that arise in the video feed from the CCTV. active than passive, technologies such as machine vision
Frame by frame processing is commonly used and along with sophisticated machine learning algorithms are
Supervised Learning is the commonly used training being incorporated to develop new systems and thereby send
method for these systems. However, since the anomalies alerts to the respective authorities as soon as the anomalies
are of many different kinds and also because it is not are detected. The analysis of crowd behavior and object
feasible to pre-detect and train all types of anomalies, detection can be deployed in many applications such as theft
supervised learning is being replaced by unsupervised detection in crowded environments. As it is quite likely for
learning and semi - supervised learning for training the people to be positioned at varying locations in the crowd and
system. This system provides a means of minimising or may move in diverse directions, it becomes a challenging task
removing the human workload that has to be put on to to find the effective features of the crowd and as a result, the
manually detect and create an alert on detection of an higher level analysis of crowd behavior becomes a tedious
abnormality in the live feed provided by the CCTV. Also task.
the system increases the storage efficiency by storing only
the abnormal events in original quality and storing the Anomaly detection is of considerable significance for
normal scenarios in low quality for archiving. Also this video surveillance systems. Most of the systems which are
system provides an extension of creating a distributed proposed use methods like Convolutional Neural Network
abnormality classification system, where only the (CNN) and LSTM (Long Short Term Memory) networks to
abnormal events are sent on to different dedicated effectively train the system in order to detect the anomalies in
systems to classify the abnormality. both supervised as well as unsupervised manners. Supervised
learning method emphasizes on using the existing knowledge
Keywords:- Convolutional Neural Network; Anomaly about a particular anomaly to train a system while
detection; Long Short-Term Memory; Unsupervised learning method on the other hand, tries to
learn normality rather than learning abnormality. This implies
that if a large deflection is seen from normal behavior, it
provokes abnormality. A highlighting aspect of our system is
that we can store original quality video snippets of the
abnormal events while the normal recordings will be stored in
low quality for archiving. Also this system provides an
extension of creating a distributed abnormality classification

IJISRT21JUN477 www.ijisrt.com 608


Volume 6, Issue 6, June – 2021 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165
system, where only the abnormal events are sent on to C. Unsupervised Anomaly Detection and Localization Based
different dedicated systems to classify the abnormality. on Deep Spatio Temporal Translation Network
Akshara Alex, Ashi Sahu, Avni Tanwar, Nisha Rathi,
II. RELATED WORK Kavitha Namdev [3] “Abnormal Event Detection By Machine
Vision Using Deep Learning,” This paper introduces a Deep
A. Practical Automated Video Analytics for Crowd Spatiotemporal Translation Network (DSTN), which is highly
Monitoring and Counting effective in the field of unsupervised anomaly detection and
Kang Hao Cheong, Sandra Poeschmann, Joel Weijia localization method. The performance of these oddment
Lai, Jin Ming Koh, U. Rajendra Acharya, Simon Ching Man localization in the pixel level evaluation is enhanced by
Yu and Kenneth Jian Wei Tang [1], “Practical automated proposing the Edge Wrapping to reduce the noise and
video analytics for crowd monitoring and counting”, In this conquer non-related edges of abnormal objects. Accuracy on
paper, video surveillance is integrated with computational any kind of anomalies on pixel level and Computational
analytics which in turn enables itself to greatly expand its Resource requirement for pixel level detection remains high.
functionality. A few significant methods are used in this This outperforms other art algorithm stages with respect to
system which includes a video processing back-end that the evaluation in both frame and pixel level evaluation, and
encompasses recognition of human subjects and tracking, as the time complexity for abnormal object detection and
well as a front-end graphical interface which is used for localization events.
operators that use classical and CNN based object
recognition techniques. One of the highlighting aspects of D. Abnormal Crowd Behavior Detection Using Motion
this system is that it has a high counting accuracy for Information Images and Convolutional Neural Networks
idealized single-subject and is appropriate for multiple- Thittaporn Ganokratanaa, Supavadee Aramvith, Nicu
subject scenarios which are more realistic. In this system, Sebe [4], “Unsupervised Anomaly Detection and Localization
facial identification is not contained on top of the current Based on Deep Spatio Temporal Translation Network”. In
object recognition as well as tracking for more enhanced this study, a novel method for abnormal crowd event
surveillance capabilities. Both controlled and non-controlled detection in surveillance videos is used. The proposed
tests were used to carry out the real-world validation of their approach is based on a new Motion Information Image (MII)
solution and considerable accuracy was strongly indicated in model that is formulated using optical flow. Optical Flow
the results,even in outdoor conditions. This system is useful Vectors that can generate Motion Information Image which is
for reducing human workload and is also able to accept then trained and tested using Convolutional Neural Network
multiple video streams from a centralized storage location. is used. It have high accuracy on abnormal motion detection.
Also,it was possible to perform the data collection of crowd Evaluations are conducted on publicly available UMN and
density and movement with better consistency and accuracy. PETS2009 datasets. The Computational Resource usage and
computational time is comparatively less in this approach.
B. Anomaly Detection in CCTV Using Optical Flow and The Requirement of huge datasets to train the system
Convolutional Autoencoder perfectly and identifying anomalies other than Motion
Elvan Duman, Osman Erdem [2], “Anomaly Detection Anomalies seems difficult in this case.
in Videos Using Optical Flow and Convolutional
Autoencoder” Ayhan In this study, a convolutional III. PROPOSED SYSTEM
autoencoder method is used to learn the pattern of normal
activities in videos. The main idea of the framework is that With our system, we can automate the process of
the frames, which contain any abnormal event, give detecting abnormal events from CCTV camera feeds. CNN
significantly different motion patterns than the normal frame. and LSTM technologies are used to detect anomalies in both
Dense optical flow maps are used as an input to the encoder. supervised and unsupervised manner. Alert messages can be
Then the network is trained with videos in which no sent to authorities on detection of events. Original Quality
abnormal event is included. After the training stage is video snippets of the abnormal event can be stored in high
properly done, the autoencoder can model the distribution of quality and low quality recordings for events which are
the pattern of normal motion changes. If an input video has considered as normal are stored separately. By implementing
an abnormal event, the model is expected to give a higher this system, we reduce the time taken and human workload
reconstruction error. Besides, the model was able to for detecting anomalies and also the system becomes more
reconstruct optic flow maps for corresponding normal video storage efficient.
volumes.
The design of this system consists of various modules or
This framework consists of three main stages. The first parts that have to be integrated together to complete the
stage of the framework, called preprocessing, aims at system. This involves the creation of Video Compressor,
extracting dense optical flow maps of each frame. In the Anomaly Detector, Storage Management and Alert
second stage, the convolutional autoencoder is used to obtain Management modules.
the spatial structure of each of the dense optical flow map
volumes. The last stage includes a convolutional long short- Video Compressor is the first module in the system
term memory network to learn the temporal patterns of which is used to compress the original video in order to store
encoded optical flow maps . it as low quality, low resolution video for archiving.

IJISRT21JUN477 www.ijisrt.com 609


Volume 6, Issue 6, June – 2021 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165
Anomaly Detector detects the presence of any kind of 1. Divide the training video frames into temporal sequences
anomalies in the live feed from the CCTV system. This with each of size 4 using the sliding window technique.
module is trained by Unsupervised learning method so that 2. Resize each frame to 256 × 256 to ensure that input images
the system can detect all kinds of anomalies (both pre- have the same resolution.
determined and undetermined). Also the video snippet where 3. Scale the pixel values between 0 and 1 by dividing each
the anomaly is occurring is stored in the original video pixel by 256.
quality and resolution.
The number of parameters in this model is
Storage Management module is used to manage the huge,Therefore a large amount of training data is required. so
storage of both the low quality archived video and original perform data augmentation in the temporal dimension. To
quality anomalous video snippets. These are stored separately generate more training sequences, and concatenate frames
for future reference. with various skipping strides. For example, the first stride-1
sequence is made up of frames (1, 2, 3, 4), whereas the first
Alert Management module is used to manage and send stride-2 sequence consists of frames (1, 3, 5, 7). We are using
the alert to respective personnel on identification of the 2 strides to extend our data for training. Since we are only
anomaly. using the 5th alternating frames, our strides are (1, 6, 11, 16),
and (1, 11, 21, 31).
IV. METHODOLOGY
Along with this above processing, we are reducing the
It is all about the reconstruction error. We have used an resolution and quality of each and every frame, and then
autoencoder to learn the regularity in video sequences. The storing them as video. This video is the low resolution video
intuition is that the trained autoencoder will reconstruct the that is stored for archiving. Since they don't contain any
regular video sequences with low error but will not abnormality, they are normally not referenced much in the
accurately reconstruct motions in irregular video sequences. future.

B. Building And Training The Model

Keras is used to build the convolutional LSTM


autoencoder. The below image shows the training process.
Train the model to reconstruct the regular events to start
discovering the model settings and architecture.

Fig.1 Architecture Diagram

We are using the UCSD anomaly detection dataset and


Avenue dataset, of which UCSD dataset contains videos
acquired with a camera mounted at height overlooking a
pedestrian walkway. These videos mainly contain
pedestrians.

Abnormal events are mainly non-pedestrian entities in


the walkway which are bikers, skaters, and small carts and
also include unusual pedestrian motion patterns like people
walking across a walkway or at the grass surrounding it. The
two parts of UCSD dataset are ped1 and ped2. We are using
Ped1 Ped2 and Avenue dataset for training and Testing.

A. Preprocessing

The training set consists of sequences of normal video


frames. The model will be trained to reconstruct these Fig.2 Training Mode Flowchart
sequences. Initially we are taking only the 5th alternate frame
from the video sequence. This is done to reduce the To build the autoencoder, Encoder and the decoder
processing time and memory usage.Get the data ready to feed should be defined. The encoder accepts as input a sequence of
the model by following these three steps: frames in chronological order, and it consists of two parts:
The spatial encoder and the temporal encoder. The encoded

IJISRT21JUN477 www.ijisrt.com 610


Volume 6, Issue 6, June – 2021 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165
features of the sequence that comes out of the spatial encoder When we are getting a reconstruction error value that is
are given to the temporal encoder for motion encoding. greater than the threshold, our system sends the alert signal.
Also the system starts to store the frames of original quality
The decoder will mirror the encoder for reconstructing and resolution from a predefined number of frames before the
the video sequence. occurrence of the abnormality to a predefined number of
frames after the occurrence of the abnormality in a video
format for future reference. We find it acceptable to store 10
to 20 frames from before the occurrence of the abnormality to
100 to 120 frames after the occurrence of the abnormality, to
get a clear idea of what the abnormality is, and how it is
occurring.

V. RESULT

These are the test results shown while detecting the


abnormal events through Avenue dataset.

It is noticed that the range above 31.5 is considered as


an abnormal or unusual event and the variation in graph plot
shows the same. Thus altered cases are taken separately and
marked as an abnormal event.

Fig.3 CNN Layers

C. Initialization and Optimization


We use Adam Optimizer with the learning rate set to
0.0001, It is reduced when training loss stops decreasing by
using a decay of 0.00001, and sets the epsilon value to
0.000001. For initialization Xavier algorithm is employed ,
which prevents the signal from becoming too tiny or too Fig.4 Plot of Second Test Video
massive to be useful because it goes through each layer.

D. Testing Phase
Each video is tested individually. UCSD Ped1 dataset
provides 36 testing videos and each of these videos contains
200 frames. Since we are taking only 5th alternative frames,
we get a total of 40 frames from each video in UCSD Ped1
dataset. In UCSD Ped2, we have 12 testing videos of varying
numbers of frames. In Avenue Dataset, we have 21 testing
videos with varying duration. Here we are only selecting 5-
Alternate frames. This is done to reduce the processing time
taken and memory usage. Even though we might get a much
better result if we select all the frames, it is not recommended
to do so as it takes a lot of time to produce the required
outputs.Sliding window technique is used to get all the
sequences of the 4 consecutive frames (after selecting 5-
Alternating Frames). This means that for each t between 0 Fig.5 Plot of Second Test Video
and 36 in UCSD Ped1 dataset, the regularity score, Sr(t) of
the sequence that starts at frame(t) and ends at frame (t+3) is
calculated.

IJISRT21JUN477 www.ijisrt.com 611


Volume 6, Issue 6, June – 2021 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165
FUTUREWORK

Our project aims at the future scope that has the ability
to train a system that can distinguish between specific
abnormalities. For example, first the system will check
whether any abnormality is present or not after that the
abnormal snippet will be given into other training models that
each are a classification model of specific abnormality
detection.

REFERENCES

[1]. Kang Hao Cheong, Sandra Poeschmann, Joel Weijia


Lai, Jin Ming Koh, U. Rajendra Acharya, Simon Ching
Man Yu, Kenneth Jian Wei Tang, “Practical Automated
Fig.6 Plot of Third Test Video Video Analytics for Crowd Monitoring and Counting”,
Digital Object Identifier, 2019.
[2]. Elvan Duman, Osman Ayhan Erdem, “Anomaly
Detection in Videos Using Optical Flow and
Convolutional Autoencoder”, Digital Object Identifier,
2019.
[3]. Thittaporn Ganokratanaa, Supavadee Aramvith, Nicu
Sebe, “Unsupervised Anomaly Detection and
Localization Based on Deep Spatiotemporal Translation
Network”, Digital Object Identifier, 2020.
[4]. Akshara Alex, Ashi Sahu, Avni Tanwar, Nisha Rathi,
Kavitha Namdev,“Abnormal Event Detection By
Machine Vision Using Deep Learning”, IJEAST, 2020.
[5]. Cem Direkoglu “Abnormal Crowd Behavior Detection
Using Motion Information Images and Convolutional
Fig.7 Plot of Fourth Test Video Neural Networks”, Digital Object Identifier, 2020.
[6]. Sandhya Rani Sahoo, Ratnakar Dash, Ramesh Ku.
VI. CONCLUSION Mahapatra, Baishnabi Sahu, “Unusual Event Detection
in Surveillance Video using Transfer Learning”, 2019
Abnormal event detection is a prominent feature in the International Conference on Information Technology.
creation of a smart CCTV system where it is possible to [7]. Rui Jiang, Xiaozheng Mou, Shunshun Shi, Yueyin
automatically detect abnormalities and create the necessary Zhou, Qinyi Wang, Meng Dong, Shoushun Chen
alerts. Supervised learning models are commonly used in the “Object tracking on event cameras with offline–online
existing systems to detect the various anomalies along with learning” CAAI Trans. Intell. Technol., 2020, Vol. 5,
reasonable computational resources. However, since the Iss. 3, pp. 165–171.
anomalies are of various kinds, it won’t be feasible to train [8]. Thittaporn Ganokratanaa, Supavadee Aramvith, Nicu
the system to detect all types of anomalies. For this reason, Sebe, “Anomaly Event Detection Using Generative
supervised learning is replaced by unsupervised learning to Adversarial Network for Surveillance Videos”, 978- 1-
effectively train the system. By implementing this system, we 7281-3248-8©2019 APSIPA.
also make a system that is storage efficient by saving only the [9]. https://www.geeksforgeeks.org/introduction-
abnormal frames in high quality while the recordings would convolution-neural-network/
be saved in lower quality.In the future, both supervised as
well as unsupervised learning methods can be combined
together to improve the system. Also anomaly identification
methods could be added in the future to identify various
types of anomalies as well as object detection.

IJISRT21JUN477 www.ijisrt.com 612

You might also like