0% found this document useful (0 votes)

12 views18 pages

Exploring Anomaly Detection Techniques For Crime Detection

Uploaded by

Santhosh Kavan Gowda

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views18 pages

Exploring Anomaly Detection Techniques For Crime Detection

Uploaded by

Santhosh Kavan Gowda

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 18

Exploring Anomaly Detection

Techniques for Crime Detection

Ashwin Singh1 , Aakanksha Singh1† , Ayush Bajaj1† ,
Sarang Deb Saha2 , Abhishek Sharma3†
1 Communication And Computer Engineering, The LNMIIT, Jamdoli
Jaipur, 302031, Rajasthan, India.
2 Communication Science Engineering, The LNMIIT, Jamdoli, Jaipur,

302031, Rajasthan, India.

3 Department of Electronics and Communication Engineering, The

LNMIIT, Jamdoli, Jaipur, 302031, Rajasthan, India.

Contributing authors: [email protected]; [email protected];

[email protected]; [email protected];
[email protected];
† These authors contributed equally to this work.

Abstract
Crime anomaly detection is critical for proactive law enforcement and public
safety measures. This paper emphasizes the identification and detection of
anomalous events harbouring criminal intent using the applications of deep
learning techniques, one such e.g. being Convolutional Neural Networks. Lever-
aging current research about neural networks, the study explores multiple
approaches using pre-trained neural network architectures, including VGG19,
DenseNet121, ResNet50, and MobileNetV2 to categorize criminal behavior into
multiple classes such as, Abuse, Arrest, Arson, Assault, Burglary, Explosion,
Fighting, Road Accident, Robbery, Shooting, Shoplifting, Stealing, Vandalism
and Normal Events.

The research systematically analyzes the performance of each model using var-
ious metrics to gauge the models’ ability to discern anomalies effectively in the
UCF Crime Dataset. It has been observed that DenseNet121 model has garnered
the most accuracy at 82.91%. The proposed methodology provides a founda-
tion for future research in refining crime prediction systems, contributing to
advancements in law enforcement technologies

1
Keywords: Neural Networks, Crime, Deep Learning, Anomaly Detection, Transfer
learning, CNN

1 Introduction
CCTV surveillance has been around for almost 70 years and has been a common
choice among law enforcement agencies and the general population to monitor abnor-
mal activities for public or personal safety. The global prominence of CCTV systems
is evident in the exponential growth of the market, with projections soaring from a
substantial $35.47 billion in 2022 to a staggering $105.20 billion by 2029, reflecting a
robust CAGR of 16.8% during the forecast period, 2022-2029 [1]. However, over the
years, even with the widespread deployment of CCTV cameras, the increasing popu-
lation and rapid urbanization have led to an alarming surge in criminal activities.
Given the increasing abundance of data collected on surveillance feeds and the climb-
ing crime rates, it becomes increasingly overwhelming and expensive to rely on human
monitoring of surveillance cameras, humans are limited by constraints such as fatigue
and other unaccounted errors.[2][3]
Hence, in the face of these evolving trends, the need for more sophisticated and intel-
ligent systems for automated prediction and detection of crimes and monitoring of
video surveillance arises. The use of Artificial Intelligence(AI) in video surveillance
has become a hot topic for research in recent years. The amount of research on the
use of neural networks for real-time crime detection has also seen significant growth
over the past few years with the developments in machine learning practices in the
21st century. Machine learning has become a popular choice among researchers owing
to how well ML techniques scale to large amounts of data . Neural network(NNs) is
a form of machine learning technique that attempts to model and learn based on the
functionality of human brains and is considered the most powerful clustering technol-
ogy available for unstructured data which includes grid-like data such as images and
video data. [4] [5] [6]
Therefore, this paper aims to propose, develop and compare deep-learning networks
to identify crimes using surveillance footage. The different models are compared using
metrics determining accuracy and computational costs. The problem at hand can be
divided into two parts: Step 1 - Detection of a crime, Step 2 - Identifying and classifying
the type of crime taking place. To accomplish this, the study focuses on using trans-
fer learning on some well-studied and current up-to-date CNN models - Densenet121,
VGG-19, ResNet50 and MobileNet V2 , and compare the different performance rates
for test images. The models are re-trained on a large dataset containing millions of
labelled images featuring occurrences of crimes from various CCTV footage.
Convolutional Neural Networks (CNN) or ConvNet, is one of the most widely used
techniques of deep learning for the purposes of object detection, recognition and image
classification problems, over the years, due to CNN achieving state-of-the-art accu-
racy for such tasks, CNN has been presenting an operative class of models for better

2
understanding of content present in an image, therefore resulting in better image
recognition, segmentation, detection, and retrieval [7].
The architecture of CNNs contains the following layers :
1. Input Layer: The starting point of a model, accepts images to be passed down
further
2. Convolution Layer: The building block of CNNs, responsible for extracting features
using filters applying convolution operation.
3. ReLU unit layer: Applies ReLU activation function to the output of the convolution
layer and converts the negative numbers to 0 for faster training.
4. Pooling layers: Does dimension reduction of feature maps, thereby reducing the
computational load.
5. Fully Connected layer: connects the information extracted from the previous steps
to the output layer and eventually classifies the input into the desired label.
6. Softmax: Present just before the output layer and gives the probabilities of each
class
The first four layers mentioned above are called the feature extraction layers and
the two remaining ones are called classification layers.

Fig. 1: Basic CNN architecture

However, traditional CNNs’ effectiveness is notably contingent on large-scale

datasets and substantial computational resources. The training of CNNs demands an
extensive amount of labelled data and computational power, making it a resource-
intensive and time-consuming process. The paradigm of transfer learning emerges to
aid with this problem. Transfer learning is an ML technique whereby a model is
trained and developed for one task and then re-used for similar tasks with minimal
modifications in the output or some of the hidden layers [8]. Transfer learning offers a

3
significant benefit by alleviating the necessity for an extensive dataset when training a
deep CNN, by fine-tuning a portion of the parameters from a pre-trained model in the
source domain using limited labelled data from the target domain, transfer learning
can yield better performance on the target dataset.[3] This fine-tuning is commonly
done by freezing the feature extraction layers and using the pre-trained weights as is,
as seen in figure 1 and modifying the classification layers to fit the target needs. How-
ever, there are instances where practitioners also modify the feature extraction layers
to better align with the specific requirements of the target application. In our study,
we utilize the former approach.

Fig. 2: Basic intuition behind transfer learning

2 Literature Review
According to the Sami Ansari et al. 2015 report [9], criminal cases in India have exhib-
ited a contrasting pattern compared to the global crime trend. Traditional damage
control strategies typically rely on the presence of law enforcement officials to carefully
review Closed Circuit Television (CCTV) recordings. The utilization of closed-circuit
television (CCTV) for surveillance in public areas has proven to be a valuable instru-
ment in both crime resolution and crime prevention [10]
. Often, the presence of a visible CCTV system deters criminals from carrying
out their illegal activities, leading to their subsequent capture. Nevertheless, the
task of personally monitoring each video sample to detect suspicious activities
becomes increasingly tedious, intricate, and time-consuming. It requires labour and
round-the-clock, constant attention.
Deep learning methods breakthroughs in recent years have assisted the automa-
tion in tasks like that of anomaly detection. Anomaly detection has a long history in
statistics, and artificial intelligence, and is a lauded problem. Integrating Anomaly
Detection with Convolutional Neural Networks has witnessed considerable surge in

4
recent years. Kowshik et al. study on Real Time Crime Detection proposes YOLOv5
as an effective object detection technique, employing just a single convolutional neu-
ral network. In the study publication, YOLOv5 was compared with its predecessors
using a proprietary real-time facial recognition dataset. This lays the framework for
the arrival of deep learning approaches in tackling age-old problems like those of
anomaly identification in sensitive scenarios pertaining to those of surveillance. [11]

Vipin Shukla et al. 2015 research on Automatic Alert of Security Threat offered
the approaches of background subtraction, coupled with human outline detection
utilizing edge estimator algorithms. The result is then utilized to examine human posi-
tion in succeeding frames therefore identifying their activities as suspicious or benign.
Although their research proposes recognizing whether an abnormal behaviour is hap-
pening, it doesn’t hypothesize on how to measure the nature of this behaviour. [12]

Nandhini T J et al. 2023 study explored the topic of criminal objects not being
apparent in places with deficient lightning. Automatic Night-time monitoring sensors
are vital to identify crime objects since most of them can be missed if evaluated by
the naked eye. The author examines the accuracy of recognizing 7 variables, namely:
knife, smartphone, car, animals, gun, blood, and currency by deploying a model for
object detection in IR (infrared) photographs. The author provided CNN architecture
and trained the model with 147 photos, with the accuracy of recognizing knives being
the greatest at 99.8%. [13]

The following table illustrates all the studies that were reviewed to create a compre-
hensive study of the existing literature on real-time crime analysis using deep learning
techniques. By building upon this existing body of literature, this research paper aims
to compile and compare the various methodologies used to implement

Table 1: Review of existing related studies

5
3 Methodology
This section outlines the general flow of the development of the models from choosing
the appropriate dataset to testing the models all of which are discussed in detail here.

Fig. 3: Processing steps for crime detection

3.1 Dataset
The data used for training the NN models is a modified and sized-down version of the
open-source UCF-Crime Dataset, obtained from Kaggle[14]. THe UCF-Crime Dataset
consists of long untrimmed surveillance videos which cover 13 real-world anomalies,

6
including Abuse, Arrest, Arson, Assault, Road Accident, Burglary, Explosion, Fight-
ing, Robbery, Shooting, Stealing, Shoplifting, and Vandalism. These anomalies are
selected because they have a significant impact on public safety [14]. The Kaggle
dataset contains images(64*64 px) extracted from every video from the UCF Crime
Dataset. Every 10th frame is extracted from each full-length video and combined for
every video in that class. This is done so that the size of the larger UCF dataset can
be reduced without losing any of the spatial and temporal information between the
images.
Fig. 4 presents samples of crimes from the different categories present in the UCF
crime dataset. Fig. 4 (a) Shows a man abusing a stray animal, Fig. 4 (b) shows several
policemen attempting to arrest someone after a car crash, Fig. 4 (c) depicts a man
pouring gasoline outside the victim’s house, Fig. 4 (d) is a snapshot of an assault in
progress with two men trying to hit the victim from behind, Fig. 4 (e) depicts an ongo-
ing burglary, Fig 4 (f) shows a large-scale explosion occurring, Fig 4 (g) is an instance
of someone harassing a couple followed by a physical altercation, Fig. 4 (h) shows an
incident of a road accident with a vehicle flipped over a person, Fig. 4 (i) shows an
instance of someone robbing the victim at gunpoint, Fig. 4 (j) is the footage of a per-
son laying unconscious after getting shot, Fig. 4 (k) is an instance of shoplifting, Fig.
4 (l) shows two people stealing some car parts, Fig. 4 (m) shows someone attempting
to flee the scene after breaking a glass pane and Fig. 3 (n) shows an instance of normal
occurrence.

3.2 Data Preprocessing

After acquiring the dataset, the data was partitioned into test and training sets. The
next steps in preprocessing utilize the Tensorflow Keras pipeline to resize the image
data to the standard 64*64 dimensions, apply the respective preprocessing for each of
the four models and generate new training data by applying data augmentation tech-
niques to the training data using the ’ImageDataGenerator’ from the Keras library.
Data augmentation improves the accuracy of the model and helps in reducing overfit-
ting by improving the generalization ability of the model. Generalizability refers to the
performance difference of a model when evaluated on previously seen data (training
data) versus data it has never seen before (testing data). Models with poor gener-
alizability have overfitted the training data [15]. The techniques used to achieve the
same include horizontal flipping, random width shifts (up to 10%), and random height
shifts (up to 5% ). The image pixels were subsequently normalized to the range [0,1]
by dividing each pixel value by 255 to reduce the computational complexity and speed
up network training.

7
(a) Abuse (b) Arrest (c) Arson (d) Assault

(e) Burglary (f) Explosion (g) Fighting (h) Road Accident

(i) Robbery (j) Shooting (k) Shoplifting (l) Stealing

(m) Vandalism (n) Normal

Fig. 4: Samples from the different crimes in the UCF crime dataset

3.3 Selected deep learning models for crime detection

Transfer learning has been used for training the following models :

3.3.1 DenseNet121
One of the key problems with these traditional CNNs is as the number of layers in
the CNN increases, i.e as they get ”deeper”, the gradient of the loss function starts
to diminish, also known as the ”vanishing gradient problem”, DenseNets resolve this
problem by modifying the standard CNN architecture and simplifying the connectivity
pattern between layers. In a DenseNet architecture, each layer is connected directly
with every other layer .[16]. The number of layers for ’L’ layers present is given by

L(L + 1)
L= (1)
2

8
This allows for feature reuse and requires fewer parameters than a traditional CNN,
and helps in reducing overfitting[16]. DenseNet121 is a variant of DenseNet and con-
tains 121 layers trained on large datasets such as CIFAR-100 and ImageNet. In terms
of architecture, each dense block consists of a varying number of layers featuring
two convolutions each; a 1x1-sized kernel as the bottleneck layer and a 3x3 kernel to
perform the convolution operation followed by a transition layer containing a 1x1 con-
volutional layer and a 2x2 average pooling layer with a stride of 2[16]. Densenet121
has been studied for crime prediction achieving a AUC score of 82.91%[17].

Fig. 5: DenseNet architecture with two dense blocks

3.3.2 VGG19
VGG19 being an acronym for ”Visual Geometry Group 19” is one of the most often
used image recognition architecture in the present day. The term ”19” connotes 19
weight layers—16 convolution layers, 3 fully connected layers, 5 maxpool layers, and 1
softmax layer. VGGNet takes an input image size of 224x224 RGB; the first two layers
are convolution layers with the kernel size of 3x3 of stride 1, and these layers use 64
filters each that result in a volume of 224x224x64 of the same padding. The small size
of convolution filters allow VGG to have a larger number of weight layers, leading to
improved accuracy. After this a batch pooling layer with a max-pool of size 2x2 and
stride 2 resulting in a reduction of height and width from 224x224x64 to 112x112x64
and so on. The VGG convolution layers are followed by a ReLu unit—it is a piecewise
linear function that will output the input if positive; otherwise, the output is zero.
The VGGNet has three fully connected layers, the first two having 4096 channels each,
and the third has 1000 channels. VGGNet has achieved 92.7% top-5 test accuracy
in ImageNet: a dataset consisting of 14 million images belonging to more than 1000
classes.

3.3.3 ResNet50
Residual Networks are another class of neural networks that solve the problem of van-
ishing gradient and high training error by introducing residual learning. In residual
learning, instead of trying to learn some features, we try to learn some residual. Resid-
ual can be simply understood as the subtraction of feature learned from input of that
layer which it achieves by introducing several shortcut or residual connections which
allows the input to bypass one or two layers[18]. The skip connections perform identity

9
mapping, and their outputs are added to the outputs of the stacked layers.ResNet-50
is a 50-layer deep convolutional neural network, trained on more than a million images
from the ImageNet database and has an input image of size 224*224.The architecture
is similar to the VGGNet consisting mostly of 3X3 filters. From the VGGNet, the
shortcut connection as described above is inserted to form a residual network. Resnet
achieved a top 5 accuracy of 92%.

Fig. 6: Residual Connection Network Block Diagram

3.3.4 MobileNetV2
As can be inferred from its name, MobileNetV2 is a CNN architecture that is aimed
to perform well in mobile devices. In its previous versions, MobileNetV1 was focussed
on reducing the complexity cost and model size of the network by utilizing Depthwise
Separable Convolution. The basic idea is to replace a full convolution layer into two
separate layers. The first layer is called a depthwise convolution, it performs lightweight
filtering by applying a single convolutional filter per input channel. The second layer
is a 1 × 1 convolution, called a pointwise convolution, which is responsible for build-
ing new features. MobileNetV2 also employs ”Inverted Residuals” building upon the
intuition that the bottleneck layer, albeit being on a lower dimensionality, has all the
necessary information. Using this information, MobileNetV2 establishes shortcut con-
nections between the bottleneck layers; thereby being more memory efficient than its
counterparts. [19]

10
Fig. 7: MobileNetV2 Block Diagram

4 Analysis and Report

4.1 Evaluation Metrics
The evaluation metrics used to compare the effectiveness of the models used are
discussed here. Metrics used include precision score, F1-score and ROC-AUC score.
Precision is given by equation 1.

TP
Precision = (2)
TP + FP

True Positive(TP) is the total number of occurrences where the crime/anomaly was
correctly detected whereas False Positive(FP) gives the number of occurrences where
the crime was falsely detected. A low precision would mean that the model predicts
some false positives and labels some normal occurrences as crimes. This type of error
is unwanted but allows a human analyser to review and correct the false alarm. The
other useful metric is recall given by Equation 2

TP
Recall = (3)
TP + FN

False Negatives(FN) is the number of occurrences where the model was unable to
detect criminal activity in the process.This type of error can be life-threatening since
it would lead to late response time by law enforcement authorities. F1-score acts as a

11
binding metric that unifies the precision and recall and gives a single score to judge
the model’s accuracy against some baseline. F1-score is given by equation 3 :

2 × Precision × Recall
F1 = (4)
Precision + Recall

The other powerful metric used is the Area under the ROC curve which has gained
much popularity for multiclass classification problems. diagnostic ability of a binary
classifier system as its discrimination threshold is varied. It is created by plotting the
True Positive Rate (TPR) against the False Positive Rate (FPR) at various threshold
settings.
The AUC score offers a quick overview of the ROC curve, shedding light on how
well a classifier can distinguish between different classes. The greater the AUC score,
the better the model is.

4.2 Results and discussion

The models were implemented using the Tensorflow Keras module. The selection of
hyperparameters was driven by the consideration of the dataset’s relatively large size,
aiming to strike an optimal balance between model convergence, stability, and com-
putational efficiency. The models were tuned for 1 epoch (to avoid prolonged training
times) with a batch size of 64 which is a common choice for deep learning models [20].
The models were compiled using SGD(for DenseNet121) and Adam(for VGG, ResNet
and MobileNet) optimizers with a learning rate of 0.00003 and with the loss function
of ’categorical crossentropy’. Each model’s feature extraction layers were frozen and
initialized with the pre-trained weights, and the fully connected layer at the top of the
network was excluded for modification to suit our classification needs. After extract-
ing the features, Global Average Pooling (GAP) was applied to each model to reduce
the feature map. Following GAP, dense layers were added to the models with ReLu
activation (three dense fully connected layers with 256, 1024, and 512 units in the
case of DenseNet and a single dense layer with 512 units for VGG19, ResNet50 and
MobileNetv2), each dense layer being followed by a dropout layer to mitigate overfit-
ting. Finally, a dense layer is added with a softmax activation function. This approach
of freezing the extraction part of a pre-trained model and modifying the classifica-
tion parts is efficient, as it saves significant computational resources and time, and
effective, as it can improve model performance.

12
Table 3: AUC scores for different classes
across the models

Class DenseNet121 VGG19 ResNet50 MobileNetV2

Abuse 0.67 0.23 0.87 0.61
Table 2: Metrics Table Arrest 0.49 0.53 0.46 0.51
Arson 0.82 0.77 0.66 0.82
Model Precision F1-Score AUC Score Train Time (s) Assault 0.65 0.58 0.69 0.46
DenseNet121 0.5601 0.6407 0.8361 5167 Burglary 0.79 0.58 0.71 0.72
VGG19 0.5659 0.5680 0.5464 4296 Explosion 0.77 0.61 0.69 0.73
ResNet50 0.5659 0.5660 0.6082 4421 Fighting 0.39 0.55 0.38 0.47
MobileNetV2 0.5659 0.5670 0.6525 4589 Normal 0.75 0.65 0.76 0.76
Road Accident 0.66 0.62 0.76 0.80
Robbery 0.58 0.39 0.65 0.60
Shooting 0.65 0.55 0.66 0.68
Shoplifting 0.53 0.56 0.82 0.76
Stealing 0.58 0.55 0.57 0.71
Vandalism 0.57 0.48 0.56 0.51

(a) Densenet121 (b) VGG19

(c) ResNet50 (d) MobilenetV2

Fig. 8: Comparison of ROC curves for different crime classes

The AUC score of 1 indicates a perfect classifier while the score of 0.5 implies the
model hasn’t learnt anything but instead is making random guesses, The closer the
curve is to the top-left corner, the higher the true positive rate (sensitivity) for a given
false positive rate (1-specificity), which indicates a better-performing model and con-
versely, a point along the diagonal line from the bottom left to the top right indicates

13
Fig. 9: Combined ROC curves for ’Normal’ class

that the true positive rate equals the false positive rate, representing a classifier that
performs no better than random chance.
Judging by the metrics from Table 2 and the ROC curves 8 we can safely conclude
that DenseNet121 performs considerably better than the other models followed by
MobileNetV2. However, when training times are taken into account, MobileNetV2
exhibits a 12% reduction in training time compared to DenseNet121, achieving com-
parable but slightly lower accuracy. This trade-off between the two needs to be taken
into account while picking a model for a specific use case. Another key observation
to be made is that different classes of anomalies perform better or worse on different
models. ’Arson’ performs better than all the other classes across all models except in
ResNet50 with the top performers being DenseNet and MobileNet with an AUC score
of 0.82. Whereas, the classes ’Fighting’ and ’Arrest’ turned out to be the worst per-
formers with below 0.5 AUC scores across the models. The findings of [21], point out
that the models struggle with instances of explosion and shooting due to smoke being
a common entity accompanied in both instances.
While analysing the performance of the models it is also quite useful to compare the
performances of models to distinguish clips from normal incidents to any anomalies,
to gauge this we investigate the ROC curves of the different models for the class ’Nor-
mal’ specifically. This measure displays further importance when it is realised that
the ’Normal’ class consists of 75.25% of the total data present in the training set,
which is a large portion of the dataset, hence the ability to distinguish between nor-
mal and the rest of the anomaly classes serves as a cardinal metric. The results of this
process are exhibited in Figure 9. The evaluation results indicate that VGG19 and
DenseNet121 outperform other architectures, achieving AUC scores of 0.76 and 0.77,

14
respectively. This suggests their efficacy in effectively discriminating between normal
and anomalous occurrences.

5 Conclusion
The main motive of this study was to utilize and compare different frameworks for
real-time anomaly detection as an aftereffect on surveillance camera footage snippets
from the UCF crime dataset. We employed the fundamentals of 4 models, namely:
DenseNet121, VGG19, ResNet50, and MobileNetV2 where it is observed that barring
the training time, DenseNet121 achieves a much better performance on the metrics
used to gauge all the models. The pinnacle aim of these findings lies in the contribu-
tion to the pre-existing biosphere of literature done in the advent of Deep-Learning
techniques in real-life situations.

The general structure of our framework follows the principle of Convolutional

Neural Networks, thereby proposing a model that works upon weakly-labeled training
videos. Since Densenet121 has the highest AUC score, it is deemed the best model at
predicting positive and negative classes as true. Inferring from the ROC curve, we can
conclude that DenseNet121 performs the best followed by MobileNetV2. ResNet50
and VGG19 do not outperform each other patently, rather they outshine others on
selected anomalies for instance the category of ’Abuse’.

Building on the aforementioned point, different models perform better or worse

on different classes of anomalies. ’Arson’ performs better across all models with an
AUC score of 0.82 in DenseNet and MobileNet, whereas ’Fighting’ and ’Arrest’ have
a below-average AUC score across all models.

6 Scope For Future Work

The research done in this paper is limited to the UCF crime dataset. As noted in
this paper [21], the UCF crime dataset must focus more on the crime scene frame
rather than having weakly labelled anomaly and no-anomaly clippings. This creates
limitations for the accurate training of our model. UCF-Crime’s testing set comprises
92.4% of normal frames and 7.6% of abnormal ones [22] , this reiterates the need for
more fitting evaluation metrics to test on unbalanced datasets. In Addition, the UCF
dataset only accounts for 13 anomaly classes, but doesn’t train the model on how to
detect normality patterns in the clippings.
Consequently, due to resource restraints, the models could only be trained for a sin-
gle epoch which might’ve resulted in erraticism in the accuracy for different classes of
models, this can be mitigated by training for multiple epochs to reduce the errors. The
next stepping stone to further our study would be to fine-tune our dataset. This can be
achieved by integrating our visual data with sensory data such as audio, and infrared
to improve evaluation accuracy. Furthermore, issues pertaining to object detection in
dim-light areas is another avenue to be considered. Furthermore, considering the sig-
nificant variations in performance across different classes on specific models, it opens
up avenues for further research in crafting more tailored models. Drawing inspiration

15
from the discussed architectures, there is room to explore the incorporation of addi-
tional elements such as LSTM to capture and leverage temporal information present in
CCTV footage. This approach aims to enhance the adaptability of models to diverse
scenarios and improve overall predictive accuracy

16
References
[1] Business Insights, F.: CCTV camera market size, growth: Global
report [2022-2029] (2023). https://www.fortunebusinessinsights.com/
cctv-camera-market-107115

[2] Malik, A.A.: Urbanization and crime : A relational analysis. (2016). https://api.
semanticscholar.org/CorpusID:22424407

[3] Ansari, S., Verma, A., Dadkhah, K.: Crime rates in india. International Criminal
Justice Review 25 (2015) https://doi.org/10.1177/1057567715596047

[4] Mandalapu, V., Elluri, L., Vyas, P., Roy, N.: Crime prediction using machine
learning and deep learning: A systematic review and future directions. IEEE
Access (2023)

[5] Mena, J.: Machine learning forensics for law enforcement, security, and intelli-
gence. (2011). https://api.semanticscholar.org/CorpusID:113740871

[6] Nguyen, M.T., Truong, L.H., Tran, T.T., Chien, C.-F.: Artificial intelligence
based data processing algorithm for video surveillance to empower industry 3.5.
Computers & Industrial Engineering 148, 106671 (2020)

[7] Sharma, N., Jain, V., Mishra, A.: An analysis of convolutional neural networks
for image classification. Procedia Computer Science 132, 377–384 (2018) https://
doi.org/10.1016/j.procs.2018.05.198 . International Conference on Computational
Intelligence and Data Science

[8] Hussain, M., Bird, J., Faria, D.: A study on cnn transfer learning for image
classification. (2018)

[9] Ansari, S., Verma, A., Dadkhah, K.M.: Crime rates in india: A trend analysis.
International Criminal Justice Review 25(4), 318–336 (2015) https://doi.org/10.
1177/1057567715596047

[10] Rohit Malpan, M.C.: Impact of cctv surveillance on crime. (2021)

[11] Kowshik, D.Y.R.D. Shoeb: Real time crime detection using deep learning. (2023)

[12] Shukla, V., Singh, G., Shah, P.: Automatic alert of security threat through video
surveillance system. (2013)

[13] J, N.T., Thinakaran, K.: Detection of crime scene objects using deep learning
techniques. In: 2023 International Conference on Intelligent Data Communication
Technologies and Internet of Things (IDCIoT), pp. 357–361 (2023). https://doi.
org/10.1109/IDCIoT56793.2023.10053440

[14] Real-world Anomaly Detection in Surveillance Videos, link = https://www.crcv.

17
ucf.edu/projects/real-world/,

[15] Shorten, C., Khoshgoftaar, T.M.: A survey on image data augmentation for deep
learning. J. Big Data 6(1) (2019)

[16] Huang, G., Liu, Z., Maaten, L., Weinberger, K.: Densely connected convolutional
networks. (2017). https://doi.org/10.1109/CVPR.2017.243

[17] Hasija, S., Peddaputha, A., Hemanth, M.B., Sharma, S.: Video anomaly clas-
sification using densenet feature extractor. In: Tiwari, R., Pavone, M.F.,
Ravindranathan Nair, R. (eds.) Proceedings of International Conference on
Computational Intelligence, pp. 347–357. Springer, Singapore (2023)

[18] He, K., Zhang, X., Ren, S., Sun, J.: Deep Residual Learning for Image Recognition
(2015)

[19] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: MobileNetV2:
Inverted Residuals and Linear Bottlenecks (2019)

[20] Bengio, Y.: Practical recommendations for gradient-based training of deep

architectures (2012)

[21] Dua, A., Kalra, B., Bhatia, A., Madan, M., Dhull, A., Gigras, Y.: Crime
alert through smart surveillance using deep learning techniques. In: Proceed-
ings of the 4th International Conference on Information Management & Machine
Intelligence, pp. 1–8 (2022)

[22] Caetano, F., Carvalho, P., Cardoso, J.S.: Unveiling the performance of video
anomaly detection models — a benchmark-based review. Intelligent Systems with
Applications 18, 200236 (2023) https://doi.org/10.1016/j.iswa.2023.200236

1 s2.0 S0045790622005419 Main
No ratings yet
1 s2.0 S0045790622005419 Main
15 pages
Smart Surveillance Using Deep Learning
No ratings yet
Smart Surveillance Using Deep Learning
10 pages
Image Recognition Using CNN
0% (1)
Image Recognition Using CNN
12 pages
Cat and Dog Classification Using CNN Fin
No ratings yet
Cat and Dog Classification Using CNN Fin
34 pages
Nikko New Product - Catalogue
No ratings yet
Nikko New Product - Catalogue
32 pages
Vitotres343 TechGuide PDF
No ratings yet
Vitotres343 TechGuide PDF
32 pages
Review of Violence Detection System.
No ratings yet
Review of Violence Detection System.
4 pages
Intelligent Crime Anomaly Detection in Smart Cities Using Deep Learning
No ratings yet
Intelligent Crime Anomaly Detection in Smart Cities Using Deep Learning
6 pages
14166-Article Text-25243-1-10-20231018
No ratings yet
14166-Article Text-25243-1-10-20231018
7 pages
Crime Examination Study 2021
No ratings yet
Crime Examination Study 2021
9 pages
ATB Deep Learning-Based Alert Generation For Detecting Suspicious Activity
No ratings yet
ATB Deep Learning-Based Alert Generation For Detecting Suspicious Activity
5 pages
1.convolutional Neural Networks For Image Classification
No ratings yet
1.convolutional Neural Networks For Image Classification
11 pages
Final 2
No ratings yet
Final 2
14 pages
Surveillance Automation Through DeepLear
No ratings yet
Surveillance Automation Through DeepLear
7 pages
Visual Image Understanding
No ratings yet
Visual Image Understanding
7 pages
Video Surveillance Fire Detection System Using CNNAlgorithm
No ratings yet
Video Surveillance Fire Detection System Using CNNAlgorithm
5 pages
650288448aams Vol 2111 September 2022 A36 p6627-6650 J. Indhumathi and M. Balasubramanian
No ratings yet
650288448aams Vol 2111 September 2022 A36 p6627-6650 J. Indhumathi and M. Balasubramanian
24 pages
Object Detection Using Convolutional Neural Network Transfer Learning
No ratings yet
Object Detection Using Convolutional Neural Network Transfer Learning
11 pages
Halooo
No ratings yet
Halooo
13 pages
Double Blind Reviewed Journals
No ratings yet
Double Blind Reviewed Journals
7 pages
Paper SMART CCTV CAMERA SURVEILLANCE SYSTEM
No ratings yet
Paper SMART CCTV CAMERA SURVEILLANCE SYSTEM
5 pages
Anomaly Detection in Surveillance
No ratings yet
Anomaly Detection in Surveillance
9 pages
RP Report Printing Final1
No ratings yet
RP Report Printing Final1
35 pages
Irjet V7i61094
No ratings yet
Irjet V7i61094
3 pages
Real Time Crime Detection Using Deep Learning Algorithm
No ratings yet
Real Time Crime Detection Using Deep Learning Algorithm
5 pages
Neuromodulation Via Electromagnetic and Ultrasound Fields
No ratings yet
Neuromodulation Via Electromagnetic and Ultrasound Fields
11 pages
AD3501-DL-Unit 2
No ratings yet
AD3501-DL-Unit 2
33 pages
Deep Learning Approach For Suspicious Activity Detection From Surveillance Video
No ratings yet
Deep Learning Approach For Suspicious Activity Detection From Surveillance Video
6 pages
Peerj Cs 07 402
No ratings yet
Peerj Cs 07 402
23 pages
NN Jaguar Lava 122
No ratings yet
NN Jaguar Lava 122
10 pages
IJECE
No ratings yet
IJECE
12 pages
Feature Extraction Using Convolution Neural Networks (CNN) and Deep Learning
No ratings yet
Feature Extraction Using Convolution Neural Networks (CNN) and Deep Learning
5 pages
Efficient Deep CNN-Based Fire Detection and Localization in Video Surveillance Application
No ratings yet
Efficient Deep CNN-Based Fire Detection and Localization in Video Surveillance Application
11 pages
Theft Detection Using Deep Learning
No ratings yet
Theft Detection Using Deep Learning
9 pages
Unit 2
No ratings yet
Unit 2
20 pages
Final Year Project Review 1
No ratings yet
Final Year Project Review 1
13 pages
The Virtual Soldier: Detecting, Recognizing, Tracing, Informing Criminals As Well As Crimes in Real World
No ratings yet
The Virtual Soldier: Detecting, Recognizing, Tracing, Informing Criminals As Well As Crimes in Real World
6 pages
Crime Detecction DL Model ConvLSTM2D Analysis and Results
No ratings yet
Crime Detecction DL Model ConvLSTM2D Analysis and Results
4 pages
Shravya Banala
No ratings yet
Shravya Banala
29 pages
Nox Eye - Modules
No ratings yet
Nox Eye - Modules
2 pages
Militant and Weapon Detection Final Report
No ratings yet
Militant and Weapon Detection Final Report
63 pages
Paper 4143
No ratings yet
Paper 4143
8 pages
Mini Project Final Report
No ratings yet
Mini Project Final Report
30 pages
Final Report 2 PDF 1
No ratings yet
Final Report 2 PDF 1
86 pages
Suspicious Activity Detection Using Deep Learning Approach
No ratings yet
Suspicious Activity Detection Using Deep Learning Approach
6 pages
Object Detection Using CNN
No ratings yet
Object Detection Using CNN
5 pages
Machine Learning Software For The Detect
No ratings yet
Machine Learning Software For The Detect
7 pages
Anomaly Detection of Industrial Control Systems Based On Transfer Learning
No ratings yet
Anomaly Detection of Industrial Control Systems Based On Transfer Learning
12 pages
Batch 06
No ratings yet
Batch 06
9 pages
VISIONHUB Suspicious Movement Classification and Weapon Object Detection Using Recurrent Neural Network RNN and Region Based Convolutional Neural Network R CNN
No ratings yet
VISIONHUB Suspicious Movement Classification and Weapon Object Detection Using Recurrent Neural Network RNN and Region Based Convolutional Neural Network R CNN
61 pages
Crime Prediction Using Machine Learning and Deep L
No ratings yet
Crime Prediction Using Machine Learning and Deep L
8 pages
Crime Forecasting
No ratings yet
Crime Forecasting
14 pages
AIHTC
No ratings yet
AIHTC
4 pages
Collection of Unexpected Accidents Under Bad CCTV
No ratings yet
Collection of Unexpected Accidents Under Bad CCTV
5 pages
14 Jaer V8N1
No ratings yet
14 Jaer V8N1
7 pages
Arora 2020
No ratings yet
Arora 2020
3 pages
Convolutional Neural Networks
No ratings yet
Convolutional Neural Networks
8 pages
Big Data Analytics and Mining For Effective Visualization and Trends Forecasting of Crime Data
No ratings yet
Big Data Analytics and Mining For Effective Visualization and Trends Forecasting of Crime Data
8 pages
BTP RP 3
No ratings yet
BTP RP 3
4 pages
Real-Time Anomaly Detection and Classification From Surveillance Cameras Using Deep Neural Network
No ratings yet
Real-Time Anomaly Detection and Classification From Surveillance Cameras Using Deep Neural Network
6 pages
Ieee Paper
No ratings yet
Ieee Paper
5 pages
Activity Recognition: Fundamentals and Applications
From Everand
Activity Recognition: Fundamentals and Applications
Fouad Sabry
No ratings yet
T JD
No ratings yet
T JD
1 page
Bus Bays Updated
No ratings yet
Bus Bays Updated
1 page
Fir-I I F - I
No ratings yet
Fir-I I F - I
3 pages
First Information Report: Before The Honourable Court of 6ht Addl - CMM Court, Nrupatunga Road, Bangalore City
No ratings yet
First Information Report: Before The Honourable Court of 6ht Addl - CMM Court, Nrupatunga Road, Bangalore City
3 pages
FINAL
No ratings yet
FINAL
1 page
Ok 2
No ratings yet
Ok 2
1 page
Black Spot Analys Paper
No ratings yet
Black Spot Analys Paper
6 pages
BENALURUR RURAL12122024kannada
No ratings yet
BENALURUR RURAL12122024kannada
26 pages
Black Spot Paper
No ratings yet
Black Spot Paper
9 pages
1.introdu Project Preparation
No ratings yet
1.introdu Project Preparation
17 pages
Ground Floor
No ratings yet
Ground Floor
1 page
Highway El 1 1rv23cht13
No ratings yet
Highway El 1 1rv23cht13
8 pages
Unity TCP Open Block Library Users Manual
No ratings yet
Unity TCP Open Block Library Users Manual
124 pages
Palindromes: Digitalcommons@University of Nebraska - Lincoln
No ratings yet
Palindromes: Digitalcommons@University of Nebraska - Lincoln
19 pages
PT Akasha Wira International TBK Swot Analysis Bac
No ratings yet
PT Akasha Wira International TBK Swot Analysis Bac
13 pages
Subdivision Warranty Bond
No ratings yet
Subdivision Warranty Bond
2 pages
Records Management Plan Template 042022
No ratings yet
Records Management Plan Template 042022
4 pages
Iso 3960 2007 en FR PDF
No ratings yet
Iso 3960 2007 en FR PDF
6 pages
How To Be Secure From Social Engineering Attack
No ratings yet
How To Be Secure From Social Engineering Attack
3 pages
Tax Invoice: 1046.17 Total Invoice Amount Rs
No ratings yet
Tax Invoice: 1046.17 Total Invoice Amount Rs
2 pages
Gr11 P2 ECO June 2024 Question Paper - 125612
100% (1)
Gr11 P2 ECO June 2024 Question Paper - 125612
13 pages
IELTS Simon Speaking Part 3 9dee133876
No ratings yet
IELTS Simon Speaking Part 3 9dee133876
37 pages
Module 9: Social and Resources Mobilization: An Approach in The Implementation of Civic Welfare and Training Services
No ratings yet
Module 9: Social and Resources Mobilization: An Approach in The Implementation of Civic Welfare and Training Services
25 pages
True Blue Geography
No ratings yet
True Blue Geography
5 pages
Impact of HL On QOL
No ratings yet
Impact of HL On QOL
8 pages
English 11-Grade 2022-2023
No ratings yet
English 11-Grade 2022-2023
1 page
Order Now Whatsapp: Course: Teacher Education in Pakistan (8626) Semester: Spring, 2023 Level: B.Ed. (1.5 Years)
No ratings yet
Order Now Whatsapp: Course: Teacher Education in Pakistan (8626) Semester: Spring, 2023 Level: B.Ed. (1.5 Years)
14 pages
20 Questions 35 Minutes
No ratings yet
20 Questions 35 Minutes
7 pages
Reset Sony HCD-GR8
No ratings yet
Reset Sony HCD-GR8
1 page
Max 30
No ratings yet
Max 30
6 pages
Betas
No ratings yet
Betas
4 pages
Raghuvamsa CantoV English Meaning
No ratings yet
Raghuvamsa CantoV English Meaning
69 pages
Water Resources - Geography
No ratings yet
Water Resources - Geography
5 pages
Happy Birthday
No ratings yet
Happy Birthday
2 pages
Scedule of Defense
No ratings yet
Scedule of Defense
1 page
T.E. (Computer Science I & II)
100% (6)
T.E. (Computer Science I & II)
20 pages
TOMEI Camshaft Spec Card TOMEI Camshaft Spec Card TOMEI Camshaft Spec Card
No ratings yet
TOMEI Camshaft Spec Card TOMEI Camshaft Spec Card TOMEI Camshaft Spec Card
1 page
Synapse - Test Automation Engineer
No ratings yet
Synapse - Test Automation Engineer
1 page
Knitting Chapter
No ratings yet
Knitting Chapter
12 pages
Gulfood Exhibitor List N 1
No ratings yet
Gulfood Exhibitor List N 1
19 pages