0% found this document useful (0 votes)

119 views15 pages

1 s2.0 S0045790622005419 Main

intelligent video surveillance Doc1

Uploaded by

Muhammad Nour

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

119 views15 pages

1 s2.0 S0045790622005419 Main

intelligent video surveillance Doc1

Uploaded by

Muhammad Nour

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

Computers and Electrical Engineering 103 (2022) 108319

Contents lists available at ScienceDirect

Computers and Electrical Engineering

journal homepage: www.elsevier.com/locate/compeleceng

A real time crime scene intelligent video surveillance systems in

violence detection framework using deep learning techniques
Kishan Bhushan Sahay a, *, Bhuvaneswari Balachander b, B. Jagadeesh c,
G. Anand Kumar d, Ravi Kumar e, L. Rama Parvathy f
a
Department of Electrical Engineering, Madan Mohan Malaviya University of Technology, Gorakhpur, Uttar Pradesh 273010, India
b
Department of ECE, Saveetha School of Engineering, Saveetha Institute of Medical and Technical sciences, Chennai, India
c
Department of ECE., Gayatri Vidya Parishad College of Engineering (Autonomous), Visakhapatnam, Andhra Pradesh 530048, India
d
Department of E.C.E, Gayatri Vidya Parishad College of Engineering Autonomous), Visakhapatnam, Andhra Pradesh 530048, India
e
Department of ECE, Jaypee University of Engineering and Technology, Guna, India
f
Department of Computer Science and Engineering, Saveetha School of Engineering, Saveetha Institute of Medical and Technical Sciences, Chennai,
India

A R T I C L E I N F O A B S T R A C T

Editor: Mustafa Matalgah Surveillance system research is now experiencing great expansion. Surveillance cameras put in
public locations such as offices, hospitals, schools, roads, and other locations can be utilised to
Keywords: capture important activities and movements for event prediction, online monitoring, goal-driven
Surveillance system analysis, and intrusion detection. This research proposed novel technique in detecting crime scene
Detecting crime scene video
video surveillance system in real time violence detection using deep learning architectures. Here
Deep learning
the aim is to collect the real time crime scene video of surveillance system and extract the features
ST
DRNN using spatio temporal (ST) technique with Deep Reinforcement neural network (DRNN) based
classification technique. The input video has been processed and converted as video frames and
from the video frames the features has been extracted and classified. Its purpose is to detect
signals of hostility and violence in real time, allowing abnormalities to be distinguished from
typical patterns. To validate our system’s performance, it is trained as well as tested in large-scale
UCF Crime anomaly dataset. The experimental results reveal that the suggested technique per
forms well in real-time datasets, with accuracy of 98%, precision of 96%, recall of 80%t, and F-1
score of 78%.

1. Introduction

Applications in various areas, including crime prevention, automatic smart visual monitoring and road safety, need for consid
erable attention to anomaly in event detection in video surveillance. In recent decades, an enormous number of surveillance cameras
are installed in both private and public locations for effective real-time monitoring to prevent malfunctions and protect public safety
[1]. Most cameras, however, offer just passive logging services and are not capable of monitoring. The number of these films grows
every minute, making it easy for human specialists to comprehend and analyses them. Similarly, monitoring analysts have to wait
hours for abnormal occurrences to be captured or seen for immediate reports [2]. Because there are few anomalous events in the real

* Corresponding author.
E-mail address: [email protected] (K.B. Sahay).

https://doi.org/10.1016/j.compeleceng.2022.108319
Received 11 January 2022; Received in revised form 3 August 2022; Accepted 10 August 2022
Available online 1 September 2022
0045-7906/© 2022 Published by Elsevier Ltd.
K.B. Sahay et al. Computers and Electrical Engineering 103 (2022) 108319

world, video anomaly detection are studied as a one-class issue, in which the model is trained on typical films and a video is tagged as
anomalous when odd patterns appear. All the typical real-world monitoring events cannot be cumulated in one dataset. Different
typical actions may thus be distracted from regular training events and may ultimately produce false alarms [3]. In contemporary
human action recognition research, notably in video surveillance, violence detection has been a hot area. The classification of human
activity in real time, nearly instantaneously after the action has occurred, is one of the challenges with human action recognition in
general. This difficulty escalates when dealing with surveillance video for a number of factors including the quality of surveillance
footage is diminished, lighting is not always guaranteed, and there is generally no contextual information that can be used to ease
detection of actions and classification of violent versus non-violent. Furthermore, in order for violent scene detection to be helpful in
real-world surveillance applications, the identification of violence must be speedy in order to allow for prompt intervention and
resolution. In addition to poor video quality, violence can occur in any given setting at any time of day, therefore, a solution will have
to be robust to detect violence no matter the conditions. Some settings for video surveillance where violence detection can be applied
includes the interior and exterior of buildings, in traffic, or on police body cameras [4].
A major purpose of video surveillance is the detection of unusual situations such as traffic accidents, robberies, or illicit activity.
Human operators and manual examination are still required by most existing monitoring systems (prone to disturbances and tired
ness). As a result, effective computer vision techniques for automatically detecting video anomalies/violence are becoming increas
ingly relevant. Building algorithms that detect specific anomalous occurrences, such as violence detectors, fight action detectors, and
traffic accident detectors, is a tiny step toward resolving detection of anomalies. In recent years, video action recognition has gotten a
lot of attention after achieving very promising results by leveraging CNN’s incredible robustness [5]. In most businesses and sectors,
installing CCTVs for ongoing surveillance of people and their interactions is a widespread practise. Every day, every person in a
developed country with a population of millions is photographed by a camera. Constant monitoring of these surveillance films by
police to determine whether or not the occurrences are suspicious is practically impossible, as it necessitates a workforce and their
undivided attention. As a result, we’re developing a demand for high-precision automation of this process. It is also vital to show which
frame and which parts of it include unexpected activity, as this aids in determining whether the unusual activity is abnormal or
suspicious. This will aid concerned authorities in finding underlying cause of anomalies while also saving time as well as effort that
would otherwise be spent manually searching the records. ARS is a real-time monitoring system that recognises and records evidence
of offensive or disruptive behaviour in real-time. Using a variety of Deep Learning models, this study seeks to detect and characterise
high movement levels in frame. Videos are divided into portions in this project. A detection alert is raised in event of a threat, dis
playing suspicious behaviour at a specific point in time. The movies in this project are divided into two classes: threat and safe.
Burglary, Abuse, Explosion, Fighting, Shooting, Shoplifting, Arson, Road Traffic Accidents, Robbery, Assault, Stealing and Vandalism
are amongst the 12 uncommon actions we recognise. As a result of these irregularities, people would feel safer [6].
The contribution of this paper is as follows:

• To collect real time crime scene video dataset and process them to convert into video frames for detecting abnormal activities
• The converted video frames were extracted using spatiotemporal analysis, which uses forward, backward, and bidirectional pre
dictions to extract the features of a video-based motion. The prediction errors are combined into a single image that depicts motion
of sequence.
• Then extracted features has been classified using Deep Reinforcement neural network (DRNN).
• The experimental results shows accuracy, precision, recall and F-1 score for various real time dataset of video surveillance systems

Paper organization is as follows: related works are described in Section 2, the proposed methodology is detailed in Section 3, and
Section 4 defines experimental analysis and concludes in Section 5.

2. Related works

In the sphere of public safety, the automated video surveillance system has become a hot focus of research. There has been a lot of
research done on object movement identification and tracking. Artificial intelligence has also aided in the reduction of labour and
enhancement of surveillance efficiency. Several attempts have been made to partially or completely automate this labour with ap
plications such as human activity recognition, Event detection and behaviour analysis. They utilised a Harris detector [7] to extract
important points and a SIFT as a descriptor, then a BoVW to extract mid-level features, which they solved using the same method as
visual categorization [8,9] employed the Spacetime Interest Point (STIP) to distinguish face emotions, human activities, and mouse
activity with 83%, 80%, and 72% accuracy. To categorise video sequences, [10] combines Gaussian Difference [11] with PCA-SIFT
(Principal Component Analysis SIFT) [12] and BoVW, resulting in the conclusion that the amount of the vocabulary employed in
BoVW is highly influenced by complexity of the scenes classified. The majority of studies employ BoVW; however, [13] reported a
BoVW comparison with other descriptions. [14] compares the performance of descriptors such as HOF and HOG with variations in
optical flow using Lucas-Kanade, HornSchunk, and Farnebäck as optical flow methods [15,16], one of the first attempts to use audio to
detect violence, defined violence as events including gunfire, explosions, fighting, and yelling, whereas nonviolent content was rep
resented by audio segments with music and talking. Descriptors included energy entropy, short-time energy, ZCR, spectrum flux and
roll-off, with a polynomial SVM classifier reaching an accuracy of 85.5%. Using Bag of Audio Words, [17] utilized MFCC (MelFre
quency Cepstral Coefficients) as an audio descriptor and dynamic Bayesian networks to get mid-level features (BoAW). The main
contribution of this research is the removal of video segmentation noise using BoAW. Based on Laptev’s study [18], they provided
BoVW utilising STIP (Space-Time Interest Point) as a descriptor and compared STIP-based BoVW with SIFT-based BoVW performance.

2
K.B. Sahay et al. Computers and Electrical Engineering 103 (2022) 108319

Fig. 1. Overall a proposed architecture.

In this case, STIP outperformed the competition. Ullah et al. [19] proposed a HueSTIP (Hue Space-Time Interest Points) variant of STIP
that counts pixel colours and recognises general activities for detecting conflicts. HueSTIP outperforms STIP, albeit at a larger
computational cost. [20] used MoSIFT to distinguish fights and compared MoSIFT and STIP as classifiers to BoVW and SVM. In ex
periments, two datasets were used: movies and hockey games. STIP beat MoSIFT in the hockey dataset, with 91.7% accuracy versus
90.9% for STIP, but MoSIFT surpassed STIP in the movie dataset, with 89.5% accuracy compared 44.5% for STIP [21]. The authors of
[22] and [23], in contrast to [24], use localised detection rather than full-frame video processing. The study in [25] proposes, in
particular, utilising the intrinsic location of anomalies and examining if use of spatiotemporal data might aid in the detection of
abnormalities. They combine model with a tube extraction module to narrow the scope of the investigation to a specified set of
spatiotemporal coordinates. The authors favour human / in-hand annotation versus computer vision-driven localization, which is one
downside of this technique. It’s a time-consuming and fruitless attempt. In contrast, [26] extracts numerous activation boxes from a
motion activation map that assesses intensity of activity at each place to automatically determine all potential attention zones where
fighting actions may occur. The authors cluster all localised ideas around retrieved attention zones based on the geographical link
between each pair of human proposals and activation boxes. It’s worth noting that the study [27] primarily focuses on pinpointing the
location of a fight in a public space; consequently, this approach isn’t suitable to a unified ADS. In fact, in untrimmed public video
footage, occlusions, motion blur, lighting changes and other environmental alterations [28] are still difficult to deal with. As a result,
we present an automatic yet effective attention region localization strategy based on background subtraction in this study. To begin, a
robust background subtraction method is used to find attention/moving zones. Attention regions will then be supplied into a 3D CNN
action recognition system [29].

3. Proposed design for real-time violence detection in crime scene intelligent video surveillance systems

This section discuss the proposed model in implementing real time violence detection of crime scene for intelligent video sur
veillance systems. Here the feature extraction and classification has been carried out based on deep learning architectures for
extracting the video frame features and then to classify them in detecting abnormality of the crime scene surveillance. The overall
proposed architecture is represented in Fig. 1.

3.1. Spatio temporal model based feature extraction

Let ytrepresent a video image frame at time t and hence is a spatial entity, and let y be a 3-D volume consisting of spatio-temporal
image frames. Each pixel in ytt corresponds to a site s, which is indicated byyst. Let Yt denote a random field, and ytdenote its realisation
at time t. Thus, yst denote a spatio-temporal co-ordinate of grid (s, t). Let x stand for segmentation of the video sequence y and xt for
segmented form of yt. In the same way, xst indicates the label of a site in frame. Let’s pretend that Xt denotes MRF from which xt is
derived. The label field can be estimated for any feature extraction issue by maximising the posterior probability distributions in Eqs.
(1) and (2).
̂x t = argmaxP(Xt = xt |Yt = yt ) (1)
xt

P(Yt = yt |Xt = xt )P(Xt = xt )

xt = argmaxxt (2)
P(Yt = yt )

3
K.B. Sahay et al. Computers and Electrical Engineering 103 (2022) 108319

where ̂
x t indicates calculated labels. prior probability P(Yt = yt) is constant and hence reduces to Eq. (3)
̂x t = argmaxxt P(Yt = yt |Xt = xt , θ)P(Xt = xt , θ) (3)

where θ is parameter vector for xt’s clique potential function. Here, xˆt is MAP evaluate. Eq. (4) consists of two parts P(Xt = xt) the prior
probability and P(Yt = yt|Xt = xt,θ) as likelihood function.
The prior probability P(Xt = xt,θ) can be expressed as
∑
1 − U(xt ) 1 − { c∈C Vc (xt )}

P(Xt = xt , θ) = e T = e T (4)
z z
∑ − U(xt )
where z is partition function is given as z = xt e T , U(Xt )i is energy function. Vc(xt) in spatial realm, shows clique potential function
Eqs. (5) and (6) are used to express it in terms of the MRF model bonding parameter
{
+α if xst = xqt
Vc (xt ) = (5)
− α if xst ∕
= xqt ,
{
( ) +α if xst =∕ xqt and(s, t), (q, t) ∈ S
Vsc xst , xqt = (6)
− α if xst = xqt and(s, t), (q, t) ∈ S.

It is well known that if Xt is an MRF, it satisfies Markovianity property [20] in spatial direction, as shown by Eq. (7)
( ) ( )
= q = P Xst = xst |Xqt = xqt , (q, t) ∈ ηst
P Xst = xst |Xqt = xqt , ∀q ∈ S, s ∕ (7)

The temporal evolution of these quantities is given by Eq. (8):

ρS,g S,g g
i (t + 1) = ρi (t)(1 − Πi (t)),
ρi (t + 1) = ρi (t)Πi (t) + (1 − ηg )ρE,g
E,g S,g g
i (t),
ρA,g
i (t + 1) = ηg E,g
ρ i (t) + (1 − α g A,g
)ρi (t),
ρI,g g A,g
i (t + 1) = α ρi (t) + (1 − μ )ρi (t),
g I,g

ρH,g g g I,g g g H,g g g H,g

i (t + 1) = μ γ ρi (t) + ω (1 − ψ )ρi (t) + (1 − ω )(1 − χ )ρi (t),
D,g g g H,g D,g
ρi (t + 1) = ω ψ ρi (t) + ρi (t),
ρR,g g g I,g g g H,g R,g
i (t + 1) = μ (1 − γ )ρi (t) + (1 − ω )χ ρi (t) + ρi (t).

{
+γ = xer , (s, t), (e, r) ∈ S, t ∕
if xst ∕ = r, and r ∈ {(t − 1), (t − 2)}
Vteec (xst , xer ) = (8)
− γ if xst = xer , (s, t), (e, r) ∈ S, t ∕
= r, and r ∈ {(t − 1), (t − 2)}.

probability is given by Eq. (9):

∑
N
Πgi (t) = (1 − pg )Pgi (t) + pg Rgij Pgj (t)
j=1

∑k+l− 1
i=k
ψi (9)
mk = , k = 1, …, q − l + 1.
l
∫
E(|γt ∩ A|) = kt(1) (x)dx
A

One-point correlation function k(1) t (x) calculates the predicted number of agents within a region L by using Eq. (10)
∫
E(|γt ∩ A|) = kt(1) (x)dx.
(10)
A
∫ ∫ ∫
E(|γt ∩ A1 ‖γt ∩ A2 |) = kt(2) (x1 , x2 )dx2 dx1 + kt(1) (x)dx
A1 A2 A1 ∩A2

The predicted product between number of agents in area L1 at time t and number of agents in area L2 at time is related to the spatio-
temporal correlation function by Eq. (11):
∫ ∫ ∫
E(|γt ∩ A1 ||γt+Δt ∩ A2 |) = kt,Δt (x1 , x2 )dx2 dx1 + kt,St (x)dx (11)
A1 A2 A1 ∩A2

Differential equation is stated in Eq. (12)

4
K.B. Sahay et al. Computers and Electrical Engineering 103 (2022) 108319

d
kt (η) = (LΔ kt )(η)
dt
∫ ∫ (12)
d (1)
k (x) = − mkt(1) (x) − a− (x − y)kt(2) (x, y)dy + a+ (x − y)kt(1) (y)dy
dt i Ri Ri

OO O+ − O − +
kt,Δt (x, y) = kΔt (x, y) + kΔt (x, y) + kΔt (x, y) + kΔt (x, y)
O
(13)
kt,Δt (x) = kΔt (x)

Following Markovianity property is also satisfied by Eq. (14):

( )
p Xst (= xst |Xqr = xqr , s ∕
= q, t ∕
= r, ∀(s, t), (q, r) ∈ )V
(14)
= p Xst = xst |Xqr = xqr , s ∕ = r, (q, r) ∈ ηsr
= q, t ∕

The term zg kg f(neff

i /si ) in Eq. (9) represents the total number of contacts, which rises as the density of patch I increases according to
function f, as well as the normalisation factor z g, which is computed using Eq. (15):
Ng
zg = ∑ ( eff ) (15)
N ni eff
i=1 f si
(ngi )

where the effective population at patch i is given by Eq. (16)

Ng
zg = ( )
∑N neff eff
f i (ngi )
i=1 si

∑
NG
eff
neff
i = (ngi )
g=1

cff
∑[ ]
(ngi ) = (1 − pg )δij + pg Rgji ngj .
j

( )
f (x) = 1 + 1 − e− ξx .
[( ) ]
nm,h h m,h h h h
j→i (t) = nj ρj (t) 1 − p δij + p Rji , m ∈ {A, I} (16)

the prior probability P(Xt = xt,θ) follows Gibb’s distribution and is of following Eq. (17),

P(Xt = xt ) = e− U(xt ,θ)

[ ∑ ]
(17)
−
s,t
{Vsc (xst ,xqt )+Vtec (xst ,xqr )+Vteec (xst ,xer )}
=e

The corresponding edgeless model is expressed as Eq. (18)

U(xt ,θ)
P(Xt = xt ) = e−
∑ (18)
= e [− c∈C
[Vsc (xst ,xqt )+Vtec (xst ,xqr )]]

The likelihood function P(Yt = yt∣Xt = xt) can be expressed as Eq. (19)
P(Yt = yt |Xt = xt ) = P(yt = xt +n|Xt = xt , θ) = P(N = yt − xt |Xt = xt , θ) (19)

where n is a realization of Gaussian degradation process N (μ, σ). Thus, P(Yt = yt∣Xt = xt) can be expressed as Eq. (20)
P(N = yt − xt |Xt , θ)
1
= √̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅e−
1(y −
2 t xt )T k− 1 (yt − xt ) (20)
(2 )f det[k]
π

where k denotes variance-covariance matrix, det[k] determinant of matrix k and f number of features expresses variance-covariance
matrix k from Eq. (21)
[ ]
k = kij f ×f (21)

the two-point spatio-temporal cumulant as Eq. (22)

5
K.B. Sahay et al. Computers and Electrical Engineering 103 (2022) 108319

ut,± (x, y) = kt,s, (x, y) − kt (x)kt+2t (y). (22)

In space-homogeneous case, ut,Δ(x, y) = ut,Δ(x − y). Eq. (3.1) can be used to express the primary model’s spatio-temporal cumulant in
terms of the auxiliary model’s spatial cumulants (23),

ut,Δ (x) = u∞ O+ − O − +
Δt (x) + uΔt (x) + uΔt (x) + uΔt (x) (23)

Applying next perturbation expansion by Eq. (24)

( )
uε,t,s (x) = εd gt,Δt (εx) + d εd (24)

By applying eq. to the auxiliary model, we may get the perturbation expansion for spatial correlation functions (25)
(x)
gt,Δt = g∞ O+ − o − +
Δt (x) + gΔt (x) + gΔt (x) + gΔt (x)

d ( )
ht,Δt = Hh,Δt qt+t , ht,st
dΔt

1 v2 v2 v3 v4
V0 (y) = [rΓA + v1 Y] + [αφ1 Q + v3 U] + W (25)
v1 βαφ1 aβαφ1

It is obvious that V0(0) = 0, and V0(y) > 0, for all y > 0. Moreover from Eq. (26)
1 v2 v2 v1 v4
V̇ 0 (y) = [rΓȦ + v1 Ẏ] + [αφ1 Q̇ + v3 U̇] + Ẇ,
v1 β α1 αβα1
1[ ] v2 v2 v1 v4
= rΓB(W)W − v1 v2 Y − rΓμ2 A2 + [αφ1 βY + b1 αφ1 W − v3 v4 U] + [aU − v5 W]
v1 βαφ1 aβaφ1
[ ]
Γ1 v1 b1 v2 v2 v2 v5 r μ2 2
= B(W)W + − W− A
v1 β αβα1 v1
⎧
⎪
⎪ ∗ rΓ ∗ ∗ (1 − r)Γ ∗ v1 v4 v5 Rode
0
⎪
⎨ Y = v2 A , M = μ
⎪ A , Q∗ = A∗ ,
aNegg αφ1
(26)
M
⎪
⎪ v1 v5 Rode v1 Rode
⎪
⎪ ∗ 0
A∗ , W ∗ = 0
A∗ ,
⎩U =
aNegg Negs

By Eq. (27), leading term of population density converges

A+ − m
q∗ =
A−
a+ (ξ) − q∗ a− (ξ)
g∗ (ξ) = q∗
A − a+ (ξ) + q∗ a− (ξ)
+ (27)

g∞,Δt (ξ) = (q∗ + ξ∗ (ξ))exp{ − [A+ − a+ (ξ) + q∗ a− (ξ)]Δt}

− q+ exp{− A+ Δt}.

SVD, eigen images and associated eigen-time courses can divide video sequence frames into two halves by Eq. (28):
( )( )
̃V
X = UDVT = UD1/2 VT D1/2 = U ̃T (28)

SVD of video sequence F

̂ is given by Eq. (29)
( )( )
̂ = UDVT = UD1/2 VT D1/2 = U
F ̃ṼT (29)

̂ = UDVT V = UD = UD1/2 D1/2 = UD

FV ̃ 1/2 (30)

By applying eq. to our discrete time Markov Chain, we can convert it to a continuous time differential Eq. (31)
[ ]
∑NG ∑ N nhj (1 − ph )δij + ph Rhji ( ) ( )
g
Pi = g g
z k fi C gh
bA ρA,h I I,h
j + b ρj + O ∈2 (31)
h eff
h=1 j=1 (ni )

where we have defined by Eq. (32)

6
K.B. Sahay et al. Computers and Electrical Engineering 103 (2022) 108319

[ ]
bm = ln (1 − βm )− 1 , m ∈ {A, I}
( )
neff (32)
fi = f i .
si

g
We then insert the above expression into Πi leading to Eq. (33):
N (
NG ∑
∑ )
gh ( A A,h ) ( )
Πgi = (M1 )gh gh gh
ij + (M2 )ij + (M3 )ij + (M4 )ij b ρj + bI ρI,h
j + O ∈2 ,
h=1 j=1

(1 − ph )nhj
(M1 )gh g g g
ij = δij (1 − p )z k fi C
gh
eff
(nhi )

Rhji ph nhj
(M2 )gh g g g
ij = (1 − p )z k fi C
gh
eff
(nhi )

(1 − ph )nhj
(M3 )gh g g g g
ij = p Rij z k fj C
gh
( h)
nj eff

∑
N
Rhjk ph nhj
(M4 )gh
ij = pg Rgik zg kg fk Cgh eff
∗ (33)
k=1 (nhk )

The associated differential equations assume the form eq. using the above definitions (34)
NG ∑
∑ N
( )
ρ̇E,g
i = − ηg ρE,g
i + Mijgh bA ρA,h I I,h
j + b ρj
h=1 j=1

ρ̇A,g
i = ηg ρE,g
i − αg ρA,g
i

ρ̇I,g g A,g
i = α ρi − μg ρI,g
i (34)
∑4
Where the tensor M is given by M = ℓ=1 Mℓ . Defining the vector (ρg)T = (ρE,g, ρA,g, ρI,g) the above system of differential equations can
be rewritten as Eq. (35)
Ng
∑ ( )
ρ̇g = F gh − V gh ρh (35)
h=1

Where we defined [Vgh = Vgδgh⊗1N × N with Eq. (36):

⎛ g ⎞
η 0 0
g ⎝
V = − η g
α g
0⎠ (36)
0 − αg μg

Pixels corresponding to FMt component of original frame yt create VOP, while regions composing foreground part of temporal seg
mentation are recognised as moving object regions.
Time steps are denoted by the numbers t = 1, 2, 3, etc. St ∈ S denotes the Markov State at time t, where S is State Space (a countable
set).At t, action is denoted by At A, where A refers to the Action Space (a countable set).Rt D is the reward at time t, where D is a
countable subset of R. (representing the numerical feedback served by the Environment, along with the State, at each time step t) by
Eq. (37).

(37)
′ ′
p (p, q |r, a) = P[(St+1 = r, St+1 = s )|St = s, At = a]

γ∈ [0, 1] is known as discount factor utilized to concession Rewards when accumulating Rewards, as follows in Eq. (38):

Return Gt = St+1 + γ ⋅ St+2 + γ2 ⋅ St+3 + … (38)

The definition of a Markov process is the first step. If and only if probability of progressing to next state St+1 depends exclusively on
current state St and not on preceding states S1, S2, St1, a series of states is called Markov. That is, for all t as shown in Eq. (39),
P[Rt+1 |Rt ] = P[Rt+1 |R1 , R2 , ⋯, Rt ] (39)

In RL, we usually talk about a time-homogeneous Markov chain, where the transition probability is independent of t by Eqs. (40)–(42):

7
K.B. Sahay et al. Computers and Electrical Engineering 103 (2022) 108319

(40)
′ ′
P[Rt+1 = r |Rt = r] = P[Rt = r |Rt− 1 = r]
P = p1 p2 …pn , where pi ∈ {H, P}, ∀1 ≤ i ≤ n
B = {P = p1 p2 …pn |pi ∈ {H, P}, ∀1 ≤ i ≤ n, n ∈ N }

G = {G = (xi , yi )|xi , yi ∈ ℜ, 1 ≤ i ≤ n}

C:B →G

P = p1 p2 …pn ↦ {(x1 , y1 ), (x2 , y2 ), …, (xn , yn )}

∀1 ≤ i, j ≤ n, with |i − j| = 1⇒|xi − xj| + |yi − yj| = 1
( )
∀1 ≤ i, j ≤ n, with i ∕
= j ⇒ (xi , yi ) ∕
= xj , yj (41)

I : {1, …, n} × {1, …, n} → {− 1, 0} (42)

where ∀1 ≤ i, j ≤ n, with |i − j| ≥ 2 as shown in Eqs. (43) and (44)

{ ⃒ ⃒ ⃒ ⃒
− 1 if pi = pj = H and ⃒xi − xj ⃒ + ⃒yi − yj ⃒ = 1
I(i, j) = (43)
0 otherwise
∑
E(C) = I(i, j) (44)
1≤i≤j− 2≤n

Bellman equation for Q-learning is defined by Q(s, a) denoting value of taking action an in states, r(s, a) denoting reward obtained in
states after performing action a and s 0 denoting environment state attained by agent after performing action an in states is followingQ
(r, b) = r(r, b) + γ ⋅ maxa′ Q(r′ ,b′ ) given by Eq. (45)

(45)
′
Q(r, b) = (1 − α) ⋅ Q(r, b) + α ⋅ (r(r, b) + γ ⋅ maxa′ Q(r , b))

n
{ } ( [ n ])
State space S will contain of 4 3− 1states, i.e. S = s1 , s2 , …, s4n − 1 . A state sik ∈ S ik ∈ 1, 4 3− 1 stretched by agent at a given moment
j

after it has visited states s1 , si1 , si2 , …, sik− 1 is a terminal state by current sequence is n − 1, i.e. k = n − 2 by Eq. (46).
( ) 4n− 1 − 1
δ sj , ak = s4,j− 3+k ∀k ∈ [1, 4], ∀j, 1 ≤ j ≤ (46)
3

Reward earned immediately after completing action a from states, plus value of following Eq. (47) for the optimum policy, equals the
value of Q.

(47)
′
Q∗ (r, b) = r(r, b) + γ ⋅ maxa′ Q∗ (δ(r, b), b )

Qn(r,b)agent’s evaluation of Q*(r, b)at n-th training episode. Demonstrate that limn→∞ Qn (r, b) = Q∗ (r, b), ∀s ∈ S , a ∈ δ(r, b) given by
Eqs. (48) and (49)
(n − 1) ⋅ (n − 2)
0 ≤ r(r, b) ≤ , ∀s ∈ S , a ∈ δ(r, b) (48)
2

∑
n− 2 ∑
n
(n − 1) ⋅ (n − 2)
0≥E≥ (− 1) = − (49)
i=1 j=i+2
2

During the training process, the estimates Q(r, b) for each state action pair (∀s ∈ S, a ∈ δ(r, b)), rise i.e. from Eq. (50)
Qn+1 (r, b) ≥ Qn (r, b), ∀n ∈ N ∗ (50)
To prove that Inequalities of Eqs. (51) and (52) hold for n + 1, also, i.e.
Qn+1 (r, b) ≥ Qn (r, b) (51)
′ ′ ′
Qn+1 (r, b) − Qn (r, b) = γ ⋅ (maxa′ Qn (r , b ) − − maxa′ Qn− 1 (r , b))

′ ′
Qn+1 (r, b) − Qn (r, b) ≥ γ ⋅ (maxa′ Qn− 1 (r , b ) −

− maxa′ Qn− 1 (r, b)) = 0

Qn (s, a) ≤ Q∗ (s, a), ∀s ∈ S , a ∈ δ(s, a) (52)

8
K.B. Sahay et al. Computers and Electrical Engineering 103 (2022) 108319

Because all rewards are positive Q* (s, a) ≥ 0. Since Q0(s, a) = 0, and Q* (s, a) ≥ 0 obtain that Q0(s, a) ≤ Q* (s, a) is given by Eq. (53).
Qn (s, a) ≤ Q∗ (s, a)

Qn+1 (s, a) ≤ Q∗ (s, a)

′ ′ ′ ′
Qn+1 (s, a) − Q∗ (s, a) = γ ⋅ (maxa′ Qn (s , a ) − maxa′ Q∗ (s , a ))

(53)
′ ′ ′ ′
Qn+1 (s, a) − Q∗ (s, a) ≥ γ ⋅ (maxa′ Q∗ (s , a ) − maxa′ Q∗ (s , a )) = 0

Expected result starting from states, taking action a and then following policy π is action-value function qπ(s, a)given by Eq. (54)
[ ]
qπ (s, a) = Eπ Gt |St = s, At = a
(54)
qm (s, a) = Em [Rt+1 + γq= (St+1 , At+1 )|St = s, At = a]

To simplify notations, we define R as = E[Rt+1 |St = s+ At = a] relationship between vπ(s) and qπ(s,a) shown by Eqs. (55) and (56)
∑
vπ (s) = π(a|s)qn (s, a) (55)
π ∈Δ

∑
(56)
′
qπ (s, a)Ras + γ Pasx′ νπ (s )
′
s ∈S

Expressing qπ(s, a) in terms of vπ(s) in the expression of vπ(s), Bellman equation for vπ is given by Eq. (57),
( )
∑ ∑
(57)
′
vπ (s) = π(a|s) Ras + γ Pass′ vπ (s )
a∈A ′
s ∈S

Bellman equation shows state-value function of one state with that of other states. Bellman equation for qπ(s, a) as shown in Eq. (58),
∑ ∑ ′ ′
(58)
′ ′
qπ (s, a) = Ras + γ pss′ π(a |s )qn (s , a )
′ ′
s ∈S a ∈A

By maximising q(s, a) over all actions, we can discover an optimal policy immediately using the theorem by Eq. (59),
v∗ (s) = maxπ vπ (s)

q∗ (s, a) = maxqπ (s, a)

′
π ≥ π vm (s) ≥ vm (s), ∀s
{
1 if a = argmq∗ (s, a)
σπ (a|s) =
̃ (59)
0 a ∈ .A

The remaining issue is determining how to obtain the best value function. The Bellman optimality equation is used to solve this
question. Optimal state-value function and have a link that can be discovered as shown by Eq. (60),
v∗ (s) = maxq∗ (s, a)
a
a
∑ ′ (60)
q∗ (s, a) = Rs + γ Pss′ v∗ (s )
′
s ∈S

Generate a Bellman optimality equation for v* and q* by expressing q*(s, a) in terms of v*(s) in expression of v*(s) by Eqs. (61) and (62)
( )
∑ ′
a
v∗ (s) = max Rs + γ Pss v∗ (s )
′
a
(61)
′
s ∈S
∑ ′ ′
q∗ (s, a) = Ras + γ Pass′ max
′
q∗ (s , a ).
′ a
s ∈S

( )
∑ ∑ a
(62)
′
vk+1 (s) = π(a|s) R as + γ P sx′ vk (s )
a∈A ′
n ∈S

Using synchronous updates, for each states, let Eqs. (63) and (64)

9
K.B. Sahay et al. Computers and Electrical Engineering 103 (2022) 108319

( )
∑ α
π(s) := argmaxq(s, a) = argmax R an + 7 P ss′ tπ (st ) (63)
a∈A a∈A ′
s ∈S

(64)
′
qπ (s, π (s)) = maxqm (s, a) ≥ qn (s, π(s)) = vn (s), vm (s) ≥ vπ (s)
a∈A

The iteration also improves the value functionvm′ (s) ≥ vπ(s) because by Eqs. (65) and (66)
′
vx (s) ≤ qn (s, π (s)) = En′ [Rt+1 + γvπ (St+1 )|St = s]

′
≤ Ez′ [Rt+1 + γqm (St+1 , π (St+1 ))|St = s]
[ ′ ]
≤ Ez′ Rt+1 + γRt+2 + γ2 qz (St+2 , π (St+2 ))|St = s

≤ Ez′ [Rt+1 + γRt+2 +…|St = s] = vπ′ (s)

(65)
′
qπ (s, π (s)) = maxa∈A qπ (s, a) = qπ (s, π(s)) = vπ (s)

vπ (s) = maxa∈A qπ (s, a) (66)

The ∞-norm, i.e. the biggest difference between state values, is utilized to evaluate distance between two state-value functions u and v
given by Eq. (67)
‖ u − v‖∞ = maxs∈S |u(s) − v(s)| (67)

We can rewrite the Bellman equation by (68)

vπ = R π + γP π vπ (68)

where vπ is a column vector with one entry per state from Eq. (69)
⎡ ⎤ ⎡ ⎤
⎡ ⎤ Rπ P π ⋯ P π1n ⎡ vm (1) ⎤
vn (1) ⎢ 1⎥ ⎢ 11 ⎥
⎣ ⋮ ⎦ = ⎢ ⋮ ⎥ + γ⎢ ⋮
⎣ ⎦ ⎣ ⋮ ⋮ ⎥ ⎦
⎣ ⋮ ⎦ (69)
v= (n) R= P π
⋯ P ∗ vπ (n)
n n1 nn

Define the Bellman equation backup operator by Eqs. (70) and (71)
T n (v) = Rπ + γP n vo
‖ Tπ (u) − Tπ (v) ‖∞ = ‖ (R π + γP π u) − (R π + R P π v) ‖∞
= ‖ γP π (u − v) ‖∞ (70)
≤ ‖ γP π (u − v) ‖∞
≤ γ ‖ u − v‖∞

vs = maxa∈A R a + γP a vs
(71)
T∗ (v) = maxa∈A R a + γP a v

Value iteration will converge because this operator is likewise a γ -contraction map.

4. Performance analysis

We go over the experimental methodology and outcomes in detail in this section. All of the model training and implementation tests
were carried out in software settings that comprised an Intel Core-i7 8700 K processor running at 3.70 GHz and an Nvidia GeForce GTX
1080 Ti (11 GB Memory) GPU. The necessary Python code is built, and the neural network models are constructed using the Keras API
in conjunction with Tensorflow-GPU in the backend.

4.1. Dataset description

4.1.1. Crowd violence dataset

The dataset contains 350 MP4 video files (H.264 codec) with an average length of 5.63 s, ranging from 2 s to 14 s. All of the clips
feature a 1920 × 1080 pixel resolution and a 30 frames per second frame rate.

4.1.2. UCSD dataset

A stationary camera situated at an elevation viewing pedestrian pathways was utilized to collect UCSD_ADS. Number of people in
pathways varied, from low to quite packed. Video contains simply pedestrians in its default setting. Irregular occurrences are caused by

10
K.B. Sahay et al. Computers and Electrical Engineering 103 (2022) 108319

Table 1
Processing of Real-time violence detection in crime scene for various datasets.
Dataset Real-time violence detection in crime Processed video frames Spatio temporal based feature Classified video frames using
scene dataset extraction of video frames DRNN

Crowd Violence
Dataset

UCSD dataset

Violence flow
dataset

Table 2
Comparative analysis for Real-time violence detection in crime scene for various dataset.
Dataset Techniques Accuracy Precision Recall F1_score

Crowd Violence Dataset 3D CNN 95 92 77 72

MIL 97 94 78 76
ST_DRNN 98 96 80 78
UCSD dataset 3D CNN 93 91 73 71
MIL 95 93 75 75
ST_DRNN 97 95 79 76
Violence flow dataset 3D CNN 92 91 77 73
MIL 93 94 79 75
ST_DRNN 95 96 80 77

the circulation of non-pedestrian entities on sidewalks or by abnormal pedestrian motion patterns. Bikers, skaters, tiny carts, and
individuals walking on a sidewalk or in grass that surrounds it are all common occurrences. There were a few cases of folks in
wheelchairs as well. All abnormalities were not produced for the purposes of constructing the dataset; they occurred naturally. The
information was separated into two groups, each representing a different setting. The video footage for each scene was broken into
multiple portions of about 200 frames apiece.
Violence flow dataset: A database containing real-world video footage of mob violence, as well as common benchmark standards
for determining violent/non-violent classification and detecting violence outbreaks. There are 246 videos in the data collection.
The above Table 1 shows Real-time violence detection in crime scene for various dataset. From above table the processed video
frames, proposed feature extracted video frames and classified video frames has been shown.

4.1.2.1. Accuracy. It defined the number of correctly predicted values to the total number of predictions. It is defined in Eq. (72)
TP + TN
Accuracy = (72)
TP + TN + FP + FN

4.1.2.2. Recall. It is defined as the correctly predicted value to the total prediction value. It is defined in Eq. (73)
TP
Recall = (73)
TP + FN

4.1.2.3. Precision. It provides the ratio of true positive values to the total predicted values. It is stated in Eq. (74)
TP
Precision = (74)
TP + FP

4.1.2.4. F1 - Score. It provides the ratio between average mean of precision and recall. F1-Score is stated in Eq. (75)
Precision ∗ Recall
F1 − Score = 2 ∗ (75)
Precision + Recall

The above Table 2 shows the comparative analysis for proposed real time crime scene dataset analysis. The analysis has been carried
out for various dataset in terms of accuracy, precision, recall and F-1 score for 3D CRNN and MIL based comparison.

11
K.B. Sahay et al. Computers and Electrical Engineering 103 (2022) 108319

Fig. 2. Comparative analysis for Crowd Violence Dataset in terms of (a) accuracy, (b) precision, (c) recall and (d) F-1 score.

Fig. 3. Comparative analysis for UCSD dataset in terms of (a) accuracy, (b) precision, (c) recall and (d) F-1 score.

Above Figs. 2, 3 and 4 shows comparative analysis for crowd violence dataset, UCSD dataset and violence flow dataset in terms of
(a) accuracy, (b) precision, (c) recall and (d) F-1 score. From above comparative analysis proposed technique ST_DRNN obtains ac
curacy of 98%, precision of 96%, Recall of 80% and F-1 score of 78% for crowd violence dataset; for UCSD dataset accuracy is 97%,
precision is 95%, Recall is 79%, F-1 Score is 76%; violence flow dataset obtained accuracy of 95%, precision of 96%, Recall of 80% and
F-1 score of 77%. Hence from above comparative analysis, the proposed technique obtained optimal results in detecting the violence

12
K.B. Sahay et al. Computers and Electrical Engineering 103 (2022) 108319

Fig. 4. Comparative analysis for Violence flow dataset in terms of (a) accuracy, (b) precision, (c) recall and (d) F-1 score.

from real time dataset of crime scene.

5. Conclusion

In this research the proposed framework gives the novel technique for crime scene violence detection. The real time crime scene
dataset has been collected and converted to video frames where the abnormal activities has been detected. The features of a video-
based gesture are recovered by backward, forward and bidirectional predictions and converted video frames were extracted based
on spatio temporal. The prediction errors are thresholded and compiled into a single image that depicts sequence’s motion. Collected
features were then categorised with the help of a Deep Reinforcement Neural Network (DRNN). For various real-time datasets of video
surveillance systems, the experimental results indicate accuracy, precision, recall, and F-1 score. The limitation for proposed video
surveillance is in particular, utilising the intrinsic location of anomalies and examining if use of spatiotemporal data might aid in the
detection of abnormalities. They combine model with a tube extraction module to narrow the scope of the investigation to a specified
set of spatiotemporal coordinates.

Statements and declarations

Ethical approval

This article does not contain any studies with animals performed by any of the authors.

Funding

No Funding.

13
K.B. Sahay et al. Computers and Electrical Engineering 103 (2022) 108319

Informed consent

Informed consent was obtained from all individual participants included in the study.

Availability of data and material

All date available in Manuscript

Code availability

All date available in Manuscript – Custom Mode.

Declaration of Competing Interest

The authors declare that they have no conflict of interest.

Data availability

The data that has been used is confidential.

References

[1] Mabrouk AB, Zagrouba E. Abnormal behavior recognition for intelligent video surveillance systems: a review. Expert Syst Appl 2018;91:480–91.
[2] Kardas K, Cicekli NK. SVAS: surveillance video analysis system. Expert Syst Appl 2017;89:343–61.
[3] Wang Y, Shuai Y, Zhu Y, Zhang J, An P. Jointly learning perceptually heterogeneous features for blind 3D video quality assessment. Neurocomputing 2019;332:
298–304 (ISSN 0925-2312).
[4] Tzelepis C, Galanopoulos D, Mezaris V, Patras I. Learning to detect video events from zero or very few video examples. Image Vis Comput 2016;53:35–44 (ISSN
0262-8856).
[5] Fakhar B, Kanan HR, Behrad A. Learning an event-oriented and discriminative dictionary based on an adaptive label-consistent K-SVD method for event
detection in soccer videos. J Vis Commun Image Represent 2018;55:489–503 (ISSN 1047-3203).
[6] Luo X, Li H, Cao D, Yu Y, Yang X, Huang T. Towards efcient and objective work sampling: recognizing workers’ activities in site surveillance videos with two-
stream convolutional networks. Autom Constr 2018;94:360–70 (ISSN 0926-5805).
[7] Shao L, Cai Z, Liu L, Lu K. Performance evaluation of deep feature learning for RGB-D image/video classifcation. Inf Sci 2017;385:266–83 (ISSN 0020-0255).
[8] Wang D, Tang J, Zhu W, Li H, Xin J, He D. Dairy goat detection based on Faster R-CNN from surveillance video. Comput Electron Agric 2018;154:443–9 (ISSN
0168-1699).
[9] Ahmed SA, Dogra DP, Kar S, Roy PP. Surveillance scene representation and trajectory abnormality detection using aggregation of multiple concepts. Expert Syst
Appl 2018;101:43–55 (ISSN 0957-4174).
[10] Arunnehru J, Chamundeeswari G, Prasanna Bharathi S. Human action recognition using 3D convolutional neural networks with 3D motion cuboids in
surveillance videos. Procedia Comput Sci 2018;133:471–7.
[11] Karri JB. Classification of crime scene images using the computer vision and deep learning techniques. Int J Mod Trends Sci Technol 2022;8(02):01–5.
[12] Ovaskainen O, Somervuo P, Finkelshtein D. A general mathematical method for predicting spatio-temporal correlations emerging from agent-based models. J R
Soc Interface 2020;17(171):20200655.
[13] Zhang XP, Chen Z. An automated video object extraction system based on spatiotemporal independent component analysis and multiscale segmentation.
EURASIP J Adv Signal Process 2006;2006:1–22.
[14] Arenas A, Cota W, Gómez-Gardenes J, Gómez S, Granell C, Matamalas JT, et al. A mathematical model for the spatiotemporal epidemic spreading of COVID19.
MedRxiv 2020.
[15] Mann Manyombe ML, Mbang J, Tsanou B, Bowong S, Lubuma J. Mathematical analysis of a spatio-temporal model for the population ecology of anopheles
mosquito. Math Methods Appl Sci 2020;43(6):3524–55.
[16] Mudgal M, Punj D, Pillai A. Suspicious action detection in intelligent surveillance system using action attribute modelling. J Web Eng 2021:129–46.
[17] Hidayat F. Intelligent video analytic for suspicious object detection: a systematic review. In: Proceedings if the international conference on ICT for smart society
(ICISS). IEEE; 2020. p. 1–8.
[18] Vosta S, Yow KC. A cnn-rnn combined structure for real-world violence detection in surveillance cameras. Appl Sci 2022;12(3):1021.
[19] Ullah W, Ullah A, Haq IU, Muhammad K, Sajjad M, Baik SW. CNN features with bi-directional LSTM for real-time anomaly detection in surveillance networks.
Multimed Tools Appl 2021;80(11):16979–95.
[20] Mathur R, Chintala T, Rajeswari D. Detecting criminal activities and promoting safety using deep learning. In: Proceedings if the international conference on
advances in computing, communication and applied informatics (ACCAI). IEEE; 2022. p. 1–8.
[21] Saad K, El-Ghandour M, Raafat A, Ahmed R, Amer E. A Markov model-based approach for predicting violence scenes from movies. In: Proceedings if the 2nd
international mobile, intelligent, and ubiquitous computing conference (MIUCC). IEEE; 2022. p. 21–6.
[22] Feng JC, Hong FT, Zheng WS. Mist: Multiple instance self-training framework for video anomaly detection. In: Proceedings of the IEEE/CVF conference on
computer vision and pattern recognition. IEEE; 2021. p. 14009–18.
[23] Wu P, Liu J, Shi Y, Sun Y, Shao F, Wu Z, et al. Not only Look, but also Listen: learning multimodal violence detection under weak supervision. In: Proceedings of
the European conference on computer vision. Springer; 2020. p. 322–39.
[24] Zhong JX, Li N, Kong W, Liu S, Li TH, Li G. Graph convolutional label noise cleaner: train a plug-and-play action classifier for anomaly detection. In: Proceedings
of the IEEE/CVF conference on computer vision and pattern recognition. IEEE; 2019. p. 1237–46.
[25] Tian Y., Pang G., Chen Y., Singh R., Verjans J.W., Carneiro G., Weakly-supervised video anomaly detection with robust temporal feature magnitude learning.
arXiv 2021, arXiv:2101.10030.
[26] Dubey S., Boragule A., Jeon, M., 3D ResNet with ranking loss function for abnormal activity detection in videos. Proceedings of the 2019 international
conference on control, automation and information sciences (ICCAIS), Chengdu, China, 24–27 2019; 1–6.

14
K.B. Sahay et al. Computers and Electrical Engineering 103 (2022) 108319

[27] Ji H., Zeng X., Li H., Ding W., Nie X., Zhang Y., Xiao Z., Human abnormal behavior detection method based on T-TINY-YOLO. Proceedings of the 5th
international conference on multimedia and image processing, Nanjing, China, 10–12 2020. 1–5.
[28] Hu X, Dai J, Huang Y, Yang H, Zhang L, Chen W, et al. A weakly supervised framework for abnormal behavior detection and localization in crowded scenes.
Neurocomputing 2020;383:270–81.
[29] Mojarad R, Attal F, Chibani A, Amirat Y. A hybrid context-aware framework to detect abnormal human daily living behavior 2020;19–24:1–8.

Kishan Bhushan Sahay is M.tech in Power System. His area of interest is power system restructuring, optimization techniques in power system, application of artificial
intelligence in power system & other fields.

Dr. Bhuvaneswari Balachander received pH D in Medical Image Processing in the year 2020. Since 2012, she has been working as an Associate Professor in Saveetha
School of Engineering, Chennai. She has around 11 years of teaching experience. Her research areas include Medical Image Processing, Image fusion, Microwave
Antenna Design and Resonator Design. She has received a number of awards including Best Women faculty award, Best Researcher award and also Best paper award.

Dr. B. Jagadeesh working as an Associate Professor in the Department of Electronics and Communication Engineering at Gayatri Vidya Parishad College of Engineering
(Autonomous). He received his B.E. Degree in Electronics and Communication Engineering with distinction from G.I.T.A.M., M.E. degree from Andhra University,
Visakhapatnam and Ph.D. from JNTUA, Anantapuramu. He is in the teaching profession for more than21 years.

Dr. G. Anand Kumar working as an Assistant Professor in the Department of Electronics and Communication Engineering at GayatriVidyaParishad College of Engi
neering (Autonomous). He received his B.Tech Degree in Electronics and Communication Engineering with distinction, M.Tech degree in Digital Electronics and
Communication Systems with distinction from J.N.T.U.K. Kakinada and Ph.D from Andhra University. He is in the teaching profession for more than17 years.

Dr. Ravi Kumar did his Ph.D. from Department of Electronics and Communication Engineering, Jaypee University of Engineering & Technology, Guna in year 2013
with the specialization in MIMO Communication Systems and Smart Antennas. He has been serving in the field of teaching at Jaypee Institute of Engineering since
August 2005. He is a SENIOR MEMBER of Institute of Electrical and Electronics Engineers (IEEE).

Dr. L. Rama Parvathy, working as a Professor at Saveetha School of Engineering, Chennai. She has 22 years of Academic Training and Teaching students including 8
years of Research. Her research interests are Cloud Computing, Evolutionary Computing, Multi-Objective Optimization, and Data Analytics. She has published in many
journals which are all indexed in Scopus and Web of Science.

Industrial Security Management (Lea3)
92% (50)
Industrial Security Management (Lea3)
33 pages
CCTV Survey Form PDF
88% (8)
CCTV Survey Form PDF
2 pages
1 s2.0 S0045790622005419 Main
No ratings yet
1 s2.0 S0045790622005419 Main
15 pages
Introduction To IP CCTV Systems
100% (2)
Introduction To IP CCTV Systems
50 pages
Review of Violence Detection System.
No ratings yet
Review of Violence Detection System.
4 pages
Real Time Crime Detection System Using Video Surveillance Technology
No ratings yet
Real Time Crime Detection System Using Video Surveillance Technology
6 pages
AIHTC
No ratings yet
AIHTC
4 pages
Enhancing The Intelligence of Real-Time Video Surveillance Systems With Automated Anomaly Detection and Response
No ratings yet
Enhancing The Intelligence of Real-Time Video Surveillance Systems With Automated Anomaly Detection and Response
9 pages
Suspicious Activity Detection From Video Surveillance Using CNN Algorithm
No ratings yet
Suspicious Activity Detection From Video Surveillance Using CNN Algorithm
7 pages
Peerj Cs 07 402
No ratings yet
Peerj Cs 07 402
23 pages
RP Report Printing Final1
No ratings yet
RP Report Printing Final1
35 pages
Vision Transformer Attention With Multi-Reservoir Echo State
No ratings yet
Vision Transformer Attention With Multi-Reservoir Echo State
17 pages
Intelligent Crime Anomaly Detection in Smart Cities Using Deep Learning
No ratings yet
Intelligent Crime Anomaly Detection in Smart Cities Using Deep Learning
6 pages
Surveillance Automation Through DeepLear
No ratings yet
Surveillance Automation Through DeepLear
7 pages
Literature - Review - Crime - Detection - Using - Machine - Learning
No ratings yet
Literature - Review - Crime - Detection - Using - Machine - Learning
8 pages
MSC Chennamsetty LH 2020
No ratings yet
MSC Chennamsetty LH 2020
56 pages
Sensors 22 02216 With Cover
No ratings yet
Sensors 22 02216 With Cover
16 pages
The Virtual Soldier: Detecting, Recognizing, Tracing, Informing Criminals As Well As Crimes in Real World
No ratings yet
The Virtual Soldier: Detecting, Recognizing, Tracing, Informing Criminals As Well As Crimes in Real World
6 pages
Final Year Project Review 1
No ratings yet
Final Year Project Review 1
13 pages
Ijcrt24a5023 Bigdataenabled
No ratings yet
Ijcrt24a5023 Bigdataenabled
6 pages
3 Jsee2891
No ratings yet
3 Jsee2891
9 pages
Real-Time Anomaly Detection and Classification From Surveillance Cameras Using Deep Neural Network
No ratings yet
Real-Time Anomaly Detection and Classification From Surveillance Cameras Using Deep Neural Network
6 pages
Batch 06
No ratings yet
Batch 06
9 pages
IJECE
No ratings yet
IJECE
12 pages
Violence Detection From Industrial Surveillance Videos Using Deep Learning
No ratings yet
Violence Detection From Industrial Surveillance Videos Using Deep Learning
13 pages
Spatiotemporal Anomaly Detection
No ratings yet
Spatiotemporal Anomaly Detection
10 pages
Ijarsct Template
No ratings yet
Ijarsct Template
8 pages
Machine Learning Software For The Detect
No ratings yet
Machine Learning Software For The Detect
7 pages
Double Blind Reviewed Journals
No ratings yet
Double Blind Reviewed Journals
7 pages
Paper 12
No ratings yet
Paper 12
4 pages
Deep Learning Approach For Suspicious Activity Detection From Surveillance Video
No ratings yet
Deep Learning Approach For Suspicious Activity Detection From Surveillance Video
7 pages
Halooo
No ratings yet
Halooo
13 pages
Topic Studies
No ratings yet
Topic Studies
6 pages
CNN LSTM
0% (1)
CNN LSTM
5 pages
Crime Detecction DL Model ConvLSTM2D Analysis and Results
No ratings yet
Crime Detecction DL Model ConvLSTM2D Analysis and Results
4 pages
Neuromodulation Via Electromagnetic and Ultrasound Fields
No ratings yet
Neuromodulation Via Electromagnetic and Ultrasound Fields
11 pages
Criminals As Well As Crime Detection Using Machine Learning & Opencv
No ratings yet
Criminals As Well As Crime Detection Using Machine Learning & Opencv
7 pages
Suspicious Activity Detection Using Different Models
No ratings yet
Suspicious Activity Detection Using Different Models
12 pages
1 s2.0 S2210670724006176 Main
No ratings yet
1 s2.0 S2210670724006176 Main
17 pages
4 2 Cse499a
No ratings yet
4 2 Cse499a
6 pages
Violence Detection
No ratings yet
Violence Detection
8 pages
RTADFSCUDL
No ratings yet
RTADFSCUDL
23 pages
Two Stream Multi Dimensional Convolution
No ratings yet
Two Stream Multi Dimensional Convolution
8 pages
Minor 4
No ratings yet
Minor 4
37 pages
Parisodhana 2025 Template
No ratings yet
Parisodhana 2025 Template
1 page
Anomaly Detection
No ratings yet
Anomaly Detection
25 pages
Paper Publish Worlk
No ratings yet
Paper Publish Worlk
7 pages
CCTV Crime Detection
No ratings yet
CCTV Crime Detection
4 pages
3603 9910 1 PB
No ratings yet
3603 9910 1 PB
8 pages
Efficient Reduction of Computational Complexity in Video Surveillance Using Hybrid Machine Learning For Event Recognition
No ratings yet
Efficient Reduction of Computational Complexity in Video Surveillance Using Hybrid Machine Learning For Event Recognition
10 pages
Implementation Paper 15
No ratings yet
Implementation Paper 15
9 pages
Real
No ratings yet
Real
8 pages
Anomaly Detection in Surveillance Videos Using Deep Learning
No ratings yet
Anomaly Detection in Surveillance Videos Using Deep Learning
6 pages
(Updated) Phase-1 - Final Project Presentation (G7)
No ratings yet
(Updated) Phase-1 - Final Project Presentation (G7)
16 pages
Thesis Yuxuan Zhao 2021
No ratings yet
Thesis Yuxuan Zhao 2021
120 pages
197-1598784584
No ratings yet
197-1598784584
8 pages
Suspicious Actions Detection System Usin
No ratings yet
Suspicious Actions Detection System Usin
20 pages
Deep Learning Based Face Detection and Identification of Criminal Suspects
No ratings yet
Deep Learning Based Face Detection and Identification of Criminal Suspects
13 pages
Exploring Anomaly Detection Techniques For Crime Detection
No ratings yet
Exploring Anomaly Detection Techniques For Crime Detection
18 pages
Journsl To Publish Research Paper
No ratings yet
Journsl To Publish Research Paper
15 pages
Deep Learning Approach For Suspicious Activity Detection From Surveillance Video
No ratings yet
Deep Learning Approach For Suspicious Activity Detection From Surveillance Video
6 pages
Innovation Manager Job Description
No ratings yet
Innovation Manager Job Description
2 pages
10 1109icraie 2018 8710421
No ratings yet
10 1109icraie 2018 8710421
7 pages
Eindhoven University of Technology: Weith, L.J
No ratings yet
Eindhoven University of Technology: Weith, L.J
77 pages
ECE 476 Power System Analysis: Lecture 23: Transient Stability
No ratings yet
ECE 476 Power System Analysis: Lecture 23: Transient Stability
25 pages
Stab Terms and Defs PDF
No ratings yet
Stab Terms and Defs PDF
18 pages
Manual For All Netlink Ethernet Versions
No ratings yet
Manual For All Netlink Ethernet Versions
65 pages
s800 Modbus PDF
No ratings yet
s800 Modbus PDF
11 pages
Module MKMB Eng v2.1
No ratings yet
Module MKMB Eng v2.1
12 pages
Modbus RTU Serial Communications Manual
100% (1)
Modbus RTU Serial Communications Manual
76 pages
CCTV Guidelines
No ratings yet
CCTV Guidelines
36 pages
Building Auxiliary Systems - Security and Controls Outline
No ratings yet
Building Auxiliary Systems - Security and Controls Outline
6 pages
Mampaey I Moor Brochure
No ratings yet
Mampaey I Moor Brochure
6 pages
Big Brother or Peeping Tom? UK Installs CCTV in School Bathrooms, Changing Rooms
No ratings yet
Big Brother or Peeping Tom? UK Installs CCTV in School Bathrooms, Changing Rooms
5 pages
Protecting Operational Performance Worldwide: MTL Surge Protection Solutions
No ratings yet
Protecting Operational Performance Worldwide: MTL Surge Protection Solutions
32 pages
Rajiv Aarogyasri Health Insurance Scheme: Quality Medicare For All
No ratings yet
Rajiv Aarogyasri Health Insurance Scheme: Quality Medicare For All
61 pages
Yolo Algorithm
No ratings yet
Yolo Algorithm
37 pages
R276pet2 RRX T SP 2001
100% (1)
R276pet2 RRX T SP 2001
75 pages
Storage and Safekeeping
No ratings yet
Storage and Safekeeping
23 pages
Me 24 (1394) - 31122014
No ratings yet
Me 24 (1394) - 31122014
18 pages
Super-Recognisers - Exercises 0
No ratings yet
Super-Recognisers - Exercises 0
4 pages
Guidelines For Managing School Student Behaviour On Buses
No ratings yet
Guidelines For Managing School Student Behaviour On Buses
33 pages
Pricing A CCTV Maintenance Contract
No ratings yet
Pricing A CCTV Maintenance Contract
4 pages
Prowatch 3.8 Ordering Info
No ratings yet
Prowatch 3.8 Ordering Info
4 pages
Hernis CCTV
No ratings yet
Hernis CCTV
1 page
Daiso
No ratings yet
Daiso
4 pages
Principles and Practices of Working As A CCTV Operator in The Private Security Industry Specimen Examination Paper 1
No ratings yet
Principles and Practices of Working As A CCTV Operator in The Private Security Industry Specimen Examination Paper 1
4 pages
An Automated Approach To Prevent Suicide in Metro Stations: Anindya Mukherjee and Bavrabi Ghosh
No ratings yet
An Automated Approach To Prevent Suicide in Metro Stations: Anindya Mukherjee and Bavrabi Ghosh
13 pages
Public-Private Crime Prevention Partnerships: January 2012
No ratings yet
Public-Private Crime Prevention Partnerships: January 2012
15 pages
Coms Overview
No ratings yet
Coms Overview
298 pages
Bdsg-New Vs Bdsg-Old: Video Surveillance of Publicly Accessible Areas
No ratings yet
Bdsg-New Vs Bdsg-Old: Video Surveillance of Publicly Accessible Areas
2 pages
Inblay Technology Profile
No ratings yet
Inblay Technology Profile
35 pages
Agreement Later Format
No ratings yet
Agreement Later Format
3 pages
Video Surveillance Systems Camera Placement
No ratings yet
Video Surveillance Systems Camera Placement
4 pages
Research and Enquiry Final Assessment
No ratings yet
Research and Enquiry Final Assessment
8 pages
More Than Meets The Eye: Mobotix - Professional Video Solutions
No ratings yet
More Than Meets The Eye: Mobotix - Professional Video Solutions
8 pages
Unit 5 Assessment Test Impacts of Digital Yechnology
No ratings yet
Unit 5 Assessment Test Impacts of Digital Yechnology
5 pages

1 s2.0 S0045790622005419 Main

Uploaded by

1 s2.0 S0045790622005419 Main

Uploaded by

Computers and Electrical Engineering 103 (2022) 108319

Contents lists available at ScienceDirect

Computers and Electrical Engineering

A real time crime scene intelligent video surveillance systems in

Fig. 1. Overall a proposed architecture.

3.1. Spatio temporal model based feature extraction

P(Yt = yt |Xt = xt )P(Xt = xt )

The temporal evolution of these quantities is given by Eq. (8):

ρH,g g g I,g g g H,g g g H,g

probability is given by Eq. (9):

Differential equation is stated in Eq. (12)

Following Markovianity property is also satisfied by Eq. (14):

The term zg kg f(neff

where the effective population at patch i is given by Eq. (16)

P(Xt = xt ) = e− U(xt ,θ)

The corresponding edgeless model is expressed as Eq. (18)

the two-point spatio-temporal cumulant as Eq. (22)

ut,± (x, y) = kt,s, (x, y) − kt (x)kt+2t (y). (22)

Applying next perturbation expansion by Eq. (24)

By Eq. (27), leading term of population density converges

g∞,Δt (ξ) = (q∗ + ξ∗ (ξ))exp{ − [A+ − a+ (ξ) + q∗ a− (ξ)]Δt}

SVD of video sequence F

̂ = UDVT V = UD = UD1/2 D1/2 = UD

where we have defined by Eq. (32)

Where we defined [Vgh = Vgδgh⊗1N × N with Eq. (36):

Return Gt = St+1 + γ ⋅ St+2 + γ2 ⋅ St+3 + … (38)

P = p1 p2 …pn ↦ {(x1 , y1 ), (x2 , y2 ), …, (xn , yn )}

I : {1, …, n} × {1, …, n} → {− 1, 0} (42)

where ∀1 ≤ i, j ≤ n, with |i − j| ≥ 2 as shown in Eqs. (43) and (44)

− maxa′ Qn− 1 (r, b)) = 0

Qn (s, a) ≤ Q∗ (s, a), ∀s ∈ S , a ∈ δ(s, a) (52)

Qn+1 (s, a) ≤ Q∗ (s, a)

q∗ (s, a) = maxqπ (s, a)

≤ Ez′ [Rt+1 + γRt+2 +…|St = s] = vπ′ (s)

vπ (s) = maxa∈A qπ (s, a) (66)

We can rewrite the Bellman equation by (68)

4.1. Dataset description

4.1.1. Crowd violence dataset

4.1.2. UCSD dataset

Crowd Violence Dataset 3D CNN 95 92 77 72

from real time dataset of crime scene.

Statements and declarations

Availability of data and material

All date available in Manuscript

All date available in Manuscript – Custom Mode.

Declaration of Competing Interest

The authors declare that they have no conflict of interest.

The data that has been used is confidential.

You might also like