Results
Results
CHAPTER 4
Size – Constant size and varying size and very small size of
target
For each dataset the results of target detection and target tracking
algorithms are obtained and their performance is evaluated with a few
existing techniques. Both image based and video based performance measures
are considered for analysis. The proposed Running gaussian background
subtraction technique for target detection is compared with Temporal frame
differencing, Running average, Temporal median filtering techniques and the
proposed Adaptive background mean shift tracking algorithm for target
70
tracking is compared with the traditional mean shift tracking technique and
continuously adaptive mean shift technique. While implementing CAMShift
algorithm and Traditional mean shift algorithm with the same input data used
for proposed Adaptive background mean shift tracking the two algorithms
result in target loss problem shown in Figure 4.30. This is due to the fact that
the algorithms cannot cope with the frame rate of 30 FPS due to their design
for stable background. Thus to make the algorithms work for dynamic platform
of aerial videos, the input frame rate is reduced to 20 FPS to the two algorithms
to make a pseudo arrangement that the background is reasonably stable to
detect and track the target. Thus the input frame rate for the algorithms used in
target tracking is as follows
Ground Truth
IN
P Object Detection
U EVALUATION
T METHODOLOGY
D
A
Object Tracking
T
A Metrics
72
Table 4.2 Target model and size
Video Target Target Target PDF Window size Initial target Minimum Maximum
Data model Size Target Size target size
Data 2 A Car 999 Pixels 299 Pixels 342 Pixels 650 Pixels
Data 4 A Bus 1008 Pixels 636 Pixels 490 Pixels 1054 Pixels
Data 5 A Car 5180 Pixels 2652 Pixels 1221 Pixels 2862 Pixels
73
Table 4.2 (Continued)
Two
Data 7 744 Pixels 228 Pixels 180 Pixels 610 Pixels
persons
Data 8 A boat 4029 Pixels 324 Pixels 255 Pixels 399 Pixels
A
Data 9 2520 Pixels 784 Pixels 644 Pixels 936 Pixels
person
74
75
‘x’ is the reference data and ‘y’ is the result data. x be the mean, x2 be the
covariance of x and y.
1 n
2
x
n 1 i 1
( xi x )2 (4.1)
1 n
y2 ( yi y )2
n 1 i 1
(4.2)
1 n
xy ( xi x )( yi y )
n 1 i 1
(4.3)
76
2xy
LM (4.4)
x y2
2
2 x y
CM (4.5)
x2 y2
xy
CM (4.6)
x y
xy 2 x y 2 xy
SSIM (4.7)
x y x2 y2 x 2 y 2
1 k
MSSIM SSIM ( x j , y j )
k j 1
(4.8)
The noise based metrics namely Peak signal to noise ratio (PSNR)
and mean square error (MSE) are calculated by the following equations with
M, N as the size of the image, (i,j) as the position of pixel.
M N
1
MSE
MN
( x
i 1 j 1
ij xij )2 (4.9)
77
2552
PSNR 10log (4.10)
MSE
The Detection rate is the ratio of total number of frames the target
is detected to the total number of frames in video.
(4.11)
True Positive - Both ground truth and result agree that there is
presence of target or bounding box of result and ground truth
coincides.
78
T
Completeness (CS)= (4.12)
T +FN
F
False Alarm Rate (FAR)= (4.13)
T +F
T
imilarity(SS)= (4.14)
T +F +FN
T +TN
Accuracy(ACC)= (4.15)
TF
Figure 4.2 (a) is the initial frame of the input video of dataset 1.
There are number of objects (cows) in the scene. The white cow is considered
as the target to be detected and tracked. In Figure 4.2 (b) and (c) a sample
background model and background update is shown respectively. The sample
background subtraction results are shown in Figure 4.2 (d) and (e).
80
(a)
(b) (c)
(d) (e)
Figure 4.2 (a) Initial frame for input video 1 (b) Background model of
Sample frame 4 (c) Background update of sample frame 4
(d) Background subtraction of sample frame 4 (e)
Background subtraction of sample frame 59.
81
At initial stage, when the UAV is in hovering state and target also
remains static, the PSNR value for detection techniques is high. After a few
frames the target starts moving and the UAV also starts following the target.
The motion of both sensor and target results in ego noise and thus the
efficiency reduces to an extent. Although the value of metrics goes down it is
higher than other technique. The plot for PSNR in Figure 4.3 shows the
variation in metrics.
80
60
PSNR(%)
TFD
40
RABS
20
TMF
0 AGBS
110
130
150
170
190
210
230
10
30
50
70
90
Number of Frames
(a)
(b) (c)
(d) (e)
Figure 4.4 (a) Initial frame for input video 2 (b) Background model of
Sample frame 8 (c) Background update of sample frame 9
(d) Background subtraction of sample frame 3
(e) Background subtraction of sample frame 21.
83
Figure 4.4 (a) is the initial frame of the input video of dataset 2. A car
on the flyover is the target to be tracked. The car is in motion right from the initial
frame till the end. The motion is curvilinear. In Figure 4.4 (b) and (c) a sample
background model and background update is shown respectively. The sample
background subtraction results are shown in Figure 4.4 (d) and (e).
Right from the initial frame, the UAV is in regular motion tracking
the car moving in curvilinear motion in regular speed. Since there is no
reasonable noise due to regular motion pattern of both sensor and target, the
PSNR value for detection techniques is high.
80
60
PSNR(%)
TFD
40
RABS
20
TMF
0 AGBS
100
110
120
130
140
150
160
170
180
20
10
30
40
50
60
70
80
90
Number of frames
moves in a high way road with no other intrusions in a regular speed. The
video is a high altitude angular video taken by UAV at military base Orrisa,
India with the experimental purpose. The result metrics of the video are listed
in Table 4.5.
(a)
(b) (c)
Figure 4.6 (a) Initial frame for input video 3 (b) Background model of
Sample frame 6 (c) Background update of sample frame 6
(d) Background subtraction of sample frame 3
(e) Background subtraction of sample frame 49.
85
(d) (e)
Figure 4.6 (a) is the initial frame of the input video of dataset 3. A
car in a highway is the target to be tracked. The car is in motion right from the
initial frame till the end. The motion is linear and regular. In Figure 4.6 (b)
and (c) a sample background model and background update is shown
respectively. The sample background subtraction results are shown in Figure
4.6 (d) and (e).
70
60
50
PSNR (%)
40 TFD
30 RABS
20 TMF
10 RGBG
250
110
130
150
170
190
210
230
270
10
30
50
70
90
Number of frames
(a)
(b) (c)
(d) (e)
Figure 4.8 (a) Initial frame for input video 4 (b) Background model of
Sample frame 6 (c) Background update of sample frame 9
(d) Background subtraction of sample frame 3 (e)
Background subtraction of sample frame 21.
88
Figure 4.8 (a) is the initial frame of the input video of dataset 4. A
bus in a bridge with a cluttered background is the target to be detected. The
bus is in motion right from the initial frame till the end. The motion is linear
and regular with some interference which hides the target partially in few
frames. In Figure 4.8 (b) and (c) a sample background model and background
update is shown respectively. The sample background subtraction results are
shown in Figure 4.8 (d) and (e).
From the initial frame, the UAV is in hovering state and then in
slow motion thereby tracking the bus moving in linear motion in regular
speed. Even though there is a regular motion pattern of both sensor and target,
as the background is cluttered, the appearance of target is occluded. Thus the
PSNR value for detection techniques is lesser. Anyhow, the proposed
technique outfits other techniques with quotable margin in noise rejection.
70
60
50
PSNR (%)
40 TFD
30 RABS
20 TMF
RGBS
10
100
110
120
130
140
150
160
170
180
190
200
210
10
20
30
40
50
60
70
80
90
Number of frames
Figure 4.10 (a) is the initial frame of the input video of dataset 5.
A car with a relatively smooth background is the target to be detected. The
car is in motion right from the initial frame till the end. The motion is linear
and regular. In Figure 4.10 (b) and (c) a sample background model and
background update is shown respectively. The sample background subtraction
results are shown in Figure 4.10 (d) and (e).
90
(a)
(b) (c)
(d) (e)
Figure 4.10 (a) Initial frame for input video 5 (b) Background model of
Sample frame 8 (c) Background update of sample frame 9
(d) Background subtraction of sample frame 3 (e)
Background subtraction of sample frame 21.
91
From the initial frame, the UAV is in motion thereby tracking the
car moving in linear motion in regular speed. The movement of the sensor is
irregular producing ego noise. Thus the PSNR value for detection techniques is
lesser. Anyhow, the proposed technique outfits other techniques with quotable
margin in noise rejection. The Figure 4.11 shows the variation in PSNR.
80
60
PSNR (%)
TFD
40
RABS
20
TMF
0
RGBS
170
270
110
130
150
190
210
230
250
290
310
330
350
70
10
30
50
90
Number of frames
(a)
(b) (c)
(d) (e)
Figure 4.12 (a) Initial frame for input video 6 (b) Background model of
Sample frame 8 (c) Background update of sample frame 6
(d) Background subtraction of sample frame 4 (e)
Background subtraction of sample frame 26.
93
Figure 4.12 (a) is the initial frame of the input video of dataset 6.
A car is the target to be detected from high altitude. The car is in motion right
from the initial frame till the end. The motion is linear and regular. In Figure
4.12 (b) and (c) a sample background model and background update is shown
respectively. The sample background subtraction results are shown in Figure
4.12 (d) and (e). From the initial frame, the UAV is in motion thereby
tracking the car moving in linear motion in regular speed. The movement of
the sensor is regular but the altitude is very high. The background is cluttered
and multiple targets are present in the scene. Thus the PSNR value for
detection techniques relatively is lesser. The Figure 4.13 shows the variation
of PSNR metrics for different algorithms.
70
60
50
PSNR (%)
40 TFD
30 RABS
20 TMF
10 RGBS
0
10 30 50 70 90 110 130 150 170 190 210 230
Number of frames
(a)
(b) (c)
Figure 4.14 (a) Initial frame for input video 7 (b) Background model of
Sample frame 48 (c) Background update of sample frame 39
(d) Background subtraction of sample frame 46 (e)
Background subtraction of sample frame 63.
95
(d) (e)
Figure 4.14 (a) is the initial frame of the input video of dataset 7.
Two persons walking in sea shore are the targets to be detected. The target is
in motion right from the initial frame till the end. The motion is linear and
regular. In Figure 4.14 (b) and (c) a sample background model and
background update is shown respectively. The sample background subtraction
results are shown in Figure 4.14 (d) and (e).
From the initial frame, the UAV is in motion thereby tracking the
target moving in linear motion in regular speed. The movement of the sensor
is regular but the background is changing. The size or the target varies and
96
there is variation in altitude. The wave also gets included as object due to
their differentiated intensity. Anyhow the PSNR value for detection
techniques relatively is good due to the fact that the target is nearer. The
Figure 4.15 shows the variation of PSNR for different algorithms.
80
70
60
50
PSNR (%)
TFD
40
RABS
30
TMF
20 RGBS
10
0
170
110
130
150
190
210
230
250
270
290
310
330
350
370
390
410
430
450
10
30
50
70
90
Number of frames
The video data 8 consists of a boat sailing in the sea as the target to
be tracked. The target is stationary for a few frames and later it moves in
regular speed in defined direction. The background is smooth and regular.
(a)
(b) (c)
(d) (e)
Figure 4.16 (a) Initial frame for input video 8 (b) Background model of
Sample frame 4 (c) Background update of sample frame 3
(d) Background subtraction of sample frame 4 (e)
Background subtraction of sample frame 63.
98
Figure 4.16 (a) is the initial frame of the input video of dataset 8.
A boat in sea is the target to be detected. The motion is linear and regular. In
Figure 4.16 (b) and (c) a sample background model and background update is
shown respectively. The sample background subtraction results are shown in
Figure 4.16 (d) and (e).
At initial frame, the UAV is in hovering state and the boat is also
static. Thus PSNR is high at initial stages. After few frames as the boat moves
the noises due to motion gets included. But, due to even and smooth
background the boat is detected efficiently with higher PSNR. The Figure
4.17 shows the variation of PSNR for different algorithms.
70
60
50
PSNR (%)
40 TFD
30 RABS
20 TMF
10 RGBG
0
100
105
20
10
30
40
50
60
70
80
90
Number of frames
(a)
(b) (c)
Figure 4.18 (a) Initial frame for input video 9 (b) Background model of
Sample frame 48 (c) Background update of sample frame 38
(d) Background subtraction of sample frame 16 (e)
Background subtraction of sample frame 143.
100
(d) (e)
Figure 4.18 (a) is the initial frame of the input video of dataset 9.
A walking person is the target is to be detected. The motion is non linear and
irregular. In Figure 4.18 (b) and (c) a sample background model and
background update is shown respectively. The sample background subtraction
results are shown in Figure 4.18 (d) and (e).
.
80
70
60
50
PSNR (%)
TFD
40
RABS
30
TMF
20
RGBS
10
0
310
490
100
130
160
190
220
250
280
340
370
400
430
460
520
10
40
70
Number of frames
(a)
(b) (c)
(d) (e)
Figure 4.20 (a) Initial frame for input video 10 (b) Background model of
Sample frame 16 (c) Background update of sample frame 6
(d) Background subtraction of sample frame 3 (e)
Background subtraction of sample frame 49.
103
Figure 4.20 (a) is the initial frame of the input video of dataset 10.
An isolated bike in the bridge is the target is to be detected. The motion is non
linear and irregular. In Figure 4.20 (b) and (c) a sample background model
and background update is shown respectively. The sample background
subtraction results are shown in Figure 4.20 (d) and (e).
From the starting frame, the target is moving and the UAV is
following the target. Thus PSNR is high at initial stages. The size is very
small and hence detection of target becomes complicated. The Figure 4.21
shows the variation of PSNR metrics for different algorithms.
80
70
60
50
PSNR (%)
TFD
40
RABS
30
TMF
20
RGBS
10
0
10 40 70 100 130 160 190 220 250 280 310 340 370 400 430 460
Number of frames
1.01
1
0.99
0.98
0.97 TFD
LM
0.96
RABS
0.95
0.94 TMF
0.93 RGBS
0.92
0.91
Video 10
Video 1
Video 2
Video 3
Video 4
Video 5
Video 6
Video 7
Video 8
Video 9
1.01
1
0.99
0.98
0.97
0.96 TFD
CS
0.95 RABS
0.94 TMF
0.93
RGBS
0.92
Video 10
Video 1
Video 2
Video 3
Video 4
Video 5
Video 6
Video 7
Video 8
Video 9
0.12
0.1
0.08
TFD
SS
0.06
RABS
0.04 TMF
RGBS
0.02
Video 10
Video 1
Video 2
Video 3
Video 4
Video 5
Video 6
Video 7
Video 8
0.12
0.1
0.08
MSSIM
TFD
0.06
RABS
0.04 TMF
RGBS
0.02
0
Video 10
Video 1
Video 2
Video 3
Video 4
Video 5
Video 6
Video 7
Video 8
Video 9
2.5
2
TFD
1.5
MSE
RABS
1 TMF
RGBS
0.5
Video 10
Video 4
Video 1
Video 2
Video 3
Video 5
Video 6
Video 7
Video 8
Figure 4.26 Mean square error for target detection techniques Video 9
70
60
50
PSNR ( % )
40 TFD
30 RABS
TMF
20
RGBS
10
0
Video 10
Video 1
Video 2
Video 3
Video 4
Video 5
Video 6
Video 7
Video 8
Video 9
Figure 4.27 Peak signal to noise ratio for target detection techniques
107
100
90
80
70
60
DR (%)
TFD
50
RABS
40
30 TMF
20 RGBS
10
0
Video 10
Video 1
Video 2
Video 3
Video 4
Video 5
Video 6
Video 7
Video 8
Video 9
Figure 4.28 Detection rate for target detection techniques
(a)
(b) (c)
(d) (e)
Figure 4.29 (a) Transition model between two sample frames (b)
ABMST result for video 1 sample frame 5 (c) ABMST result
for video 1 sample frame 16 (d) ABMST result for video 1
sample frame 82 (e) ABMST result for video 1 sample
frame 170
110
In Figure 4.29 (b), it can be observed that the target is facing the
camera. At that time the UAV is in hovering state. Later the target changes its
direction (and thus its shape) and starts moving. The UAV also starts
following the target thereby causing transition in every element of the frame
(Figure 4.29 (a)). Anyhow the tracker is able to track the target with change in
orientation, speed and size.
(a)
(b) (c)
(d) (e)
Figure 4.31 (a) Transition model between two sample frames (b)
ABMST result for video 2 sample frame 6 (c) ABMST result
for video 2 sample frame 92 (d) ABMST result for video 2
sample frame 115 (e) ABMST result for video 2 sample
frame 149
113
(a)
(b) (c)
(d) (e)
Figure 4.32 (a) Transition model between two sample frames (b)
ABMST result for video 3 sample frame 12 (c) ABMST
result for video 3 sample frame 116 (d) ABMST result for
video 3 sample frame 200 (e) ABMST result for video 3
sample frame 270
115
(a)
(b) (c)
(d) (e)
Figure 4.33 (a) Transition model between two sample frames (b)
ABMST result for video 4 sample frame 2 (c) ABMST result
for video 4 sample frame 46 (d) ABMST result for video 4
sample frame 102 (e) ABMST result for video 4 sample
frame 129
117
(a)
(b) (c)
(d) (e)
Figure 4.34 (a) Transition model between two sample frames (b)
ABMST result for video 5 sample frame 6 (c) ABMST result
for video 5 sample frame 66 (d) ABMST result for video 5
sample frame 146 (e) ABMST result for video 5 sample
frame 209
119
A car on the bridge is the target to be tracked. There are many cars
similar to the target. Initially the UAV captures from higher altitude. After
170 frames, the UAV lowers its altitude. It can be observed in Figure 4.35 (e).
The tracking algorithm still manages to track the target of interest amidst the
multiple objects and changing altitude. This disturbs the color model of the
target kernel, yet the algorithm tracks the target accurately. This is done by
adaptively compensating the background pixels based on the histogram
density spectrum. The results of performance metrics for three target
tracking algorithms (TMST, CAMST and ABMST) for video data 6 are
shown in Table 4.18. These values are obtained for video 1 with 20FPS for
CAMST and TMST and 30 FPS for ABMST. Regular motion, clear and
distinguishing background of the target helps in achieving higher DOST and
TR.
120
(a)
(b) (c)
(d) (e)
Figure 4.35 (a) Transition model between two sample frames (b)
ABMST result for video 6 sample frame 6 (c) ABMST result
for video 6 sample frame 105 (d) ABMST result for video 6
sample frame 170 (e) ABMST result for video 6 sample
frame 201
121
(a)
(b) (c)
(d) (e)
Figure 4.36 (a) Transition model between two sample frames (b)
ABMST result for video 7 sample frame 35 (c) ABMST
result for video 7 sample frame 86 (d) ABMST result for
video 7 sample frame 192 (e) ABMST result for video 7
sample frame 327
123
The target is a boat sailing in the sea. The target possesses smooth
background and so detection rate is higher. The results of performance
metrics for TMST, CAMST and ABMST for video data 8 are shown in Table
4.20. These values are obtained for video 8 with 20 FPS for CAMST and
TMST and 30 FPS for ABMST.
(a)
(b) (c)
(d) (e)
Figure 4.37 (a) Transition model between two sample frames (b)
ABMST result for video 8 sample frame 21 (c) ABMST
result for video 8 sample frame 71 (d) ABMST result for
video 8 sample frame 91 (e) ABMST result for video 8
sample frame 102
125
(a)
(b) (c)
(d) (e)
Figure 4.38 (a) Transition model between two sample frames (b)
ABMST result for video 9 sample frame 35 (c) ABMST
result for video 9 sample frame 135 (d) ABMST result for
video 9 sample frame 361 (e) ABMST result for video 9
sample frame 382
127
(a)
(b) (c)
(d) (e)
Figure 4.39 (a) Transition model between two sample frames (b)
ABMST result for video 10 sample frame 16 (c) ABMST
result for video 10 sample frame 113 (d) ABMST result for
video 10 sample frame 233 (e) ABMST result for video 10
sample frame 400
129
120
100
80
CS (%)
60 TMST
CAMST
40
ABMST
20
0
video 2
video 9
video 1
video 3
video 4
video 5
video 6
video 7
video 8
video 10
14
12
10
FAR (%)
8
TMST
6
CAMST
4 ABMST
2
Video 10
Video 6
Video 1
Video 2
Video 3
Video 4
Video 5
Video 7
Video 8
Figure 4.41 False alarm rates for target tracking techniques Video 9
120
100
80
SS (%)
60 CAMST
TMST
40
ABMST
20
0
Video 1
Video 2
Video 3
Video 4
Video 5
Video 6
Video 7
Video 8
Video 9
Video 10
120
100
80
ACC (%)
60 TMST
CAMST
40
ABMST
20
Video 10
Video 1
Video 2
Video 3
Video 4
Video 5
Video 6
Video 7
Video 8
Video 9
Figure 4.43 Accuracy measure for target tracking techniques
120
100
80
DOST ( %)
60 TMST
CAMST
40
ABMST
20
0
Video 10
Video 1
Video 2
Video 3
Video 4
Video 5
Video 6
Video 7
Video 8
Video 9
120
100
80
TR (%)
60 TMST
CAMST
40
ABMST
20
Video 10
Video 1
Video 2
Video 3
Video 4
Video 5
Video 6
Video 7
Video 8
Video 9
Figure 4.45 Tracking rate of target tracking techniques