0% found this document useful (0 votes)

11 views

stationary_objects_detection

Uploaded by

Marko Car

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views

stationary_objects_detection

Uploaded by

Marko Car

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 40

Stationary Object Detection in Video

Filip Leiding

Master’s Thesis in Computing Science, 30 ECTS credits

Examiner: Fredrik Georgsson
Supervisor: Mikael Rännar

August 24, 2015

Abstract

Computer Vision is an expression which summarise the area of which image

and video analysis is made automatically by computers after given instruc-
tions for a specific purpose. VCA (Video Content Analysis) is a subcategory
which handles analysis on video files or cameras. The progress within VCA
has been driven by the necessity of effective video surveillance in areas where
high security is demanded. Object detection within VCA can be used not
only for security but for notification of movement or stationary objects to
provided sufficient measures.
Acknowledgements

I would like to thank Johanna Björklund and Rickard Lönneborg at CodeMill

for taking me in and letting me do my thesis at their company. I would also
like to thank my supervisor at CodeMill, Ludvig Wadenstein for being there
for me when I had questions for struggled with a programming issue. And
last but not least I would like to thank my supervisor Mikael Rännar at Umeå
University for giving me recommendations an correct my faulty grammar in
this report.
Contents

1 Introduction 4
1.1 Purpose . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2 Video Content Analysis - Object detection 6

2.1 Image representation . . . . . . . . . . . . . . . . . . . . . . . 6
2.2 Background subtraction . . . . . . . . . . . . . . . . . . . . . 7
2.3 Non-recursive background modelling . . . . . . . . . . . . . . 8
2.3.1 Frame differencing . . . . . . . . . . . . . . . . . . . . 8
2.3.2 Median filtering . . . . . . . . . . . . . . . . . . . . . . 8
2.3.3 Non-parametric modelling . . . . . . . . . . . . . . . . 9
2.4 Recursive background modelling . . . . . . . . . . . . . . . . . 9
2.4.1 Approximated median filtering . . . . . . . . . . . . . . 9
2.4.2 Kalman filtering . . . . . . . . . . . . . . . . . . . . . . 10
2.4.3 Gaussian mixtures . . . . . . . . . . . . . . . . . . . . 10
2.5 1-model background subtraction . . . . . . . . . . . . . . . . . 14
2.6 2-model background subtraction . . . . . . . . . . . . . . . . . 16
2.7 Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

3 Implementation 20
3.1 System overview . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.1.1 Graphical user interface . . . . . . . . . . . . . . . . . 21
3.1.2 Background subtraction . . . . . . . . . . . . . . . . . 25
3.1.3 Foreground sampling . . . . . . . . . . . . . . . . . . . 26
3.1.4 Detection and notification . . . . . . . . . . . . . . . . 26
3.1.5 Saving to file . . . . . . . . . . . . . . . . . . . . . . . 27
3.2 Program and libraries . . . . . . . . . . . . . . . . . . . . . . . 27
3.2.1 Python . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

2
3.2.2 Tkinter . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.2.3 OpenCV . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.2.4 MKVmerge . . . . . . . . . . . . . . . . . . . . . . . . 29

4 Results 30
4.1 Test Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.2 Comparison Between Languages . . . . . . . . . . . . . . . . . 30

5 Conclusions 34
5.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
5.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

3
Chapter 1

Introduction

1.1 Purpose
This thesis is a project executed together with CodeMill AB, [16] which
is an IT consulting firm location in Umeå in northern Sweden. CodeMill
specialises in the field of media and broadcast. This thesis was formed out
of a demand for automatic notification when certain objects have been left
at improper places. Places where this type of problem could occur is for
example at loading docks when new sets of cargo have been delivered and
needs to be taken care of, emergency exits where objects are blocking way
and immediately needs to be removed for safety reason. The last and most
critical example is abandoned bags at airports which can potentially be very
dangerous. The questions to be answered in this thesis are as follows:

1. How can one detect changes in otherwise static environments?

2. When does an object become static or non-static?

3. How can one filter out static data from non-static data?

1.2 Related work

With the increasing demand for automated surveillance programs and al-
gorithms, researchers today have abandoned the use of simple, unadaptive
background subtraction models since the areas of usage constantly changes

4
over time. Now the focus lies on improving the existing models to fit the mod-
erns needs of automated surveillance. Many researches such as S.C Cheung
et al.[2], Medha Bhargava et al.[11] and Sen-Ching S et al.[7] are trying to
improve the background models responsiveness and accuracy for functioning
in crowded areas such as train stations or rush hour traffic sites. Another side
of the modern needs is the ability to distinguish objects from its shadow to
reduce the number of faulty detections when the environment light changes
or when an objects shadow enter the scene but not the object itself. Kaew
Tra Kul Pong et al.[12] and Thanarat Horprasert et al.[13] have been work-
ing with this kind of improvement. Some other researchers have been trying
to improve the overall function of background subtraction by using new and
different approaches. Rubén Heras Evangelio et al.[10] are using a 2-model
background model together with a finite state machine to detect static ob-
jects and Antonio Albiol et al.[5] are using spatio-temporal maps to detect
stationary objects within predefined areas of a scene. Thi Thi Zin et al.[6]
and YingLi Tian et al.[8] have done research about object detection suited
for real surveillance application where security is the main purpose. In the
paper Wallflower: Principles and practice of background maintenance by K.
Toyama et al.[3] they have developed a new background subtraction algo-
rithm which they clam are one of the best algorithms to use, at least when
the paper was written.

5
Chapter 2

Video Content Analysis -

Object detection

2.1 Image representation

Video content analysis is the study of visual changes and events in video
streams and video files. Video files are collections of images put together in
a specified sequence which are shown at a certain speed to make the illusion
of motion to the human eye. To make analysis of video, one must go down
to the image level and perform analysis on one images at the time. Video
content analysis, in particular object detection is made at an even smaller
scale than just an image, namely pixel level. To be able to understand the
concept in this paper an introduction to digital image representation will
follow.
An image is represented by pixels in the digital world and inside computers
an image is represented by a 2-dimensional matrix where each element in the
matrix is one pixel. The colourspace of the image determines the dimension
of the images represented. If the image is in grayscale, each pixel will only
contain one value, from 0 to 255 (in 8-bit colourspace per channel) (see Fig-
ure 2.2 on the facing page). In RGB (Red, Green, Blue) colourspace, each
pixel has three values, one per channel, and this also makes the images rep-
resentation 3-dimensional. As for the grayscale representation each channel
in RGB colourspace can take a value between 0 and 255 (see Figure 2.1 on
the next page).

6
Figure 2.1: RGB colour space
Figure 2.2: grayscale colour space

2.2 Background subtraction

One of the basic approaches to detect stationary objects in a video stream
is to apply the background subtraction model. This technique is based on
a comparison between a stored background image used as a reference of the
scene and the next frame in line, described in the paper by Smitha H.[1].
The background will function as a ground layer and everything that was
not present in the scene when the background was created is considered a
foreground object (object not belonging to the background). This separa-
tion between background and foreground is what makes the foundation in
object detection by background subtraction. The first background reference
is often obtained by taking N number of frames when the camera starts and
take the average or median of the pixel values from all the frames. This is
a very simple but effective way to background image. Since almost no scene
is constant over time many different models have been developed to adapt
the background image when changes in the scene occur. They have to be
robust against environmental changes such as illumination but also sensitive
enough detect the objects of interest.

An introduction to some of the most used models will be presented be-

low and a comparison between them will be made to distinguish the areas
of use for each model. The models can be divided in to two subcategories;
non-recursive and recursive (no buffer of past frames needed).

7
2.3 Non-recursive background modelling
Non-recursive modelling is based on using buffer of size N to store previous
frames of the scene. The stored images are used to estimated the background
images using the variation of temporal values of each pixel in each frame
within the buffer. These techniques can vary in its adaptiveness depending
on the size of the buffer. A large buffer implies that the adaption takes longer
time and the storages requirement for the using equipment increases and vice
versa for a smaller buffer.

2.3.1 Frame differencing

This is the most simplest technique for modelling an adaptive background.
Frame differencing takes the current frame F at time t and compare it to
the previous frame t-1.
|Ft (x, y) − Ft−1 (x, y)| > T (2.1)
Where x and y represent the spatial location of a pixel within the frame and
T is a threshold deciding if the current pixel is significantly different from the
previous frame. If the pixel value is greater than T it will get a value of 1 and
if not a value of 0 in the binary mask that is created. This technique is very
quick in adapting to changes in the scene but it also have a large problem
which is if an object is unified in colour, only the outer lines of the object
will be marked as moving. This is because frame differencing only compares
the previous frame and thus the object with its unified colour will still be in
the same region and the pixel colour will not have changed significantly.

2.3.2 Median filtering

Median filtering is based upon frame differencing but the difference between
them is that median filtering uses a larger buffer of images and instead of
comparing two pixel values it takes the median of all the frames in the buffer.
For a pixel to be detected as moving in this technique, it must have stayed
in the background for more than half of the buffer size. The median only
works if the images is in grayscale where the medoid is used if the imaged is
in a coloured space.

8
2.3.3 Non-parametric modelling
Non-parametric modelling is similar to the other non-recursive methods but
instead of having a fixed buffer of frames for estimating the background
update, it uses the all of the previous frames to make the estimate parameter
independent. This method was constructed by Elgammal et al. [14] by using
all the previous frames to estimate the pixel density function f (Pt = u):
t−1
X
f (Pt = u) = 1/N K(u − Pi ) (2.2)
i=t−N

The K represents the kernel estimator presented by D. W Scott in [15]

where the kernel must be a symmetric distribution. In [14] the kernel distri-
bution is the normal distribution.
If the current pixel does not come from the chosen distribution, it will be
declared as a foreground pixel.

2.4 Recursive background modelling

Recursive modelling in contrast of non-recursive is that recursive modelling
does not store frames from the past in a buffer for comparison. Instead
the background is updated with every new frame. This means that the
background is affected by frames from the distant past and can have errors
lingering for a very long time. Despite that, these types of techniques do not
require any larger storage capabilities. Most of these techniques implements
weighting variables to discount the past frames faster to reduce the error
time of pixels.

2.4.1 Approximated median filtering

This method is based upon its non-recursive twin where comparison between
the current and the previous frame is made. Only this time an estimate is
made from the median in every new frame, where the estimate is increased
by one if the new frame is bigger than the estimate and decreased by one if
lower (on pixel level). This only works on grayscale images just as the non-
recursive model. If you line up all the frames that have passed half of them
would be behind the one with the median pixel and half would be in front,
which by definition is the median. Note that this will occur individually for

9
every pixel in the frame. So the same frame could be on either side in the
line depending on which pixel you look at.

2.4.2 Kalman filtering

Kalman filtering is a linear prediction which uses the current frame values
and mixes them with a prediction of the current frame with help of the
previous frame to get an estimated frame. The estimated result at time ti is

x̂(ti ) = x̃(ti ) + K(ti ) ∗ [z(ti ) − H(ti ) ∗ x̃(ti )] (2.3)

where the prediction term is defined as

x̃(ti ) = A(ti ) ∗ x̂(ti−1 ) (2.4)

A(ti ) is the system matrix which is a constant matrix defined as

1 a1,2
A= (2.5)
1 a2,2

where the values of a1,2 = a1,2 = 0.7 as used in [4]. H(ti ) is called the
measurement matrix and is also a constant matrix.

H= 1 0 (2.6)

z(ti ) is the input matrix, the current frame from the camera and K(ti ) is the
gain matrix. This matrix is derived from the error covariance matrix and if
the gain is high, the noise of the input is low and vice versa. So the Kalman
filter procedure gets its estimated matrix by weighing the difference between
the prediction matrix and the current input matrix. So values which have
a large difference from the input matrix will get a lower weigh which means
that errors do not linger for a long time. See a graphical explanation of the
process in Figures 2.3 on the facing page, 2.4 on the next page and 2.5 on
page 12

2.4.3 Gaussian mixtures

Gaussian distributions operates on single values so to be able to explain
how it works in the context of images we need to go down to pixel level.
This model is called mixtures because each and every pixel is compared to

10
Figure 2.3

Figure 2.4

11
Figure 2.5

12
a set of distributions, ranging from 3 to 5 distributions mostly. Why more
than one distribution is used is to be able to ignore objects that belongs
to the background but are not stationary, such as swinging leafs, snow and
rain but even reduce the detection of shadows. Several distributions makes
it possible for multi modal backgrounds which means that a pixel can take
several different colour values and still not be classified as a foreground object.
Every pixel at each frame is compared to the set of mean values
µ(K) = {µ1 , µ2 , ..., µK } (2.7)
and a set of variances
σ(K) = {σ12 , σ22 , ..., σK
2
} (2.8)
where K is the number of Gaussian distributions used. The distribution
created with the K gaussian distributions can be described as
X K K
X
2
Z∼N µi , σi (2.9)
i=1 i=1

This new distribution can have a similar look as this illustration below (see
Figure 2.6)

Figure 2.6: Gaussian mixture using 3 distributions

The probability of a pixel being inside this distribution can described by

setting up a confidence interval as follow
X −µ
1 − α = P = (−λα/2 < < λα/2 ) (2.10)
σ
13
where P is the probability the the pixel is inside the quantiles and α is the
area of the distribution outside the quantile limits. The X is the value of
the current pixel.The quantile limits are usually set by how many standard
deviations the result can differ from the mean value. One standard deviation
away from the mean value corresponds to an α = 32%, two standard devia-
tions 5% and three standard deviations 0.3%. The chosen alpha usually lies
between 2 and 3 standard deviations from the mean value. If the pixel is
inside this interval it counts as the background, otherwise it is detected as
foreground.

2.5 1-model background subtraction

The simplest approach to detecting stationary objects is to use the 1-model
background subtraction which only uses a single comparison of the scene
(see Figure 2.7 on the facing page). The model is based on a comparison
between the current frame and the stored background frame. The background
BFt (x, y) where t is the current frame and x and y are the spatial coordinates
for the pixel, is created at first by taking N number of frames at startup and
take the median values of the pixels to create an image as the reference BF. To
separate new objects from the objects of the background image, a foreground
F Ft (x, y) is created. This is done by taking the difference between the current
frame and the background frame and create a mask for the foreground image
Mt (x, y). The pixels in the mask can take the number 0 and 1 depending on
the value from the difference.
|F Ft (x, y) − BFt (x, y)| > T (2.11)
Where T is the threshold chosen and Mt (x, y) = 1 if the difference is above
T and Mt (x, y) = 0 otherwise. This will create an image which is black
and white, where the white areas are objects that does not belong to the
background. For an object to be specified as stationary, it must be at the
same spot for a period of time. A sample of M frames must be collected and
a foreground mask for each frame must be created. Since the masks are in
binary form, the logic operator AND, denoted as ∧, will create a sampled
mask
S = (Mt−n (x, y) ∧ Mt−n+1 (x, y)∧, .... ∧ Mt (x, y)) (2.12)
where S only contains the objects that have been detected in all of the
sampled masks. Those pixels that are still white in the mask correspond to

14
the stationary object in the scene. This method is very basic and therefore it
has some restrictions. It can not distinguish permanently stationary objects
from temporary stationary objects, for example a person who stops to tie a
shoe. It would alert the system for every object that is standing still for a
moderate period of time. To solve this problem a similar but slightly more
advanced model was invented, described in the next section.

Figure 2.7: 1-model and 2-model background subtraction illustration

15
2.6 2-model background subtraction
In this method, the subtraction of the background will occur at two different
rates. One of the background images is updated every frame and the other
one is updated every L frames (see Figure 2.7 on the previous page). Masks
for both backgrounds are created at respectively rate. The short term back-
ground SBt (x, y) will be compared to the current frame CFt (x, y) and every
pixel will either increase in intensity or decrease depending on the result of
the comparison.
|CFt (x, y) − SBt (x, y)| > T (2.13)
Equality between pixels leave the pixel unchanged. This enables the SBt (x, y)
to change quickly in scenes where the lighting conditions change rapidly. The
long term background LBt (x, y) will do the same process at every L frames
and is compared to the current SBt (x, y) to gradually adapt to the environ-
ment of the scene. By increasing or decreasing the intensity of the pixels,
stationary objects will slowly become a part of the background and moving
objects will still only be part of the moving foreground. By having two back-
grounds updating at different intervals, detection of temporary stationary
objects and objects that were part of the background before but has been
moved can be made. This method can only be applied to frames who are in
the grayscale colourspace.

2.7 Comparison
Under this section a comparison between the different background modelling
methods will be made and be summarised with a table describing each meth-
ods important features (see Table 2.1 on page 19). To start the comparison
I will first separate recursive and non-recursive methods and compare the
methods within these two categories and then do a comparison between the
two groups.

The non-recursive models are very similar to each other since both median
filtering and non-parametric modelling is based on the frame differencing
technique. The frame differencing model and the median filtering model
both compares the current frame with the previous frame or buffer of frames
while the non-parametric modelling technique uses all of the previous frames
to make an estimate and then compare the estimate to the current frame.

16
The base function is as said very similar between these three and the biggest
difference between them is that only the non-parametric model works on
colour images while the other two only work for grayscale images (with a
small exception for the median filtering where the medoid can be used for
colour images). The non-parametric model is the most sophisticated but re-
quires a large buffer for storing all of the previous frames, which can be quite
many if the recording goes on for a longer period of time.

Non-recursive modelling
Pros:

• Overall quick adaption to changes in the scene.

• No lingering error from past frames (except for non-parametric mod-

elling).

Cons:

• Require storage capabilities, larger means slower adaption.

• Only works for grayscale images (except for median filtering using
medoid).

The recursive models are not as similar to each other as the non-recursive
models are. All of these techniques have different approaches which makes
them harder to compare. The approximated median filtering has the same
base as the median filtering, but this time no storage at all is used, instead
a threshold is used to either increase or decrease the background pixel value
by one. This unfortunately makes this model useless for colour images. The
Kalman filtering uses a prediction matrix and the current matrix to update
the background reference matrix and thus it does not need to store frames
for the estimate. To get this technique to work, some initialisation and pre-
defined variables are required. Gaussian mixtures is the most sophisticated
model presented in this paper because of its multi modal property and its
ability to work on both grayscale and colour images. Gaussian mixture uses
a statistical approach to determine the similarity of the current pixel and

17
the background pixel. This model also requires some predefined variables to
work. These variables must be set to determine the sensitivity of the model.
The Kalman and the Gaussian models have the colour images capability in
common while the approximated median filtering only work for grayscale.
The only thing all of these three models have in common is that they only
compare the background frame with the current frame and that the back-
ground image is updated after each frame.

Recursive modelling
Pros:

• Most of them work on colour images.

• Require no storage of frames.

Cons:

• Can have lingering errors for a long time.

• Require some initialisation before start.

So what does these two major categories of techniques have in common?

Well, all of the techniques are used to update the background image to make
it adaptive to changes in the scene, but in different ways as explained above.
As summarised in the table (see Table 2.1 on the next page) the major differ-
ence between the two groups is that the majority of the recursive techniques
have a tendency to be a bit slower than the non-recursive techniques. This
could be explained by looking on how much computation that needs to be
done at each new frame. In the non-recursive techniques a lot of information
is stored, so the amount of computation needed is much lower compared to
the recursive models where nothing is stored. This means that all of the
information needed to make the background update is computed at every
frame. This makes the recursive models somewhat slower because of redun-
dant computations. To say that recursive techniques are better than the
non-recursive ones is not a valid statement until the context of usage has
been revealed and the type of hardware to use have been decided. When
these two variables are known, then one can compare the effectiveness of
these different groups of techniques.

18
Technique Multi-modal Shadow Det. Adaptive Rate Category
Frame Differencing No No Fast Non-recursive
Median filtering No No Fast Non-recursive
Non-parametric modelling No Yes Slow Non-recursive
Approx. median filtering No No Fast Recursive
Kalman filtering No No Slow Recursive
Gaussian mixtures Yes Yes Slow Recursive

Table 2.1: Comparison summary of background models.

19
Chapter 3

Implementation

3.1 System overview

Figure 3.1: Program flow chart

20
3.1.1 Graphical user interface
The implemented GUI is a very simple one with only some basic function-
ality. Upon startup the user can choose to save the captured frames. This
feature can be switched on and off during runtime (see Figure 3.2). Another
immediate setting which can be changed either before or during runtime is
the learning rate of the background where a high number represents a slow
background update. The last setting of the startup window is the pixel area
detection which controls the threshold of the area of objects that should be
detected as stationary. With lower number, smaller objects can be detected.
This unit is measured in square pixel which is calculated after the contours of
the detected object have been calculated. When the record button is pressed
the program starts to capture from the source the user have entered in the
video input source setting (see Figure 3.3 on the next page) which can be
found under file in the menu bar.

Figure 3.2: Graphical user interface startup

The program can take any type of video file or a number which represent
the source of a connected camera, 0 is the default number for a connected
camera or a built in camera. When a valid source is entered the option to
pause, exit or watch (see Figure 3.2) the capturing live comes available for
the user. Pressing the pause button will stop the capturing and even the

21
saving of the captured frames, but the program will still be running. The
background reference will be reseted and will become the last frame before
the pause button was pressed. This enables the user to pause the capturing,
ignore the changes made during the pause and then continue the capturing
like the pausing never happened. The exit button will exit the program as
well as the exit button in the menu bar. The view button will enlarge the
window and display the captured images in real time (see Figure 3.4 on the
next page).

Figure 3.3: Video source input

When live view is enabled, another option for the user comes up, namely
to extend the window further for an extend view mode (see Figure 3.5 on
page 24). This mode not only shows the current frame but also the applied
binary mask and the sampled images when an object has be stationary for
period of time.
The last setting the user can make is to type in their destination for the
notification mails (see Figure 3.6 on page 24). These mails are sent to the
user when new detection occurs. The user can specify which mail address it
should be sent to and which mail address the sender should have. Under that
the subject and the actual message can be specified to the user’s preferences.
If the used mail server requires login credentials, the optional input fields for
username and password can be used. The last input field is the address to
the mail server with a targeted port.

22
Figure 3.4: Live view of the capturing

23
Figure 3.5: Extended live view with binary mask applied

Figure 3.6: Mail account credentials

24
3.1.2 Background subtraction
The purpose of the program is to read the next frame from the input source
and make a background subtraction against the background reference image.
This is made by simply take the current frame and then subtract the back-
ground reference image. This is a simple binary operation since the images
are stored in matrixes so every operation is made for every pixel representa-
tion in the matrixes. A simple subtraction between the matrixes will result
in matrix with a lot of noise in it since the subtraction is absolute (see Fig-
ure 3.7). This means that even if the pixels just differ with decimals it will
be visible.

Figure 3.7: Simple unfiltered background subtraction

This program uses a predefined method for subtraction the images. This
method uses morphological filters which filers out pixel which have very small
differences (see Figure 3.8 on the next page). These filters provides improved
accuracy and less faulty detections when the binary masks are applied.
When the subtraction has been made the resulting matrix is used to
make a binary mask which then is used for the foreground separation and
the detection of objects not belonging to the background.

25
Figure 3.8: Background subtraction of a colour image with morphological
filter

3.1.3 Foreground sampling

The foreground mask is applied on every frame and at frame 0 and at frame
N, the mask is saved as a sample. When the frame number is N, a comparison
is done by taking the masks at frame 0 and frame N and perform a logical
AND operation. This operation creates a new binary matrix with pixels in
white only if the corresponding pixels in both of the sample frames are white
(see Figure 3.9 on the facing page). This operation will detect if an object
has been present in both of the frames when the sample was taken, indication
that an object has been left abandoned. Only two samples are taken instead
of a whole series of them in shorter interval because if only one of them do
not have the object and the other ones has, the object will not be detected.
So only the first and last sample frames counts.

3.1.4 Detection and notification

When the sample comparison is made, a built in function in OpenCV searches
for white pixels in the matrix and finds the contours of the stationary object.

26
Figure 3.9: Foreground sampling compared with logical AND

When the contours have been found, another built in function calculates the
area within the contours. This area is compared with the threshold specified
of the user in the GUI. If the areas is larger than the threshold, a rectangle
around the object is drawn on the current frame. When the rectangle has
been drawn a notification is sent to the specified email address with the user
specified message and subject (see Figure 3.6 on page 24).

3.1.5 Saving to file

If the user chooses to save the captured frames, they will be saved after every
operation has been made. After the program exits, all of the frames are put
together into a video file. The program uses XVID codec and AVI as file
format but can be changed to the users liking. If the frames are saved to
a file, the program will wrap it up in an MKV container together with the
detection chapter text file.

3.2 Program and libraries

Below are short descriptions of the major programs and libraries which was
used during the making of this program for object detection. Every extension
and plug-in within these libraries will not be listed and explained but can be
found at the respective web page.

27
3.2.1 Python
Python is a very diverse and multifunctional language that is supported
almost everywhere on every platform. It can be used both as a scripting
language as well as an object oriented language [19], which this program uses
it as, or as a mixture of both scripting and object oriented. The possibility
to embed other languages inside Python make its usage almost limitless.
This language is chosen for its diversity and its support on many major but
also minor platforms. This is to make this program as universal as possible
without compromise too much.

3.2.2 Tkinter
Tkinter is a graphical user interface (GUI) toolkit which comes embedded in
the Python environment when installed. This framework for creating GUI’s
are very similar to the Swing package in Java and is very easy to use and
creating something small but functional takes very little time. This toolkit
is mostly built for small GUI’s and therefore its functions are limited to only
support the most basic needs. If creation of big scale and advanced GUIs
another library is recommended. This toolkit was chosen for its simplicity
since the GUI will only consist of a few buttons and input lines for the user
to change some variables and settings during runtime. This toolkit is well
documented and have examples of every button and item it can support [20]
and also what input and output every method has for easy understanding
and usage.

3.2.3 OpenCV
OpenCV (Open Computer Vision) is a set of libraries written in optimised
C and C++ with the intention to be computational effective for real-time
programs and applications. These libraries is made under the BSD licence
which gives the user the right to use it freely under commercial as well as
under academic purposes. OpenCV has many well designed methods for
VCA which are constructed after well known research papers with robust
techniques. The methods used in this paper are mostly for the background
subtraction and the background update. Documentation of all the methods
and attributes of these can be found at the OpenCV webpage [21]

28
3.2.4 MKVmerge
MKV files or Matroska files is a media container which can hold video, audio
and subtitles in a single file [17]. These files are not of video or audio format,
just a simple container for files. To create this container a program called
MKVmerge [18] is used which takes a video file and can take a text file
containing chapters marks for the video file and then creates a new MKV
file. This program is used to mark new detections from the implemented
program to be able to search the file for detection events instead of having
to go through the whole video.

29
Chapter 4

Results

4.1 Test Results

During the implementation of the program used in this thesis, test files with
objects being left behind and abandoned have been used to make the program
work correctly. These files were taken from a website and were used originally
for the CAVIAR project [22]. These files have been specifically designed to
test different forms of computer vision and video analysis scenarios. The files
used in this project are the files where objects are left somewhere in the scene
by different people. Below are images taken from test files from the CAVIAR
web site (see Figures in figure 4.1 on the facing page).

4.2 Comparison Between Languages

To be able to get an understanding of the performance of this program I have
written the program both in Python and in C++ with OpenCL to see if it
was possible to make the computations run faster. The programs have been
tested on an Intel Core i5 1.3 GHz with a integrated Intel HD Graphics 5000
as graphics unit. Both of the program have been tested with the same kind
of test through a live webcam with the same objects to detect. Each test
processed 1000 frames. Below is a summary of the different average frame
rates at different resolutions (see Table 4.1 on page 32 and Figure 4.1 on
page 32).
The difference between the languages when live view was enabled can not
be determined at the lower resolution because the web camera can not record

30
Figure 4.1: Tests made with CAVIAR test files, left side are scene without
objects, right side markeds the abandoned objects.

31
Language Resolution (pixels) Average frame rate (FPS) Live View
Python 320x240 30.02/14.95 No/Yes
Python 640x480 30.0/10.77 No/Yes
Python 1280x720 18.9/6.22 No/Yes
C++ 320x240 30.3/15.87 No/Yes
C++ 640x480 30.3/12.20 No/Yes
C++ 1280x720 20.41/6.49 No/Yes

Table 4.1: Summary table of average frame rate at different resolution

Figure 4.2: Performance difference between Python and C++ with OpenCL

32
faster than 30 frames per second, so the only significant difference was when
the frame rate came up to max of the cameras capability at 1280x720 pixels.
Here the C++ program performs 8% better than the Python language. When
the live view was enabled differences could seen at all resolutions. Here the
difference was between 4, 3 to 13, 3%

33
Chapter 5

Conclusions

5.1 Conclusion
The intension of this thesis was to answer the three questions stated in the
beginning of the report:

1. How can one detect changes in otherwise static environments?

2. When does an object become static or non-static?

3. How can one filter out static data from non-static data?

The first question is about the foundation of object detection, namely

analysing frames captured from some sort of video camera. The basics are
as explained earlier a subtraction of a reference background image with the
current captured frame. After that, several methods were explained that
one can use for detection based on the conditions of the environment it is
supposed to be used in. The second question is harder to just give one answer
to. This is because some objects are becoming static faster than others based
on the environment they are in. In some environment such as airports, object
might be considered static after several tens of minutes, while at for example
train stations during rush hour, object might need to be detected under a
minute to prevent theft of belongings. So this question does not have a
single answer since it is based of the situation. The last question can be
answered by using different methods. The method I described in this report
uses the chapters function that exists in most video containers. To create the
chapters, a text file with time stamps describing where the different chapters

34
begin is created. This text file together with the video file can be merged
using the MKVmerge external program to create a video container. When
the new video file is played, the user is able to search the video with the
chapters, which represent the time of detection. This is just one of many
ways to be able to search through a video file for times of detection.

5.2 Future Work

To make this program viable in real applications a few improvements and
further developments can be made to suit the need in that specific environ-
ment. This program provides a stable ground for object detection which can
be extended to the users liking. Since the detection time and the background
adaption is based on the frames per second, different machines processes the
image at different speed up to the max limit set by the cameras ability to cap-
ture frames. To solve this issue and make the program hardware independent
a function for adjusting the detection settings based on how fast the machine
can process a single image and so calculate the frames per second should be
developed. Another extension could be to make the program ”smart” and
recognise detected objects to avoid detection the same object twice if it has
been moved slightly. This would prevent false alarms and make the program
more effective for real surveillance usage. The possibilities for extending and
further development of this program is as endless as ones imagination and
need for some sort of detection software.

35
Bibliography

[1] Smitha. H and V. Palanisamy ”Detection of Stationary Foreground Ob-

jects in Region of Interest from Traffic Video Sequences” IJCSI Interna-
tional Journal of Computer Science Issues, Vol. 9, Issue 2, No 2, March
2012.

[2] S.C Cheung and C. Kamath ”Robust techniques for background subtrac-
tion in urban traffic video”. in Proc. Video Communications and Image
Processing, SPIE Electronic Imaging , San Jose, Calif, USA, January
2004.

[3] K. Toyama, J. Krumm, B. Brumitt, and B. Meyers ”Wallflower: Princi-

ples and practice of background maintenance” in ICCV (1), pp. 255-261,
1999.

[4] K.-P.Karmann and A.Brandt ”Moving object recognition using and

adaptive background memory” in Time-Varying Image Processing and
Moving Object Recognition, V. Cappellini, ed., 2, pp. 289-307, Elsevier
Science Publishers B.V., 1990.

[5] Antonio Albiol, Laura Sanchis, Alberto Albiol, and Jose M Mossi ”De-
tection of Parked vehicles using SpatioTemporal Maps” vol.12, pp.1277-
1291, December 2011.

[6] Thi Thi Zin, Member, IAENG, Pyke Tin, Takashi Toriu, and Hiromitsu
Hama ”A Novel Probabilistic Video Analysis for Stationary Object De-
tection in Video Surveillance Systems” IAENG International Journal of
Computer Science, 39:3, IJCS 39 3 09

[7] Sen-Ching S. Cheung and Chandrika Kamath ”Robust techniques for

background subtraction in urban traffic video” Center for Applied Sci-

36
entific Computing Lawrence Livermore National Laboratory 7000 East
Avenue, Livermore, CA 94550

[8] YingLi Tian, Rogerio Feris, Haowei Liu, Arun Humpapur, and Ming-
Ting Sun ”Robust Detection of Abandoned and Removed Objects in Com-
plex Surveillance Videos”

[9] Thi Thi Zin, Pyke Tin, Takashi Toriu and Hiromitsu Hama ”A
Probability-based Model for Detecting Abandoned Objects in Video
Surveillance Systems” Proceedings of the World Congress on Engineer-
ing 2012 Vol II WCE 2012, July 4 - 6, 2012, London, U.K.

[10] Rubén Heras Evangelio and Thomas Sikora ”Static Object Detec-
tion Based on a Dual Background Model and a Finite-State Ma-
chine” Hindawi Publishing Corporation EURASIP Journal on Im-
age and Video Processing Volume 2011, Article ID 858502, 11 pages
doi:10.1155/2011/858502

[11] Medha Bhargava, Chia-Chih Chen, M. S. Ryoo, and J. K. Aggarwal

”Detection of Abandoned Objects in Crowded Environments” Computer
and Vision Research Center Department of Electrical and Computer
Engineering, The University of Texas at Austin, Austin, TX 78712, USA

[12] P. Kaew Tra Kul Pong and R. Bowden ”An Improved Adaptive Back-
ground Mixture Model for Real- time Tracking with Shadow Detection”
In Proc. 2nd European Workshop on Advanced Video Based Surveil-
lance Systems, AVBS01. Sept 2001. VIDEO BASED SURVEILLANCE
SYSTEMS: Computer Vision and Distributed Processing, Kluwer Aca-
demic Publishers

[13] Thanarat Horprasert, David Harwood, and Larry S. Davis ”A Statistical

Approach for Realtime Robust Background Subtraction and Shadow De-
tection” Computer Vision Lab oratory University of Maryland College
Park MD

[14] Ahmed Elgammal, David Harwood, and Larry Davis ”Non-parametric

Model for Background Subtraction” in Proceedings of IEEE ICCV’99
Frame-rate workshop, Sept 1999.

[15] D. W. Scott ”Multivariate Density” Estimation Wiley-Interscience, 1992

37
[16] http://www.codemill.se

[17] http://www.matroska.org/technical/whatis/index.html

[18] http://www.matroska.org/node/50

[19] https://www.python.org

[20] http://www.tutorialspoint.com/python/python gui programming.htm

[21] http://www.opencv.org

[22] http://groups.inf.ed.ac.uk/vision/CAVIAR/CAVIARDATA1/

Image reference
[23] Figure 2.2 on page 7 borrowed from
http://inperc.com/wiki/index.php?title=Images as functions of two variables

[24] Figure 2.1 on page 7 borrowed from

http://radio.feld.cvut.cz/matlab/toolbox/images/color4.html

[25] Figure 2.6 on page 13 borrowed from

http://www.robots.ox.ac.uk/ parg/projects/ica/riz/Thesis/thesis029.html

[26] Figure 2.7 on page 15 borrowed from

[5]

EpicCare Clin 251-252 Exam Study Notes
88% (8)
EpicCare Clin 251-252 Exam Study Notes
15 pages
Ukulele Method Books Sample
No ratings yet
Ukulele Method Books Sample
36 pages
Multiple Target Detection and Tracking in A Multiple Camera Network
No ratings yet
Multiple Target Detection and Tracking in A Multiple Camera Network
59 pages
RPT Manual 1.12
No ratings yet
RPT Manual 1.12
45 pages
Ug4 Proj
No ratings yet
Ug4 Proj
44 pages
Monitoring Spatial Sustainable Development Semi-Au
No ratings yet
Monitoring Spatial Sustainable Development Semi-Au
81 pages
mTechPesWeJune21Grp6 Final+Submission
No ratings yet
mTechPesWeJune21Grp6 Final+Submission
50 pages
BTP Report
No ratings yet
BTP Report
27 pages
Fulltext01 P
No ratings yet
Fulltext01 P
78 pages
Ventricular Arrhythmia Classification Using Convolutional Neural Networks
No ratings yet
Ventricular Arrhythmia Classification Using Convolutional Neural Networks
118 pages
Loreggia Giacomo
No ratings yet
Loreggia Giacomo
80 pages
Nordby - Environmental Sound Classification on Microcontrollers using Convolutional Neural Networks
No ratings yet
Nordby - Environmental Sound Classification on Microcontrollers using Convolutional Neural Networks
70 pages
Image Compression For Wireless Sensor Networks.: Johannes Karlsson
No ratings yet
Image Compression For Wireless Sensor Networks.: Johannes Karlsson
56 pages
Speeding Up Document Image Classi Cation
No ratings yet
Speeding Up Document Image Classi Cation
59 pages
Indigo Orton Thesis Final
No ratings yet
Indigo Orton Thesis Final
192 pages
Master Thesis Copy
No ratings yet
Master Thesis Copy
44 pages
Improving The Accuracy of 2D On - Road Object Detection Based On Deep Learning Techniques
No ratings yet
Improving The Accuracy of 2D On - Road Object Detection Based On Deep Learning Techniques
69 pages
AnaMadeira 2014226191 Tese
No ratings yet
AnaMadeira 2014226191 Tese
118 pages
Soren Bleikertz
No ratings yet
Soren Bleikertz
125 pages
Convolutional Layer Implementation To Classify Malware in Banking Financial Services Industry
No ratings yet
Convolutional Layer Implementation To Classify Malware in Banking Financial Services Industry
100 pages
Convolutional Neural Networks For Malware Classification
100% (1)
Convolutional Neural Networks For Malware Classification
100 pages
New Techniques For Steganography and Steganalysis in The Pixel Domain 2009 Thesis 1245340893
No ratings yet
New Techniques For Steganography and Steganalysis in The Pixel Domain 2009 Thesis 1245340893
124 pages
Thesis PDF
No ratings yet
Thesis PDF
114 pages
Harsha Thesis
No ratings yet
Harsha Thesis
62 pages
Robot Tool Center Point Calibration Using Computer Vision - Johan Hallenberg PDF
No ratings yet
Robot Tool Center Point Calibration Using Computer Vision - Johan Hallenberg PDF
104 pages
Full Text 01
No ratings yet
Full Text 01
78 pages
Raj Emmanuel
No ratings yet
Raj Emmanuel
85 pages
Handbook of Regression Modeling in People Analytics 1st Edition Keith Mcnulty 2024 Scribd Download
100% (2)
Handbook of Regression Modeling in People Analytics 1st Edition Keith Mcnulty 2024 Scribd Download
40 pages
Simulator Testbed For The Collaborative Vehicle Routing Problem
No ratings yet
Simulator Testbed For The Collaborative Vehicle Routing Problem
62 pages
DESIGN OF PHYSICAL RANDOM ACCESSThesisReportNM
No ratings yet
DESIGN OF PHYSICAL RANDOM ACCESSThesisReportNM
55 pages
Predicting Images Using Convolutional Networks - Visual Scene Understanding With Pixel Maps
No ratings yet
Predicting Images Using Convolutional Networks - Visual Scene Understanding With Pixel Maps
149 pages
Classification of Textures Using Convolutional
No ratings yet
Classification of Textures Using Convolutional
30 pages
Real-Time Detection of Spelling Mistakes in Handwritten Notes
No ratings yet
Real-Time Detection of Spelling Mistakes in Handwritten Notes
70 pages
Internship Report: Meta-Learning Algorithms For Few-Shot Computer Vision
No ratings yet
Internship Report: Meta-Learning Algorithms For Few-Shot Computer Vision
35 pages
Tesi
No ratings yet
Tesi
69 pages
Nguyen Duy
No ratings yet
Nguyen Duy
66 pages
Parallelizing Particle-In-Cell Codes With Openmp and Mpi: Nils Magnus Larsgård
No ratings yet
Parallelizing Particle-In-Cell Codes With Openmp and Mpi: Nils Magnus Larsgård
74 pages
Image Recognitiion
No ratings yet
Image Recognitiion
50 pages
Network
No ratings yet
Network
135 pages
A Portable Vision System For Detecting and Counting Maggots: Lazar Lazarov
No ratings yet
A Portable Vision System For Detecting and Counting Maggots: Lazar Lazarov
59 pages
Grid Map
No ratings yet
Grid Map
74 pages
Real-Time Stage Tracking Camera Using Raspberry Pi
No ratings yet
Real-Time Stage Tracking Camera Using Raspberry Pi
50 pages
Lu Qikai 202109 MSC
No ratings yet
Lu Qikai 202109 MSC
57 pages
Puulmann Cs 2014
No ratings yet
Puulmann Cs 2014
40 pages
p.gribelyukLJK
No ratings yet
p.gribelyukLJK
62 pages
Master_Thesis_paper___p-grad-CAM
No ratings yet
Master_Thesis_paper___p-grad-CAM
87 pages
Detecting and Classifying Faults in Infrared Images
No ratings yet
Detecting and Classifying Faults in Infrared Images
75 pages
An Implementation of The Spacetime Constraints Approach To The Synthesis of Realistic Motion
No ratings yet
An Implementation of The Spacetime Constraints Approach To The Synthesis of Realistic Motion
46 pages
Build Your Own 3D Scanner: 3D Photography For Beginners: SIGGRAPH 2009 Course Notes Wednesday, August 5, 2009
No ratings yet
Build Your Own 3D Scanner: 3D Photography For Beginners: SIGGRAPH 2009 Course Notes Wednesday, August 5, 2009
94 pages
Deep Learning for Remote Sensing Images with Open Source Software (Rémi Cresson) (Z-Library)
No ratings yet
Deep Learning for Remote Sensing Images with Open Source Software (Rémi Cresson) (Z-Library)
165 pages
Lidar Simulation for Robotic Application
No ratings yet
Lidar Simulation for Robotic Application
112 pages
2010 Javagpu Diss
No ratings yet
2010 Javagpu Diss
103 pages
MLbook Extract
No ratings yet
MLbook Extract
14 pages
Dimensioning, Cell Site Planning PDF
No ratings yet
Dimensioning, Cell Site Planning PDF
145 pages
+martin Marinov - Remote Video Eavesdropping Using A Software-De - Ned Radio Platform
No ratings yet
+martin Marinov - Remote Video Eavesdropping Using A Software-De - Ned Radio Platform
78 pages
Virtual Machines For Aspect-Oriented Systems: Master Thesis
No ratings yet
Virtual Machines For Aspect-Oriented Systems: Master Thesis
153 pages
A Minor-Project Report "Face Recognition System Using KLT Viola-Jones Algorithm"
No ratings yet
A Minor-Project Report "Face Recognition System Using KLT Viola-Jones Algorithm"
34 pages
Uol Algorithms
No ratings yet
Uol Algorithms
215 pages
001 Motod221pdf PDF
No ratings yet
001 Motod221pdf PDF
70 pages
Master Inspera
No ratings yet
Master Inspera
45 pages
Basic Research and Technologies for Two-Stage-to-Orbit Vehicles: Final Report of the Collaborative Research Centres 253, 255 and 259
From Everand
Basic Research and Technologies for Two-Stage-to-Orbit Vehicles: Final Report of the Collaborative Research Centres 253, 255 and 259
Dieter Jacob
No ratings yet
Open Data Structures: An Introduction
From Everand
Open Data Structures: An Introduction
Pat Morin
4/5 (4)
Chapter 8: More Number Theory
No ratings yet
Chapter 8: More Number Theory
5 pages
MG02 Leveling
No ratings yet
MG02 Leveling
10 pages
Catalog-WSLY Series Zirconia Oxygen Analyzer
No ratings yet
Catalog-WSLY Series Zirconia Oxygen Analyzer
3 pages
Sugar ER Diagram 5.2
No ratings yet
Sugar ER Diagram 5.2
17 pages
International Warranty Network: Country/Region Names of Members Telephone Website Email
No ratings yet
International Warranty Network: Country/Region Names of Members Telephone Website Email
3 pages
EFCO Maschinenbau India Private Limited: Job Card - Isolation Valve
No ratings yet
EFCO Maschinenbau India Private Limited: Job Card - Isolation Valve
2 pages
SWOT Analysis For Nonprofits
No ratings yet
SWOT Analysis For Nonprofits
7 pages
Interview Questions & Tips
No ratings yet
Interview Questions & Tips
5 pages
2001 Jongsma Greenfield Architectural Technology and Spread Early Agricultural Societies Temperate SE Europe Chacmool
No ratings yet
2001 Jongsma Greenfield Architectural Technology and Spread Early Agricultural Societies Temperate SE Europe Chacmool
22 pages
26412VNR 2020 Brunei Report
No ratings yet
26412VNR 2020 Brunei Report
113 pages
Kementerian Pendidikan Malaysia: Institut Pendidikan Guru Kampus Pendidikan Teknik, Negeri Sembilan
No ratings yet
Kementerian Pendidikan Malaysia: Institut Pendidikan Guru Kampus Pendidikan Teknik, Negeri Sembilan
3 pages
I. O. Macari, Morpho-Syntax, Lecture 1
No ratings yet
I. O. Macari, Morpho-Syntax, Lecture 1
25 pages
LMR 122D
No ratings yet
LMR 122D
4 pages
AutoCad P&ID Assemblies
No ratings yet
AutoCad P&ID Assemblies
4 pages
Boris Plotkin
No ratings yet
Boris Plotkin
19 pages
Chapter 7 Data Marts and Star Schema Design
No ratings yet
Chapter 7 Data Marts and Star Schema Design
7 pages
我讨厌作业的诗歌
100% (1)
我讨厌作业的诗歌
4 pages
Assignment 1 Word 2007: Fakultas Kedokteran Gigi Universitas Sumatera Utara
No ratings yet
Assignment 1 Word 2007: Fakultas Kedokteran Gigi Universitas Sumatera Utara
11 pages
834 Units: Clifton Villas
No ratings yet
834 Units: Clifton Villas
2 pages
Transformation of Sentences
0% (1)
Transformation of Sentences
12 pages
As 1774.28-2007 Refractories and Refractory Materials - Physical Test Methods Ceramic Fibre Products - Test M
No ratings yet
As 1774.28-2007 Refractories and Refractory Materials - Physical Test Methods Ceramic Fibre Products - Test M
3 pages
Siemens - Brochure - Advanced Microgrid Solutions
No ratings yet
Siemens - Brochure - Advanced Microgrid Solutions
2 pages
Study Plan Introduction of My Background
No ratings yet
Study Plan Introduction of My Background
2 pages
SIPROTEC Tools
No ratings yet
SIPROTEC Tools
12 pages
D113-19 Shearwall Design Guide Eversion s
No ratings yet
D113-19 Shearwall Design Guide Eversion s
66 pages
Protocol: For Old Products (Model No.: T16 TK05se TK10se TK20se T12se) 1.upload Content (Device To Server)
No ratings yet
Protocol: For Old Products (Model No.: T16 TK05se TK10se TK20se T12se) 1.upload Content (Device To Server)
15 pages
Cepsa Atf Avant Diii
No ratings yet
Cepsa Atf Avant Diii
1 page

stationary_objects_detection

Uploaded by

stationary_objects_detection

Uploaded by

Stationary Object Detection in Video

Master’s Thesis in Computing Science, 30 ECTS credits

August 24, 2015

Computer Vision is an expression which summarise the area of which image

I would like to thank Johanna Björklund and Rickard Lönneborg at CodeMill

2 Video Content Analysis - Object detection 6

1. How can one detect changes in otherwise static environments?

2. When does an object become static or non-static?

1.2 Related work

Video Content Analysis -

2.1 Image representation

2.2 Background subtraction

An introduction to some of the most used models will be presented be-

2.3.1 Frame differencing

2.3.2 Median filtering

The K represents the kernel estimator presented by D. W Scott in [15]

2.4 Recursive background modelling

2.4.1 Approximated median filtering

2.4.2 Kalman filtering

x̂(ti ) = x̃(ti ) + K(ti ) ∗ [z(ti ) − H(ti ) ∗ x̃(ti )] (2.3)

where the prediction term is defined as

x̃(ti ) = A(ti ) ∗ x̂(ti−1 ) (2.4)

A(ti ) is the system matrix which is a constant matrix defined as

2.4.3 Gaussian mixtures

Figure 2.6: Gaussian mixture using 3 distributions

The probability of a pixel being inside this distribution can described by

2.5 1-model background subtraction

Figure 2.7: 1-model and 2-model background subtraction illustration

• Overall quick adaption to changes in the scene.

• No lingering error from past frames (except for non-parametric mod-

• Require storage capabilities, larger means slower adaption.

• Most of them work on colour images.

• Can have lingering errors for a long time.

So what does these two major categories of techniques have in common?

Table 2.1: Comparison summary of background models.

3.1 System overview

Figure 3.1: Program flow chart

Figure 3.2: Graphical user interface startup

Figure 3.3: Video source input

Figure 3.6: Mail account credentials

Figure 3.7: Simple unfiltered background subtraction

3.1.3 Foreground sampling

3.1.4 Detection and notification

3.1.5 Saving to file

3.2 Program and libraries

4.1 Test Results

4.2 Comparison Between Languages

Table 4.1: Summary table of average frame rate at different resolution

1. How can one detect changes in otherwise static environments?

2. When does an object become static or non-static?

The first question is about the foundation of object detection, namely

5.2 Future Work

[1] Smitha. H and V. Palanisamy ”Detection of Stationary Foreground Ob-

[3] K. Toyama, J. Krumm, B. Brumitt, and B. Meyers ”Wallflower: Princi-

[4] K.-P.Karmann and A.Brandt ”Moving object recognition using and

[7] Sen-Ching S. Cheung and Chandrika Kamath ”Robust techniques for

[11] Medha Bhargava, Chia-Chih Chen, M. S. Ryoo, and J. K. Aggarwal

[13] Thanarat Horprasert, David Harwood, and Larry S. Davis ”A Statistical

[14] Ahmed Elgammal, David Harwood, and Larry Davis ”Non-parametric

[15] D. W. Scott ”Multivariate Density” Estimation Wiley-Interscience, 1992

[20] http://www.tutorialspoint.com/python/python gui programming.htm

[24] Figure 2.1 on page 7 borrowed from

[25] Figure 2.6 on page 13 borrowed from

[26] Figure 2.7 on page 15 borrowed from

You might also like