Mandellos 2011
Mandellos 2011
a r t i c l e i n f o a b s t r a c t
Keywords: An innovative system for detecting and extracting vehicles in traffic surveillance scenes is presented. This
Computer vision system involves locating moving objects present in complex road scenes by implementing an advanced
Background subtraction background subtraction methodology. The innovation concerns a histogram-based filtering procedure,
Background reconstruction which collects scatter background information carried in a series of frames, at pixel level, generating reli-
Background maintenance
able instances of the actual background. The proposed algorithm reconstructs a background instance on
Background update
Vehicle detection
demand under any traffic conditions. The background reconstruction algorithm demonstrated a rather
Traffic surveillance robust performance in various operating conditions including unstable lighting, different view-angles
Tracking and congestion.
Ó 2010 Elsevier Ltd. All rights reserved.
0957-4174/$ - see front matter Ó 2010 Elsevier Ltd. All rights reserved.
doi:10.1016/j.eswa.2010.07.083
1620 N.A. Mandellos et al. / Expert Systems with Applications 38 (2011) 1619–1631
realistic traffic circumstances where vehicles might remain still for In this study we present an innovative algorithm, the back-
a long time. Finally, background subtraction detects the actual ground reconstruction algorithm, as part of a system for locating
background and extracts objects that do not belong to it. The con- and tracking vehicles through traffic video captures. The purpose
cept of this method is described below. of the present work is to overcome the two main weaknesses of
In a typical background model a prototype of the image back- the background subtraction algorithm, namely initialization and
ground (an initialization of the background) is considered first background update and to build a robust methodology, capable
and then each pixel of the prototype is compared with the actual of detecting vehicles under realistic traffic circumstances.
image color map. If the color difference exceeds a predefined The background reconstruction algorithm is a heuristic that
threshold it is assumed that this pixel belongs to the foreground. provides a periodically updated background and enhances the effi-
Consequently, raw foreground information is derived. This infor- ciency of the well known background subtraction methodology in
mation is grouped to compact pixel sets (blobs). In case of outdoor case of outdoor captures. Indeed, it is a key process for a typical
scenes, when the background is not completely static, lighting fluc- background subtraction algorithm, because it supports the weak-
tuations, shadows or slight movements (i.e. leaves and branches est part of it, which is the initialization step. This methodology
waving) can degrade the effectiveness of the foreground extrac- guarantees a fresh instance of the actual background periodically,
tion. To overcome this, a number of algorithms have modeled the which is achieved by collecting scatter color information through
aforementioned nuisances. More specifically, mixture models use a series of sequential images and assembling them to reconstruct
statistical filters to eliminate continuous slight movements on the actual background. This process is applied to each pixel sepa-
the background by grouping time evolving pixel characteristics in rately and the result is a color map of the actual image background.
clusters or color prototypes and characterizing as background the Our algorithm is presented as a part of an integrated surveil-
more populated one (Kim, Chalidabhongse, Harwood, & Davis, lance system that can be set up in existing traffic surveillance
2005; Stauffer & Grimson, 1999; Zivkovic & Van der Hijden, infrastructure. This system locates, counts and tracks vehicles in
2006), while parametric models such as the ones proposed by Hari- a variety of lighting conditions such as cloudiness and glares.
taoglu, Harwood, and Davis (2000), Horprasert, Harwood, and Da- Moreover, it adapts quickly to any changes of the background, as
vis (1999), Pless (2005) simulate the background by taking into transition between different lighting conditions (i.e. from cloudi-
account color characteristics. In the work (Horprasert et al., ness to direct sunlight and vice versa), various traffic conditions
1999) each pixel is classified in one of four classes namely: ‘Fore- including stop-and-go traffic flow as well as permanent changes
ground’, ‘Shaded background’, ‘Highlighted background’ and ‘Back- to the background (for instance, when a vehicle has pulled over).
ground’. Thus, the system can ‘recognize’ background This overcomes the weaknesses of previous systems described
discontinuities due to lightings and shadows and consequently above.
register them as background. A typical surveillance system consists of a traffic camera net-
This methodology has the great advantage of separating objects work, which processes captured traffic video on-site and transmits
by using background information even in images that comprise the extracted parameters in real time. In this study we focus on the
shadows or glares (Senior, Tian, Brown, Pankanti, & Bolle, 2001). algorithmic part of such a system.
The main drawback of the background subtraction algorithm is The innovation of this study lies on the ability of the proposed
the complexity to define the background. A common practice is algorithm to reconstruct the actual background color map without
to initialize the algorithm by employing an ‘empty scene’. Another the need of any human intervention even in harsh traffic condi-
important issue in this methodology is the difficulty to maintain tions, such as stop-and-go traffic flow, stopped vehicles (i.e. acci-
the background instance through time in outdoor captures. dent) and rain or snow. In our approach a new background
The creation of a reliable initial instance is a critical issue for the prototype is constructed every 1 or 2 min, restricting the problem
quality of the overall process. A general solution for this problem of background pollution to the interval between two consecutive
does not exist, and the common practice is to average a sequence updates. Each newly recreated background instance is assumed
of frames presenting a scene without moving objects, which in fact to be steady within the update period. Thus, the background in-
is too difficult to acquire in a crowded highway. Despite the impor- stance is used as a prototype in order to separate the foreground
tance of this issue, there has only been limited research published from the image for each image frame within the update period.
focusing on the reconstruction of a starting background instance. This paper is structured as follows: In Section 2 a description of
(Colombari, Cristani, Murino, & Fusiello, 2005; Gutchess et al., the system together with its specifications and the testing arrange-
2001), for instance, are significantly complicated to implement. ment are given. Section 3 presents the Vehicle Detection Unit.
On top of that, they are based on several restrictive assumptions. Emphasis is given to the background reconstruction algorithm
The latter work in particular refers to an inpainting technique, which is analyzed in detail. In Section 4 the Tracking Unit is pre-
(Criminisi, Perez, & Toyama, 2004) where background parts are sented. In Section 5 we present our experiments, which aim to sup-
reconstructed exploiting color and texture information. port the basic assumptions of this work and to evaluate the
In outdoor captures, the background prototype often fails to re- developed background reconstruction algorithm. Finally, in Section
flect the actual background due to lighting condition changes, sha- 6 we summarize our results and we present our conclusions.
dow casting with respect to the sun position and background
alterations with permanent effect. Moreover, the insertion of new
objects in the road scene can induce permanent or temporary 2. System conception
changes of the background (e.g. a vehicle that has been pulled over
for a long time or an object in the road deck). Common practice in The innovative algorithm of background reconstruction is part
such cases is to use adaptive update models, such as those of Toy- of a contemporary and realistic surveillance system. The integrated
ama, Krumm, Brumitt, and Meyers (1999), Gupte, Masoud, Martin, system locates, tracks and extracts traffic parameters in real time.
and Papanikolopoulos (2002), Wren, Azarbayejani, Darrell, and Furthermore, the system can utilize any existing traffic surveil-
Pentland, (1997), that keep the background template recursively lance infrastructure without further modification or tuning (except
updated, so that, the background template is adapted in forthcom- for the camera calibration that calculates image metrics).
ing image changes. Nevertheless, in most cases, after some time A typical road traffic surveillance infrastructure consists of a
the noise pollution of the background results into the degradation camera network that has the ability to transmit images in real time
of the overall process quality. to a central operational center. The processing of the images can be
N.A. Mandellos et al. / Expert Systems with Applications 38 (2011) 1619–1631 1621
carried out on-site saving valuable network bandwidth as it trans- adaptive filter that is applied in pixel level and proposed by Toy-
mits only the outcome of the calculations. Else, the whole process ama et al. (1999):
can be performed either in real time video streaming from an oper-
Bt ¼ ð1 aÞBt1 þ aIt ð1Þ
ational center or in already stored video material.
In such network installation, the cameras must be sited approx- where Bt is the color vector of the background model in the t frame,
imately 10–15 m or more above road level to minimize the effect I is the actual color vector of the same pixel in the frame t and a is
of occlusion. The system must be adaptive to a series of perturba- the coefficient that declares the rate of adaptation with values in
tions that may affect the clarity of the captured video, such as the range 0–1.
vibrations of the camera and slow changes in the background In the main flow of the detection unit, the raw foreground infor-
due to lighting conditions, (Mimbela & Klein, 2000). mation is derived by a background subtraction procedure, Fig. 1.
In order to simulate the algorithmic part of an integrated road The result of this step is a set of partly connected pixels, which
traffic surveillance system, we used the following arrangement: must be further processed in order to form compact objects (clus-
A commercial CSS DV camcorder was installed about 10 m above tering/convex hull, Fig. 1). If those pixels lie along the ‘‘Entrance
road level and was sited above the central lane of the road, facing Zone” (Fig. 2) a region growing algorithm (Davies, 2005) merges
the traffic at an angle of 65o. The characteristics of this camera are all those pixels that potentially belong to a common vehicle do-
as follows: main. Else, if the pixels from the background subtraction procedure
lie within the Main Area, then they are further processed to form
48 mm focal length equivalent to 35 mm camera, which defines blobs (connected pixels that form a shape) using a convex hull pro-
a 27° vertical and 40° horizontal angle of view; cedure (Section 3.3). The detected blobs from this process are
25 fps of 720 576 pixels in PAL video format. merged to form objects. Merging occurs to blobs that lie partly or
totally within the frontiers of a recognized vehicle shape from
Some predefined spots, on each test scene, were chosen in order the previous frame t 1, whose position has been appropriately
to calibrate the camera according to DLT (direct linear transforma- corrected for frame t (vehicle matching – diagram of Fig. 1). Candi-
tion), a method originally reported in Abdel-Aziz and Karara date vehicles are recognized by a cognitive clustering procedure
(1971). The calibration of the camera defines the relationship be- (classifier – diagram of Fig. 1). This cognitive clustering process
tween the ‘real-world’ and the pixel matrix of the digital image. has the following concept: candidate object is considered to be a
The architecture of the proposed system is described in Fig. 1. vehicle only if its location is consistent with its prior calculated tra-
The system consists of two units namely the Vehicle Detection Unit jectory and detected object dimensions remain unchanged through
and the Tracking Unit, the latter being indicated in gray color. Fig. 1 frames. A vehicle that does not match the previous frame, or seems
shows that first, a series of frames (raw traffic capture) enters the to have an irregular trajectory must be rejected (i.e. a backward
Vehicle Detection Unit (presuming that an initial background tem- vehicle movement that cannot be explained by its trajectory).
plate has been created). Subsequently, the stream of frames feeds Occlusion can be handled by simple rules of merging and split-
the background reconstruction algorithm in order to create the ting vehicle domains regarding their trajectory. Each detected
next background template that replaces the current after a prede- vehicle belongs to one of the following classes: ‘vehicle’, ‘large
fined number of frames. While a new background template is cre- vehicle’ or ‘non-vehicle object’. ‘non-vehicle’ objects are not further
ated, the background in use is maintained using the simple tracked and are ignored by the system. Finally, if a previously de-
at the (i, j) pixel of the frame at time t, ILij ðtÞ; Iuij ðtÞ and Ivij ðtÞ denote
The most popular algorithms for background extraction found in
the literature and used for comparison purposes in the current work the L , u , v elements of Iij(t) respectively, and B = {Bij} the back-
are the Mixture Of Gaussians Model (MOG, Stauffer & Grimson, ground color map. The pixel (i, j) color variation with respect to
1999, (Zivkovic & Van der Hijden, 2006) and the Codebook model time is estimated by a sampling procedure, where the color values
(Kim et al., 2005). The MOG methodology models each pixel history Iij(t) obtained by T consecutive frames, starting from t0, are col-
as a cluster of Gaussian type distributions and uses an on-line lected. Thus, the temporal sample Sij ðt 0 Þ ¼ ðIij ðt0 Þ; Iij ðt0 þ 1Þ; . . . ;
approximation to update its parameters. According to this, the Iij ðt 0 þ T 1ÞÞ of pixel (i, j) defines the frequency ^f ij ðl; u; v Þ of the
background is found as the expected value of the distribution corre- examined pixel having color value belonging into the bl,u,v bin:
sponding to the most populated cluster (Stauffer & Grimson, 1999). " L #! " u #! " v #!
t0X
þT1
This methodology is greatly improved on grounds of performance ^f ij ðl; u; v Þ ¼ Iij ðtÞ Iij ðtÞ I ðtÞ
d l d u d v ij
by considering recursive equations to adaptively update the param- t¼t0
h h h
eters of the Gaussian model (Zivkovic & Van der Hijden, 2006).
ð2Þ
According to the Codebook model (Kim et al., 2005), sample back-
ground values at each pixel are quantized into codebooks that rep- where l; u; v 2 N, d() is the kronecher delta function.
resent a compressed form of background model for a long image The frequency ^f ij ðlm ; um ; v m Þ within the mode bin blm ;um ;v m cor-
sequence. The codebook is enriched in new codewords in the pres- responds to the most persistent color Im ¼ ðlm ; um ; v m Þ in a se-
ence of a new color that cannot be assigned to the existing groups. quence of T frames for pixel (i, j). For this reason, our
N.A. Mandellos et al. / Expert Systems with Applications 38 (2011) 1619–1631 1623
3
^f ðv Þ ðv Þ ¼ ^f ðl; u; v Þ;
The methodology described is of O(n ) complexity in terms of ij ij
l¼lmin v ¼v min
memory and O(sn3) in terms of calculations involved (where s
denotes the total number of frames in the sample and n repre-
where lmax, lmin, umax, umin, vmax and vmin are the maximum and
sents the magnitude of discretization for each color parameter).
minimum values of the discrete l, u, v parameters respectively.
In normal traffic conditions a 100–200-frames sample corre-
The calculation of the frequencies above Eq. (4) provides infor-
sponding to 4–8 s of traffic observation is adequate for the iden-
mation on the reconstruction of the background. According to the
tification of the actual background. However, in general
proposed methodology, the most persistent color value in a se-
conditions where the vehicle flow is dense having low speed
quence of frames, for a specific pixel, is the one that is most likely
and/or involving stop-and-go behavior, the demanded sample
to represent the actual background, and can also be calculated by
size is expected to be higher. Our tests in such conditions
maximizing Eq. (2). Alternatively, it can be approximated by com-
showed that an average of 1250 frames, corresponding to
posing an artificial color, which is composed by each one of the fre-
1 min capture length may be required to reliably reconstruct
quency modes (lmode, umode, vmode) maximizing Eq. (4):
the actual background. In these cases the sample size includes
a vast volume of information and therefore demands an in- h i h i h i
ðv Þ
Bij ¼ arg max ^f ij ðlÞ ; arg max ^f ij ðuÞ ; arg max ^f ij ðv Þ
ðlÞ ðuÞ
creased memory capacity, which may be prohibitive for the de-
sign and operation of the system.
¼ ðlmode ; umode ; v mode Þ ð5Þ
The limitations posed by the hardware motivated us to seek for
a more efficient way to solve the problem while keeping the mem- The whole idea is implemented on the reconstruction of the ac-
ory usage and the required amount of calculations within accept- tual background based on the clustering of pixel temporal color
able limits. Towards this goal, our research focused on a different values into two basic classes: ‘Background’ and ‘non-background’.
approach of managing the discrete temporal chromatic informa- One of the most efficient methodologies for clustering color infor-
tion l, u, v of the chromatic space U. Thus, we calculated the fre- mation is the popular methodology of mean-shift, introduced by
ðv Þ
quencies ^f ij ðlÞ; ^f ij ðuÞ; ^f ij ðv Þ for each l, u, v parameter separately,
ðlÞ ðuÞ
Abdel-Aziz (1971). However, this methodology involves a vast
through the following summations: amount of calculations for the set of color values that correspond
Fig. 3. Illustrative example of the principles of the background reconstruction methodology: in a 2D distribution (vu plane) of image color values, the majority is
concentrated around a value (mode) that represents the background color.
1624 N.A. Mandellos et al. / Expert Systems with Applications 38 (2011) 1619–1631
to a single pixel, making a solution for the whole image not realis- In our work the chromatic difference between the current frame
tic. Additionally, the memory required is of the same complexity as and the background model Bij is defined by a norm that combines
the one required for Eq. (2). the difference in lightness L with the chromatic difference of the
To overcome this obstacle we propose to use Eq. (4) to carry out u, v parameters in Luv color space. The foreground mask Mij
this clustering in a more flexible manner that is alleviated by the is calculated then by the following relation:
main characteristics of the problem: (i) the distribution of sam- (
1; ILij BLij > threshold ^ jjIuij ;v Buij ;v jj > threshold
a simple matching methodology was found to meet better the having mask M1 matches with vehicle V 01 having mask M 01 from
needs of this problem. previous frame only if M 1 \ M 01 –£. If there is a conflict between
The matching procedure adopted in this study is similar to two vehicles then the matching vehicle is the one that maximizes
(Criminisi et al., 2004) and is based on the assumption that the the common surface.
next position for a vehicle can be estimated by its motion. Accord- Even with the most accurate algorithm for locating templates in
ing to this, we estimate the positions of previous frame vehicles an image, small drifts and miscalculations due to conversion of dis-
and we draw their traces in the current frame. Then a vehicle V1 tances in the image discrete space into real conditions result in
Fig. 5. The snapshots above were taken from the four captures used for evaluation. In each snapshot, five pixel positions (1–5) have been chosen (three pixels at the front of
the image and two at the back). Note that some of the pixels are sited over the white stripe of the road in order to study the behavior of the system in a color different from the
asphalt. Evaluation results are presented in Tables 1 and 2 as well as at Fig. 6.
Fig. 6. The 2D-topology of the pixel series (PCk, k = 0, 1, . . .) of Scene I at pixel 1 of Fig. 5: top, middle and bottom rows present ‘‘background”, ‘‘foreground” and all elements
respectively. The side bar graphs in each topology correspond to the color parameter histogram.
N.A. Mandellos et al. / Expert Systems with Applications 38 (2011) 1619–1631 1627
small errors in measurements. In order to form a more accurate three at the front. For each pixel position of a specific scene two ar-
and smooth trajectory for each vehicle a Kalman filter algorithm rays were constructed, as described below:
is employed. The first array is a collection of color values PC pk;s (PC = [pixel col-
Kalman filter employs a procedure that a state variable is repet- or, class], p = testing pixel, k = array index = 1, 2, . . . , s = scene)
itively predicted according to a theoretical model and is subse- appropriately classified as one of the following classes: ‘Fore-
quently corrected by an actual measurement. The state variable ground’ or ‘Background’. This array comprises a sampling of the
of the described system is a vector of vehicle location, speed and color values collected at the pre-selected pixels for each testing
length. In our approach we assumed simple constant straight mo- scene taken in a 500 frames interval. According to this, the 1st,
tion along the direction of the road, a rational approach for traffic 500th, 1000th, . . . frames (of each testing scene and each pre-se-
in avenues and national roads. In addition, we assume constant lected pixel position) were manually extracted and classified to
vehicle length for our kinematic model along its trajectory. construct the PC pk;s array.
The second array consists of the background color values at the
pre-selected pixels of the testing scenes BGpk;s (BG = background col-
5. Experimental results – evaluation or, p = testing pixel, k = array index = 1, 2, . . . , s = scene). As in the
first array, the background color values of the 1st, 500th,
In order to validate the effectiveness of the background recon- 1000th, ... frame (of each testing scene and each pre-selected pixel)
struction algorithm we created an evaluation process which aims were recorded in order to form the BGpk;s array whenever this was
to provide evidence supporting the color that is more frequent at possible (when the testing pixel was not obstructed by foreground
a specific pixel in a series of frames is more likely to belong to objects). If the testing pixel was obstructed by a foreground object
background rather than foreground. we sought for the nearest frame where the testing pixel could
For that reason, we created a testing group of four video cap- clearly be defined.
tures (Scene I-79,000 frames, Scene II-29,000 frames, Scene III- The graphical representation of the first array PC pk;s (testing
28,000 frames and Scene IV-7000, see Fig. 5). In each scene five pix- scene I, pixel 1) is presented in Fig. 6. The topology is analyzed into
el positions have been chosen, two at the back of the image and the three combinations of planes: vu, uL and vL and for each
Table 1
Comparison with other models.
Scenes MOG (Stauffer & Grimson, 1999; Zivkovic & Codebook Kim et al., This work
Van der Hijden, 2006) 2005
FG (%) BG (%) FG (%) BG (%) FG (%) BG (%)
Scene I
E94 – direction Elefsina stop-and-go traffic conditions (79,000 frames 74.3 99.6 92.5 93.6 97.1 98.3
capture)
Scene II
E94 – direction El.Venizelos normal traffic conditions (29,000 frames 82.4 98.7 94.9 92.2 95.0 99.0
capture)
Scene III
E94 – direction El.Venizelos normal and dense flow traffic conditions 80.7 98.7 93.4 91.1 94.2 98.7
(28,000 frames capture)
Scene IV
E75 – direction lamia dense normal and dense flow traffic conditions (7000 77.0 96.6 88.1 93.8 91.2 97.1
frames capture)
Table 2
Background Reconstruction process outcome.
parameter L*,u*,v* the corresponding histogram HL, Hu, Hv is gen- second row respectively. In the last row the PC pk;s series are illus-
erated. In each diagram the darkest areas represent high concen- trated independent of their classification.
tration of values, which also corresponds to high values at the It can be clearly seen that in the first row the distribution of the
side histograms. background color values is densely populated around a central va-
The distribution of PC pk;s array elements that have been classified lue, which is the mode of this distribution. The mode can also be
as ‘Background’ and ‘Foreground’ are presented in the first and located from the side bar graphs, where the bars are steeply
N.A. Mandellos et al. / Expert Systems with Applications 38 (2011) 1619–1631 1629
increased around a narrow interval. On the contrary, in the second flat with multiple modes dispersed uniformly in the 2D parameter
row (foreground values distribution), the side bar graphs tend to be space. When the two distributions are mixed, the color that is
present in the majority of frames (so its color value is distributed in innovation of this study lies on a new algorithm for reconstruction
a narrow zone of a central value) is the background color. More- of the actual background. This algorithm is based on statistical col-
over, the side 1D histograms HL, Hu, Hv can precisely locate the or sampling per pixel over time. This algorithm is robust in recon-
background color (using Eq. (5) providing almost the same results structing the actual background, even in real time. This was
as in the 2D distributions. achieved due to algorithm’s low complexity: O(n) complexity in
In order to test further our methodology we carried out the fol- terms of memory and O(tn) in terms of calculations involved (t de-
lowing test: We processed the data of Scenes I–IV by applying the notes the time size of the sample and n represents the magnitude
background reconstruction algorithm to implement the 3D and 1D of discretization for each color parameter). The experiments car-
problem – Eqs. (3) and (5), respectively for a specific test frame, ried out showed that the proposed algorithm is capable of real time
chosen for each scene (I–IV, Fig. 5). Moreover, the proposed back- operational working due to its low complexity.
ground subtraction algorithm was tested against two of the most The reconstruction of a new background instance, wherever this
popular algorithms found in literature, that is to say MOG and is required, enhances the typical background subtraction algo-
codebook. For the case of MOG model the Mahalanobis distance rithm. Thus, in our approach the implementation of the back-
was used to account for problems where the standard deviation ground subtraction does not depend on an initial background
of the Gaussian distribution is high. The results are presented in instance and for that it has broadened its applicability. One of
Table 1. For all scenes, the percentage of successfully detecting the main advantages of the proposed system is that it can be ap-
the foreground and background pixels is given for all methods plied in an existing traffic surveillance system without substantial
tested. The methodology proposed outperforms all previous algo- modifications and the background reconstruction algorithm allows
rithms. Visual presentation of the results is given in Fig. 7 (Scene the unobstructed operation of the system without human inter-
III was left out since the actual scene data were similar to scene vention. The system works well either in real time mode or in al-
II – same location). In addition, the performance of the suggested ready stored video.
algorithm was faster since it did not involve the computational The testing arrangement used, which simulates the operation of
burden of adopting the MOG cluster parameters or the enrichment a traffic surveillance system, was found to work satisfactory in out-
of codebook codewords. door diverse lighting conditions. In all cases background recon-
Background reconstruction is a statistical methodology, there- struction algorithm managed to accurately reconstruct the actual
fore the identification of the sample size is important. In general, background in various harsh conditions including heavy conges-
the sample size should be large enough to carry enough informa- tion and changes in the lighting. This methodology added robust-
tion for extracting the background color in each image pixel. To ness to the traditional background subtraction algorithm and
achieve this goal the sample size, in terms of time, should exceed overcame known instability issues.
the average time that a passing vehicle occupies any pixel in the In future work, we aim to focus on night surveillance, where
image. In our test we chose a 500 frames and a 2500 frames sam- some primary tests leave space for improvement on the existing
ple, translated in terms of time to a 20 and 100 s exposure algorithms reported in literature. However, the other modules of
correspondingly. our proposed system should be improved, focusing on the occlu-
The 500 frames sample is quite satisfactory for a highway sion handling and vehicle matching procedure. Moreover, it re-
where vehicles’ average speed is about 80 km h1 (22.22 ms1) mains a challenge to utilize the capabilities of the proposed
and the occupation of a specific pixel close to the camera is ex- algorithm to other kind of machine vision problems, such as secu-
pected to be less than a second. We chose the 2500 frames sample rity, remote sensing, ship surveillance and a plethora of surveil-
in order to examine if overexposure can improve the reconstruc- lance applications.
tion procedure. We observe that the choice of a larger sample size
tends to decrease the accuracy (Tables 1 and 2) for two reasons:
first, as the sample size increases, the background color changes
References
following the diminutive change of lighting (this also explains
the differences on the measured values in different sample sizes Abdel-Aziz, Y.I., Karara, H.M. (1971). Direct linear transformation from comparator
for the same pixel) and second, the larger the sample size the more coordinates into object space coordinates in close-range photogrammetry. In
Proceedings of the symposium on close-range photogrammetry (pp. 1–18). VA:
the inserted noise and the subsequent degradation of the back-
American Society of Photogrammetry.
ground reconstruction performance. Colombari, A., Cristani, M., Murino, V., Fusiello, A. (2005). Exemplar-based
The proposed system was implemented and tested as it is background model initialization. ACM workshop on video surveillance and
shown in Fig. 8, where the result of the Scene I experiment is pre- sensor networks VSSN, pp. 29–36.
Comaniciu, D., Ramesh, V., Meer, P. (2000). Real-time tracking of non-rigid objects
sented. This result is also published on the internet corresponding using mean shift. In Proceedings of the IEEE conference on computer vision and
author’s personal page http://www.users.ntua.gr/nmand/BGRe- pattern recognition, Hilton Head, SC. Vol. 1. pp. 142–149.
construction.htm. Moreover, the background reconstruction proce- Comaniciu, D., & Meer, P. (1997). Robust analysis of feature spaces: Color image
segmentation. IEEE Conference on Computer Vision and Pattern Recognition,
dure for the test scenes I–IV is also included in the same web page. 750–755.
Overall, the system was found to work satisfactorily and the Criminisi, A., Perez, P., & Toyama, K. (2004). Region filling and object removal by
background reconstruction algorithm added robustness to the pro- exemplar-based image inpainting. IEEE Transactions on Image Processing, 13,
1200–1212.
cess. In normal traffic conditions the system responded well and Cucchiara, R., Piccardi, M. (1999). Vehicle detection under day and night
the outcome results regarding vehicle speed and trajectory were illumination. In Proceedings of 3rd international ICSC symposium on intelligent
accurate enough. The maximum number of vehicles detected and industrial automation.
Davies, E. R. (2005). Machine vision (3rd ed.). San Francisco, US: Elsevier Inc.. p. 104.
tracked simultaneously for the heavy traffic instances of scene 1, Davies, E. R. (2005). Mathematical morphology. In Machine vision (3rd ed.,
was 10. pp. 233–261). San Francisco, US: Elsevier Inc.
Graham, R., & Yao, F. (1983). Finding the convex hull of a simple polygon. Journal of
Algorithms, 4, 324–331.
Gupte, S., Masoud, O., Martin, R., & Papanikolopoulos, N. (2002). Detection and
6. Conclusion classification of vehicles. IEEE Transactions on Intelligent Transportation Systems,
3, 37–47.
In this study we presented a system that implements a classical Gutchess, D., Trajkovic, M., Kohen-Solal, E., Lyons, D., Jain, A. 2001. A background
model initialization algorithm for video surveillance. In Proceedings of the eighth
computer vision algorithm, the background subtraction, appropri- international conference on computer vision (Vol. 12, pp. 733–740). Vancouvier,
ately modified for the purposes of a traffic surveillance system. The Canada.
N.A. Mandellos et al. / Expert Systems with Applications 38 (2011) 1619–1631 1631
Haritaoglu, I., Harwood, D., & Davis, L. (2000). W4: Real-time surveillance of people Mimbela, L., Klein, (2000). Non-intrusive technologies. In A summary of vehicle
and their activities. IEEE Transactions on Pattern Analysis and Machine detection and surveillance technologies used in intelligent transportation
Intelligence, 22(8), 809–830. systems (1st ed., pp. 5.1–5.27). Federal Highway Administrations (FHWA)
Horprasert, T., Harwood, D., Davis, L. (1999). A statistical approach for real-time Intelligent Transportation Systems Joint Program Office: Washington, DC, US.
robust background subtraction and shadow detection. IEEE ICCV FRAME-RATE Pless, R. (2005). Spatio-temporal background models for outdoor surveillance.
workshop. EURASIP Journal on Applied Signal Processing, 14, 2281–2291.
Kastrinaki, V., Zervakis, M., & Kalaitzakis, K. (2003). A survey of video processing Senior, A., Tian, H.A.Y., Brown, L., Pankanti, S., Bolle, R. (2001). Appearance models
techniques for traffic applications. Image and Vision Computing, 21, 359–381. for occlusion handling. In 2nd IEEE workshop on performance evaluation of
Kim, K., Chalidabhongse, T., Harwood, D., & Davis, L. (2005). Real-time foreground– tracking and surveillance PETS.
background segmentation using codebook model. Real-Time Imaging, 11(3), Stauffer, C., & Grimson, W. (1999). Adaptive background mixture models for real-
172–185. time tracking. Computer Vision Pattern Recognition, 246–252.
Koller, D., Weber, J., Makik, J. (1994). Robust multiple car tracking with occlusion Toyama, K., Krumm, J., Brumitt, B., & Meyers, B. (1999). Wallflower: Principles and
reasoning. In Proceedings of the third European conference on computer vision (pp. practice of background maintenance. ICCV99, 255–261.
189–196). Stockholm, Sweden, May 2–6. Wren, C., Azarbayejani, A., Darrell, T., & Pentland, A. (1997). Pfinder: Real-time
Lindeberg, T. (1996). Scale-space: A framework for handling image structures at tracking of the human body. IEEE Transactions on Pattern Analysis and Machine
multiple scales. CERN School of Computing, 695–702. Intelligence, 19, 780–785.
Mahmassani, H., Haas, C., Zhou, S., Peterman, J. (2001). Evaluation of incident Zivkovic, Z., & Van der Hijden, F. (2006). Efficient adaptive density estimation per
detection methodologies. CTR report: http://www.utexas.edu/research/ctr/ image pixel for the task of background subtraction. Pattern Recognition Letters,
pdf_reports. 55(5), 773–780.