A People Counting System Based On Face-Detection
A People Counting System Based On Face-Detection
1
Tsong-Yi Chen,
1
Chao-Ho Chen,
2
Da-Jinn Wang, and
1
Yi-Li Kuo
1
Department of Electronic Engineering
National Kaohsiung University of Applied Sciences
Kaohsiung, Taiwan, R.O.C.
Email: [email protected]
2
Department of Information Management
National Kaohsiung Marine University
Kaohsiung, Taiwan, R.O.C
Abstract This paper presents an automatic people
counting system based on face detection, where the
number of people passing through a gate or door is
counted by setting a video camera. The basic idea is to
first use the frame difference to detect the rough edges of
moving people and then use the chromatic feature to
locate the people face. Based on NCC (Normalized Color
Coordinates) color space, the initial face candidate is
obtained by detecting the skin color region and then the
face feature of the candidate is analyzed to determine
whether the candidate is real face or not. After face
detection, a person will be tracked by following the
detected face and then this person will counted if its face
touches the counting line. Experimental results show that
the proposed people counting algorithm can provide a
high count accuracy of 80% on average for the crowded
pedestrians.
Keywords: people counting; face detection; skin color;
people tracking
I. INTRODUCTION
Due to fast development of the domestic economy, many
large stores, hypermarkets, department stores and public
transport stations are frequently full of crowded people.
Large-scale concerts, fairs, garden etc., need many staffs to
provide customers better service. Therefore, the department
of human resources must have a strategy to achieve an
effective personnel management.
The number of customers is very important information
for making the business decisions. It not only is used as an
index for shopping popularity of department store or
hypermarkets, but also helps managers to arrange the
deployment of personnel. The number of customers of a year
is often used to predict the crowds peak periods. In the
exhibition or stadium, we can count the number of visitors
through the entrance and provide the information to the
museum security personnel. Effective control the number of
visitors is very important. Therefore, the information of the
number of people through the entrance has very high
reference value.
Although the method of traditional manual counting
with high accuracy, however, it must pay the high labor costs.
The labors may be tire or neglect and count the number of
people error. To solve this problem, there are several
automatic people counting device proposed recently, such as
mechanical shaft device. It can save the cost of human
resource human. And the counting accuracy is also
satisfactory. Its disadvantage is that only one people can pass
the device at one time. Therefore, it`s not convenient for the
pedestrian.
Automatic passenger counting method is the major
dominated by infrared sensing devices. It used the
interruption of infrared beams to count the number of the
people. According to the different interrupting time of two
infrared sensors, we can determine the movement direction
of passengers. Its accuracy is not good if many passengers
pass the gateway at the time. The methods cannot collect
other information. Recently, the computer processing time
speed up. The surveillance cameras captured the image by
computer vision and video processing technology to achieve
automatic control. Therefore, more and more researches are
interested in this domain. These researches have the
following advantages:
1). It can provide higher quality images, high
performance, high accuracy, and high reliable way
to monitor and control the completion of the work
than the infrared sensing devices.
2). Installing the surveillance cameras is convenient
and economical. And they are easy maintenance.
3). The information about crowed people is obtained
from the camera and is sent to the remote
computers by the internet. It will help achieve the
real-time remote management.
4). It contents the real time condition, reliability and
security requirements.
There are many proposed methods based on image
processing. Albiol et al. [1] and Chen et al. [2] set up the
cameras perpendicular to the top at the specific gateway to
capture moving objects such as images through the
designated area of the image sequence to achieve real-time
crowd counting.
The major problems of people counting with video
processing include the lighting changes, shadows and crowd
counting. This paper present a people counting system based
on face detection to solve the mentioned problems. Our
proposed system can be divided into four parts: moving
crowd segmentation, skin color detection, face detection and
tracking, and counting. We organized the paper as follows.
the next section introduces the flow chart of people counting
system for the pedestrians passing through a gate. Section 3
2010 Fourth International Conference on Genetic and Evolutionary Computing
978-0-7695-4281-2/10 $26.00 2010 IEEE
DOI 10.1109/ICGEC.2010.178
699
describes the detail procedures of the method. Experimental
results using the proposed method are given in section 4.
Finally, we made some conclusions shown as section 5.
II. WORK FLOW OF PROPOSED METHOD
Figure 1. The proposed method
The work flow of people counting system is shown in
Fig.1. There are four main components: moving object
segmentation, skin color detection, face detection, and
counter. We used the frame-difference methods to extract the
edge pixels of the moving crowd, and fill the edge pixels of
the crowd using the dilation technique of morphological
processing, which proposed by Gonzalez et al. [3]. After the
crowd segmentation, the NCC skin color space is applied to
detect the skin color region and analyzed the face feature of
the extracted skin color region. Then, we mark each face
region with an exclusive label and calculate the center of the
face region. The center of the face region is applied for
tracking the moving direction of the crowd. According to the
direction of crowd, we count the number of crowd who is
crossing the counting line for each different moving ways.
III. THE PROPOSED METHOD
A. Moving Object Segmentation
Many proposed methods about detecting general moving
object, the background subtraction method [4] and the frame-
difference method [5] are widely used. The background
subtraction method constructs a background model and
retrieves the foreground image by the difference operation
between current frame and background model. Under the
different environments, there are different detection
difficulties to solve. For example, the background
subtraction method can`t remove the background completely
when the light changes. Usually, the background subtraction
method does not apply to segment and count moving object.
In order to reduce the affected of light changes effectively,
we used the frame-difference method to segment the moving
object. Because the frame-difference method cannot segment
moving object completely, the dilation technique of
morphological processing is used to fill the moving object.
The following function (1) shows the calculation of frame-
difference method.
255, | ( , , ) ( , ) |
( , , )
0 , | ( , , ) ( , ) |
if frame x y t frame x y T
BI x y t
if frame x y t frame x y T
>
=
(1)
where BI(x, y, t), frame(x, y, t) and frame(x, y, t-1) denoted
the binary image and the frame of the time t, respectively
After we extract the edge pixels of the moving object by
applying the frame-difference method, the median filter and
dilation technique are applied to filter the noise and fill the
moving object. The equation (2) is the dilation calculations.
{ | }
image i
D B S i S B I = = =
(2)
where B and S are denoted the binary image and the
structuring element. The result of moving object
segmentation experiment is shown in Fig.2:
)b* )c*
)d* )e*
Figure 2. The result of moving object segment. (a) Original image (b) the
result after applying the frame difference method (c) the dilation result (d)
the result of object segmentation
B. Skin Color Detection
In order to improve the accuracy of the face detection,
we applied the reference white with NCC (Normalized
Color Coordinates) [6] color space to detect the skin color
space before the face detection. This method is proposed by
M. Soriano et al. [7]. The NCC color space considered with
balance the white color automatically and the changes of
skin color in images caused by the intensity differences of
the light.
The proposed algorithm, firstly, converses to normalized
color space (r, g) chromaticity to the R and G normalized
followed by :
R
r
R G B
=
+ +
(3)
G
g
R G B
=
+ +
(4)
700
where the r` and g` is obtained from normalized the R and
G.
After the R and G are normalized, two quadratic formulas
are used to define the upper and lower bounds of the skin
locus:
2
1.376 1.0743 0.1452
upper
Skin r r = + +
(5)
2
0.776 0.5601 0.1766
lower
Skin r r = + + (6)
To prevent the white pixels defined as the skin color
area, M. Soriano et al. [7] used the following condition
function to remove the white pixels:
2 2
( 0.33) ( 0.33) w r g = + (7)
We combine the above conditional formulas, the color
range of skin is determined with SR as:
2
1 ( ( ) & & 0.0007)
0
upper lower
if g Skin r g Skin w
SR
otherwise
< > >
=
(8)
C. Face Detection
After the processing for detecting the skin color space, the
retain region is defined as the face candidate region. We
analyzed the feature of the face candidate region. The aspect
ratio of the human face has an approximately size. Firstly,
we give each face candidate region a MER (Minimum
enclosing rectangle) to define the ratio of the height to the
width in the face candidate region. The range of the
R
Face
is 0.8 to 1.3. After calculating the aspect ratio by the MER,
some non-face region can be removed.
H
R
W
Face
Face
Face
=
(9)
where the
H
Face
and
W
Face
are denoted the height and
width of the MER of the face candidate region, respectively .
We proposed a new method to detect the eyes of the face
candidate region. Firstly, we transform the RGB color image
to the gray image. Applying the formulas (10) and (11), we
calculate the standard deviation of the face candidate region.
1 1
,
0 0
1
W H
avg x y
x y
E C
x y
= =
=
__
(10)
1 1
2
,
0 0
1
( )
W H
avg x y avg
x y
C E
x y
V
= =
=
__
(11)
where the
avg
E
is denoted as the average energy of the face
candidate region.
, x y
C
is denoted as the number of the pixel.
a v g
V
is denoted as the standard deviation between the
, x y
C
and
av g
E
.
Due to the eyes belong to the contours of the face region;
therefore, the standard deviation of the eyes region is larger
than the skin color space. If the upper half face candidate
region has the eyes region, this face candidate region will be
as the face region.
D. Face tracking and counting
After describing the face region, we give each face region
an exclusive label. We calculated the center of the face
region applying MER. According to the center of the face
and the movement direction, the proposed method track and
count the number of objects for each different directions.
We set up a counting line in the image shown as Fig. 3. If
the moving objects (human face) entry the tracking area, we
analyze the center of the face to get the tracked object`s
moving direction. Then, we count the face passing through
the tracking area.
Figure 3. The counting line.
If the human faces move, the coordinates of the center will
change during the different time periods. However, the
moving volume of the center is limited between two adjacent
images. We measure the moving volume of the center
between two adjacent images though Euclidean distance. The
Euclidean distance formula is shown as the following:
2 2
1 1
(( ) ( ) )
t t t t
Dist sqrt x x y y
= + (12)
where the t is denoted as the time, (x, y) is denoted as the
coordinates of the center.
Due to the technology used to calculate the number of
people crossing through the gateway, it must content with
the following two conditions:
1).
1
( )
t t
y y