Age Estimation
Age Estimation
Age Estimation
2, December 2014
Abstract—This paper investigates the use of SVM-SMO image contains a face and to identify the facial feature
algorithms in estimating the age of a person through the points required. The second part measures the distances
evaluation of its facial features on both front and side-view and angles between the selected set of facial feature
face orientation. Stephen-Harris algorithm, SURF, and points whose results was fed to SVM-SMO classifier for
Minimum Eigenvalue Feature Detection algorithms were
age determination.
also used for feature extraction. During experiments,
training sets composed on 44 front view images and 44 side
view images were used to train the network. Testing was II. SCOPE AND LIMITATION
performed to 140 front view images and 44 side view images.
The study covers multiple human faces in an image
Result of the experiment shows age recognition of 53.85%
regardless of size or distance and orientation (front or
for front view images and 14.3% for side view images.
side-view). The image can also be either a colored or in
Index Terms—age determination, image processing, neural black and white. It should also be able to estimate the age
networks regardless of expression or if the subject of the image has
wrinkles because of smoking, drinking, sleep deprivation,
overexposure to the sun and other external factors that
I. INTRODUCTION contribute to them “looking old.” For ease, images of
Filipino faces were used for the study.
The availability of robust face recognition algorithms
The study did not cover faces that are obscured with
brings vast studies in the area of image processing to
sunglasses/eyeglasses, masks, tattoos or any foreign
perform expression recognition, smile detection, identity
objects that covers the feature points used by the
recognition, and much more. One of the more unexplored
application. The images were not blurry or fuzzy and the
areas in face study is age determination. Not much study
luminance levels were normal or where the face is
has been done in determining an age of a person through
recognizable. It also did not cover images whose
image processing and for those studies age determination
subject’s face has been altered by cosmetic surgery,
is only tested on front-view images. Normally, to
injury, illness or scars.
determine someone’s age people usually turn to looking
at a person’s wrinkling face, similar to the study of [1],
III. METHODS
but it poses challenges for some people who have
wrinkles due to frequent smoking, drinking, overexposure The overall process used in this study is reflected in
to sun, and sleep deprivation, among others. The other Fig. 1. Training images were initially fed to the model
method to determine age is by the analysis of a selected before testing of other images was performed. In both
set of facial feature points, for instance the structure of scenarios, all images passed through pre-processing,
facial bones, similar to the study of [2] on how the feature extraction, and classification components.
mandible continues to enlarge in the course of life.
Support Vector Machine-Sequential Minimal
Optimization has been used in age classification and has
proven to have good accuracy but has been applied to a
wide range of age groups. In the study of [3], they used
what they determined to be the optimized facial feature
points for the facial measurements in classifying age.
The purpose of this study is to be able to determine a
person’s age by analyzing a set of facial feature points
from an image. The method is divided into two parts: (1)
face detection and facial feature identification, and (2)
age determination. For the first part, the Viola-Jones
Object Detection framework was used to detect if an
Figure 1. System block diagram
Manuscript received June 25, 2014; revised November 19, 2014.
A. Image Acquisition
All images used in the training and testing phase of
this research were captured using a typical digital camera
having a resolution of 1280×760px. Image acquisition
was also done in a controlled environment, that is, proper
illumination and lighting is observed, and with a light
colored background. For the grayscaling of several
images, it was done using the grayscale option in
Photoshop and as well in Photoscape.
Figure 3. Ears region of interest
B. Face Detection for Front and Side View
Detection of the face region for both front and side c) Chin
view faces is described below. The training of the chin was quite an ordeal, same with
1) Front view viola jones the eye and the ear. A similar procedure with the ears was
In this study, the Vision toolbox in MATLAB was used for the chin, in which we selected the facial feature
used to detect if an image contains faces in front-view. using the cascade training GUI by manually inputting the
The toolbox uses the Viola-Jones Object Detection ROI and forming a rectangular over the chin, starting
Framework as a method for detection. from its base and going near the lips, but forming the
2) Side view training using cascade training GUI smallest possible rectangle on both sides as shown Fig. 4.
The Cascade Training GUI is an interactive GUI for
managing the selection and positioning of rectangular
ROIs in a list of images, and for specifying ground truth
for training algorithms. In this study, it was used to train
the detectors needed for the side view face detection.
Each side and features, ears, chin and eyes were trained
separately for a higher accuracy detection rate. Each
training sets were trained differently in order to meet the
programs requirements. The GUI consisted of stages that
have to be manually adjusted to minimize false detections Figure 4. Chin region of interest
and each set was specified to have the feature type set to
LBP (local binary patterns). During training, several images up to 200+ for each
side were conducted for the purpose to increase detection.
a) Side view Numerous changes on the settings were done to get a
Training of the side views had to cover all the features close 100% accuracy. Most of the changes were in the per
that were used for feature extraction. The ROIs start just – stage false alarm rate and per – stage true positive rate.
above the eyebrows, below the chin and behind the ears. Cascade stages were lowered to 7 stages to increase
The false alarm rate was set to 0.1, the numbers of stages accuracy and having the negative samples factor to 4 in
were set to eight, and the negative samples were set to 5. order for it to compare several hundreds of negative
Manually inputting the object training to 32×34 increased samples and making the object training size equal to
the detection rate. Fig. 2 shows the selection for side view 23×26 to specify a small size will escalate precision.
face detection training.
d) Eye
Linear binary pattern facial feature extractor was used the mean that is used to plot the location of the ear as
with an image size of 1280 by 720 pixels. The accuracy is shown in Fig. 6.
95% for both sides and false detector at less than 3%. b) Eye
C. Feature Detection The corner detector used to extract the feature from the
1) Front-view Viola Jones eye was the Harris-Stephens algorithm. After the eye was
In a similar fashion to the studies of [4] and [5], the detected, the region where the eye is located was cropped
researchers needed to be able to detect the left and right and converted into grayscale. The corners were then
eyeballs, and the mouth of a front view face. The detected by using detectHarrisFeatures() method and the
researchers also incorporated elements from the study of strongest features are extracted using
[4] into the research. corners.selectStrongest() method of the toolbox.
For the progression of the research, the Vision Toolbox
was used for detecting the different facial feature points
on the front-viewing face. Similar to face detection, it
also uses the Viola-Jones Object Detection Framework as
its method of detection.
First, the ‘EyePairBig’ model was used to search if the
face region contains both the left and right eyes. If ever
this detection fails, image processing for this region is
aborted.
Second, the ‘LeftEye’ model was used to search the
face region for the left eye. To eliminate the various false
detections that this model may cause, the paper compares
each of the detected LeftEye regions’ point coordinates
Figure 7. Strongest points of eye
with the coordinate of the ‘EyePairBig’ region. The
nearest ‘LeftEye’ region is selected as the ‘LeftEye’ The strongest points were located by computing the
region, the rest are deleted. mean that is used to plot the location of the eye as shown
Third, the ‘RightEye’ model was used to search the in Fig. 7.
face region for the right eye. It followed a similar process
c) Chin
with the ‘LeftEye’, except it compared each of the
detected ‘RightEye’ regions’ point coordinates plus the The corner detector used to extract the feature from the
length of the region with the ‘EyePairBig’ region’s point chin was the Harris-Stephens algorithm. The procedure is
coordinate plus the length of that region. similar to the eye detection.
Then, the ‘Mouth’ model was used to search the face
D. Training of SVM-SMO
region for the mouth.
2) Side-view SURF feature and Harris-Stephens To implement the SVM-SMO, the WEKA (Waikato
algorithm Environment for Knowledge Analysis) data mining
a) Ear software was used. WEKA requires the creation of an
ARFF file in order for it to perform its calculations. The
ARFF file contains the data acquired from the previous
steps, specifically, the measurements and angles of the
facial feature points. Before the output model was created,
a kernel function was first selected. The researchers used
the default kernel function in WEKA, which is the
Polynomial kernel with an exponent of 1.
IV. EXPERIMENTS
A. Front-View Age Classification
Table I shows the number of subjects gathered for
Figure 6. Strongest points using SURF testing. During testing, 98.57% (138) were successful
during the face detection stage. In the feature extraction
The corner detector used to extract the feature from the
stage, only 78 out of the 138 (56.52%) passed the
ear is SURF (Speed Up Robust Feature). After the ear
was detected, the region where the ear is located was detection, though most of them incurred some slight
cropped and converted into grayscale, as needed for all errors. For age classification stage, only 42 out of the 78
corner detectors. The corners were then detected by using (53.85%) managed to correctly estimate the age category.
detectSURFFeatures() method and the features were Overall, the system only managed to correctly classify 42
extracted with using extractFeatures() method of the out of the 140 total subjects, bringing its accuracy to only
toolbox. The strongest points were located by computing 30%.
TABLE I. AGE TEST DATA FRONT-VIEW subjects, 2 (14.3%) were correctly classified. Although
Age Face Feature Age the age classifier had an accuracy of 84.1% on its training
Subjects
Category Detection Extraction Classification set, the testing set only managed to have 2.27% overall.
-10 5 5/5 5/5 1/5
TABLE II. AGE DATA SET SIDE-VIEW
11-15 13 13/13 7/13 1/7
Testing Control Summary
16-20 59 59/59 44/59 39/44
Age No. of Face Feature Correctly
21-25 24 24/24 11/25 0/11 Category Subjects Detection Extraction Classified
-10 10 9/10 1/9 0/1
26-30 6 6/6 0/6 0/0
11-15 10 10/10 0/10 0/0
31-35 2 1/2 0/1 0/0
16-20 10 10/10 8/10 2/8
36-40 7 7/7 2/7 0/2
21-25 10 10/10 5/10 0/5
41-45 10 10/10 3/10 0/3
26-30 6 6/6 0/6 0/0
46-50 6 5/6 2/5 0/2
31-35 5 3/5 0/3 0/0
51+ 8 8/8 2/8 0/2
36-40 9 9/9 0/9 0/0
Total 140 138/140 78/138 42/78
41-45 10 10/10 0/10 0/0
Percentage 98.57% 56.52% 53.85%
46-50 8 8/8 0/8 0/0
Fig. 8 shows the different types of detections that the
51+ 10 10/10 0/10 0/8
system does for the colored images. Fig. 8(1) shows a
perfect detection and feature extraction, with the points Total 88 85/88 14/85 2/14
exactly where they should be. Percentage 96.6% 16.5% 14.3%
Kim Lopena is a Filipino born in Dumaguete Mehdi Salemiseresht was born in Tehran,
City in 1989. He recently attained his B.S in Iran, in 1982. He recently attained his B.S in
Computer Science degree from Silliman Computer Science degree from Silliman
University, Dumaguete City in 2014. He is University, Dumaguete City in 2014.
currently interning for Rentah Inc., Brooklyn, Currently working for NetworkLabs Nokia,
New York. His current endeavor is being an Manila, Philippines.
IT/Computer Systems (Back–End Developer).
Arby Moay is a Filipino born in Dapitan Chuchi Montenegro was born in Dapitan
City in 1993. He was a scholar in Philippine City, Philippines, in 1971. She received her
Science High School - CMC, and attained a B.S. in Computer Engineering degree from
BS Computer Science degree in Silliman Cebu Institute of Technology – University,
University, Dumaguete City in 2014. He is Cebu City in 1992 and Master in Computer
currently working as a R&D Engineer I for Science from the same university in 2009.
NetworkLabs Nokia, TechnoHub, Quezon She is an assistant professor of the College of
City. He currently dreams of becoming a Computer Studies, Silliman University,
Software Architect for NWL. Dumaguete City. Her research interest is in
the field of neural networks, signal
processing, and speech recognition.