Opencv Ing
Opencv Ing
html
Naotoshi Seo
Tutorial: OpenCV haartraining (Rapid Object Detection With A
Cascade of Boosted Classifiers Based on Haar-like Features)
Table of Contents
Objective
Data Prepartion
Create Samples (Increase Images)
createsamples
Create Description File
EXTRA: random seed
Create Training Samples
Create Testing Samples
Training
haartraining
Generate xml
Testing
performance
facedetect
Discussion
Download
References
Objective
The OpenCV library gives us a greatly interesting demo for a face detection. Furthermore, it provides us
programs (or functions) which they used to train classifiers for their face detection system (called HaarTraining)
so that we can create our own object classifiers using these functions. It is interesting.
However, I could not follow how OpenCV developers performed the haartrainig for their face detection system
exactly because they did not provide us several information such as what images they used for training. The
objective of this report is to provide step-by-step procedures for following people.
My working environment is Visual Studio + cygwin on Windows XP, or on Linux. The cygwin is required
because I use several Linux commands.
1 di 8 11/07/2008 12.54
Tutorial: OpenCV haartraining (Rapid Object Detection With A Casca... http://note.sonots.com/SciSoftware/haartraining.html
Data Prepartion
Positive (Face) Images
FYI: There are database lists on Face Recognition Homepage - Databases. and Computer Vision Test Images.
Kuranov et. al. [3] states as they used 3000 negative images.
The collection is available at the Download section (But, it may take forever to download.)
createsamples
In this section, I describe functionalities of the createsamples software because the Tutorial [1] did not explain
them clearly (but please see the Tutorial [1] also for further options).
1. Create training samples (a vec file) from one image applying distortions
2. Create training samples (a vec file) from some images without applying distortions
3. Create test samples (images) and their ground truth (a description file) from single image applying
distortions
4. Show images within a vec file
This function is launched when options, -img, -bg, and -vec were specified.
For example,
$ createsamples -img face.png -num 10 -bg negatives.dat -vec samples.vec -maxxangle 0.6 -ma
Only the first <num> negative images in the <description file of negatives> are used.
2 di 8 11/07/2008 12.54
Tutorial: OpenCV haartraining (Rapid Object Detection With A Casca... http://note.sonots.com/SciSoftware/haartraining.html
In this case, the option -num is used only to reduce the number of samples to generate, this does not generate
many samples from one image applying distortions.
For example,
3. Create test samples and their ground truth from single image applying distortions
This is triggered when options, -img, -bg, and -info were specified.
In this case, -w and -h are used to determine the minimal size of positives to be embeded in the test images.
$ createsamples -img face.png -num 10 -bg negatives.dat -info test.dat -maxxangle 0.6 -maxy
The output file name format is as <number>_<x>_<y>_<width>_<height>.jpg, where x, y, width and height are
the coordinates of placed object bounding rectangle.
Only the first <num> negative images in the <description file of negatives> are used.
This is triggered when only an option, -vec, was specified (no -info, -img, -bg). For example,
3 di 8 11/07/2008 12.54
Tutorial: OpenCV haartraining (Rapid Object Detection With A Casca... http://note.sonots.com/SciSoftware/haartraining.html
[filename]
[filename]
[filename]
.
.
such as
img/img1.jpg
img/img2.jpg
such as
The first format can be easily created with the find command as
such as
The createsamples software applys the same sequence of distortions for each image. We may want to apply the
different sequence of distortions for each image because, otherwise, our resulting detection may work only for
specific distortions.
#include<time.h>
srand(time(NULL));
4 di 8 11/07/2008 12.54
Tutorial: OpenCV haartraining (Rapid Object Detection With A Casca... http://note.sonots.com/SciSoftware/haartraining.html
Kuranov et. al. [3] mentions as they used 5000 positive frontal face patterns and 3000 negatives for training, and
5000 positive frontal face patterns were derived from 1000 original faces.
However, you may have noticed that none of 4 functionalities of createsamples software provide us the
functionality to generate 5000 positive images from 1000 images at burst. We have to use the 1st functionality of
the createsamples to generate 5 (or some) positives form 1 image, repeat the procedures 1000 (or some) times,
and finally merge the generated output vec files.
I also wrote a script, createtrainsamples.pl, to repeat the procedures 1000 (or some) times. Please modify the path
to createsamples and its option parameters directly written in the file. I specified 7000 instead of 5000 because
the Tutorial [1] states as "the reasonable number of positive samples is 7000."
Example)
Testing samples are images which include positives in negative images and locations of positives are known in
the images. We can use the 3rd functionality of createsamples to do it. But, we can specify only one image using
it, thus, creating a script to repeat the procedure would help us. The script is available at svn:createtestsamples.pl.
Please modify the path to createsamples and its option parameters directly in the file.
This generates lots of jpg files and info.dat. The jpg file name format is as
<number>_<x>_<y>_<width>_<height>.jpg, where x, y, width and height are the coordinates of placed object
bounding rectangle.
Example)
Training
Training is done by the haartraining software.
haartraining
5 di 8 11/07/2008 12.54
Tutorial: OpenCV haartraining (Rapid Object Detection With A Casca... http://note.sonots.com/SciSoftware/haartraining.html
Kuranov et. al. [3] states as 20x20 of sample size achieved the highest hit rate. Furthermore, they states as "For
18x18 four split nodes performed best, while for 20x20 two nodes were slightly better. The difference between
weak tree classifiers with 2, 3 or 4 split nodes is smaller than their superiority with respect to stumps."
Furthermore, there was a description as "20 stages were trained. Assuming that my test set is representative for
the learning task, I can expect a false alarm rate about and a hit rate about ."
Therefore, I used 20x20 of sample size with nsplit = 2, nstages = 20, minhitrate = 0.9999, maxfalselarm = 0.5
such as
$ haartraining -data trainout -vec samples.vec -bg negatives.dat -nstages 20 -nsplits 2 -mi
The "-mode ALL" uses Extended Sets of Haar-like Features [2]. Default is BASIC and it uses only upright
features, while ALL uses the full set of upright and 45 degree rotated feature set [1].
The "-mem 512" is the available memory in MB for precalculation [1]. Default is 200MB, so increase if more
memory is available. We should not use all system RAM because otherwise it will result in a considerable
training slow down.
#If you use OpenMP (multi-processing) supporting compilers such as Intel C++ compiler and MS Visual Studio
8.0, the haartraining is automatically built with the OpenMP. This is effective when your computer has
multi-CPUs.
Generate xml
The haartraing software did not generate a cascade.xml used by the facedetect software.
Although the Tutorial [1] does not mention how to generate a cascade.xml, OpenCV has a software to convert a
haartraining output dir tree into a xml file at the OpenCV/samples/c/convert_cascade.c (that is, in your
installation directory). Compile it.
Example)
Testing
under construction. I am not achieving sufficient results....
performance
We will evaluate the performance of the generated classifier using the performance software.
such as
6 di 8 11/07/2008 12.54
Tutorial: OpenCV haartraining (Rapid Object Detection With A Casca... http://note.sonots.com/SciSoftware/haartraining.html
+================================+======+======+======+
| File Name | Hits |Missed| False|
+================================+======+======+======+
'Hits' shows the # of correctly found objects, and 'Missed' shows the # of missed objects (there must exist but not
found, as known as false negatives), and 'False' shows the # of false alarm (there must not exist but found, as
known as false positives).
facedetect
Discussion
The generated classifier did not work well. Specializing classifiers for only the frontal faces may achieve a better
result (UMIST images include 90 degree faces for each person.)
If we are supposed to use the trained face detection system only indoor, using simpler negative images (pictures
of plain walls, pc, chairs, desks only, etc) would provide a better accuracy for the constrained environment.
Download
The files are available at http://tutorial-haartraining.googlecode.com/svn/trunk/ (mirror)
http://opensvn.csie.org/sonots/SciSoftware/haartraining/ .
This is a svn repository, so you can download files at burst if you have a svn client. (Sorry, but I will not explain
how to install or use the svn client here because it is a general knowledge.) For example,
svn co http://tutorial-haartraining.googlecode.com/svn/trunk/HaarTraining/ .
Sorry, but downloading (checkout) image datasets may take forever.... I created a zip file once, but google code
repository did not allow me to upload such a big file (100MB).
The following additional utilities can be obtained from OpenCV/samples/c in your OpenCV install directory (I
also put them in HaarTraining/ directory).
convert_cascade.c
facedetect.c
7 di 8 11/07/2008 12.54
Tutorial: OpenCV haartraining (Rapid Object Detection With A Casca... http://note.sonots.com/SciSoftware/haartraining.html
References
[1] Rapid Object Detection With A Cascade of Boosted Classifiers Based on Haar-like Features - OpenCV
haartraining Tutorial (This can be obtained from OpenCV/apps/HaarTraining/doc on your OpenCV install
directory).
[2] Rainer Lienhart and Jochen Maydt. An Extended Set of Haar-like Features for Rapid Object Detection.
Submitted to ICIP2002. ( http://www.lienhart.de/ICIP2002.pdf )
[3] Alexander Kuranov, Rainer Lienhart, and Vadim Pisarevsky. An Empirical Analysis of Boosting
Algorithms for Rapid Objects With an Extended Set of Haar-like Features. Intel Technical Report
MRL-TR-July02-01, 2002. ( http://www.lienhart.de/Publications/DAGM2003.pdf )
OpenCV haartraining (Rapid Object Detection With A Cascade of Boosted Classifiers Based on Haar-like
Features) - Naotoshi Seo
Learning-Based Computer Vision with Intel's Open Source Computer Vision Library
How-to build a cascade of boosted classifiers based on haar like features [pdf]
Empirical Analysis of Detection Cascades of Boosted Classifiers for Rapid Object Detection [pdf]
5000 positive frontal face patterns and 3000 negative. 5000 positive frontal face patterns were derived from
1000 original face patterns by random rotation, scaling, mirroring, shifting. 20x20 of sample size with
nsplit = 2. nstages = 20
OpenCV for Linux Haar-like 特徴を用いる高速物体検知
OpenCV: オブジェクト検出器関数
http://ai-www.naist.jp/people/kohei-t/OpenCV/reference/ref/OpenCVRef_Experimental.htm 和訳
An Extended Set of Haar-like Features for Rapid Object Detection [pdf]
MERL – TR2003-096 – Fast Multi-view Face Detection
Haar状特徴に基づくブースト分類器のカスケードを利用する高速物体検知
haartraining Tutorial 和訳
Object Detection Using Haar-like Features with Cascade of Boosted Classifiers
Manual to create cascade xml file. Same with the file located in OpenCV directory.
OpenCV: Object Detection Functions
FaceDetection - OpenCV Library Wiki
Boosting Haar-like features. The basic classifiers are decision-tree classifiers with at least 2 leaves.
8 di 8 11/07/2008 12.54