521H0290 DoMinhQuan Assignment3
521H0290 DoMinhQuan Assignment3
Đỗ Minh Quân
Khoa : Công Nghệ Thông tin
1. Face detection
What is Face Detection?
Face detection involves identifying a person’s face in an image or video. This is done by
analyzing the visual input to determine whether a person’s facial features are present.
Since human faces are so diverse, face detection models typically need to be trained on
large amounts of input data for them to be accurate. The training dataset must contain a sufficient
representation of people who come from different backgrounds, genders, and cultures.
These algorithms also need to be fed many training samples comprising different
lighting, angles, and orientations to make correct predictions in real-world scenarios.
These nuances make face detection a non-trivial, time-consuming task that requires hours
of model training and millions of data samples.
How Face Detection works ?
Haar-feature selection: A Haar-like feature consists of dark regions and light regions. It
produces a single value by taking the difference of the sum of the intensities of the dark
regions and the sum of the intensities of light regions. It is done to extract useful elements
necessary for identifying an object. The features proposed by viola and jones are:
Creation of Integral Images: A given pixel in the integral image is the sum of all the
pixels on the left and all the pixels above it. Since the process of extracting Haar-like
features involves calculating the difference of dark and light rectangular regions, the
introduction of Integral Images reduces the time needed to complete this task
significantly.
AdaBoost Training: This algorithm selects the best features from all features. It combines
multiple “weak classifiers” (best features) into one “strong classifier”. The generated
“strong classifier” is basically the linear combination of all “weak classifiers”.
Cascade Classifier: It is a method for combining increasingly more complex classifiers
like AdaBoost in a cascade which allows negative input (non-face) to be quickly
discarded while spending more computation on promising or positive face-like regions. It
significantly reduces the computation time and makes the process more efficient.
Normalize Blocks
Purpose: Enhance robustness by normalizing gradient information within blocks.
Details:
Divide the cells into overlapping blocks (e.g., 2x2 cells with 50% overlap).
Normalize the histograms within each block to handle variations in
illumination.
Normalization can be done by dividing the histogram values by the block's
L2-norm.