0% found this document useful (0 votes)
5 views3 pages

521H0290 DoMinhQuan Assignment3

The document discusses face detection and human (pedestrian) detection techniques in computer vision. Face detection involves identifying facial features using algorithms like Haar-feature selection and AdaBoost training, while human detection utilizes the Histogram of Oriented Gradients (HOG) method to describe object shapes and appearances. Key steps in HOG include image normalization, gradient computation, and histogram accumulation to create a robust feature descriptor for object detection.

Uploaded by

521h0290
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views3 pages

521H0290 DoMinhQuan Assignment3

The document discusses face detection and human (pedestrian) detection techniques in computer vision. Face detection involves identifying facial features using algorithms like Haar-feature selection and AdaBoost training, while human detection utilizes the Histogram of Oriented Gradients (HOG) method to describe object shapes and appearances. Key steps in HOG include image normalization, gradient computation, and histogram accumulation to create a robust feature descriptor for object detection.

Uploaded by

521h0290
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

MSSV: 521H0290

Đỗ Minh Quân
Khoa : Công Nghệ Thông tin

Link google colab : https://colab.research.google.com/drive/1yLKu2hX0gJkasMz8EuzylO4Gu9eSAGtV?


usp=sharing

1. Face detection
What is Face Detection?
Face detection involves identifying a person’s face in an image or video. This is done by
analyzing the visual input to determine whether a person’s facial features are present.
Since human faces are so diverse, face detection models typically need to be trained on
large amounts of input data for them to be accurate. The training dataset must contain a sufficient
representation of people who come from different backgrounds, genders, and cultures.
These algorithms also need to be fed many training samples comprising different
lighting, angles, and orientations to make correct predictions in real-world scenarios.
These nuances make face detection a non-trivial, time-consuming task that requires hours
of model training and millions of data samples.
How Face Detection works ?

 Haar-feature selection: A Haar-like feature consists of dark regions and light regions. It
produces a single value by taking the difference of the sum of the intensities of the dark
regions and the sum of the intensities of light regions. It is done to extract useful elements
necessary for identifying an object. The features proposed by viola and jones are:

 Creation of Integral Images: A given pixel in the integral image is the sum of all the
pixels on the left and all the pixels above it. Since the process of extracting Haar-like
features involves calculating the difference of dark and light rectangular regions, the
introduction of Integral Images reduces the time needed to complete this task
significantly.
 AdaBoost Training: This algorithm selects the best features from all features. It combines
multiple “weak classifiers” (best features) into one “strong classifier”. The generated
“strong classifier” is basically the linear combination of all “weak classifiers”.
 Cascade Classifier: It is a method for combining increasingly more complex classifiers
like AdaBoost in a cascade which allows negative input (non-face) to be quickly
discarded while spending more computation on promising or positive face-like regions. It
significantly reduces the computation time and makes the process more efficient.

1. Human (pedestrian) detection


HOG (Histogram of Oriented Gradients) is a feature descriptor used in computer vision
and image processing for object detection. The concepts behind HOG were introduced in 1986,
but it wasn't widely adopted until 2005 when Navneet Dalal and Bill Triggs proposed
enhancements to HOG. HOG is similar to other feature descriptors like edge orientation
histograms, scale-invariant feature transform descriptors (such as SIFT, SURF, etc.), and shape
contexts. However, HOG is computed on a densely packed grid of cells and normalizes contrast
across blocks to improve accuracy. HOG is primarily used to describe the shape and appearance
of an object in an image. The typical steps in computing HOG involve:
Normalize the Image Before Processing:
Purpose: Ensure that the image has consistent lighting conditions and contrast across
different regions.
Details:
 Common normalization techniques include converting the image to grayscale and
performing histogram equalization.
 This step is essential to make the HOG descriptor robust to variations in lighting
and improve its performance.

Compute Gradients in both X and Y Directions


Purpose: Capture information about the changes in intensity along both the horizontal (x) and
vertical (y) directions.
Details:
 Calculate the gradient at each pixel using convolution with filters like Sobel
operators.
 The Sobel operators approximate the derivatives in the x and y directions,
highlighting edges and intensity variations.

Vote with Weighted Gradients in Cells:


Purpose: Accumulate votes for gradient orientations within predefined cells to form
histograms.
Details:
 Divide the image into small cells (e.g., 8x8 pixels).
 For each pixel in a cell, determine its gradient magnitude and orientation.
 Accumulate these gradient magnitudes into histograms, where each bin
corresponds to a range of gradient orientations.
 Weigh the contribution of each gradient by its magnitude.

Normalize Blocks
Purpose: Enhance robustness by normalizing gradient information within blocks.
Details:
 Divide the cells into overlapping blocks (e.g., 2x2 cells with 50% overlap).
 Normalize the histograms within each block to handle variations in
illumination.
 Normalization can be done by dividing the histogram values by the block's
L2-norm.

Collect All Oriented Gradient Histograms


Purpose: Aggregate histograms from all blocks to create the final feature vector.
Details:
 Concatenate the normalized histograms from all blocks to form a
comprehensive feature vector.
 The feature vector contains information about the distribution of gradient
orientations in the entire image.
 This vector is used as a feature descriptor for object detection and
recognition tasks.

You might also like