0% found this document useful (0 votes)
66 views5 pages

UNIT 4-Feature Extraction

Feature extraction in image analytics involves capturing important characteristics of objects for recognition and classification. It includes boundary-based and region-based descriptors, with preprocessing steps like noise reduction and edge detection. Various descriptors, such as Fourier Descriptors, Statistical Moments, and SIFT, are used to analyze shapes and textures for applications in object recognition and image retrieval.

Uploaded by

abc269207
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
66 views5 pages

UNIT 4-Feature Extraction

Feature extraction in image analytics involves capturing important characteristics of objects for recognition and classification. It includes boundary-based and region-based descriptors, with preprocessing steps like noise reduction and edge detection. Various descriptors, such as Fourier Descriptors, Statistical Moments, and SIFT, are used to analyze shapes and textures for applications in object recognition and image retrieval.

Uploaded by

abc269207
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

UNIT 4

FEATURE EXTRACTION

In image analytics, feature extraction is the process of capturing important characteristics of an object or
region in an image. These features help in recognition, classification, and matching tasks.

Two main types of descriptors:


Boundary-based descriptors: Describe the shape using object contours.
Region-based descriptors: Use properties inside the object.

Feature descriptors must often be invariant to scale, rotation, translation, and noise.

Representation
Two primary ways to represent objects in an image:
Boundary Representation: Only the outer boundary is used. Good for shape analysis.
Region Representation: Entire object region is considered, including texture and structure.

Example:
A leaf outline would be a boundary representation; the full leaf with veins and texture is a region
representation.

Boundary Preprocessing
Before boundary-based descriptors (like Fourier Descriptors or Shape Numbers) can be applied, the
boundary of an object in an image must be cleaned, extracted, and standardized. This involves several
key preprocessing steps:

Noise Reduction
Real-world images often contain noise due to lighting, sensor errors, or environmental conditions.
Noise can cause false or broken edges during edge detection.
Apply smoothing filters like:
Gaussian Blur: Most common; applies a weighted average to nearby pixels.
This reduces sharp intensity changes caused by noise while preserving important structure.

Edge Detection
Edges are the most important indicators of object boundaries.
Edge detection highlights the locations where pixel intensity changes abruptly, typically indicating the
boundary of an object.

Contour Tracing (or Contour Extraction)


Once edges are identified, we need to trace continuous curves around objects to define their shapes.
Use algorithms like:
Moore-Neighbor Tracing
Suzuki’s Algorithm (used in OpenCV’s , find Contours() function)

Normalization
Objects in different images may appear at different scales, positions, or orientations.
To make feature descriptors comparable, we normalize the extracted boundary.
Steps:
Translation to Origin
Scaling to Unit Size
Rotation to Standard Angle
Boundary Feature Descriptors
Boundary feature descriptors are numerical values or encoded patterns that capture the shape characteristics
of an object by analyzing only its outline (boundary) — not its interior (region).
They help in recognizing and comparing objects, especially when the interior details are irrelevant or
unavailable

1. Basic Boundary Descriptors


These are geometric measurements derived directly from the boundary.
a. Perimeter (P)
Total length of the object's boundary.
Calculated by summing the distance between consecutive boundary points.
b. Compactness (Shape Factor)
Compactness=P2 / 4πA
P = perimeter, A = area enclosed.
Minimum value is for a circle (value = 1).
The more irregular the shape, the higher the value.
c. Eccentricity
Ratio of major axis to minor axis of the shape (like fitting an ellipse to the shape).
Indicates how elongated the shape is.
d. Convexity & Solidity
Convex Hull: The smallest convex shape enclosing the object.
Solidity = Area of shape /Area of convex hull
Helps differentiate between convex and concave shapes.

Shape Numbers and Chain Codes


These are structural descriptors based on how the boundary progresses pixel by pixel.
Chain Codes
Represents the boundary as a sequence of directions.
For example, in 4-connectivity: Right = 0, Up = 1, Left = 2, Down = 3
In 8-connectivity:
Directions are numbered 0 through 7 for all eight compass directions.
Example:
If the boundary goes right → up → left → down, the chain code = [0, 1, 2, 3]

Differential Chain Codes (Shape Numbers)


Take the difference between consecutive directions.
Makes the representation invariant to rotation.
First difference is called the first-order shape number.

Applications:
Good for recognizing digits, letters, or geometric shapes.

Fourier Descriptors
These are frequency domain representations of shape boundaries.
Treat the boundary as a complex signal:
s(n)=x(n)+j⋅y(n)
Apply Discrete Fourier Transform (DFT) to get frequency components.
Benefits:
Low-frequency components capture the general shape.
High-frequency components capture fine details and noise.
Statistical Moments
These describe the distribution of the boundary points mathematically.
Moments describe the shape's spatial distribution.
Examples:
Mean: Center of mass of the boundary.
Variance: Spread of boundary pixels.
Skewness and Kurtosis: Measure asymmetry and peakedness.

Regional Feature Descriptors


Regional feature descriptors are numerical or statistical measures that describe the internal properties of a
shape or object by analyzing the pixels within the region enclosed by the boundary.
Regional descriptors use all pixels inside the object — making them useful for analyzing texture, structure,
symmetry, and intensity distributions.

Types of Regional Feature Descriptors

Basic Regional Descriptors


These are simple, geometric or statistical properties computed from the object’s interior.
a. Area
Total number of pixels inside the region.
Used for size comparison and object classification.
b. Centroid
The average x and y position of all the pixels in the object.
c. Orientation
Angle of the major axis of the object.
Helps understand object’s dominant direction (used in alignment tasks).
d. Bounding Box
The smallest rectangle that completely contains the region.
e. Extent, Solidity
Extent = Area / Bounding box area.
Solidity = Area / Convex hull area.

Topological Descriptors

These capture connectivity and structure inside the region.


a. Euler Number
Defined as:
E=C−H
where C = number of connected components, H = number of holes.
Useful for distinguishing between solid objects and those with internal gaps.
b. Connectivity
4-connected or 8-connected regions.
Determines how pixels are connected — affects object segmentation and labeling.
c. Number of Holes
Helpful in identifying objects like rings, loops, or letters like "O", "A", "B".

Texture Descriptors
These quantify patterns or variations in intensity inside the region — especially useful in gray-scale or color
images.

a. Gray Level Co-occurrence Matrix (GLCM)


Measures how often pairs of pixel intensities (e.g., 50 and 100) occur together.
From GLCM, compute:
Contrast: Measures intensity variation.
Homogeneity: Measures smoothness.
Energy: Repetition of pixel pairs.
Correlation: Linear dependency of gray levels.

b. Local Binary Patterns (LBP)


Captures texture by thresholding neighborhood pixels against the center pixel.
Converts pixel neighborhoods into binary numbers — robust to illumination changes.

c. Gabor Filters / Wavelet Features


Analyze texture at multiple scales and orientations.
Effective for medical imaging, face recognition, etc.

Moment Invariants (Hu Moments)


These are derived from the region’s statistical moments and are invariant to translation, scale, and rotation.
Hu’s Seven Invariant Moments:
Complex combinations of moments that do not change even if the object is rotated, resized, or shifted.
Application:
Shape recognition, digit recognition, object classification.

Principal Component Analysis (PCA)


Used to:
Find the direction of maximum variance in the region.
Reduce dimensionality of feature vectors.
Align the region for rotation invariance.

PCA Descriptors Include:

Eigenvalues and eigenvectors of the covariance matrix of the region.


Orientation angle: direction of the major eigenvector.

Whole Image Features


These are descriptors that represent the entire image or a large region as a single entity, rather than analyzing
individual parts like boundaries or segmented objects.

These features help in tasks such as:


Scene classification (e.g., forest vs. beach)
Image retrieval (finding similar images)
Texture-based classification
Content-based image indexing
They capture global properties like color, texture, structure, and overall layout.
Histograms:
Intensity or color histogram to represent pixel distribution.
Texture Features: Used when images are distinguished more by pattern than by shape.
GLCM, LBP over the full image.

Edge/Gradient-Based Features
Describe how intensity changes across the entire image.
Edge Histogram Descriptor (EHD)
 Measures edge distribution in horizontal, vertical, and diagonal directions
Color Moments (in color images):
Mean, Standard Deviation, and Skewness of R, G, B channels.

Statistical Features
Describe global intensity distributions and relationships.
✅ a. Mean, Variance, Skewness
Basic statistical moments of pixel intensities.
✅ b. Entropy
Measures randomness or texture complexity.
High entropy = more complex image.

Scale-Invariant Feature Transform (SIFT)


It is a computer vision algorithm used to:

 Detect distinctive keypoints in an image.


 Extract descriptors from these keypoints.
 Match features between different images.

Detects and describes local features in images.


Steps:
Scale-space extrema detection
 Images can appear differently depending on scale (zoom in or out).
 SIFT constructs a scale space by gradually blurring the image using a Gaussian filter and
comparing differences of Gaussians (DoG).

Keypoint localization

Once potential keypoints are detected, SIFT refines them:

 Eliminates low-contrast points.


 Discards edge responses (e.g., corners on flat edges that are unstable).

Orientation Assignment
 Each keypoint is assigned a dominant orientation based on the gradient directions around it.
 This makes SIFT rotation invariant.

Key point descriptor creation


A 128-dimensional descriptor vector is created for each key point.

Application
Widely used in image stitching, object recognition, and 3D reconstruction, medical imaging

Example: Matching Two Images


Image 1: Contains a logo.
Image 2: Same logo but resized, rotated, and partially hidden.
SIFT detects and matches corresponding points regardless of scale or orientation, enabling recognition of
the logo.

Key Properties of SIFT

Property Description
Scale-invariant Detects features at multiple zoom levels
Rotation-invariant Accounts for orientation change
Robust Works well with noise, blur, partial occlusion
Distinctive Each feature is highly unique for matching
Local Based on patches, not whole image

You might also like