0% found this document useful (0 votes)
272 views76 pages

Computer Vision and Object Recognition

The document presents a lecture on Object Recognition and Image Understanding in the context of Computer Vision, detailing key concepts, methods, and applications. It covers major tasks such as object detection, classification, tracking, and pose estimation, as well as image understanding tasks like scene understanding and semantic segmentation. Additionally, it discusses techniques like Hough Transform for shape detection and various other object recognition methods.

Uploaded by

rampriya.r
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
0% found this document useful (0 votes)
272 views76 pages

Computer Vision and Object Recognition

The document presents a lecture on Object Recognition and Image Understanding in the context of Computer Vision, detailing key concepts, methods, and applications. It covers major tasks such as object detection, classification, tracking, and pose estimation, as well as image understanding tasks like scene understanding and semantic segmentation. Additionally, it discusses techniques like Hough Transform for shape detection and various other object recognition methods.

Uploaded by

rampriya.r
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.

SRI RAMAKRISHNA ENGINEERING COLLEGE

[Educational Service : SNR Sons Charitable Trust]


[Autonomous Institution, Reaccredited by NAAC with ‘A+’ Grade]
[Approved by AICTE and Permanently Affiliated to Anna University, Chennai]
[ISO 9001:2015 Certified and all Eligible Programmes Accredited by NBA]
VATTAMALAIPALAYAM, N.G.G.O. COLONY POST, COIMBATORE – 641 022.

Department of Artificial Intelligence and Data Science

20AD206 – COMPUTER VISION

Presentation by
Mrs.R.Rampriya
Assistant Professor(OG)/AI&DS

08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 1


Module 4
OBJECT RECOGNITION AND IMAGE
UNDERSTANDING
Hough transforms and other simple object
recognition methods, Shape correspondence
and shape matching, Principal component
analysis and Shape priors for recognition.
Pattern recognition methods, HMM, GMM and
EM.

08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 2


Object recognition and image
understanding
Object recognition and image understanding are two closely
related fields within computer vision, a subfield of artificial
intelligence and computer science.

These fields focus on the development of algorithms and


techniques to enable computers to understand and interpret
visual information from images or videos.

08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 3


Object recognition
Object recognition refers to the process of identifying and
classifying objects or specific patterns within an image or video
stream.
It involves determining what objects are present in an image
and, in some cases, their precise location and orientation.

08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 4


Major Tasks-Object Recognition
Object Detection
Object Classification
Object Tracking
Pose Estimation

08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 5


Contd…
1.Object Detection: Locating and identifying multiple objects in an image,
often by drawing bounding boxes around them and associating each with a
class label
(e.g., identifying cars, pedestrians, or traffic signs in a scene).
2.Object Classification: Assigning a label or category to a detected object
(e.g., labeling an object as a "cat" or "dog").
3.Object Tracking: Following the movement of objects across frames in a
video, maintaining their identities.
4.Pose Estimation: Determining the spatial orientation or pose of objects
within the scene.
Applications- Object recognition is used in various applications, including
autonomous vehicles, surveillance, robotics, augmented reality, and
healthcare.
08/07/2025 6
20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS
Image Understanding
 Image understanding is a
broader concept that
encompasses the interpretation
of visual data from images or
videos.
 It goes beyond object
recognition and involves
extracting meaningful
information, relationships, and
context from visual data.

08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 7


4 Key Tasks-Image
understanding
Scene Understanding
Semantic Segmentation
Image Captioning
Visual Question Answering (VQA)

08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 8


Contd…
Scene Understanding: Analyzing an entire scene to understand the
relationships between objects, their context, and their significance (e.g.,
recognizing that a person is holding an umbrella because it's raining).
Semantic Segmentation: Assigning pixel-level labels to distinguish
different object classes or regions within an image.
Image Captioning: Generating natural language descriptions of the
content of an image or video.
Visual Question Answering (VQA): Answering questions related to the
content of an image or video.
Applications: Image understanding has applications in content-based
image retrieval, human-computer interaction, medical image analysis, and
more.

08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 9


Hough Transform
Hough Transform is a popular image processing technique
used for detecting shapes or patterns within an image,
particularly lines,curves,Eclipse.
Initially developed by Paul Hough in the 1960s and has since
been extended and adapted for various applications in
computer vision and image analysis.
The Hough Transform is particularly useful for tasks like line
detection in edge images or circle detection.
The Hough Transform is a powerful technique for shape
detection

08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 10


How the Hough Transform works:

Parameter Space Representation


Parameterization
Accumulator Space
Accumulation
Peak Detection
Shape Extraction

08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 11


Contd…
1. Parameter Space Representation
The Hough Transform operates on an image where an edge detector or
other feature extraction method has already identified points or pixels that
lie on a particular shape a straight line or a circle.
Each point or pixel in the image corresponds to a particular feature in the
shape.
2. Parameterization
For each feature, the Hough Transform represents it in a parameterized
form. For example, in line detection, each point is represented as a pair of
parameters (ρ, θ), where ρ is the distance from the origin to the closest
point on the line, and θ is the angle between the line and the x-axis.

08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 12


Contd…
3. Accumulator Space
The Hough Transform then creates an accumulator space, which is
essentially a two-dimensional array or grid where each cell corresponds to
a particular combination of parameter values (ρ, θ).
For each point in the image, it votes for all possible parameter
combinations that could explain the observed feature.
4. Accumulation
As each point votes for parameter combinations, the corresponding
cells in the accumulator space are incremented.
This process continues for all points in the image.

08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 13


Contd…
5. Peak Detection:
After all votes have been counted, the accumulator space is examined
to find peaks. Peaks correspond to parameter values that received a high
number of votes.
These peaks represent the parameters of the shapes that were
detected in the original image.
6. Shape Extraction:
Finally, the detected parameters are converted back into their
geometric form (e.g., lines or circles), and the identified shapes are
overlaid on the original image.

08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 14


Circle/Line Detection using Hough
Transform

08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 15


Hough Transform Algorithm

1. The main idea is the following: the bundle of lines passing through
a point (xi,yi) in the image space is defined as yi = a * xi + b.

2. This bundle corresponds to exactly one line when you’re moving to


the parameter space where b = -xi * a + yi.
(a,b) -independent and dependent variables
(xi, yi) -image space.

08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 16


Contd…
3.One line passing through 2 points in the image space y = a’ * x + b’,
can be represented as 2 lines that intersect in one point (a’, b’) in the
parameter space.

4.If we have n points, we will have n lines intersecting in the


parameter space at the same point.

The procedure thus will be the following:


Transform the image in the parameter space
Look for the points where a high number of lines intersect: they
correspond to a large number of points aligned over the same line.

08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 17


08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 18
Pseudocode
1. Compute the edge detection to get the edge points e.g. Canny Edge
Detection.
2. The parameter space is quantized in cells, and a counter is associated
with each cell. Theta is in the range [-pi/2, pi/2] and rho in the range [-
D, D] where D is the size of the diagonal of the image.
3. For each edge point (x,y)
a. Let theta assume all the values in the quantized range (-pi/2, pi/2)
and compute rho = x * cos(theta) + y * sin(theta).
b. For each cell crossed increment by 1 the counter in position (theta,
rho)
4. The counter for each cell contains the number of pixels collinear on
that line.
Pick all the cells that have a number of counts higher than some user-
defined threshold.
08/07/2025 19
20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS
Line Detection Using Hough Transform

import cv2
# Apply hough transform on the image
import numpy as np
lines = hough_lines(best_canny_res,
from google.colab.patches import cv2_imshow threshold, -np.pi/2, np.pi/2)
# Read image as gray-scale # Show the image with the lines found
lines_img, mask = draw_lines(image,
img = lines)
cv2.imread('/content/drive/MyDrive/Colab
Notebooks/circle_hough.jpg', win_name = "hough"
cv2.IMREAD_COLOR) cv.namedWindow(win_name)
# Convert to gray-scale print("Original Image") print(“Line Detection using Hough
cv2_imshow(img) gray = cv2.cvtColor(img, Transform")
cv2.COLOR_BGR2GRAY) cv2_imshow(img)
# Blur the image to reduce noise cv.imshow(win_name, lines_img)
cv.waitKey(0)
img_blur = cv2.medianBlur(gray, 5) cv.destroyAllWindows()
08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 20
Contd…Plotting the results
OUTPUT

08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 21


Applications of the Hough
Transform
Line Detection: Detecting straight lines in images, often
used in tasks like road lane detection in autonomous
vehicles or barcode recognition.
Circle Detection: Identifying circular shapes in images,
useful in tasks like detecting coins or pupils in eye tracking
systems.
Generalized Hough Transform: Extending the technique
to detect arbitrary shapes, not just lines and circles.
Edge Detection: It can be used as a preprocessing step
for edge detection to identify linear features in images.

08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 22


Other simple object recognition methods
1. Template Matching
2. Color Histograms
3. Edge Detection and Contour Analysis
4. Feature Matching with Descriptors
5. Color-Based Segmentation
6. Geometric Features
7. Principal Component Analysis (PCA)
8. Histogram of Oriented Gradients (HOG)

08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 23


Template Matching
Template matching involves
comparing a small template
image with sub regions of a
larger image to find regions that
closely match the template.
Use Case:
It is useful for detecting objects
when the object's appearance is
relatively consistent and the
template is well-defined. For
example, finding a specific logo
or symbol in an image.
08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 24
Color Histograms
Color histograms capture the
distribution of colors in an
image. Objects can be
recognized by comparing the
color histograms of the target
object and the image regions.
Use Case
It is suitable for recognizing
objects based on their
distinctive color patterns, such
as traffic signs.
08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 25
Edge Detection and Contour
Analysis
Edges and contours in an
image can be detected using
techniques like Canny edge
detection. The shapes and
structures formed by these
edges can be analyzed to
identify objects.
Use Case:
Useful for detecting simple
geometric shapes or irregular
objects with distinct contours.
08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 26
Feature Matching with
Descriptors
Feature descriptors like Scale-
Invariant Feature Transform
(SIFT) or Speeded-Up Robust
Features (SURF) can be used to
extract distinctive local features
from objects. These features are
matched between the object and
the image.
Use Case:
Effective for recognizing objects
with unique texture or pattern
features.
08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 27
Color-Based Segmentation
Objects can be segmented
based on color differences.
Thresholding or clustering
techniques can be used to
separate objects from the
background.
Use Case:
Suitable for identifying objects
with distinct colors, such as
fruits or objects on a colored
background.
08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 28
Geometric Features
Simple geometric features like
the aspect ratio, area, or
centroid of regions can be
computed and compared to
identify objects.
Use Case:
Applicable when objects have
distinct geometric properties,
such as distinguishing between
squares and circles.

08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 29


Principal Component Analysis
(PCA)
PCA can be used to reduce the
dimensionality of image data
and capture the most
significant features. Objects
can be recognized based on
the reduced feature space.
Use Case:
Useful when objects can be
distinguished by their principal
components.

08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 30


Histogram of Oriented Gradients (HOG)
HOG descriptors capture
information about the
distribution of gradient
orientations in an image.
They are often used for
object detection.
Use Case:
Effective for detecting
objects with distinct edge
patterns, like pedestrians in
images.
08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 31
Shape correspondence
Shape correspondence is a fundamental concept in computer
vision, image processing, and geometry.
It involves establishing a correspondence or mapping between
points, features, or regions in two or more shapes or objects.
The primary goal of shape correspondence is to identify how
different parts of one shape correspond to equivalent parts of
another shape.
shapes are subjected to transformations like scaling, rotation,
translation, or deformation.
Aims to find a relationship or mapping between elements (such
as vertices, landmarks, or features) of two or more shapes.
08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 32
Contd…
Shape correspondence is a crucial concept in various
fields where understanding and matching shapes are
essential.
It allows computers and algorithms to recognize,
compare, and work with shapes in images and 3D
spaces, enabling applications ranging from object
recognition to medical imaging and animation.

08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 33


Shape Correspondence

08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 34


Techniques
Several techniques are used for shape correspondence,
including:
Geometric Transformations: Applying transformations like
scaling, rotation, and translation to align shapes.
Feature Matching: Identifying and matching distinctive
features or key points in shapes.
Graph Matching: Representing shapes as graphs and
finding correspondences between nodes and edges.

08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 35


Contd…
Distance Metrics: Using distance measures (e.g.,
Euclidean distance, Hausdorff distance) to quantify the
dissimilarity between shapes.
Optimization Methods: Formulating shape
correspondence as an optimization problem to find the
best matching correspondence.

08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 36


Applications

Object Recognition: In computer vision, shape correspondence is


crucial for recognizing and matching objects in images or scenes.
Medical Imaging: In medical image analysis, it is used to establish
correspondences between anatomical structures in different images,
which is valuable for diagnosis and treatment planning.
Computer Graphics: Shape correspondence plays a vital role in
character animation, where it helps map a character's skeleton to a 3D
model or mesh.
Robotics: Robots often use shape correspondence to navigate and
interact with objects in their environment.

08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 37


Challenges
Shape correspondence can be challenging due to
variations in shape, pose, scale, and deformation.
Matching shapes that have undergone these
transformations requires robust algorithms that can
handle such variations.

08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 38


Shape Matching
Shape matching refers to the process of comparing
two or more shapes to determine their similarity or
dissimilarity.
 The goal is to quantify the degree of similarity
between shapes, which can be useful in various
applications.

08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 39


Shape Matching

08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 40


Techniques
1. Point Correspondence: Finding corresponding
points or landmarks in shapes.
2. Geometric Transformations: Applying
transformations like translation, rotation, scaling,
and deformation to align shapes.
3. Feature Extraction: Extracting meaningful shape
descriptors, such as contour features or key points,
to facilitate matching.

08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 41


Contd…
4.Distance Metrics: Using distance measures (e.g.,
Euclidean distance, Hausdorff distance) to quantify the
dissimilarity between shapes.
5.Graph Matching: Representing shapes as graphs and
matching nodes and edges to establish
correspondence.

08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 42


Applications
1.Object recognition: In computer vision, shape matching
can be used to identify objects in an image by comparing
their shapes to a database of known shapes.
2.Image retrieval: Shape matching is used to find similar
images in a database based on the shapes present in the
query image.
3. Signature verification: Handwriting recognition systems
use shape matching to verify signatures by comparing the
shape of a signature to a stored template.

08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 43


Principal Component Analysis
(PCA)
Principal Components Analysis is an unsupervised learning
class of statistical techniques used to explain data in high
dimension using smaller number of variables called the
principal components.

we compute the principal component and used the to


explain the data.

Principal Component Analysis (PCA) is a dimensionality


reduction and data analysis technique widely used in various
fields, including statistics, machine learning, and data science.
08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 44
Contd…
PCA aims to simplify complex data by transforming it
into a new coordinate system, where the data's
variability is maximized along a set of orthogonal axes
called principal components.
It is particularly useful for reducing the dimensionality
of data while retaining the most significant information.

08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 45


08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 46
How PCA works(4 steps)
Data Transformation:
1. PCA starts with a dataset containing high-dimensional data
points. Each data point is represented by multiple features
or attributes.
2. The first step is to center the data by subtracting the mean
of each feature from the data points. This centers the data
around the origin.
3. PCA then calculates the covariance matrix, which
measures the relationships between different features in
the dataset.

08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 47


Contd…
Eigenvalue Decomposition:
1. The next step is to perform eigenvalue
decomposition on the covariance matrix. This
decomposition yields a set of eigenvalues and
corresponding eigenvectors.
2. Eigenvectors are unit vectors that point in the
direction of maximum variance (the principal
components), and eigenvalues represent the amount
of variance explained by each eigenvector.

08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 48


Contd…
Selecting Principal Components:
1. The principal components are ranked in descending
order of their associated eigenvalues. The first
principal component (PC1) explains the most
variance in the data, the second principal component
(PC2) explains the second most variance, and so on.
2. Typically, you can choose a subset of the principal
components that capture a high percentage (e.g.,
95%) of the total variance while reducing
dimensionality.

08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 49


Contd…
Data Reconstruction:
1. You can project the original data onto the selected
principal components to obtain a lower-dimensional
representation of the data. This is achieved by taking a
dot product between the data and the chosen principal
components.
2. The resulting lower-dimensional data retains most of
the original data's variability while reducing the
number of features.

08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 50


Applications of PCA:
1.Dimensionality Reduction: PCA is commonly used
to reduce the number of features in high-dimensional
datasets. It helps simplify data while preserving
essential information, making it easier to visualize,
analyze, and model.
2.Data Visualization: PCA is a powerful tool for
visualizing high-dimensional data in two or three
dimensions. It can be used to generate scatter plots or
3D plots that highlight data patterns and clusters.

08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 51


Contd…
3.Noise Reduction: By focusing on the principal
components that explain the most variance, PCA can help
remove noise or irrelevant information from data.
4.Feature Engineering: PCA can be used as a feature
engineering technique in machine learning to create new
features that capture the most relevant information in the
data, potentially improving model performance.
5.Compression: PCA can be used for data compression, such
as image compression, where it reduces the size of the data
while retaining its essential features.

08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 52


Shape priors for Recognition
Shape priors are a crucial concept in computer vision,
particularly in the context of object recognition.
It provide valuable information about the expected
shape or structure of objects in an image or scene.
By incorporating shape priors into recognition
algorithms, you can improve the accuracy and
robustness of object recognition systems.

08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 53


Contd…
Definition of Shape Priors:
◦ Shape priors represent prior knowledge or assumptions about the
shapes of objects of interest. These priors are often based on statistical
models or templates of shapes learned from a training dataset or
constructed based on domain-specific knowledge.
Role in Object Recognition:
◦ Shape priors serve as a form of regularization or constraint in object
recognition tasks. They guide the recognition process by providing
expectations about the shapes that objects are likely to take.
◦ By incorporating shape priors, object recognition systems can be more
robust to variations in object appearance due to changes in viewpoint,
lighting, occlusion, or other factors.

08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 54


Shape priors for recognition

08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 55


Types of Shape Priors
Statistical Shape Models
Template Matching
Geometric Constraints
Graphical Models

08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 56


Contd…
Statistical Shape Models: These models represent shapes as statistical
distributions. For example, Active Shape Models (ASMs) and Active
Appearance Models (AAMs) use statistical shape priors to guide the fitting
of shapes to image data.
Template Matching: Shape priors can take the form of templates or
exemplar shapes. During recognition, objects are matched to these
templates to find the closest match.
Geometric Constraints: Shape priors can also be expressed as geometric
constraints, specifying relationships between different parts of an object or
the expected proportions of object components.
Graphical Models: Bayesian networks and graphical models can capture
dependencies between object parts and incorporate shape priors into
probabilistic recognition frameworks.
08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 57
Benefits of Shape Priors
Improved Robustness: Shape priors help object recognition
systems handle variations in object appearance, such as
changes in pose, size, and illumination.
Reduced Ambiguity: Priors can resolve ambiguity in object
recognition by favoring shapes that are more likely based on
prior knowledge.
Enhanced Localization: Shape priors can aid in precise
localization of object boundaries or keypoints within an image.

08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 58


Challenges
Constructing Accurate Priors: Building reliable
shape priors often requires extensive training data and
careful modeling.
Handling Variability: Objects can exhibit significant
shape variations, and shape priors must be flexible
enough to accommodate this variability.

08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 59


Applications
Object Detection and Localization: Shape priors are
commonly used in object detection tasks to improve
localization accuracy.
Medical Imaging: In medical image analysis, shape
priors are used to segment and recognize anatomical
structures in images.
Robotics: Shape priors guide robots in recognizing and
interacting with objects in their environment.

08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 60


Pattern recognition methods are a set of techniques and approaches
Pattern recognition methods

used to identify and classify patterns within data.


These methods find applications in various fields, including computer
science, machine learning, artificial intelligence, image processing, and
data analysis. pattern recognition methods(11 methods)
1. Statistical Pattern Recognition: This approach involves using
statistical methods to analyze data patterns. Common techniques
include clustering, principal component analysis (PCA) and
discriminant analysis.
2. Machine Learning: Machine learning algorithms, such as support
vector machines, decision trees, neural networks, and k-nearest
neighbors, can be used for pattern recognition. These algorithms learn
from data and can make predictions or classifications based on patterns
they discover.
08/07/2025 61
20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS
Pattern recognition method
contd…
3. Deep Learning: Deep learning, a subset of machine learning, employs
artificial neural networks with multiple layers (deep neural networks).
Convolutional Neural Networks (CNNs) are widely used for image
recognition, while Recurrent Neural Networks (RNNs) are used for
sequential data.
4. Image Processing: In computer vision and image processing,
techniques like edge detection, feature extraction and template matching
are used for recognizing patterns in images.
5. Natural Language Processing (NLP): NLP techniques are used for
recognizing patterns in text data, including sentiment analysis, named
entity recognition and topic modeling.
6. Speech Recognition: This involves recognizing patterns in audio data
to convert spoken language into text. Hidden Markov Models (HMMs) and
deep learning methods are commonly used.
08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 62
Pattern recognition method
contd…
7. Biometric Pattern Recognition: Biometric systems use unique
physical or behavioral characteristics, such as fingerprints, facial features
and voice, to identify individuals.
8. Time Series Analysis: These methods are used for recognizing patterns
in time-ordered data, such as stock prices, weather data, or physiological
signals. Autoregressive models, moving averages, and Fourier analysis are
commonly applied.
9. Pattern Matching Algorithms: These algorithms search for specific
patterns within data, such as regular expressions, string matching and
sequence alignment.
11. Feature Extraction: Identifying and selecting relevant features from
the data is a crucial step in pattern recognition. Dimensionality reduction
techniques like Principal Component Analysis (PCA) and t-Distributed
Stochastic Neighbor Embedding (t-SNE) are used for this purpose.
08/07/2025 63
20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS
Techniques for pattern recognition
HMM, GMM and EM are common techniques and models used in the
fields of machine learning, statistics, and pattern recognition.
Hidden Markov Model(HMM)
A Hidden Markov Model is a statistical model that represents a system
with unobservable (hidden) states and observable outputs.
HMM is a statistical model used for modeling sequences of data.
It's especially suitable for problems involving temporal data and has
applications in speech recognition, natural language processing and
bioinformatics, Financial Modeling, Image and Video Analysis, Gesture
Recognition,Time Series Analysis

08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 64


Contd…
Key Components
1. Hidden States (Q): Unobservable states of the system.
2. Observations (O): Observable outputs of the system.
3. Transition Probabilities (A): Probabilities of transitioning
between hidden states.
4. Emission Probabilities (B): Probabilities of observing outputs
given hidden states.
5. Initial State Distribution (π): Probability distribution of the
initial hidden state.

08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 65


Contd…
HMM Parameters HMM Algorithms
1. N: Number of hidden states. 1. Forward Algorithm:
Computes the probability of
2. M: Number of observable observing a sequence.
outputs.
2. Viterbi Algorithm: Finds
3. A (NxN): Transition the most likely hidden state
probability matrix. sequence.
4. B (NxM): Emission 3. Baum-Welch Algorithm:
probability matrix. Estimates HMM parameters
5. π (1xN): Initial state from data.
distribution.

08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 66


Contd..
Advantages Disadvantages
1. Handles sequential data 1. Assumes Markov
2. Models uncertainty property
3. Flexible and adaptable 2. Requires careful
parameter tuning
4. Efficient computation
3. Sensitive to initial
Tools and Libraries of
conditions
HMM
1. Python: hmmlearn,
PyHMM
2. R: HMM, depmixS4
08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 67
Gaussian Mixture Model (GMM)
A Gaussian Mixture Model is a probabilistic model used for
representing the distribution of data as a mixture of multiple Gaussian
distributions.
Each Gaussian component represents a different mode in the data.
Components
1. K Gaussian distributions (clusters)
2. Mixing coefficients (weights)
3. Mean vectors (centroids)
4. Covariance matrices

08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 68


How GMM Works
1. Data is divided into K clusters
2. Each cluster is modeled by a Gaussian distribution
3. Parameters are estimated using Expectation-Maximization
(EM) algorithm.
Advantages
1. Flexible modeling of complex data distributions
2. Handles overlapping clusters
3. Provides soft clustering (probability of belonging to each
cluster)

08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 69


Contd…
Applications
GMM techniques
1. Clustering and segmentation
1. K-means initialization
2. Density estimation
2. EM algorithm for parameter
3. Anomaly detection estimation
4. Image and speech recognition3. Bayesian Information Criterion
(BIC) for model selection
5. Time-series analysis
4. Cross-validation for hyper
GMM libraries parameter tuning
1. scikit-learn (Python)
2. TensorFlow (Python)
3. PyTorch (Python) 4.MATLAB
08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 70
Expectation-Maximization
(EM)
Expectation-Maximization (EM) is an iterative algorithm for finding
maximum likelihood estimates of parameters in probabilistic models,
particularly when data is incomplete or has missing values.
How EM works
1. E-Step (Expectation): Calculate the expected value of the log-
likelihood function, given the current estimate of the parameters and the
observed data.
2. M-Step (Maximization): Update the parameters to maximize the
expected log-likelihood function obtained in the E-Step.
Key Features 1. Handles missing or incomplete data.
2. Can be used for parameter estimation, clustering, and density
estimation.
3. Converges to a local maximum likelihood estimate.
08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 71
EM Algorithm Steps (4 steps)
1. Initialize parameters.
- π (mixing coefficients)
- μ (mean vectors)
- Σ (covariance matrices)
2. E-Step: Calculate expected log-likelihood.
Calculate the responsibility matrix (γ):
- γ(i, k) = P(k | x(i), θ) (probability of x(i) belonging to k-th component)
- γ(i, k) = (π(k) * N(x(i) | μ(k), Σ(k))) / (∑(π(j) * N(x(i) | μ(j), Σ(j))))
 Calculate the expected log-likelihood: -
L = ∑(γ(i, k) * log(N(x(i) | μ(k), Σ(k))))
. 08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 72
Contd…
3. M-Step: Update parameters
i. Update mixing coefficients (π): - π(k) = (∑(γ(i, k))) / N
ii. Update mean vectors (μ): - μ(k) = (∑(γ(i, k) * x(i))) / (∑(γ(i,
k)))
iii. Update covariance matrices (Σ): - Σ(k) = (∑(γ(i, k) * (x(i) -
μ(k))(x(i) - μ(k))T)) / (∑(γ(i, k)))
4. Repeat steps 2-3 until convergence.
 Check for convergence of parameters (π, μ, Σ)
Repeat EM steps until convergence or maximum iterations

08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 73


Contd…
Advantages Disadvantages
1. Robust to missing data 1. Convergence to local
2. Flexible modelling optima
2. Sensitive to initialization
3. Efficient computation
3. Computationally
expensive for large
datasets

08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 74


Contd…
Applications EM libraries
1. Gaussian Mixture 1. scikit-learn (Python)
Models (GMM) 2. TensorFlow (Python)
2. Hidden Markov Models
3. PyTorch (Python)
(HMM)
4. MATLAB
3. Mixture of Experts (MoE)
5. R
4. Image segmentation
5. Clustering
6. Density estimation
08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 75
08/07/2025 20AD206 - COMPUTER VISION MRS.R.RAMPRIYA AP/AI&DS 76

You might also like