Unit 1 CV
Unit 1 CV
Unit 1 CV
This pixel is a point on the image that takes on a specific shade, opacity or
color. It is usually represented in one of the following:
Image processing is the process of transforming an image into a digital form and
performing certain operations to get some useful information from it. The image
processing system usually treats all images as 2D signals when applying certain
predetermined signal processing methods.
➔What Is an Image?
Before we jump into image processing, we need to first understand what exactly
constitutes an image. An image is represented by its dimensions (height and width)
based on the number of pixels. For example, if the dimensions of an image are 500 x
400 (width x height), the total number of pixels in the image is 200000.
This pixel is a point on the image that takes on a specific shade, opacity or color. It is
usually represented in one of the following:
● Grayscale - A pixel is an integer with a value between 0 to 255 (0 is completely
black and 255 is completely white).
● RGB - A pixel is made up of 3 integers between 0 to 255 (the integers
represent the intensity of red, green, and blue).
● RGBA - It is an extension of RGB with an added alpha field, which represents
the opacity of the image.
Image processing requires fixed sequences of operations that are performed at each
pixel of an image. The image processor performs the first sequence of operations on
the image, pixel by pixel. Once this is fully done, it will begin to perform the second
operation, and so on. The output value of these operations can be computed at any pixel
of the image.
Image processing is the process of transforming an image into a digital form and
performing certain operations to get some useful information from it. The image
processing system usually treats all images as 2D signals when applying certain
predetermined signal processing methods.
Computer
It comprises the digitizer and hardware that can carry out basic operations, including an
Arithmetic Logic Unit (ALU), which can carry out simultaneous arithmetic and logical
operations on whole pictures.
Massive Storing
In applications involving image processing, the skill is essential. The three main types of
digital storage for image processing applications are as follows: Three types of storage
exist (1) short-term storage, (2) online storage for quick recall (3) archive storage, which
is characterized by rare access.
In applications involving image processing, the skill is essential. The three main types of
digital storage for image processing applications are as follows: Three types of storage
exist (1) short-term storage, (2) online storage for quick recall (3) archive storage, which
is characterized by rare access.
➔ Camera Sensors
It alludes to perception. The image sensor's primary function is to collect incoming light,
transform it into an electrical signal, measure that signal, and then output it to
supporting electronics. It consists of a two-dimensional array of light-sensitive
components that convert photons into electrons. Images are captured by equipment
like digital cameras using image sensors like CCD and CMOS. Two components are
often needed on image sensors to collect digital pictures. The first is an actual tool
(sensor) that can detect the energy emitted by the object we want to turn into an image.
The second is a digitizer, which transforms a physical sensing device's output into
digital form.
➔ Software
The image processing software comprises specialized modules that carry out particular
functions.
➔ Hardcopy Equipment
Laser printers, film cameras, heat-sensitive equipment, inkjet printers, and digital
equipment like optical and CDROM discs are just a few examples of the instruments
used to record pictures.
➔ Networking
Image acquisition is the first step in image processing. This step is also
known as preprocessing in image processing. It involves retrieving the
image from a source, usually a hardware-based source.
2. Image Enhancement
3. Image Restoration
6. Compression
7. Morphological Processing
8. Segmentation
Segmentation is one of the most difficult steps of image processing. It
involves partitioning an image into its constituent parts or objects.
10. Recognition
❖ Classical filtering :-
➔ Introduction
similar for all pixels. For example, if we are multiplying the intensity
pixel. For example, squaring each pixel is not the same as squaring the
image matrix.
➔Linear Filtering
In MATLAB, linear filtering of images is implemented through
convolution with the only difference being that the kernel is flipped
One of these matrices represents the image itself, while the other
filter and returns an output image. For example, in this call, k is the
equal to the standard deviation of the values of the pixels in the input
pixel’s neighborhood.
You can use the nlfilter function to implement a variety of sliding
an image of the same size as the input image. The value of each pixel
each output pixel by taking the standard deviation of the values of the
input pixel’s 3-by-3 neighborhood (that is, the pixel itself and its eight
contiguous neighbors):
You can write an M-file to implement a specific function, and then use
this function with nlfilter. For example, this command processes the
can also use an inline function; in this case, the function name appears
in the nlfilter call without quotation marks. The example below uses
Key Points:
● Result: Pixels above the threshold are set to white (255), and
1. Simple Thresholding
● If the pixel intensity I(x,y) is less than or equal to the threshold T, the
● If the pixel intensity I(x,y) is greater than the threshold T, the output
● Computationally efficient.
2. Adaptive Thresholding
small regions of the image, which allows for better handling of varying
lighting conditions.
Pros of Adaptive Thresholding
parameters.
3. Otsu's Thresholding
Otsu's method is an automatic thresholding technique that calculates the
threshold values.
images.
threshold values to segment the image into more than two regions. This is
levels.
5. Color Thresholding
In color images, thresholding can be applied to each color channel (e.g., RGB,
objects.
Approaches of Color Thresholding
manually.
each channel.
6. Local Thresholding
Local thresholding calculates a different threshold for each pixel based on its
. Niblack's Method
● The threshold is calculated as the mean of the local neighborhood
● T(x,y)=μ(x,y)+kσ(x,y)
● Here,
○ k is a constant.
2. Sauvola's Method
● Computationally intensive.
7. Global Thresholding
Global thresholding uses a single threshold value for the entire image. This
technique is suitable for images with uniform lighting and clear contrast
● Computationally efficient.
8. Iterative Thresholding
Iterative thresholding starts with an initial guess for the threshold value and
iteratively refines it based on the mean intensity of the pixels above and
below the threshold. The process continues until the threshold value
converges.
5.
6. .
11.
12. and
13. C2
14. C
15. 2
16.
17. using
18. Tk
19. T
20. k
21.
22. .
27.
28. and
29. μ2
30. μ
31. 2
32.
33. of
34. C1
35. C
36. 1
37.
38. and
39. C2
40. C
41. 2
42.
43. .
● Tk+1=μ1+μ22
● T
● k+1
●
● =
● 2
● μ
● 1
●
● +μ
● 2
●
●
49.
50. −T
51. k
52.
53. ∣<ϵ
background.
Applications of Thresholding
Thresholding techniques are used in various applications, including:
and automation.
Conclusion
thresholding and global thresholding are suitable for images with uniform
for their specific needs, leading to more accurate and efficient image
processing workflows.
❖ Edge detection
Edge detection is a fundamental image processing technique for identifying and locating
the boundaries or edges of objects in an image. It is used to identify and detect the
discontinuities in the image intensity and extract the outlines of objects present in an
image. The edges of any object in an image (e.g. flower) are typically defined as the
regions in an image where there is a sudden change in intensity. The goal of edge
There are various types of edge detection techniques, which include the following:
The goal of edge detection algorithms is to identify the most significant edges within an
image or scene. These detected edges should then be connected to form meaningful
lines and boundaries, resulting in a segmented image that contains two or more distinct
regions. The segmented results are subsequently used in various stages of a machine
vision system for tasks such as object counting, measuring, feature extraction, and
classification.
Edge Models
Edge models are theoretical constructs used to describe and understand the different
types of edges that can occur in an image. These models help in developing algorithms
for edge detection by categorizing the types of intensity changes that signify edges. The
basic edge models are Step, Ramp and Roof. A step edge represents an abrupt
change in intensity, where the image intensity transitions from one value to another in a
single step. A ramp edge describes a gradual transition in intensity over a certain
distance, rather than an abrupt change. A roof edge represents a peak or ridge in the
intensity profile, where the intensity increases to a maximum and then decreases.
grayscale image. In a color image, the intensity function can be extended to include
The first derivative of an image measures the rate of change of pixel intensity. It is
useful for detecting edges because edges are locations in the image where the intensity
changes rapidly. It detects edges by identifying significant changes in intensity. The first
derivative can be approximated using gradient operators like the Sobel, Prewitt, or
Scharr operators.
The second derivative measures the rate of change of the first derivative. It is useful for
detecting edges because zero-crossings (points where the second derivative changes
rate of change of intensity. The second derivative can be approximated using the
Laplacian operator.
Sobel edge detection is a popular technique used in image processing and computer
convolution operations with specific kernels to calculate the gradient magnitude and
direction at each pixel in the image. Here's a detailed explanation of Sobel edge
detection.
The Sobel operator uses two 3x3 convolution kernels (filters), one for detecting changes
in the x-direction (horizontal edges) and one for detecting changes in the y-direction
(vertical edges). These kernels are used to compute the gradient of the image intensity
at each point, which helps in detecting the edges. Here are the Sobel kernels:
This kernel is used to detect horizontal edges by emphasizing the gradient in the
x-direction.
The 𝐺𝑥 kernel emphasizes changes in intensity in the horizontal direction. The positive
values (+1 and +2) on the right side will highlight bright areas, while the negative values
(-1 and -2) on the left side will highlight dark areas, effectively detecting horizontal
edges.
This kernel is used to detect vertical edges by emphasizing the gradient in the
y-direction.
The 𝐺𝑦 kernel emphasizes changes in intensity in the vertical direction. Similarly, the
positive values (+1 and +2) at the bottom will highlight bright areas, while the negative
values (-1 and -2) at the top will highlight dark areas, effectively detecting vertical
edges.
Let's walk through an example of Sobel edge detection using Python and the OpenCV
4. Apply Sobel Operator: Use the Sobel operator to calculate the gradients in the x
and y directions.
7. Display the Resulting Edge Image: Normalize and display the edge-detected
image.
Here, in the following code for sobel operator cv2.CV_64F specifies the desired depth of
the output image. Using a higher depth helps in capturing precise gradient values,
especially when dealing with small or fine details. For 𝐺𝑥 the values (1, 0) means taking
the first derivative in the x-direction and zero derivative in the y-direction. For 𝐺𝑦 the
values (0, 1) means taking the first derivative in the y-direction and zero derivative in the
x-direction. ksize=3 specifies the size of the extended 3x3 Sobel kernel.
images. It was developed by John F. Canny in 1986 and is known for its optimal edge
detection capabilities. The algorithm follows a series of steps to reduce noise, detect
1. Noise Reduction using Gaussian Blurring: The first step in the Canny edge
detection algorithm is to smooth the image using a Gaussian filter. This helps in
reducing noise and unwanted details in the image. The Gaussian filter is applied
to the image to convolve it with a Gaussian kernel. The Gaussian kernel (or
This step helps to remove high-frequency noise, which can cause spurious edge
detection.
2. Gradient Calculation:
After noise reduction, the Sobel operator is used to calculate the gradient intensity and
direction of the image. This involves calculating the intensity gradients in the x and y
directions (𝐺𝑥 and 𝐺𝑦). The gradient magnitude and direction are then computed using
these gradients.
3. Non-Maximum Suppression: To thin out the edges and get rid of spurious
retains only the local maxima in the gradient direction. The idea is to traverse the
gradient image and suppress any pixel value that is not considered to be an
edge, i.e., any pixel that is not a local maximum along the gradient direction.
In the above image, point A is located on the edge in the vertical direction. The gradient
direction is perpendicular to the edge. Points B and C lie along the gradient direction.
maximum. If it does, Point A proceeds to the next stage; otherwise, it is suppressed and
set to zero.
marked using double thresholding. This step classifies the edges into strong,
weak, and non-edges based on two thresholds: high and low. Strong edges are
those pixels with gradient values above the high threshold, while weak edges are
those with gradient values between the low and high thresholds.
Given the gradient magnitude 𝑀 and two thresholds 𝑇high and 𝑇low, the classification can
rapid intensity change, which are often associated with edges in an image. Unlike
gradient-based methods such as Sobel and Canny, which use directional gradients,
Laplacian Edge Detection relies on the second derivative of the image intensity.
Following are the key Concepts of Laplacian Edge Detection:
The Laplacian operator is used to detect edges by calculating the second derivative of
the image intensity. Mathematically, the second derivative of an image 𝑓(𝑥, 𝑦) can be
represented as:
This can be implemented using convolution with a Laplacian kernel. Common 3x3
Laplacian edge detection on images. This function applies the Laplacian operator to the
input image to compute the second derivative of the image intensity. Following are the
2. Apply Gaussian Blur (Optional): Smoothing the image with a Gaussian blur can
3. Apply the Laplacian Operator: Convolve the image with a Laplacian kernel to
Prewitt edge detection is a technique used for detecting edges in digital images. It
works by computing the gradient magnitude of the image intensity using convolution
with Prewitt kernels. The gradients are then used to identify significant changes in
Prewitt edge detection uses two kernels, one for detecting edges in the horizontal
direction and the other for the vertical direction. These kernels are applied to the image
using convolution.
Roberts Cross edge detection is a simple technique used for detecting edges in digital
images. It works by computing the gradient magnitude of the image intensity using
convolution with Roberts Cross kernels. These kernels are small, simple, and efficient
for detecting edges, especially when the edges are thin and prominent. Lawrence
Roberts Cross edge detection uses two kernels, one for detecting edges in the
horizontal direction and the other for the vertical direction. These kernels are applied to
2. Apply the Horizontal and Vertical Roberts Cross Kernels: Convolve the image
with the horizontal Roberts Cross kernel (Gx) to detect horizontal edges and with
3. Compute Gradient Magnitude: Combine the horizontal and vertical edge maps to
compute the gradient magnitude of the image intensity at each pixel. The
Scharr edge detection is another method used to detect edges in digital images. It is an
improvement over the Sobel operator. The Scharr operator consists of two 3x3
convolution kernels, one for approximating the horizontal gradient and the other for
approximating the vertical gradient. These kernels are applied to the image to compute
the gradient at each pixel, which highlights areas of rapid intensity change or edges.
The horizontal gradient kernel (Gx) is designed to approximate the rate of change of
intensity in the horizontal direction, while the vertical gradient kernel (Gy) approximates
the rate of change of intensity in the vertical direction. The Scharr kernels are as
follows.
detection
two sets of interest points and, by virtue of the indexing of the two sets
automate this process, and that will be the subject of this and the
following lecture. Figure 1 shows the left and right view from a wide
difficulty with wider baselines. For the human visual system, finding
Informative • Reproducible
detection is often called corner detection, even though not all features
corner in the physical world, and the area circled in red is an example
of a window in the image that is not a real corner. Something like the
different depth.
Properties of Corners If we think of each point on the image plane as
surface with light points being high and dark points being low. This
enables us to use basic calculus on the surface and the simplest thing
Corner features occur where there is a sharp change in the angle of the
symmetry.
For the image in Figure 4(a), the tangent lines intersect exactly at
texture.
➔ Mathematical Morphology
that are useful for representation and description. The technique was
originally developed by Matheron and Serra [3] at the Ecole des Mines
their skeletons, and their convex hulls. It is also useful for many pre-
pruning.
one where grey levels represent the distance from the sensor to the
objects in the scene rather than the intensity of light reflected from
them).
Set operations
dilation
element.
can be any shape. The image and structuring element sets need not be
higher) dimensions.
Let A and B be subsets of Z2. The translation of A by x is denoted Ax
and is defined as
and B is denoted A - B.
Dilation
the reflection of B about its origin and then shifting this relection by x.
Consider the example where A is a rectangle and B is a disc centred on
the origin. (Note that if B is not centred on the origin we will get a
smooth sections of contours, fusing narrow breaks and long thin gulfs,
➔ Shape Representation:
Binary shapes can be represented using various
techniques, such as boundary-based representations (e.g., contour tracing
algorithms), region-based representations (e.g., connected component
labeling), or skeletal representations (e.g., skeletonization algorithms)
➔ Shape Descriptors:
Shape descriptors are numerical or symbolic features that
capture specific characteristics of binary shapes. These descriptors are calculated
based on shape properties, such as area, perimeter, compactness, moments, or
orientation. They provide quantitative measures that can be used for shape
comparison, classification, or recognition.
➔CONNECTEDNESS
➔ Connectedness, in the context of binary shape analysis, refers to the
property of a set of pixels or objects being connected to form a
coherent shape or region. It determines the relationships and
connectivity between the pixels in a binary image, where each pixel
can be either "on" (foreground) or "off" (background).
❖ OBJECT LABELING
AND COUNTING
Object labeling and counting refer to the process of identifying and
categorizing objects within an image or a visual scene and determining the
number of instances of each object category present. It is a common task in
computer vision and image processing, often used in various applications
such as object detection, image recognition, and scene understanding.
The general steps involved in object labeling and counting are as follows:
1. Image Acquisition: Obtain the image or visual data that you want to
analyze.
This could be a photograph, a video frame, or even a live video feed.
2. Preprocessing: Preprocess the image to enhance its quality and remove
any noise or unwanted elements. Common preprocessing techniques
include resizing, noise reduction, and image enhancement.
❖ ACTIVE COUNTERS :
Image segmentation means partitioning the input image by clustering pixel
values. It mainly identifies various surfaces or living or nonliving objects in
an image. For example, if you have the following image as an input, you
can have a tiger, green grass, blue water, and land as various surfaces in
your output image.
Various image segmentation techniques exist, such as Active contours,
Active Contours.
the segmentation process. Contours are the boundaries that define the
are also used to define smooth shapes in images and construct closed
internal forces. The energy function is always related to the image’s curve.
➔ . Snake Model
the target object’s shape, especially for complicated things. Active snake
image .
➔ Equation
the internal elastic energy term EInternal, and the external edge-based
energy term External. The internal energy term regulates the snake’s
deformations, while the exterior energy term controls the contour’s fitting
caused by the picture Eimage and constraint forces imposed by the user
Econ.The snake’s energy function is the total of its exterior and internal
Advantage
The applications of the active snake model are expanding rapidly,
disorders or anomalies.
Disadvantage
recognition
Shape models and shape recognition are critical components in computer
understanding these concepts can help you grasp how computers interpret
1. Shape Models
models include:
rotation).
where the shape model is actively fitted to the data by iterating to find
Laplacian operators.
● Corner Detection: Corners are high-information points that help
classification.
the spatial domain into a parameter space. It’s especially useful for
deformed.
describe shape properties (like area and orientation) that are invariant
non-standard orientations).
REFER THEM
– boundary descriptors –