IA Unit 01
IA Unit 01
Image analytics can bring out that information for you, letting you see who, where and
how people are using your products. It also lets you, deep-dive, into the sentiment of
people when they are using your products.
Image
Digital Image
A 16 bit format is actually divided into three further formats which are Red, Green and
Blue. That is famous RGB format.
Image as a Matrix
As we know, images are represented in rows and columns we have the following
syntax in which images are represented:
The right side of this equation is digital image by definition. Every element of this
matrix is called image element, picture element, or pixel.
Array versus Matrix Operation: Images are viewed as the matrix. But in this series of
DIP, we are using array operation. There is a difference is Matrix and Array Operation. In
Array, the operation is carried out by Pixel by Pixel in Image. But in Matrix, the
operation is carried out simply multiplication.
ARITHMETIC OPERATIONS:
1. Addition Operation: Let s(x,y) is the new corrupted image as we are adding noise
i.e s(x,y)=f(x,y)+constant.
Where f(x,y) is image 1 and g(x,y) is image 2. The Practical Application of image
image formed f(x,y) is image1 and g(x,y) is image2. We can also multiple constant to an
image like h(x,y)=f(x,y)*constant. Multiplication Operation is used in shading correction.
are two images and h(x,y) is the new image formed. We can also divided it by constant
i.e. h(x,y)=f(x,y)/constant.
LOGICAL OPERATION:
In the realm of image processing, basic intensity transformations play a crucial role in
tailoring images to meet specific needs and applications. They are particularly useful
when images have a limited intensity range, and they help to stretch or compress this
range to bring out finer details.
basic operations used in image processing to alter the pixel intensities of an image. These
Negative Transformation
The negative transformation is achieved by taking the complement of the original pixel
intensity values. If the original pixel intensity is denoted by “r,” the corresponding
Here, “L” represents the maximum intensity level in the image, often 255 in the case of
8-bit images. This equation subtracts the original intensity from the maximum possible
intensity, effectively inverting the pixel values. For example, if the original intensity is
100 in an 8-bit image (L = 255), the negative intensity will be (255–1–100) = 154.
Identity Transformation
change the pixel intensities of the image. If “r” represents the original pixel intensity and
s=r
adjustments aimed at enhancing the visibility of details in images. They are particularly
useful for images with a wide range of pixel values, such as those captured in low-light or
The log transformation involves applying the logarithm function to each pixel value in an
image. This operation spreads out the darker pixel values, making fine details in
S=c.log(1+R)
values.
The inverse-log transformation is the reverse operation, aimed at expanding the range of
brighter pixels. This is especially useful for images with overexposed regions. The
S= e(R/c)-1
Power-law transformations
transformations used to adjust the tonal and brightness characteristics of an image. These
enhance the visual quality of images or correct for issues related to illumination and
contrast.
The basic idea behind power-law transformations is to raise the pixel values of an image
to a certain power (exponent) in order to adjust the image’s overall brightness and
O= k.IƳ
Where:
k is a constant that scales the result to fit within the desired intensity range.
Gamma Correction: When γ>1, it is known as gamma correction, and it brightens the
image. When γ<1, it darkens the image. Adjusting Contrast: Higher values of γ (greater
than 1) increase contrast by making dark areas darker and bright areas brighter, while
lower values (between 0 and 1) decrease contrast. No Change: When γ=1, the
transformation has no effect on the image; the output is the same as the input.
image enhancement, gamma correction for display devices, and improving the visibility
In a graph, the horizontal axis of the graph is used to represent tonal variations whereas
the vertical axis is used to represent the number of pixels in that particular pixel. Black
and dark areas are represented in the left side of the horizontal axis, medium grey color
is represented in the middle, and the vertical axis represents the size of the area.
Applications of Histograms
Histogram Sliding
Histogram Equalization: Histogram equalization is used for equalizing all the pixel
values of an image. Transformation is done in such a way that uniform flattened
histogram is produced.
Histogram equalization increases the dynamic range of pixel values and makes an equal
count of pixels at each level which produces a flat histogram with high contrast image.
While stretching histogram, the shape of histogram remains the same whereas in
Histogram equalization, the shape of histogram changes and it generates only one image.
1. Mean Filter: Linear spatial filter is simply the average of the pixels contained in
the neighborhood of the filter mask. The idea is replacing the value of every pixel
in an image by the average of the grey levels in the neighbourhood define by the
filter mask. Types of Mean filter:
(i) Averaging filter: It is used in reduction of the detail in image. All
coefficients are equal.
(ii) Weighted averaging filter: In this, pixels are multiplied by different
coefficients. Center pixel is multiplied by a higher value than average filter.
2. Order Statistics Filter: It is based on the ordering the pixels contained in the
image area encompassed by the filter. It replaces the value of the center pixel with
the value determined by the ranking result. Edges are better preserved in this
filtering. Types of Order statistics filter:
(i) Minimum filter: 0th percentile filter is the minimum filter. The value of the
center is replaced by the smallest value in the window.
(ii) Maximum filter: 100th percentile filter is the maximum filter. The value of
the center is replaced by the largest value in the window.
(iii) Median filter: Each pixel in the image is considered. First neighboring
pixels are sorted and original values of the pixel is replaced by the median of
the list.
Sharpening Spatial Filter: It is also known as derivative filter. The purpose of the
sharpening spatial filter is just the opposite of the smoothing spatial filter. Its main
focus in on the removal of blurring and highlight the edges. It is based on the first and
second order derivative. First order derivative:
Must be zero in flat segments.
Must be non zero at the onset of a grey level step.
Must be non zero along ramps.
First order derivative in 1-D is given by:
f' = f(x+1) - f(x)
Second order derivative:
Must be zero in flat areas.
Must be zero at the onset and end of a ramp.
Must be zero along ramps.
Second order derivative in 1-D is given by:
f'' = f(x+1) + f(x-1) - 2f(x)
Color Fundamentals
Colors are seen as variable combinations of the primary color s of light: red (R),
green (G), and blue (B). The primary colors can be mixed to produce the secondary
colors: magenta (red+blue), cyan (green+blue), and yellow (red+green). Mixing the
three primaries, or a secondary withits opposite primary color, produces white light.
RGB colors are used for color TV, monitors, and video cameras.
However, the primary colors of pigments are cyan (C), magenta (M), and yellow (Y),
and the secondary colors are red, green, and blue. A proper combination of the three
pigment primaries, or a secondary with its opposite primary, produces black.
Color characteristics
Color Models
In this model, each color appears in its primary colors red, green, and blue. This
model is based on a Cartesian coordinate system. The color subspace is the cube
shown in the figure below. The different colors in this model are points on or inside
the cube, and are defined by vectors extending from the origin.
The total number of bits used to represent each pixel in RGB image is called pixel
depth. For example, in an RGB image if each of the red, green, and blue images
is an 8-bit image, the pixel depth of the RGB image is 24-bits. The figure below
shows the component images of anRGB image.
Full color
Red Green
Blue Figure 15.5 A full-color
image and its RGB component images
Cyan, magenta, and yellow are the primary colors of pigments. Most printing devices
such as color printers and copiers require CMY data input or perform an RGB to
CMY conversion internally. This conversionis performed using the equation
𝐶 1 𝑅
𝑀= 1 − 𝐺
𝑌 1 𝐵
where, all color values have been normalized to the range [0, 1].
In printing, combining equal amounts of cyan, magenta, and yellow produce
muddy-looking black. In order to produce true black, a fourth color, black, is
added, giving rise to the CMYK color model.
The figure below shows the CMYK component images of an RGB image.
Yellow Black
The RGB and CMY color models are not suited for describing colors in terms of
human interpretation. When we view a color object, we describe it by its hue,
saturation, and brightness (intensity). Hence the HSI color model has been presented.
The HSI model decouples the intensity component from the color-carrying
information (hue and saturation) in a color image. As a result, this model is an ideal
tool for developing color image processing algorithms.
The hue, saturation, and intensity values can be obtained from the RGB color cube.
That is, we can convert any RGB point to a corresponding point is the HSI color
model by working out the geometrical formulas.
𝐵 = (1 − 𝑆)
𝑆 cos 𝐻
𝑅 = 𝐼 [1 + ]
cos (60°− 𝐻)
𝐺 = 3𝐼 − (𝑅 + 𝐵)
If 240° ≤ 𝐻 ≤ 360° :
𝐻 = 𝐻 − 240°
𝐺 = (1 − 𝑆)
𝑆 cos 𝐻
𝐵 = 𝐼 [1 + ]
cos (60°−𝐻)
𝑅 = 3𝐼 − (𝐺 + 𝐵)
The next figure shows the HSI component images of an RGB image.
Full color
Hue Saturation
Approaches that process each component image individually and then form a
composite processed color image from the individuallyprocessed components.
Approaches that work with color pixels directly.
In full-color images, color pixels really are vectors. For example, in the RGB
system, each color pixel can be expressed as
𝑐𝑅(𝑥, 𝑦) 𝑅(𝑥, 𝑦)
𝑐(𝑥, 𝑦) = [𝑐𝐺(𝑥, 𝑦)] = [𝐺(𝑥, 𝑦)]
𝑐𝐵(𝑥, 𝑦) 𝐵(𝑥, 𝑦)
For an image of size M×N, there are MN such vectors, c(x, y), for x = 0,1,2,...,M-1; y
= 0,1,2,...,N-1.
Color Transformation
(a) (b)
Figure 15.8 (a) Original image. (b) Result of decreasing its intensity
Some Basic Relationships between Pixels
And are denoted by . These points together with the 4-neighbors, are
called the 8-neighbors of p, denoted by . As before , some of the neighbor
locations in and fall outside the image if (x,y) is on the border of
the image.
x-1,y- X-
X-1,y
1 1,y+1
X,y-1 X,y X,y+1
X+1,y- X+1,y-
X+1,y
1 1
1
3. m-adjacency (mixed adjacency), two pixels p and q with values from are
m-adjacency if q :
q is in , or
q is in and the set has no pixels whose values are from V
Example:
Consider the pixel arrangement shown in fig.(a) for V={1}. The three pixels at
the top fig.(b) show multiple 8-adjacency as indicated by the lines. This
ambiguity is removed by using m-adjacency, as shown in fig.(c).
A (digital) path (or curve) from pixel p with coordinates (x,y) to pixel q with
coordinates (s,t) is a sequence of distinct pixels with coordinates:
Where
=(s,t),
and pixels and are adjacent for .
In this case, n is the length of the path. If , the path is a
closed path. We define 4-, 8-, or m-paths depending on the type of adjacency
specified. For example, the path shown in fig.(b) between the top right and
bottom rights are 8-paths, and the path in fig.(c) is an m-path.
Let S represent a subset of pixels in an image. Two pixels p and q are said
to be connected in S if there exists a path between them consisting entirely
of pixels in S. For any pixel p in S, the set of pixels that are connected to it in
S is called a connected component of S. If it only has one connected
component, then set S is called a connected set.
Example:
The two regions are adjacent only if 8-adjacent is used. A 4-path between
the two regions does not exist, so there union is not a connected set.
The boundary (also called the border or counter) of a region R is the set of
points that are adjacent to points in the complement of R. said another way,
the border of a region is the set of pixels in the region that have at least one
background neighbor.
Example:
The point circled in figure below is not a member of the border of the 1-
valued region if 4-connectivity is used between the region and its background.
As a rule, adjacency between points in a regions and its background is defined
in terms of 8-connectivity to handle situation like this.
0 0 0 0 0
0 1 1 0 0
0 1 1 0 0
0 1 1 1 0
0 1 1 1 0
0 0 0 0 0
3
4-3 Distance Measures
For pixels p, q, and z, with coordinates (x,y),(s,t),and(v,w), respectively, D is a
distance function or metric if:
1. D(p,q) 0, D(p,q)=0 if p=q
2. D(p,q)= D(q,p) ,and
3. D(p,z) D(p,q)+ D(q,z).
1) The Euclidean distance between p and q is define as
For this distance measure, the pixels having a distance less than or equal to
some value r from (x,y) are the points contained in the disk of radius r
centered at (x,y).
2) The distance (called the city-block distance)between p and q is
defined as:
| | | |
In this case, the pixels having a distance from (x,y) less than or equal to
some value r from a diamond centered at (x,y).
Example
The pixels with distance from (x,y) ( the center point) from
the following contours of constant distance:
2
2 1 2
2 1 0 2 1
2 1 2
2
The pixels with =1 are the 4-nieghbors of (x,y).
3) The distance (called the chessboard distance)between p and q is
defined as:
| || |
In this case, the pixels having a distance from (x,y) less than or equal to
some value r from a square centered at (x,y).
Example
The pixels with distance from (x,y) ( the center point) from
the following contours of constant distance:
4
2 2 2 2 2
2 1 1 1 2
2 1 0 1 2
2 1 1 1 2
2 2 2 2 2
The pixels with =1 are the 8-nieghbors of (x,y).
Note
and distances between p and q are independent of any paths that might
exist between the points because these distances involve only the coordinates
of the points.
4) -distance between two points is defined as the shortest m-path
between the points.
In this case the distance two pixels will depend on the values of the pixels
along the path as well as the values of their neighbors.
Example
Consider the following arrangement of pixels and assume that , , and
Have value 1 and that and can have a value of 0 or 1:
2) if is 1
The distance between p is 3
3) if =1 and =0
The distance between p is 3
5
4) if and are 1