Ipmv Notes
Ipmv Notes
Ipmv Notes
Unit 1
Image: An image may be defined as a two dimensional function f(x, y) where x & y are spatial
coordinates and amplitude of at any pair of coordinates(x, y) is called the intensity or grey level of the
image at that point.
Digital Image: when x, y & amplitude values of f are all finite, discrete quantities we call the image a
digital image.
Digital image is composed of a finite number of elements.
These elements are known as pixel (picture element).
Pixels have particular location and value.
𝑓(0,1) ⋯ 𝑓(0, 𝑁 − 1)
f(x,y) = [ ⋮ ⋱ ⋮ ]
𝑓(𝑀 − 1,0) ⋯ 𝑓(𝑀 − 1, 𝑁 − 1)
0 0 0
3X3= 0 0 0 here, 0 – black & 1 – white
1 1 1
- Computer: In an image processing system is a general purpose computer and can range from
a PC to a super computer. In dedicated applications, sometimes specially designed computers
are used to achieve a required level of performance.
- Software: for image processing consists of specialized modules that perform specific tasks.
- Mass storage: capability is must in image processing applications. Digital storage for image
processing applications falls into three principle categories: (1) short-term storage (2) on-line
storage for relatively fast recall, (3) archival storage, characterized by infrequent access.
- Hardcopy devices: Used for recording images include laser printers, film cameras, heat
sensitive devices, inkjet units and digital units such as optical and CDROM disks.
1 Image Acquisition:
- This is the first step or process of the fundamental steps of digital image processing. Image
acquisition could be as simple as being given an image that is already in digital form. Generally,
the image acquisition stage involves pre-processing, such as scaling etc
2. Image Enhancement:
- Image enhancement is among the simplest and most appealing areas of digital image
processing. Basically, the idea behind enhancement techniques is to bring out detail that is
obscured, or simply to highlight certain features of interest in an image. Such as, changing
brightness & contrast etc.
3. Image Restoration:
- Image restoration is an area that also deals with improving the appearance of an image.
However, unlike enhancement, which is subjective, image restoration is objective, in the sense
that restoration techniques tend to be based on mathematical or probabilistic models of
image degradation.
7. Morphological Processing:
- Morphological processing deals with tools for extracting image components that are useful in
the representation and description of shape.
8. Segmentation:
- Segmentation procedures partition an image into its constituent parts or objects. In general,
autonomous segmentation is one of the most difficult tasks in digital image processing. A
rugged segmentation procedure brings the process a long way toward successful solution of
imaging problems that require objects to be identified individually.
- The use of a filter in front of a sensor improves selectivity. For example, a green (pass)
filter in front of a light sensor favours light in the green band of the color spectrum.
- As a consequence, the sensor output will be stronger for green light than for other
components in the visible spectrum.
- 2D image generated by displacement in x- and y directions between the sensor and
the area to be imaged
Fig. An example of the digital image acquisition process. (a)Energy (“illumination”) source.
(b) An element of a scene. (c) Imaging system. (d) Projection of the scene onto the image
plane. (e) Digitized image.
- This type of arrangement is found in digital cameras. A typical sensor for these
cameras is a CCD array, which can be manufactured with a broad range of sensing
properties and can be packaged in rugged arrays of 4000 * 4000 elements or more.
- The response of each sensor is proportional to the integral of the light energy
projected on to the surface of the sensor, a property that is used in astronomical and
other applications requiring low noise images.
- The first function performed by the imaging system in Fig.(c) is to collect the incoming
energy and focus it onto an image plane.
- If the illumination is light, the front end of the imaging system is a lens, which projects
the viewed scene onto the lens focal plane as Fig.(d) shows.
- The sensor array, which is coincident with the focal plane, produces outputs
proportional to the integral of the light received at each sensor.
- The output is a digital image, as shown diagrammatically in Fig.(e)
- Fig (a) shows continuous image f(X,Y) that we want to convert into digital image
- Continuous in x and y coordinates & amplitude also.
- We have to sample the function in both coordinates and in amplitude.
- Digitizing the amplitude values is called quantization
- Consider line AB (s segment) in fig (b), it is plot of grey levels (amplitude) of continuous
image along the line segment AB.
- Quantization: The samples values are represented by finite set of integer values. This is
knowns as quantization
Question: Justify “Quality of picture depends on the number of pixels & grey levels”.
- Every image is seen on screen is actually in matrix form. Each element of matrix is called pixel
if matrix is N X M so total pixels are NX M.
𝑓(0,1) ⋯ 𝑓(0, 𝑁 − 1)
f(x,y) = [ ⋮ ⋱ ⋮ ]
𝑓(𝑀 − 1,0) ⋯ 𝑓(𝑀 − 1, 𝑁 − 1)
- If size of N x M is large than pixel value becomes more & sampling rate will be increased
therefore we will get better resolution (quality). Value of each pixel is known as grey level.
- Computer understands only 0’s and 1’s. Hence these grey levels need to be represented in
terms of 0’s and 1’s.
- If we have two bits to represent the grey levels only 4 diff grey levels are available 00,01,10,11.
Here 00 – black, 11- White remaining values are shades of grey. Similar 8 bits are used for 1
pixel representation so 28 =256 grey levels are available.
- So more bits, more grey levels & better resolution .total size of image is N X M X m where m
is no of bits used for 1 pixel. Here m is no bits for 1 pixel.
- So we can say quality of image depends on pixels & grey levels.
Question: explain image sampling & quantization of a medical image has size of 8 X 8
inches. The sampling resolution is 5 cycles/mm. How many pixels are required? Will an
image of size 256 X 256 be enough?
- 1 cycle/mm = 1 line pair /mm
- 1 line pair means 1 line white and 1 line black
- For 1 line pair at least we require 2 pixels/mm
- So 5 cycle/mm = 10 pixels/mm
- Size is 8 inch X 8 inch
- 1 inch = 25.4 mm
- 8 X 25.4= 203.2 mm
- 203.2 mm X 203.2 mm
- each mm there are 10 pixels
- total pixels= (2032 X 2032 )
- We require 2032 X 2032 pixels to represent image so 256 X256 pixels will not be enough to
represent the given image.
7. Isopreference Curves:
- We have seen effect of reduction of N & m in previous topic. We still do not know the ideal
value of N & m for image.
- T.S Huang had attempt no of experiment by varying values of N & m simultaneously. Fig a)
has woman face b) cameraman c) has crowd of people.
- The result was drawn on the graph. Each curve on the graph represents one image. The
values on the x axis represents the number of grey levels and the values on the y axis
represents bits per pixel (k). This curve is known as isopreference curve.
- So conclude that, for more detailed images, the isopreference curves become more and
more vertical. It also means that for an image with a large amount of details, very few grey
levels are needed.
8. Image types
1) Binary image /monochrome image:
The binary image as it name states, contain only two pixel values: 0 and 1. Here 0 refers to black
colour and 1 refers to white colour. It is also known as Monochrome.
2) Grey scale image:
It has 256 different shades of colours in it. It is commonly known as Grayscale image. The range of
the colours in 8 bit varies from 0-255, where 0 stands for black, 255 stands for white and 127
stands for grey colour.
3) Colour image(24 bit):
24 bit colour format is also known as true colour format. In a 24 bit colour format, the 24 bits are
again distributed in three different formats of Red, Green and Blue.
- Since 24 is equally divided on 8, so it has been distributed equally between three different
colour channels.
- Their distribution is like this. 8 bits for R, 8 bits for G, 8 bits for B. A 24 bit colour image
supports 16777216 diff combination of colours. Colour image can be converted in grey scale
image using this equation
- X= 0.30 R + 0.59G + 0.11B.
This is an additive model, i.e. the colours present in the light add to form new colours, and is
appropriate for the mixing of coloured light for example. Red, green and blue are primary
colors to form the three secondary colours yellow (red + green), cyan (blue + green) and
magenta (red + blue), and white ((red + green + blue).
2. CMY model:
The CMYK color model (process color, four color) is a subtractive color model, used in color
printing, and is also used to describe the printing process itself. CMYK refers to the four inks
used in some color printing: cyan, magenta, yellow, and key (black).
C= 1-R
3. HIS Model:
Hue: Dominant colour observe by observer
Intensity: Amount of white color mixed with Hue.
Saturation: Amount of brightness reflection.
I= ,
min (R,G,B) 3
S=1- =1- min (R,G,B)
The advantage of this model is that more bandwidth can be assigned to the Y-component
(luminance) to which the human eye is more sensible than to color information.
- A pixel p at coordinates (x, y) has four horizontal and vertical neighbors whose coordinates
are given by (x+1, y), (x-1, y), (x, y+1), (x, y-1). This set of pixels, called the 4-neighbors of
p, is denoted by N4 (p). Each pixel is a unit distance from (x, y), and some of the neighbors
of p lie outside the digital image if (x, y) is on the border of the image.
- The four diagonal neighbors of p have coordinates (x+1, y+1), (x+1, y-1), (x-1, y+1), (x-1, y-
1) and are denoted by ND (p). These points, together with the 4-neighbors, are called the
8-neighbors of p, denoted by N8 (p). As before, some of the points in ND (p) and N8 (p)
fall outside the image if (x, y) is on the border of the image.
- Two pixels are connected if they are neighbours and their grey levels are satisfy some
specified criteria of similarity.
- For example, in a binary image two pixels are connected if they are 4-neighbors and have
same value (0/1).
4-adjacency: Two pixels p and q with values from V are 4- adjacent if q is in the set N4(p)
8-adjacency: Two pixels p and q with values from V are 8- adjacent if q is in the set N8(p).
m-adjacency: Two pixels p and q with values from V are m- adjacent if,
i)q is in N4(p)
- To determine whether the pixels are adjacent in some sense. Let V the set of grey level
values used to define connectivity then two pixels p & q that have values from the set v are:
m-connected: if
i) q is in set of N4(p)
Here V= {1,2}
DE = [(x1,y1 ) 2+ (x2,y2)2]1\2
2) City block distance (D4 distance): If p and q are the two pixels with coordinates(x1,y1) and
(x2,y2) then
If p and q are the two pixels with coordinates(x1,y1) and (x2,y2) then
4) Dm Distance: This distance is measured based on m adjacency. Pixel p and q are m adjacent
if i) q is in set of N4(p)
Example : Let V = {0,1}. Compute DE , D4 ,D8, Dm distance between two pixels p and q let the
pixel coordinates of p and q be (3,0) and (2,3) respectively for the image shown. Find distance
Solution: V = {0,1} implies that the distance traversed can pass through 0 and 1.
i) Euclidean Distance:
DE (p,q) = 1.4+1+1=3.4
=│ 3-2 │ + │ 0-3 │
iii) D8 Distance:
= Max(1,3)
iv) Dm Distance: This distance is measured based on m adjacency. Pixel p and q are m adjacent if
i) q is in set of N4(p)
Dm (p,q) = 1+1+1+1= 4
Example: Let V = {2,4}. Compute D4, D8, Dm distance between two pixels p and q.
= Max(3,2)
iii) Dm Distance: This distance is measured based on m adjacency. Pixel p and q are m
adjacent if i) q is in set of N4(p)
Dm = 1.4+1+1+1= 4.4
Here ω= Ω T
sampling interval T = 1/fs
but we know f/ fs = k
so ω=2πk
ω/2π =k/N
- The DFT of finite duration sequence x(n) is defined as
X(k) = ∑𝑁−1
𝑛=0 𝑥(𝑛)𝑒
k= 0,1,2,3......,N-1
𝑛=0 𝑥(𝑛)𝑤𝑁
- we know X (k) = ∑𝑁−1
𝑛=0 𝑥(𝑛)𝑒
0 1 2 3
0 w40 w40 w40 w40
1 w40 w41 w42 w43
2 w40 w42 w44 w46
3 [w40 w43 w46 w49 ]
1 1 1 1 0
1 −𝑗 −1 𝑗 1
So X(k) = [ ][ ]
1 −1 1 −1 2
1 𝑗 −1 −𝑗 3
X(k) = [4 −2 0 −2]
Proof: we know
=∑𝑁−1 𝑁−1
𝑚=0 (∑𝑛=0 𝑓(𝑚, 𝑛) 𝑒
) 𝑒 −𝑗2𝜋𝑚𝑘/𝑁
𝑚=0 (𝑓(𝑚, 𝑙) 𝑒
= 𝐹(𝑘, 𝑙)
Proof: we know
F (k+pN,l+qN) = ∑𝑁−1 𝑁−1
𝑚=0 ∑𝑛=0 𝑓(𝑚, 𝑛) 𝑒
−𝑗2𝜋𝑚(𝑘+𝑝𝑁)/𝑁 −𝑗2𝜋𝑛(𝑙+𝑞𝑁)/𝑁
= ∑𝑁−1 𝑁−1
𝑚=0 ∑𝑛=0 𝑓(𝑚, 𝑛) 𝑒
−𝑗2𝜋𝑚𝑘/𝑁 −𝑗2𝜋𝑚𝑝𝑁/𝑁 −𝑗2𝜋𝑛𝑙/𝑁 −𝑗2𝜋𝑛𝑞𝑁/𝑁
𝑒 𝑒 𝑒
𝑒 −𝑗2𝜋𝑚𝑝 𝑒 −𝑗2𝜋𝑛𝑞 = 1
So F(k+pN,l+qN) = F(k,l)
4) Convolution Property:
- Convolution in spatial domain is equal to multiplication in frequency domain.
f(m,n) * g(m.n) = F(k,l) x G(k,l)
Proof: we know convolution definition
f(m,n) * g(m.n) = ∑𝑁−1
𝑎=0 ∑𝑏=0 𝑓(𝑎, 𝑏)𝑔(𝑚 − 𝑎, 𝑛 − 𝑏)
LHS= F{ f(m,n) * g(m.n)} =∑𝑁−1 𝑁−1 𝑁−1
𝑚=0 ∑𝑛=0 [ ∑𝑎=0 ∑𝑏=0 𝑓(𝑎, 𝑏)𝑔(𝑚 − 𝑎, 𝑛 −
𝑏) ] 𝑒 −𝑗2𝜋𝑚𝑘/𝑁 𝑒 −𝑗2𝜋𝑛𝑙/𝑁
=∑𝑁−1 𝑁−1 𝑁−1
𝑚=0 ∑𝑛=0 ∑𝑎=0 ∑𝑏=0 𝑓(𝑎, 𝑏)𝑔(𝑚 − 𝑎, 𝑛 − 𝑏) 𝑒
−𝑗2𝜋(𝑚−𝑎+𝑎)𝑘/𝑁 −𝑗2𝜋(𝑛−𝑏+𝑏)𝑙/𝑁
=F(k,l) x G(k,l)
5) Correlation: correlation gives similarity between two signals. DFT of correlation of two
sequence x(n) & h(n) is defined as X(-k)H(k).
=∑𝑁−1 𝑁−1
𝑚=0 {∑𝑛=0 𝑥(𝑛)ℎ(𝑛 + 𝑚)} 𝑒
𝑚=0 𝑥(𝑛)𝑒
−𝑗2𝜋(−𝑛)𝑘/𝑁 ∑𝑁−1
𝑛=0 ℎ(𝑛 + 𝑚) 𝑒
= X(-k)H(k)
6) Scaling property: it is basically used to increase & decrease the size of image.
DFT{f(am,bn)}= F(k/a, l/b)
F[f(am,bn)] = ∑𝑁−1 𝑁−1
𝑚=0 ∑𝑛=0 𝑓(𝑎𝑚, 𝑏𝑛) 𝑒
−𝑗2𝜋𝑚𝑘/𝑁 −𝑗2𝜋𝑛𝑙/𝑁
−𝑗2𝜋𝑚( )𝑘/𝑁 −𝑗2𝜋𝑛(𝑏/𝑏)𝑙/𝑁
=∑𝑁−1 𝑁−1
𝑚=0 ∑𝑛=0 𝑓(𝑎𝑚, 𝑏𝑛) 𝑒 𝑎 𝑒
𝑁−1 𝑘
= ∑ ∑ 𝑓(𝑎𝑚, 𝑏𝑛) 𝑒 −𝑗2𝜋𝑚(𝑎)𝑎/𝑁 𝑒 −𝑗2𝜋𝑛(𝑙/𝑏)𝑏/𝑁
= F(k/a, l/b))
7) Conjugate symmetry:
f(m,n) F(k,l)
f*(m,n) F*(-k,-l)
F(k,l) = ∑𝑁−1 𝑁−1
𝑚=0 ∑𝑛=0 𝑓(𝑚, 𝑛) 𝑒
−𝑗2𝜋𝑚𝑘/𝑁 −𝑗2𝜋𝑛𝑙/𝑁
F*(-k,-l) =f*(k,l)
8) Orthogonality:
∑ ∑ 𝑎𝑘,𝑙 (𝑚, 𝑛)𝑎∗ 𝑘 ′ ,𝑙′ (𝑚, 𝑛) − (k-k’,l-l’)
9) Multiplication by exponential:
Proof:We know
=∑𝑁−1 𝑁−1
𝑚=0 ∑𝑛=0 𝑓(𝑚, 𝑛) 𝑒
𝑒 −𝑗2𝜋𝑚𝑘0 /𝑁 𝑒 −𝑗2𝜋𝑛𝑙/𝑁 𝑒 −𝑗2𝜋𝑚𝑙0 /𝑁
=∑𝑁−1 𝑁−1
𝑚=0 ∑𝑛=0 𝑓(𝑚, 𝑛) 𝑒
−𝑗2𝜋𝑚(𝑘−𝑘0 )/𝑁 −𝑗2𝜋𝑛(𝑙−𝑙0 )/𝑁
= F (k-ko , l-lo)
Example: Compute 2D DFT 3* 3 give image.
1 −1 1
[−1 1 1]
1 1 1
Now apply F= T f T
1 1 1 1 −1 1
- First find Tf =[1 −0.5 − 0.886𝑗 −0.5 + 0.886𝑗] [−1 1 1]
1 −0.5 + 0.886𝑗 −0.5 − 0.886𝑗 1 1 1
3 6 9
= [ 0 0 0]
0 0 0
3 6 9 1 1 1
Now TfT =[0 0 0] [1 −0.5 − 0.886𝑗 −0.5 + 0.886𝑗]
0 0 0 1 −0.5 + 0.886𝑗 −0.5 − 0.886𝑗
1 1 1 1
1 1 1 1
[ ]
1 1 1 1
1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1
1 −𝑗 −1 𝑗 1 1 1 1 1 −𝑗 −1 𝑗
F= TfT= [ ][ ][ ]
1 −1 1 −1 1 1 1 1 1 −1 1 −1
1 𝑗 −1 −𝑗 1 1 1 1 1 𝑗 −1 −𝑗
16 0 0 0
0 0 0 0
=[ ]
0 0 0 0
0 0 0 0
- 1/16 [T][F][T]
1 1 1 1 16 0 0 0 1 1 1 1
1 −𝑗 −1 𝑗 0 0 0 0 1 −𝑗 −1 𝑗
1/16[ ][ ][ ]
1 −1 1 −1 0 0 0 0 1 −1 1 −1
1 𝑗 −1 −𝑗 0 0 0 0 1 𝑗 −1 −𝑗
1 1 1 1
1 1 1 1
[ ]
1 1 1 1
1 1 1 1
Example: find 2 D DFT of following image.
0 1 2 1
1 2 3 2
F(x,y)= [ ]
2 3 4 3
1 2 3 2
Solution: F = TfT
1 1 1 1 0 1 2 1
1 −𝑗 −1 𝑗 1 2 3 2
Tf= [ ][ ]
1 −1 1 −1 2 3 4 3
1 𝑗 −1 −𝑗 1 2 3 2
4 8 12 8
−2 −2 −2 −2
=[ ]
0 0 0 0
−2 −2 −2 −2
4 8 12 8 1 1 1 1
−2 −2 −2 −2 1 −𝑗 −1 𝑗
TfT= [ ][ ]
0 0 0 0 1 −1 1 −1
−2 −2 −2 −2 1 𝑗 −1 −𝑗
32 −8 0 −8
−8 0 0 0
=[ ]
0 0 0 0
−8 0 0 0
0 1 2 1
1 2 3 2
F(x,y)= [ ]
2 3 4 3
1 2 3 2
1 1 1 1
1 −𝑗 −1 𝑗
Solution: we know T= [1 −1 ]
1 −1
1 𝑗 −1 −𝑗
We shall use the DFT along the rows and then along the columns.
1 1 1 1 0 4
1 −𝑗 −1 𝑗 1 −2
DFT of first row [ ][ ] = [ ]
1 −1 1 −1 2 0
1 𝑗 −1 −𝑗 1 −2
1 1 1 1 1 8
1 −𝑗 −1 𝑗 2 −2
DFT of second row [ ][ ] = [ ]
1 −1 1 −1 3 0
1 𝑗 −1 −𝑗 2 −2
1 1 1 1 2 12
1 −𝑗 −1 𝑗 3 −2
DFT of third row [ ][ ] = [ ]
1 −1 1 −1 4 0
1 𝑗 −1 −𝑗 3 −2
1 1 1 1 1 8
1 −𝑗 −1 𝑗 2 −2
DFT of fourth row [ ][ ] = [ ]
1 −1 1 −1 3 0
1 𝑗 −1 −𝑗 2 −2
4 −2 0 −2
8 −2 0 −2
Hence we have an intermediate stage = [ ]
12 −2 0 −2
8 −2 0 −2
Now using 1 D DFT along the columns of this intermediate image we get
1 1 1 1 4 32
1 −𝑗 −1 𝑗 8 −8
DFT of first column [ ][ ] = [ ]
1 −1 1 −1 12 0
1 𝑗 −1 −𝑗 8 −8
1 1 1 1 −2 −8
1 −𝑗 −1 𝑗 −2 0
DFT of second column [ ][ ] = [ ]
1 −1 1 −1 −2 0
1 𝑗 −1 −𝑗 −2 0
1 1 1 1 0 0
1 −𝑗 −1 𝑗 0 0
DFT of third column [ ][ ] = [ ]
1 −1 1 −1 0 0
1 𝑗 −1 −𝑗 0 0
1 1 1 1 −2 −8
1 −𝑗 −1 𝑗 −2 0
DFT of fourth column [ ][ ] = [ ]
1 −1 1 −1 −2 0
1 𝑗 −1 −𝑗 −2 0
32 −8 0 −8
−8 0 0 0
The final DFT of entire image is [ ]
0 0 0 0
−8 0 0 0
32 1024 160
(4 point DIT-FFT)
Example: find DIT- FFT of given input image
0 1 2 1
1 2 3 2
2 3 4 3
1 2 3 2
- 4 * 4 image so we need 4 point butterfly diagram. First do DFT with row then column.
F(1) - F(2)
F(3) -1 - F(3)
4 -2 0 -2
8 -2 0 -2
12 -2 0 -2
8 -2 0 -2
- Repeat this process with individual column of above matrix X and final output matrix is
32 -8 0 -8
-8 0 0 0
0 0 0 0
-8 0 0 0
- (u) =√ ,1 ≤u ≤N-1
2 𝜋(2𝑣+1)𝑢
- c(u,v) = √ cos [ ] , 1 ≤u < N-1 & 0 ≤v < N-1.
𝑁 2𝑁
2 𝜋(2𝑣+1)𝑢
- c(u,v) = √𝑁 cos [ 2𝑁
] , 1 ≤u < N-1 & 0 ≤v < N-1.
0 1 2 3
0 0.5 0.5 0.5 0.5
c(u,v) = 1 0.653 0.270 -0.270 -0.653
2 0.5 -0.5 -0.5 0.5
3 0.270 -0.653 0.653 -0.270
- Now F= cf
0.5 0.5 0.5 0.5 1
0.653 0.270 -0.270 -0.653 2
0.5 -0.5 -0.5 0.5 [ ]
0.270 -0.653 0.653 -0.270 7
F= [ ]
2 4 4 2
4 6 8 3
2 8 10 4
3 8 6 2
Example: write expression for a two dimensional DCT. Also find DCT of given 4 * 4 image
1 2 2 1
2 1 2 1
1 2 2 1
2 1 2 1
𝜋(2𝑥+1)𝑢 𝜋(2𝑦+1)𝑣
- F(u,v) = (v) (u)∑𝑁−1 𝑁−1
𝑥=0 ∑𝑦=0 𝑓(𝑥, 𝑦)cos[ 2𝑁
] cos[ 2𝑁 ] ;x,y = 0,1,2…. N-1
- Given input matrix is asymmetric so we have to use F= c f c’
0 0.3827 −1 0.9239
0 −0.1464 −0.3827 −0.3536
F= [ ]
0 0 0 0
0 −0.3536 −0.9239 −0.8536
2.1.6 KL Transform
Question: write short note on KL transform.
𝟒 −𝟐
Question: Find KL transform of following image [ ]
−𝟏 𝟑
- In image neighbor pixels are highly related to center pixels. When we apply compression
algorithm, which compressed all pixels values so which is directly effect on picture quality.
- By using k-L transform compression apply on uncorrelated that so quality of image will be good
compare with other compression algorithm.
- For KL transform we have to follow some steps which are given below.
- Step 1: Formation of vector from the given matrix.
4 −2 4 −2
- Suppose matrix X = [ ] so X0 = [ ] & X1 = [ ]
−1 3 −1 3
- Step 2: Determination of covariance matrix.
- For covariance cov(x) = E[XXT]- 𝑋̅ 𝑋̅T
- Here 𝑋̅ is mean value.
- 𝑋̅ = ∑𝑀−1
𝑀 𝑘=0 𝑋𝑘 , M is number of vectors in X
1 𝑀−1
- So 𝑋̅ = ∑ 𝑋
2 𝑘=0 𝑘
= {X0 + X1 }
1 4 −2
= 2{ [ ] + [ ]}
−1 3
1 2
= [ ]
2 2
=[ ]
- Now 𝑋̅ 𝑋̅T = [ ] [1 1]
1 1
=[ ]
1 1
- E[XXT ] = 𝑀 ∑𝑀−1 𝑇
𝑘=0 𝑋𝑘 𝑋𝑘
= ∑1𝑘=0 𝑋𝑘 𝑋𝑘𝑇
= [(𝑋0 𝑋0𝑇 ) + (𝑋1 𝑋1𝑇 )]
1 4 −2
=2 {[ ] [4 −1] + [ ] [−2 3]}
−1 3
1 16 −4 4 6
= {[ ]+[ ]}
2 −4 1 −6 9
1 20 −10
= [ ]
2 −10 10
10 −5
=[ ]
−5 5
9 −6 0
- ([ ]− [ ]) =0
−6 4 0
9− −6
- [ ]=0
−6 4−
- {(9-)(4-) – 36 } = 0
- 36- 4-9+2-36 =0
- 2 -13 =0
- (-13)=0
- So 0=0 & 1= 13
- Step 4: Determination of eigen vectors of covariance matrix.
- First eigen vector φ0
- cov(x)-0I φ0 = 0
9 −6 1 0 ϕ00 0
- ([ ] − (0) [ ]) [ ]=[ ]
−6 4 0 1 ϕ01 0
9 −6 ϕ00 0
- [ ][ ]=[ ]
−6 4 ϕ01 0
- Take ϕ01 = 1
- 9ϕ00 − 6ϕ01 = 0
- 9ϕ00 − 6(1) = 0
- ϕ00 = 6/9 = 0.66
- So Eigen vector ϕ0 = [ ]
- Similarly find out eigen vector ϕ1
- cov(x)-1I φ1 = 0
9 −6 1 0 ϕ10 0
- ([ ] − (13) [ ]) [ ]=[ ]
−6 4 0 1 ϕ11 0
9 −6 13 0 ϕ 0
- ([ ]− [ ]) [ 10 ]=[ ]
−6 4 0 13 ϕ11 0
−4 −6 ϕ00 0
- [ ][ ]=[ ]
−6 9 ϕ01 0
- Take ϕ11 = 1
- -4ϕ10 − 6ϕ11 = 0
- -4ϕ10 − 6(1) = 0
- ϕ10 = - 6/4 = -1.5
- So Eigen vector ϕ1= [ ]
1 0.66
- = [ ]
√(0.66)2 +(1)2 1
- =(0.83) [ ]
- =[ ]
- Similarly second Eigen normalization
ϕ1 1 ϕ
- = [ 10 ]
√ϕ 2 +ϕ 2 ϕ11
10 11
1 −1.5
= [ ]
√(−1.5)2 +(1)2 1
= (0.55) [ ]
=[ ]
- Step 6 : KL transform matrix from the given vector of covariance matrix
0.55 −0.83
T=[ ]
0.83 0.55
- We have to check this matrix is unitary or not so
0.55 −0.83 0.55 0.83
TTT =[ ][ ]
0.83 0.55 −0.83 0.55
0.99 0
=[ ]
0 0.99
1 0
=[ ]
0 1
- Step 7: KL transformation of i/p matrix
- Y = T[X]
0.55 −0.83 4
- Y0= T[X0] = [ ][ ]
0.83 0.55 −1
2.2 + 0.83
= [ ]
2.64 − 0.55
=[ ]
0.55 −0.83 −2
- Y1= T[X1] =[ ][ ]
0.83 0.55 3
=[ ]
3.02 −3.59
Y= [ ]
2.73 0.01
[ ]
- The KL transform is based on the statically properties of the image and has several important
properties that make it useful for image processing particularly image compression.
- Since data from neighbouring pixels in an image is highly correlated, image compression without
ruining the subjective quality of the image becomes quite challenging.
- By decorrelating this data, more data compression can be achieved. It is advantageous to remove
redundancies from a decorrelated data sequence. The KL transform performs this task of
decorrelating the data.
- The KL transform is used in clustering analysis to determine a new coordinates system for sample
data where the largest variance of a projection of the data lies on the first axis & so on.
- Because these axes are orthogonal so approach allows for reducing the dimensionality of data set
by eliminating the coordinate axis with small variances.
- This data reduction technique is known as principal component Analysis (PCA).
- The hardmard transform is based on the hardamard matrix which is a square array having entries
0f +1 & -1.
1 1
- The hardamard matrix of order 2 is given by, H(2) = [ ]
1 −1
- It is orthogonal matrix.
- For normalization we have to multiply the matrix with some constant factor.
- Hardamard matrices of order 2n can be recursively generated through the kronecker product.
H(2n) = H(2) x H(2n-1)
Suppose n =1
H(2) = H(2) x H(20)
H(2) = H(2)
Suppose n= 2
H(22) = H(2) x H(2n-1)
From kronecker product we get,
1. 𝐻(2) 1. 𝐻(2)
H(4) = [ ]
1. 𝐻(2) −1. 𝐻(2)
1 1 1 1
1 −1 1 −1
H(4) = [ ]
1 1 −1 −1
1 −1 −1 1
- If x(n) is N-point dimensional sequence of finite valued real numbers arranged in a column, then
Hadmard transformed sequence is given by,
X= T.x
X[n] = [H(N) . x(n)]
- H(n) is N x N matrix & x(n) is data sequence.
- The inverse Hardmard transform is given by, x(n) = 1/N [H(n)X(n)]
- If f is a N x N image & F is transformed image, the Hardmard transform is given by,
F= T f T
F= [H (N) f H (N)]
Example: compute the Hardmard transform of the data sequence {1, 2, 0, 3}|
2 1 2 1
1 2 3 2
2 3 4 3
1 2 3 2
F= [H (N) f H (N)]
1 1 1 1 2 1 2 1 1 1 1 1
1 −1 1 −1 1 2 3 2 1 −1 1 −1
F= [ ][ ][ ]
1 1 −1 −1 2 3 4 3 1 1 −1 −1
1 −1 −1 1 1 2 3 2 1 −1 −1 1
6 8 12 8 1 1 1 1
2 0 0 0 1 −1 1 −1
F=[ ][ ]
0 −2 −2 −2 1 1 −1 −1
0 −2 −2 −2 1 −1 −1 1
34 2 −6 −6
2 2 2 2
F= [ ]
−6 2 2 2
−6 2 2 2
2.1.8 The Haar Transform
- The haar transform is based on a class of orthogonal matrices whose elements are either 1,-1,0
multiplied by factor √𝟐.
- Algorithm to generate Haar basis
- Step 1: Determine the order of N of the Haar basis.
- Step 2: Determine n where n= log2N.
- Step 3: Determine p & q.
- Step 4: Determine k.
k= 2p+q-1
- Step 5: Determine Z.
Z [0,1] {0/N ,1/N, ….., N-1/N}
- Step 6: if k=0 then H(z)= 1/√𝐍
𝑝 1
𝑞−1 (𝑞− )
+22 , ( 2𝑝 ) ≤ 𝑍 < 2𝑝
- Suppose N= 2
- Step 1: N =2
- Step 2: n= log22 = 1.
- Step 3: i) since n=1, the only value of p is 0.
ii) So q takes the value of 0 & 1.
Step 4: Determine the value k using the formula k= 2p+q-1
p q k
0 0 0
0 1 1
2 1 2 1
1 2 3 2
2 3 4 3
1 2 3 2
1 1 1 1 2 1 2 1 1 1 √𝟐 0
1 1 −1 −1 1 2 3 2 1 1 −√𝟐 0
F= 1/√𝟒 [ ][ ] 1/√𝟒
√𝟐 −√𝟐 0 0 2 3 4 3 1 −1 0 √𝟐
0 0 √𝟐 −√𝟐 1 2 3 2 [1 −1 0 −√𝟐 ]
6 8 12 8 1 1 √𝟐 0
0 −2 −2 −2 1 1 −√𝟐 0
F= 1/√𝟒 [ ] 1/√𝟒
√𝟐 −√𝟐 −√𝟐 −√𝟐 1 −1 0 √𝟐
−√𝟐 −√𝟐 √𝟐 √𝟐 [1 −1 0 −√𝟐 ]
- Change the order of lyrics
- Fourier transform gives similar output for both because it doesn’t give time information so this
problem overcome by using STFT (short term fourier transform).
- Drawback of STFT is that once we choose particular window size, it remains same for all frequency.
- Many signal needs a more flexible approach where one can vary window size
- It is known as multi-resolution which given by wavelet transform
- Wavelet: a wave is an oscillation function of time of space that is periodic an infinite length
continuous function.
- Wavelet is a wavelength of an effectively limited duration that has an average value of zero
- (x) is called wavelet if it has following properties
- I) ∫− (x)dx = 0
- II) ∫− (x) dx <
- III) C= ∫− 𝑑𝜔 <
- There are classified into 2 categories.
- I) Discrete wavelet Transform(DWT)
- II) Continuous wavelet Transform(CWT)
- CWT is given by
1 𝑡−𝑏
- Wf (a,b) = ∫− 𝑥(𝑡)∗ [ 𝑎
- a is scaling parameter gives the frequency information in wavelet transform.
- b is shifting parameter gives the time information as indicates the locations of the window which
is shifted through the signal.
- Expression for 2D CWT of image f(x, y) is given by,
1 𝑥−𝑚 𝑦−𝑛
- ∫ ∫− 𝑓(𝑥, 𝑦)∗ [ 𝑎
, 𝑏 ]𝑑𝑥 𝑑𝑦
- Where m, n - shifting parameters & a, b –scaling parameter.
500 samples
High pass 2
500 samples
- Size of image is N * N
- At the first stage we convolve the rows of image with h(n) & g(n) & discard alternate columns
(down sample by 2)
- The columns of each of N/2 * N data convolve with h(n) & g(n) & alternate rows are discard.
- The result of entire operation gives N/2 * N/2 samples.
- The upper left most square represents the smooth information (blurred version of the image).
- The other square represents detailed information (edges) in different directions & at different
- We can also reconstruct original image using reverse process.
Unit 3 Image Enhancement
Unit 3
- Image enhancement is one of the first steps in image processing.
- In this technique the original image is processed so that the resultant image is more
suitable for specific application.
- It is Subjective processing technique. Subjective means result may vary from person
to person.
- It does not add any extra information to the original image.
- It can be done in two domains:
1. The Spatial domain
2. The Frequency domain
- Suppose f(x,y) be original image where f can take values from 0-255.
- The modified image can be expressed as:
s = T(r)
r = input pixel Intensity
s = output pixel Intensity
T is a function
1. Digital Negative
𝑺 = (𝑳 − 𝟏) − 𝒓
L is number of Grey levels
- In this case, L=256, thus, S = (256 – 1) – r
- So we can write, S = 255- r
- Thus, S = 255 for r=0
2. Contrast Stretching
- In Contrast stretching, to increase the contrast of image by making the dark portions
darker and bright portions brighter.
𝛼𝑟, 0 ≤ 𝑟 < 𝑟1
𝑆 = { 𝛽(𝑟 − 𝑟1 ) + 𝑆1 , 𝑟1 ≤ 𝑟 < 𝑟2
𝛾(𝑟 − 𝑟2 ) + 𝑆2 , 𝑟2 ≤ 𝑟 < 𝐿 − 1
- We make dark area darker by assigning a slope less than 1 &make bright area brighter
by assigning a slope greater than 1.
3. Thresholding
Extreme contrast is known as Thresholding.
𝑠 = 0, 𝑖𝑓 𝑟 ≤ 𝑎
𝑠 = 𝐿 − 1, 𝑖𝑓 𝑟 > 𝑎
- Thresholding has only 2 values: black or white. (Threshold image has maximum
- When we have to highlight a specific range of grey value like enhancing the flaws in x-
ray or CT image for that we have to use a transformation is known as Grey Level Slicing.
𝐿 − 1, 𝑎 ≤ 𝑟 ≤ 𝑏
𝑠={ (Without background)
0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
𝐿 − 1, 𝑎 ≤ 𝑟 ≤ 𝑏
𝑠={ (With background)
𝑟, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
- Observing the images we come to conclusion that the higher order bits contain
majority of visually significant data, while the lower bits contain the suitable details in
the image.
- Bit Plane Slicing used for an image compression, we can remove lower order bits and
transmit only higher order bits.
- Application: - Stenography
- Stenography is art of hiding information. It is technique in which secret data is hidden
in carrier signal.
Stego image
- Dynamic range of the image exceeds the capability of the display devices.
- Some images have pixels with high value (intensity) and some images with low value.
- So we cannot see the low value pixels in the image. For example, in day time we cannot
see stars because sun has high intensity compare with stars so that the eye cannot
adjust to such a large dynamic range.
- In image processing, a classic example of such large differences in grey levels is the
Fourier spectrum. In that only some of the values are large while most of the values
are too small. The dynamic range of pixel is of order of 10 6. Hence, when we plot the
spectrum, we see only the small dots which represent the large values.
- Sometimes we need to be able to see the small values as well. This technique used to
compress dynamic range of pixels is known as Dynamic Range Compression.
- For this technique we could use LOG operator.
𝑆 = 𝑐 log(1 + 𝑟)
𝑆 = 𝑐𝑟 𝛾
𝒘𝟏 𝒘𝟐 𝒘𝟑
𝒘𝟒 𝒘𝟓 𝒘𝟔
𝒘𝟕 𝒘𝟖 𝒘𝟗
(3 x 3 Mask)
- Most of images background is considered to be a low frequency region and edges are
considered to be high frequency regions.
- Low Pass Filter removes noise and edges
10 10 10 10 10 10 10 10
10 10 10 10 10 10 10 10
10 10 10 10 10 10 10 10
10 10 10 10 10 10 10 10
50 50 50 50 50 50 50 50
50 50 50 50 50 50 50 50
50 50 50 50 50 50 50 50
50 50 50 50 50 50 50 50
- Multiply each pixel value of image with corresponding pixel value of mask.
- Centre value of this 3x3 matrix is replaced by average value of 3 X 3 matrix. Move this
given mask matrix from left corner to bottom right corner of input image and replace
centre value with average value.
- Final Matrix
10 ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ 10
10 ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ 10
10 ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ 10
23.3 ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ 23.3
36.6 ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ 36.6
50 ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ 50
50 ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ 50
[ 50 ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ 50 ]
- From final matrix we can conclude that the edges (where pixel values are changed
from 10 to 50 in input image) are blurred due to this type of filtering.
1 5 7
[2 3 6]
3 2 1
Example: Apply median filter on given input matrix using 3 x 3 matrix
18 22 33 25 32 24
[34 128 24 172 26 23]
22 19 32 31 28 26
- First consider left top 3 x 3 matrix of input matrix
- Arrange all value of this 3 x 3 matrix in ascending order 18 19 22 22 24 32 33 34 128
- Middle value of above ascending order is 24. So centre value of taken 3 x 3 input matrix
128 is replaced by 24.
- Now consider next 3 x 3 matrix of input matrix
- Arrange all value of this 3 x 3 matrix in ascending order 19 22 24 25 31 32 33 128 172
- Middle value of above ascending order is 31. So centre value of taken 3 x 3 input
matrix 24 is replaced by 31.
- Repeat this process from top to bottom row and left to right column.
- So Final Matrix is
18 22 33 25 32 24
[34 24 31 31 26 23]
22 19 32 31 28 26
- From the result we can conclude that if value of pixel is very different from
neighbouring pixels in input image then this pixel value is replaced by correlated value.
10 10 10 10 10 10 10 10
10 10 10 10 10 10 10 10
10 10 10 10 10 10 10 10
10 10 10 10 10 10 10 10
100 100 100 100 100 100 100 100
100 100 100 100 100 100 100 100
100 100 100 100 100 100 100 100
100 100 100 100 100 100 100 100
-1 -1 -1
-1 8 -1
9 -1 -1 -1
- Apply mask on input image and add up all co-efficient and take average value and
replace it with centre value.
- Repeat this process from top to bottom row and left to right column
- Negative value in output image should be considered zero.
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
30 30 30 30 30 30 30 30
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
- From the output image we can say High Pass filter removes background detail by
placing zero values and highlight only edges.
𝑖𝑓 𝐴 = 1, 𝑡ℎ𝑒𝑛
𝐻𝑖𝑔ℎ 𝐵𝑜𝑜𝑠𝑡 = 𝐻𝑖𝑔ℎ 𝑃𝑎𝑠𝑠
-1 -1 -1
-1 x -1
-1 -1 -1
- Suppose x= 9A-1
- If A = 1 then x= 8
- So mask matrix becomes
-1 -1 -1
-1 8 -1
-1 -1 -1
- If A = 1.1 then x= 8.9
- So mask matrix becomes
-1 -1 -1
-1 8.9 -1
-1 -1 -1
- For different value of A we can make different mask for high boost filter.
0 0 0 0 0
0 0 2 1 0
0 1 100 2 0
0 2 0 1 0
0 0 0 0 0
0 1 0
0 1 1
0 1 0
-1 -1 -1
-1 8 -1
9 -1 -1 -1
3. Compare with low pass averaging and high pass filter, the median filter gives
more correlated data.
1 1 1
1 1 1
9 1 1 1
So resultant matrix is
14.67 15.67 14.89
15.44 17.22 16.00
14.89 16.22 15.00
2. Median filter
- Use similar 3 x 3 mask matrix and repeat process of median filter
0 4 0
3 5 4
0 4 0
3. High Pass Filter
-1 -1 -1
-1 8 -1
9 -1 -1 -1
0 -5 -4 0 0 0
1 = 0 102.71 0
-7 960 -5
9 -4 -3 -7 0 0 0
4. Comparing result of 1 & 2 we can say using median filter we can get more
correlated value of pixels.
Example: Obtain the digital negative of following 8 bits per pixel image
Here r(x,y) is input image so first pixel value is 255-121=134 & so on.
134 50 38 99 104
116 128 98 138 130
3 138 19 117 113
28 73 77 58 13
54 149 136 4 15
Example: For a given image find: 1) Digital Negative of an image. 2) Bit Plane Slicing.
4 3 2 1
3 1 2 4
5 1 6 2
2 3 5 6
Digital Negative
Max value of pixel is 6 so we need 3 bit for binary representation of this pixel value.
0 1 1 1 1 0 0 1 1 0 1 0
1 1 1 0 0 1 0 1 0 0 1 1
0 1 0 1 1 1 0 0 0 0 1 1
1 1 0 0 0 0 1 0 1 0 0 1
MSB plane Middle plane LSB plane
r2 = 5 , r1 = 3 , s2 = 6 , s1 = 2
4 3 2 1
3 1 2 4
5 1 6 2
2 3 5 6
𝑠1 2
𝛼= = = 0.66
𝑟1 3
𝑦2 − 𝑦1 6 − 2
𝛽= = =2
𝑥2 − 𝑥1 5 − 3
𝑦2 − 𝑦1 7 − 6
𝛾= = = 0.5
𝑥2 − 𝑥1 7 − 5
𝛼𝑟, 0≤𝑟<3
𝑠 = { 𝛽(𝑟 − 𝑟1 ) + 𝑆1 , 3≤𝑟<5
𝛾(𝑟 − 𝑟2 ) + 𝑆2 , 5≤𝑟<7
r S
0 S = 𝛼𝑟 = 0.66*0 =0
1 S = 𝛼𝑟 = 0.66*1 = 0.66
2 S = 𝛼𝑟 = 0.66*2 = 1.32
3 S = 𝛽(𝑟 − 𝑟1 ) + 𝑆1 = 2(3-3)+2 =2
4 S = 𝛽(𝑟 − 𝑟1 ) + 𝑆1 = 2(4-3)+2 =4
5 S = 𝛾(𝑟 − 𝑟1 ) +𝑆2 =6
6 S = 𝛾(𝑟 − 𝑟1 ) +𝑆2 = 6.5
7 S = 𝛾(𝑟 − 𝑟1 ) +𝑆2 =7
4 2 1.32 0.66 4 2 1 1
2 0.66 1.32 4 2 1 1 4
S(x,y)= =>
6 0.66 6.5 1.32 6 1 7 1
1.32 2 6 6.5 1 2 6 7
1 2 3 0
2 4 6 7
5 2 4 3
3 2 6 1
6 5 4 7
5 3 1 0
2 5 3 4
4 5 1 6
0, 𝑟≤4
𝐿 − 1, 𝑟>4
0 0 0 0
0 7 7 7
7 0 7 0
0 0 7 0
0 7 7 0
7 7 0 0
7 7 7 0
7 7 0 0
𝐿 − 1, 2 ≤ 𝑟 ≤ 5
𝑠={ (Without background)
0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
0 7 7 0
7 7 0 0
7 7 7 0
7 7 0 0
𝐿 − 1, 2 ≤ 𝑟 ≤ 5
𝑠={ (With background)
𝑟, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
1 7 7 0
7 7 6 7
7 7 7 3
7 7 6 1
- When we apply the LPF on the image, the center pixel z5 changes to
- 1/9[Z1 + Z2 + Z3 + Z4 + Z5 + Z6 + Z7 + Z8 + Z9]
- Original –low pass = Z5 - 1/9[Z1 + Z2 + Z3 + Z4 + Z5 + Z6 + Z7 + Z8 + Z9]
= Z5 - Z1 /9 - Z2 /9 - Z3/9 - Z4 /9 - Z5 /9 - Z6/9 - Z7/9 - Z8/9 + Z9/9
= 8Z5 /9 - 1/9[Z1 + Z2 + Z3 + Z4 + Z6 + Z7 + Z8 + Z9]
- This is nothing but a high pass filter mask
-1 -1 -1
-1 8 -1
-1 -1 -1
- In frequency domain
-N/2 0 N/2
- Hence, the 0 represents the d.c term. As we move to the right, the frequency
goes on increasing, maximum being N/2. by using translation property ,we have
Low frequency
High frequency
0 N/2 N
- Hence we conclude that in the Fourier spectrum, the centre is where the low
frequencies and as we go away from the centre, we encounter the high
- Centre part of image is consider as Low frequency and edges of image is high
1, 𝐷(𝑢, 𝑣) ≤ 𝐷0
𝐻(𝑢, 𝑣) = {
0, 𝐷(𝑢, 𝑣) > 0
- D(u, v) is the distance from the point (u, v) to the origin of the frequency rectangle
for an MxN image.
- D(u,v) = [(u-(M/2) ) 2+ (v-N/2)2]1\2
- For an image if u = M/2, v= N/2,
Then D(u,v) = 0
- How we can decide value of do? (which is suitable for better o/p).
- To compute circles that enclose specified amounts of total image power Ptotal.
P(u,v)2 = F(u,v)2
= R2 (u,v) + I2 (u,v)
Ptotal = ∑𝑁−1 𝑀−1
𝑢=0 ∑𝑣=0 𝑝(𝑢, 𝑣)
- For low order value of n Butterworth low pass filter becomes Gaussian low pas
- For high order value of n Butterworth low pass filter becomes Ideal low pas filter.
Hlp,BW(u,v) = 1- 1+[ D(u,v)/D0]2n
FHP(u,v) = [1 – HLP(u,v)]F(u,v)
HHP(u,v) = 1 - HLP(u,v)
- In similar manner,
HHB(u,v) = (A-1)+ HHP(u,v) ; A>1
- To come back to the space domain, we have to take Inverse Fourier Transform
F’(x,y) = F-‘[ F(u,v)* H(u,v)]
F’(x,y) = F-‘[ FI(u,v) * H(u,v)] + F-‘[ FR(u,v) * H(u,v)]
- Now for desired enhanced image is obtained by taking exponential operation.
g(x,y) = ef’(x,y)
3.3 Histogram
3.3.1. Histogram
- It is plot of number of occurrence of grey levels in image against with grey level values.
- Histogram provides more information about brightness & contrast of image.
- Histogram of dark image will be clustered towards lower grey levels.
- Histogram of bright image will be clustered towards higher grey levels.
- For low contrast image the histogram will not be spread equally, that is, the histogram
will be narrow.
- For high contrast image the histogram will have an equal spread in the grey level.
- Image brightness may be improved by modifying the histogram of the image.
- Histogram can be plotted in two different ways
Method 1:
Method 2:
Instead of plotting no of pixels, we directly plot its probability values.
Pr(k) = nk/n
𝑠 −𝑠
S = T(r) =𝑟𝑚𝑎𝑥−𝑟𝑚𝑖𝑛 (𝑟 − 𝑟𝑚𝑖𝑛 ) + 𝑠𝑚𝑖𝑛
𝑚𝑎𝑥 𝑚𝑖𝑛
Example: Perform histogram stretching on the following image. So that new image has dynamic
range [0,7].
Gray Levels 0 1 2 3 4 5 6 7
No.of Pixels 0 0 50 60 50 20 10 0
𝑠 −𝑠
S = T(r) =𝑟𝑚𝑎𝑥−𝑟𝑚𝑖𝑛 (𝑟 − 𝑟𝑚𝑖𝑛 ) + 𝑠𝑚𝑖𝑛
𝑚𝑎𝑥 𝑚𝑖𝑛
r s
rmin = 2
2 0
rmax = 6
3 1.75 ≈ 2
smax = 0
4 3.5 ≈4
smin = 7
5 5.2 ≈5
6 7
Modified Histogram:
Gray Levels(s) 0 1 2 3 4 5 6 7
No.of Pixels 50 0 60 0 50 20 0 10
Example: Perform histogram equalization on the following image histogram. Plot the original
and equalized histogram.
Gray 0 1 2 3 4 5 6 7
No.of 790 1023 850 656 329 245 122 81
Grey nk Pr(k) = nk/n Sk=∑ (L-1) Rounding Grey
level.(r) Pr(k) Sk Off level.(s)
0 790 0.19 0.19 1.33 1 1
1 1023 0.25 0.44 3.08 3 3
2 850 0.21 0.65 4.55 5 5
3 656 0.16 0.81 5.67 6 6
4 329 0.08 0.89 6.23 6 6
5 245 0.06 0.95 6.65 7 7
6 122 0.03 0.98 6.86 7 7
7 81 0.02 1 7 7 7
gray 0 1 2 3 4 5 6 7
No.of 0 790 0 1023 0 850 656+329 245+122+81
Pixels =985 =448
4 4 4 4 4
3 4 5 4 3
F(x,y)= 3 5 5 5 3
3 4 5 4 3
4 4 4 4 4
Gray Levels 0 1 2 3 4 5 6 7
No.of Pixels 0 0 0 6 14 5 0 0
Gray 0 1 2 3 4 5 6 7
No.of 0 0 6 0 0 0 14 5+0+0=5
H = ∑𝐿−1
𝑖=0 𝑝𝑖 log 𝑝𝑖
1 1
∑( ) log( )
256 256
Example: what effect would setting to i) zero the higher order bits plane ii) zero the lower order
bits plane. Image is given below.
0 1 2 3
4 5 6 7
8 9 10 11
12 13 14 15
Solution: Maximum number is 15 so we require 4 bits for binary representation
Binary representation:
- In this histogram the variability is reduced, number of grey levels are reduced.
0 1 2 3
0 1 2 3
0 1 2 3
0 1 2 3
- In this case grey levels are also reduced but important thing is image becomes much
Example: Given histogram (a) and (b) modify histogram (a) as given (b).
Histogram (a):
Gray 0 1 2 3 4 5 6 7
No.of 790 1023 850 656 329 245 122 81
Histogram (b):
Gray 0 1 2 3 4 5 6 7
No.of 0 0 0 614 819 1230 819 614
gray 0 1 2 3 4 5 6 7
No.of 0 790 0 1023 0 850 656+329 245+122+81
Pixels =985 =448
0 0 0 0 0 0
1 0 0 0 0 0
2 0 0 0 0 0
3 614 0.149 0.149 1.05 1
4 819 0.20 0.35 2.45 2
5 1230 0.30 0.65 4.45 5
6 819 0.20 0.85 5.97 6
7 614 0.15 1 7 7
n = 4096
Applying inverse transform and comparing histogram (a) and histogram (b).
gray 0 1 2 3 4 5 6 7
No.of 0 0 0 790 0 850 985 448
10 2 13 7
11 14 6 9
4 7 3 2
0 5 10 7
Solution: 1)
𝑠1 2
𝛼= = = 0.4
𝑟1 5
𝑦2 − 𝑦1 12 − 2
𝛽= = =2
𝑥2 − 𝑥1 10 − 5
𝑦2 − 𝑦1 15 − 12
𝛾= = = 0.6
𝑥2 − 𝑥1 15 − 10
r S
0 S = 𝛼𝑟 = 0.4*0 =0
1 S = 𝛼𝑟 = 0.4*1 = 0.4
2 S = 𝛼𝑟 = 0.4*2 =0.8
3 S = 𝛼𝑟 = 0.4*3 = 1.2
4 S = 𝛼𝑟 = 0.4*4 = 1.6
5 S = 𝛽(𝑟 − 𝑟1 ) + 𝑆1 = 2(5-5)+2 =2
6 S = 𝛽(𝑟 − 𝑟1 ) + 𝑆1 = 2(6-5)+2 =4
7 S = 𝛽(𝑟 − 𝑟1 ) + 𝑆1 = 2(7-5)+2 =6
8 S = 𝛽(𝑟 − 𝑟1 ) + 𝑆1 = 2(8-5)+2 =8
9 S = 𝛽(𝑟 − 𝑟1 ) + 𝑆1 = 2(9-5)+2 = 10
10 S = 𝛾(𝑟 − 𝑟2 ) +𝑆2 = 0.6(10-10)+12 = 12
11 S = 𝛾(𝑟 − 𝑟2 ) +𝑆2= 0.6(11-10)+12 = 12.6
12 S = 𝛾(𝑟 − 𝑟2 ) +𝑆2= 0.6(12-10)+12 = 13.2
13 S = 𝛾(𝑟 − 𝑟2 ) +𝑆2= 0.6(13-10)+12 = 13.8
14 S = 𝛾(𝑟 − 𝑟2 ) +𝑆2= 0.6(14-10)+12 = 14.4
15 s = 𝛾(𝑟 − 𝑟2 ) +𝑆2= 0.6(15-10)+12 = 15
12 0.8 13.8 6 12 1 14 6
12.6 14.4 4 10 13 14 4 10
S(x,y) = =>
1.6 6 1.2 0.8 2 6 1 1
0 2 12 6 0 2 12 6
grey 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
No of 0 1 1 1 1 1 1 0 0 3 1 0 2 1 1 1
- It depends on the probability or frequency of grey value. So NO matter how the grey
values are distributed over the image, if the frequency of occurrence of grey value is
not changed, the histogram will not change.
Question: Continuous image histogram can be perfectly equalized but it may not be so for digital
- The cumulative density function ensures that we get a flat histogram is the continuous
- For example values such as 1.1, 1.2, and 1.3, are all grouped together and placed in
value 1. Due to this perfectly flat histograms are never obtained in the discrete
Histogram Contrast Stretching
1. It is about modifying the intensity 1. It is all about increasing the
values of all pixels in the image as difference between the minimum
equally. and maximum intensity value in
2. The transformation function used in 2. Transformation function is selected
Histogram is selected automatically manually based on the requirement
from PDF of the image. of application.
3. It is reliable. 3. It is unreliable.
4. It is non-linear normalization. 4. It is linear normalization.
5. In histogram equalization, the original 5. In contrast stretching, the original
image cannot be restored from image can be restored from contrast
equalized image. stretched image
6. Histogram equalization is obtained 6. Contrast stretching can be obtaining
using the cumulative distribution by changing the slopes of various
function. sections.
Unit 4 (PART – 1)
- The main objective of image segmentation is to extract various features of the image
which can be merged or split in order to build objects of interest on which analysis
and interpretation can be performed.
- Segmentation forms a section of computer vision .we use segmentation when we
want the computer to make decision.
- Segmentation algorithms divide into two different way.
1) Segmentation based on discontinuities in intensity
2) Segmentation based on similarities in intensity
-1 -1 -1
-1 8 -1
-1 -1 -1
│R │≥ T
- Where R is derive from
R = W1Z1 + W2Z2 + W3Z3 +………………..+ W9Z9
R = ∑9𝑖=1 𝑤𝑖 𝑧𝑖
- We take │R │ because we want to detect both the kinds of points i.e. white points on
black background as well as black points on a white background.
- T is non negative threshold which is defined by the user.
-1 -1 -1 -1 -1 2
2 2 2 -1 2 -1
-1 -1 -1 2 -1 -1
(Horizontal) (+45▫)
-1 2 -1 2 -1 -1
-1 2 -1 -1 2 -1
-1 2 -1 -1 -1 2
(Vertical) (-45▫)
- All these masks have a sum equal to zero, and hence all of them are high pass mask.
- The first mask would detect horizontal line, second mask would detect a line at angle
+45, third mask would detect vertical line and forth mask would detect line at angle -
- Consider this example
0 0 0 10 0 0 0 0
0 0 0 10 0 0 0 0
0 0 0 10 0 0 0 0
0 0 0 10 0 0 0 0
0 10 10 10 10 10 10 0
0 0 0 10 0 0 0 0
0 0 0 10 0 0 0 0
0 0 0 10 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 -20 -20 -20 -20 -30 -20 0
0 40 40 40 40 60 40 0
0 -20 -20 -20 -20 -30 -20 0
0 0 0 10 0 0 0 0
0 0 0 10 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 40 40 40 40 60 40 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
- Practically sharp slope is not possible. For digital image we follow the below step
𝑑𝑦 𝑓(𝑥 + ℎ) − 𝑓(𝑥)
= lim
𝑑𝑥 ℎ→0 ℎ
𝜕𝑓 𝑓(𝑥, 𝑦 + 𝑘) − 𝑓(𝑥, 𝑦)
= lim
𝜕𝑦 ℎ→0 𝑘
Hence final gradient is
𝜕𝑓 𝜕𝑓
𝛻𝑓 = i^ +j^ 𝜕𝑦
𝜕𝑓 𝑓(𝑥 + ℎ, 𝑦, 𝑧) − 𝑓(𝑥, 𝑦, 𝑧)
= lim
𝜕𝑥 ℎ→0 ℎ
𝜕𝑓 𝑓(𝑥, 𝑦 + 𝑘, 𝑧) − 𝑓(𝑥, 𝑦, 𝑧)
= lim
𝜕𝑦 ℎ→0 𝑘
𝜕𝑓 𝑓(𝑥, 𝑦, 𝑧 + 𝑙) − 𝑓(𝑥, 𝑦, 𝑧)
= lim
𝜕𝑧 𝑙→0 𝑙
So gradient
𝜕𝑓 𝜕𝑓 𝜕𝑓
𝛻𝑓 = i^ +j^ 𝜕𝑦 +k^ 𝜕𝑧
𝜕𝑓 𝜕𝑓
│𝛻𝑓│ = [ (𝜕𝑥 )2 + ( 𝜕𝑦)2 ]2 Finding gradients using Masks
- Consider 3x3 neighbourhood matrix with Z5 as the origin
𝜕𝑓 𝑓(𝑥+ℎ,𝑦)−𝑓(𝑥,𝑦)
= lim
𝜕𝑥 ℎ→0 ℎ
𝜕𝑓 𝑓(𝑥,𝑦+𝑘)−𝑓(𝑥,𝑦)
= lim
𝜕𝑦 𝑘→0 𝑘
𝜕𝑓 𝜕𝑓
So 𝜕𝑥= z8-z5 & 𝜕𝑦 = z6-z5
1 0
Mask 1 Ιz5-z8 Ι =
-1 0
Mask 2 Ιz5-z6 Ι = 1 -1
0 0
Robert operator:
- It state that better result could be obtained if cross difference were taken instead of
the straight difference.
1 0
Mask 1 Ιz5-z8 Ι = 0 -1
0 1
Mask 1 Ιz5-z8 Ι = -1 0
1 1
Resultant Mask = -1 -1
Prewitt operator:
Ι∇FΙ= Ιz7+z8+z9 Ι - Ιz1+z2 +z3 Ι
-1 -1 -1
Mask 1=
0 0 0
1 1 1
-1 0 1
Mask 2=
-1 0 1
-1 0 1
Sobel operator: In 3x3 mask, higher weights assigned to pixels which close to centre pixel z5.
-1 -2 -1
0 0 0
Mask 1= 1 2 1
-1 0 1
Mask 2=
-2 0 2
-1 0 1
-2 -2 0
Resultant Mask = mask1 + mask 2 = -2 0 2
0 2 2
Compass operator: it is seen that edges in the horizontal as well as in the vertical direction
are enhanced when prewitt’s or sobel’s operator is used.
- There are applications, in which we need edges in all the direction the directions.
- A simple method would be to rotate the prewitts or sobel’s mask in all the possible
- Consider a prewitt’ s operator
-1 -1 -1
0 0 0
1 1 1
-1 -1 0 -1 0 1 0 1 1
-1 0 1 -1 0 1 -1 0 1
0 1 1 -1 0 1 -1 -1 0
1 1 1 1 1 0 1 0 -1
0 0 0 1 0 -1 1 0 -1
-1 -1 -1 0 -1 -1 1 0 -1
-1 -1 -1
0 -1 -1
0 0 0
1 0 -1
1 1 1
1 1 0
- This operator is known as compass operator and is very useful for detecting weak
edges. Compass operator can also be implemented using the sobel operator.
- We know,
𝜕𝑓 𝜕𝑓
𝛻𝑓 = + 𝜕𝑦
=f(x+1,y)-f(x,y) &
𝜕2 𝑓
= f(x+1,y) - f(x,y) + f(x-1,y) - f(x,y)
𝜕𝑥 2
𝜕2 𝑓
= f(x+1,y) + f(x-1,y) -2f(x,y)
𝜕𝑥 2
𝜕2 𝑓
= f(x,y+1) + f(x,y-1) -2 f(x,y)
𝜕𝑦 2
𝜕2 𝑓 𝜕2 𝑓
∇2f = 𝜕𝑥 2 + 𝜕𝑦 2
│∇2f │= │z8+z2+z6+z4-4z5│
0 1 0
1 -4 1
0 1 0
Solution: It is isotropic filter, means its response is independent of the direction of the
discontinuities in image.
- For removing this unwanted effect we have to use Laplacian of Gaussian algorithm
- We know Gaussian function
−(𝑥2 +𝑦2 )
h(r) = 𝑒 2𝜎2
Take 𝑥 2 + 𝑦 2 = r2
−(𝑟2 )
h(r) = 𝑒 2𝜎2
−(𝑟2 )
𝜕ℎ(𝑟) 𝜕
= (𝑒 2𝜎2 )
𝜕𝑟 𝜕𝑟
−(𝑟 2 )
−(𝑟 2 ) 𝜕
= (𝑒 2𝜎2 ) 2𝜎 2
−(𝑟2 )
= (𝑒 2𝜎2 )(2𝜎2 )
−(𝑟2 )
𝜕ℎ(𝑟) −𝑟
= (𝑒 2𝜎2 )(𝜎2 )
−(𝑟2 )
𝜕2 ℎ(𝑟) 𝜕 −𝑟
= [(𝑒 2𝜎2 )(𝜎2 )]
𝜕𝑟 2 𝜕𝑟
−(𝑟2 )
1 𝜕
= − 𝜎2 [(𝑟𝑒 2𝜎2 )]
−(𝑟2 ) −(𝑟2 )
1 −𝑟
= − 𝜎2 [( 𝜎2 ) ( 𝑟𝑒 2𝜎2 ) + (𝑒 2𝜎2 )]
−(𝑟2 )
−(𝑟2 )
1 𝑟 2 𝑒 2𝜎2
= 𝜎2 [− − (𝑒 2𝜎2 )]
−(𝑟2 )
1 𝑟2
=𝜎2 (𝑒 2𝜎2 ) (𝜎2 -1)
−(𝑟2 )
𝑟 2 −𝜎2
∇2h= (𝑒 2𝜎2 ) 𝜎4
Consider 5 x 5 mask
0 0 -1 0 0
0 -1 -2 -1 0
-1 -2 16 -2 -1
0 -1 -2 -1 0
0 0 -1 0 0
- Using this equation and varying the values of a and b, infinite number of lines pass
through this point (x1,y1).
- However if we write this equation as b = -ax1+y1
- Consider ab plane instead of xy plane, we get a single line for a point (x1,y1).
- This entire line in the ab plane is due to a single point in the xy plane and different
values of a and b.
- Now consider another point (x2,y2) in the xy plane.
- Slope intercept equation of this line is y2 = ax2 + b.
- Writing this equation in ab plane b = -ax2+y2.
- This is another line in the ab plane. These two line will intersect each other
somewhere in ab plane only if they are a part of straight line in xy plane.
- The point of intersection in ab plane is noted as (a’,b’).
- Similar process we have to repeat for all given points.
Example 2:
- By using this we have to find out low cost path would eventual correspond to the
most significant edge
Example: Using graph theoretical approach, find the edge corresponding to minimum cost path
5 6 1
I= 6 7 0
7 1 3
- For path A
Example: Consider an 8 x 8 image, the grey level range from 0 to 7. Segment this image
using the region growing technique.
Solution: seed pixels are 6 and 0 & Threshold value is 3 & connectivity N4(p)
- Check condition Max {g(x,y)} - min {g(x,y)} ≤ Th
Example: Consider an 8 x 8 image, the grey level range from 0 to 7. Segment this image
using the region splitting technique. consider Th ≤ 3. Also draw quad tree.
5 6 6 6 7 7 6 6
6 7 6 7 5 5 4 7
6 6 4 4 3 2 5 6
5 4 5 4 2 3 4 6
0 3 2 3 3 2 4 7
0 0 0 0 2 2 5 6
1 1 0 1 0 3 4 4
1 0 1 0 2 3 5 4
Example: Segment the following image using split and merge technique. Draw
quad tree representation for the corresponding segmentation.
Example: Segment the following image using split and merge technique. Draw quad
tree representation for the corresponding segmentation.
Solution: Add one row & column to given image. Then apply split & merge technique.
4.5 Thresholding
- It produces segments having pixels with similar intensities.
- It is useful technique for establishing boundaries in images that contain solid objects
reflecting on a contrasting background.
- This technique requires object has homogenous intensity & background with
different intensity levels.
4.5.1 Global Thresholding
1, 𝑓(𝑥, 𝑦) ≥ 𝑇
𝑓(𝑥, 𝑦) = {
0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
- Steps for global thresholding
1) Read the given image
2) Plot the histogram of image
3) based on histogram, choose the value T
4) Using this value of T segment the image into objects & background.
2) Line Edge: if a segment of image is very narrow, it necessarily has two edges in
close proximity. This arrangement is called a line.
4) Roof edge: two nearby ramp edges result in line structure called a roof. basically
there are two types of roof edges
i) Convex roof edges
- Logical operators:
- A or B
- A and B
- Not A
- A xor B
- A nand B
4.2 Standard binary morphological operations
4.1.1 Dilation:
- It is a process in which the binary image is expanded from its original shape.
- The expanded binary image is determined by the structuring element.
- This structuring element is smaller in size component to image itself, and normally the
size used for the structuring element is 3 x 3.
- The dilation is similar to the convolution process.
- The dilation process will move the structuring element from left to right and top to
- The process will look for whether there is at least one dark value of structuring element
overlap with input image or not.
- If there is not a single overleaping then pixel of input image behind the position of origin
(center) of structuring element will be set 0(white).
- If there is a single overleaping then pixel of input image behind the position of origin
(center) of structuring element will be set 1(Black).
- Let us define X as the reference image & B structuring element so dilation operation is
given by
- X⊕ B = {{Z││(B᷆)z∩X}⊆ X}
Example: For given input image A apply dilation technique using given structuring element B.
4.1.2 Erosion
- It is the counter process of dilation. If dilation enlarge the image then erosion shrink
the image.
- The erosion process will move the structuring element from left to right and top to
- The process will look for whether there is a complete overlap with all dark portion of
structuring element or not.
- If there is no complete overleaping then pixel of input image behind the position of
origin (center) of structuring element will be set 0(white).
- If there is complete overleaping then pixel of input image behind the position of origin
(center) of structuring element will be set 1(Black).
- Let us define X as the reference image & B structuring element so erosion operation is
given by
- Xɵ B = { {Z││(B᷆)z⊆ X}
Example: For given input image A apply erosion technique using given structuring element B.
- Similarly we have move structuring element on input image and we will get final erode
image which shown below.
Example: A={(1.0),(1,1),(1,2),(0,3),(1,3),(2,3),(3,3),(1,4)} & B= {(0,0),(1,0)} Apply dilation and
erosion on given input image.
0 1 2 3 4
A ⊕ B=
1 0 0 0 0
1 1 1 0 1 0 0 0
S. E A= 0 0 1 0 0
0 0 0 1 0
0 0 0 0 1
1 1 0 0 0
Dilation= 0 1 1 0 0
0 1 1 1 0
0 0 1 1 1
0 0 0 1 1
1 0 0 0 0
Erosion= 0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 1
Example: Suppose given image is grey image & Apply dilation and erosion on given input image
16 14 14 17 19
1 1 1 53 57 61 62 64
132 130 133 132 131
138 142 137 132 138
16 16 17 19 19
Dilation= 53 61 62 64 64
132 133 133 133 131
138 142 142 139 138
16 14 14 14 14
53 53 57 61 64
132 130 130 131 131
138 137 137 137 138
4.1.3 Opening & closing operation
- Opening is basically erosion followed by dilation using structuring element.
- A ο B = (AѲB) ⊕ B
- Closing is basically dilation followed by erosion using structuring element.
- A • B = (A⊕B) Ѳ B
Example: perform opening and closing operation on given input image using structuring element
1 0 0 0 0
1 1 1 0 1 0 0 0
0 0 1 0 0
0 0 0 1 0
0 0 0 0 1
I) Opening:
A ο B = (AѲB) ⊕ B
1 0 0 0 0
0 0 0 0 0
AѲB = 0 0 0 0 0
0 0 0 0 0
0 0 0 0 1
1 1 0 0 0
0 0 0 0 0
(AѲB) ⊕ B= 0 0 0 0 0
0 0 0 0 0
0 0 0 1 1
2) Closing:
A • B = (A⊕B) Ѳ B
1 1 0 0 0
A⊕B = 0 1 1 0 0
0 1 1 1 0
0 0 1 1 1
0 0 0 1 1
1 0 0 0 0
0 0 1 0 0
0 0 1 0 0
(A⊕B) Ѳ B = 0 0 0 1 1
0 0 0 0 1
4.1.4 Boundary Detection
- Morphology operations are very effective in the detection of boundaries in binary
image. The following boundaries detection are widely used.
- G(x,y )= f(x,y) –( f(x,y)ѲSE)
- G(x,y)= (f(x,y)⊕SE) – f(x,y)
- Y=(f(x,y)⊕SE) – (f(x,y)ѲSE)
Solution: X0⊕B =
- Now (X0⊕B) ⋂ Ā = X1
- Now X1⊕B
- Now (X1⊕B) ⋂ Ā = X2
- Similar process we have to repeat till you will get Xk = X k-1.
- For this example X6= X 5 so final output image = (X 5⋃ A).
0 0 0 0 0 0 0
0 1 1 1 1 1 0 0 0 0 0 1 0
0 1 1 1 1 1 0 0 1 0 0 0 0
0 1 1 1 1 1 0 0 1 0 0 0 0
0 1 1 1 1 1 0 B1 B2
0 1 1 1 1 1 0
0 0 0 0 0 0 0
Solution: consider
0 0 0 0 0 0 0
AθB1 = 0 1 1 1 1 1 0
0 1 1 1 1 1 0
0 1 1 1 1 1 0
0 1 1 1 1 1 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
1 1 1 1 1 1 1
1 0 0 0 0 0 1
1 0 0 0 0 0 1
Ac= 1 0 0 0 0 0 1
1 0 0 0 0 0 1
1 0 0 0 0 0 1
1 1 1 1 1 1 1
1 1 1 1 1 1 1
1 1 1 1 1 1 1
1 0 0 0 0 0 1
AcθB2 = 1 0 0 0 0 0 1
1 0 0 0 0 0 1
1 0 0 0 0 0 1
1 0 0 0 0 0 1
0 0 0 0 0 0 0
0 1 1 1 1 1 0
(AθB1)⋂( AcθB2) =
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
Example: For given image, use hit & miss transform find out output image.
B1 B2
AθB1 =
AcθB2 =
(AθB1)⋂( AcθB2) =
4.1.7 Thinning
- Thinning a binary image down to a unit width skeleton is useful not only to reduce the
amount of pixels, but also to simplify the computational procedure, required for shape
- It is based on the hit or miss transformation as
X⊗B = X- HM(X,B) OR X⊗B = X⋂HM(X,B)c
- There are different possibilities of structuring element are used for thinning.
- Consider origin at center of structuring element.
4.1.8 Thickening
- Thickening is the morphological operation which is used to grow selected region of
foreground pixels in binary images.
- It is defined as X⨀B = X ⋃ HM(X,B)
- The thickened image consists of the original image plus any additional foreground
pixels switched on by hit or miss transform.
- This process is normally applied repeatedly unit is caused no further changes in the
- The different structuring element that can be used in thickening process.
- Consider origin at centre of structuring element.
Dilation Erosion
- It is non linear operation related to - It is non linear operation related to
shape of image shape of image
- Add the pixels to the boundaries of - Remove the pixels from the
objects in image. boundaries of objects in image.
- It grows or thicken objects in - It shrink or thin object in a binary
binary image image.
- Dilation is given by this formula - Erosion is given by this formula Xɵ
X⊕ B = {{Z││(B᷆)z∩X}⊆ X} B = { {Z││(B᷆)z⊆ X}
- Replaced pixels in i/p image which - Replaced pixels in i/p image which
behind origin of structuring with behind origin of structuring with
max valued pixel of its min valued pixel of its
neighborhood. neighborhood.
- Dilation of image B is equivalent of - Erosion of image B is equivalent of
the erosion of the complement of the Dilation of the complement of
the image B. the image B.
Unit 5
- Image restoration can be defined as the process of removal or reduction of degradation
in an image through linear & nonlinear filtering.
- It is objective process.
- Degradation can be due to
- I) image sensor noise
- II) Blur due to miss-focus
- III) Blur due to motion
- IV) noise from transmission
- V) Blur due to transmission channel.
G(u,v )
- F(u,v)= H(u,v)
- In presence of noise ,
- G(u,v ) = [H(u,v) x F(u,v) ]+ N(u,v)
- F(u,v )= [G(u,v ) - N(u,v)] / H(u,v)
G(u,v ) N(u,v )
- F(u,v )= -
H(u,v) H(u,v)
𝑓̂(x,y) =∑
𝑘=− ∑𝑙=− 𝑔(𝑥 − 𝑘 , 𝑦 − 𝑙)𝑣(𝑘, 𝑙)
- So E[f(x,y) – (∑
𝑘=− ∑𝑙=− 𝑔(𝑥 − 𝑘 , 𝑦 − 𝑙)𝑣(𝑘, 𝑙)) v(k’,l’)] = 0
- E{f(x,y) v(k’,l’)} = E{ ∑
𝑘=− ∑𝑙=− 𝑔(𝑥 − 𝑘 , 𝑦 − 𝑙)𝑣(𝑘, 𝑙)v(k’,l’) }
- {rfv(x,y)} = E{ ∑
𝑘=− ∑𝑙=− 𝑔(𝑥 − 𝑘 , 𝑦 − 𝑙)𝑣(𝑘, 𝑙)v(k’,l’) }= 0
G(u,v) = 2
H(u,v) Sff(u,v) + Snn(u,v)
G(u,v) = 2 snn (u,v)
H(u,v) +
sff (u,v)
- If H(u,v) =1 then
G(u,v) = 2 snn (u,v)
H(u,v) +
sff (u,v)
G(u,v) = snn (u,v)
(1)2 +
sff (u,v)
sff (u,v)
G(u,v) = s
ff (u,v) +snn (u,v)
sff (u,v)
snn (u,v)
G(u,v) = sff (u,v)
snn (u,v)
sff (u,v)
- is signal to noise ratio at frequencies (u,v)
snn (u,v)
- So G(u,v) = SSNR + 1
- From the equation we realized that when the SNR is large, G(u,v)=1 and when SNR is
small, G(u,v) = SSNR
- Hence G(u,v) acts as low pass filter. It is called the wiener smoothing filter.
- An important property of this filter is that signal attenuation is in proportional to the
signal to noise ratio.
G(u,v) = 2
G(u,v) = H(u,v)
- This is inverse filter. Since the blurring is usually a low pass operation, the wiener filter
in the absence of noise acts as high pass filter.
- When both noise and blur are present wiener filter act as bandpass filter.
I) Gaussian Noise:
- It is provides a good model of noise. They are very popular & are at times used when all
other noise models fail.
- PDF= 𝜎√2𝜋 𝑒 2𝜎2 z= grey level
σ=standard deviation
μ= mean
2) Rayleigh Noise:
2 2
- PDF is defined as P(z) = 𝑏 (𝑧 − 𝑎)𝑒 −(𝑧−𝑎) /𝑏 , 𝑧 ≥ 𝑎
= 0 , z<a
3) Gamma Noise:
𝑎𝑏 𝑧 𝑏−1
PDF is defined as P(z) = 𝑒 −𝑎𝑧 , 𝑧 ≥ 0
4) Exponential Noise:
PDF is defined as P(z) = a𝑒 −𝑎𝑧 , 𝑧 ≥ 0
= 0,z<0
4) Uniform Noise:
PDF is defined as P(z) = 𝑏−𝑎 , 𝑎 ≤ 𝑧 ≤ 𝑏
= 0 , otherwise
Question: What are the different types of order statistics filters? Discuss their
- Median, Max- Min all are consider as statistics filters.
- Max-Min filter:
- These are actually two separate filters they like the median filter work within a
neighbourhood. The max filter is given by,
f^(x,y) = max {g(m,n)}.
- This simply means that we take the maximum value from the neighbourhood and
replace it at the centre.
- The min filter is given by,
f^(x,y) = min{g(m,n)}.
- This simply means that we take the minimum value from the neighbourhood and
replace it at the centre.
- These filters help us identify the brightest and darkest points within the image.