0% found this document useful (0 votes)
20 views302 pages

CV Lab Journal

Uploaded by

aryankriti28
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views302 pages

CV Lab Journal

Uploaded by

aryankriti28
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 302

Department of E& TC Engineering

Vishwakarma Institute of Technology, Pune

LAB JOURNAL
ET-2293 COMPUTER VISION

BATCH ET-A-1

Submitted by

Name: Aryan Mundra

Roll No: 8

GRNo: 12211487

Batch guide
Prof. Jyoti Madake
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

ET3221: Computer Vision

Name of the student:AAshish Rodi Roll No. 9


Div: A Batch: 1
Date of performance: 09-07-2024

Experiment No.1
Problem Statement:

Image and Video Access using Python

AIM:

Write a Python Code for following Operations:


1. Reading an Image from a Specified Path
2. Writing an Image to a Specified Path
3. Reading a Video without Audio from a Specified Path
4. Reading a Video with Audio from a Specified Path
5. Record a Video using your Web Camera and Write an Image to a Specified Path

Objective(s) of Experiment:
1. Read and write the image from filesystem
2. Read and write the video path using specific path as well as webcam for further
processing.

Introduction:

To process the image the Image reading and writing is important. Image processing is a
method to perform some operations on an image, in order to get an enhanced image or to
extract some useful information from it. It is a type of signal processing in which input is an
image and output may be image or characteristics/features associated with that image.
Nowadays, image processing is among rapidly growing technologies. There are two types of
methods used for image processing namely, analogue and digital image processing. Analogue
image processing can be used for the hard copies like printouts and photographs. Image
analysts use various fundamentals of interpretation while using these visual techniques. Digital
image processing techniques help in manipulation of the digital images by using computers.
The three general phases that all types of data have to undergo while using digital technique
are pre-processing, enhancement, and display, information extraction.

2
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Flowchart:

3
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Code and Results:

Reading an image

Displaying an image

Trying different values of flag

4
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Getting image shape

Output:

Output:

Read Image

5
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Writing an image

Input:

Output:

Reading a video (without audio)

6
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Reading a video (with audio)

Input:

Output:

7
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Record WebCam

Output:

Conclusions:
The image processing is supreme approach to improve the appearance of an image to a human
observer, to extract from an image quantitative information that is not readily apparent to the
eye, and to calibrate an image in photometric or geometric terms.
In this experiment, I have performed operations like reading an image, writing an image,
recording a video with and without audio and recording through web cam. The experiments
help me gain an understanding of the basic commands that are used to perform operations on
image. I also understood the different types of images like REDUCED_GRAYSCALE_2,
REDUCED_COLOR_2 etc. I also understood that the system treats the image as BGR which is
later converted to RGB for visualization.
8
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

ET3221: Computer Vision

Name of the student: Ashish Rodi Roll No. 9


Div: A Batch: 1
Date of performance: 16-07-2024

Experiment No. 2

Problem Statement:

Arithmetic Operations using Python

AIM:

Write a Python Code for following Arithmetic Operation:


1. Addition
2. Subtraction
3. Multiplication
4. Division
For the following Logical Bitwise Operation:
1. AND
2. OR
3. XOR
4. NOT

Objective(s) of Experiment:

Perform the athematic operation to increase/decrease the brightness and contrast

Introduction:

Arithmetic Operations like Addition, Subtraction, and Bitwise Operations (AND, OR, NOT,
XOR) can be applied to the input images. These operations can be helpful in enhancing the
properties of the input images. The Image arithmetic’s are important for analyzing the input
image properties. The operated images can be further used as an enhanced input image, and
many more operations can be applied for clarifying, thresholding, dilating etc of the image.

Adding or subtracting a constant value to/from each image pixel value can be used to
increases/decrease its brightness. Blending Adding images together produces a composite
image of both input images. This can be used to produce blending effects using weighted
addition. Multiplication and division can be used as a simple means of contrast adjustment and
extension to addition/subtraction (e.g., reduce contrast to 25% = division by 4; increase
contrast by 50% = multiplication by 1.5).

9
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Flowchart:

10
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Flowchart for Logical operations:

Code and Outputs:


import cv2
import numpy as np
import matplotlib.pyplot as plt
Addition Operation
path = "C:/Users/Ashish/Desktop/CVFolder/image.jfif"
img = cv2.imread(path, 1)
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
print(img.shape)
print(img) # elements
width = 10
height = 10
dim = (width, height)
# resize image

11
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

resized = cv2.resize(img, dim)


print(resized.shape)
print(resized)
plt.imshow(resized,cmap='gray')
(2035, 2048)
[[254 254 254 ... 255 255 255]
[254 254 254 ... 255 255 255]
[254 254 254 ... 255 255 255]
...
[255 255 255 ... 255 255 255]
[255 255 255 ... 255 255 255]
[255 255 255 ... 255 255 255]]
(10, 10)
[[255 255 255 255 255 255 255 255 255 255]
[255 255 255 255 248 220 255 255 255 255]
[255 255 130 133 219 220 242 247 255 255]
[255 255 66 76 131 131 71 107 255 255]
[255 255 61 104 199 100 232 71 255 255]
[255 255 227 81 64 90 103 200 255 255]
[255 255 142 58 103 199 65 96 255 255]
[255 255 255 86 87 78 138 255 255 255]
[255 255 255 255 255 255 255 255 255 255]
[255 255 255 255 255 255 255 255 255 255]]

<matplotlib.image.AxesImage at 0x267c30e7fb0>

12
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

k = 100
M = np.ones(resized.shape, dtype="uint8") * k
print(resized)
print(M)
add1 = cv2.add(resized, M)
add2 = resized + M
print("The addition with CV2", add1)
print("The addition with simple operator", add2)
print("Shape of input image", resized.shape)
#plt.imshow(resized,cmap='gray')
#plt.show()
plt.imshow(resized, cmap='gray')
plt.show()
plt.imshow(add1, cmap='gray')
plt.show()
plt.imshow(add2,cmap='gray')
plt.show()

[[255 255 255 255 255 255 255 255 255 255]
[255 255 255 255 248 220 255 255 255 255]
[255 255 130 133 219 220 242 247 255 255]
[255 255 66 76 131 131 71 107 255 255]
[255 255 61 104 199 100 232 71 255 255]
[255 255 227 81 64 90 103 200 255 255]
[255 255 142 58 103 199 65 96 255 255]
[255 255 255 86 87 78 138 255 255 255]
[255 255 255 255 255 255 255 255 255 255]
[255 255 255 255 255 255 255 255 255 255]]
[[100 100 100 100 100 100 100 100 100 100]
[100 100 100 100 100 100 100 100 100 100]
[100 100 100 100 100 100 100 100 100 100]
[100 100 100 100 100 100 100 100 100 100]
[100 100 100 100 100 100 100 100 100 100]
[100 100 100 100 100 100 100 100 100 100]
[100 100 100 100 100 100 100 100 100 100]
[100 100 100 100 100 100 100 100 100 100]
[100 100 100 100 100 100 100 100 100 100]
[100 100 100 100 100 100 100 100 100 100]]
The addition with CV2 [[255 255 255 255 255 255 255 255 255 255]
[255 255 255 255 255 255 255 255 255 255]
[255 255 230 233 255 255 255 255 255 255]
[255 255 166 176 231 231 171 207 255 255]
[255 255 161 204 255 200 255 171 255 255]
[255 255 255 181 164 190 203 255 255 255]
[255 255 242 158 203 255 165 196 255 255]
[255 255 255 186 187 178 238 255 255 255]
[255 255 255 255 255 255 255 255 255 255]
[255 255 255 255 255 255 255 255 255 255]]
The addition with simple operator [[ 99 99 99 99 99 99 99 99 99 99]
[ 99 99 99 99 92 64 99 99 99 99]
[ 99 99 230 233 63 64 86 91 99 99]
13
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

[ 99 99 166 176 231 231 171 207 99 99]


[ 99 99 161 204 43 200 76 171 99 99]
[ 99 99 71 181 164 190 203 44 99 99]
[ 99 99 242 158 203 43 165 196 99 99]
[ 99 99 99 186 187 178 238 99 99 99]
[ 99 99 99 99 99 99 99 99 99 99]
[ 99 99 99 99 99 99 99 99 99 99]]
Shape of input image (10, 10)

14
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

path = "C:/Users/Ashish/Desktop/CVFolder/image.jfif"
img = cv2.imread(path, 1)
#img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
print(img.shape)
print(img)

15
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

width = 10
height = 10
dim = (width, height)
# resize image
resized = cv2.resize(img, dim)
print(resized.shape)
print(resized)
plt.imshow(resized)
(2035, 2048, 3)
[[[254 254 254]
[254 254 254]
[254 254 254]
...
[255 255 255]
[255 255 255]
[255 255 255]]

[[254 254 254]


[254 254 254]
[254 254 254]
...
[255 255 255]
[255 255 255]
[255 255 255]]

[[254 254 254]


[254 254 254]
[254 254 254]
...
[255 255 255]
[255 255 255]
[255 255 255]]
...
<matplotlib.image.AxesImage at 0x267c3147140>

16
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

k = 150
M = np.ones(resized.shape, dtype="uint8") * k
print(resized)
print(M)
add1 = cv2.add(resized, M)
add2 = resized + M
print("The addition with CV2", add1)
print("The addition with simple operator", add2)
print("Shape of input image", resized.shape)
#plt.imshow(resized,cmap='gray')
#plt.show()
plt.imshow(resized)
plt.show()
plt.imshow(add1)
plt.show()
plt.imshow(add2)
plt.show()

The addition with CV2 [[[255 255 255]


The addition with simple operator [[[149 149 149]
Shape of input image (10, 10, 3)

17
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

18
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Subtraction Operation
path = "C:/Users/Ashish/Desktop/CVFolder/MM9.jpg"
img = cv2.imread(path, 1)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
width = 10
height = 10
dim = (width, height)
# resize image
resized = cv2.resize(img, dim)
k = 100
M = np.ones(resized.shape, dtype="uint8") * k
sub = cv2.subtract(resized, M)
plt.imshow(resized)
plt.show()
plt.imshow(sub)
plt.show()

19
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Multiplication Operation
path = "C:/Users/Ashish/Desktop/CVFolder/MM9.jpg"
img = cv2.imread(path, 1)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
width = 10
height = 10
20
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

dim = (width, height)


# resize image
resized = cv2.resize(img, dim)
k = 50
M = np.ones(resized.shape, dtype="uint8") * k
mul = cv2.multiply(resized, M)
plt.imshow(resized)
plt.show()
plt.imshow(mul)
plt.show()

21
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Division Operation
path = "C:/Users/Ashish/Desktop/CVFolder/MM9.jpg"
img = cv2.imread(path, 1)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
k = int(input("Enter intensity of 2nd img pixels: "))
width = 10
height = 10
dim = (width, height)
# resize image
resized = cv2.resize(img, dim)
print(resized)
M = np.ones(resized.shape, dtype="uint8") * k
div = cv2.divide(resized, M)
print("division")
print(div)
plt.imshow(resized)
plt.show()
plt.imshow(div)
plt.show()
[[[ 5 3 25]
[ 63 54 114]
[103 97 165]
[ 22 11 19]
[ 38 29 37]
[ 13 0 11]
[ 10 2 9]
[ 19 5 20]

22
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

[ 2 0 3]
[ 0 0 4]]

[[ 21 16 23]
[ 56 36 37]
[ 46 29 32]
[ 16 3 3]
[ 0 0 0]
[ 4 0 1]
[ 3 0 2]
[ 37 26 30]
[ 55 38 46]
[ 72 53 58]]

division
[[[ 2 2 12]
[ 32 27 57]
[ 52 48 82]
[ 11 6 10]
[ 19 14 18]
[ 6 0 6]
[ 5 1 4]
[ 10 2 10]
[ 1 0 2]
[ 0 0 2]]

[[ 10 8 12]
[ 28 18 18]
[ 23 14 16]
[ 8 2 2]
[ 0 0 0]
[ 2 0 0]
[ 2 0 1]
[ 18 13 15]
[ 28 19 23]
[ 36 26 29]]

23
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Logical Operations:
import cv2
import numpy as np
import matplotlib.pyplot as plt

24
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

AND Operator
img1 = cv2.imread(r"C:\Users\Ashish\Desktop\CVFolder\square2.jpg")
img1 = cv2.cvtColor(img1, cv2.COLOR_BGR2GRAY)
img2 = cv2.imread(r"C:\Users\Ashish\Desktop\CVFolder\circle.jpg")
img2 = cv2.cvtColor(img2, cv2.COLOR_BGR2GRAY)
print(img1.shape)
print(img2.shape)
# resixe 10 x10
width = 10
height = 10
dim = (width, height)
# resize image
imgR1 = cv2.resize(img1, dim)
imgR2 = cv2.resize(img2, dim)
AND = cv2.bitwise_and(imgR1, imgR2)
plt.imshow(imgR1, cmap='gray')
plt.show()
print(imgR1)
plt.imshow(imgR2, cmap='gray')
plt.show()
print(imgR2)
plt.imshow(AND, cmap='gray')
plt.show()
print(AND)
(350, 350)
(350, 350)

25
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

[[248 25 25 25 24 248 248 248 248 25]


[248 24 24 24 24 248 248 248 248 24]
[248 24 25 24 24 248 248 248 248 24]
[ 25 248 248 248 248 25 24 25 25 248]
[ 25 248 248 248 248 25 25 24 24 248]
[ 24 248 248 248 248 24 24 25 25 248]
[ 24 248 248 248 248 24 24 24 24 248]
[248 25 25 25 25 248 248 248 248 24]
[248 25 24 24 24 248 248 248 248 25]
[248 24 24 25 24 248 248 248 248 24]]

26
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

[[243 243 243 255 255 255 6 243 243 243]


[243 86 255 255 255 255 255 1 182 243]
[243 255 255 255 0 0 255 0 1 243]
[255 255 255 255 255 255 255 1 1 1]
[255 255 255 255 255 255 0 1 1 1]
[255 255 255 206 1 1 1 1 1 1]
[255 255 255 1 1 1 1 1 1 1]
[245 255 255 1 255 253 1 1 1 245]
[243 19 255 1 1 1 1 1 59 243]
[243 243 243 247 1 1 1 246 243 243]]

27
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

[[240 17 17 25 24 248 0 240 240 17]


[240 16 24 24 24 248 248 0 176 16]
[240 24 25 24 0 0 248 0 0 16]
[ 25 248 248 248 248 25 24 1 1 0]
[ 25 248 248 248 248 25 0 0 0 0]
[ 24 248 248 200 0 0 0 1 1 0]
[ 24 248 248 0 0 0 0 0 0 0]
[240 25 25 1 25 248 0 0 0 16]
[240 17 24 0 0 0 0 0 56 17]
[240 16 16 17 0 0 0 240 240 16]]
OR Operator
img1 = cv2.imread(r"C:\Users\Ashish\Desktop\CVFolder\square.jpg")
img1 = cv2.cvtColor(img1, cv2.COLOR_BGR2GRAY)
img2 = cv2.imread(r"C:\Users\Ashish\Desktop\CVFolder\circle.jpg")
img2 = cv2.cvtColor(img2, cv2.COLOR_BGR2GRAY)
# resixe 10 x10
width = 10
height = 10
dim = (width, height)
# resize image
imgR1 = cv2.resize(img1, dim)
imgR2 = cv2.resize(img2, dim)
OR = cv2.bitwise_or(imgR1, imgR2)
plt.imshow(imgR1, cmap='gray')
plt.show()
print(imgR1)
plt.imshow(imgR2, cmap='gray')

28
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

plt.show()
print(imgR2)
plt.imshow(OR, cmap='gray')
plt.show()
print(OR)

[[248 25 25 25 24 248 248 248 248 25]


[248 25 24 25 24 248 248 248 248 25]
[248 24 25 24 24 248 248 248 248 24]
[ 25 248 248 248 248 25 24 25 25 248]
[ 25 248 248 248 248 25 25 24 24 248]
[ 24 248 248 248 248 24 24 25 25 248]
[ 24 248 248 248 248 24 25 25 25 248]
[248 25 25 25 25 248 248 248 248 24]
[248 25 24 24 24 248 248 248 248 25]
[248 24 25 25 25 248 248 248 248 25]]

29
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

[[243 243 243 255 255 255 6 243 243 243]


[243 86 255 255 255 255 255 1 182 243]
[243 255 255 255 0 0 255 0 1 243]
[255 255 255 255 255 255 255 1 1 1]
[255 255 255 255 255 255 0 1 1 1]
[255 255 255 206 1 1 1 1 1 1]
[255 255 255 1 1 1 1 1 1 1]
[245 255 255 1 255 253 1 1 1 245]
[243 19 255 1 1 1 1 1 59 243]
[243 243 243 247 1 1 1 246 243 243]]

30
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

[[251 251 251 255 255 255 254 251 251 251]
[251 95 255 255 255 255 255 249 254 251]
[251 255 255 255 24 248 255 248 249 251]
[255 255 255 255 255 255 255 25 25 249]
[255 255 255 255 255 255 25 25 25 249]
[255 255 255 254 249 25 25 25 25 249]
[255 255 255 249 249 25 25 25 25 249]
[253 255 255 25 255 253 249 249 249 253]
[251 27 255 25 25 249 249 249 251 251]
[251 251 251 255 25 249 249 254 251 251]]
XOR Image
img1 = cv2.imread(r"C:\Users\Ashish\Desktop\CVFolder\square.jpg")
img1 = cv2.cvtColor(img1, cv2.COLOR_BGR2GRAY)
img2 = cv2.imread(r"C:\Users\Ashish\Desktop\CVFolder\circle.jpg")
img2 = cv2.cvtColor(img2, cv2.COLOR_BGR2GRAY)
print(img1.shape)
print(img2.shape)
# resixe 10 x10
width = 10
height = 10
dim = (width, height)
# resize image
imgR1 = cv2.resize(img1, dim)
imgR2 = cv2.resize(img2, dim)
XOR = cv2.bitwise_xor(imgR1, imgR2)
plt.imshow(imgR1, cmap='gray')
plt.show()

31
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

print(imgR1)
plt.imshow(imgR2, cmap='gray')
plt.show()
print(imgR2)
plt.imshow(OR, cmap='gray')
plt.show()
print(OR)
(350, 350)
(350, 350)

[[248 25 25 25 24 248 248 248 248 25]


[248 25 24 25 24 248 248 248 248 25]
[248 24 25 24 24 248 248 248 248 24]
[ 25 248 248 248 248 25 24 25 25 248]
[ 25 248 248 248 248 25 25 24 24 248]
[ 24 248 248 248 248 24 24 25 25 248]
[ 24 248 248 248 248 24 25 25 25 248]
[248 25 25 25 25 248 248 248 248 24]
[248 25 24 24 24 248 248 248 248 25]
[248 24 25 25 25 248 248 248 248 25]]

32
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

[[243 243 243 255 255 255 6 243 243 243]


[243 86 255 255 255 255 255 1 182 243]
[243 255 255 255 0 0 255 0 1 243]
[255 255 255 255 255 255 255 1 1 1]
[255 255 255 255 255 255 0 1 1 1]
[255 255 255 206 1 1 1 1 1 1]
[255 255 255 1 1 1 1 1 1 1]
[245 255 255 1 255 253 1 1 1 245]
[243 19 255 1 1 1 1 1 59 243]
[243 243 243 247 1 1 1 246 243 243]]

33
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

[[251 251 251 255 255 255 254 251 251 251]
[251 95 255 255 255 255 255 249 254 251]
[251 255 255 255 24 248 255 248 249 251]
[255 255 255 255 255 255 255 25 25 249]
[255 255 255 255 255 255 25 25 25 249]
[255 255 255 254 249 25 25 25 25 249]
[255 255 255 249 249 25 25 25 25 249]
[253 255 255 25 255 253 249 249 249 253]
[251 27 255 25 25 249 249 249 251 251]
[251 251 251 255 25 249 249 254 251 251]]
NOT Operator
img1 = cv2.imread("square.jpg")
img1 = cv2.cvtColor(img1,cv2.COLOR_BGR2RGB)
NOT = cv2.bitwise_not(img1)
width = 10
height = 10
dim = (width, height)
# resize image
imgR1 = cv2.resize(img1, dim)
imgR2 = cv2.resize(NOT, dim)
XOR = cv2.bitwise_xor(imgR1, imgR2)
plt.imshow(imgR1)
plt.show()
plt.imshow(imgR2)
plt.show()

34
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

[[[251 248 241]


[ 26 23 30]
[ 26 23 30]
[ 26 23 30]
[ 25 22 29]
[251 248 241]
[251 248 241]
[251 248 241]
[251 248 241]
[ 26 23 30]]
[[251 248 241]
[ 26 23 30]
[ 25 22 29]
[ 26 23 30]
[ 25 22 29]
[251 248 241]
[251 248 241]
[251 248 241]
[251 248 241]
[ 26 23 30]]
[[251 248 241]
[ 25 22 29]
[ 26 23 30]
[ 25 22 29]
[ 25 22 29]
[251 248 241]
[251 248 241]
[251 248 241]
[251 248 241]
[ 25 22 29]]
[[ 26 23 30]
[251 248 241]
[251 248 241]
[251 248 241]
[251 248 241]
[ 26 23 30]
[ 25 22 29]
[ 26 23 30]
[ 26 23 30]
[251 248 241]]

35
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Conclusion:
The addition of two images of the same size results in a new image of the same size whose
pixels are to the sum of the pixels in the original images. The subtraction of two images detects
changes between input images. Multiplication and division are useful in contrast adjustment.
The division of two images correct non- homogeneous illumination and shadow removal. Also,
whenever division operation takes place there is a significant darkening of the images, so for a
large factor the image may become complete black. Also, using the individual pixel values, the
arithmatic operations and logical operations can be easily verified.

36
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

ET3221: Computer Vision

Name of the student: Ashish Rodi Roll No. 9


Div: A Batch: 1
Date of performance: 23-07-2024

Experiment No. 3

Problem Statement:

Geometric Transformations

AIM:

Write a Python Code for following Geometric Transformations:


1. Translation
2. Rotation
3. Scaling
4. Cropping
5. Shearing
6. Flipping

Objective(s) of Experiment:
To perform the geometric transformations of images.

Introduction:

Geometric transformations are widely used for image registration and the removal of
geometric distortion. Scaling is just resizing of the image. Rotation is a concept in
mathematics that is a motion of a certain space that preserves at least one point. Image rotation
is a common image processing routine with applications in matching, alignment, and other
image-based algorithms, it is also extensively in data augmentation, especially when it comes
to image classification.
Image transformation is a coordinate changing function, it maps some (x, y) points in one
coordinate system to points (x', y') in another coordinate system. Shear mapping is a linear
map that displaces each point in a fixed direction, it substitutes every point horizontally or
vertically by a specific value in proportion to its x or y coordinates, there are two types of
shearing effects. Image cropping is the removal of unwanted outer areas from an image, a lot
of the above examples introduced black pixels, you can easily remove them using cropping.
Common applications include construction of mosaics, geographical mapping, stereo and
video.

37
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Flowchart:

Code and results:


import cv2
import numpy as np
import matplotlib.pyplot as plt
Translation Operation
38
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

# reads an input image


img = cv2.imread(r"Mario.jpg")
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) #BGR 2 RGB for plotting using
matplotlib
h, w = img.shape[:2]
x = int(input("Enter x-axis shift: "))
y = int(input("Enter y-axis shift: "))
t_matrix = np.float32([[1,0,x], [0,1,y]]) #Creating translation matrix
n_img = cv2.warpAffine(img, t_matrix, (w, h))
plt.imshow(img)
plt.show()
plt.imshow(n_img)
plt.show()

39
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Rotation Operation
# reads an input image
img = cv2.imread(r"Mario.jpg")
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) #BGR 2 RGB for plotting using
matplotlib
h, w = img.shape[:2]
center = (w//2, h//2)
ang = int(input("Enter degree of rotation: ")) #Positive value for anti-clockwise &
negative for clockwise
r_matrix = cv2.getRotationMatrix2D(center, ang, 1.0) #Generating rotational matrix
r = cv2.warpAffine(img, r_matrix, (w, h))
plt.imshow(img)
plt.show()
plt.imshow(r)
plt.show()

40
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Scaling Operation
# reads an input image
img = cv2.imread(r"Mario.jpg")
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) #BGR 2 RGB for plotting using
matplotlib

41
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

h, w = img.shape[:2]
scale = float(input("Enter scaling factor: ")) #scale>1:Img is scaled-up , scale<1:Img
is scaled-down
h = int(img.shape[0] * scale) #Scaling in y-axis
w = int(img.shape[1] * scale) #Scaling in x-axis
re = cv2.resize(img, (w, h)) #Resizing img with scaled values
plt.imshow(img)
plt.show()
plt.imshow(re)
plt.show()

42
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Cropping Operation
# reads an input image
img = cv2.imread(r"Mario.jpg")
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) #BGR 2 RGB for plotting using
matplotlib
h, w = img.shape[:2]
print(img.shape)
cxlow = int(input("Enter lower-x limit: ")) #Enter cropping limits
cxup = int(input("Enter upper-x limit: "))
cylow = int(input("Enter lower-y limit: "))
cyup = int(input("Enter upper-y limit: "))
crop = img[cylow:cyup, cxlow:cxup] #Only displaying the selected part of
img
plt.imshow(img)
plt.show()
plt.imshow(crop)
plt.show()
(1400, 1400, 3)

43
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Flipping Operation
# reads an input image
img = cv2.imread(r"Mario.jpg")
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) #BGR 2 RGB for plotting using
matplotlib

44
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

h, w = img.shape[:2]
flip1 = cv2.flip(img, 1) # left right flip
flip2 = cv2.flip(img, 0) # upside down
flip3 = cv2.flip(img, -1) #left right flip + upside down
plt.imshow(img)
plt.show()
plt.imshow(flip1)
plt.show()
plt.imshow(flip2)
plt.show()
plt.imshow(flip3)
plt.show()

45
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

46
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

# reads an input image


img = cv2.imread(r"Mario.jpg")
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) #BGR 2 RGB for plotting using
matplotlib
h, w = img.shape[:2]
m = int(input("Enter 1 for horizontal, 2 for vertical or 3 for both axes flipping: "))
if m==1:
matrix = np.float32([[-1,0,w], [0,1,0], [0,0,1]]) #Creating matrix for horizontal flipping
flip = cv2.warpPerspective(img, matrix, (w,h))
elif m==2:
matrix = np.float32([[1,0,0], [0,-1,h], [0,0,1]]) #Creating matrix for vertical flipping
flip = cv2.warpPerspective(img, matrix, (w,h))
elif m==3:
matrix = np.float32([[-1,0,w], [0,-1,h], [0,0,1]]) #Creating matrix for flipping both axes
flip = cv2.warpPerspective(img, matrix, (w,h))
plt.imshow(img)
plt.show()
plt.imshow(flip)
plt.show()

47
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Shearing Operation
# reads an input image
img = cv2.imread(r"Mario.jpg")
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) #BGR 2 RGB for plotting using
matplotlib

48
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

h, w = img.shape[:2]
m = int(input("Enter 1 for horizontal and 2 for vertical shear: "))
if m==1:
k = float(input("Enter shearing factor: "))
sh_matrix = np.float32([[1,k,0], [0,1,0], [0,0,1]]) #Creating shearing matrix for
given factor
sheared = cv2.warpPerspective(img, sh_matrix, (int(w*(k+1)), h)) #Increasing width by
the same factor
if m==2:
k = float(input("Enter shearing factor: "))
sh_matrix = np.float32([[1,0,0], [k,1,0], [0,0,1]])
sheared = cv2.warpPerspective(img, sh_matrix, (w, int(h*(k+1)))) #Increasing height by
the same factor
plt.imshow(img)
plt.show()
plt.imshow(sheared)
plt.show()

Conclusions:
In these experiments, we performed the basics of image processing and transformation,
which are image translation, scaling, shearing, reflection, rotation, and cropping. Translation
and rotation is shifting image in horizontal direction and rotation. Scaling means adjusting
size. Shearing means skewing. Cropping means extracting a portion of image. Reflection
means flipping the image.
49
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

ET3221: Computer Vision

Name of the student: Ashish Rodi Roll No. 9


Div: A Batch: 1
Date of performance: 23-07-2024

Experiment No. 4

Problem Statement:

Intensity Transformations using python

AIM:

Write a Python Code for following Intensity Transformations:


1. Intensity Level Slicing Without Background
2. Intensity Level Slicing with Background
3. Log
4. Power Law

Objective(s) of Experiment:
To enhance the image intensity using level slicing, log and power law transformation

Introduction:

Intensity Level Slicing highlights the particular range of gray levels in an image. It handles a
group of intensity levels in an image up to a specific range by reducing rest or by leaving them
alone.

Log transformation of an image means replacing all pixel values, present in the image, with
its logarithmic values. Log transformation is used for image enhancement as it expands dark
pixels of the image as compared to higher pixel values.

Power – Law transformations is used for enhancing images for different type of display
devices. The gamma of different display devices is different. For example, Gamma of CRT
lies in between of 1.8 to 2.5, that means the image displayed on CRT is dark.

Flowchart:

50
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

51
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Code and Results:


import cv2
import numpy as np
import matplotlib.pyplot as plt
import imageio
import matplotlib.pyplot as plt
image = cv2.imread(r"Bridge.jpg")
plt.imshow(image, 'gray')
plt.figure(figsize = (6,6))
plt.imshow(255 - image)
<matplotlib.image.AxesImage at 0x196cacda270>

52
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Intensity Level Slicing Without Background


image = cv2.imread(r"Bridge.jpg")
#image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
gray = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)
x,y = image.shape[:2]
plt.imshow(gray,'gray')
plt.title('Original Image')
plt.show()

z = np.zeros((x,y))
for i in range(0,x):
for j in range(0,y):
if(gray[i][j]>50 and gray[i][j]<150):
z[i][j]=255
else:
z[i][j]=0

plt.imshow(z, 'gray')
plt.title('Sliced Original Image')
plt.show()

53
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Intensity Level Slicing With Background


image = cv2.imread(r"Bridge.jpg")
#image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
gray = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)
x,y = image.shape[:2]
plt.imshow(gray,'gray')
plt.title('Original Image')

54
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

plt.show()
z = np.zeros((x,y))
for i in range(0,x):
for j in range(0,y):
if(gray[i][j]>50 and gray[i][j]<150):
z[i][j]=255
else:
z[i][j]=gray[i][j]
plt.imshow(z, 'gray')
plt.title('Sliced with background')
plt.show()

55
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Log Transform
image = cv2.imread(r"Bridge.jpg")
#image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
gray = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)
x,y = image.shape[:2]
plt.imshow(gray,'gray')
plt.title('Original Image')
plt.show()
c = 255/(np.log(1+np.max(gray)))
log = c*np.log(1+gray)
log = np.array(log,dtype=np.uint8)
plt.imshow(log, 'gray')
plt.show()

56
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

C:\Users\Ashish\AppData\Local\Temp\ipykernel_19928\2741588362.py:9: RuntimeWarning:
divide by zero encountered in log
log = c*np.log(1+gray)
C:\Users\Ashish\AppData\Local\Temp\ipykernel_19928\2741588362.py:10: RuntimeWarning:
invalid value encountered in cast
log = np.array(log,dtype=np.uint8)

Power Law

57
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

image = cv2.imread(r"Bridge.jpg")
#image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
gray = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)
x,y = image.shape[:2]
plt.imshow(gray,'gray')
plt.title('Original Image')
plt.show()
z = np.zeros((x,y))
gamma = float(input("Enter gamma value: "))
gamma1 = float(input("Enter input value"))
gamma2 = float(input("Enter value"))
c = 255/(np.log(1+np.max(gray)))
z = np.array(c*(gray**gamma), dtype = 'uint8')
z1 = np.array(c*(gray**gamma1), dtype= 'uint8')
z2 = np.array(c*(gray**gamma2), dtype = 'uint8')
print("gamma = ",gamma)
plt.imshow(z, 'gray')
plt.show()
plt.imshow(z1,'gray')
plt.show()
plt.imshow(z2, 'gray')
plt.show()
print(z)
print(z1)
print(z2)

C:\Users\Ashish\AppData\Local\Temp\ipykernel_19928\2293666078.py:15: RuntimeWarning:
divide by zero encountered in power
z2 = np.array(c*(gray**gamma2), dtype = 'uint8')
C:\Users\Ashish\AppData\Local\Temp\ipykernel_19928\2293666078.py:15: RuntimeWarning:
58
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

invalid value encountered in cast


z2 = np.array(c*(gray**gamma2), dtype = 'uint8')

gamma = 0.2

gamma = 0.6

gamma = -0.3

59
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

[[128 128 129 ... 138 138 138]


[128 128 128 ... 138 138 138]
[128 128 128 ... 138 138 138]
...
[128 131 130 ... 131 131 132]
[129 130 131 ... 128 129 128]
[129 128 127 ... 131 131 131]]
[[226 237 251 ... 235 235 235]
[237 241 244 ... 235 232 232]
[244 241 241 ... 232 232 232]
...
[237 47 30 ... 43 57 64]
[251 30 57 ... 244 251 230]
[ 13 237 219 ... 47 57 40]]
[[9 9 9 ... 8 8 8]
[9 9 9 ... 8 8 8]
[9 9 9 ... 8 8 8]
...
[9 9 9 ... 9 9 9]
[9 9 9 ... 9 9 9]
[9 9 9 ... 9 9 9]]
Conclusions:
Thus, the given operations have been performed successfully. The experiment gives an idea
about how different transforms change the characteristics of the image. Slicing refers to
highlighting a particular grayscale range and keep it as it is. It can be done with and without
background. If done without background then background is changed to level 0 or else kept as it
is in case of with background. The Log transform brightens up the image by performing a
logarithmic operation. Power transform uses an entity called gamma which is an exponential
term. The value of gamma decides how the image would look. The effect of gamma value
would be different for different images as it may get washed out beyond a certain value. But
60
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

usually, as the value increases the image becomes darker.

ET3221: Computer Vision

Name of the student: Ashish Rodi Roll No. 9


Div: A Batch: 1
Date of performance: 30-07-2024

Experiment No. 5

Problem Statement:

Histogram Equalization using python

AIM:

Write a Python Code for following Enhancement Operations:


1. Plotting a Histogram for an Image
2. Linear Stretching
3. Histogram Equalization

Objective(s) of Experiment:

To perform the histogram of an image, linear stretching, and histogram equalization.

Introduction:

An image histogram is a type of histogram that acts as a graphical representation of the tonal
distribution in a digital image. It plots the number of pixels for each tonal value. By looking
at the histogram for a specific image a viewer will be able to judge the entire tonal distribution
at a glance.

The horizontal axis of the graph represents the tonal variations, while the vertical axis
represents the number of pixels in that particular tone. The left side of the horizontal axis
represents the black and dark areas, the middle represents medium grey and the right-hand side
represents light and pure white areas. The vertical axis represents the size of the area that is
captured in each one of these zones.

61
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Flowchart:

62
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Code and Results:

Plot Histogram of Image

import cv2
import matplotlib.pyplot as plt
import numpy as np
from skimage import exposure
from skimage.exposure import match_histograms
from matplotlib.colors import NoNorm
# reads an input image
img = cv2.imread(r"Apple.jpg")
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
gray = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
plt.imshow(gray,'gray')
plt.title('Original Image')
plt.show()
# find frequency of pixels in range 0-255, with 256 bins
histr = cv2.calcHist([gray],[0],None,[256],[0,255])
histr1 = cv2.calcHist([gray],[0],None,[16],[0,255])
# show the plotting graph of an image
plt.plot(histr)
plt.xlabel('number of bins')
plt.ylabel('Number of pixels')
plt.title('Histogram')
plt.show()
plt.plot(histr1)
plt.xlabel('number of bins')
plt.ylabel('Number of pixels')
plt.title('Histogram with reduced bins')
plt.show()

63
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

64
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

# reads an input image


img = cv2.imread(r"Apple.jpg")
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
gray = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
plt.imshow(gray,'gray')
plt.title('Original Image')
plt.show()
# find frequency of pixels in range 0-255
histr = cv2.calcHist([gray],[0],None,[255],[0,256])
# show the plotting graph of an image
plt.plot(histr)
plt.xlabel('Intensity value')
plt.ylabel('Number of pixels')
plt.title('Histogram')
plt.show()

65
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Color Histogram
import cv2
import numpy as np
from matplotlib import pyplot as plt

66
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

# Read the image


img = cv2.imread(r"Apple.jpg")
# Define color channels
color = ('b', 'g', 'r')
for i, col in enumerate(color):
# Calculate the histogram for each color channel
histr = cv2.calcHist([img], [i], None, [256], [0, 256])
# Plot the histogram
plt.plot(histr, color=col)
plt.xlabel('Intensity value')
plt.ylabel('Number of pixels')
plt.title(f'Histogram for {col.upper()} channel')
plt.show()

67
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

68
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Linear Stretching
# reads an input image
img = cv2.imread(r"woods.jpg")

img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)


gray = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
plt.imshow(gray,'gray')
plt.title('Original Image')

plt.imshow(gray,cmap ='gray', norm=NoNorm())


plt.title('Original Image without normalization')

69
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

gray.shape
(382, 612)
plt.hist(gray.flatten()*255)
plt.title('Original Image Histogram')

gray.min()
2
70
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

gray.max()
255
def pixelVal(pix, r1, s1, r2, s2):
if (0 <= pix and pix <= r1):
return (s1 / r1)*pix
elif (r1 < pix and pix <= r2):
return ((s2 - s1)/(r2 - r1)) * (pix - r1) + s1
else:
return ((255 - s2)/(255 - r2)) * (pix - r2) + s2

plt.imshow(img.astype('uint8')) # function is used to cast a pandas object to a specified data


type
plt.title('Original Image')
plt.show()
# Define parameters.
r1 = 18
s1 = 0
r2 = 213
s2 = 255
# Vectorize the function to apply it to each value in the Numpy array.
pixelVal_vec = np.vectorize(pixelVal)

# Apply contrast stretching.


linear_stretched = pixelVal_vec(img, r1, s1, r2, s2)
plt.imshow(linear_stretched.astype('uint8'))
plt.title('Result')
plt.show()
plt.hist(linear_stretched.flatten())
plt.title('Original Image Histogram')

71
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

72
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

73
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Histogram Equalization
#Apply histogram equalization
equl = cv2.equalizeHist(gray)
plt.imshow(equl,'gray')
plt.title('Equalized Image')
plt.show()

histr2 = cv2.calcHist([equl],[0],None,[256],[0,256])
plt.plot(histr2)
plt.xlabel('Intensity value')
plt.ylabel('Number of pixels')
plt.title('Equalised histogram')
plt.show()

74
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Histogram Matching (Example 1)


#Match the histogram of image woods with coffee
# reading main image
img1 = cv2.imread("woods.jpg")
img1 = cv2.cvtColor(img1, cv2.COLOR_BGR2RGB)

# checking the number of channels


print('No of Channel is: ' + str(img1.ndim))

# reading reference image


img2 = cv2.imread("coffee.jpg")
img2 = cv2.cvtColor(img2, cv2.COLOR_BGR2RGB)
# checking the number of channels
print('No of Channel is: ' + str(img2.ndim))
image = img1
reference = img2
matched = match_histograms(image, reference)
fig, (ax1, ax2, ax3) = plt.subplots(nrows=1, ncols=3,figsize=(8, 3),sharex=True, sharey=True)
for aa in (ax1, ax2, ax3):
aa.set_axis_off()
ax1.imshow(image)
ax1.set_title('Source')
ax2.imshow(reference)
ax2.set_title('Reference')
ax3.imshow(matched)
ax3.set_title('Matched')
75
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

plt.tight_layout()
plt.show()
fig, axes = plt.subplots(nrows=3, ncols=3, figsize=(8, 8))
for i, img in enumerate((image, reference, matched)):
for c, c_color in enumerate(('red', 'green', 'blue')):
img_hist, bins = exposure.histogram(img[..., c],source_range='dtype)
axes[c, i].plot(bins, img_hist / img_hist.max())
img_cdf, bins = exposure.cumulative_distribution(img[..., c])
axes[c, i].plot(bins, img_cdf)
axes[c, 0].set_ylabel(c_color)
axes[0, 0].set_title('Source')
axes[0, 1].set_title('Reference')
axes[0, 2].set_title('Matched')

plt.tight_layout()
plt.show()
No of Channel is: 3
No of Channel is: 3

76
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Histogram Matching (Example 2)


#Match the histogram of image kolkata with coffee
# reading main image
img1 = cv2.imread("kolkata.webp")
img1 = cv2.cvtColor(img1, cv2.COLOR_BGR2RGB)
# checking the number of channels
print('No of Channel is: ' + str(img1.ndim))
# reading reference image
img2 = cv2.imread("coffee.jpg")
img2 = cv2.cvtColor(img2, cv2.COLOR_BGR2RGB)
# checking the number of channels
print('No of Channel is: ' + str(img2.ndim))
image = img1
reference = img2

matched = match_histograms(image, reference)

fig, (ax1, ax2, ax3) = plt.subplots(nrows=1, ncols=3, figsize=(8, 3),sharex=True,

77
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

sharey=True)
for aa in (ax1, ax2, ax3):
aa.set_axis_off()
ax1.imshow(image)
ax1.set_title('Source')
ax2.imshow(reference)
ax2.set_title('Reference')
ax3.imshow(matched)
ax3.set_title('Matched')
plt.tight_layout()
plt.show()
fig, axes = plt.subplots(nrows=3, ncols=3, figsize=(8, 8))
for i, img in enumerate((image, reference, matched)):
for c, c_color in enumerate(('red', 'green', 'blue')):
img_hist, bins = exposure.histogram(img[..., c],
source_range='dtype')
axes[c, i].plot(bins, img_hist / img_hist.max())
img_cdf, bins = exposure.cumulative_distribution(img[..., c])
axes[c, i].plot(bins, img_cdf)
axes[c, 0].set_ylabel(c_color)
axes[0, 0].set_title('Source')
axes[0, 1].set_title('Reference')
axes[0, 2].set_title('Matched')
plt.tight_layout()
plt.show()

No of Channel is: 3
No of Channel is: 3

78
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Plotting Gray-Scale Normalized Histogram


from matplotlib import pyplot as plt
import cv2

# load the input image and convert it to grayscale


image = cv2.imread("woods.jpg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

# compute a grayscale histogram


hist = cv2.calcHist([image], [0], None, [256], [0, 256])

# matplotlib expects RGB images so convert and then display the image
# with matplotlib
plt.figure()
plt.axis("off")
plt.imshow(cv2.cvtColor(image, cv2.COLOR_GRAY2RGB))

# plot the histogram


plt.figure()

79
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

plt.title("Grayscale Histogram")
plt.xlabel("Bins")
plt.ylabel("# of Pixels")
plt.plot(hist)
plt.xlim([0, 256])
# normalize the histogram
hist /= hist.sum()
# plot the normalized histogram
plt.figure()
plt.title("Grayscale Histogram (Normalized)")
plt.xlabel("Bins")
plt.ylabel("% of Pixels")
plt.plot(hist)
plt.xlim([0, 256])
plt.show()

80
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

81
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

def pixelVal(pix, r1, s1, r2, s2):


if (0 <= pix and pix <= r1):
return (s1 / r1)*pix
elif (r1 < pix and pix <= r2):
return ((s2 - s1)/(r2 - r1)) * (pix - r1) + s1
else:
return ((255 - s2)/(255 - r2)) * (pix - r2) + s2

# reads an input image


img = cv2.imread(r"woods.jpg")

img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)


gray = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
plt.imshow(img.astype('uint8'))
plt.title('Original Image')
plt.show()

# find frequency of pixels in range 0-255


histr1 = cv2.calcHist([gray],[0],None,[256],[0,255])

# show the plotting graph of an image


plt.plot(histr1)
plt.xlabel('Intensity value')
plt.ylabel('Number of pixels')
plt.title('Histogram')
plt.show()

# Define parameters.
r1 = 60
r2 = 190

s1 = 0
s2 = 255

# Vectorize the function to apply it to each value in the Numpy array.


pixelVal_vec = np.vectorize(pixelVal)
# Apply contrast stretching.
linear_stretched = pixelVal_vec(img, r1, s1, r2, s2)
plt.imshow(linear_stretched.astype('uint8'))
plt.title('Result')
plt.show()
# find frequency of pixels in range 0-255
histr2 = cv2.calcHist([linear_stretched.astype(np.float32)],[0],None,[256],[0,255])
# show the plotting graph of an image
plt.plot(histr2)
plt.xlabel('Intensity value')
plt.ylabel('Number of pixels')
plt.title('Histogram')
plt.show()

82
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

83
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Conclusions:
Histogram image contrast has been enhanced and its histogram has also been equalized. There
is also one important thing to be noted here that during histogram equalization the overall shape
of the histogram changes, whereas in histogram stretching the overall shape of histogram
remains same.

84
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

ET3221: Computer Vision

Name of the student: Ashish Rodi Roll No. 9


Div: A Batch: 1
Date of performance: 06-08-2024

Experiment No. 6

Problem Statement:
Spatial Domain Filters and Frequency Domain Filtering

AIM:

Write a Python Code for following Enhancement Operations:


Spatial domain filtering
1. Low Pass Average Filter
2. Low Pass Median Filter
3. High Pass Filter
4. High Boost Filter
Frequency domain filtering:
1. Gaussian blur filter
2. High Pass Filter
3. Low Pass Filter
4. Band pass filter
5. Band stop filter

Objective(s) of Experiment:
Enhancement operation of image using spatial domain filters

Introduction:
Spatial Filtering technique is used directly on pixels of an image. Mask is usually considered
to be added in size so that it has specific center pixel. This mask is moved on the image such
that the center of the mask traverses all image pixels.
● Low Pass Average Filter: A low-pass filter is a filter that passes signals with a
frequency lower than a selected cutoff frequency and attenuates signals with
frequencies higher than the cutoff frequency. The exact frequency response of
the filter depends on the filter design.
● Low Pass Median Filter: Low pass filtering (aka smoothing) is employed to
remove high spatial frequency noise from a digital image. The low-pass filters
usually employ moving window operator which affects one pixel of the image at
a time, changing its value by some function of a local region (window) of pixels
● High Pass Filter: A high pass filter tends to retain the high frequency information
within an image while reducing the low frequency information. The kernel of the
high pass filter is designed to increase the brightness of the center pixel relative
to neighboring pixels.

85
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

● High Boost Filter: The high boost filter can be used to enhance the high
frequency components. We can sharpen edges of a image through the
amplification and obtain a more clear image. The high boost filter is simply the
sharpening operator in image processing.
Frequency domain filtering is a powerful technique in image processing that involves
transforming an image from the spatial domain to the frequency domain, applying filters,
and then converting it back. This approach allows for advanced manipulation of image
features based on their frequency components. Its techniques involve:
1. Transforming the Image: Converting the image from the spatial domain to the frequency
domain using the Fast Fourier Transform (FFT).
2. Applying Filters: Utilizing various filters in the frequency domain to manipulate specific
frequency components of the image:
- Gaussian Blur Filter: Smooths the image by reducing high-frequency noise and details,
acting as a low-pass filter.
- High Pass Filter: Enhances edges and fine details by suppressing low-frequency
components.
- Low Pass Filter: Removes high-frequency noise and blurs the image by allowing only
low-frequency components to pass.
- Band Pass Filter: Isolates and passes a specific range of frequencies while blocking
others. - Band Stop Filter: Suppresses a specific range of frequencies to remove periodic
noise, allowing other frequencies to pass.
3. Transforming Back: Converting the filtered image back to the spatial domain using the
inverse Fast Fourier Transform (IFFT). This process allows for effective noise reduction,
feature enhancement, and frequency-specific image modifications

86
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Flowchart:

87
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Code and outputs: (Spatial domain filtering)


import cv2
import numpy as np
import matplotlib.pyplot as plt
import cv2
import numpy as np
image = cv2.imread("Rose.jpeg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
gray = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)
x,y = gray.shape[:2]

Low Pass Mean Filter


kernel = np.ones([3, 3], dtype = int)
out1 = cv2.filter2D(src=image, ddepth=-1, kernel=kernel/9)
out2 = cv2.blur(src=image, ksize=(3,3)) # 5x5, 7x7, 11x11
out3 = cv2.blur(src=image, ksize=(5,5))
out4 = cv2.blur(src=image, ksize=(11,11))
out5 = cv2.blur(src=image, ksize=(12,12))
plt.imshow(image, 'gray')
plt.title('Input Image')
plt.show()
plt.imshow(out1, 'gray')
plt.title('Low Pass Average Filter')
plt.show()
plt.imshow(out2, 'gray')
plt.title('Low Pass Average Filter - In built function')
plt.show()
plt.imshow(out3, 'gray')
plt.title('Low Pass Average Filter - In built function')
88
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

plt.show()
plt.imshow(out4, 'gray')
plt.title('Low Pass Average Filter - In built function')
plt.show()
plt.imshow(out5, 'gray')
plt.title('Low Pass Average Filter - In built function')
plt.show()

89
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

90
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

91
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Low Pass Median Filter


out1 = cv2.medianBlur(src=image, ksize = 3) # 5, 7, 9, 11
out2 = cv2.medianBlur(src=image, ksize = 5) # 5, 7, 9, 11
out3 = cv2.medianBlur(src=image, ksize = 7) # 5, 7, 9, 11
out4 = cv2.medianBlur(src=image, ksize = 9) # 5, 7, 9, 11
plt.imshow(image, 'gray')
plt.title('Input Image')
plt.show()
plt.imshow(out1, 'gray')
plt.title('Low Pass Median Filter')
plt.show()
plt.imshow(out2, 'gray')
plt.title('Low Pass Median Filter')
plt.show()
plt.imshow(out3, 'gray')
plt.title('Low Pass Median Filter')
plt.show()
plt.imshow(out4, 'gray')
plt.title('Low Pass Median Filter')
plt.show()

92
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

93
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

94
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

High Pass Filter


kernel = np.array([[-1/9, -1/9, -1/9],
[-1/9, 8/9, -1/9],
[-1/9, -1/9, -1/9]])

kernel2 = np.array([[-1, -1, -1],


[-1, 8, -1],
[-1, -1, -1]])
# another kernal without 1/9
out = cv2.filter2D(src=image, ddepth=-1, kernel=kernel)
out2 = cv2.filter2D(src=image, ddepth=-1, kernel=kernel2)
plt.imshow(image, 'gray')
plt.title('Input Image')
plt.show()
plt.imshow(out, 'gray')
plt.title('High Pass Filter')
plt.show()
plt.imshow(out2, 'gray')
plt.title('High Pass Filter')
plt.show()

95
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

96
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

High Boost Filter


kernel = np.ones([3, 3], dtype = int)
kernel2 = kernel
kernel = kernel / 9
blur_image = cv2.filter2D(src=image, ddepth=-1, kernel=kernel)
blur_image2 = cv2.filter2D(src=image, ddepth=-1, kernel=kernel2)
out = cv2.addWeighted(image, 2, blur_image, -1, 0) #
out2 = cv2.addWeighted(image, 2, blur_image2, -1, 0) #
plt.imshow(image, 'gray')
plt.title('Input Image')
plt.show()
plt.imshow(out, 'gray')
plt.title('High Boost Filter')
plt.show()
plt.imshow(out2, 'gray')
plt.title('High Boost Filter2')
plt.show()

97
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

98
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Adding Random Noise to the Image


# Read Image
img = cv2.imread('kolkata.webp') # Color image

# Convert the image to grayscale


img_gray = img[:,:,1]
plt.imshow(img_gray,'gray')
plt.title('Original Image')
plt.show()
# Genearte noise with same shape as that of the image
noise = np.random.normal(0, 50, img_gray.shape)
# Add the noise to the image
img_noised = img_gray + noise
# Clip the pixel values to be between 0 and 255.
img_noised = np.clip(img_noised, 0, 255).astype(np.uint8)
plt.imshow(img_noised,'gray')
plt.title('Noise Added Image')
plt.show()

99
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

100
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Removing Random Noise using Mean LPF


kernel = np.ones([3, 3], dtype = int)
kernel = kernel / 9
out1 = cv2.filter2D(src=img, ddepth=-1, kernel=kernel)
out2 = cv2.blur(src=img_noised, ksize=(3,3)) # 5x5, 7x7, 11x11
out3 = cv2.blur(src=img_noised, ksize=(5,5))
out4 = cv2.blur(src=img_noised, ksize=(11,11))
out5 = cv2.blur(src=img_noised, ksize=(7,7))
plt.imshow(out2, 'gray')
plt.title('Low Pass Average Filter - In built function')
plt.show()
plt.imshow(out3, 'gray')
plt.title('Low Pass Average Filter - In built function')
plt.show()
plt.imshow(out4, 'gray')
plt.title('Low Pass Average Filter - In built function')
plt.show()
plt.imshow(out5, 'gray')
plt.title('Low Pass Average Filter - In built function')
plt.show()

101
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

102
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Removing Random Noise using Median LPF


out1 = cv2.medianBlur(src=img_noised, ksize = 3) # 5, 7, 9, 11
out2 = cv2.medianBlur(src=img_noised, ksize = 5) # 5, 7, 9, 11
out3 = cv2.medianBlur(src=img_noised, ksize = 7) # 5, 7, 9, 11
out4 = cv2.medianBlur(src=img_noised, ksize = 9) # 5, 7, 9, 11
plt.imshow(out1, 'gray')
plt.title('Low Pass Median Filter')
plt.show()
plt.imshow(out2, 'gray')
plt.title('Low Pass Median Filter')
plt.show()
plt.imshow(out3, 'gray')
plt.title('Low Pass Median Filter')
plt.show()
plt.imshow(out4, 'gray')
plt.title('Low Pass Median Filter')
plt.show()

103
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

104
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

105
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Adding Salt and Pepper Noise


# Read Image
img = cv2.imread('kolkata.webp') # Color image
# Convert the image to grayscale
img_gray = img[:,:,1]
plt.imshow(img_gray,'gray')
plt.title('Original Image')
plt.show()
# Get the image size (number of pixels in the image).
img_size = img_gray.size
# Set the percentage of pixels that should contain noise
noise_percentage = 0.1 # Setting to 10%
# Determine the size of the noise based on the noise precentage
noise_size = int(noise_percentage*img_size)
# Randomly select indices for adding noise.
random_indices = np.random.choice(img_size, noise_size)
# Create a copy of the original image that serves as a template for the noised image.
img_noised = img_gray.copy()
# Create a noise list with random placements of min and max values of the image pixels.
noise = np.random.choice([img_gray.min(), img_gray.max()], noise_size)
# Replace the values of the templated noised image at random indices with the noise, to obtain
the final noised image.
img_noised.flat[random_indices] = noise
plt.imshow(img_noised,'gray')
plt.title('Noise Added Image')
plt.show()

106
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

107
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Removing Salt and Pepper Noise using Median LPF


out1 = cv2.medianBlur(src=img_noised, ksize = 3) # 5, 7, 9, 11
out2 = cv2.medianBlur(src=img_noised, ksize = 5) # 5, 7, 9, 11
out3 = cv2.medianBlur(src=img_noised, ksize = 7) # 5, 7, 9, 11
out4 = cv2.medianBlur(src=img_noised, ksize = 9) # 5, 7, 9, 11
plt.imshow(out1, 'gray')
plt.title('Low Pass Median Filter')
plt.show()
plt.imshow(out2, 'gray')
plt.title('Low Pass Median Filter')
plt.show()
plt.imshow(out3, 'gray')
plt.title('Low Pass Median Filter')
plt.show()
plt.imshow(out4, 'gray')
plt.title('Low Pass Median Filter')
plt.show()

108
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

109
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Removing Salt and Pepper Noise using Mean LPF


kernel = np.ones([3, 3], dtype = int)
kernel = kernel / 9
out1 = cv2.filter2D(src=img, ddepth=-1, kernel=kernel)
out2 = cv2.blur(src=img_noised, ksize=(3,3)) # 5x5, 7x7, 11x11
out3 = cv2.blur(src=img_noised, ksize=(7,7))
out4 = cv2.blur(src=img_noised, ksize=(11,11))
out5 = cv2.blur(src=img_noised, ksize=(15,15))
plt.imshow(img_noised, 'gray')
plt.show()
plt.imshow(out2, 'gray')
plt.title('Low Pass Average Filter - In built function')
plt.show()
plt.imshow(out3, 'gray')
plt.title('Low Pass Average Filter - In built function')
plt.show()
plt.imshow(out4, 'gray')
plt.title('Low Pass Average Filter - In built function')
plt.show()
plt.imshow(out5, 'gray')
plt.title('Low Pass Average Filter - In built function')
plt.show()

110
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

111
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

112
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Adding Gaussian Noise


# Load the image
image = cv2.imread('Rose.jpeg')
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
gray = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)
x,y = gray.shape[:2]
plt.imshow(gray,'gray')
plt.title('Original Image')
plt.show()
# Generate random Gaussian noise
mean = 0
stddev = 180
noise = np.zeros(gray.shape, np.uint8)
cv2.randn(noise, mean, stddev)
# Add noise to image
noisy_img = cv2.add(gray, noise)
# Save noisy image
plt.imshow(noisy_img,'gray')
plt.title('Noise Added Image')
plt.show()

113
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Removing Gaussian Noise using Median LPF


out1 = cv2.medianBlur(src=noisy_img, ksize = 3) # 5, 7, 9, 11
out2 = cv2.medianBlur(src=noisy_img, ksize = 5) # 5, 7, 9, 11
out3 = cv2.medianBlur(src=noisy_img, ksize = 7) # 5, 7, 9, 11

114
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

out4 = cv2.medianBlur(src=noisy_img, ksize = 9) # 5, 7, 9, 11


plt.imshow(out1, 'gray')
plt.title('Low Pass Median Filter')
plt.show()
plt.imshow(out2, 'gray')
plt.title('Low Pass Median Filter')
plt.show()
plt.imshow(out3, 'gray')
plt.title('Low Pass Median Filter')
plt.show()
plt.imshow(out4, 'gray')
plt.title('Low Pass Median Filter')
plt.show()

115
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

116
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Removing Gaussian Noise using Mean LPF


kernel = np.ones([3, 3], dtype = int)
kernel = kernel / 9

out1 = cv2.filter2D(src=img, ddepth=-1, kernel=kernel)


out2 = cv2.blur(src=noisy_img, ksize=(3,3)) # 5x5, 7x7, 11x11
out3 = cv2.blur(src=noisy_img, ksize=(7,7))
out4 = cv2.blur(src=noisy_img, ksize=(11,11))
out5 = cv2.blur(src=noisy_img, ksize=(15,15))
plt.imshow(noisy_img, 'gray')
plt.show()
plt.imshow(out2, 'gray')
plt.title('Low Pass Average Filter - In built function')
plt.show()
plt.imshow(out3, 'gray')
plt.title('Low Pass Average Filter - In built function')
plt.show()
plt.imshow(out4, 'gray')
plt.title('Low Pass Average Filter - In built function')
plt.show()
plt.imshow(out5, 'gray')
plt.title('Low Pass Average Filter - In built function')
plt.show()

117
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

118
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

119
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Adding Impulse Noise


# Load the image
image = cv2.imread('Rose.jpeg')
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
gray = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)
x,y = gray.shape[:2]
plt.imshow(gray,'gray')
plt.title('Original Image')
plt.show()
imp_noise=np.zeros((x,y),dtype=np.uint8)
cv2.randu(imp_noise,0,255)
imp_noise=cv2.threshold(imp_noise,245,255,cv2.THRESH_BINARY)[1]
in_img=cv2.add(gray,imp_noise)
fig=plt.figure(dpi=200)
fig.add_subplot(1,3,1)
plt.imshow(image,cmap='gray')
plt.axis("off")
plt.title("Original")
fig.add_subplot(1,3,2)
plt.imshow(imp_noise,cmap='gray')
plt.axis("off")
plt.title("Impulse Noise")
fig.add_subplot(1,3,3)
plt.imshow(in_img,cmap='gray')
plt.axis("off")
plt.title("Combined")

120
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Text(0.5, 1.0, 'Combined')

Removing Impulse Noise using Median LPF


out1 = cv2.medianBlur(src=in_img, ksize = 3) # 5, 7, 9, 11
out2 = cv2.medianBlur(src=in_img, ksize = 5) # 5, 7, 9, 11
out3 = cv2.medianBlur(src=in_img, ksize = 7) # 5, 7, 9, 11
out4 = cv2.medianBlur(src=in_img, ksize = 9) # 5, 7, 9, 11
plt.imshow(in_img,'gray')
plt.show()
plt.imshow(out1, 'gray')
plt.title('Low Pass Median Filter')
plt.show()
plt.imshow(out2, 'gray')
plt.title('Low Pass Median Filter')
plt.show()
plt.imshow(out3, 'gray')
plt.title('Low Pass Median Filter')
plt.show()

121
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

plt.imshow(out4, 'gray')
plt.title('Low Pass Median Filter')
plt.show()

122
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

123
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Removing Impulse Noise using Mean LPF


#Mean filter
kernel = np.ones([3, 3], dtype = int)
kernel = kernel / 9
out1 = cv2.filter2D(src=img, ddepth=-1, kernel=kernel)
out2 = cv2.blur(src=in_img, ksize=(3,3)) # 5x5, 7x7, 11x11
out3 = cv2.blur(src=in_img, ksize=(5,5))
out4 = cv2.blur(src=in_img, ksize=(11,11))
out5 = cv2.blur(src=in_img, ksize=(7,7))
plt.imshow(in_img, 'gray')
plt.show()
plt.imshow(out2, 'gray')
plt.title('Low Pass Average Filter - In built function')
plt.show()
plt.imshow(out3, 'gray')
plt.title('Low Pass Average Filter - In built function')
plt.show()
plt.imshow(out4, 'gray')
plt.title('Low Pass Average Filter - In built function')
plt.show()
plt.imshow(out5, 'gray')
plt.title('Low Pass Average Filter - In built function')
plt.show()

124
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

125
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

126
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Frequency Domain Analysis


Code and Outputs ( frequency domain filtering)
import cv2
from PIL import Image
from skimage.io import imread, imshow, show
import scipy.fftpack as fp
from scipy import ndimage, misc, signal
#from scipy.stats import signaltonoise
from skimage import data, img_as_float
from skimage.color import rgb2gray
from skimage.transform import rescale
import matplotlib.pylab as pylab
import numpy as np
import numpy.fft
import timeit

Frequency Domain Gaussian Blur (numpy FFT)


pylab.figure(figsize=(20,15))
pylab.gray() # show the filtered result in grayscale
im = np.mean(imread(r"C:\Users\Ashish\Desktop\CVFolder\Bridge.jpg"), axis=2)
gauss_kernel = np.outer(signal.gaussian(im.shape[0], 5), signal.gaussian(im.shape[1], 5))
freq = fp.fft2(im)
assert(freq.shape == gauss_kernel.shape)
freq_kernel = fp.fft2(fp.ifftshift(gauss_kernel))
convolved = freq*freq_kernel # by the convolution theorem, simply multiply in the frequency
domain
im1 = fp.ifft2(convolved).real

pylab.subplot(2,3,1), pylab.imshow(im), pylab.title('Original Image',


127
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

size=20), pylab.axis('off')
pylab.subplot(2,3,2), pylab.imshow(gauss_kernel), pylab.title('Gaussian Kernel', size=20)
pylab.subplot(2,3,3), pylab.imshow(im1) # the imaginary part is an artifact
pylab.title('Output Image', size=20), pylab.axis('off')
pylab.subplot(2,3,4), pylab.imshow( (20*np.log10( 0.1 + fp.fftshift(freq))).astype(int))
pylab.title('Original Image Spectrum', size=20), pylab.axis('off')
pylab.subplot(2,3,5), pylab.imshow( (20*np.log10( 0.1 +
fp.fftshift(freq_kernel))).astype(int))
pylab.title('Gaussian Kernel Spectrum', size=20), pylab.subplot(2,3,6)
pylab.imshow( (20*np.log10( 0.1 + fp.fftshift(convolved))).astype(int))
pylab.title('Output Image Spectrum', size=20), pylab.axis('off')
pylab.subplots_adjust(wspace=0.2, hspace=0)
pylab.show()

Gaussian Kernel In Frequency Domain


im = rgb2gray(imread('Bridge.jpg'))
gauss_kernel = np.outer(signal.gaussian(im.shape[0], 1),
signal.gaussian(im.shape[1], 1))
freq = fp.fft2(im)
freq_kernel = fp.fft2(fp.ifftshift(gauss_kernel))
pylab.imshow( (20*np.log10( 0.01 +
fp.fftshift(freq_kernel))).real.astype(int), cmap='coolwarm') # 0.01 is added to keep the
argument to log function always positive
pylab.colorbar()
pylab.show()

128
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Frequency Domain Gaussian Blur (scipy FFTConvolve())


im = cv2.imread('Bridge.jpg')
im = cv2.cvtColor(im, cv2.COLOR_BGR2RGB)
im = cv2.cvtColor(im, cv2.COLOR_RGB2GRAY)
print(im.shape)
# (224, 225)
gauss_kernel = np.outer(signal.gaussian(11, 3), signal.gaussian(11, 3)) # 2D Gaussian kernel
of size 11x11 with σ = 3
im_blurred = signal.fftconvolve(im, gauss_kernel, mode='same')
fig, (ax_original, ax_kernel, ax_blurred) = pylab.subplots(1, 3, figsize=(20,8))
ax_original.imshow(im, cmap='gray')
ax_original.set_title('Original', size=20)
ax_original.set_axis_off()
ax_kernel.imshow(gauss_kernel)
ax_kernel.set_title('Gaussian kernel', size=15)
ax_kernel.set_axis_off()
ax_blurred.imshow(im_blurred, cmap='gray')
ax_blurred.set_title('Blurred', size=20)
ax_blurred.set_axis_off()
fig.show()
(574, 860)

129
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

import scipy.fftpack as fftpack


F1 = fftpack.fft2((im).astype(float))
F2 = fftpack.fftshift( F1 )
pylab.figure(figsize=(15,8))
pylab.subplot(1,2,1), pylab.imshow( (20*np.log10( 0.1 + F2)).astype(int),
cmap=pylab.cm.gray)
pylab.title('Original Image Spectrum', size=20)
F1 = fftpack.fft2((im_blurred).astype(float))
F2 = fftpack.fftshift( F1 )
pylab.subplot(1,2,2), pylab.imshow( (20*np.log10( 0.1 + F2)).astype(int),
cmap=pylab.cm.gray)
pylab.title('Blurred Image Spectrum', size=20)
pylab.show()

im = cv2.imread('Bridge.jpg')
im = cv2.cvtColor(im, cv2.COLOR_BGR2RGB)
im = cv2.cvtColor(im, cv2.COLOR_RGB2GRAY)
print(im.shape)
# (224, 225)
gauss_kernel = np.outer(signal.gaussian(11, 3), signal.gaussian(11, 3)) # 2D Gaussian kernel
of size 11x11 with σ = 3
im_blurred1 = signal.convolve(im, gauss_kernel, mode="same")
im_blurred2 = signal.fftconvolve(im, gauss_kernel, mode='same')
def wrapper_convolve(func):
def wrapped_convolve():
return func(im, gauss_kernel, mode="same")
return wrapped_convolve

130
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

wrapped_convolve = wrapper_convolve(signal.convolve)
wrapped_fftconvolve = wrapper_convolve(signal.fftconvolve)
times1 = timeit.repeat(wrapped_convolve, number=1, repeat=100)
times2 = timeit.repeat(wrapped_fftconvolve, number=1, repeat=100)
pylab.figure(figsize=(15,5))
pylab.gray()
pylab.subplot(131), pylab.imshow(im), pylab.title('Original Image',size=15), pylab.axis('off')
pylab.subplot(132), pylab.imshow(im_blurred1), pylab.title('convolve Output', size=15),
pylab.axis('off')
pylab.subplot(133), pylab.imshow(im_blurred2), pylab.title('ffconvolve Output',
size=15),pylab.axis('off')
(574, 860)

Time Difference between 2 methods


data = [times1, times2]
pylab.figure(figsize=(8,6))
box = pylab.boxplot(data, patch_artist=True) #notch=True,
colors = ['cyan', 'pink']
for patch, color in zip(box['boxes'], colors):
patch.set_facecolor(color)
pylab.xticks(np.arange(3), ('', 'convolve', 'fftconvolve'), size=15)
pylab.yticks(fontsize=15)
pylab.xlabel('scipy.signal convolution methods', size=15)
pylab.ylabel('time taken to run', size = 15)
pylab.show()

131
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

High Pass Filter


im = cv2.imread('coffee.jpg')
im = cv2.cvtColor(im, cv2.COLOR_BGR2RGB)
im = cv2.cvtColor(im, cv2.COLOR_RGB2GRAY)
pylab.figure(figsize=(10,10)), pylab.imshow(im, cmap=pylab.cm.gray), pylab.axis('off'),
pylab.show()

132
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

(<Figure size 1000x1000 with 1 Axes>,


<matplotlib.image.AxesImage at 0x1313fe560c0>,
(-0.5, 605.5, 605.5, -0.5),
None)
freq = fp.fft2(im)
(w, h) = freq.shape
half_w, half_h = int(w/2), int(h/2)
freq1 = np.copy(freq)
freq2 = fp.fftshift(freq1)
pylab.figure(figsize=(10,10)), pylab.imshow( (20*np.log10( 0.1 + freq2)).astype(int)),
pylab.show()

133
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

(<Figure size 1000x1000 with 1 Axes>,


<matplotlib.image.AxesImage at 0x1313fecc4d0>,
None)
# apply HPF
freq2[half_w-20:half_w+21,half_h-20:half_h+21] = 0 # select all but the first 20x20 (low)
frequencies
pylab.figure(figsize=(10,10))
pylab.imshow( (20*np.log10( 0.1 + freq2)).astype(int))
pylab.show()

134
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

im1 = np.clip(fp.ifft2(fftpack.ifftshift(freq2)).real,0,255) # clip pixel values after IFFT


#print(signaltonoise(im1, axis=None))
# 0.5901647786775175
pylab.imshow(im1, cmap='gray'), pylab.axis('off'), pylab.show()

135
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

(<matplotlib.image.AxesImage at 0x1313ff25640>,
(-0.5, 605.5, 605.5, -0.5),
None)
from scipy import fftpack
im = cv2.imread('Rose.jpeg')
im = cv2.cvtColor(im, cv2.COLOR_BGR2RGB)
im = cv2.cvtColor(im, cv2.COLOR_RGB2GRAY)
freq = fp.fft2(im)
(w, h) = freq.shape
half_w, half_h = int(w/2), int(h/2)
snrs_hp = []
lbs = list(range(1,25))
pylab.figure(figsize=(12,20))
for l in lbs:
freq1 = np.copy(freq)
freq2 = fftpack.fftshift(freq1)
freq2[half_w-l:half_w+l+1,half_h-l:half_h+l+1] = 0 # select all but the first lxl (low)
frequencies
im1 = np.clip(fp.ifft2(fftpack.ifftshift(freq2)).real,0,255) # clip pixel values after IFFT
#snrs_hp.append(signaltonoise(im1, axis=None))
pylab.subplot(6,4,l), pylab.imshow(im1, cmap='gray'), pylab.axis('off')
pylab.title('F = ' + str(l+1), size=20)
pylab.subplots_adjust(wspace=0.1, hspace=0)
pylab.show()

136
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

137
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Low Pass Filter


import numpy.fft as fp
fig, (axes1, axes2) = pylab.subplots(1, 2, figsize=(20,10))
pylab.gray() # show the result in grayscale
im = rgb2gray(imread('Bridge.jpg'))
freq = fp.fft2(im)
freq_gaussian = ndimage.fourier_gaussian(freq, sigma=4)
im1 = fp.ifft2(freq_gaussian)
axes1.imshow(im), axes1.axis('off'), axes2.imshow(im1.real) # the imaginary part is an artifact
axes2.axis('off')
pylab.show()

pylab.figure(figsize=(10,10))
pylab.imshow( (20*np.log10( 0.1 +
numpy.fft.fftshift(freq_gaussian))).astype(int))
pylab.show()

from scipy import fftpack


im = np.array(Image.open('coffee.jpg').convert('L'))
138
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

# low pass filter


freq = fp.fft2(im)
(w, h) = freq.shape
half_w, half_h = int(w/2), int(h/2)
freq1 = np.copy(freq)
freq2 = fftpack.fftshift(freq1)
freq2_low = np.copy(freq2)
freq2_low[half_w-10:half_w+11,half_h-10:half_h+11] = 0 # block the lowfrequencies
freq2 -= freq2_low # select only the first 20x20 (low) frequencies, block the high frequencies
im1 = fp.ifft2(fftpack.ifftshift(freq2)).real
#print(signaltonoise(im1, axis=None))
# 2.389151856495427
pylab.imshow(im1, cmap='gray'), pylab.axis('off')
pylab.show()
pylab.figure(figsize=(10,10))
pylab.imshow( (20*np.log10( 0.1 + freq2)).astype(int))
pylab.show()

139
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

im = np.array(Image.open('Rose.jpeg').convert('L'))
freq = fp.fft2(im)
(w, h) = freq.shape
half_w, half_h = int(w/2), int(h/2)
snrs_lp = []
ubs = list(range(1,25))
pylab.figure(figsize=(12,20))
for u in ubs:
freq1 = np.copy(freq)
freq2 = fftpack.fftshift(freq1)
freq2_low = np.copy(freq2)
freq2_low[half_w-u:half_w+u+1,half_h-u:half_h+u+1] = 0
freq2 -= freq2_low # select only the first 20x20 (low) frequencies
im1 = fp.ifft2(fftpack.ifftshift(freq2)).real
#snrs_lp.append(signaltonoise(im1, axis=None))
pylab.subplot(6,4,u), pylab.imshow(im1, cmap='gray'), pylab.axis('off')
pylab.title('F = ' + str(u), size=20)
pylab.subplots_adjust(wspace=0.1, hspace=0)
pylab.show()
140
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

141
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Band Pass Filter


from skimage import img_as_float
im = img_as_float(pylab.imread('Mario.jpg'))
pylab.figure(), pylab.imshow(im), pylab.axis('off'), pylab.show()
x = np.linspace(-10, 10, 15)
kernel_1d = np.exp(-0.005*x**2)
kernel_1d /= np.trapz(kernel_1d) # normalize the sum to 1
gauss_kernel1 = kernel_1d[:, np.newaxis] * kernel_1d[np.newaxis, :]
kernel_1d = np.exp(-5*x**2)
kernel_1d /= np.trapz(kernel_1d) # normalize the sum to 1
gauss_kernel2 = kernel_1d[:, np.newaxis] * kernel_1d[np.newaxis, :]
DoGKernel = gauss_kernel1[:, :, np.newaxis] - gauss_kernel2[:, :, np.newaxis]
im = signal.fftconvolve(im, DoGKernel, mode='same')
pylab.figure(), pylab.imshow(np.clip(im, 0, 1)), print(np.max(im)),
pylab.show()

0.7513583129834916

142
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Band Stop Filter


from scipy import fftpack
pylab.figure(figsize=(15,10))
im = np.mean(imread("MM9.jpg"), axis=2) / 255
print(im.shape)
pylab.subplot(2,2,1), pylab.imshow(im, cmap='gray'), pylab.axis('off')
pylab.title('Original Image')
F1 = fftpack.fft2((im).astype(float))
F2 = fftpack.fftshift( F1 )
pylab.subplot(2,2,2), pylab.imshow( (20*np.log10( 0.1 + F2)).astype(int),
cmap=pylab.cm.gray)
pylab.xticks(np.arange(0, im.shape[1], 25))
pylab.yticks(np.arange(0, im.shape[0], 25))
pylab.title('Original Image Spectrum')
# add periodic noise to the image
for n in range(im.shape[1]):
im[:, n] += np.cos(0.1*np.pi*n)
pylab.subplot(2,2,3), pylab.imshow(im, cmap='gray'), pylab.axis('off')
pylab.title('Image after adding Sinusoidal Noise')
F1 = fftpack.fft2((im).astype(float)) # noisy spectrum
F2 = fftpack.fftshift( F1 )
pylab.subplot(2,2,4), pylab.imshow( (20*np.log10( 0.1 + F2)).astype(int),
cmap=pylab.cm.gray)
pylab.xticks(np.arange(0, im.shape[1], 25))
pylab.yticks(np.arange(0, im.shape[0], 25))
pylab.title('Noisy Image Spectrum')

143
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

pylab.tight_layout()
pylab.show()
(496, 512)

F2[170:176,:220] = F2[176:255,120:] = 0 # eliminate the frequencies most likely responsible


for noise (keep some low frequency components)
im1 = fftpack.ifft2(fftpack.ifftshift( F2 )).real
pylab.axis('off'), pylab.imshow(im1, cmap='gray'), pylab.show()

144
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

((0.0, 1.0, 0.0, 1.0), <matplotlib.image.AxesImage at 0x131415280e0>, None)

Conclusions:
An image filter is a technique through which size, colors, shading and other characteristics of
an image are altered. In these experiments we used different filters and observed on different
image.
In this experiment, different noises are added to the image to analyze which filter is most
suitable for a type of a noise. Based on the results obtained, for the random noise or salt and
pepper noise (a type of random noise) median filter appears to be more suitable as the middle
value of the matrix is replaced with the median of all the values in the matrix.
For uniform noise, mean filter is the appropriate one as it gradually reduces the image whilst
keeping the image clear. For impulse noise, median filter is the right choice, even with a
smaller matrix the noise is reduced or suppressed to a great extent. Moreover, as the size of the
matrix increases, the noise reduces furthermore. However, it should be noted that increasing
the size of matrix too much may result in complete blurring of the image.
For the high pass filters, the large frequency changes are allowed and hence helps to capture
the coarse details like the shape. For these filters, normal kernel without dividing it by 9
(normalization) provides a better result as it makes the difference between the intensities more
prominent hence normalization in this case is not preferred. High boost simply extracts the
high pass image and then merges it with the original image to obtain a sharp image. In this
145
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

case kernel with normalization is preferred to control the intensity. If the result is required to
be extremely sharp normalization is avoided, but for better control over intensities,
normalization is used. The main point of interest in the frequency domain spectrum is the
center. If the center is brightly illuminated (high intensity), it indicates the presence of low-
frequency components, suggesting the use of a low-pass filter. This can be visualized as a bell-
shaped curve in the magnitude spectrum. Conversely, for high-frequency components, the
center region appears dark, while the surrounding areas are illuminated based on intensity,
indicating the use of a high-pass filter. This can be visualized as an inverted bell shape in the
magnitude spectrum.

146
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

ET3221: Computer Vision

Name of the student: Ashish Rodi Roll No. 9


Div: A Batch: 1
Date of performance: 13-08-2024

Experiment No. 7

Problem Statement:
Edge Detectors using Python

AIM:
Write a Python Code for the following Edge Detection Operators:
1. Roberts Operator
2. Prewitt’s Operator
3. Sobel Operator
4. Prewitt’s Compass Operators
5. Sobel Compass Operators
6. Canny Edge Detector

Objective(s) of Experiment:
To perform and study edge detection operators for image data.

Introduction:
Edge detection includes a variety of mathematical methods that aim at identifying points in
a digital image at which the image brightness changes sharply or, more formally, has
discontinuities. The points at which image brightness changes sharply are typically
organized into a set of curved line segments termed edges. The same problem of finding
discontinuities in 1D signals is known as step detection and the problem of finding signal
discontinuities over time is known as change detection. Edge detection is a fundamental tool
in image processing, machine vision and computer vision, particularly in the areas of feature
detection and feature extraction.

147
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Flowchart:

148
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Code and Results:


import cv2
import numpy as np
import matplotlib.pyplot as plt

Prewitt Edge Detection


# Prewitt Edge Detector Operator

image = cv2.imread("Temple.jpeg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
img2 = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)
gray = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)
x,y = gray.shape[:2]
plt.imshow(gray, 'gray')
plt.title('Input Image')
plt.show()
kernelx = np.array([[1,1,1],[0,0,0],[-1,-1,-1]])
kernely = np.array([[-1,0,1],[-1,0,1],[-1,0,1]])

# img2= cv2.GaussianBlur(gray,(5,5),0)#gaussian Image

img_prewittx = cv2.filter2D(img2, -1, kernelx)#Horizontal


img_prewitty = cv2.filter2D(img2, -1, kernely)#Vertical
img_prewitt = img_prewittx + img_prewitty#Horizontal & Vertical

plt.imshow(img_prewittx, 'gray')
plt.title('Prewitt Horizontal Edge Kernel')
plt.show()

plt.imshow(img_prewitty, 'gray')
plt.title('Prewitt Vertical Edge Kernel')
plt.show()

plt.imshow(img_prewitt, 'gray')
plt.title('Prewitt Both Edges Kernel')
plt.show()

149
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

150
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Prewitt Edge detection (with Gaussian blur)


# Prewitt Edge Detector Operator (with gaussian blur)

image = cv2.imread(r"Temple.jpeg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
gray = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)
x,y = gray.shape[:2]
plt.imshow(gray, 'gray')
plt.title('Input Image')
plt.show()
151
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

kernelx = np.array([[1,1,1],[0,0,0],[-1,-1,-1]])
kernely = np.array([[-1,0,1],[-1,0,1],[-1,0,1]])

img2= cv2.GaussianBlur(gray,(5,5),0)#gaussian Image

img_prewittx = cv2.filter2D(img2, -1, kernelx)#Horizontal


img_prewitty = cv2.filter2D(img2, -1, kernely)#Vertical
img_prewitt = img_prewittx + img_prewitty#Horizontal & Vertical

plt.imshow(img_prewittx, 'gray')
plt.title('Prewitt Horizontal Edge Kernel')
plt.show()

plt.imshow(img_prewitty, 'gray')
plt.title('Prewitt Vertical Edge Kernel')
plt.show()

plt.imshow(img_prewitt, 'gray')
plt.title('Prewitt Both Edges Kernel')
plt.show()

152
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

153
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Sobel Edge detection


# Sobel Edge Detector Operator

image = cv2.imread(r"Temple.jpeg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
gray = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)
x,y = gray.shape[:2]
plt.imshow(gray, 'gray')
plt.title('Input Image')
plt.show()

img2= cv2.GaussianBlur(gray,(5,5),0)#gaussian Image


img_sobelx = cv2.Sobel(img2,cv2.CV_8U,0,1,ksize=3)
img_sobely = cv2.Sobel(img2,cv2.CV_8U,1,0,ksize=3)
img_sobel = img_sobelx + img_sobely

plt.imshow(img_sobelx, 'gray')
plt.title('Sobel Horizontal Edge Kernel')
plt.show()

plt.imshow(img_sobely, 'gray')
plt.title('Sobel Vertical Edge Kernel')
plt.show()

plt.imshow(img_sobel, 'gray')
plt.title('Sobel Both Edges Kernel')
plt.show()

154
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

155
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Sobel edge detection with weight operator


# Sobel Edge Detector with higher weight Operator

image = cv2.imread(r"Temple.jpeg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
gray = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)
x,y = gray.shape[:2]
plt.imshow(gray, 'gray')
plt.title('Input Image')
plt.show()
156
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

kernelx = np.array([[1,8,1],[0,0,0],[-1,-8,-1]])
kernely = np.array([[-1,0,1],[-8,0,8],[-1,0,1]])

img2= cv2.GaussianBlur(gray,(5,5),0)#gaussian Image

img_sobelx = cv2.filter2D(img2, -1, kernelx)#Horizontal


img_sobely = cv2.filter2D(img2, -1, kernely)#Vertical
img_sobel = img_sobelx + img_sobely#Horizontal & Vertical

plt.imshow(img_sobelx, 'gray')
plt.title('Sobel Horizontal Edge Kernel')
plt.show()

plt.imshow(img_sobely, 'gray')
plt.title('Sobel Vertical Edge Kernel')
plt.show()

plt.imshow(img_sobel, 'gray')
plt.title('Sobel Both Edges Kernel')
plt.show()

157
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

158
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Robert Edge Detector


# robert Operator

image = cv2.imread(r"Temple.jpeg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
gray = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)
x,y = gray.shape[:2]
plt.imshow(gray, 'gray')
plt.title('Input Image')
plt.show()

kernel_Roberts_x = np.array([[1, 0],[0, -1]])


kernel_Roberts_y = np.array([[0, -1],[1, 0]])

img2= cv2.GaussianBlur(gray,(5,5),0)#gaussian Image


x = cv2.filter2D(img2, cv2.CV_16S, kernel_Roberts_x)
y = cv2.filter2D(img2, cv2.CV_16S, kernel_Roberts_y)
absX = cv2.convertScaleAbs(x)
absY = cv2.convertScaleAbs(y)
roberts = cv2.addWeighted(absX, 0.5, absY, 0.5, 0)

plt.imshow(roberts, 'gray')
plt.title('roberts Kernel')
plt.show()

159
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Prewitt Compass Matrix


# prewittcompass():

image = cv2.imread(r"Temple.jpeg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
gray = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)
x,y = gray.shape[:2]
plt.imshow(gray, 'gray')
plt.title('Input Image')
plt.show()
160
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

#Masks
prewitt1 = np.array ([[-1,-1,-1],[1,-2,1],[1,1,1]])
prewitt2 = np.array ([[-1,-1,1],[-1,-2,1],[1,1,1]])
prewitt3 = np.array ([[-1,1,1],[-1,-2,1],[-1,1,1]])
prewitt4 = np.array ([[1,1,1],[-1,-2,1],[-1,-1,1]])
prewitt5 = np.array ([[1,1,1],[1,-2,1],[-1,-1,-1]])
prewitt6 = np.array ([[1,1,1],[1,-2,-1],[1,-1,-1]])
prewitt7 = np.array ([[1,1,-1],[1,-2,-1],[1,1,-1]])
prewitt8 = np.array ([[1,-1,-1],[1,-2,-1],[1,1,1]])
img2= cv2.GaussianBlur(gray,(5,5),0)#gaussian Image
img_prewitt1 = cv2.filter2D(img2, -1, prewitt1)
img_prewitt2 = cv2.filter2D(img2, -1, prewitt2)
img_prewitt3 = cv2.filter2D(img2, -1, prewitt3)
img_prewitt4 = cv2.filter2D(img2, -1, prewitt4)
img_prewitt5 = cv2.filter2D(img2, -1, prewitt5)
img_prewitt6 = cv2.filter2D(img2, -1, prewitt6)
img_prewitt7 = cv2.filter2D(img2, -1, prewitt7)
img_prewitt8 = cv2.filter2D(img2, -1, prewitt8)

fig=plt.figure(dpi=200)

fig.add_subplot(3,3,1)
plt.imshow(img_prewitt1, 'gray')
plt.title('Prewitt 1')
plt.axis('off')

fig.add_subplot(3,3,2)
plt.imshow(img_prewitt2, 'gray')
plt.title('Prewitt 2')
plt.axis('off')

fig.add_subplot(3,3,3)
plt.imshow(img_prewitt3, 'gray')
plt.title('Prewitt 3')
plt.axis('off')

fig.add_subplot(3,3,4)
plt.imshow(img_prewitt4, 'gray')
plt.title('Prewitt 4')
plt.axis('off')

fig.add_subplot(3,3,5)
plt.imshow(img_prewitt5, 'gray')
plt.title('Prewitt 5')
plt.axis('off')

fig.add_subplot(3,3,6)
plt.imshow(img_prewitt6, 'gray')
plt.title('Prewitt 6')

161
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

plt.axis('off')

fig.add_subplot(3,3,7)
plt.imshow(img_prewitt7, 'gray')
plt.title('Prewitt 7')
plt.axis('off')

fig.add_subplot(3,3,8)
plt.imshow(img_prewitt8, 'gray')
plt.title('Prewitt 8')
plt.axis('off')

plt.show()
#Adding all the elements together to form an image
prewitt_compass = img_prewitt1 + img_prewitt2 + img_prewitt3 + img_prewitt4 +
img_prewitt5 + img_prewitt6 + img_prewitt7 + img_prewitt8

plt.imshow(prewitt_compass, 'gray')
plt.title('prewitt_compass Kernel Filter')
plt.show()

162
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Sobel Compass Matrix


import cv2
import numpy as np
import matplotlib.pyplot as plt

image = cv2.imread(r"Temple.jpeg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
gray = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)
163
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

x,y = gray.shape[:2]
plt.imshow(gray, 'gray')
plt.title('Input Image')
plt.show()

sobel1 = np.array([[-1,-2,-1],[0,0,0],[1,2,1]])
sobel1 = np.array([[-1,-2,-1],[0,0,0],[1,2,1]])
sobel2 = np.array([[-2,-1,0],[-1,0,1],[0,1,2]])
sobel3 = np.array([[-1,0,1],[-2,0,2],[-1,0,1]])
sobel4 = np.array([[0,1,2],[-1,0,1],[-2,-1,0]])
sobel5 = np.array([[1,2,1],[0,0,0],[-1,-2,-1]])
sobel6 = np.array([[2,1,0],[1,0,-1],[0,-1,-2]])
sobel7 = np.array([[1,0,-1],[2,0,-2],[1,0,-1]])
sobel8 = np.array([[0,-1,-2],[1,0,-1],[2,1,0]])

img2= cv2.GaussianBlur(gray,(5,5),0)#gaussian Image

img_sobel1 = cv2.filter2D(img2, -1, sobel1)


img_sobel2 = cv2.filter2D(img2, -1, sobel2)
img_sobel3 = cv2.filter2D(img2, -1, sobel3)
img_sobel4 = cv2.filter2D(img2, -1, sobel4)
img_sobel5 = cv2.filter2D(img2, -1, sobel5)
img_sobel6 = cv2.filter2D(img2, -1, sobel6)
img_sobel7 = cv2.filter2D(img2, -1, sobel7)
img_sobel8 = cv2.filter2D(img2, -1, sobel8)

#img_sobel1 = cv2.Sobel(img2,cv2.CV_8U,1,0,ksize=3)
#img_sobel2 = cv2.Sobel(img2,cv2.CV_8U,0,1,ksize=3)
#img_sobel3 = cv2.Sobel(img2,cv2.CV_8U,1,0,ksize=3)
#img_sobel4 = cv2.Sobel(img2,cv2.CV_8U,0,1,ksize=3)
#img_sobel5 = cv2.Sobel(img2,cv2.CV_8U,1,0,ksize=3)
#img_sobel6 = cv2.Sobel(img2,cv2.CV_8U,0,1,ksize=3)
#img_sobel7 = cv2.Sobel(img2,cv2.CV_8U,1,0,ksize=3)
#img_sobel8 = cv2.Sobel(img2,cv2.CV_8U,0,1,ksize=3)

#Add all the elements to form an image


sobel_compass = img_sobel1 + img_sobel2 + img_sobel3 + img_sobel4 + img_sobel5 +
img_sobel6 + img_sobel7 + img_sobel8

plt.imshow(sobel_compass, 'gray')
plt.title('sobel_compass Kernel Filter')
plt.show()

164
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Gaussian Blur
# reads an input image
from skimage import feature, exposure
img = cv2.imread(r"Temple.jpeg")
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) #BGR 2 RGB for plotting
using matplotlib

print("input image dimensions", gray.shape)

width = 100
165
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

height = 100
dim = (width, height)

# resize image
gray = cv2.resize(gray, dim)

plt.imshow(gray, cmap = 'gray')


plt.xlabel('X axis')
plt.ylabel('Y axis')
plt.title('Input Image')
plt.show()

blurred = cv2.GaussianBlur(gray, (5, 5), 0)

plt.imshow(blurred, cmap = 'gray')


plt.xlabel('X axis')
plt.ylabel('Y axis')
plt.title('Blurred Image')
plt.show()
input image dimensions (359, 640)

166
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Canny with apperture size


image = cv2.imread(r"Temple.jpeg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
gray = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)
x,y = gray.shape[:2]
plt.imshow(gray, 'gray')
plt.title('Input Image')
plt.show()

# Setting All parameters


t_lower = 100 # Lower Threshold
t_upper = 200 # Upper threshold
aperture_size = 5 # Aperture size
aperture_size1 = 3 # Aperture size

# Applying the Canny Edge filter


# with Custom Aperture Size
edge = cv2.Canny(gray, t_lower, t_upper, apertureSize=aperture_size)
edge2 = cv2.Canny(gray, t_lower, t_upper, apertureSize=aperture_size1)

# Convert the image data to a floating-point format


#edge = edge.astype(np.int8)

167
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

plt.imshow(edge, 'gray')
plt.title('aperture = 5 Image')
plt.show()
plt.imshow(edge2, 'gray')
plt.title('aperture = 3Image')
plt.show()

168
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Canny with L2 Gradient


image = cv2.imread(r"Temple.jpeg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
gray = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)
x,y = gray.shape[:2]
plt.imshow(gray, 'gray')
plt.title('Input Image')
plt.show()

t_lower = 100 # Lower Threshold


t_upper = 200 # Upper threshold
aperture_size = 5 # Aperture size
L2Gradient = True # Boolean

t_lower1 = 20 # Lower Threshold


t_upper1 = 250 # Upper threshold

# Applying the Canny Edge filter with L2Gradient = True


edge = cv2.Canny(gray, t_lower, t_upper, L2gradient = L2Gradient)
edge2 = cv2.Canny(gray, t_lower1, t_upper1, L2gradient = L2Gradient )

plt.imshow(edge, 'gray')
plt.title('Input Image')
plt.show()

plt.imshow(edge2, 'gray')
plt.title('Input Image')
plt.show()

169
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

170
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Canny with apperture size and L2 gradient

image = cv2.imread(r"Temple.jpeg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
gray = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)
x,y = gray.shape[:2]
plt.imshow(gray, 'gray')
plt.title('Input Image')
plt.show()

# Defining all the parameters


t_lower = 100 # Lower Threshold
t_upper = 200 # Upper threshold
aperture_size = 5 # Aperture size
aperture_size1 = 3
L2Gradient = True # Boolean

# Applying the Canny Edge filter


# with Aperture Size and L2Gradient
edge = cv2.Canny(gray, t_lower, t_upper,
apertureSize = aperture_size,
L2gradient = L2Gradient )

edge2 = cv2.Canny(gray, t_lower, t_upper,


apertureSize = aperture_size1,
L2gradient = L2Gradient )

plt.imshow(edge, 'gray')
plt.title('aperture = 5 Image')
171
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

plt.show()

plt.imshow(edge2, 'gray')
plt.title('aperture =3 Image')
plt.show()

172
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Threshold Analysis
# compute a "wide", "mid-range", and "tight" threshold for the edges
# using the Canny edge detector
# Syntax: cv2.Canny(image, T_lower, T_upper, aperture_size, L2Gradient)
wide = cv2.Canny(blurred, 10, 200)
mid = cv2.Canny(blurred, 30, 150)
tight = cv2.Canny(blurred, 240, 250)

(fig, axs) = plt.subplots(nrows=1, ncols=3, figsize=(8, 4))


# plot each of the images
axs[0].imshow(wide, cmap="gray")
axs[1].imshow(mid, cmap="gray")
axs[2].imshow(tight, cmap="gray")

# set the titles of each axes


axs[0].set_title("Wide")
axs[1].set_title("Mid-Range")
axs[2].set_title("Tight Threshold")

# show the plots


plt.tight_layout()
plt.show()

173
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Conclusions:

In this experiment, we have performed and studied the edge detection of an image using
different operators, we can conclude that edge detectors are useful for finding boundaries of
object within images. It works by detecting sharp discontinuities in brightness. Edge detection
can be used for Image segmentation and data extraction. Sobel inherently uses noise reduction
with the help of gaussian blur, while for Prewitt, it needs to be used explicitly. When used
with weighted edges, the edges get more emphasized thus providing a better output. L2
gradient is used for magnitude of the edge. L2 provides a better output compared to the normal
L1 gradient. Aperture decides the amounts of details or edges to be highlighted in the image.
For a high aperture, more edges are observed. Threshold analysis can also be performed to
highlight the edges of a range of intensity.

174
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

ET3221: Computer Vision

Name of the student: Ashish Rodi Roll No. 9


Div: A Batch: 1
Date of performance: 27-08-2024

Experiment No. 8

Problem Statement:
Blob and corner Detectors using Python

AIM:
Write a Python Code for the following Blob and corner Detection Operators:
1.Laplacian of Gaussian(LoG)
2.Difference of Gaussian(DoG)
3.Determinant of Hessian(DoH)
4.Harris corner detectors

Objective(s) of Experiment:
To perform and study blob and corner detection operators for image data.

Introduction:
Blob and corner detection are critical techniques in image processing, used to identify and
extract important features from images. Blob detection targets regions within an image that
differ in intensity from their surroundings, making it useful for tasks like object recognition
and segmentation. Common methods include the Laplacian of Gaussian (LoG), Difference
of Gaussian (DoG), and Determinant of Hessian (DoH).Corner detection, meanwhile,
focuses on identifying points where edges meet, such as the corners of objects, which are
key for image matching and motion tracking. Techniques like the Harris Corner Detector are
widely employed for this purpose. Together, these methods provide a robust foundation for
various computer vision applications, enabling machines to interpret and analyze visual data
effectively.

175
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Flowchart:

176
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Code and Results:


import cv2
import numpy as np
from skimage.io import imshow, imread
from skimage.color import rgb2gray
import matplotlib.pyplot as plt
sample = imread('Lotus3.jpg')
sample = cv2.resize(sample, (400,400))
#sample_g = rgb2gray(sample)
sample_g = cv2.cvtColor(sample, cv2.COLOR_BGR2GRAY) #BGR 2 RGB for
plotting using matplotlib
fig, ax = plt.subplots(1,2,figsize=(10,5))
ax[0].imshow(sample)
ax[1].imshow(sample_g,cmap='gray')
ax[0].set_title('Colored Image',fontsize=15)
ax[1].set_title('Grayscale Image',fontsize=15)
plt.show()

sample_g.shape
(400, 400)

Binarization Of Image
fig, ax = plt.subplots(1,3,figsize=(15,5))
sample_b = sample_g > 25
ax[0].set_title('Grayscale Image',fontsize=20)
ax[0].imshow(sample_g,cmap='gray')
ax[1].plot(sample_g[300])
ax[1].set_ylabel('Pixel Value')
ax[1].set_xlabel('Width of Picture')
ax[1].set_title('Plot of 1 Line',fontsize=15)
ax[2].set_title('Binarized Image',fontsize=15)
ax[2].imshow(sample_b,cmap='gray')

177
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

<matplotlib.image.AxesImage at 0x2c5d691c140>

Laplacian of Gaussian (LoG)


from skimage.feature import blob_dog, blob_log, blob_doh
fig, ax = plt.subplots(1,2,figsize=(10,5))
ax[0].set_title('Binarized Image',fontsize=15)
ax[0].imshow(sample_g,cmap='gray')
blobs = blob_log(sample_b, max_sigma=30, threshold=0.01)
ax[1].imshow(sample_b, cmap='gray')
for blob in blobs:
y, x, area = blob
ax[1].add_patch(plt.Circle((x, y), area*np.sqrt(2), color='r', fill=False))
ax[1].set_title('Using LOG',fontsize=15)
plt.tight_layout()
plt.show()

Difference Of Gaussian (DoG)


from skimage.feature import blob_dog, blob_log, blob_doh
fig, ax = plt.subplots(1,2,figsize=(10,5))
ax[0].set_title('Binarized Image',fontsize=15)
178
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

ax[0].imshow(sample_g,cmap='gray')

blobs = blob_dog(sample_b, max_sigma=30, threshold=0.01)

ax[1].imshow(sample_b, cmap='gray')
for blob in blobs:
y, x, area = blob
ax[1].add_patch(plt.Circle((x, y), area*np.sqrt(2), color='r',
fill=False))
ax[1].set_title('Using DOG',fontsize=15)
plt.tight_layout()
plt.show()

Determinant of Hessian (DoH)


blobs = blob_doh(sample_b, max_sigma=30, threshold=0.01)
from skimage.feature import blob_dog, blob_log, blob_doh
fig, ax = plt.subplots(1,2,figsize=(10,5))
ax[0].set_title('Binarized Image',fontsize=15)
ax[0].imshow(sample_g,cmap='gray')
blobs = blob_doh(sample_b, max_sigma=30, threshold=0.01)
ax[1].imshow(sample_b, cmap='gray')
for blob in blobs:
y, x, area = blob
ax[1].add_patch(plt.Circle((x, y), area*np.sqrt(2), color='r',
fill=False))
ax[1].set_title('Using DOG',fontsize=15)
plt.tight_layout()
plt.show()

179
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

DoH

import cv2
import numpy as np
from skimage.io import imshow, imread
from skimage.color import rgb2gray
import matplotlib.pyplot as plt

# Read and resize the image


sample = imread('Sun_flower.jpg')
sample = cv2.resize(sample, (400, 400))

# Convert the image to grayscale


sample_g = cv2.cvtColor(sample, cv2.COLOR_BGR2GRAY)

# Display the colored image and grayscale image


fig, ax = plt.subplots(1, 2, figsize=(10, 5))
ax[0].imshow(sample)
ax[0].set_title('Colored Image', fontsize=15)
ax[1].imshow(sample_g, cmap='gray')
ax[1].set_title('Grayscale Image', fontsize=15)
plt.show()

180
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

sample.shape
print(np.isnan(sample).any(), np.isinf(sample).any())
print(np.iscomplex(sample).any())
False False
False
sample_b = sample_g > 25
Binarization of Image

# Plot the original image and binary image


fig, ax = plt.subplots(1, 3, figsize=(15, 5))
ax[0].imshow(sample_g, cmap='gray')
ax[0].set_title('Grayscale Image', fontsize=20)
ax[1].plot(sample_g[200]) # Plotting a row from the grayscale image
ax[1].set_ylabel('Pixel Value')
ax[1].set_xlabel('Width of Picture')
ax[1].set_title('Plot of 1 Line', fontsize=15)
ax[2].imshow(sample_b, cmap='gray')
ax[2].set_title('Binarized Image', fontsize=15)
plt.show()

181
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

# Convert images to uint8 if necessary


sample_b = sample_g > 25

Laplacian of Gaussian (LoG)


from skimage.feature import blob_dog, blob_log, blob_doh
fig, ax = plt.subplots(1,2,figsize=(10,5))
ax[0].set_title('Binarized Image',fontsize=15)
ax[0].imshow(sample_g,cmap='gray')
blobs = blob_log(sample_b, max_sigma=30, threshold=0.01)
ax[1].imshow(sample_b, cmap='gray')
for blob in blobs:
y, x, area = blob
ax[1].add_patch(plt.Circle((x, y), area*np.sqrt(2), color='r', fill=False))
ax[1].set_title('Using LOG',fontsize=15)
plt.tight_layout()
plt.show()

Difference Of Gaussian (DoG)


from skimage.feature import blob_dog, blob_log, blob_doh
fig, ax = plt.subplots(1,2,figsize=(10,5))
ax[0].set_title('Binarized Image',fontsize=15)
ax[0].imshow(sample_g,cmap='gray')
blobs = blob_dog(sample_b, max_sigma=30, threshold=0.01)
ax[1].imshow(sample_b, cmap='gray')
for blob in blobs:
y, x, area = blob
ax[1].add_patch(plt.Circle((x, y), area*np.sqrt(2), color='r', fill=False))
ax[1].set_title('Using DOG',fontsize=15)
plt.tight_layout()
plt.show()

182
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Determinant of Hessian (DoH)


blobs = blob_doh(sample_b, max_sigma=30, threshold=0.01)
from skimage.feature import blob_dog, blob_log, blob_doh
fig, ax = plt.subplots(1,2,figsize=(10,5))
ax[0].set_title('Binarized Image',fontsize=15)
ax[0].imshow(sample_g,cmap='gray')
blobs = blob_doh(sample_b, max_sigma=30, threshold=0.01)
ax[1].imshow(sample_b, cmap='gray')
for blob in blobs:
y, x, area = blob
ax[1].add_patch(plt.Circle((x, y), area*np.sqrt(2), color='r', fill=False))
ax[1].set_title('Using DOG',fontsize=15)
plt.tight_layout()
plt.show()
DOH
DoH

183
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Histogram of oriented Gradients (HoG)


import cv2
import numpy as np;
ori = cv2.imread('sunflower.jpg')
im = cv2.imread("sunflower.jpg", cv2.IMREAD_GRAYSCALE)
detector = cv2.SimpleBlobDetector_create()
keypoints = detector.detect(im)
im_with_keypoints = cv2.drawKeypoints(im, keypoints, np.array([]), (0,0,255),
cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS)
cv2.imshow('Original',ori)
cv2.imshow('BLOB',im_with_keypoints)
if cv2.waitKey(0) & 0xff == 27:
cv2.destroyAllWindows()

from skimage.feature import hog


from skimage import exposure
import cv2
import numpy as np

# Load the image and convert it to RGB


img = cv2.imread('sunflower.jpg')
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

# Compute HOG features and the visualization image


hog_features, hog_image = hog(img, orientations=8, pixels_per_cell=(16, 16),
cells_per_block=(1, 1), visualize=True,
channel_axis=-1) # Specify the channel axis

# Rescale the HOG image to the range [0, 255] for display
hog_image_rescaled = exposure.rescale_intensity(hog_image, in_range=(0, 10))
184
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

# Convert the HOG image to uint8


hog_image_rescaled = (hog_image_rescaled * 255).astype("uint8")

# Display the original and HOG images


cv2.imshow('Original', cv2.cvtColor(img, cv2.COLOR_RGB2BGR)) # Convert back to BGR
for display
cv2.imshow('HoG', hog_image_rescaled)
cv2.waitKey(0)
cv2.destroyAllWindows()

Harris Corner Detection


import cv2
import matplotlib.pyplot as plt
import os
import sys
import numpy as np
import scipy.ndimage.filters as filters

def h_fun(img, kernel_size=3):


"""Calculates Harris operator array for every pixel"""
Ix = cv2.Sobel(img, cv2.CV_32F, 1, 0, ksize=kernel_size)
Iy = cv2.Sobel(img, cv2.CV_32F, 0, 1, ksize=kernel_size)
Ix_square = Ix * Ix
Iy_square = Iy * Iy
Ixy = Ix * Iy
Ix_square_blur = cv2.GaussianBlur(Ix_square, (kernel_size, kernel_size), 0)
Iy_square_blur = cv2.GaussianBlur(Iy_square, (kernel_size, kernel_size), 0)
Ixy_blur = cv2.GaussianBlur(Ixy, (kernel_size, kernel_size), 0)
det = Ix_square_blur * Iy_square_blur - Ixy_blur * Ixy_blur
trace = Ix_square_blur + Iy_square_blur
k = 0.05
h = det - k*trace*trace
h = h / np.max(h)
185
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

return h

def find_max(image, size, threshold):


"""Finds maximum of array"""
data_max = filters.maximum_filter(image, size)
maxima = (image == data_max)
diff = (image > threshold)
maxima[diff == 0] = 0
return np.nonzero(maxima)

def draw_points(img, points):


plt.figure()
plt.imshow(img)
plt.plot(points[1], points[0], '*', color='r')
# plt.show()
IMG_NAME1 = 'Temple.jpeg'
img1_color = cv2.imread(IMG_NAME1)
img1_color = cv2.cvtColor(img1_color, cv2.COLOR_BGR2RGB)
img1 = cv2.imread(IMG_NAME1, cv2.IMREAD_GRAYSCALE)
KERNEL_SIZE = 3
THRESHOLD = 0.1
# Find maximums
h1 = h_fun(img1, KERNEL_SIZE)
m1 = find_max(h1, KERNEL_SIZE, THRESHOLD)
draw_points(img1_color, m1)
plt.show()
C:\Users\Ashish\AppData\Local\Temp\ipykernel_24548\2343967864.py:27:
DeprecationWarning: Please import `maximum_filter` from the `scipy.ndimage` namespace;
the `scipy.ndimage.filters` namespace is deprecated and will be removed in SciPy 2.0.0.
data_max = filters.maximum_filter(image, size)

186
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

KERNEL_SIZE = 3
THRESHOLD = 0.2
# Find maximums
h1 = h_fun(img1, KERNEL_SIZE)
m1 = find_max(h1, KERNEL_SIZE, THRESHOLD)
draw_points(img1_color, m1)
plt.show()
C:\Users\Ashish\AppData\Local\Temp\ipykernel_24548\2343967864.py:27:
DeprecationWarning: Please import `maximum_filter` from the `scipy.ndimage` namespace;
the `scipy.ndimage.filters` namespace is deprecated and will be removed in SciPy 2.0.0.
data_max = filters.maximum_filter(image, size)

187
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

KERNEL_SIZE = 3
THRESHOLD = 0.3
# Find maximums
h1 = h_fun(img1, KERNEL_SIZE)
m1 = find_max(h1, KERNEL_SIZE, THRESHOLD)
draw_points(img1_color, m1)
plt.show()
C:\Users\Ashish\AppData\Local\Temp\ipykernel_24548\2343967864.py:27:
DeprecationWarning: Please import `maximum_filter` from the `scipy.ndimage` namespace;
the `scipy.ndimage.filters` namespace is deprecated and will be removed in SciPy 2.0.0.
data_max = filters.maximum_filter(image, size)

KERNEL_SIZE = 3
THRESHOLD = 0.4
# Find maximums
h1 = h_fun(img1, KERNEL_SIZE)
m1 = find_max(h1, KERNEL_SIZE, THRESHOLD)
draw_points(img1_color, m1)
plt.show()
C:\Users\Ashish\AppData\Local\Temp\ipykernel_24548\2343967864.py:27:
DeprecationWarning: Please import `maximum_filter` from the `scipy.ndimage` namespace;
the `scipy.ndimage.filters` namespace is deprecated and will be removed in SciPy 2.0.0.
data_max = filters.maximum_filter(image, size)

188
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

KERNEL_SIZE = 5
THRESHOLD = 0.1
# Find maximums
h1 = h_fun(img1, KERNEL_SIZE)
m1 = find_max(h1, KERNEL_SIZE, THRESHOLD)
draw_points(img1_color, m1)
plt.show()
C:\Users\Ashish\AppData\Local\Temp\ipykernel_24548\2343967864.py:27:
DeprecationWarning: Please import `maximum_filter` from the `scipy.ndimage` namespace;
the `scipy.ndimage.filters` namespace is deprecated and will be removed in SciPy 2.0.0.
data_max = filters.maximum_filter(image, size)

189
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

KERNEL_SIZE = 5
THRESHOLD = 0.2
# Find maximums
h1 = h_fun(img1, KERNEL_SIZE)
m1 = find_max(h1, KERNEL_SIZE, THRESHOLD)
draw_points(img1_color, m1)
plt.show()
C:\Users\Ashish\AppData\Local\Temp\ipykernel_24548\2343967864.py:27:
DeprecationWarning: Please import `maximum_filter` from the `scipy.ndimage` namespace;
the `scipy.ndimage.filters` namespace is deprecated and will be removed in SciPy 2.0.0.
data_max = filters.maximum_filter(image, size)

KERNEL_SIZE = 5
THRESHOLD = 0.3
# Find maximums
h1 = h_fun(img1, KERNEL_SIZE)
m1 = find_max(h1, KERNEL_SIZE, THRESHOLD)
draw_points(img1_color, m1)
plt.show()
C:\Users\Ashish\AppData\Local\Temp\ipykernel_24548\2343967864.py:27:
DeprecationWarning: Please import `maximum_filter` from the `scipy.ndimage` namespace;
the `scipy.ndimage.filters` namespace is deprecated and will be removed in SciPy 2.0.0.
data_max = filters.maximum_filter(image, size)

190
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

KERNEL_SIZE = 7
THRESHOLD = 0.1
# Find maximums
h1 = h_fun(img1, KERNEL_SIZE)
m1 = find_max(h1, KERNEL_SIZE, THRESHOLD)
draw_points(img1_color, m1)
plt.show()
C:\Users\Ashish\AppData\Local\Temp\ipykernel_24548\2343967864.py:27:
DeprecationWarning: Please import `maximum_filter` from the `scipy.ndimage` namespace;
the `scipy.ndimage.filters` namespace is deprecated and will be removed in SciPy 2.0.0.
data_max = filters.maximum_filter(image, size)

191
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

KERNEL_SIZE = 7
THRESHOLD = 0.2
# Find maximums
h1 = h_fun(img1, KERNEL_SIZE)
m1 = find_max(h1, KERNEL_SIZE, THRESHOLD)
draw_points(img1_color, m1)
plt.show()
C:\Users\Ashish\AppData\Local\Temp\ipykernel_24548\2343967864.py:27:
DeprecationWarning: Please import `maximum_filter` from the `scipy.ndimage` namespace;
the `scipy.ndimage.filters` namespace is deprecated and will be removed in SciPy 2.0.0.
data_max = filters.maximum_filter(image, size)

KERNEL_SIZE = 7
THRESHOLD = 0.3
# Find maximums
h1 = h_fun(img1, KERNEL_SIZE)
m1 = find_max(h1, KERNEL_SIZE, THRESHOLD)
draw_points(img1_color, m1)
plt.show()
C:\Users\Ashish\AppData\Local\Temp\ipykernel_24548\2343967864.py:27:
DeprecationWarning: Please import `maximum_filter` from the `scipy.ndimage` namespace;
the `scipy.ndimage.filters` namespace is deprecated and will be removed in SciPy 2.0.0.
data_max = filters.maximum_filter(image, size)

192
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Conclusions:
In this experiment, we explored several key techniques for blob and corner detection have been
implemented. The Laplacian of Gaussian (LOG) was used to detect blobs by highlighting
regions of rapid intensity change, combining the smoothing (Gaussian) and edge detection
(Laplacian) depicting the edges and round regions within the image. Difference of Gaussian
(DoG), a faster approximation of LoG, successfully identified blobs by subtracting Gaussian-
blurred images, emphasizing differences at varying scales. The Determinant of Hessian (DoH)
method, although typically more computationally intensive, detected blob-like structures by
analyzing intensity changes across multiple directions.
Histogram of oriented gradients (HoG) technique was used object detection by capturing the
gradient
direction and magnitude within localized regions of an image. It represents the shape and
structure of objects by summarizing the distribution of gradient orientations, making it effective
for image classification.
For corner detection, the Harris Detection method was applied which highlights the significant
intensity variations, which is important for feature matching and object recognition. There are
two important parameters, kernel size and threshold. As the kernel size is increased, the detector
becomes more sensitive to the larger features but may smooth out smaller and fine details. On
the other hand, as the threshold is increased, the number of corners detected is reduced, because
it makes the detector more selective and less sensitive to the weak corners.

193
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

ET3221: Computer Vision

Name of the student: Ashish Rodi Roll No. 9

Div: A Batch: 1
Date of performance: 03-09-2024

Experiment No. 9
Problem Statement:
SIFT - based Cats n Dogs classifier

AIM:
1.To develop a SIFT-based feature extraction process for classifying images of cats and dogs.

2.To use K-Nearest Neighbours (KNN) for classifying the extracted SIFT features.

3.To implement and evaluate clustering techniques like K-means for organizing SIFT features.

Objective(s) of Experiment:
To classify images of cats and dogs by extracting SIFT features, clustering them, and using
KNN classification.

Introduction:
SIFT (Scale-Invariant Feature Transform) is a popular algorithm in computer vision used for
detecting and describing local features in images. It is particularly useful because it is scale-
invariant, meaning it can detect the same key points even when an image is resized, rotated, or
transformed. SIFT works by identifying key points (features) in an image that remain consistent
under different viewing conditions. These key points are then described using a feature
descriptor, which provides a rich set of information that can be used for matching similar
objects or classifying images. Once SIFT features are extracted, they can be used for tasks like
object recognition and image classification. In classification tasks, algorithms like K-Nearest
Neighbours (KNN) are commonly used. KNN is a simple, non-parametric algorithm that
classifies new data points based on the majority class of their nearest neighbours in feature
space. It relies on the assumption that similar data points are close to each other, making it a
natural fit for organizing and classifying image features extracted through methods like SIFT.
By combining feature extraction with classification techniques like KNN, we can create robust
systems capable of distinguishing between different objects or categories, such as identifying
cats and dogs in images.

Flowchart:

194
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Code and results:


import numpy as np
import pandas as pd
import os
import csv
import cv2
from matplotlib import pyplot as plt
import joblib
from sklearn import preprocessing
from skimage.filters import sobel
from skimage.measure import shannon_entropy
import warnings
warnings.filterwarnings('ignore')
from sklearn.cluster import KMeans
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report
from sklearn.tree import DecisionTreeClassifier
import sklearn.metrics as metrics
from sklearn.metrics import accuracy_score
input0 = 'C:/Users/Ashish/Desktop/CVFolder/'
temp = ['Cats', 'Dog']
195
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

#cleanup.clean('SIFT')
for i in temp:
count = 0
for filename in os.listdir(input0 + i):
img = cv2.imread(input0 + i + '/' + filename)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# pre processing - resize, contrast enhancement - Histogram equliazation
# Image blur- Gaussian
# Prewit / Sobel - structural
#initialise sift descriptor
sift = cv2.SIFT_create(nfeatures = 1000)
keypoints, descriptors = sift.detectAndCompute(gray, None)
sift_image = cv2.drawKeypoints(gray, keypoints, img)
# display image with keypoints
#convert the descriptor array into a dataframe format
out=pd.DataFrame(descriptors)
print("descriptor shape ",i, count, " : ", out.shape)
#append to the csv file
csv_data=out.to_csv('./Dataset/SIFT/SIFT_' + i + '.csv', mode='a', index=False)
count += 1
if count == 50:
break
print(i + ": " + str(count))
descriptor shape Cats 0 : (382, 128)
descriptor shape Cats 1 : (359, 128)
descriptor shape Cats 2 : (180, 128)
descriptor shape Cats 3 : (273, 128)
descriptor shape Cats 4 : (506, 128)
descriptor shape Cats 5 : (486, 128)
descriptor shape Cats 6 : (418, 128)
descriptor shape Cats 7 : (88, 128)
descriptor shape Cats 8 : (345, 128)
descriptor shape Cats 9 : (413, 128)
descriptor shape Cats 10 : (168, 128)
descriptor shape Cats 11 : (154, 128)
descriptor shape Cats 12 : (178, 128)
descriptor shape Cats 13 : (165, 128)
descriptor shape Cats 14 : (417, 128)
descriptor shape Cats 15 : (123, 128)
descriptor shape Cats 16 : (213, 128)
descriptor shape Cats 17 : (187, 128)
descriptor shape Cats 18 : (473, 128)
descriptor shape Cats 19 : (491, 128)
descriptor shape Cats 20 : (361, 128)
descriptor shape Cats 21 : (299, 128)
descriptor shape Cats 22 : (138, 128)
descriptor shape Cats 23 : (155, 128)
descriptor shape Cats 24 : (165, 128)
descriptor shape Cats 25 : (158, 128)
descriptor shape Cats 26 : (491, 128)
descriptor shape Cats 27 : (173, 128)
196
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

descriptor shape Cats 28 : (213, 128)


descriptor shape Cats 29 : (424, 128)
descriptor shape Cats 30 : (131, 128)
descriptor shape Cats 31 : (130, 128)
descriptor shape Cats 32 : (142, 128)
descriptor shape Cats 33 : (550, 128)
descriptor shape Cats 34 : (322, 128)
descriptor shape Cats 35 : (196, 128)
descriptor shape Cats 36 : (184, 128)
descriptor shape Cats 37 : (141, 128)
descriptor shape Cats 38 : (174, 128)
descriptor shape Cats 39 : (312, 128)
descriptor shape Cats 40 : (182, 128)
descriptor shape Cats 41 : (506, 128)
descriptor shape Cats 42 : (344, 128)
descriptor shape Cats 43 : (130, 128)
descriptor shape Cats 44 : (368, 128)
descriptor shape Cats 45 : (354, 128)
descriptor shape Cats 46 : (478, 128)
descriptor shape Cats 47 : (264, 128)
descriptor shape Cats 48 : (416, 128)
descriptor shape Cats 49 : (226, 128)
Cats: 50
descriptor shape Dog 0 : (207, 128)
descriptor shape Dog 1 : (374, 128)
descriptor shape Dog 2 : (271, 128)
descriptor shape Dog 3 : (216, 128)
descriptor shape Dog 4 : (185, 128)
descriptor shape Dog 5 : (290, 128)
descriptor shape Dog 6 : (379, 128)
descriptor shape Dog 7 : (161, 128)
descriptor shape Dog 8 : (382, 128)
descriptor shape Dog 9 : (316, 128)
descriptor shape Dog 10 : (413, 128)
descriptor shape Dog 11 : (411, 128)
descriptor shape Dog 12 : (530, 128)
descriptor shape Dog 13 : (407, 128)
descriptor shape Dog 14 : (258, 128)
descriptor shape Dog 15 : (186, 128)
descriptor shape Dog 16 : (254, 128)
descriptor shape Dog 17 : (411, 128)
descriptor shape Dog 18 : (247, 128)
descriptor shape Dog 19 : (239, 128)
descriptor shape Dog 20 : (323, 128)
descriptor shape Dog 21 : (250, 128)
descriptor shape Dog 22 : (254, 128)
descriptor shape Dog 23 : (503, 128)
descriptor shape Dog 24 : (292, 128)
descriptor shape Dog 25 : (113, 128)
descriptor shape Dog 26 : (327, 128)
descriptor shape Dog 27 : (162, 128)

197
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

descriptor shape Dog 28 : (375, 128)


descriptor shape Dog 29 : (158, 128)
descriptor shape Dog 30 : (384, 128)
descriptor shape Dog 31 : (199, 128)
descriptor shape Dog 32 : (166, 128)
descriptor shape Dog 33 : (237, 128)
descriptor shape Dog 34 : (135, 128)
descriptor shape Dog 35 : (151, 128)
descriptor shape Dog 36 : (385, 128)
descriptor shape Dog 37 : (249, 128)
descriptor shape Dog 38 : (369, 128)
descriptor shape Dog 39 : (434, 128)
descriptor shape Dog 40 : (334, 128)
descriptor shape Dog 41 : (383, 128)
descriptor shape Dog 42 : (310, 128)
descriptor shape Dog 43 : (316, 128)
descriptor shape Dog 44 : (390, 128)
descriptor shape Dog 45 : (256, 128)
descriptor shape Dog 46 : (401, 128)
descriptor shape Dog 47 : (323, 128)
descriptor shape Dog 48 : (301, 128)
descriptor shape Dog 49 : (293, 128)
Dog: 50
data1 = pd.read_csv('./Dataset/SIFT/SIFT_Cats.csv', dtype='uint8')
data2 = pd.read_csv('./Dataset/SIFT/SIFT_Dog.csv', dtype='uint8')
data1 = data1.astype('uint8')
data2 = data2.astype('uint8')
data1
0
1
2
3
4
5
6
7
8
9

118
119
198
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

120
121
122
123
124
125
126
127
0
3
25
5
2
9
97
21
1
29
45

0
0
0
0
0
0
0
0
0
0
1

199
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

100
45
0
0
1
2
3
16
45
23

2
94
15
0
0
22
119
24
2
9
2
65
29
0
1
7
8
9
17
143

200
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

42

5
73
90
0
0
3
14
9
4
90
3
42
3
1
37
79
0
0
12
49
7

0
27
22
102
15
3
0

201
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

0
0
0
4
42
3
1
35
55
1
1
17
22
0

27
31
2
85
143
3
2
2
1
0





202
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

















14190
5
111
110
0
0
0
0
0
89
166

1
16
18

203
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

28
1
0
0
0
11
17
14191
49
0
0
0
0
0
0
12
32
0

0
21
119
10
0
0
0
0
1
47
14192
44

204
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

86
22
7
0
0
2
27
16
28

0
0
2
2
0
0
0
0
0
0
14193
0
4
5
0
0
1
1
0
76
121

205
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering


5
15
62
62
8
13
5
0
0
1
14194
3
0
0
0
0
0
0
0
42
2

9
20
33
61
26
4
0
0

206
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

8
20
14195 rows × 128 columns
data2
0
1
2
3
4
5
6
7
8
9

118
119
120
121
122
123
124
125
126
127
0
0
0
0
0
0

207
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

0
0
0
0
0

0
0
52
14
0
2
7
0
0
0
1
0
0
0
0
0
0
0
0
36
12

5
23
73

208
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

34
1
9
15
0
0
0
2
38
0
0
0
0
0
12
146
55
2

0
1
22
2
0
2
7
1
1
1
3
1

209
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

0
0
0
24
146
12
1
119
1

13
13
7
1
0
1
146
13
1
2
4
0
0
2
150
95
0
0
0
0
2

210
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering


22
69
21
9
2
0
0
0
5
20



















211
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering



14954
17
0
0
1
13
2
3
58
10
2

22
33
40
3
2
9
93
56
1
4
14955
22
78
17
11
69
13

212
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

4
5
149
20

0
0
0
0
8
49
3
0
0
0
14956
109
40
3
0
0
0
0
23
110
10

6
25
35
67

213
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

2
1
87
125
14
6
14957
25
6
3
37
94
38
4
1
165
57

0
3
0
0
0
3
79
8
0
0
14958
53
43

214
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

6
9
8
9
6
12
46
100

6
38
5
0
0
0
0
1
20
20
14959 rows × 128 columns
data_final = pd.concat([data1,data2])
inertias = []
for i in range(1,35):
kmeans = KMeans(n_clusters=i)
kmeans.fit(data_final)
inertias.append(kmeans.inertia_)

plt.plot(range(1,35), inertias, marker='o')


plt.title('Elbow method')
plt.xlabel('Number of clusters')
plt.ylabel('Inertia')
plt.show()

215
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

kmeans3 = KMeans(n_clusters=10)
kmeans3.fit(pd.concat([data1, data2]))
joblib.dump(kmeans3, './Dataset/SIFT/Trained_Models/Kmeans_A')
['./Dataset/SIFT/Trained_Models/Kmeans_A']
kmeans3
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the
notebook. On GitHub, the HTML representation is unable to render, please try loading this page
with nbviewer.org.
KMeans?Documentation for KMeansiFitted
c=0
for i in temp:
data = []
path_to_folder = input0+i
print(path_to_folder)
for fname in os.listdir(path_to_folder):
path_to_file = path_to_folder+'/'+fname
img = cv2.imread(path_to_file)
sift = cv2.SIFT_create()
keypoints, descriptors = sift.detectAndCompute(img, None)
out1 = pd.DataFrame(descriptors)
array_double = np.array(out1, dtype=np.double)
a=kmeans3.predict(array_double)
216
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

# BVOW, BOW
hist=np.histogram(a,bins=10) #the range of 10 cluster assignments is divided into 6
intervals.
data.append(hist[0])
# csv_data = out1.to_csv(r'SIFT\SIFT_{}.csv'.format(i), mode='a', index=False)
Output = pd.DataFrame(data)
Output["Class"] = c
csv_data=Output.to_csv(r'./Dataset/SIFT/SIFT_Final_{}.csv'.format(i), mode='a',
index=False)
c += 1
C:/Users/Ashish/Desktop/CVFolder/Cats
C:/Users/Ashish/Desktop/CVFolder/Dog
Output_Initial= pd.concat([data1, data2])
len(a)
247
a # kmeans predicted cluster labels between 0 to 9
array([1, 2, 3, 7, 2, 2, 4, 8, 4, 2, 8, 2, 6, 3, 2, 6, 3, 2, 8, 4, 2, 1,
4, 3, 2, 3, 2, 2, 8, 7, 1, 7, 2, 0, 4, 0, 9, 9, 7, 2, 1, 9, 8, 7,
2, 9, 9, 2, 4, 4, 0, 6, 7, 9, 0, 2, 5, 7, 9, 5, 2, 2, 2, 3, 7, 9,
8, 2, 6, 2, 0, 0, 3, 2, 8, 1, 5, 0, 2, 0, 2, 8, 8, 2, 0, 5, 6, 3,
6, 8, 8, 9, 0, 4, 2, 4, 4, 6, 9, 0, 5, 0, 2, 2, 8, 2, 7, 2, 7, 6,
4, 2, 5, 5, 4, 6, 0, 1, 7, 0, 0, 9, 6, 4, 2, 3, 6, 0, 3, 8, 3, 6,
3, 0, 5, 7, 4, 2, 2, 5, 5, 7, 6, 3, 3, 3, 2, 5, 3, 9, 3, 9, 1, 1,
0, 7, 1, 2, 9, 9, 0, 5, 4, 4, 2, 7, 3, 7, 2, 7, 4, 2, 2, 9, 0, 3,
3, 3, 4, 9, 4, 9, 1, 9, 3, 3, 2, 2, 1, 8, 6, 3, 2, 8, 6, 7, 6, 4,
9, 8, 4, 0, 4, 4, 5, 0, 7, 2, 0, 5, 9, 8, 1, 4, 1, 6, 6, 0, 9, 3,
4, 9, 8, 6, 0, 6, 1, 5, 3, 3, 6, 6, 5, 2, 9, 9, 9, 2, 2, 2, 4, 4,
4, 7, 7, 1, 6])
hist # Clusters distributed in 10 bins
(array([25, 15, 48, 27, 28, 16, 23, 21, 18, 26], dtype=int64),
array([0. , 0.9, 1.8, 2.7, 3.6, 4.5, 5.4, 6.3, 7.2, 8.1, 9. ]))
len(data)
54
data # All features histograms are appended in data
[array([18, 19, 39, 21, 32, 15, 19, 18, 14, 12], dtype=int64),
array([36, 33, 73, 32, 39, 31, 41, 40, 22, 27], dtype=int64),
array([18, 23, 51, 25, 50, 17, 22, 23, 21, 21], dtype=int64),
array([25, 7, 46, 23, 29, 19, 20, 11, 21, 15], dtype=int64),
array([18, 9, 34, 19, 33, 19, 13, 12, 12, 16], dtype=int64),
array([39, 18, 42, 28, 35, 31, 28, 12, 28, 29], dtype=int64),
array([34, 17, 68, 35, 50, 34, 30, 21, 57, 33], dtype=int64),
array([18, 7, 11, 16, 23, 17, 16, 6, 28, 19], dtype=int64),
217
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

array([27, 29, 54, 35, 36, 39, 33, 22, 59, 48], dtype=int64),
array([34, 19, 24, 33, 41, 25, 30, 13, 54, 43], dtype=int64),
array([28, 30, 43, 47, 48, 30, 33, 21, 69, 64], dtype=int64),
array([49, 27, 36, 48, 43, 58, 48, 26, 37, 39], dtype=int64),
array([53, 81, 52, 40, 59, 53, 37, 33, 61, 61], dtype=int64),
array([52, 34, 49, 48, 43, 51, 58, 16, 27, 29], dtype=int64),
array([25, 12, 42, 34, 26, 28, 33, 12, 14, 32], dtype=int64),
array([17, 10, 33, 18, 23, 21, 25, 15, 16, 8], dtype=int64),
array([28, 19, 48, 25, 36, 25, 29, 18, 15, 11], dtype=int64),
array([46, 30, 54, 49, 56, 51, 44, 27, 32, 22], dtype=int64),
array([18, 13, 42, 24, 32, 25, 24, 12, 32, 25], dtype=int64),
array([33, 17, 38, 28, 10, 27, 27, 16, 26, 17], dtype=int64),
array([38, 23, 42, 32, 26, 34, 34, 14, 41, 39], dtype=int64),
array([22, 21, 24, 22, 40, 14, 18, 12, 54, 23], dtype=int64),
array([20, 5, 49, 29, 43, 21, 43, 18, 12, 14], dtype=int64),
array([40, 61, 40, 55, 72, 49, 42, 44, 55, 45], dtype=int64),
array([37, 12, 26, 49, 54, 32, 48, 5, 14, 15], dtype=int64),
array([13, 24, 6, 7, 8, 16, 7, 12, 11, 9], dtype=int64),
array([40, 26, 47, 32, 35, 33, 42, 26, 28, 18], dtype=int64),
array([17, 26, 17, 17, 13, 18, 14, 10, 13, 17], dtype=int64),
array([45, 28, 51, 45, 44, 39, 39, 30, 32, 22], dtype=int64),
array([18, 15, 34, 12, 20, 19, 16, 6, 7, 11], dtype=int64),
array([39, 17, 66, 43, 44, 43, 36, 25, 42, 29], dtype=int64),
array([11, 6, 51, 20, 21, 26, 16, 3, 27, 18], dtype=int64),
array([18, 6, 36, 17, 21, 18, 14, 6, 10, 20], dtype=int64),
array([22, 16, 41, 22, 29, 20, 19, 28, 23, 17], dtype=int64),
array([12, 15, 18, 17, 20, 15, 11, 20, 6, 1], dtype=int64),
array([13, 11, 20, 9, 32, 14, 19, 12, 9, 12], dtype=int64),
array([49, 18, 35, 60, 53, 40, 41, 13, 40, 36], dtype=int64),
array([25, 8, 27, 18, 32, 26, 27, 15, 40, 31], dtype=int64),
array([29, 33, 48, 52, 56, 37, 52, 22, 25, 15], dtype=int64),
array([53, 25, 64, 47, 51, 45, 49, 24, 45, 31], dtype=int64),
array([32, 18, 43, 29, 49, 36, 32, 23, 35, 37], dtype=int64),
array([45, 23, 61, 45, 46, 27, 34, 19, 35, 48], dtype=int64),
array([25, 17, 69, 28, 41, 23, 31, 14, 31, 31], dtype=int64),
array([43, 40, 27, 28, 22, 38, 34, 20, 34, 30], dtype=int64),
array([46, 25, 70, 38, 59, 25, 40, 36, 25, 26], dtype=int64),
array([24, 10, 34, 26, 36, 15, 34, 14, 40, 23], dtype=int64),
array([43, 25, 54, 42, 54, 41, 51, 27, 33, 31], dtype=int64),
array([12, 26, 73, 16, 57, 24, 23, 22, 41, 29], dtype=int64),
array([32, 27, 37, 22, 54, 37, 24, 20, 36, 12], dtype=int64),
array([21, 28, 51, 21, 51, 20, 21, 21, 42, 17], dtype=int64),
array([38, 6, 39, 26, 15, 33, 27, 5, 12, 20], dtype=int64),
array([26, 18, 46, 34, 38, 23, 36, 11, 54, 44], dtype=int64),
array([12, 12, 38, 17, 29, 7, 28, 13, 4, 9], dtype=int64),
array([25, 15, 48, 27, 28, 16, 23, 21, 18, 26], dtype=int64)]
1
dfp = pd.read_csv('./Dataset/SIFT/SIFT_Final_Cats.csv')
dfn = pd.read_csv('./Dataset/SIFT/SIFT_Final_Dog.csv')

218
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

df = pd.concat([dfp, dfn])
csv_data = df.to_csv('./Dataset/SIFT/SIFT_Final.csv')
df.shape
(104, 11)
X = df.drop(df.columns[[0, -1]], axis=1)
X.shape
(104, 9)
df = pd.read_csv('./Dataset/SIFT/SIFT_Final.csv')
X = df.drop(df.columns[[0, -1]], axis=1) # selects all rows (:) and all but the last column (:-1)
of the DataFrame df.
Y = df.iloc[:, -1] # selects all rows (:) and the last column (-1) of the DataFrame df.

# train test split


x_train, x_test, y_train, y_test = train_test_split(X, Y, train_size=0.8, random_state=5)
y_train
37 0
23 0
59 1
32 0
64 1
..
73 1
16 0
61 1
78 1
99 1
Name: Class, Length: 83, dtype: int64
x_test.shape
(21, 10)
y_test
74 1
10 0
20 0
22 0
83 1
52 1
46 0
102 1
63 1
84 1
67 1
40 0
72 1

219
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

6 0
66 1
17 0
56 1
28 0
48 0
98 1
35 0
Name: Class, dtype: int64
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()

train_X = sc.fit_transform(x_train)
test_X = sc.transform(x_test)
train_X
array([[-0.78219695, -1.12840664, -1.32705455, -1.04375126, -0.82826934,
-0.58287821, -1.09410278, -0.86113728, -1.06198976, -1.02395088],
[-0.98550283, -1.00106775, -1.26468449, -0.90870628, -0.68910668,
-1.1381596 , -0.6765831 , -0.79574442, -0.52386308, -0.95617742],
[ 0.30210112, -0.30070388, -0.70335394, 0.10413107, 0.70251995,
-0.23582734, -0.11989019, -0.66495871, 1.14432962, 1.28034688],
[-1.52765186, -1.0647372 , 0.98063771, -1.44888619, -0.75868801,
-1.20756977, -1.233276 , -1.18810157, -1.16961509, -1.22727127],
[-0.30781654, -0.74638998, 0.41930716, 0.17165356, -0.34120002,
-0.02759682, 0.08886965, -0.73035156, -1.00817709, 0.53483878],
[ 1.04755603, -0.0460261 , 1.60433832, 0.91440094, 1.0504266 ,
-0.09700699, 0.15845626, -0.27260155, 0.12188893, 1.61921419],
[-0.78219695, -0.55538165, -0.07965333, -1.31384121, -0.75868801,
-0.65228838, -1.09410278, -1.12270872, -1.38486576, -0.88840396],
[-0.10451066, 0.39966 , 0.48167722, 1.04944592, 1.18958927,
0.11122353, 0.08886965, -0.14181584, 1.95151964, 2.70358961],
[ 0.70871289, 0.59066833, -1.20231443, 1.1844909 , -0.13245603,
0.59709475, 0.08886965, 0.25054131, 1.30576762, -0.07512239],
[-1.12104009, -1.19207608, -0.765724 , -0.63861632, -0.54994402,
-0.99933925, -0.60699648, -1.12270872, -0.25479974, -0.21066932],
[-0.30781654, -0.55538165, 0.79352752, -0.30100387, -0.20203736,
-0.8605189 , -0.60699648, -0.14181584, -0.79292642, 0.128198 ],
[ 0.505407 , 0.33599056, -1.01520425, 0.5767885 , -1.03701334,
-0.09700699, 1.06308224, 0.1197556 , 0.71382828, 0.128198 ],
[-1.18880872, -0.68272054, -0.70335394, -1.04375126, -0.61952535,
-1.1381596 , -1.16368939, -0.46878013, -0.36242508, -1.02395088],
[-0.64665969, -0.61905109, -0.57861382, -0.57109383, -1.10659467,
-0.92992908, -0.60699648, -0.73035156, -1.16961509, -1.22727127],
[ 0.09879523, -0.0460261 , -0.01728327, 0.5767885 , -0.68910668,
0.25004388, 1.20225546, 0.64289846, 0.22951427, -0.21066932],
[-1.12104009, -0.81005942, -0.95283418, -1.51640868, 0.07628797,
-0.99933925, -0.88534294, -0.73035156, -1.27724043, -0.82063049],
[ 0.64094426, -0.36437332, 0.41930716, -0.23348138, 0.28503196,
0.18063371, -0.25906342, -0.73035156, -0.25479974, 0.33151839],
220
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

[ 1.58970507, 0.08131278, 1.7914485 , 1.04944592, 1.39833326,


1.15237614, 1.20225546, 0.05436274, 0.66001561, 0.46706531],
[-0.78219695, -0.68272054, 0.41930716, -0.50357134, 0.07628797,
-0.23582734, -0.53740987, -0.73035156, -0.03954907, 0.06042453],
[ 1.79301095, 0.90901554, -1.20231443, 1.79219331, -0.13245603,
1.77706771, 0.71514917, 0.38132703, 3.56589967, 3.11023039],
[-0.78219695, -0.30070388, 0.23219697, -0.70613881, 0.07628797,
-0.92992908, -0.88534294, -0.33799441, -1.00817709, -0.82063049],
[-0.78219695, -1.0647372 , -1.51416473, -1.04375126, -0.54994402,
-0.79110873, -1.09410278, -1.12270872, -0.25479974, -0.34621625],
[-0.30781654, -0.42804276, 2.1032988 , -0.23348138, 0.70251995,
-0.37464769, -0.05030358, -0.59956585, -0.09336174, 0.46706531],
[-0.24004791, -0.36437332, 0.6687874 , 0.17165356, 0.49377595,
-0.37464769, 0.29762949, -0.79574442, 1.14432962, 1.34812034],
[-0.17227929, 0.27232111, -0.39150364, 0.5767885 , 0.07628797,
-0.09700699, 0.43680272, -0.07642298, 0.33713961, -0.48176317],
[-0.78219695, -0.10969555, -1.13994437, -0.84118379, -1.31533866,
-0.92992908, -1.44203584, -0.33799441, -0.90055175, -0.95617742],
[ 1.52193644, 2.43708218, 0.54404728, 1.99476078, 0.70251995,
1.43001684, 1.34142869, 2.73546994, 2.48964631, 1.48366727],
[ 0.91201878, 2.11873497, -0.14202339, 1.31953588, -0.27161869,
2.3323491 , 0.71514917, 2.34311279, 0.28332694, -0.82063049],
[ 0.57317563, -0.0460261 , 0.41930716, 0.03660858, -0.34120002,
0.38886423, 0.15845626, -0.59956585, 0.44476494, 1.00925302],
[ 0.77648152, 1.16369332, 0.29456703, 1.31953588, -0.48036268,
0.66650492, 1.27184208, 2.67007708, 0.33713961, 0.26374492],
[ 0.23433249, -0.42804276, 0.16982691, -0.23348138, -1.45450132,
-0.09700699, -0.32865003, -0.46878013, -0.36242508, -0.48176317],
[-0.78219695, -0.93739831, -0.07965333, -0.84118379, 0.1458693 ,
-0.65228838, -1.30286262, -0.73035156, -1.11580242, -0.54953664],
[ 1.31863055, -0.36437332, -0.01728327, 1.92723829, 1.53749592,
0.80532527, 0.64556256, -0.66495871, 0.39095227, 0.80593263],
[ 0.23433249, -0.30070388, -1.01520425, 0.23917605, -0.27161869,
0.18063371, -0.05030358, -0.14181584, 0.28332694, -0.34621625],
[-1.18880872, -0.36437332, -1.57653479, -0.97622877, -0.967432 ,
-1.27697995, -1.09410278, -0.66495871, -1.22342776, -1.15949781],
[-0.51112243, -0.30070388, -1.38942461, -1.44888619, -1.38491999,
-0.8605189 , -1.16368939, 0.25054131, -1.06198976, -1.09172435],
[-0.78219695, -1.12840664, 0.04508679, -0.97622877, -0.68910668,
-0.72169856, -1.233276 , -1.12270872, -1.22342776, -0.27844278],
[ 0.16656386, 1.80038775, -0.07965333, 0.5767885 , -0.89785067,
0.11122353, 1.13266885, 1.62379134, -0.20098707, -0.07512239],
[-0.30781654, -1.00106775, -0.51624376, -0.90870628, 0.07628797,
-0.16641716, -0.32865003, -0.53417299, 0.39095227, 0.46706531],
[-0.51112243, -0.17336499, -0.70335394, -0.63861632, 0.63293862,
-0.99933925, -0.95492955, -0.73035156, 1.14432962, -0.07512239],
[ 0.91201878, 1.03635443, -0.51624376, -0.23348138, -0.61952535,
0.66650492, 0.15845626, -0.2072087 , 0.06807627, 0.39929185],
[ 0.16656386, -0.36437332, 0.48167722, -0.16595889, 1.2591706 ,
0.52768458, 0.01928304, -0.01103012, 0.12188893, 0.8737061 ],
[-1.05327146, -1.38308441, -0.4538737 , -0.03091391, -0.48036268,

221
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

-0.23582734, 0.29762949, -0.86113728, 0.12188893, -0.21066932],


[-1.25657735, -1.12840664, 0.98063771, -0.7736613 , -0.68910668,
-0.16641716, -1.09410278, -1.31888729, -0.30861241, -0.41398971],
[ 2.19962273, 1.48204053, -0.51624376, 1.31953588, 0.56335728,
1.49942701, 1.48060192, 1.29682705, 2.32820831, 1.28034688],
[-0.57889106, -1.12840664, -0.39150364, -0.70613881, -0.75868801,
-0.99933925, -0.25906342, -1.05731586, -0.68530108, -1.09172435],
[-0.51112243, -1.19207608, -1.26468449, -0.30100387, -1.10659467,
-1.1381596 , -0.39823664, -1.18810157, -0.09336174, -0.14289586],
[-1.86649501, -0.42804276, 0.54404728, -1.51640868, -1.80240798,
-1.27697995, -1.233276 , -0.27260155, -1.38486576, -1.43059166],
[-0.03674203, 0.59066833, 0.79352752, 1.38705837, 1.74623992,
0.59709475, 1.4110153 , -0.07642298, -0.41623774, -0.6173101 ],
[ 0.43763837, 0.59066833, 2.35277905, 0.03660858, 0.56335728,
0.18063371, 0.64556256, 1.10064847, -0.57767575, 0.19597146],
[-0.78219695, 0.01764334, -1.20231443, -0.97622877, -1.80240798,
0.25004388, -0.74616971, -0.07642298, -0.41623774, -1.3628182 ],
[ 2.6062345 , 1.54570998, -0.26676351, 1.58962584, -1.10659467,
2.12411858, 2.4548145 , 2.86625565, 1.19814229, 1.14479995],
[-0.4433538 , 0.20865167, -0.01728327, 0.50926601, 0.56335728,
-0.44405786, 0.8543224 , 2.4738985 , 0.87526628, 1.28034688],
[-0.17227929, 0.33599056, 1.16774789, 0.23917605, 0.35461329,
0.7359151 , 0.08886965, -0.07642298, 1.41339296, 1.61921419],
[ 1.72524232, 2.43708218, -0.89046412, 1.99476078, 0.49377595,
2.74881014, 1.06308224, 1.10064847, 0.82145361, 1.48366727],
[ 0.57317563, -1.12840664, 0.23219697, -0.36852636, -1.10659467,
0.31945405, -0.32865003, -1.18810157, -1.11580242, -0.27844278],
[ 0.505407 , 0.39966 , 0.04508679, -0.43604885, 0.49377595,
0.18063371, -0.1894768 , 0.1197556 , 1.46720563, 1.61921419],
[-1.18880872, 0.14498223, 2.35277905, -1.04375126, 1.81582125,
-0.30523751, -0.60699648, -0.07642298, 0.44476494, 0.33151839],
[ 1.92854821, 1.60937942, 1.16774789, 0.91440094, 1.67665859,
1.08296597, 2.73316096, 1.62379134, 0.92907895, 0.53483878],
[ 0.77648152, 2.18240441, -1.32705455, 0.84687845, 0.70251995,
0.87473545, 1.48060192, 2.08154135, 0.39095227, 0.128198 ],
[ 2.06408547, 0.8453461 , -0.89046412, 2.66998568, 0.77210128,
2.3323491 , 2.87233418, 1.49300563, 0.66001561, -0.54953664],
[-0.37558517, -0.87372887, -0.07965333, -0.36852636, 0.35461329,
-0.92992908, 0.15845626, -0.59956585, 0.39095227, -0.07512239],
[-0.17227929, 0.08131278, 0.48167722, -0.23348138, 0.42419462,
-0.23582734, -0.39823664, 0.70829132, -1.06198976, -1.22727127],
[-0.84996557, -0.87372887, -0.14202339, -0.90870628, -0.54994402,
-0.51346803, -0.46782326, -0.53417299, -0.90055175, -1.09172435],
[-1.12104009, 0.01764334, -1.82601504, -1.65145366, -1.59366399,
-0.8605189 , -1.72038229, -0.73035156, -1.16961509, -1.02395088],
[-0.84996557, 0.14498223, -1.13994437, -0.97622877, -1.24575733,
-0.72169856, -1.233276 , -0.86113728, -1.06198976, -0.48176317],
[ 0.30210112, -0.42804276, 0.6687874 , 1.25201339, 0.49377595,
1.43001684, 1.48060192, -0.33799441, 0.55239028, -0.27844278],
[ 0.91201878, 0.08131278, 1.16774789, 0.71183348, 1.60707725,
0.87473545, 1.34142869, 0.25054131, 0.0142636 , 0.46706531],

222
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

[ 1.11532466, 0.08131278, 2.16566886, 0.44174352, 1.95498391,


-0.23582734, 0.57597594, 0.83907704, -0.41623774, 0.128198 ],
[-0.30781654, -1.0647372 , 0.6687874 , -0.57109383, -0.13245603,
-0.65228838, -0.81575632, -0.79574442, -0.63148841, -0.6173101 ],
[-1.32434598, -0.49171221, -1.51416473, -1.04375126, -1.176176 ,
-1.69344099, -0.95492955, -0.33799441, -1.16961509, -1.29504474],
[ 0.70871289, 0.14498223, 0.73115746, 0.03660858, 0.28503196,
0.31945405, 0.71514917, 0.18514845, -0.25479974, -0.41398971],
[-1.73095775, -0.81005942, -1.63890486, -0.84118379, -1.45450132,
-1.62403082, -1.16368939, -0.92653014, -1.54630377, -1.49836513],
[ 0.64094426, -0.42804276, 1.91618862, 0.77935596, 0.91126394,
1.01355579, 0.29762949, 0.1197556 , 0.49857761, 0.33151839],
[-0.98550283, -0.93739831, -1.07757431, -1.24631872, -0.89785067,
-1.1381596 , -0.95492955, -0.66495871, -1.22342776, -1.29504474],
[-1.45988323, -0.68272054, 0.10745685, -1.04375126, -1.10659467,
-1.1381596 , -1.02451616, -0.27260155, -0.79292642, -0.41398971],
[ 1.58970507, 3.6468016 , 1.04300777, 0.5767885 , 1.95498391,
1.70765753, 0.3672161 , 0.64289846, 1.5210183 , 2.50026922],
[-0.03674203, 0.08131278, 0.79352752, 0.77935596, 0.49377595,
-0.72169856, 0.71514917, 0.96986276, 0.44476494, -0.07512239],
[ 0.70871289, 2.37341274, 0.29456703, 1.58962584, 2.85954122,
1.43001684, 0.71514917, 1.36221991, 1.19814229, 1.4158938 ],
[-0.17227929, -0.49171221, -0.01728327, -0.7736613 , -0.61952535,
-0.58287821, -1.02451616, 1.03525562, -1.11580242, -1.29504474],
[ 1.31863055, 0.20865167, 0.04508679, 1.11696841, 0.84168261,
2.0547084 , 1.13266885, 0.18514845, 0.22951427, 1.00925302],
[ 1.04755603, 0.27232111, 0.98063771, 0.91440094, 0.91126394,
0.7359151 , 0.50638933, 0.44671989, -0.03954907, -0.14289586],
[-0.57889106, 0.27232111, 0.98063771, -0.70613881, 1.39833326,
-0.58287821, -0.74616971, -0.14181584, 0.49857761, -0.48176317]])
test_X
array([[ 0.505407 , -0.74638998, -0.57861382, 1.1844909 , 1.60707725,
0.25004388, 1.13266885, -1.18810157, -1.00817709, -0.6173101 ],
[-1.12104009, -1.5104233 , -0.32913357, -0.57109383, -0.61952535,
-1.06874943, -0.60699648, -0.79574442, -0.68530108, -0.75285703],
[ 0.70871289, 0.39966 , -0.01728327, 0.77935596, 1.12000794,
0.25004388, 0.43680272, 0.64289846, 0.12188893, 0.26374492],
[-1.45988323, -1.00106775, -0.4538737 , -1.24631872, -0.54994402,
-1.48521047, -1.233276 , -0.59956585, -1.00817709, -1.02395088],
[-0.51112243, -0.49171221, 0.3569371 , -0.63861632, -0.13245603,
-0.58287821, -0.88534294, 0.31593417, -0.52386308, -0.48176317],
[-0.78219695, -0.0460261 , 0.98063771, -0.43604885, 1.32875193,
-0.79110873, -0.6765831 , -0.01103012, -0.63148841, -0.21066932],
[ 1.79301095, 0.71800722, 0.04508679, 2.33237323, 0.63293862,
3.58173223, 0.99349562, 0.83907704, 1.09051695, 0.39929185],
[-1.18880872, -0.74638998, 0.16982691, -0.97622877, -0.13245603,
-1.48521047, -0.25906342, -0.66495871, -1.54630377, -1.02395088],
[ 1.52193644, 0.65433777, 0.85589758, 1.11696841, 0.84168261,
1.56883719, 1.82853498, -0.46878013, -0.30861241, 0.33151839],
[-1.18880872, -0.55538165, -1.07757431, -0.97622877, -0.75868801,
223
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

-0.92992908, -1.44203584, -0.2072087 , -1.43867843, -1.56613859],


[ 1.11532466, 0.39966 , 1.16774789, 1.1844909 , 1.74623992,
1.56883719, 0.8543224 , 0.25054131, -0.03954907, -0.14289586],
[-1.12104009, -0.49171221, -0.39150364, -0.63861632, -0.48036268,
-0.99933925, -1.44203584, -0.33799441, -0.68530108, -0.6173101 ],
[-0.64665969, -1.19207608, 0.85589758, -0.16595889, 0.84168261,
-0.51346803, 0.78473578, -0.33799441, -1.11580242, -0.68508356],
[ 0.43763837, 1.48204053, 0.04508679, 1.04944592, 0.98084527,
1.15237614, 1.61977514, 0.31593417, 0.66001561, 0.67038571],
[-0.10451066, -0.30070388, 0.79352752, -0.43604885, 0.35461329,
-0.23582734, -0.1894768 , -0.33799441, -0.95436442, -0.88840396],
[-0.9177342 , -0.93739831, -0.01728327, -0.16595889, -1.10659467,
-0.92992908, 0.22804288, -0.86113728, -1.06198976, -0.95617742],
[ 0.30210112, -0.42804276, 2.04092874, 0.23917605, 1.32875193,
0.38886423, -0.11989019, -0.14181584, 1.30576762, 0.60261224],
[-0.51112243, -0.68272054, -0.07965333, -0.30100387, -0.0628747 ,
-0.16641716, -0.25906342, -0.53417299, -0.84673908, -1.56613859],
[ 1.45416781, 1.54570998, -0.82809406, 0.64431099, -0.13245603,
0.7359151 , 1.34142869, 3.19321995, 0.22951427, 0.128198 ],
[ 0.16656386, 0.20865167, 0.10745685, -0.63861632, 1.60707725,
0.59709475, -0.53740987, -0.2072087 , 0.1757016 , -0.82063049],
[-0.57889106, -0.17336499, -1.32705455, -0.57109383, -1.45450132,
-0.23582734, -0.32865003, 0.18514845, -0.63148841, -1.09172435]])
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import confusion_matrix, precision_score, recall_score, f1_score,
RocCurveDisplay
import joblib

# Train KNN Classifier


model_knn = KNeighborsClassifier(n_neighbors=13)
model_knn.fit(x_train, y_train)

# Save model
joblib.dump(model_knn, "./Dataset/SIFT/Trained_Models/modelA_knn")

# Predict
y_pred3 = model_knn.predict(x_test)

# Print metrics
print("KNN Classifier")
print("Train Accuracy:", model_knn.score(x_train, y_train))
print("Test Accuracy:", model_knn.score(x_test, y_test))
print("Precision Score:", precision_score(y_test, y_pred3, average='micro'))
print("Recall Score:", recall_score(y_test, y_pred3, average='micro'))
print("F1 Score:", f1_score(y_test, y_pred3, average='micro'))

# Confusion Matrix
print("Confusion Matrix:")
print(confusion_matrix(y_test, y_pred3))

224
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

# ROC Curve
RocCurveDisplay.from_estimator(model_knn, x_test, y_test)
KNN Classifier
Train Accuracy: 0.8554216867469879
Test Accuracy: 0.7619047619047619
Precision Score: 0.7619047619047619
Recall Score: 0.7619047619047619
F1 Score: 0.7619047619047619
Confusion Matrix:
[[ 6 4]
[ 1 10]]

<sklearn.metrics._plot.roc_curve.RocCurveDisplay at 0x1bbd5b68a40>

confusion_matrix(y_test, y_pred3)
array([[ 6, 4],
[ 1, 10]], dtype=int64)
DC=accuracy_score(y_test, y_pred3)*100
DC
76.19047619047619
from tabulate import tabulate
#from prettytable import PrettyTable
mydata = confusion_matrix(y_test, y_pred3)
225
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

# create header
head = ["Predicted No", "Predicted Yes"]
row=["Actual No","Actual Yes"]

# display table
print(tabulate(mydata, headers=head,showindex=row, tablefmt="grid"))
+------------+----------------+-----------------+
| | Predicted No | Predicted Yes |
+============+================+=================+
| Actual No | 6| 4|
+------------+----------------+-----------------+
| Actual Yes | 1| 10 |
+------------+----------------+-----------------+
dfc = pd.DataFrame(mydata)
from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay
cm = confusion_matrix(y_test, y_pred3, normalize='all')
cmd = ConfusionMatrixDisplay(cm, display_labels=['Cats','Dogs'])
cmd.plot()
<sklearn.metrics._plot.confusion_matrix.ConfusionMatrixDisplay at 0x1bbd5b410a0>

from sklearn.neighbors import KNeighborsClassifier


from sklearn.model_selection import GridSearchCV
from sklearn.metrics import confusion_matrix, precision_score, recall_score, f1_score,
RocCurveDisplay
226
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

import joblib

# Define the parameter grid to search over


param_grid = {
'n_neighbors': [5, 7, 13, 17, 23],
'weights': ['uniform', 'distance'],
'metric': ['euclidean', 'manhattan']
}

# Create a KNeighborsClassifier object


knn = KNeighborsClassifier()

# Perform a grid search over the parameter grid using 5-fold cross-validation
grid_search_knn = GridSearchCV(knn, param_grid, cv=5)

# Fit the grid search object to the training data


grid_search_knn.fit(x_train, y_train)

# Print the best hyperparameters and their corresponding score


print("Best hyperparameters: ", grid_search_knn.best_params_)
print("Best cross-validated score: ", grid_search_knn.best_score_)

# Save the best model


joblib.dump(grid_search_knn, "./Dataset/SIFT/Trained_Models/modelA_knn_tuned")

# Predict using the best estimator found by GridSearchCV


y_pred3 = grid_search_knn.predict(x_test)

# Print performance metrics


print("KNN with GridSearchCV")
print("Train Accuracy:", grid_search_knn.score(x_train, y_train))
print("Test Accuracy:", grid_search_knn.score(x_test, y_test))
print("Precision Score:", precision_score(y_test, y_pred3, average='micro'))
print("Recall Score:", recall_score(y_test, y_pred3, average='micro'))
print("F1 Score:", f1_score(y_test, y_pred3, average='micro'))

# Confusion Matrix
print("Confusion Matrix:")
print(confusion_matrix(y_test, y_pred3))

# ROC Curve
RocCurveDisplay.from_estimator(grid_search_knn, x_test, y_test)
Best hyperparameters: {'metric': 'euclidean', 'n_neighbors': 5, 'weights': 'uniform'}
Best cross-validated score: 0.8286764705882353
KNN with GridSearchCV
Train Accuracy: 0.8433734939759037
Test Accuracy: 0.8095238095238095
Precision Score: 0.8095238095238095
Recall Score: 0.8095238095238095
F1 Score: 0.8095238095238095
Confusion Matrix:
227
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

[[ 7 3]
[ 1 10]]

<sklearn.metrics._plot.roc_curve.RocCurveDisplay at 0x1bbd4f96d80>

def siftFeatures(path):
img = cv2.imread(path)
gray = cv2.resize(cv2.cvtColor(img, cv2.COLOR_BGR2GRAY), (128, 128))

#initialise sift descriptor


sift = cv2.SIFT_create()
keypoints, descriptors = sift.detectAndCompute(gray, None)

#sift_image = cv2.drawKeypoints(gray, keypoints, img)


#convert the descriptor array into a dataframe format
return pd.DataFrame(descriptors).astype('uint8')
def featureReduction(features):
modelKmeansA = joblib.load('./Dataset/SIFT/Trained_Models/Kmeans_A')
#pickle.load(file)

data = modelKmeansA.predict(features)
hist = np.histogram(data,bins=10)

return pd.DataFrame([hist[0]])

228
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

import joblib
import os
import cv2
import matplotlib.pyplot as plt

# Ensure the model file path is correct


model_path = './Dataset/SIFT/Trained_Models/modelA_knn_tuned'

if os.path.exists(model_path):
# Load the model
modelA = joblib.load(model_path)
else:
raise FileNotFoundError(f"Model file not found at {model_path}. Please check the path.")

# Predict using the model


img_path = './Cats/cat1.jpeg' # Ensure this path is also correct
modelA_pred = modelA.predict(featureReduction(siftFeatures(img_path)))
modelA_proba = modelA.predict_proba(featureReduction(siftFeatures(img_path)))

# Output the prediction


print("CAT" if modelA_pred else "DOG")
# Read and display the image
img = cv2.imread(img_path)
if img is None:
raise FileNotFoundError(f"Image file not found at {img_path}. Please check the path.")
# Convert from BGR to RGB for correct plotting
img_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

# Display the image


plt.imshow(img_rgb)
plt.axis('off') # Hide axis
plt.show()
DOG

229
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

modelA_proba
array([[1., 0.]])import joblib
import cv2
import matplotlib.pyplot as plt
import os
# Define paths for model and image
img_path = './Dog/dog10.jpeg'
model_path = './Dataset/SIFT/Trained_Models/modelA_knn_tuned'
# Check if the model file exists
if os.path.exists(model_path):
# Load the model
modelA = joblib.load(model_path)
else:
raise FileNotFoundError(f"Model file not found at {model_path}. Please check the path.")
# Check if the image file exists
if os.path.exists(img_path):
# Extract features and predict using the loaded model
features = featureReduction(siftFeatures(img_path))
modelA_pred = modelA.predict(features)
modelA_proba = modelA.predict_proba(features)
else:
raise FileNotFoundError(f"Image file not found at {img_path}. Please check the path.")
# Output the prediction
print("CAT" if modelA_pred else "DOG")
# Read and display the image
img = cv2.imread(img_path)
if img is None:
raise FileNotFoundError(f"Image file not found at {img_path}. Please check the path.")
230
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

# Convert from BGR to RGB for correct plotting


img_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
# Display the image
plt.imshow(img_rgb)
plt.axis('off') # Hide axis
plt.show()
DOG

modelA_proba
array([[0.8, 0.2]])
import joblib
import cv2
import matplotlib.pyplot as plt
import os
# Define paths for model and image
img_path = './Dog/dog30.jpeg'
model_path = './Dataset/SIFT/Trained_Models/modelA_knn_tuned' # Corrected the path
# Check if the model file exists
if os.path.exists(model_path):
# Load the model
modelA = joblib.load(model_path)
else:
raise FileNotFoundError(f"Model file not found at {model_path}. Please check the path.")
# Check if the image file exists
if os.path.exists(img_path):
# Extract features and predict using the loaded model
features = featureReduction(siftFeatures(img_path))
modelA_pred = modelA.predict(features)
modelA_proba = modelA.predict_proba(features)
231
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

else:
raise FileNotFoundError(f"Image file not found at {img_path}. Please check the path.")
# Output the prediction
print("CAT" if modelA_pred else "DOG")
# Read and display the image
img = cv2.imread(img_path)
if img is None:
raise FileNotFoundError(f"Image file not found at {img_path}. Please check the path.")
# Convert from BGR to RGB for correct plotting
img_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
# Display the image
plt.imshow(img_rgb)
plt.axis('off') # Hide axis
plt.show()
DOG

modelA_proba
array([[0.6, 0.4]])
import joblib
import cv2
import matplotlib.pyplot as plt
import os
# Define paths for model and image
img_path = './Cats/cat15.jpeg'
model_path = './Dataset/SIFT/Trained_Models/modelA_knn_tuned'
# Check if the model file exists
if os.path.exists(model_path):
# Load the model
modelA = joblib.load(model_path)
else:
232
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

raise FileNotFoundError(f"Model file not found at {model_path}. Please check the path.")
# Check if the image file exists
if os.path.exists(img_path):
# Extract features and predict using the loaded model
features = featureReduction(siftFeatures(img_path))
modelA_pred = modelA.predict(features)
modelA_proba = modelA.predict_proba(features)
else:
raise FileNotFoundError(f"Image file not found at {img_path}. Please check the path.")
# Output the prediction
print("CAT" if modelA_pred else "DOG")

# Read and display the image


img = cv2.imread(img_path)
if img is None:
raise FileNotFoundError(f"Image file not found at {img_path}. Please check the path.")
# Convert from BGR to RGB for correct plotting
img_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
# Display the image
plt.imshow(img_rgb)
plt.axis('off') # Hide axis
plt.show()
DOG

233
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

modelA_proba
array([[0.6, 0.4]])
import joblib
import cv2
import matplotlib.pyplot as plt
import os
# Define paths for model and image
img_path = './Cats/cat24.jpeg'
model_path = './Dataset/SIFT/Trained_Models/modelA_knn_tuned'
# Check if the model file exists
if os.path.exists(model_path):
# Load the model
modelA = joblib.load(model_path)
else:
raise FileNotFoundError(f"Model file not found at {model_path}. Please check the path.")
# Check if the image file exists
if os.path.exists(img_path):
# Extract features and predict using the loaded model
features = featureReduction(siftFeatures(img_path))
modelA_pred = modelA.predict(features)
modelA_proba = modelA.predict_proba(features)
else:
raise FileNotFoundError(f"Image file not found at {img_path}. Please check the path.")
# Output the prediction
print("CAT" if modelA_pred else "DOG")
# Read and display the image
img = cv2.imread(img_path)
if img is None:
raise FileNotFoundError(f"Image file not found at {img_path}. Please check the path.")
# Convert from BGR to RGB for correct plotting
img_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
234
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

# Display the image


plt.imshow(img_rgb)
plt.axis('off') # Hide axis
plt.show()
DOG

modelA_proba
array([[0.8, 0.2]])
Conclusion:
In this experiment, we successfully implemented an image classification system based on the
Scale-Invariant Feature Transform (SIFT) to differentiate between images of cats and dogs.
SIFT was chosen for feature extraction due to its effectiveness in identifying key points that are
stable under various transformations, such as scaling, rotation, and changes in lighting
conditions. This robustness makes SIFT particularly suitable for real-world scenarios where
image conditions can vary significantly. After extracting SIFT features, we applied K-means
clustering to group these features into distinct clusters. This clustering process not only reduced
the dimensionality of the feature space but also helped uncover patterns within the data. By
creating a "visual vocabulary" from these feature clusters, each image could be represented as a
histogram of cluster occurrences, effectively converting it into a feature vector that could be
used for classification with K-Nearest Neighbors (KNN). 10 clusters were used.
KNN, a simple yet powerful classifier, was then used to classify the images into two categories:
cats and dogs. The combined approach of SIFT for feature extraction, K-means for clustering,
and KNN for classification yielded promising results in distinguishing between cats and dogs,
demonstrating a high level of accuracy. This highlights the effectiveness of SIFT in feature
extraction and KNN in classification tasks within computer vision. However, there are some
limitations to this approach. For instance, the performance of K-means clustering depends on
the chosen number of clusters, and KNN requires careful tuning of the parameters to avoid
overfitting or underfitting. Moreover, the computational time required for SIFT feature
235
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

extraction and K-means clustering can be considerable, especially when dealing with large
ET3221: Computer
datasets. This Vision
indicates that while the SIFT-K-means-KNN pipeline is effective, it may not be
ideal for real-time applications unless optimizations or parallel processing techniques are
Name
applied.of the student: Ashish Rodi Roll No. 9

Div: A Batch: 1
Date of performance: 10-09-2024

Experiment No. 10
Problem Statement:
Human detection using HOG
AIM:
1. Resize the image to 64x128, calculate HOG features with smaller 8x8 cells, create the
feature vector, and use KNN for classification.
2. Resize the image to 64x128, compute HOG features using larger 32x32 cells, generate
the feature vector, and classify with KNN.
3. Blur the 64x128 image with Gaussian filtering, apply Prewitt X and Y edge detection,
extract HOG features, and classify using KNN.
4. Detect blobs in the 64x128 image using DoG, compute HOG features, create the feature
vector, and classify using KNN.

Objective(s) of Experiment:
To implement human detection using HOG feature descriptors and understand how HOG helps
in object detection by capturing shape and appearance information.
Introduction:
Human detection is a fundamental task in computer vision, useful in a variety of applications
such as surveillance, autonomous driving, and human-computer interaction. One of the most
effective methods for human detection is the Histogram of Oriented Gradients (HOG)
descriptor, which extracts features based on the gradients of an image's pixel intensities.
HOG works by dividing an image into small regions called cells, calculating the gradient
orientation for each pixel, and constructing histograms of gradient directions within each cell.
These histograms capture the shape and appearance of objects by emphasizing the edges and
contours in the image. The HOG feature descriptor, when combined with a classifier such as a
support vector machine (SVM), can be used to effectively detect objects, particularly humans,
in various images.
HOG is especially useful in human detection because it is robust to variations in lighting and
pose, which are common challenges in real-world environments. By detecting edges and
gradients, HOG helps in capturing the distinctive outline of humans, making it ideal for object
detection tasks like pedestrian recognition.
Flowchart:

236
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Code & Results:


import numpy as np
import pandas as pd
import os
import csv
import cv2
from matplotlib import pyplot as plt
import joblib
from sklearn import preprocessing
from skimage import exposure
from skimage.filters import sobel
from skimage.measure import shannon_entropy
from skimage.feature import hog
from sklearn.neighbors import KNeighborsClassifier
import warnings

237
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

warnings.filterwarnings('ignore')
from sklearn.cluster import KMeans
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report
from sklearn.tree import DecisionTreeClassifier
import sklearn.metrics as metrics
from sklearn.metrics import accuracy_score
input0 = 'C:/Users/Ashish/Desktop/CVFolder/'
temp = ['1', '0']

Case 1: Cell Size = 8x8


# Check if the base directory exists
if not os.path.exists(input0):
print(f"Error: The specified directory {input0} does not exist.")
else:
# Loop through each class/folder
for i in temp:
folder_path = os.path.join(input0, i)
if not os.path.exists(folder_path):
print(f"Warning: The subfolder {folder_path} does not exist.")
continue

count = 0
# Check if the output directory exists; if not, create it
output_dir = './Datasets/HOG/'
if not os.path.exists(output_dir):
os.makedirs(output_dir)

# Prepare the CSV file path


csv_file_path = output_dir + 'HOG_' + i + '.csv'

# Loop through files in the folder


for filename in os.listdir(folder_path):
img_path = os.path.join(folder_path, filename)

# Check if the file is an image by extension


if not filename.lower().endswith(('. ', '.jpg', '.jpeg', '.bmp')):
print(f"Skipping non-image file: {filename}")
continue

# Read the image


img = cv2.imread(img_path)

# Verify if image is read correctly


if img is None:
print(f"Error: Could not read image {img_path}. Skipping this file.")
continue

# Convert to grayscale
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

238
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

# Preprocessing: Resize to 64x128 for HOG descriptor


gray = cv2.resize(gray, (64, 128)) # Resize to 64x128 for consistency
# Perform HOG feature extraction with visualization
fd, hog_image = hog(
gray,
orientations=9,
pixels_per_cell=(8, 8),
cells_per_block=(2, 2),
visualize=True,
block_norm='L2-Hys'
)

# Normalize the HOG image for visualization properly


hog_image_rescaled = exposure.rescale_intensity(hog_image, out_range=(0, 255))
hog_image_uint8 = np.array(hog_image_rescaled, dtype=np.uint8) # Convert to uint8

# Convert HOG features into a DataFrame and transpose to have one row of features
out = pd.DataFrame(fd).T
print("Descriptor shape ", i, count, " : ", out.shape)

# Append to the CSV file


# Write header only if the file does not exist (first write)
out.to_csv(csv_file_path, mode='a', header=not os.path.isfile(csv_file_path),
index=False)

count += 1
if count == 50: # Limit to first 50 images
break

print(i + ": " + str(count))


Skipping non-image file: .ipynb_checkpoints
Descriptor shape 1 0 : (1, 3780)
Descriptor shape 1 1 : (1, 3780)
Descriptor shape 1 2 : (1, 3780)
Descriptor shape 1 3 : (1, 3780)
Descriptor shape 1 4 : (1, 3780)
Descriptor shape 1 5 : (1, 3780)
Descriptor shape 1 6 : (1, 3780)
Descriptor shape 1 7 : (1, 3780)
Descriptor shape 1 8 : (1, 3780)
1: 50
Descriptor shape 0 0 : (1, 3780)
Descriptor shape 0 1 : (1, 3780)
Descriptor shape 0 2 : (1, 3780)
Descriptor shape 0 3 : (1, 3780)
Descriptor shape 0 4 : (1, 3780)
Descriptor shape 0 5 : (1, 3780)
Descriptor shape 0 6 : (1, 3780)
Descriptor shape 0 7 : (1, 3780)
Descriptor shape 0 8 : (1, 3780)
239
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Descriptor shape 0 9 : (1, 3780)


Descriptor shape 0 10 : (1, 3780)

0: 50
data1 = pd.read_csv('./Datasets/HOG/HOG_1.csv', dtype='float64')
data2 = pd.read_csv('./Datasets/HOG/HOG_0.csv', dtype='float64')
print(data1.head())
print(data2.head())
print(data1.dtypes)
print(data2.dtypes)
0 1 2 3 4 5 6 \
0 0.258659 0.002152 0.025606 0.041955 0.124544 0.101784 0.186634
1 0.340736 0.103812 0.197419 0.000000 0.105524 0.033571 0.024925
2 0.028492 0.021237 0.037175 0.137302 0.331552 0.220111 0.026863
3 0.210885 0.038038 0.034926 0.114505 0.155678 0.135960 0.264799
4 0.375628 0.041870 0.065953 0.043860 0.270782 0.099272 0.139203

7 8 9 ... 3770 3771 3772 3773 \


0 0.258659 0.258659 0.095286 ... 0.093786 0.240636 0.105009 0.011209
1 0.011392 0.205713 0.340736 ... 0.000000 0.017954 0.019092 0.317712
2 0.000000 0.000000 0.071231 ... 0.060913 0.084592 0.017056 0.048156
3 0.264799 0.264799 0.074146 ... 0.012341 0.320903 0.101567 0.123079
4 0.252840 0.375628 0.026235 ... 0.004584 0.006029 0.000000 0.002132

3774 3775 3776 3777 3778 3779


0 0.099954 0.216010 0.120552 0.252413 0.040891 0.252413
1 0.118116 0.317712 0.008766 0.006641 0.000000 0.012533
2 0.253353 0.262293 0.262293 0.206730 0.169373 0.139292
3 0.166233 0.101823 0.178816 0.269375 0.002068 0.195890
4 0.004767 0.692840 0.004069 0.000000 0.003371 0.002383

[5 rows x 3780 columns]


0 1 2 3 4 5 6 \
0 0.252501 0.138965 0.000000 0.010890 0.251280 0.001420 0.036374
1 0.209961 0.182977 0.184402 0.168116 0.246609 0.152641 0.108318
2 0.280912 0.280912 0.131114 0.255403 0.280912 0.273070 0.208281
3 0.343019 0.000000 0.017820 0.000000 0.016481 0.001681 0.000466
4 0.234167 0.000000 0.161417 0.170622 0.163207 0.228206 0.243317

7 8 9 ... 3770 3771 3772 3773 \


0 0.102355 0.045981 0.252501 ... 0.127708 0.236611 0.236611 0.069334
1 0.044560 0.079332 0.246609 ... 0.014996 0.218139 0.355656 0.355656
2 0.201418 0.010968 0.238570 ... 0.001921 0.079906 0.050814 0.431057
3 0.000000 0.144196 0.343019 ... 0.120578 0.256088 0.151402 0.137115
4 0.187824 0.051697 0.243317 ... 0.244654 0.301669 0.106215 0.080612

3774 3775 3776 3777 3778 3779


0 0.236611 0.136389 0.042045 0.157576 0.030355 0.197035
1 0.219246 0.327471 0.055668 0.026826 0.144523 0.101532
240
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

2 0.265800 0.431057 0.034328 0.431057 0.243252 0.100712


3 0.189604 0.171058 0.202539 0.059614 0.074129 0.161142
4 0.072528 0.114002 0.030042 0.181748 0.110993 0.166096

[5 rows x 3780 columns]


0 float64
1 float64
2 float64
3 float64
4 float64
...
3775 float64
3776 float64
3777 float64
3778 float64
3779 float64
Length: 3780, dtype: object
0 float64
1 float64
2 float64
3 float64
4 float64
...
3775 float64
3776 float64
3777 float64
3778 float64
3779 float64
Length: 3780, dtype: object
50 rows × 3780 columns
# Combine the datasets (assuming they are properly formatted)
X = pd.concat([data1, data2], axis=0).values
y = [0] * len(data1) + [1] * len(data2)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
X_train = np.array(X_train, dtype=np.float64)
y_train = np.array(y_train, dtype=np.int64)
dfp = pd.read_csv('./Datasets/HOG/HOG_0.csv')
dfn = pd.read_csv('./Datasets/HOG/HOG_1.csv')

df = pd.concat([dfp, dfn])
csv_data = df.to_csv('./Datasets/HOG/HOG_Final.csv')
X_train
array([[0.26877325, 0.26877325, 0.26877325, ..., 0.00161489, 0.08702827,
0.27348235],
[0.01933716, 0. , 0. , ..., 0.15141459, 0.17664369,
0.12057064],
[0.45790657, 0. , 0. , ..., 0.03672928, 0.03871606,
241
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

0. ],
...,
[0.24940433, 0.00123831, 0.00656413, ..., 0.07061544, 0.02864682,
0.05245014],
[0.29962926, 0.12220875, 0. , ..., 0.00584205, 0.01448974,
0.29104826],
[0.20996119, 0.18297707, 0.18440202, ..., 0.02682564, 0.14452345,
0.10153207]])
X_test
array([[0.28068686, 0. , 0.00928159, ..., 0.27659436, 0.10303338,
0.00557759],
[0.34301887, 0. , 0.01781956, ..., 0.05961435, 0.07412865,
0.16114186],
[0.26669595, 0.05213381, 0.03408786, ..., 0.10750376, 0.20026886,
0.23614985],
...,
[0.29126872, 0.07604202, 0.03359546, ..., 0.2470542 , 0.2470542 ,
0.22815049],
[0.2688807 , 0.0853415 , 0.13952609, ..., 0.03989037, 0.17349667,
0.13803452],
[0.14365397, 0.00127804, 0.0762536 , ..., 0.17424213, 0.10331287,
0.04870216]])
y_train
array([1, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 1, 0, 1, 0, 1, 1, 0, 0, 0, 0,
0, 1, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 1, 1, 1, 0,
1, 1, 0, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 0, 0, 0, 1,
0, 0, 0, 1, 1, 1, 1, 1, 0, 1, 1, 0, 1, 1], dtype=int64)
y_test
[1, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 1, 1, 0, 0]
#KNN classifier
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import confusion_matrix, precision_score, recall_score, f1_score,
RocCurveDisplay
import joblib

model_knn = KNeighborsClassifier(n_neighbors = 13)


model_knn.fit(X_train, y_train)

joblib.dump(model_knn,"./Datasets/HOG/Trained_Models/modelA_knn")
y_pred3 = model_knn.predict(X_test)
print("KNN Classifier")
print("Train Accuracy:",model_knn.score(X_train, y_train))
print("Test Accuracy:",model_knn.score(X_test, y_test))
print("Precision Score: ",metrics.precision_score(y_test, y_pred3, pos_label='positive',
average='micro'))

242
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

print("Recall Score: ",metrics.recall_score(y_test, y_pred3, pos_label='positive',


average='micro'))
print("F1 Score: ",metrics.f1_score(y_test, y_pred3, pos_label='positive', average='micro'))
print("Confusion Matrix: ")
print(confusion_matrix(y_test, y_pred3,))

RocCurveDisplay.from_estimator(model_knn, X_test, y_test)


KNN Classifier
Train Accuracy: 0.4875
Test Accuracy: 0.6
Precision Score: 0.6
Recall Score: 0.6
F1 Score: 0.6
Confusion Matrix:
[[12 0]
[ 8 0]]

<sklearn.metrics._plot.roc_curve.RocCurveDisplay at 0x1e21d94f9b0>

Case 2: Cell Size = 32x32


# Check if the base directory exists
if not os.path.exists(input0):
print(f"Error: The specified directory {input0} does not exist.")
else:
# Loop through each class/folder
243
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

for i in temp:
folder_path = os.path.join(input0, i)
if not os.path.exists(folder_path):
print(f"Warning: The subfolder {folder_path} does not exist.")
continue

count = 0
# Check if the output directory exists; if not, create it
output_dir = './Datasets/HOG2/'
if not os.path.exists(output_dir):
os.makedirs(output_dir)

# Prepare the CSV file path


csv_file_path = output_dir + 'HOG_' + i + '.csv'

# Loop through files in the folder


for filename in os.listdir(folder_path):
img_path = os.path.join(folder_path, filename)

# Check if the file is an image by extension


if not filename.lower().endswith(('. ', '.jpg', '.jpeg', '.bmp')):
print(f"Skipping non-image file: {filename}")
continue

# Read the image


img = cv2.imread(img_path)

# Verify if image is read correctly


if img is None:
print(f"Error: Could not read image {img_path}. Skipping this file.")
continue

# Convert to grayscale
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

# Preprocessing: Resize to 64x128 for HOG descriptor


gray = cv2.resize(gray, (64, 128)) # Resize to 64x128 for consistency

# Perform HOG feature extraction with visualization


fd, hog_image = hog(
gray,
orientations=9,
pixels_per_cell=(32, 32),
cells_per_block=(2, 2),
visualize=True,
block_norm='L2-Hys'
)

# Normalize the HOG image for visualization properly


hog_image_rescaled = exposure.rescale_intensity(hog_image, out_range=(0, 255))
hog_image_uint8 = np.array(hog_image_rescaled, dtype=np.uint8) # Convert to uint8

244
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

# Convert HOG features into a DataFrame and transpose to have one row of features
out = pd.DataFrame(fd).T
print("Descriptor shape ", i, count, " : ", out.shape)

# Append to the CSV file


# Write header only if the file does not exist (first write)
out.to_csv(csv_file_path, mode='a', header=not os.path.isfile(csv_file_path),
index=False)

count += 1
if count == 50: # Limit to first 50 images
break

print(i + ": " + str(count))


Skipping non-image file: .ipynb_checkpoints
Descriptor shape 1 0 : (1, 108)
Descriptor shape 1 1 : (1, 108)
Descriptor shape 1 2 : (1, 108)
Descriptor shape 1 3 : (1, 108)
Descriptor shape 1 4 : (1, 108)
Descriptor shape 1 5 : (1, 108)
Descriptor shape 1 6 : (1, 108)
Descriptor shape 1 7 : (1, 108)
Descriptor shape 1 8 : (1, 108)
Descriptor shape 1 9 : (1, 108)
Descriptor shape 1 10 : (1, 108)
Descriptor shape 1 11 : (1, 108)
Descriptor shape 1 12 : (1, 108)
Descriptor shape 1 13 : (1, 108)
Descriptor shape 1 14 : (1, 108)
Descriptor shape 1 15 : (1, 108)
Descriptor shape 1 16 : (1, 108)
Descriptor shape 1 17 : (1, 108)
Descriptor shape 1 18 : (1, 108)
1: 50
Descriptor shape 0 0 : (1, 108)
Descriptor shape 0 1 : (1, 108)
Descriptor shape 0 2 : (1, 108)
Descriptor shape 0 3 : (1, 108)
Descriptor shape 0 4 : (1, 108)
Descriptor shape 0 5 : (1, 108)
Descriptor shape 0 6 : (1, 108)
Descriptor shape 0 7 : (1, 108)
Descriptor shape 0 8 : (1, 108)
Descriptor shape 0 9 : (1, 108)
Descriptor shape 0 10 : (1, 108)
Descriptor shape 0 11 : (1, 108)
Descriptor shape 0 12 : (1, 108)
Descriptor shape 0 13 : (1, 108)
Descriptor shape 0 14 : (1, 108)
245
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Descriptor shape 0 15 : (1, 108)


Descriptor shape 0 16 : (1, 108)
Descriptor shape 0 17 : (1, 108)
...
Descriptor shape 0 49 : (1, 108)
0: 50
data1 = pd.read_csv('./Datasets/HOG2/HOG_1.csv', dtype='float64')
data2 = pd.read_csv('./Datasets/HOG2/HOG_0.csv', dtype='float64')
print(data1.head())
print(data2.head())
print(data1.dtypes)
print(data2.dtypes)
0 1 2 3 4 5 6 \
0 0.261789 0.086879 0.089688 0.083896 0.261789 0.183295 0.261789
1 0.253006 0.106585 0.130788 0.082970 0.107974 0.071530 0.154536
2 0.060853 0.036663 0.094146 0.148572 0.210548 0.098569 0.054269
3 0.237195 0.168937 0.125035 0.150397 0.198345 0.097061 0.102116
4 0.242801 0.210303 0.224686 0.076658 0.134743 0.092444 0.110020

7 8 9 ... 98 99 100 101 \


0 0.196347 0.261789 0.261789 ... 0.102470 0.271426 0.220798 0.214553
1 0.208535 0.253006 0.162248 ... 0.189783 0.156165 0.104905 0.236390
2 0.029844 0.025693 0.249427 ... 0.149893 0.121074 0.088459 0.088207
3 0.127950 0.237195 0.236508 ... 0.157376 0.116861 0.057845 0.049849
4 0.164330 0.242801 0.242801 ... 0.091345 0.078030 0.050401 0.065961

102 103 104 105 106 107


0 0.210691 0.230548 0.196161 0.271426 0.271426 0.271426
1 0.162929 0.236390 0.069595 0.082581 0.072553 0.137251
2 0.146579 0.223288 0.223288 0.210612 0.174146 0.111551
3 0.089246 0.269907 0.269907 0.105867 0.043717 0.069548
4 0.024955 0.347574 0.055364 0.043591 0.070767 0.106507

[5 rows x 108 columns]


0 1 2 3 4 5 6 \
0 0.203414 0.203414 0.171193 0.186965 0.172065 0.163954 0.202227
1 0.174619 0.206166 0.216175 0.211236 0.216175 0.142780 0.127156
2 0.065729 0.042476 0.041669 0.085543 0.243799 0.243799 0.125732
3 0.319126 0.089987 0.057685 0.063811 0.211358 0.101371 0.119560
4 0.244639 0.164047 0.224451 0.153590 0.126145 0.189402 0.127086

7 8 9 ... 98 99 100 101 \


0 0.168057 0.172076 0.194258 ... 0.183392 0.214220 0.214220 0.214220
1 0.136052 0.123100 0.185853 ... 0.093432 0.158521 0.090197 0.133443
2 0.074314 0.026329 0.078983 ... 0.160301 0.069578 0.017817 0.124033
3 0.099848 0.319126 0.319126 ... 0.220521 0.234298 0.153939 0.104708
4 0.104805 0.195488 0.244639 ... 0.260203 0.260203 0.119793 0.090066

102 103 104 105 106 107


246
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

0 0.214220 0.182113 0.133149 0.200685 0.169505 0.214220


1 0.064120 0.111002 0.088001 0.074641 0.079581 0.054050
2 0.178924 0.170715 0.080887 0.289237 0.074725 0.072655
3 0.106098 0.160073 0.181163 0.234298 0.234298 0.219193
4 0.034725 0.077695 0.047642 0.081363 0.088862 0.109379

[5 rows x 108 columns]


0 float64
1 float64
2 float64
3 float64
4 float64
...
103 float64
104 float64
105 float64
106 float64
107 float64
Length: 108, dtype: object
0 float64
1 float64
2 float64
3 float64
4 float64
...
103 float64
104 float64
105 float64
106 float64
107 float64
Length: 108, dtype: object
50 rows × 108 columns
# Combine the datasets (assuming they are properly formatted)
X = pd.concat([data1, data2], axis=0).values
y = [0] * len(data1) + [1] * len(data2)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
X_train = np.array(X_train, dtype=np.float64)
y_train = np.array(y_train, dtype=np.int64)
dfp = pd.read_csv('./Datasets/HOG2/HOG_0.csv')
dfn = pd.read_csv('./Datasets/HOG2/HOG_1.csv')

df = pd.concat([dfp, dfn])
csv_data = df.to_csv('./Datasets/HOG2/HOG_Final.csv')
df
100 rows × 108 columns
X_train

247
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

array([[0.2597646 , 0.12610688, 0.17333325, ..., 0.19854511, 0.26364033,


0.26385359],
[0.27435213, 0.07704692, 0.10510892, ..., 0.17228776, 0.19467387,
0.16776281],
[0.25736548, 0.06954309, 0.08722384, ..., 0.09945842, 0.10022852,
0.13009917],
...,
[0.24918604, 0.20480833, 0.14726205, ..., 0.25001668, 0.25001668,
0.25001668],
[0.26600369, 0.11450869, 0.06222764, ..., 0.08116553, 0.11651163,
0.24801353],
[0.17461871, 0.20616637, 0.21617461, ..., 0.07464117, 0.07958089,
0.05405022]])
X_test
array([[0.26528478, 0.16284352, 0.09045281, ..., 0.09872744, 0.09557058,
0.14624095],
[0.31912629, 0.08998673, 0.05768453, ..., 0.23429839, 0.23429839,
0.21919343],
[0.27108815, 0.18836408, 0.15725114, ..., 0.11503052, 0.06936545,
0.05088422],
...,
[0.08070902, 0.04778797, 0.03697618, ..., 0.1149623 , 0.10414497,
0.14371557],
[0.25021987, 0.13441359, 0.13345396, ..., 0.13802847, 0.18110194,
0.26175356],
[0.32206669, 0.03379257, 0.05101648, ..., 0.01392593, 0.02076163,
0.03532835]])
y_train
array([1, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 1, 0, 1, 0, 1, 1, 0, 0, 0, 0,
0, 1, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 1, 1, 1, 0,
1, 1, 0, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 0, 0, 0, 1,
0, 0, 0, 1, 1, 1, 1, 1, 0, 1, 1, 0, 1, 1], dtype=int64)
y_test
[1, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 1, 1, 0, 0]
#KNN classifier
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import confusion_matrix, precision_score, recall_score, f1_score,
RocCurveDisplay
import joblib
model_knn = KNeighborsClassifier(n_neighbors = 13)
model_knn.fit(X_train, y_train)
joblib.dump(model_knn,"./Datasets/HOG2/Trained_Models/modelA_knn")
y_pred3 = model_knn.predict(X_test)
print("KNN Classifier")
print("Train Accuracy:",model_knn.score(X_train, y_train))
print("Test Accuracy:",model_knn.score(X_test, y_test))

248
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

print("Precision Score: ",metrics.precision_score(y_test, y_pred3, pos_label='positive',


average='micro'))
print("Recall Score: ",metrics.recall_score(y_test, y_pred3, pos_label='positive',
average='micro'))
print("F1 Score: ",metrics.f1_score(y_test, y_pred3, pos_label='positive', average='micro'))
print("Confusion Matrix: ")
print(confusion_matrix(y_test, y_pred3,))
RocCurveDisplay.from_estimator(model_knn, X_test, y_test)
KNN Classifier
Train Accuracy: 0.6875
Test Accuracy: 0.7
Precision Score: 0.7
Recall Score: 0.7
F1 Score: 0.7
Confusion Matrix:
[[9 3]
[3 5]]
<sklearn.metrics._plot.roc_curve.RocCurveDisplay at 0x1e27eeb7c50>

Case 3: Using Gaussian Blur and X and Y Prewitt Edge Detectors


import cv2
import numpy as np
import pandas as pd
import os
from skimage.feature import hog
249
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

from skimage import exposure


from scipy.ndimage import convolve

# Prewitt kernels
prewitt_x = np.array([
[-1, 0, 1],
[-1, 0, 1],
[-1, 0, 1]
])

prewitt_y = np.array([
[-1, -1, -1],
[0, 0, 0],
[1, 1, 1]
])

def apply_prewitt_edge_detection(image):
# Apply Prewitt operator
edges_x = convolve(image, prewitt_x)
edges_y = convolve(image, prewitt_y)
# Compute the magnitude of gradients
edges = np.sqrt(edges_x**2 + edges_y**2)
return edges

# Check if the base directory exists


if not os.path.exists(input0):
print(f"Error: The specified directory {input0} does not exist.")
else:
# Loop through each class/folder
for i in temp:
folder_path = os.path.join(input0, i)
if not os.path.exists(folder_path):
print(f"Warning: The subfolder {folder_path} does not exist.")
continue

count = 0
# Check if the output directory exists; if not, create it
output_dir = './Datasets/HOG3/'
if not os.path.exists(output_dir):
os.makedirs(output_dir)

# Prepare the CSV file path


csv_file_path = os.path.join(output_dir, f'HOG_{i}.csv')

# Loop through files in the folder


for filename in os.listdir(folder_path):
img_path = os.path.join(folder_path, filename)

# Check if the file is an image by extension


if not filename.lower().endswith(('. ', '.jpg', '.jpeg', '.bmp')):
print(f"Skipping non-image file: {filename}")

250
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

continue

# Read the image


img = cv2.imread(img_path)

# Verify if the image is read correctly


if img is None:
print(f"Error: Could not read image {img_path}. Skipping this file.")
continue

# Resize image to a standard size (64x128) to ensure consistent feature vector length
resized_img = cv2.resize(img, (64, 128))

# Convert to grayscale
gray = cv2.cvtColor(resized_img, cv2.COLOR_BGR2GRAY)

# Apply Gaussian blur


gray_blurred = cv2.GaussianBlur(gray, (5, 5), 0)

# Apply Prewitt edge detection


edges = apply_prewitt_edge_detection(gray_blurred)

# Perform HOG feature extraction with visualization


fd, hog_image = hog(
edges,
orientations=9,
pixels_per_cell=(8, 8),
cells_per_block=(2, 2),
visualize=True,
block_norm='L2-Hys'
)

# Normalize the HOG image for visualization


hog_image_rescaled = exposure.rescale_intensity(hog_image, out_range=(0, 255))
hog_image_uint8 = np.array(hog_image_rescaled, dtype=np.uint8) # Convert to uint8

# Convert HOG features into a DataFrame and transpose to have one row of features
out = pd.DataFrame(fd).T
print("Descriptor shape ", i, count, " : ", out.shape)

# Append to the CSV file


# Write header only if the file does not exist (first write)
out.to_csv(csv_file_path, mode='a', header=not os.path.isfile(csv_file_path),
index=False)

count += 1
if count == 50: # Limit to the first 50 images
break

print(f"{i}: {count}")

251
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Skipping non-image file: .ipynb_checkpoints


Descriptor shape 1 0 : (1, 3780)
Descriptor shape 1 1 : (1, 3780)
Descriptor shape 1 2 : (1, 3780)
Descriptor shape 1 3 : (1, 3780)
Descriptor shape 1 4 : (1, 3780)
Descriptor shape 1 5 : (1, 3780)
Descriptor shape 1 6 : (1, 3780)
Descriptor shape 1 7 : (1, 3780)
Descriptor shape 1 8 : (1, 3780)
1: 50
Descriptor shape 0 0 : (1, 3780)
Descriptor shape 0 1 : (1, 3780)
Descriptor shape 0 2 : (1, 3780)
Descriptor shape 0 3 : (1, 3780)
Descriptor shape 0 4 : (1, 3780)
Descriptor shape 0 5 : (1, 3780)
Descriptor shape 0 6 : (1, 3780)
Descriptor shape 0 7 : (1, 3780)
Descriptor shape 0 8 : (1, 3780)
Descriptor shape 0 9 : (1, 3780)
0: 50
data1 = pd.read_csv('./Datasets/HOG3/HOG_1.csv', dtype='float64')
data2 = pd.read_csv('./Datasets/HOG3/HOG_0.csv', dtype='float64')
print(data1.head())
print(data2.head())
print(data1.dtypes)
print(data2.dtypes)
0 1 2 3 4 5 6 \
0 0.231616 0.095326 0.030806 0.103815 0.195565 0.155590 0.134392
1 0.162942 0.160536 0.064792 0.222772 0.222772 0.093122 0.146576
2 0.205167 0.074233 0.128818 0.254769 0.254769 0.254769 0.115891
3 0.167163 0.225783 0.225783 0.111253 0.225783 0.000000 0.148787
4 0.248809 0.082480 0.066494 0.088787 0.140748 0.248809 0.248809

7 8 9 ... 3770 3771 3772 3773 \


0 0.231616 0.152020 0.231616 ... 0.228724 0.228724 0.084816 0.075686
1 0.111356 0.136510 0.222772 ... 0.107157 0.116197 0.138453 0.100130
2 0.088535 0.085044 0.060698 ... 0.049840 0.226648 0.116185 0.088748
3 0.105444 0.048986 0.210269 ... 0.086137 0.225647 0.176621 0.172361
4 0.232369 0.204149 0.159177 ... 0.095046 0.104550 0.115427 0.067599

3774 3775 3776 3777 3778 3779


0 0.140292 0.186345 0.216384 0.130097 0.191955 0.058225
1 0.059555 0.223747 0.124569 0.223747 0.182767 0.214685
2 0.018123 0.226648 0.086881 0.160649 0.226648 0.108644
3 0.225647 0.152547 0.104929 0.078454 0.081155 0.092829
4 0.142219 0.281541 0.144872 0.048355 0.037222 0.009070

252
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

[5 rows x 3780 columns]


0 1 2 3 4 5 6 \
0 0.225119 0.197257 0.103810 0.031438 0.225119 0.159402 0.225119
1 0.230570 0.119273 0.192716 0.230570 0.173409 0.062007 0.126305
2 0.247630 0.114578 0.119759 0.063616 0.247630 0.208943 0.206512
3 0.220631 0.129425 0.220631 0.053698 0.220631 0.000000 0.197854
4 0.251042 0.100214 0.169067 0.052826 0.251042 0.116574 0.202769

7 8 9 ... 3770 3771 3772 3773 \


0 0.183192 0.041323 0.225119 ... 0.029877 0.232452 0.152329 0.082750
1 0.132307 0.051724 0.230570 ... 0.075147 0.256334 0.114076 0.256334
2 0.139873 0.127275 0.247630 ... 0.154142 0.220352 0.171297 0.185029
3 0.087694 0.220631 0.220631 ... 0.012577 0.214519 0.129710 0.076503
4 0.029328 0.119856 0.215476 ... 0.214605 0.248754 0.055264 0.054280

3774 3775 3776 3777 3778 3779


0 0.175260 0.232452 0.151420 0.072181 0.168276 0.084597
1 0.256334 0.256334 0.124170 0.176283 0.166529 0.110201
2 0.202425 0.220352 0.122677 0.035593 0.197168 0.051948
3 0.161413 0.214519 0.103177 0.130720 0.170248 0.130972
4 0.031071 0.150776 0.074881 0.070272 0.248754 0.157586

[5 rows x 3780 columns]


0 float64
1 float64
2 float64
3 float64
4 float64
...
3775 float64
3776 float64
3777 float64
3778 float64
3779 float64
Length: 3780, dtype: object
0 float64
1 float64
2 float64
3 float64
4 float64
...
3775 float64
3776 float64
3777 float64
3778 float64
3779 float64
Length: 3780, dtype: object
50 rows × 3780 columns
# Combine the datasets (assuming they are properly formatted)
X = pd.concat([data1, data2], axis=0).values
253
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

y = [0] * len(data1) + [1] * len(data2)


X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
y_train = np.array(y_train, dtype=np.int64)
X_train = np.array(X_train, dtype=np.float64)
y_train = np.array(y_train, dtype=np.int64)
dfp = pd.read_csv('./Datasets/HOG3/HOG_0.csv')
dfn = pd.read_csv('./Datasets/HOG3/HOG_1.csv')

df = pd.concat([dfp, dfn])
csv_data = df.to_csv('./Datasets/HOG3/HOG_Final.csv')
df
X_train
array([[0.2209127 , 0.13149631, 0.1310974 , ..., 0.26443285, 0.26443285,
0.26443285],
[0.24531284, 0.12817822, 0.13990779, ..., 0.05215404, 0.13892704,
0.07834952],
[0.22650987, 0.15092507, 0.13507013, ..., 0.04238259, 0.04488483,
0.02453033],
...,
[0.2155882 , 0.1684096 , 0.08829851, ..., 0.10465179, 0.03230634,
0.14772919],
[0.23780212, 0.03834988, 0.18296075, ..., 0.14817041, 0.16469105,
0.1732593 ],
[0.23056962, 0.1192733 , 0.19271609, ..., 0.1762829 , 0.16652858,
0.11020062]])
X_test
array([[0.22460075, 0.20807512, 0.12964375, ..., 0.17852044, 0.18853596,
0.07264616],
[0.22063051, 0.12942505, 0.22063051, ..., 0.13071956, 0.17024796,
0.13097218],
[0.20563139, 0.12902862, 0.06734444, ..., 0.21048011, 0.16722286,
0.09023312],
...,
[0.23037201, 0.1716012 , 0.22422652, ..., 0.15069234, 0.18747883,
0.08020848],
[0.21217868, 0.07423544, 0.18174443, ..., 0.18857197, 0.09719393,
0.08561356],
[0.1990332 , 0.19094804, 0.15076064, ..., 0.09255979, 0.06207166,
0.13051169]])
y_train
array([1, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 1, 0, 1, 0, 1, 1, 0, 0, 0, 0,
0, 1, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 1, 1, 1, 0,
1, 1, 0, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 0, 0, 0, 1,
0, 0, 0, 1, 1, 1, 1, 1, 0, 1, 1, 0, 1, 1], dtype=int64)

254
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

y_test
[1, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 1, 1, 0, 0]
#KNN classifier
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import confusion_matrix, precision_score, recall_score, f1_score,
RocCurveDisplay
import joblib

model_knn = KNeighborsClassifier(n_neighbors = 13)


model_knn.fit(X_train, y_train)

joblib.dump(model_knn,"./Datasets/HOG3/Trained_Models/modelA_knn")
y_pred3 = model_knn.predict(X_test)
print("KNN Classifier")
print("Train Accuracy:",model_knn.score(X_train, y_train))
print("Test Accuracy:",model_knn.score(X_test, y_test))
print("Precision Score: ",metrics.precision_score(y_test, y_pred3, pos_label='positive',
average='micro'))
print("Recall Score: ",metrics.recall_score(y_test, y_pred3, pos_label='positive',
average='micro'))
print("F1 Score: ",metrics.f1_score(y_test, y_pred3, pos_label='positive', average='micro'))
print("Confusion Matrix: ")
print(confusion_matrix(y_test, y_pred3,))

RocCurveDisplay.from_estimator(model_knn, X_test, y_test)


KNN Classifier
Train Accuracy: 0.5375
Test Accuracy: 0.6
Precision Score: 0.6
Recall Score: 0.6
F1 Score: 0.6
Confusion Matrix:
[[10 2]
[ 6 2]]

<sklearn.metrics._plot.roc_curve.RocCurveDisplay at 0x1911ba16870>

255
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Case 4: Blob Detection Using DOG


import cv2
import numpy as np
import pandas as pd
import os
from skimage.feature import blob_dog, hog
from skimage import exposure
import matplotlib.pyplot as plt

# Function to apply blob detection


def apply_blob_detection(image):
blobs = blob_dog(image, max_sigma=30, threshold=0.01)
return blobs

# Check if the base directory exists


if not os.path.exists(input0):
print(f"Error: The specified directory {input0} does not exist.")
else:
# Loop through each class/folder
for i in temp:
folder_path = os.path.join(input0, i)
if not os.path.exists(folder_path):
print(f"Warning: The subfolder {folder_path} does not exist.")
continue

256
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

count = 0
# Check if the output directory exists; if not, create it
output_dir = './Datasets/HOG4/'

if not os.path.exists(output_dir):
os.makedirs(output_dir)

# Prepare the CSV file path


csv_file_path = output_dir + 'HOG_' + i + '.csv'

# Loop through files in the folder


for filename in os.listdir(folder_path):
img_path = os.path.join(folder_path, filename)

# Check if the file is an image by extension


if not filename.lower().endswith(('. ', '.jpg', '.jpeg', '.bmp')):
print(f"Skipping non-image file: {filename}")
continue

# Read the image


img = cv2.imread(img_path)

# Verify if image is read correctly


if img is None:
print(f"Error: Could not read image {img_path}. Skipping this file.")
continue

# Convert to grayscale
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

# Preprocessing: Resize to 64x128


gray = cv2.resize(gray, (64, 128))

# Apply blob detection


blobs = apply_blob_detection(gray)

# Create a mask for the detected blobs


mask = np.zeros_like(gray, dtype=np.uint8)
for blob in blobs:
y, x, area = blob
radius = int(area * np.sqrt(2)) # radius for the circle
cv2.circle(mask, (int(x), int(y)), radius, (255), thickness=-1)

# Apply mask to the grayscale image


masked_image = cv2.bitwise_and(gray, gray, mask=mask)

# Perform HOG feature extraction


fd, hog_image = hog(
masked_image,
orientations=9,
pixels_per_cell=(8, 8),

257
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

cells_per_block=(2, 2),
visualize=True,
block_norm='L2-Hys'
)

# Normalize the HOG image for visualization


hog_image_rescaled = exposure.rescale_intensity(hog_image, out_range=(0, 255))
hog_image_uint8 = np.array(hog_image_rescaled, dtype=np.uint8) # Convert to uint8

# Convert HOG features into a DataFrame and transpose to have one row of features
out = pd.DataFrame(fd).T
print("Descriptor shape ", i, count, " : ", out.shape)

# Append to the CSV file


out.to_csv(csv_file_path, mode='a', header=not os.path.isfile(csv_file_path),
index=False)

count += 1
if count == 50: # Limit to first 50 images
break

print(i + ": " + str(count))


Skipping non-image file: .ipynb_checkpoints
Descriptor shape 1 0 : (1, 3780)
Descriptor shape 1 1 : (1, 3780)
Descriptor shape 1 2 : (1, 3780)
Descriptor shape 1 3 : (1, 3780)
Descriptor shape 1 4 : (1, 3780)
Descriptor shape 1 5 : (1, 3780)
Descriptor shape 1 6 : (1, 3780)
Descriptor shape 1 7 : (1, 3780)
Descriptor shape 1 8 : (1, 3780)
Descriptor shape 1 9 : (1, 3780)
Descriptor shape 1 10 : (1, 3780)
Descriptor shape 1 11 : (1, 3780)
Descriptor shape 1 12 : (1, 3780)
Descriptor shape 1 13 : (1, 3780)
Descriptor shape 1 14 : (1, 3780)
Descriptor shape 1 15 : (1, 3780)
Descriptor shape 1 16 : (1, 3780)
Descriptor shape 1 17 : (1, 3780)
Descriptor shape 1 18 : (1, 3780)
Descriptor shape 1 19 : (1, 3780)
Descriptor shape 1 20 : (1, 3780)
Descriptor shape 1 21 : (1, 3780)
Descriptor shape 1 22 : (1, 3780)
Descriptor shape 1 23 : (1, 3780)
Descriptor shape 1 24 : (1, 3780)
Descriptor shape 1 25 : (1, 3780)
Descriptor shape 1 26 : (1, 3780)
Descriptor shape 1 27 : (1, 3780)
258
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Descriptor shape 1 28 : (1, 3780)


Descriptor shape 1 29 : (1, 3780)
Descriptor shape 1 30 : (1, 3780)
Descriptor shape 1 31 : (1, 3780)
Descriptor shape 1 32 : (1, 3780)
Descriptor shape 1 33 : (1, 3780)
Descriptor shape 1 34 : (1, 3780)
Descriptor shape 1 35 : (1, 3780)
Descriptor shape 1 36 : (1, 3780)
Descriptor shape 1 37 : (1, 3780)
Descriptor shape 1 38 : (1, 3780)
Descriptor shape 1 39 : (1, 3780)
Descriptor shape 1 40 : (1, 3780)
Descriptor shape 1 41 : (1, 3780)
Descriptor shape 1 42 : (1, 3780)
Descriptor shape 1 43 : (1, 3780)
Descriptor shape 1 44 : (1, 3780)
Descriptor shape 1 45 : (1, 3780)
Descriptor shape 1 46 : (1, 3780)
Descriptor shape 1 47 : (1, 3780)
Descriptor shape 1 48 : (1, 3780)
Descriptor shape 1 49 : (1, 3780)
1: 50
Descriptor shape 0 0 : (1, 3780)
Descriptor shape 0 1 : (1, 3780)
Descriptor shape 0 2 : (1, 3780)
Descriptor shape 0 3 : (1, 3780)
Descriptor shape 0 4 : (1, 3780)
Descriptor shape 0 5 : (1, 3780)
Descriptor shape 0 6 : (1, 3780)
Descriptor shape 0 7 : (1, 3780)
Descriptor shape 0 8 : (1, 3780)
Descriptor shape 0 9 : (1, 3780)
Descriptor shape 0 10 : (1, 3780)
Descriptor shape 0 11 : (1, 3780)
Descriptor shape 0 12 : (1, 3780)
Descriptor shape 0 13 : (1, 3780)
Descriptor shape 0 14 : (1, 3780)
Descriptor shape 0 15 : (1, 3780)
Descriptor shape 0 16 : (1, 3780)
Descriptor shape 0 17 : (1, 3780)
Descriptor shape 0 18 : (1, 3780)
Descriptor shape 0 19 : (1, 3780)
Descriptor shape 0 20 : (1, 3780)
Descriptor shape 0 21 : (1, 3780)
Descriptor shape 0 22 : (1, 3780)
Descriptor shape 0 23 : (1, 3780)
Descriptor shape 0 24 : (1, 3780)
Descriptor shape 0 25 : (1, 3780)
Descriptor shape 0 26 : (1, 3780)
Descriptor shape 0 27 : (1, 3780)

259
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Descriptor shape 0 28 : (1, 3780)


Descriptor shape 0 29 : (1, 3780)
Descriptor shape 0 30 : (1, 3780)
Descriptor shape 0 31 : (1, 3780)
Descriptor shape 0 32 : (1, 3780)
Descriptor shape 0 33 : (1, 3780)
Descriptor shape 0 34 : (1, 3780)
Descriptor shape 0 35 : (1, 3780)
Descriptor shape 0 36 : (1, 3780)
Descriptor shape 0 37 : (1, 3780)
Descriptor shape 0 38 : (1, 3780)
Descriptor shape 0 39 : (1, 3780)
Descriptor shape 0 40 : (1, 3780)
Descriptor shape 0 41 : (1, 3780)
Descriptor shape 0 42 : (1, 3780)
Descriptor shape 0 43 : (1, 3780)
Descriptor shape 0 44 : (1, 3780)
Descriptor shape 0 45 : (1, 3780)
Descriptor shape 0 46 : (1, 3780)
Descriptor shape 0 47 : (1, 3780)
Descriptor shape 0 48 : (1, 3780)
Descriptor shape 0 49 : (1, 3780)
0: 50
data1 = pd.read_csv('./Datasets/HOG4/HOG_1.csv', dtype='float64')
data2 = pd.read_csv('./Datasets/HOG4/HOG_0.csv', dtype='float64')
print(data1.head())
print(data2.head())
print(data1.dtypes)
print(data2.dtypes)
0 1 2 3 4 5 6 \
0 0.295319 0.029208 0.181941 0.000000 0.148984 0.003137 0.172733
1 0.340736 0.103812 0.197419 0.000000 0.105524 0.033571 0.024925
2 0.028492 0.021237 0.037175 0.137302 0.331552 0.220111 0.026863
3 0.302692 0.000000 0.008208 0.007298 0.057953 0.008212 0.302692
4 0.375628 0.041870 0.065953 0.043860 0.270782 0.099272 0.139203

7 8 9 ... 3770 3771 3772 3773 \


0 0.000000 0.063651 0.295319 ... 0.050393 0.257320 0.068155 0.257320
1 0.011392 0.205713 0.340736 ... 0.092410 0.312364 0.015723 0.312364
2 0.000000 0.000000 0.071231 ... 0.116837 0.173610 0.000000 0.210875
3 0.004227 0.275092 0.030781 ... 0.169827 0.295333 0.000000 0.203723
4 0.252840 0.375628 0.026235 ... 0.000000 0.000000 0.000000 0.000000

3774 3775 3776 3777 3778 3779


0 0.000000 0.257320 0.000000 0.183671 0.254292 0.000000
1 0.119291 0.293841 0.101617 0.312364 0.000000 0.184091
2 0.000000 0.303419 0.000000 0.150702 0.068591 0.000000
3 0.000000 0.124789 0.000000 0.108132 0.000000 0.000000
4 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
260
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

[5 rows x 3780 columns]


0 1 2 3 4 5 6 \
0 0.279837 0.052656 0.121025 0.000000 0.279837 0.000000 0.122085
1 0.271961 0.000000 0.271961 0.040502 0.252791 0.009549 0.005112
2 0.318966 0.000000 0.318966 0.000000 0.318966 0.003163 0.318966
3 0.343019 0.000000 0.017820 0.000000 0.016481 0.001681 0.000466
4 0.234167 0.000000 0.161417 0.170622 0.163207 0.228206 0.243317

7 8 9 ... 3770 3771 3772 3773 \


0 0.052656 0.000000 0.279837 ... 0.074357 0.278810 0.087550 0.212101
1 0.090859 0.058156 0.271961 ... 0.000000 0.337637 0.000000 0.337637
2 0.000000 0.000000 0.318966 ... 0.001921 0.079906 0.050814 0.431057
3 0.000000 0.144196 0.343019 ... 0.081566 0.284757 0.210855 0.232515
4 0.187824 0.051697 0.243317 ... 0.000000 0.000000 0.000000 0.000000

3774 3775 3776 3777 3778 3779


0 0.036638 0.278810 0.000000 0.097890 0.200632 0.000000
1 0.000000 0.337637 0.000000 0.337637 0.000000 0.000000
2 0.265800 0.431057 0.034328 0.431057 0.243252 0.100712
3 0.000000 0.284757 0.000000 0.078510 0.155984 0.025535
4 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000

[5 rows x 3780 columns]


0 float64
1 float64
2 float64
3 float64
4 float64
...
3775 float64
3776 float64
3777 float64
3778 float64
3779 float64
Length: 3780, dtype: object
0 float64
1 float64
2 float64
3 float64
4 float64
...
3775 float64
3776 float64
3777 float64
3778 float64
3779 float64
Length: 3780, dtype: object
data1
100 rows × 3780 columns
261
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

X_train
array([[0.31940152, 0. , 0.0461378 , ..., 0.24152244, 0. ,
0.13396017],
[0.01210291, 0. , 0. , ..., 0.27773573, 0.06211159,
0.09996591],
[0. , 0. , 0. , ..., 0.03672928, 0.03871606,
0. ],
...,
[0.28572148, 0. , 0.16910751, ..., 0.20302401, 0.00693289,
0. ],
[0.18326481, 0.05943829, 0. , ..., 0.23846514, 0. ,
0.04399283],
[0.27196109, 0. , 0.27196109, ..., 0.33763659, 0. ,
0. ]])
X_test
array([[0.26926665, 0.10292101, 0.12367615, ..., 0.05209534, 0. ,
0. ],
[0.34301887, 0. , 0.01781956, ..., 0.07851 , 0.15598416,
0.02553455],
[0.26669595, 0.05213381, 0.03408786, ..., 0.23440643, 0. ,
0. ],
...,
[0.299901 , 0. , 0.19373461, ..., 0.19731137, 0.2214648 ,
0.25376244],
[0.26524393, 0. , 0.26524393, ..., 0.07753482, 0.11683825,
0.06237846],
[0.14365397, 0.00127804, 0.0762536 , ..., 0. , 0. ,
0. ]])
y_train
array([1, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 1, 0, 1, 0, 1, 1, 0, 0, 0, 0,
0, 1, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 1, 1, 1, 0,
1, 1, 0, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 0, 0, 0, 1,
0, 0, 0, 1, 1, 1, 1, 1, 0, 1, 1, 0, 1, 1], dtype=int64)
y_test
[1, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 1, 1, 0, 0]
#KNN classifier
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import confusion_matrix, precision_score, recall_score, f1_score,
RocCurveDisplay
import joblib

model_knn = KNeighborsClassifier(n_neighbors = 13)


model_knn.fit(X_train, y_train)

joblib.dump(model_knn,"./Datasets/HOG4/Trained_Models/modelA_knn")
y_pred3 = model_knn.predict(X_test)
262
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

print("KNN Classifier")
print("Train Accuracy:",model_knn.score(X_train, y_train))
print("Test Accuracy:",model_knn.score(X_test, y_test))
print("Precision Score: ",metrics.precision_score(y_test, y_pred3, pos_label='positive',
average='micro'))
print("Recall Score: ",metrics.recall_score(y_test, y_pred3, pos_label='positive',
average='micro'))
print("F1 Score: ",metrics.f1_score(y_test, y_pred3, pos_label='positive', average='micro'))
print("Confusion Matrix: ")
print(confusion_matrix(y_test, y_pred3,))

RocCurveDisplay.from_estimator(model_knn, X_test, y_test)


KNN Classifier
Train Accuracy: 0.625
Test Accuracy: 0.5
Precision Score: 0.5
Recall Score: 0.5
F1 Score: 0.5
Confusion Matrix:
[[8 4]
[6 2]]

<sklearn.metrics._plot.roc_curve.RocCurveDisplay at 0x1913feb92e0>

263
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Result
# Define metrics for each approach
metrics = {
'Approach 1': {'Accuracy': 0.6, 'Precision': 0.6, 'Recall': 0.6, 'F1-Score': 0.6},
'Approach 2': {'Accuracy': 0.7, 'Precision': 0.7, 'Recall': 0.7, 'F1-Score': 0.7},
'Approach 3': {'Accuracy': 0.6, 'Precision': 0.6, 'Recall': 0.6, 'F1-Score': 0.6},
'Approach 4': {'Accuracy': 0.5, 'Precision': 0.5, 'Recall': 0.5, 'F1-Score': 0.5}
}
# Convert to DataFrame
df_metrics = pd.DataFrame(metrics).T
print(df_metrics)

# Plot the metrics for comparison


df_metrics.plot(kind='bar', figsize=(10, 6))
plt.title('Comparison of Classification Metrics for Different Approaches')
plt.ylabel('Score')
plt.xlabel('Approach')
plt.legend(loc='best')
plt.show()
Accuracy Precision Recall F1-Score
Approach 1 0.6 0.6 0.6 0.6
Approach 2 0.7 0.7 0.7 0.7
Approach 3 0.6 0.6 0.6 0.6
Approach 4 0.5 0.5 0.5 0.5

264
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Testing
# Define paths for model and image
img_path = './0/17. '
model_path = './Datasets/HOG2/Trained_Models/modelA_knn'

# Function to extract HOG features from an image


def extract_hog_features(image_path):
# Read the image
img = cv2.imread(image_path)
if img is None:
raise FileNotFoundError(f"Image file not found at {image_path}. Please check the path.")

# Convert to grayscale
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

# Resize to 64x128 for HOG descriptor consistency


gray_resized = cv2.resize(gray, (64, 128))

# Compute HOG features


fd, hog_image = hog(
gray_resized,
orientations=9,
pixels_per_cell=(32, 32),
cells_per_block=(2, 2),
visualize=True,
block_norm='L2-Hys'
)

# Normalize the HOG image for visualization


hog_image_rescaled = exposure.rescale_intensity(hog_image, out_range=(0, 255))
hog_image_uint8 = np.array(hog_image_rescaled, dtype=np.uint8)

return fd, hog_image_uint8

# Check if the model file exists


if os.path.exists(model_path):
# Load the model
modelA = joblib.load(model_path)
else:
raise FileNotFoundError(f"Model file not found at {model_path}. Please check the path.")

# Check if the image file exists


if os.path.exists(img_path):
# Extract HOG features
features, hog_image = extract_hog_features(img_path)

# Predict using the loaded model


modelA_pred = modelA.predict([features]) # Note: features need to be passed as a list
modelA_proba = modelA.predict_proba([features])

# Output the prediction


265
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

print("Human" if modelA_pred[0] == 1 else "Not Human")


else:
raise FileNotFoundError(f"Image file not found at {img_path}. Please check the path.")

# Read and display the image


img = cv2.imread(img_path)
if img is None:
raise FileNotFoundError(f"Image file not found at {img_path}. Please check the path.")

# Convert from BGR to RGB for correct plotting


img_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

# Display the image and HOG visualization


plt.figure(figsize=(10, 5))

# Display original image


plt.subplot(1, 2, 1)
plt.imshow(img_rgb)
plt.title('Original Image')
plt.axis('off')

# Display HOG image


plt.subplot(1, 2, 2)
plt.imshow(hog_image, cmap='gray')
plt.title('HOG Visualization')
plt.axis('off')

plt.tight_layout()
plt.show()
Not Human

266
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

# Define paths for model and image


img_path = './0/18. '
model_path = './Datasets/HOG2/Trained_Models/modelA_knn'

# Function to extract HOG features from an image


def extract_hog_features(image_path):
# Read the image
img = cv2.imread(image_path)
if img is None:
raise FileNotFoundError(f"Image file not found at {image_path}. Please check the path.")

# Convert to grayscale
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

# Resize to 64x128 for HOG descriptor consistency


gray_resized = cv2.resize(gray, (64, 128))

# Compute HOG features


fd, hog_image = hog(
gray_resized,
orientations=9,
pixels_per_cell=(32, 32),
cells_per_block=(2, 2),
visualize=True,
block_norm='L2-Hys'
)

# Normalize the HOG image for visualization


hog_image_rescaled = exposure.rescale_intensity(hog_image, out_range=(0, 255))
hog_image_uint8 = np.array(hog_image_rescaled, dtype=np.uint8)

return fd, hog_image_uint8

# Check if the model file exists


if os.path.exists(model_path):
# Load the model
modelA = joblib.load(model_path)
else:
raise FileNotFoundError(f"Model file not found at {model_path}. Please check the path.")

# Check if the image file exists


if os.path.exists(img_path):
# Extract HOG features
features, hog_image = extract_hog_features(img_path)

# Predict using the loaded model


modelA_pred = modelA.predict([features]) # Note: features need to be passed as a list
modelA_proba = modelA.predict_proba([features])

# Output the prediction


print("Human" if modelA_pred[0] == 1 else "Not Human")

267
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

else:
raise FileNotFoundError(f"Image file not found at {img_path}. Please check the path.")

# Read and display the image


img = cv2.imread(img_path)
if img is None:
raise FileNotFoundError(f"Image file not found at {img_path}. Please check the path.")

# Convert from BGR to RGB for correct plotting


img_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

# Display the image and HOG visualization


plt.figure(figsize=(10, 5))

# Display original image


plt.subplot(1, 2, 1)
plt.imshow(img_rgb)
plt.title('Original Image')
plt.axis('off')

# Display HOG image


plt.subplot(1, 2, 2)
plt.imshow(hog_image, cmap='gray')
plt.title('HOG Visualization')
plt.axis('off')

plt.tight_layout()
plt.show()
Not Human

268
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

# Define paths for model and image


img_path = './1/0. '
model_path = './Datasets/HOG2/Trained_Models/modelA_knn'

# Function to extract HOG features from an image


def extract_hog_features(image_path):
# Read the image
img = cv2.imread(image_path)
if img is None:
raise FileNotFoundError(f"Image file not found at {image_path}. Please check the path.")

# Convert to grayscale
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

# Resize to 64x128 for HOG descriptor consistency


gray_resized = cv2.resize(gray, (64, 128))

# Compute HOG features


fd, hog_image = hog(
gray_resized,
orientations=9,
pixels_per_cell=(32, 32),
cells_per_block=(2, 2),
visualize=True,
block_norm='L2-Hys'
)

# Normalize the HOG image for visualization


hog_image_rescaled = exposure.rescale_intensity(hog_image, out_range=(0, 255))
hog_image_uint8 = np.array(hog_image_rescaled, dtype=np.uint8)

return fd, hog_image_uint8

# Check if the model file exists


if os.path.exists(model_path):
# Load the model
modelA = joblib.load(model_path)
else:
raise FileNotFoundError(f"Model file not found at {model_path}. Please check the path.")

# Check if the image file exists


if os.path.exists(img_path):
# Extract HOG features
features, hog_image = extract_hog_features(img_path)

# Predict using the loaded model


modelA_pred = modelA.predict([features]) # Note: features need to be passed as a list
modelA_proba = modelA.predict_proba([features])

# Output the prediction


print("Human" if modelA_pred[0] == 1 else "Not Human")

269
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

else:
raise FileNotFoundError(f"Image file not found at {img_path}. Please check the path.")

# Read and display the image


img = cv2.imread(img_path)
if img is None:
raise FileNotFoundError(f"Image file not found at {img_path}. Please check the path.")

# Convert from BGR to RGB for correct plotting


img_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

# Display the image and HOG visualization


plt.figure(figsize=(10, 5))

# Display original image


plt.subplot(1, 2, 1)
plt.imshow(img_rgb)
plt.title('Original Image')
plt.axis('off')

# Display HOG image


plt.subplot(1, 2, 2)
plt.imshow(hog_image, cmap='gray')
plt.title('HOG Visualization')
plt.axis('off')

plt.tight_layout()
plt.show()
Not Human

270
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

# Define paths for model and image


img_path = './1/7.png '
model_path = './Datasets/HOG3/Trained_Models/modelA_knn'

# Function to extract HOG features from an image


def extract_hog_features(image_path):
# Read the image
img = cv2.imread(image_path)
if img is None:
raise FileNotFoundError(f"Image file not found at {image_path}. Please check the path.")

# Convert to grayscale
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

# Resize to 64x128 for HOG descriptor consistency


gray_resized = cv2.resize(gray, (64, 128))

# Compute HOG features


fd, hog_image = hog(
gray_resized,
orientations=9,
pixels_per_cell=(8, 8),
cells_per_block=(2, 2),
visualize=True,
block_norm='L2-Hys'
)

# Normalize the HOG image for visualization


hog_image_rescaled = exposure.rescale_intensity(hog_image, out_range=(0, 255))
hog_image_uint8 = np.array(hog_image_rescaled, dtype=np.uint8)

return fd, hog_image_uint8

# Check if the model file exists


if os.path.exists(model_path):
# Load the model
modelA = joblib.load(model_path)
else:
raise FileNotFoundError(f"Model file not found at {model_path}. Please check the path.")

# Check if the image file exists


if os.path.exists(img_path):
# Extract HOG features
features, hog_image = extract_hog_features(img_path)

# Predict using the loaded model


modelA_pred = modelA.predict([features]) # Note: features need to be passed as a list
modelA_proba = modelA.predict_proba([features])

# Output the prediction


print("Human" if modelA_pred[0] == 1 else "Not Human")

271
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

else:
raise FileNotFoundError(f"Image file not found at {img_path}. Please check the path.")

# Read and display the image


img = cv2.imread(img_path)
if img is None:
raise FileNotFoundError(f"Image file not found at {img_path}. Please check the path.")

# Convert from BGR to RGB for correct plotting


img_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

# Display the image and HOG visualization


plt.figure(figsize=(10, 5))

# Display original image


plt.subplot(1, 2, 1)
plt.imshow(img_rgb)
plt.title('Original Image')
plt.axis('off')

# Display HOG image


plt.subplot(1, 2, 2)
plt.imshow(hog_image, cmap='gray')
plt.title('HOG Visualization')
plt.axis('off')

plt.tight_layout()
plt.show()
Human

Conclusion:

272
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

We explored various methods for human detection using Histogram of Oriented Gradients
(HOG) features combined with K-Nearest Neighbors (KNN) classification. Each approach
provided unique insights into how different feature extraction and preprocessing techniques
affect classification performance. In the first approach, we used HOG features with smaller 8x8
cells, resulting in a detailed representation of the image's gradient orientations. This fine-
grained method captures intricate edge information, potentially enhancing the classifier's ability
to differentiate between human figures and other objects. However, the increased feature vector
dimensionality can lead to higher computational costs and necessitates careful parameter tuning
to prevent overfitting. The second approach involved using larger 32x32 cells for HOG feature
extraction. This coarser representation reduces the dimensionality of the feature vector, thereby
lowering computational load and potentially speeding up processing. However, this may result
in the loss of finer details critical for accurate human detection. Therefore, finding the right
balance between cell size and the need for detailed information is key to optimizing HOG
performance. In the third approach, we applied Gaussian blurring followed by Prewitt edge
detection before extracting HOG features. Blurring reduces noise and smooths variations in the
image, which can enhance edge detection accuracy. Prewitt edge detection emphasizes
gradients in both the horizontal and vertical directions, complementing the HOG features by
improving edge clarity. This combined method could potentially offer greater robustness against
noise and variations in the images, thereby improving human detection reliability. The fourth
approach involved using Difference of Gaussians (DoG) for blob detection prior to HOG
feature extraction. Blob detection helps identify regions of interest, focusing the feature
extraction on areas more likely to contain relevant information, such as human figures. This
method may enhance classification accuracy by providing more relevant features for KNN.
Based on the accuracies, the different images were also analyzed to verify human detection.
Also, out of the 4 models implemented, the HOG2 meaning the one which uses the 32x32 cell
size showed the highest accuracy.

273
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

ET3221: Computer Vision

Name of the student: Ashish Rodi Roll No. 9

Div: A Batch: 1
Date of performance: 10-09-2024

Experiment No. 11

Problem Statement: To classify images of different types of terrain (dirt, paved, and stone-
paved) by extracting texture features using the Gray-Level Co-occurrence Matrix (GLCM)
and Haralick features, and training a classifier based on these features.

AIM: Write a Python Code to perform the following:

● To implement feature extraction using GLCM and Haralick features.


● To extract multiple texture properties from images.
● To classify images based on extracted features using K-Nearest Neighbors (KNN).

Objective(s) of Experiment: To perform and study texture classification using GLCM and
Haralick features.

Introduction:

In computer vision, texture analysis is crucial in identifying and classifying objects based on
their surface characteristics. Methods like the Gray-Level Co-occurrence Matrix (GLCM)
are widely used for detecting and describing distinctive texture features in images. GLCM
provides a statistical method for examining the spatial relationship between pixels, making it
an excellent choice for texture-based classification.

Distinguishing between different types of terrain (dirt, paved, stone-paved) involves


extracting important texture features from each image and using these features to build a
classifier. The GLCM-based texture descriptors extracted from each image form high-
dimensional data, which are difficult to process directly. Therefore, a classification
algorithm such as K-Nearest Neighbors (KNN) is applied to classify these descriptors,
simplifying the problem.

GLCM Feature Extraction:


274
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

GLCM is used to extract texture features from each image. It computes how often pairs of
pixels with specific values and spatial relationships occur in a given direction and distance
in an image. From the GLCM, texture descriptors such as energy, contrast, correlation,
dissimilarity, and homogeneity are calculated. These descriptors capture the local structure
and pattern of the image and are invariant to changes in lighting and scaling.

K-Nearest Neighbors (KNN) Classification:


Once the GLCM descriptors are computed for each image, they are used to train a KNN
classifier. The classifier assigns labels to each image based on the features extracted from its
texture. The KNN algorithm relies on the proximity of feature vectors to make classification
decisions. Choosing the right number of neighbors (K) is crucial and can be optimized to
balance the model's complexity and performance.

Flowchart

275
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Code and Results:

import numpy as np
import pandas as pd
import os
import csv
import cv2
from matplotlib import pyplot as plt
import joblib
from sklearn import preprocessing
from skimage.filters import sobel
from skimage.measure import shannon_entropy
import warnings
warnings.filterwarnings('ignore')
from sklearn.cluster import KMeans
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report
from sklearn.tree import DecisionTreeClassifier
import sklearn.metrics as metrics
from sklearn.metrics import accuracy_score

from skimage.feature import graycomatrix, graycoprops


input0 = 'C:/Users/Ashish/Desktop/CVFolder/'
temp = ['Paved', 'Asphalt', 'Dirt']
def feature_extractor(dataset, cls):
image_dataset = pd.DataFrame()
for filename in os.listdir(input0+dataset): #iterate through each file
# print(image)
#break
df = pd.DataFrame() #Temporary data frame to capture information for each loop.
#Reset dataframe to blank after each loop.
img = cv2.imread(input0 +dataset + '/' + filename)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
gray = cv2.resize(gray, (100, 100)) # Resize to 64x128 for consistency

################################################################
#START ADDING DATA TO THE DATAFRAME
#Full image
#GLCM = graycomatrix(img, [1], [0, np.pi/4, np.pi/2, 3*np.pi/4])
GLCM = graycomatrix(gray, [1], [0])
GLCM_Energy = graycoprops(GLCM, 'energy')[0]
df['Energy'] = GLCM_Energy

GLCM_corr = graycoprops(GLCM, 'correlation')[0]

276
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

df['Corr'] = GLCM_corr
GLCM_diss = graycoprops(GLCM, 'dissimilarity')[0]
df['Diss_sim'] = GLCM_diss
GLCM_hom = graycoprops(GLCM, 'homogeneity')[0]
df['Homogen'] = GLCM_hom
GLCM_contr = graycoprops(GLCM, 'contrast')[0]
df['Contrast'] = GLCM_contr

GLCM2 = graycomatrix(gray, [2], [0])


GLCM_Energy2 = graycoprops(GLCM2, 'energy')[0]
df['Energy2'] = GLCM_Energy2
GLCM_corr2 = graycoprops(GLCM2, 'correlation')[0]
df['Corr2'] = GLCM_corr2
GLCM_diss2 = graycoprops(GLCM2, 'dissimilarity')[0]
df['Diss_sim2'] = GLCM_diss2
GLCM_hom2 = graycoprops(GLCM2, 'homogeneity')[0]
df['Homogen2'] = GLCM_hom2
GLCM_contr2 = graycoprops(GLCM2, 'contrast')[0]
df['Contrast2'] = GLCM_contr2

GLCM3 = graycomatrix(gray, [5], [0])


GLCM_Energy3 = graycoprops(GLCM3, 'energy')[0]
df['Energy3'] = GLCM_Energy3
GLCM_corr3 = graycoprops(GLCM3, 'correlation')[0]
df['Corr3'] = GLCM_corr3
GLCM_diss3 = graycoprops(GLCM3, 'dissimilarity')[0]
df['Diss_sim3'] = GLCM_diss3
GLCM_hom3 = graycoprops(GLCM3, 'homogeneity')[0]
df['Homogen3'] = GLCM_hom3
GLCM_contr3 = graycoprops(GLCM3, 'contrast')[0]
df['Contrast3'] = GLCM_contr3

GLCM4 = graycomatrix(gray, [0], [np.pi/4])


GLCM_Energy4 = graycoprops(GLCM4, 'energy')[0]
df['Energy4'] = GLCM_Energy4
GLCM_corr4 = graycoprops(GLCM4, 'correlation')[0]
df['Corr4'] = GLCM_corr4
GLCM_diss4 = graycoprops(GLCM4, 'dissimilarity')[0]
df['Diss_sim4'] = GLCM_diss4

GLCM_hom4 = graycoprops(GLCM4, 'homogeneity')[0]

277
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

df['Homogen4'] = GLCM_hom4
GLCM_contr4 = graycoprops(GLCM4, 'contrast')[0]
df['Contrast4'] = GLCM_contr4

GLCM5 = graycomatrix(gray, [0], [np.pi/2])


GLCM_Energy5 = graycoprops(GLCM5, 'energy')[0]
df['Energy5'] = GLCM_Energy5
GLCM_corr5 = graycoprops(GLCM5, 'correlation')[0]
df['Corr5'] = GLCM_corr5
GLCM_diss5 = graycoprops(GLCM5, 'dissimilarity')[0]
df['Diss_sim5'] = GLCM_diss5
GLCM_hom5 = graycoprops(GLCM5, 'homogeneity')[0]
df['Homogen5'] = GLCM_hom5
GLCM_contr5 = graycoprops(GLCM5, 'contrast')[0]
df['Contrast5'] = GLCM_contr5

GLCM6= graycomatrix(gray, [5], [0])


GLCM_Energy6 = graycoprops(GLCM6, 'energy')[0]
df['Energy6'] = GLCM_Energy6
GLCM_corr6 = graycoprops(GLCM6, 'correlation')[0]
df['Corr6'] = GLCM_corr6
GLCM_diss6= graycoprops(GLCM6, 'dissimilarity')[0]
df['Diss_sim6'] = GLCM_diss6
GLCM_hom6 = graycoprops(GLCM6, 'homogeneity')[0]
df['Homogen6'] = GLCM_hom6
GLCM_contr6 = graycoprops(GLCM6, 'contrast')[0]
df['Contrast6'] = GLCM_contr6

GLCM7 = graycomatrix(gray, [0], [3*np.pi/4])


GLCM_Energy7 = graycoprops(GLCM7, 'energy')[0]
df['Energy7'] = GLCM_Energy7
GLCM_corr7 = graycoprops(GLCM7, 'correlation')[0]
df['Corr7'] = GLCM_corr7
GLCM_diss7 = graycoprops(GLCM7, 'dissimilarity')[0]
df['Diss_sim7'] = GLCM_diss7
GLCM_hom7 = graycoprops(GLCM7, 'homogeneity')[0]
df['Homogen7'] = GLCM_hom7
GLCM_contr7 = graycoprops(GLCM7, 'contrast')[0]
df['Contrast7'] = GLCM_contr7

GLCM8 = graycomatrix(gray, [3], [np.pi/2])


GLCM_Energy8 = graycoprops(GLCM8, 'energy')[0]
df['Energy8'] = GLCM_Energy8
GLCM_corr8 = graycoprops(GLCM8, 'correlation')[0]

df['Corr8'] = GLCM_corr8
GLCM_diss8 = graycoprops(GLCM8, 'dissimilarity')[0]
df['Diss_sim8'] = GLCM_diss8
GLCM_hom8 = graycoprops(GLCM8, 'homogeneity')[0]
df['Homogen8'] = GLCM_hom8

278
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

GLCM_contr8 = graycoprops(GLCM8, 'contrast')[0]


df['Contrast8'] = GLCM_contr8

#Add more filters as needed


entropy = shannon_entropy(gray)
df['Entropy'] = entropy
df['cls'] = cls
# print(GLCM_Energy,GLCM_corr,GLCM_diss,GLCM_hom,GLCM_contr,entropy)

#Append features from current image to the dataset


image_dataset = pd.concat([df, image_dataset], ignore_index=True)

return image_dataset
count = 1
for i in temp:
out = feature_extractor(i,count)
count+=1
#append to the csv file
csv_data=out.to_csv('./Data/GLCM/GLCM_' + '.csv', mode='a', index=False,
header=(count==2))
data = pd.read_csv('./Data/GLCM/GLCM_.csv')
X = data.drop(columns=['cls']) # Features (drop the 'cls' column)
y = data['cls'] # Labels (classes)
# Split the data into 80% training and 20% testing
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=38)
Results
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix
from sklearn.metrics import confusion_matrix, precision_score, recall_score, f1_score,
RocCurveDisplay
import seaborn as sns
model_knn = KNeighborsClassifier(n_neighbors=3)

# Train the KNN classifier


model_knn.fit(X_train, y_train)

# Make predictions on the test set


y_pred = model_knn.predict(X_test)

# Calculate and print train and test accuracy


print("Train Accuracy:", model_knn.score(X_train, y_train))
print("Test Accuracy:", model_knn.score(X_test, y_test))

# Calculate precision, recall, and F1 scores (using 'micro' average for multi-class)
print("Precision Score: ", precision_score(y_test, y_pred, average='micro'))
print("Recall Score: ", recall_score(y_test, y_pred, average='micro'))
279
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

print("F1 Score: ", f1_score(y_test, y_pred, average='micro'))


print("Confusion Matrix: ")
print(confusion_matrix(y_test, y_pred))

Train Accuracy: 0.7083333333333334


Test Accuracy: 0.6666666666666666
Precision Score: 0.6666666666666666
Recall Score: 0.6666666666666666
F1 Score: 0.6666666666666666
Confusion Matrix:
[[1 1 0]
[0 1 0]
[0 1 2]]
def extract_features_from_image(image_path):
# Read the image
img = cv2.imread(image_path)
# Convert to grayscale
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# Resize the image to match the training size (100x100 as per your code)
gray = cv2.resize(gray, (100, 100))
# Extract GLCM features
df = pd.DataFrame()
GLCM = graycomatrix(gray, [1], [0])
df['Energy'] = graycoprops(GLCM, 'energy')[0]
df['Corr'] = graycoprops(GLCM, 'correlation')[0]
df['Diss_sim'] = graycoprops(GLCM, 'dissimilarity')[0]
df['Homogen'] = graycoprops(GLCM, 'homogeneity')[0]
df['Contrast'] = graycoprops(GLCM, 'contrast')[0]
GLCM2 = graycomatrix(gray, [2], [0])
GLCM_Energy2 = graycoprops(GLCM2, 'energy')[0]
df['Energy2'] = GLCM_Energy2
GLCM_corr2 = graycoprops(GLCM2, 'correlation')[0]
df['Corr2'] = GLCM_corr2
GLCM_diss2 = graycoprops(GLCM2, 'dissimilarity')[0]
df['Diss_sim2'] = GLCM_diss2
GLCM_hom2 = graycoprops(GLCM2, 'homogeneity')[0]
df['Homogen2'] = GLCM_hom2
GLCM_contr2 = graycoprops(GLCM2, 'contrast')[0]

df['Contrast2'] = GLCM_contr2

GLCM3 = graycomatrix(gray, [5], [0])


GLCM_Energy3 = graycoprops(GLCM3, 'energy')[0]
df['Energy3'] = GLCM_Energy3
GLCM_corr3 = graycoprops(GLCM3, 'correlation')[0]
df['Corr3'] = GLCM_corr3
GLCM_diss3 = graycoprops(GLCM3, 'dissimilarity')[0]
df['Diss_sim3'] = GLCM_diss3
280
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

GLCM_hom3 = graycoprops(GLCM3, 'homogeneity')[0]


df['Homogen3'] = GLCM_hom3
GLCM_contr3 = graycoprops(GLCM3, 'contrast')[0]
df['Contrast3'] = GLCM_contr3

GLCM4 = graycomatrix(gray, [0], [np.pi/4])


GLCM_Energy4 = graycoprops(GLCM4, 'energy')[0]
df['Energy4'] = GLCM_Energy4
GLCM_corr4 = graycoprops(GLCM4, 'correlation')[0]
df['Corr4'] = GLCM_corr4
GLCM_diss4 = graycoprops(GLCM4, 'dissimilarity')[0]
df['Diss_sim4'] = GLCM_diss4
GLCM_hom4 = graycoprops(GLCM4, 'homogeneity')[0]
df['Homogen4'] = GLCM_hom4
GLCM_contr4 = graycoprops(GLCM4, 'contrast')[0]
df['Contrast4'] = GLCM_contr4

GLCM5 = graycomatrix(gray, [0], [np.pi/2])


GLCM_Energy5 = graycoprops(GLCM5, 'energy')[0]
df['Energy5'] = GLCM_Energy5
GLCM_corr5 = graycoprops(GLCM5, 'correlation')[0]
df['Corr5'] = GLCM_corr5
GLCM_diss5 = graycoprops(GLCM5, 'dissimilarity')[0]
df['Diss_sim5'] = GLCM_diss5
GLCM_hom5 = graycoprops(GLCM5, 'homogeneity')[0]
df['Homogen5'] = GLCM_hom5
GLCM_contr5 = graycoprops(GLCM5, 'contrast')[0]
df['Contrast5'] = GLCM_contr5

GLCM6= graycomatrix(gray, [5], [0])


GLCM_Energy6 = graycoprops(GLCM6, 'energy')[0]
df['Energy6'] = GLCM_Energy6
GLCM_corr6 = graycoprops(GLCM6, 'correlation')[0]
df['Corr6'] = GLCM_corr6
GLCM_diss6= graycoprops(GLCM6, 'dissimilarity')[0]

df['Diss_sim6'] = GLCM_diss6
GLCM_hom6 = graycoprops(GLCM6, 'homogeneity')[0]
df['Homogen6'] = GLCM_hom6
GLCM_contr6 = graycoprops(GLCM6, 'contrast')[0]
df['Contrast6'] = GLCM_contr6

GLCM7 = graycomatrix(gray, [0], [3*np.pi/4])


GLCM_Energy7 = graycoprops(GLCM7, 'energy')[0]
df['Energy7'] = GLCM_Energy7
GLCM_corr7 = graycoprops(GLCM7, 'correlation')[0]
df['Corr7'] = GLCM_corr7

281
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

GLCM_diss7 = graycoprops(GLCM7, 'dissimilarity')[0]


df['Diss_sim7'] = GLCM_diss7
GLCM_hom7 = graycoprops(GLCM7, 'homogeneity')[0]

df['Homogen7'] = GLCM_hom7
GLCM_contr7 = graycoprops(GLCM7, 'contrast')[0]
df['Contrast7'] = GLCM_contr7

GLCM8 = graycomatrix(gray, [3], [np.pi/2])


GLCM_Energy8 = graycoprops(GLCM8, 'energy')[0]
df['Energy8'] = GLCM_Energy8
GLCM_corr8 = graycoprops(GLCM8, 'correlation')[0]
df['Corr8'] = GLCM_corr8
GLCM_diss8 = graycoprops(GLCM8, 'dissimilarity')[0]
df['Diss_sim8'] = GLCM_diss8
GLCM_hom8 = graycoprops(GLCM8, 'homogeneity')[0]
df['Homogen8'] = GLCM_hom8
GLCM_contr8 = graycoprops(GLCM8, 'contrast')[0]
df['Contrast8'] = GLCM_contr8

# Calculate entropy (if needed)


entropy = shannon_entropy(gray)
df['Entropy'] = entropy

return df.values.flatten()
Testing
random_image_path = 'C:/Users/Ashish/Desktop/CVFolder/Paved/img1.jpg' # Change to the
actual image path
# 1: Paved 2: Asphalt 3: Dirt
# Extract features from the random image
image_features = extract_features_from_image(random_image_path)

# Reshape to match input format for KNN (1 sample, n features)


image_features = image_features.reshape(1, -1)

# Scale the image features using the same scaler used for training

# Predict the class using the trained KNN model


predicted_class = model_knn.predict(image_features)

# Print the predicted class


print(f'The predicted class for the image is: {predicted_class[0]}')
img = cv2.imread(random_image_path)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) # Convert BGR to RGB for correct display
plt.imshow(img)
plt.axis('off') # Turn off axis numbers and ticks
282
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

plt.title(f'Input Image - Predicted Class: {predicted_class[0]}')


plt.show()

The predicted class for the image is: 1

random_image_path = 'C:/Users/Ashish/Desktop/CVFolder/Paved/img9.jpg' # Change to the


actual image path
# 1: Paved 2: Asphalt 3: Dirt
# Extract features from the random image
image_features = extract_features_from_image(random_image_path)

# Reshape to match input format for KNN (1 sample, n features)

image_features = image_features.reshape(1, -1)

# Scale the image features using the same scaler used for training

# Predict the class using the trained KNN model


predicted_class = model_knn.predict(image_features)

# Print the predicted class


print(f'The predicted class for the image is: {predicted_class[0]}')
img = cv2.imread(random_image_path)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) # Convert BGR to RGB for correct display
plt.imshow(img)
plt.axis('off') # Turn off axis numbers and ticks
plt.title(f'Input Image - Predicted Class: {predicted_class[0]}')
plt.show()
The predicted class for the image is: 2

283
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

png
random_image_path = 'C:/Users/Ashish/Desktop/CVFolder/Asphalt/im9.jpg' # Change to the
actual image path
# 1: Paved 2: Asphalt 3: Dirt

# Extract features from the random image


image_features = extract_features_from_image(random_image_path)

# Reshape to match input format for KNN (1 sample, n features)


image_features = image_features.reshape(1, -1)

# Scale the image features using the same scaler used for training

# Predict the class using the trained KNN model


predicted_class = model_knn.predict(image_features)

# Print the predicted class


print(f'The predicted class for the image is: {predicted_class[0]}')
img = cv2.imread(random_image_path)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) # Convert BGR to RGB for correct display
plt.imshow(img)
plt.axis('off') # Turn off axis numbers and ticks
plt.title(f'Input Image - Predicted Class: {predicted_class[0]}')
plt.show()
The predicted class for the image is: 2

284
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

random_image_path = 'C:/Users/Ashish/Desktop/CVFolder/Dirt/im6.jpg' # Change to the


actual image path

# 1: Paved 2: Asphalt 3: Dirt


# Extract features from the random image
image_features = extract_features_from_image(random_image_path)

# Reshape to match input format for KNN (1 sample, n features)


image_features = image_features.reshape(1, -1)

# Scale the image features using the same scaler used for training

# Predict the class using the trained KNN model


predicted_class = model_knn.predict(image_features)

# Print the predicted class


print(f'The predicted class for the image is: {predicted_class[0]}')
img = cv2.imread(random_image_path)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) # Convert BGR to RGB for correct display
plt.imshow(img)
plt.axis('off') # Turn off axis numbers and ticks
plt.title(f'Input Image - Predicted Class: {predicted_class[0]}')
plt.show()
The predicted class for the image is: 1

285
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

png

random_image_path = 'C:/Users/Ashish/Desktop/CVFolder/Dirt/im1.jpg' # Change to the


actual image path
# 1: Paved 2: Asphalt 3: Dirt
# Extract features from the random image
image_features = extract_features_from_image(random_image_path)

# Reshape to match input format for KNN (1 sample, n features)


image_features = image_features.reshape(1, -1)

# Scale the image features using the same scaler used for training

# Predict the class using the trained KNN model


predicted_class = model_knn.predict(image_features)

# Print the predicted class


print(f'The predicted class for the image is: {predicted_class[0]}')
img = cv2.imread(random_image_path)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) # Convert BGR to RGB for correct display
plt.imshow(img)
plt.axis('off') # Turn off axis numbers and ticks
plt.title(f'Input Image - Predicted Class: {predicted_class[0]}')
plt.show()
The predicted class for the image is: 3

286
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Conclusion:
The experiment explained and gave insights on how GLCM technique (Gray level
Co-occurrence matrix) along with the Haralicks features can help in texture detection.
Different parameters like Correlation, entropy, energy etc were analyzed. This experiment also
utilized the concepts of KNN classifier for predicting the correct road type (asphalt,
paved or dirt). The model was trained on 80% of the images and then random
images were used to check if the model was able to predict the correct road type.

287
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

ET3221: Computer Vision

Name of the student: Ashish Rodi Roll No. 9


Div: A Batch: 1
Date of performance: 01-10-2024

Experiment No. 12

Problem Statement:
Face Expression Recognition using Gabor Filter and Local Binary Pattern (LBP).
AIM:
Write a Python Code for identifying the following face expressions
1. Angry
2. Fear
3. Happy
4. Sad

Objective of Experiment:
To apply Gabor Filter and LBP to identify different facial expressions.
Introduction:
A Gabor filter is a linear filter used in image processing, especially for texture analysis and
feature extraction. It is named after the scientist Dennis Gabor. A Gabor filter captures both
spatial frequency (how quickly the intensity values change in an image) and orientation (the
direction of edges) and is sensitive to specific frequencies and orientations in an image. By
applying Gabor filters at different scales and orientations, it is possible to capture patterns at
various levels of detail in different directions.

Local Binary Pattern (LBP) is a simple and effective texture descriptor used for image analysis.
It works by converting an image into a binary pattern that represents local texture information
by comparing the intensity of a central pixel with its surrounding neighbors.
LBP is rotation-invariant and computationally efficient.

288
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

Flowchart:

Code and Results


import scipy
scipy .__version__
'1.12.0'
from sklearn.svm import LinearSVC
import matplotlib.pyplot as plt
import argparse
import cv2
import os
289
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

# Local Binary Pattern function


from skimage.feature import local_binary_pattern
# To calculate a normalized histogram
from sklearn.preprocessing import normalize
import numpy as np
import csv
import pandas as pd
import joblib
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn import svm
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier
from sklearn.neighbors import KNeighborsClassifier
from sklearn.naive_bayes import GaussianNB
from xgboost import XGBClassifier
from sklearn.metrics import confusion_matrix, precision_score, recall_score, f1_score
import sklearn.metrics as metrics
GABOR FILTER

Gabor Vertical kernel


folder_name = 'C:/Users/Ashish/Desktop/CVFolder/images/train/'
temp = ['angry' , 'fear' , 'happy' , 'sad']
import os
# GABOR VERTICAL KERNEL
def gabor_vertical(folder_name):
for x in temp:
i=0
currentframe =1
for filename in os.listdir(folder_name + x):
#path
print(filename)
path= folder_name + x + '/'+ filename
print(path)
a=cv2.imread(path)

a = cv2.cvtColor(a, cv2.COLOR_BGR2GRAY)

ksize = 3 #Use size that makes sense to the image and fetaure size. Large may not be
good.
#On the synthetic image it is clear how ksize affects imgae (try 5 and 50)
sigma = 2 #Large sigma on small features will fully miss the features.
theta = 2*np.pi/4 #/4 shows horizontal 3/4 shows other horizontal. Try other
contributions
lamda = 1*np.pi /4 #1/4 works best for angled.
gamma=0.4 #Value of 1 defines spherical. Calue close to 0 has high aspect ratio
#Value of 1, spherical may not be ideal as it picks up features from other regions.
phi = 0 #Phase offset. I leave it to 0.

kernel = cv2.getGaborKernel((ksize, ksize), sigma, theta, lamda, gamma, phi,


290
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

ktype=cv2.CV_32F)
fimg = cv2.filter2D(a, cv2.CV_8UC3, kernel)

name = 'C:/Users/Ashish/Desktop/CVFolder/images/gabor_v/' + x
+'/'+str(currentframe) + '.png'
print ('Creating...' + name)

cv2.imwrite(name, fimg)

currentframe=currentframe+1
i=i+1
if i==1000:
break
gabor_vertical(folder_name)
C:/Users/Ashish/Desktop/CVFolder/images/train/happy/11869.jpg
v/happy/404.png
1187.jpg
C:/Users/Ashish/Desktop/CVFolder/images/train/happy/1187.jpg
v/happy/405.png
11870.jpg
C:/Users/Ashish/Desktop/CVFolder/images/train/happy/11870.jpg
v/happy/406.png
11871.jpg
C:/Users/Ashish/Desktop/CVFolder/images/train/happy/11871.jpg
v/happy/407.png

# DEFINING CSV FILE TO STORE FEATURE DESCRIPTOR PROVIDED FOR LBP


csv1 = "C:/Users/Ashish/Desktop/CVFolder/images/gabor_v.csv"
csv2 = "C:/Users/Ashish/Desktop/CVFolder/images/gabor_h.csv"

APPLYING LBP ON IMAGES OBTAINED BY APPLYING VERTICAL KENREL


def lbp_v(folder_name):
label = 1
for y in temp:
i=0
for filename in os.listdir(folder_name + '/' + y):
#path
path= folder_name + y + '/' + filename
print(path)
a=cv2.imread(path)

a = cv2.cvtColor(a, cv2.COLOR_BGR2GRAY)
radius = 3
# Number of points to be considered as neighbourers
no_points = 8 * radius

# Uniform LBP is used


lbp = local_binary_pattern(a, no_points, radius, method='uniform')

291
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

# Calculate the histogram


hist, _ = np.histogram(lbp.ravel(), bins=18, range=(0, 18), density=True)

# Ensure hist is a 1D array


hist = hist.reshape(-1)

# Take only the first 18 bins (adjust this as per your needs)
j = hist[:18].tolist() # This will ensure you get only the first 18 bins
j.append(label)
# taking only first 18 bins
with open(r'C:/Users/Ashish/Desktop/CVFolder/images/gabor_v.csv', 'a',newline='') as
f:
writer = csv.writer(f)
writer.writerow(j)
i=i+1
if i==1000:
break
label = label + 1
gabor_path_v = 'C:/Users/Ashish/Desktop/CVFolder/images/gabor_v/'
lbp_v(gabor_path_v)
data_v = pd.read_csv('C:/Users/Ashish/Desktop/CVFolder/images/gabor_v.csv' , dtype =
'float64')

APPLYING LBP ON IMAGES OBTAINED BY APPLYING HORIZONTAL KENREL


def lbp_h(folder_name):
label = 1
for y in temp:
i=0
for filename in os.listdir(folder_name + '/' + y):
#path
path= folder_name + y + '/' + filename
print(path)
a=cv2.imread(path)

a = cv2.cvtColor(a, cv2.COLOR_BGR2GRAY)
radius = 3
# Number of points to be considered as neighbourers
no_points = 8 * radius

# Uniform LBP is used


lbp = local_binary_pattern(a, no_points, radius, method='uniform')

# Calculate the histogram


hist, _ = np.histogram(lbp.ravel(), bins=18, range=(0, 18), density=True)

# Ensure hist is a 1D array


hist = hist.reshape(-1)

# Take only the first 18 bins (adjust this as per your needs)
292
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

j = hist[:18].tolist() # This will ensure you get only the first 18 bins
j.append(label)
# taking only first 18 bins
with open(r'C:/Users/Ashish/Desktop/CVFolder/images/gabor_h.csv', 'a',newline='') as
f:
writer = csv.writer(f)
writer.writerow(j)

i=i+1
if i==1000:
break
label = label + 1
gabor_path_h = 'C:/Users/Ashish/Desktop/CVFolder/images/gabor_h/'
lbp_h(gabor_path_h)
combined_csv_path = 'C:/Users/Ashish/Desktop/CVFolder/images/combined_gabor.csv'

# Creating a list to store the 3 csv files and combine them


csv_list = ['C:/Users/Ashish/Desktop/CVFolder/images/gabor_v.csv',
'C:/Users/Ashish/Desktop/CVFolder/images/gabor_h.csv']

first_file = True

for file in csv_list:


if first_file:
df = pd.read_csv(file)
out_master = df.to_csv(combined_csv_path , mode = 'w' , index = False ,header = True)
first_file = False
else:
df = pd.read_csv(file)
out_master = df.to_csv(combined_csv_path , mode = 'a' , index = False , header = True)
data = pd.read_csv('C:/Users/Ashish/Desktop/CVFolder/images/combined_gabor.csv')
data
Y = data.iloc[:,-1]
Y_1 = Y - 1
Y
0 1
1 1
2 1
3 1
4 1
..
7995 4
7996 4
7997 4
7998 4

293
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

7999 4
Name: Class, Length: 8000, dtype: int64
Y_1
0 0
1 0
2 0
3 0
4 0
..
7995 3
7996 3
7997 3
7998 3
7999 3
Name: Class, Length: 8000, dtype: int64
X_train,X_test,y_train,y_test = train_test_split(X , Y , train_size = 0.8 , random_state = 0)
X_train
y_train
1001 2
7360 4
5234 2
7390 4
6841 3
..
4931 1
3264 4
1653 2
2607 3
2732 3
Name: Class, Length: 6400, dtype: int64
X_test
y_test
3069 4
1675 2
6385 3
543 1
3213 4
..
7716 4
4766 1
4096 1
1595 2
5023 2
Name: Class, Length: 1600, dtype: int64

294
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

KNN Classifier
#KNN classifier
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import confusion_matrix, precision_score, recall_score, f1_score,
RocCurveDisplay
model_knn = KNeighborsClassifier(n_neighbors = 3)
model_knn.fit(X_train, y_train)
joblib.dump(model_knn,"C:/Users/Ashish/Desktop/CVFolder/images/Trained_Models/
modelGabor_LBP_knn")
y_pred3 = model_knn.predict(X_test)
print("KNN Classifier")
print("Train Accuracy:",model_knn.score(X_train, y_train))
print("Test Accuracy:",model_knn.score(X_test, y_test))
print("Precision Score: ",metrics.precision_score(y_test, y_pred3, pos_label='positive',
average='micro'))
print("Recall Score: ",metrics.recall_score(y_test, y_pred3, pos_label='positive',
average='micro'))
print("F1 Score: ",metrics.f1_score(y_test, y_pred3, pos_label='positive', average='micro'))
KNN Classifier
Train Accuracy: 0.60953125
Test Accuracy: 0.3125
Precision Score: 0.3125
Recall Score: 0.3125
F1 Score: 0.3125

KNN Hyperparameter Tuning


#KNN classifier
from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import GridSearchCV

# Define the parameter grid to search over


param_grid = {'n_neighbors': [5, 7, 13, 17, 23],
'weights': ['uniform', 'distance'],
'metric': ['euclidean', 'manhattan']}

# Create a KNeighborsClassifier object


knn = KNeighborsClassifier()

# Perform a grid search over the parameter grid using 5-fold cross-validation
grid_search_knn = GridSearchCV(knn, param_grid, cv=5)

# Fit the grid search object to the training data


#model_knn.fit(x_train, y_train)
grid_search_knn.fit(X_train, y_train)

# Print the best hyperparameters and their corresponding score


295
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

print("Best hyperparameters: ", grid_search_knn.best_params_)


print("Best score: ", grid_search_knn.best_score_)

joblib.dump(grid_search_knn,"C:/Users/Ashish/Desktop/CVFolder/images/Trained_Models/
modelA_Gabor_LBP_tunned")

y_pred3 = grid_search_knn.predict(X_test)

print("KNN")
print("Train Accuracy:",grid_search_knn.score(X_train, y_train))
print("Test Accuracy:",grid_search_knn.score(X_test, y_test))
print("Precision Score: ",metrics.precision_score(y_test, y_pred3, pos_label='positive',
average='micro'))
print("Recall Score: ",metrics.recall_score(y_test, y_pred3, pos_label='positive',
average='micro'))
print("F1 Score: ",metrics.f1_score(y_test, y_pred3, pos_label='positive', average='micro'))
Best hyperparameters: {'metric': 'euclidean', 'n_neighbors': 17, 'weights': 'distance'}
Best score: 0.32171875
KNN
Train Accuracy: 0.99859375
Test Accuracy: 0.31875
Precision Score: 0.31875
Recall Score: 0.31875
F1 Score: 0.31875

XGBoost Classifier
X_train,X_test,y_train,y_test = train_test_split(X , Y_1 , train_size = 0.8 , random_state = 0)
from sklearn.metrics import accuracy_score, classification_report
model_XGBoost = XGBClassifier()
model_XGBoost.fit(X_train , y_train)
joblib.dump(model_XGBoost,'C:/Users/Ashish/Desktop/CVFolder/images/Trained_Models/
modelGabor_LBP_XGBoost')
predictions = model_XGBoost.predict(X_test)

print("Train Accuracy:",model_XGBoost.score(X_train, y_train))


print("Test Accuracy:",model_XGBoost.score(X_test, y_test))
print("Precision Score: ",metrics.precision_score(y_test, y_pred3, pos_label='positive',
average='micro'))
print("Recall Score: ",metrics.recall_score(y_test, y_pred3, pos_label='positive',
average='micro'))
print("F1 Score: ",metrics.f1_score(y_test, y_pred3, pos_label='positive', average='micro'))
Train Accuracy: 0.99140625
Test Accuracy: 0.31875
Precision Score: 0.17375
Recall Score: 0.17375
F1 Score: 0.17375

296
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

SVM classifier
svm_clf = SVC(kernel = 'linear')
svm_clf.fit(X_train, y_train)
joblib.dump(svm_clf,'C:/Users/Ashish/Desktop/CVFolder/images/Trained_Models/
modelGabor_LBP_SVM')

svm_predictions = svm_clf.predict(X_test)

print("Train Accuracy:",svm_clf.score(X_train, y_train))


print("Test Accuracy:",svm_clf.score(X_test, y_test))
print("Precision Score: ",metrics.precision_score(y_test, y_pred3, pos_label='positive',
average='micro'))
print("Recall Score: ",metrics.recall_score(y_test, y_pred3, pos_label='positive',
average='micro'))
print("F1 Score: ",metrics.f1_score(y_test, y_pred3, pos_label='positive', average='micro'))
Train Accuracy: 0.28921875
Test Accuracy: 0.28625
Precision Score: 0.17375
Recall Score: 0.17375
F1 Score: 0.17375

Comparing all 3 classifiers, it is clear that hyper-tunned KNN performs best. then XGBoost
classifier and then SVM classifier

Testing
def gabor_v_test(img_path):
path= img_path
print(path)
a=cv2.imread(path)

a = cv2.cvtColor(a, cv2.COLOR_BGR2GRAY)

ksize = 3 #Use size that makes sense to the image and fetaure size. Large may not be good.
#On the synthetic image it is clear how ksize affects imgae (try 5 and 50)
sigma = 2 #Large sigma on small features will fully miss the features.
theta = 2*np.pi/4 #/4 shows horizontal 3/4 shows other horizontal. Try other contributions
lamda = 1*np.pi /4 #1/4 works best for angled.
gamma=0.4 #Value of 1 defines spherical. Calue close to 0 has high aspect ratio
#Value of 1, spherical may not be ideal as it picks up features from other regions.
phi = 0 #Phase offset. I leave it to 0.

kernel = cv2.getGaborKernel((ksize, ksize), sigma, theta, lamda, gamma, phi,


ktype=cv2.CV_32F)
fimg = cv2.filter2D(a, cv2.CV_8UC3, kernel)

name = 'C:/Users/Ashish/Desktop/CVFolder/images/Gabor_LBP_test/v/0.png'
print ('Creating...' + name)

cv2.imwrite(name, fimg)
297
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

def gabor_h_test(img_path):
path= img_path
print(path)
a=cv2.imread(path)

a = cv2.cvtColor(a, cv2.COLOR_BGR2GRAY)

ksize = 3 #Use size that makes sense to the image and fetaure size. Large may not be good.
#On the synthetic image it is clear how ksize affects imgae (try 5 and 50)
sigma = 2 #Large sigma on small features will fully miss the features.
theta = 8*np.pi/4 #/4 shows horizontal 3/4 shows other horizontal. Try other contributions
lamda = 1*np.pi /4 #1/4 works best for angled.
gamma=0.4 #Value of 1 defines spherical. Calue close to 0 has high aspect ratio
#Value of 1, spherical may not be ideal as it picks up features from other regions.
phi = 0 #Phase offset. I leave it to 0.

kernel = cv2.getGaborKernel((ksize, ksize), sigma, theta, lamda, gamma, phi,


ktype=cv2.CV_32F)
fimg = cv2.filter2D(a, cv2.CV_8UC3, kernel)

name = 'C:/Users/Ashish/Desktop/CVFolder/images/Gabor_LBP_test/h/0.png'
print ('Creating...' + name)

cv2.imwrite(name, fimg)
def lbp_v_test(gabor_file_path):
path= gabor_file_path
print(path)
a=cv2.imread(path)

a = cv2.cvtColor(a, cv2.COLOR_BGR2GRAY)
radius = 3
# Number of points to be considered as neighbourers
no_points = 8 * radius

# Uniform LBP is used


lbp = local_binary_pattern(a, no_points, radius, method='uniform')

# Calculate the histogram


hist, _ = np.histogram(lbp.ravel(), bins=18, range=(0, 18), density=True)

# Ensure hist is a 1D array


hist = hist.reshape(-1)

# Take only the first 18 bins (adjust this as per your needs)
j = hist[:18].tolist() # This will ensure you get only the first 18 bins
# taking only first 18 bins
with open(r'C:/Users/Ashish/Desktop/CVFolder/images/Gabor_LBP_test/gabor_v_test.csv',
'a',newline='') as f:
writer = csv.writer(f)
writer.writerow(j)

298
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

def lbp_h_test(gabor_file_path):
path= gabor_file_path
print(path)
a=cv2.imread(path)

a = cv2.cvtColor(a, cv2.COLOR_BGR2GRAY)
radius = 3
# Number of points to be considered as neighbourers
no_points = 8 * radius

# Uniform LBP is used


lbp = local_binary_pattern(a, no_points, radius, method='uniform')

# Calculate the histogram


hist, _ = np.histogram(lbp.ravel(), bins=18, range=(0, 18), density=True)

# Ensure hist is a 1D array


hist = hist.reshape(-1)

# Take only the first 18 bins (adjust this as per your needs)
j = hist[:18].tolist() # This will ensure you get only the first 18 bins
# taking only first 18 bins
with open(r'C:/Users/Ashish/Desktop/CVFolder/images/Gabor_LBP_test/gabor_v_test.csv',
'a',newline='') as f:
writer = csv.writer(f)
writer.writerow(j)
def test_csv_combine(file1,file2):
combined_csv_path = './combined_gabor_test.csv'

# Creating a list to store the 3 csv files and combine them


csv_list = [file1, file2]

first_file = True

for file in csv_list:


if first_file:
df = pd.read_csv(file)
out_master = df.to_csv(combined_csv_path , mode = 'w' , index = False ,header =
True)
first_file = False
else:
df = pd.read_csv(file)
out_master = df.to_csv(combined_csv_path , mode = 'a' , index = False , header =
True)
img_path = 'C:/Users/Ashish/Desktop/CVFolder/images/validation/angry/65.jpg'
gabor_v_test(img_path)
gabor_h_test(img_path)
lbp_v_test('C:/Users/Ashish/Desktop/CVFolder/images/Gabor_LBP_test/v/0.png')
lbp_h_test('C:/Users/Ashish/Desktop/CVFolder/images/Gabor_LBP_test/h/0.png')

299
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

img = cv2.imread(img_path)
img = cv2.cvtColor(img , cv2.COLOR_BGR2GRAY)
data_test = pd.read_csv('C:/Users/Ashish/Desktop/CVFolder/images/Gabor_LBP_test/
gabor_v_test.csv')
data_test
model_KNN_tunned =
joblib.load('C:/Users/Ashish/Desktop/CVFolder/images/Trained_Models/
modelA_Gabor_LBP_tunned')
predictions = model_KNN_tunned.predict(data_test)
class_mapping = {0: "angry", 1: "fear", 2: "happy", 3: "sad"}
predicted_class_name = class_mapping.get(predictions[0], "Unknown")
print(f"The predicted class for the new image is: {predicted_class_name}")
plt.imshow(img , 'gray')
plt.show()
The predicted class for the new image is: fear

png

The actual class being angry it predicted fear.


img_path = 'C:/Users/Ashish/Desktop/CVFolder/images/validation/happy/80.jpg'
gabor_v_test(img_path)
gabor_h_test(img_path)
lbp_v_test('C:/Users/Ashish/Desktop/CVFolder/images/Gabor_LBP_test/v/0.png')
lbp_h_test('C:/Users/Ashish/Desktop/CVFolder/images/Gabor_LBP_test/h/0.png')
img = cv2.imread(img_path)
img = cv2.cvtColor(img , cv2.COLOR_BGR2GRAY)
300
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

data_test = pd.read_csv('C:/Users/Ashish/Desktop/CVFolder/images/Gabor_LBP_test/
gabor_v_test.csv')
data_test
model_KNN_tunned =
joblib.load('C:/Users/Ashish/Desktop/CVFolder/images/Trained_Models/
modelA_Gabor_LBP_tunned')
predictions = model_KNN_tunned.predict(data_test)
class_mapping = {0: "angry", 1: "fear", 2: "happy", 3: "sad"}
predicted_class_name = class_mapping.get(predictions[0], "Unknown")
print(f"The predicted class for the new image is: {predicted_class_name}")
plt.imshow(img , 'gray')
plt.show()
The predicted class for the new image is: fear

Conclusion:
In this experiment, I explored two new techniques: the Gabor filter and the Local Binary Pattern
(LBP) feature descriptor. The Gabor filter is useful for extracting spatial frequency and
orientation information from images, while LBP is a texture descriptor known for its ability to
be rotation-invariant. However, I found that the classifiers using these techniques didn’t perform
well, with test accuracies around 30% across different classifiers. This could be due to the poor
quality of the dataset, where the texture variations might not be distinct enough to differentiate
between different emotions. I believe that increasing the image resolution could potentially
improve the results by providing more pixels for the classifiers to better capture subtle texture
differences.

301
Bansilal Ramnath Agarwal Charitable Trust’s
Vishwakarma Institute of Technology, Pune-37
(An Autonomous Institute Affiliated to Savitribai Pune University)

Department of Electronics & Telecommunication Engineering

302

You might also like