0% found this document useful (0 votes)
5 views

Task 9 Implementation of Object Detection and Localization

Uploaded by

Mani Maran
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Task 9 Implementation of Object Detection and Localization

Uploaded by

Mani Maran
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Task_9_Implementation_of_object_detection_and_localization

March 22, 2024

#Object detection using deep learning with OpenCV and Python


[1]: #Get the data
from google.colab import files
files.upload()

! mkdir ~/.kaggle
! cp kaggle.json ~/.kaggle/
! chmod 600 ~/.kaggle/kaggle.json

<IPython.core.display.HTML object>
Saving kaggle.json to kaggle.json

[2]: !kaggle datasets download -d drmanimaran/object-detection-using-open-cv

Downloading object-detection-using-open-cv.zip to /content


83% 10.0M/12.0M [00:01<00:00, 12.7MB/s]
100% 12.0M/12.0M [00:01<00:00, 10.2MB/s]

[3]: !unzip /content/object-detection-using-open-cv.zip -d Data

Archive: /content/object-detection-using-open-cv.zip
inflating: Data/coco_names.txt
inflating: Data/frozen_inference_graph.pb
inflating: Data/horses.jpg
inflating: Data/image1.jpg
inflating: Data/image3.jpg
inflating: Data/ssd_mobilenet_v3_large_coco_2020_01_14.pbtxt
#First Image
[4]: import cv2

image = cv2.imread('Data/image1.jpg')
image = cv2.resize(image, (640, 480))
h = image.shape[0]
w = image.shape[1]

# path to the weights and model files

1
weights = "Data/frozen_inference_graph.pb"
model = "Data/ssd_mobilenet_v3_large_coco_2020_01_14.pbtxt"
# load the MobileNet SSD model trained on the COCO dataset
net = cv2.dnn.readNetFromTensorflow(weights, model)

First, we load our image from disk, resize it, and grab the height and width of the image.
Then we load our model using OpenCV’s dnn module providing the references to the prototxt and
model files.
The sample files are provided in the folder which contains the source code along with some images.
Make sure to download these files from this link.
Next, let’s load our class labels:
[5]: # load the class labels the model was trained on
class_names = []
with open("Data/coco_names.txt", "r") as f:
class_names = f.read().strip().split("\n")

We first create the list class_names that will contain the names of the classes from the file.
Then we open our file using context manager and store each line, which corresponds to each class
name, in our list.
Let’s now apply object detection:
[6]: # create a blob from the image
blob = cv2.dnn.blobFromImage(
image, 1.0/127.5, (320, 320), [127.5, 127.5, 127.5])
# pass the blog through our network and get the output predictions
net.setInput(blob)
output = net.forward() # shape: (1, 1, 100, 7)

We create a blob using the cv2.dnn.blobFromImage function. Basically, this function is used to
preprocess our image and prepare it for the network. It applies a mean subtraction and a scaling
to the input image.
The first argument to this function is the input image. The second one is the scale factor which we
can use to scale our image.
The third argument is the spatial size of the output image and the last argument is the mean
subtraction values which are subtracted from channels.
The values for these parameters are provided in the documentation, don’t worry too much about
them.
Next, we set the blob as input for the network and detect the objects on the image using our Single
Shot Detector.
The result is stored in the output variable. If you print out the shape of this variable, you’ll get a
shape of (1, 1, 100, 7).

2
So we can loop over the detections to filter the results and get the bounding boxes of the detected
objects:
[7]: from google.colab.patches import cv2_imshow
# loop over the number of detected objects
for detection in output[0, 0, :, :]: # output[0, 0, :, :] has a shape of: (100,␣
↪7)

# the confidence of the model regarding the detected object


probability = detection[2]

# if the confidence of the model is lower than 50%,


# we do nothing (continue looping)
if probability < 0.5:
continue

# perform element-wise multiplication to get


# the (x, y) coordinates of the bounding box
box = [int(a * b) for a, b in zip(detection[3:7], [w, h, w, h])]
box = tuple(box)
# draw the bounding box of the object
cv2.rectangle(image, box[:2], box[2:], (0, 255, 0), thickness=2)

# extract the ID of the detected object to get its name


class_id = int(detection[1])
# draw the name of the predicted object along with the probability
label = f"{class_names[class_id - 1].upper()} {probability * 100:.2f}%"
cv2.putText(image, label, (box[0], box[1] + 15),
cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)

cv2_imshow(image)
cv2.waitKey()

3
[7]: -1

#Second Image
[8]: image1 = cv2.imread('Data/image3.jpg')
image1 = cv2.resize(image1, (640, 480))
h = image1.shape[0]
w = image1.shape[1]

[9]: # create a blob from the image


blob = cv2.dnn.blobFromImage(
image1, 1.0/127.5, (320, 320), [127.5, 127.5, 127.5])
# pass the blog through our network and get the output predictions
net.setInput(blob)
output = net.forward() # shape: (1, 1, 100, 7)

[10]: from google.colab.patches import cv2_imshow


# loop over the number of detected objects
for detection in output[0, 0, :, :]: # output[0, 0, :, :] has a shape of: (100,␣
↪7)

# the confidence of the model regarding the detected object

4
probability = detection[2]

# if the confidence of the model is lower than 50%,


# we do nothing (continue looping)
if probability < 0.5:
continue

# perform element-wise multiplication to get


# the (x, y) coordinates of the bounding box
box = [int(a * b) for a, b in zip(detection[3:7], [w, h, w, h])]
box = tuple(box)
# draw the bounding box of the object
cv2.rectangle(image1, box[:2], box[2:], (0, 255, 0), thickness=2)

# extract the ID of the detected object to get its name


class_id = int(detection[1])
# draw the name of the predicted object along with the probability
label = f"{class_names[class_id - 1].upper()} {probability * 100:.2f}%"
cv2.putText(image1, label, (box[0], box[1] + 15),
cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)

cv2_imshow(image1)
cv2.waitKey()

5
[10]: -1

#Third Image
[11]: image2 = cv2.imread('Data/horses.jpg')
image2 = cv2.resize(image2, (640, 480))
h = image2.shape[0]
w = image2.shape[1]

[12]: # create a blob from the image


blob = cv2.dnn.blobFromImage(
image2, 1.0/127.5, (320, 320), [127.5, 127.5, 127.5])
# pass the blog through our network and get the output predictions
net.setInput(blob)
output = net.forward() # shape: (1, 1, 100, 7)

[13]: from google.colab.patches import cv2_imshow


# loop over the number of detected objects
for detection in output[0, 0, :, :]: # output[0, 0, :, :] has a shape of: (100,␣
↪7)

# the confidence of the model regarding the detected object


probability = detection[2]

# if the confidence of the model is lower than 50%,


# we do nothing (continue looping)
if probability < 0.5:
continue

# perform element-wise multiplication to get


# the (x, y) coordinates of the bounding box
box = [int(a * b) for a, b in zip(detection[3:7], [w, h, w, h])]
box = tuple(box)
# draw the bounding box of the object
cv2.rectangle(image2, box[:2], box[2:], (0, 255, 0), thickness=2)

# extract the ID of the detected object to get its name


class_id = int(detection[1])
# draw the name of the predicted object along with the probability
label = f"{class_names[class_id - 1].upper()} {probability * 100:.2f}%"
cv2.putText(image2, label, (box[0], box[1] + 15),
cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)

cv2_imshow(image2)
cv2.waitKey()

6
[13]: -1

You might also like