AI MasterClass Day11Intern
AI MasterClass Day11Intern
Day 11
Object Recognition
Day-11 Agenda.
04. 05.
Deploying Real-time Object
MobileNetSSD recognition
Pre-trained Model Pre-Trained Model
Object Recognition.
Object recognition is a computer vision technique
for identifying objects in images or videos. Object
recognition is a key output of deep learning and
machine learning algorithms. When humans look at
a photograph or watch a video, we can readily spot
people, objects, scenes, and visual details.
Implementing Object
Recognition.
PRE-TRAINED
0
1 MODEL
0
TRANSFER
2 LEARNING
0
BUILDING FROM
3 SCRATCH
Deep Neural Network -
DNN.
• Solve Complex Task
• When it gets new information in the system, it
learns how to act accordingly to a new situation.
• Learning becomes deeper when tasks you solve
get harder.
• Helps to load pre-trained Model from DL
frameworks such as
Tensorflow
Caffe
Darknet
Torch
Speed Comparison on Image
Classification.
Pre-trained Model for Object
recognition.
• MobileNet-SSD
• GoogleNet
• Squeezenet
• Faster R-CNN
• ResNet
• Inception
• YOLO
• VGGNet
MobileNet SSD (Single shot Multibox
Detector).
• The MobileNet model is based on depthwise separable convolutions which are a
form of factorized convolutions. These factorize a standard convolution into a
depthwise convolution and a 1 × 1 convolution called a pointwise convolution.
• For MobileNets, the depthwise convolution applies a single filter to each input
channel. The pointwise convolution then applies a 1 × 1 convolution to combine
the outputs of the depthwise convolution.
• A standard convolution both filters and combines inputs into a new set of outputs
in one step. The depthwise separable convolution splits this into two layers
– a separate layer for filtering and a separate layer for combining. This
factorization has the effect of drastically reducing computation and model size.
cv2.dnn.readNetFromCaffe
cv2.dnn.readNetFromTensorFlow
cv2.dnn.readNetFromTorch
cv2.dnn.readhTorchBlob
blob =
cv2.dnn.blobFromImage(imResizeBlob,0.007843,
(300, 300), 127.5)
Block Diagram – Workflow of DNN in
OpenCV.
Select Reading frame
Load Model Select target
Backend from camera
Convert to
Forward Post Process
Blob
Numpy Basic Syntax.
Numpy.array
arr = np.array([1, 2, 3, 4, 5])
Numpy.arrange
for i in np.arange(0,detShape):
Practical
session using
Object Recognition
Pre-trained Model with DNN
in OpenCV
import numpy as np
import imutils
import time
import cv2
prototxt = "MobileNetSSD_deploy.prototxt.txt"
model = "MobileNetSSD_deploy.caffemodel"
confThresh = 0.2
CLASSES = ["background", "aeroplane", "bicycle", "bird", "boat",
"bottle", "bus", "car", "cat", "chair", "cow", "diningtable",
"dog", "horse", "motorbike", "person", "pottedplant", "sheep",
"sofa", "train", "tvmonitor"]
COLORS = np.random.uniform(0, 255, size=(len(CLASSES), 3))
print("Loading model...")
net = cv2.dnn.readNetFromCaffe(prototxt, model)
print("Model Loaded")
print("Starting Camera Feed...")
vs = cv2.VideoCapture(0)
time.sleep(2.0)
while True:
_,frame = vs.read()
frame = imutils.resize(frame, width=500)
(h, w) = frame.shape[:2]
imResizeBlob = cv2.resize(frame, (300, 300))
blob = cv2.dnn.blobFromImage(imResizeBlob,0.007843, (300, 300), 127.5)
net.setInput(blob)
detections = net.forward()
detShape = detections.shape[2]
for i in np.arange(0,detShape):
confidence = detections[0, 0, i, 2]
if confidence > confThresh:
idx = int(detections[0, 0, i, 1])
box = detections[0, 0, i, 3:7] * np.array([w, h, w, h])
(startX, startY, endX, endY) = box.astype("int")
label = "{}: {:.2f}%".format(CLASSES[idx],confidence * 100)
cv2.rectangle(frame, (startX, startY), (endX, endY),COLORS[idx], 2)
if startY - 15 > 15:
y = startY - 15
else:
startY + 15
cv2.putText(frame, label, (startX, y),
cv2.FONT_HERSHEY_SIMPLEX, 0.5, COLORS[idx], 2)
cv2.imshow("Frame", frame)
key = cv2.waitKey(1)
if key == 27:
break
vs.release()
cv2.destroyAllWindows()
AI News – Day 11.
Yesterday - 2020
www.pantechsolutions.net