0% found this document useful (0 votes)
199 views

opencv cheatsheet

Uploaded by

siva prasad
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
199 views

opencv cheatsheet

Uploaded by

siva prasad
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 60

Python

Follow me on

Sumit Khanna for more updates

OpenCV CheatSheet & Comprehensive


Guide
OpenCV (Open Source Computer Vision Library) is an open-source computer vision and machine
learning software library. OpenCV was built to provide a common infrastructure for computer vision
applications and to accelerate the use of machine perception in commercial products. The library contains
more than 2500 optimized algorithms, including a comprehensive set of both classic and state-of-the-art
computer vision and machine learning algorithms. These algorithms can be used for various purposes
such as detecting and recognizing faces, identifying objects, classifying human actions in videos, tracking
camera movements, tracking moving objects, extracting 3D models of objects, stitching images together
to produce a high-resolution image of an entire scene, finding similar images from an image database,
removing red-eye, following eye movements, and much more.

Cheat Sheet Table

Method Name Definition

cv2.imread Reads an image from a file.

cv2.imwrite Writes an image to a file.

cv2.imshow Displays an image in a window.

cv2.cvtColor Converts an image from one color space to another.

cv2.GaussianBlur Applies a Gaussian blur to an image.

cv2.Canny Applies the Canny edge detector to an image.

Detects lines in a binary image using the Hough


cv2.HoughLines
Transform.

Detects circles in a binary image using the Hough


cv2.HoughCircles
Transform.
Method Name Definition

cv2.findContours Finds contours in a binary image.

cv2.drawContours Draws contours on an image.

cv2.rectangle Draws a rectangle on an image.

cv2.circle Draws a circle on an image.

cv2.line Draws a line on an image.

cv2.putText Puts text on an image.

cv2.resize Resizes an image to the specified dimensions.

cv2.warpAffine Applies an affine transformation to an image.

cv2.warpPerspective Applies a perspective transformation to an image.

Computes the affine matrix for rotating an image by a


cv2.getRotationMatrix2D
specified angle.

Computes the affine transformation matrix from three


cv2.getAffineTransform
pairs of points.

Computes the perspective transformation matrix


cv2.getPerspectiveTransform
from four pairs of points.

cv2.dilate Applies the dilation operation to an image.

cv2.erode Applies the erosion operation to an image.

Applies advanced morphological transformations to


cv2.morphologyEx
an image.

cv2.threshold Applies a fixed-level threshold to an image.

cv2.adaptiveThreshold Applies an adaptive threshold to an image.

cv2.equalizeHist Equalizes the histogram of a grayscale image.

cv2.calcHist Computes the histogram of an image.

cv2.compareHist Compares two histograms.

Compares a template image with a source image


cv2.matchTemplate
using a specific method.

cv2.VideoCapture Opens a video file or a capturing device.

cv2.VideoWriter Writes video frames to a video file.


Method Name Definition

Computes dense optical flow using the Lucas-


cv2.calcOpticalFlowPyrLK
Kanade method.

Computes dense optical flow using the Farneback


cv2.calcOpticalFlowFarneback
method.

cv2.dnn.readNet Reads a deep learning network model from a file.

Converts an image to a blob for input into a deep


cv2.dnn.blobFromImage
learning network.

cv2.dnn_Net.forward Runs a forward pass of the deep learning network.

cv2.CascadeClassifier Detects objects using a cascade classifier.

cv2.face.LBPHFaceRecognizer_create Creates an LBPH face recognizer.

cv2.face.FisherFaceRecognizer_create Creates a Fisher face recognizer.

cv2.face.EigenFaceRecognizer_create Creates an Eigen face recognizer.

cv2.bgsegm.createBackgroundSubtractorMOG Creates a MOG background subtractor.

cv2.bgsegm.createBackgroundSubtractorMOG2 Creates a MOG2 background subtractor.

cv2.bgsegm.createBackgroundSubtractorKNN Creates a KNN background subtractor.

cv2.bgsegm.createBackgroundSubtractorGMG Creates a GMG background subtractor.

cv2.FastFeatureDetector_create Creates a FAST feature detector.

Creates an ORB feature detector and descriptor


cv2.ORB_create
extractor.

Creates a SIFT feature detector and descriptor


cv2.SIFT_create
extractor.

Creates a SURF feature detector and descriptor


cv2.SURF_create
extractor.

Creates a BRISK feature detector and descriptor


cv2.BRISK_create
extractor.

cv2.drawKeypoints Draws keypoints on an image.


Explanation and Usage

1. cv2.imread
Reads an image from a file.

Usage:

import cv2

# Read an image
image = cv2.imread('path/to/image.jpg')

# Display the image


cv2.imshow('Image', image)
cv2.waitKey(0)
cv2.destroyAllWindows()

2. cv2.imwrite
Writes an image to a file.

Usage:

import cv2

# Read an image
image = cv2.imread('path/to/image.jpg')

# Write the image to a new file


cv2.imwrite('path/to/new_image.jpg', image)

3. cv2.imshow
Displays an image in a window.

Usage:
import cv2

# Read an image
image = cv2.imread('path/to/image.jpg')

# Display the image


cv2.imshow('Image', image)
cv2.waitKey(0)
cv2.destroyAllWindows()

4. cv2.cvtColor
Converts an image from one color space to another.

Usage:

import cv2

# Read an image
image = cv2.imread('path/to/image.jpg')

# Convert the image to grayscale


gray_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

# Display the grayscale image


cv2.imshow('Gray Image', gray_image)
cv2.waitKey(0)
cv2.destroyAllWindows()

5. cv2.GaussianBlur
Applies a Gaussian blur to an image.

Usage:
import cv2

# Read an image
image = cv2.imread('path/to/image.jpg')

# Apply Gaussian blur


blurred_image = cv2.GaussianBlur(image, (15, 15), 0)

# Display the blurred image


cv2.imshow('Blurred Image', blurred_image)
cv2.waitKey(0)
cv2.destroyAllWindows()

6. cv2.Canny
Applies the Canny edge detector to an image.

Usage:

import cv2

# Read an image
image = cv2.imread('path/to/image.jpg', cv2.IMREAD_GRAYSCALE)

# Apply Canny edge detector


edges = cv2.Canny(image, 100, 200)

# Display the edges


cv2.imshow('Edges', edges)
cv2.waitKey(0)
cv2.destroyAllWindows()

7. cv2.HoughLines
Detects lines in a binary image using the Hough Transform.

Usage:
import cv2
import numpy as np

# Read an image
image = cv2.imread('path/to/image.jpg', cv2.IMREAD_GRAYSCALE)

# Apply Canny edge detector


edges = cv2.Canny(image, 50, 150, apertureSize=3)

# Detect lines using Hough Transform


lines = cv2.HoughLines(edges, 1, np.pi/180, 200)

# Draw the lines on the image


for line in lines:
rho, theta = line[0]
a = np.cos(theta)
b = np.sin(theta)
x0 = a * rho
y0 = b * rho
x1 = int(x0 + 1000 * (-b))
x2 = int(x0 - 1000 * (-b))
y1 = int(y0 + 1000 * (a))
y2 = int(y0 - 1000 * (a))
cv2.line(image, (x1, y1), (x2,

y2), (0, 0, 255), 2)

# Display the result


cv2.imshow('Hough Lines', image)
cv2.waitKey(0)
cv2.destroyAllWindows()

8. cv2.HoughCircles
Detects circles in a binary image using the Hough Transform.

Usage:
import cv2
import numpy as np

# Read an image
image = cv2.imread('path/to/image.jpg', cv2.IMREAD_GRAYSCALE)

# Apply Gaussian blur


blurred_image = cv2.GaussianBlur(image, (9, 9), 2)

# Detect circles using Hough Transform


circles = cv2.HoughCircles(blurred_image, cv2.HOUGH_GRADIENT, 1, 20,
param1=50, param2=30, minRadius=0, maxRadius=0)

# Convert the circles to integers


circles = np.uint16(np.around(circles))

# Draw the circles on the image


for i in circles[0, :]:
cv2.circle(image, (i[0], i[1]), i[2], (0, 255, 0), 2)
cv2.circle(image, (i[0], i[1]), 2, (0, 0, 255), 3)

# Display the result


cv2.imshow('Hough Circles', image)
cv2.waitKey(0)
cv2.destroyAllWindows()

9. cv2.findContours
Finds contours in a binary image.

Usage:
import cv2

# Read an image
image = cv2.imread('path/to/image.jpg', cv2.IMREAD_GRAYSCALE)

# Apply threshold to get a binary image


_, binary_image = cv2.threshold(image, 127, 255, cv2.THRESH_BINARY)

# Find contours
contours, hierarchy = cv2.findContours(binary_image, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)

# Draw the contours on the image


cv2.drawContours(image, contours, -1, (0, 255, 0), 3)

# Display the result


cv2.imshow('Contours', image)
cv2.waitKey(0)
cv2.destroyAllWindows()

10. cv2.drawContours
Draws contours on an image.

Usage:

import cv2

# Read an image
image = cv2.imread('path/to/image.jpg', cv2.IMREAD_GRAYSCALE)

# Apply threshold to get a binary image


_, binary_image = cv2.threshold(image, 127, 255, cv2.THRESH_BINARY)

# Find contours
contours, hierarchy = cv2.findContours(binary_image, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)

# Draw the contours on the image


cv2.drawContours(image, contours, -1, (0, 255, 0), 3)

# Display the result


cv2.imshow('Drawn Contours', image)
cv2.waitKey(0)
cv2.destroyAllWindows()
11. cv2.rectangle
Draws a rectangle on an image.

Usage:

import cv2

# Read an image
image = cv2.imread('path/to/image.jpg')

# Draw a rectangle on the image


cv2.rectangle(image, (50, 50), (200, 200), (255, 0, 0), 3)

# Display the result


cv2.imshow('Rectangle', image)
cv2.waitKey(0)
cv2.destroyAllWindows()

12. cv2.circle
Draws a circle on an image.

Usage:

import cv2

# Read an image
image = cv2.imread('path/to/image.jpg')

# Draw a circle on the image


cv2.circle(image, (150, 150), 50, (0, 255, 0), 3)

# Display the result


cv2.imshow('Circle', image)
cv2.waitKey(0)
cv2.destroyAllWindows()

13. cv2.line
Draws a line on an image.

Usage:
import cv2

# Read an image
image = cv2.imread('path/to/image.jpg')

# Draw a line on the image


cv2.line(image, (100, 100), (300, 300), (0, 0, 255), 3)

# Display the result


cv2.imshow('Line', image)
cv2.waitKey(0)
cv2.destroyAllWindows()

14. cv2.putText
Puts text on an image.

Usage:

import cv2

# Read an image
image = cv2.imread('path/to/image.jpg')

# Put text on the image


cv2.putText(image, 'OpenCV', (50, 50), cv2.FONT_HERSHEY_SIMPLEX, 1, (255, 255, 255), 2, cv2.LINE_AA

# Display the result


cv2.imshow('Text', image)
cv2.waitKey(0)
cv2.destroyAllWindows()

15. cv2.resize
Resizes an image to the specified dimensions.

Usage:
import cv2

# Read an image
image = cv2.imread('path/to/image.jpg')

# Resize the image


resized_image = cv2.resize(image, (400, 300))

# Display the result


cv2.imshow('Resized Image', resized_image)
cv2.waitKey(0)
cv2.destroyAllWindows()

16. cv2.warpAffine
Applies an affine transformation to an image.

Usage:

import cv2
import numpy as np

# Read an image
image = cv2.imread('path/to/image.jpg')

# Define the transformation matrix


M = np.float32([[1, 0, 100], [0, 1, 50]])

# Apply affine transformation


transformed_image = cv2.warpAffine(image, M, (image.shape[1], image.shape[0]))

# Display the result


cv2.imshow('Affine Transformation', transformed_image)
cv2.waitKey(0)
cv2.destroyAllWindows()

17. cv2.warpPerspective
Applies a perspective transformation to an image.

Usage:
import cv2
import numpy as np

# Read an image
image = cv2.imread('path/to/image.jpg')

# Define the points for perspective transformation


pts1 = np.float32([[50, 50], [200, 50], [50, 200], [200, 200]])
pts2 = np.float32([[10, 100], [200, 50], [100, 250], [300, 200]])

# Compute the perspective transformation matrix


M = cv2.getPerspectiveTransform(pts1, pts2)

# Apply perspective transformation


transformed_image = cv2.warpPerspective(image, M, (image.shape[1], image.shape[0]))

# Display the result


cv2.imshow('Perspective Transformation', transformed_image)
cv2.waitKey(0)
cv2.destroyAllWindows()

18. cv2.getRotationMatrix2D
Computes the affine matrix for rotating an image by a specified angle.

Usage:

import cv2

# Read an image
image = cv2.imread('path/to/image.jpg')

# Compute the rotation matrix


M = cv2.getRotationMatrix2D((image.shape[1]/2, image.shape[0]/2), 45, 1)

# Apply rotation
rotated_image = cv2.warpAffine(image, M, (image.shape[1], image.shape[0]))

# Display the result


cv2.imshow('Rotated Image', rotated_image)
cv2.waitKey(0)
cv2.destroyAllWindows()
19. cv2.getAffineTransform
Computes the affine transformation matrix from three pairs of points.

Usage:

import cv2
import numpy as np

# Read an image
image = cv2.imread('path/to/image.jpg')

# Define the points for affine transformation


pts1 = np.float32([[50, 50], [200, 50], [50, 200]])
pts2 = np.float32([[10, 100], [200, 50], [100, 250]])

# Compute the affine transformation matrix


M = cv2.getAffineTransform(pts1, pts2)

# Apply affine transformation


transformed_image = cv2.warpAffine(image, M, (image.shape[1], image.shape[0]))

# Display the result


cv2.imshow('Affine Transformation', transformed_image)
cv2.waitKey(0)
cv2.destroyAllWindows()

20. cv2.getPerspectiveTransform
Computes the perspective transformation matrix from four pairs of points.

Usage:
import cv2
import numpy as np

# Read an image
image = cv2.imread('path/to/image.jpg')

# Define the points for perspective transformation


pts1 = np.float32([[50, 50], [200, 50], [50, 200], [200, 200]])
pts2 = np.float32([[10, 100], [200, 50], [100, 250], [300, 200]])

# Compute the perspective transformation matrix


M = cv2.getPerspectiveTransform(pts1, pts2)

# Apply perspective transformation


transformed_image = cv2

.warpPerspective(image, M, (image.shape[1], image.shape[0]))

# Display the result


cv2.imshow('Perspective Transformation', transformed_image)
cv2.waitKey(0)
cv2.destroyAllWindows()

21. cv2.dilate
Applies the dilation operation to an image.

Usage:

import cv2
import numpy as np

# Read an image
image = cv2.imread('path/to/image.jpg', cv2.IMREAD_GRAYSCALE)

# Define the kernel


kernel = np.ones((5, 5), np.uint8)

# Apply dilation
dilated_image = cv2.dilate(image, kernel, iterations=1)

# Display the result


cv2.imshow('Dilated Image', dilated_image)
cv2.waitKey(0)
cv2.destroyAllWindows()
22. cv2.erode
Applies the erosion operation to an image.

Usage:

import cv2
import numpy as np

# Read an image
image = cv2.imread('path/to/image.jpg', cv2.IMREAD_GRAYSCALE)

# Define the kernel


kernel = np.ones((5, 5), np.uint8)

# Apply erosion
eroded_image = cv2.erode(image, kernel, iterations=1)

# Display the result


cv2.imshow('Eroded Image', eroded_image)
cv2.waitKey(0)
cv2.destroyAllWindows()

23. cv2.morphologyEx
Applies advanced morphological transformations to an image.

Usage:

import cv2
import numpy as np

# Read an image
image = cv2.imread('path/to/image.jpg', cv2.IMREAD_GRAYSCALE)

# Define the kernel


kernel = np.ones((5, 5), np.uint8)

# Apply morphological transformations


morph_image = cv2.morphologyEx(image, cv2.MORPH_OPEN, kernel)

# Display the result


cv2.imshow('Morphological Transformations', morph_image)
cv2.waitKey(0)
cv2.destroyAllWindows()
24. cv2.threshold
Applies a fixed-level threshold to an image.

Usage:

import cv2

# Read an image
image = cv2.imread('path/to/image.jpg', cv2.IMREAD_GRAYSCALE)

# Apply threshold
_, thresh_image = cv2.threshold(image, 127, 255, cv2.THRESH_BINARY)

# Display the result


cv2.imshow('Threshold Image', thresh_image)
cv2.waitKey(0)
cv2.destroyAllWindows()

25. cv2.adaptiveThreshold
Applies an adaptive threshold to an image.

Usage:

import cv2

# Read an image
image = cv2.imread('path/to/image.jpg', cv2.IMREAD_GRAYSCALE)

# Apply adaptive threshold


adaptive_thresh_image = cv2.adaptiveThreshold(image, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C,
cv2.THRESH_BINARY, 11, 2)

# Display the result


cv2.imshow('Adaptive Threshold Image', adaptive_thresh_image)
cv2.waitKey(0)
cv2.destroyAllWindows()

26. cv2.equalizeHist
Equalizes the histogram of a grayscale image.

Usage:
import cv2

# Read an image
image = cv2.imread('path/to/image.jpg', cv2.IMREAD_GRAYSCALE)

# Equalize histogram
equalized_image = cv2.equalizeHist(image)

# Display the result


cv2.imshow('Equalized Image', equalized_image)
cv2.waitKey(0)
cv2.destroyAllWindows()

27. cv2.calcHist
Computes the histogram of an image.

Usage:

import cv2
import matplotlib.pyplot as plt

# Read an image
image = cv2.imread('path/to/image.jpg', cv2.IMREAD_GRAYSCALE)

# Compute histogram
hist = cv2.calcHist([image], [0], None, [256], [0, 256])

# Plot histogram
plt.plot(hist)
plt.show()

28. cv2.compareHist
Compares two histograms.

Usage:
import cv2

# Read two images


image1 = cv2.imread('path/to/image1.jpg', cv2.IMREAD_GRAYSCALE)
image2 = cv2.imread('path/to/image2.jpg', cv2.IMREAD_GRAYSCALE)

# Compute histograms
hist1 = cv2.calcHist([image1], [0], None, [256], [0, 256])
hist2 = cv2.calcHist([image2], [0], None, [256], [0, 256])

# Compare histograms
comparison = cv2.compareHist(hist1, hist2, cv2.HISTCMP_CORREL)

print('Histogram Comparison Result:', comparison)

29. cv2.matchTemplate
Compares a template image with a source image using a specific method.

Usage:

import cv2

# Read the source image


source_image = cv2.imread('path/to/source_image.jpg')

# Read the template image


template_image = cv2.imread('path/to/template_image.jpg')

# Perform template matching


result = cv2.matchTemplate(source_image, template_image, cv2.TM_CCOEFF_NORMED)

# Get the location of the best match


min_val, max_val, min_loc, max_loc = cv2.minMaxLoc(result)

# Draw a rectangle around the matched region


top_left = max_loc
bottom_right = (top_left[0] + template_image.shape[1], top_left[1] + template_image.shape[0])
cv2.rectangle(source_image, top_left, bottom_right, (0, 255, 0), 2)

# Display the result


cv2.imshow('Matched Template', source_image)
cv2.waitKey(0)
cv2.destroyAllWindows()
30. cv2.VideoCapture
Opens a video file or a capturing device.

Usage:

import cv2

# Open a video file


cap = cv2.VideoCapture('path/to/video.mp4')

while cap.isOpened():
ret, frame = cap.read()
if not ret:
break
cv2.imshow('Video', frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break

cap.release()
cv2.destroyAllWindows()

31. cv2.VideoWriter
Writes video frames to a video file.

Usage:
import cv2

# Define the codec and create VideoWriter object


fourcc = cv2.VideoWriter_fourcc(*'XVID')
out = cv2.VideoWriter('output.avi', fourcc, 20.0, (640, 480))

# Open a video file


cap = cv2.VideoCapture('path/to/video.mp4')

while cap.isOpened():
ret, frame = cap.read()
if not ret:
break
out.write(frame)
cv2.imshow('Video', frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break

cap.release()
out.release()
cv2.destroyAllWindows()

32. cv2.calcOpticalFlowPyrLK
Computes dense optical flow using the Lucas-Kanade method.

Usage:
import cv2
import numpy as np

# Read the first frame


cap = cv2.VideoCapture('path/to/video.mp4')
ret, old_frame = cap.read()
old_gray = cv2.cvtColor(old_frame, cv2.COLOR_BGR2GRAY)

# Parameters for Lucas-Kanade optical flow


lk_params = dict(winSize=(15, 15), maxLevel=2,
criteria=(cv2.TERM_CRITERIA_EPS | cv2.TERM_CRITERIA_COUNT, 10, 0.03))

# Detect corners to track


feature_params = dict(maxCorners=100, qualityLevel=0.3, minDistance=7, blockSize=7)
p0 = cv2.goodFeaturesToTrack(old_gray, mask=None, **feature_params)

# Create a mask image for drawing purposes


mask = np.zeros_like(old_frame)

while cap.isOpened():
ret, frame = cap.read()
if not ret:
break
frame_gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)

# Calculate optical flow


p1, st, err = cv2.calcOpticalFlowPyrLK(old_gray, frame_gray, p0, None, **lk_params)

# Select good points


good_new = p1[st == 1]
good_old = p0[st == 1]

# Draw the tracks


for i, (new, old) in enumerate(zip(good_new, good_old)):
a, b = new.ravel()
c, d = old.ravel()
mask = cv2.line(mask, (a, b), (c, d), (0, 255, 0), 2)
frame = cv2.circle(frame, (a, b), 5, (0, 255, 0), -1)
img = cv2.add(frame, mask)

cv2.imshow('Optical Flow', img)


if cv2.waitKey(1) & 0xFF == ord('q'):
break

# Update the previous frame and previous points


old_gray = frame_gray.copy()
p0 = good
_new.reshape(-1, 1, 2)

cap.release()
cv2.destroyAllWindows()

33. cv2.calcOpticalFlowFarneback
Computes dense optical flow using the Farneback method.

Usage:

import cv2
import numpy as np

# Open a video file


cap = cv2.VideoCapture('path/to/video.mp4')

ret, frame1 = cap.read()


prvs = cv2.cvtColor(frame1, cv2.COLOR_BGR2GRAY)

while cap.isOpened():
ret, frame2 = cap.read()
if not ret:
break
next = cv2.cvtColor(frame2, cv2.COLOR_BGR2GRAY)

# Calculate optical flow


flow = cv2.calcOpticalFlowFarneback(prvs, next, None, 0.5, 3, 15, 3, 5, 1.2, 0)

# Convert the flow to HSV color space


hsv = np.zeros_like(frame1)
hsv[..., 1] = 255
mag, ang = cv2.cartToPolar(flow[..., 0], flow[..., 1])
hsv[..., 0] = ang * 180 / np.pi / 2
hsv[..., 2] = cv2.normalize(mag, None, 0, 255, cv2.NORM_MINMAX)
rgb = cv2.cvtColor(hsv, cv2.COLOR_HSV2BGR)

cv2.imshow('Optical Flow', rgb)


if cv2.waitKey(1) & 0xFF == ord('q'):
break

prvs = next

cap.release()
cv2.destroyAllWindows()
34. cv2.dnn.readNet
Reads a deep learning network model from a file.

Usage:

import cv2

# Load a pre-trained model


net = cv2.dnn.readNet('path/to/model.caffemodel', 'path/to/model.prototxt')

# Read an image
image = cv2.imread('path/to/image.jpg')

# Prepare the image for the model


blob = cv2.dnn.blobFromImage(image, scalefactor=1.0, size=(224, 224), mean=(104, 117, 123))

# Set the input to the model


net.setInput(blob)

# Perform forward pass


output = net.forward()

print(output)

35. cv2.dnn.blobFromImage
Converts an image to a blob for input into a deep learning network.

Usage:

import cv2

# Read an image
image = cv2.imread('path/to/image.jpg')

# Convert the image to a blob


blob = cv2.dnn.blobFromImage(image, scalefactor=1.0, size=(224, 224), mean=(104, 117, 123))

print(blob.shape)

36. cv2.dnn_Net.forward
Runs a forward pass of the deep learning network.

Usage:
import cv2

# Load a pre-trained model


net = cv2.dnn.readNet('path/to/model.caffemodel', 'path/to/model.prototxt')

# Read an image
image = cv2.imread('path/to/image.jpg')

# Prepare the image for the model


blob = cv2.dnn.blobFromImage(image, scalefactor=1.0, size=(224, 224), mean=(104, 117, 123))

# Set the input to the model


net.setInput(blob)

# Perform forward pass


output = net.forward()

print(output)

37. cv2.CascadeClassifier
Detects objects using a cascade classifier.

Usage:

import cv2

# Load the cascade classifier


face_cascade = cv2.CascadeClassifier('path/to/haarcascade_frontalface_default.xml')

# Read an image
image = cv2.imread('path/to/image.jpg')
gray_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

# Detect faces
faces = face_cascade.detectMultiScale(gray_image, scaleFactor=1.1, minNeighbors=5)

# Draw rectangles around the faces


for (x, y, w, h) in faces:
cv2.rectangle(image, (x, y), (x+w, y+h), (255, 0, 0), 2)

# Display the result


cv2.imshow('Detected Faces', image)
cv2.waitKey(0)
cv2.destroyAllWindows()
38. cv2.face.LBPHFaceRecognizer_create
Creates an LBPH face recognizer.

Usage:

import cv2

# Create an LBPH face recognizer


recognizer = cv2.face.LBPHFaceRecognizer_create()

# Train the recognizer with training data


# recognizer.train(training_images, training_labels)

# Save the trained model


# recognizer.save('lbph_model.yml')

# Load the trained model


recognizer.read('lbph_model.yml')

# Recognize faces
# label, confidence = recognizer.predict(test_image)

39. cv2.face.FisherFaceRecognizer_create
Creates a Fisher face recognizer.

Usage:

import cv2

# Create a Fisher face recognizer


recognizer = cv2.face.FisherFaceRecognizer_create()

# Train the recognizer with training data


# recognizer.train(training_images, training_labels)

# Save the trained model


# recognizer.save('fisher_model.yml')

# Load the trained model


recognizer.read('fisher_model.yml')

# Recognize faces
# label, confidence = recognizer.predict(test_image)
40. cv2.face.EigenFaceRecognizer_create
Creates an Eigen face recognizer.

Usage:

import cv2

# Create an Eigen face recognizer


recognizer = cv2.face.EigenFaceRecognizer_create()

# Train the recognizer with training data


# recognizer.train(training_images, training_labels)

# Save the trained model


# recognizer.save('eigen_model.yml')

# Load the trained model


recognizer.read('eigen_model.yml')

# Recognize faces
# label, confidence = recognizer.predict(test_image)

41. cv2.bgsegm.createBackgroundSubtractorMOG
Creates a MOG background subtractor.

Usage:
import cv2

# Create a MOG background subtractor


background_subtractor = cv2.bgsegm.createBackgroundSubtractorMOG()

# Open a video file


cap = cv2.VideoCapture('path/to/video.mp4')

while cap.isOpened():
ret, frame = cap.read()
if not ret:
break
fg_mask = background_subtractor.apply(frame)

cv2.imshow('Foreground Mask', fg_mask)


if cv2.waitKey(1) & 0xFF == ord('q'):
break

cap.release()
cv2.destroyAllWindows()

42. cv2.bgsegm.createBackgroundSubtractorMOG2
Creates a MOG2 background subtractor.

Usage:
import cv2

# Create a MOG2 background subtractor


background_subtractor = cv2.createBackgroundSubtractorMOG2()

# Open a video file


cap = cv2.VideoCapture('path/to/video.mp4')

while cap.isOpened():
ret, frame = cap.read()
if not ret:
break
fg_mask = background_subtractor.apply(frame)

cv2.imshow('Foreground Mask', fg_mask)


if cv2.waitKey(1) & 0xFF == ord('q'):
break

cap.release()
cv2.destroyAllWindows()

43. cv2.bgsegm.createBackgroundSubtractorKNN
Creates a KNN background subtractor.

Usage:
import cv2

# Create a KNN background subtractor


background_subtractor = cv2.createBackgroundSubtractorKNN()

# Open a video file


cap = cv2.VideoCapture('path/to/video.mp4')

while cap.isOpened():
ret, frame = cap.read()
if not ret:
break
fg_mask = background_subtractor.apply(frame)

cv2.imshow('Foreground Mask', fg_mask)


if cv2.waitKey(1) & 0xFF == ord('q'):
break

cap.release()
cv2.destroyAllWindows()

44. cv2.bgsegm.createBackgroundSubtractorGMG
Creates a GMG background subtractor.

Usage:
import cv2

# Create a GMG background subtractor


background_subtractor = cv2.bgsegm.createBackgroundSubtractorGMG()

# Open a video file


cap = cv2.VideoCapture('path/to/video.mp4')

while cap.isOpened():
ret, frame = cap.read()
if not ret:
break
fg_mask = background_subtractor.apply(frame)

cv2.imshow('Foreground Mask', fg_mask)


if cv2.waitKey(1) & 0xFF == ord('q'):
break

cap.release()
cv2.destroyAllWindows()

45. cv2.FastFeatureDetector_create
Creates a FAST feature detector.

Usage:
import cv2

# Create a FAST feature detector


fast = cv2.FastFeatureDetector_create()

# Read an image
image = cv2.imread('path/to/image.jpg', cv2.IMREAD_GRAYSCALE)

# Detect keypoints
keypoints = fast.detect(image, None)

# Draw keypoints
image_with_keypoints = cv

2.drawKeypoints(image, keypoints, None, color=(255, 0, 0))

# Display the result


cv2.imshow('FAST Keypoints', image_with_keypoints)
cv2.waitKey(0)
cv2.destroyAllWindows()

46. cv2.ORB_create
Creates an ORB feature detector and descriptor extractor.

Usage:

import cv2

# Create an ORB detector


orb = cv2.ORB_create()

# Read an image
image = cv2.imread('path/to/image.jpg', cv2.IMREAD_GRAYSCALE)

# Detect keypoints and descriptors


keypoints, descriptors = orb.detectAndCompute(image, None)

# Draw keypoints
image_with_keypoints = cv2.drawKeypoints(image, keypoints, None, color=(255, 0, 0))

# Display the result


cv2.imshow('ORB Keypoints', image_with_keypoints)
cv2.waitKey(0)
cv2.destroyAllWindows()
47. cv2.SIFT_create
Creates a SIFT feature detector and descriptor extractor.

Usage:

import cv2

# Create a SIFT detector


sift = cv2.SIFT_create()

# Read an image
image = cv2.imread('path/to/image.jpg', cv2.IMREAD_GRAYSCALE)

# Detect keypoints and descriptors


keypoints, descriptors = sift.detectAndCompute(image, None)

# Draw keypoints
image_with_keypoints = cv2.drawKeypoints(image, keypoints, None, color=(255, 0, 0))

# Display the result


cv2.imshow('SIFT Keypoints', image_with_keypoints)
cv2.waitKey(0)
cv2.destroyAllWindows()

48. cv2.SURF_create
Creates a SURF feature detector and descriptor extractor.

Usage:
import cv2

# Create a SURF detector


surf = cv2.SURF_create()

# Read an image
image = cv2.imread('path/to/image.jpg', cv2.IMREAD_GRAYSCALE)

# Detect keypoints and descriptors


keypoints, descriptors = surf.detectAndCompute(image, None)

# Draw keypoints
image_with_keypoints = cv2.drawKeypoints(image, keypoints, None, color=(255, 0, 0))

# Display the result


cv2.imshow('SURF Keypoints', image_with_keypoints)
cv2.waitKey(0)
cv2.destroyAllWindows()

49. cv2.BRISK_create
Creates a BRISK feature detector and descriptor extractor.

Usage:

import cv2

# Create a BRISK detector


brisk = cv2.BRISK_create()

# Read an image
image = cv2.imread('path/to/image.jpg', cv2.IMREAD_GRAYSCALE)

# Detect keypoints and descriptors


keypoints, descriptors = brisk.detectAndCompute(image, None)

# Draw keypoints
image_with_keypoints = cv2.drawKeypoints(image, keypoints, None, color=(255, 0, 0))

# Display the result


cv2.imshow('BRISK Keypoints', image_with_keypoints)
cv2.waitKey(0)
cv2.destroyAllWindows()
50. cv2.drawKeypoints
Draws keypoints on an image.

Usage:

import cv2

# Read an image
image = cv2.imread('path/to/image.jpg', cv2.IMREAD_GRAYSCALE)

# Create a feature detector


orb = cv2.ORB_create()

# Detect keypoints
keypoints = orb.detect(image, None)

# Draw keypoints
image_with_keypoints = cv2.drawKeypoints(image, keypoints, None, color=(255, 0, 0))

# Display the result


cv2.imshow('Keypoints', image_with_keypoints)
cv2.waitKey(0)
cv2.destroyAllWindows()
Mini Computer Vision Project

Use Case 1: Object Detection using YOLO

import cv2

# Load YOLO
net = cv2.dnn.readNet("yolov3.weights", "yolov3.cfg")
layer_names = net.getLayerNames()
output_layers = [layer_names[i[0] - 1] for i in net.getUnconnectedOutLayers()]

# Load image
img = cv2.imread("image.jpg")
height, width, channels = img.shape

# Detecting objects
blob = cv2.dnn.blobFromImage(img, 0.00392, (416, 416), (0, 0, 0), True, crop=False)
net.setInput(blob)
outs = net.forward(output_layers)

# Showing information on the screen


class_ids = []
confidences = []
boxes = []
for out in outs:
for detection in out:
scores = detection[5:]
class_id = np.argmax(scores)
confidence = scores[class_id]
if confidence > 0.5:
# Object detected
center_x = int(detection[0] * width)
center_y = int(detection[1] * height)
w = int(detection[2] * width)
h = int(detection[3] * height)

# Rectangle coordinates
x = int(center_x - w / 2)
y = int(center_y - h / 2)

boxes.append([x, y, w, h])
confidences.append(float(confidence))
class_ids.append(class_id)

indexes = cv2.dnn.NMSBoxes(boxes, confidences, 0.5, 0.4)


font = cv2.FONT_HERSHEY_PLAIN
for i in range(len(boxes)):
if i in indexes:
x, y, w, h = boxes[i]
label = str(classes[class_ids[i]])
color = colors[i]
cv2.rectangle(img, (x, y), (x + w, y + h), color, 2)
cv2.putText(img, label, (x, y + 30), font, 3, color, 3)

cv2.imshow("Image", img)
cv2.waitKey(0)
cv2.destroyAllWindows()

Use Case 2: Semantic Segmentation using DeepLabV3

import cv2

# Load DeepLabV3 model


net = cv2.dnn.readNetFromTensorflow("deeplabv3.pb")

# Load image
image = cv2.imread("image.jpg")

# Prepare the image


blob = cv2.dnn.blobFromImage(image, scalefactor=1.0, size=(513, 513), mean=(104.00698793, 116.66876

# Set the input


net.setInput(blob)

# Perform forward pass


output = net.forward()

# Process the output


output = output.squeeze().argmax(axis=0)
output = cv2.resize(output, (image.shape[1], image.shape[0]))

# Apply color map


output_colored = cv2.applyColorMap(output.astype(np.uint8), cv2.COLORMAP_JET)

# Display the result


cv2.imshow("Semantic Segmentation", output_colored)
cv2.waitKey(0)
cv2.destroyAllWindows()
Use Case 3: Gesture Recognition using MediaPipe

import cv2
import mediapipe as mp

mp_hands = mp.solutions.hands
mp_drawing = mp.solutions.drawing_utils

# Initialize MediaPipe Hands


hands = mp_hands.Hands(static_image_mode=False, max_num_hands=2, min_detection_confidence=0.5, min_

# Open video capture


cap = cv2.VideoCapture(0)

while cap.isOpened():
ret, frame = cap.read()
if not ret:
break

# Convert the frame to RGB


rgb_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)

# Process the frame


result = hands.process(rgb_frame)

# Draw hand landmarks


if result.multi_hand_landmarks:
for hand_landmarks in result.multi_hand_landmarks:
mp_drawing.draw_landmarks(frame, hand_landmarks, mp_hands.HAND_CONNECTIONS)

# Display the result


cv2.imshow('Hand Gesture Recognition', frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break

cap.release()
cv2.destroyAllWindows()
Use Case 4: Image Recognition using InceptionV3

import cv2

# Load InceptionV3 model


net = cv2.dnn.readNetFromTensorflow("inceptionv3.pb")

# Load image
image = cv2.imread("image.jpg")

# Prepare the image


blob = cv2.dnn.blobFromImage(image, scalefactor=1.0, size=(299, 299), mean=(104, 117, 123))

# Set the input


net.setInput(blob)

# Perform forward pass


output = net.forward()

# Get the predicted class


class_id = output.argmax()

# Display the result


print("Predicted Class ID:", class_id)
Use Case 5: Face Detection using Haar Cascades

import cv2

# Load the cascade classifier


face_cascade = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')

# Open video capture


cap = cv2.VideoCapture(0)

while cap.isOpened():
ret, frame = cap.read()
if not ret:
break

Convert the frame to grayscale


gray_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)

# Detect faces
faces = face_cascade.detectMultiScale(gray_frame, scaleFactor=1.1, minNeighbors=5)

# Draw rectangles around the faces


for (x, y, w, h) in faces:
cv2.rectangle(frame, (x, y), (x+w, y+h), (255, 0, 0), 2)

# Display the result


cv2.imshow('Face Detection', frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break

cap.release()
cv2.destroyAllWindows()
Use Case 6: Real-time Object Tracking using GOTURN Tracker

import cv2

# Load the GOTURN tracker


tracker = cv2.TrackerGOTURN_create()

# Open video capture


cap = cv2.VideoCapture(0)
ret, frame = cap.read()

# Define the initial bounding box


bbox = cv2.selectROI(frame, False)

# Initialize the tracker


tracker.init(frame, bbox)

while cap.isOpened():
ret, frame = cap.read()
if not ret:
break

# Update the tracker


success, bbox = tracker.update(frame)

# Draw the bounding box


if success:
p1 = (int(bbox[0]), int(bbox[1]))
p2 = (int(bbox[0] + bbox[2]), int(bbox[1] + bbox[3]))
cv2.rectangle(frame, p1, p2, (0, 255, 0), 2, 1)
else:
cv2.putText(frame, "Tracking failure detected", (100, 80), cv2.FONT_HERSHEY_SIMPLEX, 0.75,

# Display the result


cv2.imshow('Object Tracking', frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break

cap.release()
cv2.destroyAllWindows()
Use Case 7: Background Subtraction using KNN

import cv2

# Create a KNN background subtractor


background_subtractor = cv2.createBackgroundSubtractorKNN()

# Open video capture


cap = cv2.VideoCapture(0)

while cap.isOpened():
ret, frame = cap.read()
if not ret:
break

# Apply background subtraction


fg_mask = background_subtractor.apply(frame)

# Display the result


cv2.imshow('Background Subtraction', fg_mask)
if cv2.waitKey(1) & 0xFF == ord('q'):
break

cap.release()
cv2.destroyAllWindows()
Use Case 8: Lane Detection in a Video

import cv2
import numpy as np

def region_of_interest(img, vertices):


mask = np.zeros_like(img)
cv2.fillPoly(mask, vertices, 255)
masked = cv2.bitwise_and(img, mask)
return masked

def draw_lines(img, lines):


for line in lines:
for x1, y1, x2, y2 in line:
cv2.line(img, (x1, y1), (x2, y2), (0, 255, 0), 3)

# Open video capture


cap = cv2.VideoCapture('path/to/video.mp4')

while cap.isOpened():
ret, frame = cap.read()
if not ret:
break

# Convert the frame to grayscale


gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)

# Apply Gaussian blur


blur = cv2.GaussianBlur(gray, (5, 5), 0)

# Apply Canny edge detector


edges = cv2.Canny(blur, 50, 150)

# Define region of interest


height, width = frame.shape[:2]
roi_vertices = [(0, height), (width / 2, height / 2), (width, height)]
roi = region_of_interest(edges, np.array([roi_vertices], np.int32))

# Detect lines using Hough Transform


lines = cv2.HoughLinesP(roi, 1, np.pi / 180, 100, minLineLength=40, maxLineGap=5)

# Draw the lines on the frame


if lines is not None:
draw_lines(frame, lines)

# Display the result


cv2.imshow('Lane Detection', frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break

cap.release()
cv2.destroyAllWindows()
Use Case 9: Real-time Face Recognition using LBPH

import cv2

# Load the LBPH face recognizer


recognizer = cv2.face.LBPHFaceRecognizer_create()
recognizer.read('lbph_model.yml')

# Load the cascade classifier


face_cascade = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')

# Open video capture


cap = cv2.VideoCapture(0)

while cap.isOpened():
ret, frame = cap.read()
if not ret:
break

# Convert the frame to grayscale


gray_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)

# Detect faces
faces = face_cascade.detectMultiScale(gray_frame, scaleFactor=1.1, minNeighbors=5)

for (x, y, w, h) in faces:


# Recognize the face
label, confidence = recognizer.predict(gray_frame[y:y+h, x:x+w])

# Draw a rectangle around the face


cv2.rectangle(frame, (x, y), (x+w, y+h), (255, 0, 0), 2)
cv2.putText(frame, f'ID: {label}, Confidence: {confidence}', (x, y-10), cv2.FONT_HERSHEY_SI

# Display the result


cv2.imshow('Face Recognition', frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break

cap.release()
cv2.destroyAllWindows()
Use Case 10: Real-time Emotion Detection using FER2013

import cv2
import numpy as np
from keras.models import load_model

# Load the pre-trained emotion detection model


emotion_model = load_model('fer2013_model.h5')
emotion_labels = ['Angry', 'Disgust', 'Fear', 'Happy', 'Sad', 'Surprise', 'Neutral']

# Load the cascade classifier


face_cascade = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')

# Open video capture


cap = cv2.VideoCapture(0)

while cap.isOpened():
ret, frame = cap.read()
if not ret:
break

# Convert the frame to grayscale


gray_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)

# Detect faces
faces = face_cascade.detectMultiScale(gray_frame, scaleFactor=1.1, minNeighbors=5)

for (x, y, w, h) in faces:


# Extract the face region of interest
roi_gray = gray_frame[y:y+h, x:x+w]
roi_gray = cv2.resize(roi_gray, (48, 48))
roi_gray = roi_gray / 255.0
roi_gray = np.expand_dims(roi_gray, axis=0)
roi_gray = np.expand_dims(roi_gray, axis=-1)

# Predict the emotion


emotion_prediction = emotion_model.predict(roi_gray)
max_index = np.argmax(emotion_prediction)
emotion = emotion_labels[max_index]

# Draw a rectangle around the face


cv2.rectangle(frame, (x, y), (x+w, y+h), (255, 0, 0), 2)
cv2.putText(frame, emotion, (x, y-10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (255, 0, 0), 2)

# Display the result


cv2.imshow('Emotion Detection', frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break

cap.release()
cv2.destroyAllWindows()

Use Case 11: Road Sign Detection using HOG and SVM

import cv2

# Load pre-trained HOG + SVM model for road sign detection


hog = cv2.HOGDescriptor()
svm = cv2.ml.SVM_load('road_sign_svm_model.yml')

# Load image
image = cv2.imread('road_sign.jpg')

# Convert image to grayscale


gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

# Detect road signs using HOG + SVM


_, hog_features = hog.compute(gray)
result = svm.predict(hog_features.reshape(1, -1))

# Display result
if result[1][0] == 1:
cv2.putText(image, 'Road Sign Detected', (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)

cv2.imshow('Road Sign Detection', image)


cv2.waitKey(0)
cv2.destroyAllWindows()
Use Case 12: Person Re-identification using Deep Learning

import cv2
import numpy as np

# Load pre-trained deep learning model for person re-identification


net = cv2.dnn.readNet('person_reid_model.onnx')

# Load images of the person to be re-identified


image1 = cv2.imread('person1.jpg')
image2 = cv2.imread('person2.jpg')

# Prepare images for the model


blob1 = cv2.dnn.blobFromImage(image1, scalefactor=1.0, size=(128, 256), mean=(0, 0, 0), swapRB=True
blob2 = cv2.dnn.blobFromImage(image2, scalefactor=1.0, size=(128, 256), mean=(0, 0, 0), swapRB=True

# Set the inputs


net.setInput(blob1)
output1 = net.forward()

net.setInput(blob2)
output2 = net.forward()

# Compute the cosine similarity between the two feature vectors


similarity = np.dot(output1, output2.T) / (np.linalg.norm(output1) * np.linalg.norm(output2))

# Display the result


print('Similarity:', similarity)
Use Case 13: Scene Text Detection using EAST Detector

import cv2

# Load the pre-trained EAST text detector


net = cv2.dnn.readNet('frozen_east_text_detection.pb')

# Load image
image = cv2.imread('scene_text.jpg')
orig = image.copy()
(H, W) = image.shape[:2]

# Prepare the image


blob = cv2.dnn.blobFromImage(image, 1.0, (W, H), (123.68, 116.78, 103.94), swapRB=True, crop=False)
net.setInput(blob)
(scores, geometry) = net.forward(['feature_fusion/Conv_7/Sigmoid', 'feature_fusion/concat_3'])

# Decode the results


(num_rows, num_cols) = scores.shape[2:4]
rects = []
confidences = []

for y in range(0, num_rows):


scores_data = scores[0, 0, y]
x_data0 = geometry[0, 0, y]
x_data1 = geometry[0, 1, y]
x_data2 = geometry[0, 2, y]
x_data3 = geometry[0, 3, y]
angles_data = geometry[0, 4, y]

for x in range(0, num_cols):


if scores_data[x] < 0.5:
continue

(offset_x, offset_y) = (x * 4.0, y * 4.0)


angle = angles_data[x]
cos = np.cos(angle)
sin = np.sin(angle)
h = x_data0[x] + x_data2[x]
w = x_data1[x] + x_data3[x]
end_x = int(offset_x + (cos * x_data1[x]) + (sin * x_data2[x]))
end_y = int(offset_y - (sin * x_data1[x]) + (cos * x_data2[x]))
start_x = int(end_x - w)
start_y = int(end_y - h)

rects.append((start_x, start_y, end_x, end_y))


confidences.append(scores_data[x])
# Apply non-maxima suppression to suppress weak, overlapping bounding boxes
boxes = non_max_suppression(np.array(rects), probs=confidences)

# Draw the bounding boxes


for (start_x, start_y, end_x, end_y) in boxes:
cv2.rectangle(orig, (start_x, start_y), (end_x, end_y), (0, 255, 0), 2)

# Display the result


cv2.imshow('Text Detection', orig)
cv2.waitKey(0)
cv2.destroyAllWindows()
Use Case 14: Real-time Head Pose Estimation

import cv2
import numpy as np

# Load the pre-trained face detection model


face_cascade = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')

# Load the pre-trained head pose estimation model


net = cv2.dnn.readNet('head_pose_estimation.onnx')

# Open video capture


cap = cv2.VideoCapture(0)

while cap.isOpened():
ret, frame = cap.read()
if not ret:
break

# Convert the frame to grayscale


gray_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)

# Detect faces
faces = face_cascade.detectMultiScale(gray_frame, scaleFactor=1.1, minNeighbors=5)

for (x, y, w, h) in faces:


# Extract the face region of interest
roi = frame[y:y+h, x:x+w]
blob = cv2.dnn.blobFromImage(roi, 1.0, (64, 64), (0, 0, 0), swapRB=True, crop=False)

# Perform head pose estimation


net.setInput(blob)
output = net.forward()
yaw, pitch, roll = output[0]

# Draw the head pose on the frame


cv2.putText(frame, f'Yaw: {yaw:.2f}, Pitch: {pitch:.2f}, Roll: {roll:.2f}', (x, y-10), cv2
cv2.rectangle(frame, (x, y), (x+w, y+h), (255, 0, 0), 2)

# Display the result


cv2.imshow('Head Pose Estimation', frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break

cap.release()
cv2.destroyAllWindows()
Use Case 15: Image Inpainting using Deep Learning

import cv2
import numpy as np

# Load the pre-trained inpainting model


net = cv2.dnn.readNet('image_inpainting.onnx')

# Load image with damaged areas


image = cv2.imread('damaged_image.jpg')
mask = cv2.imread('mask.jpg', 0)

# Prepare the image and mask for the model


blob_image = cv2.dnn.blobFromImage(image, 1.0, (512, 512), (0, 0, 0), swapRB=True, crop=False)
blob_mask = cv2.dnn.blobFromImage(mask, 1.0, (512, 512), (0, 0, 0), swapRB=True, crop=False)

# Set the inputs


net.setInput(blob_image, 'input_image')
net.setInput(blob_mask, 'input_mask')

# Perform inpainting
output = net.forward()

# Post-process the output


output = output.squeeze().transpose(1, 2, 0)
output = np.clip(output, 0, 255).astype(np.uint8)

# Display the result


cv2.imshow('Inpainted Image', output)
cv2.waitKey(0)
cv2.destroyAllWindows()
Use Case 16: Optical Character Recognition (OCR) using
Tesseract

import cv2
import pytesseract

# Load image
image = cv2.imread('document.jpg')

# Convert image to grayscale


gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

# Apply thresholding
_, thresh = cv2.threshold(gray, 150, 255, cv2.THRESH_BINARY_INV)

# Apply dilation
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3))
dilated = cv2.dilate(thresh, kernel, iterations=1)

# Extract text using Tesseract OCR


text = pytesseract.image_to_string(dilated)

print('Extracted Text:', text)


Use Case 17: Real-time Drowsiness Detection using Eye Aspect
Ratio

import cv2
import dlib
from scipy.spatial import distance

def eye_aspect_ratio(eye):
A = distance.euclidean(eye[1], eye[5])
B = distance.euclidean(eye[2], eye[4])
C = distance.euclidean(eye[0], eye[3])
ear = (A + B) / (2.0 * C)
return ear

# Load pre-trained models


detector = dlib.get_frontal_face_detector()
predictor = dlib.shape_predictor('shape_predictor_68_face_landmarks.dat')

# Define thresholds
EYE_AR_THRESH = 0.3
EYE_AR_CONSEC_FRAMES = 48

# Initialize counters
counter = 0

# Open video capture


cap = cv2.VideoCapture(0)

while cap.isOpened():
ret, frame = cap.read()
if not ret:
break

gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)


faces = detector(gray)

for face in faces:


landmarks = predictor(gray, face)
left_eye = []
right_eye = []
for i in range(36, 42):
left_eye.append((landmarks.part(i).x, landmarks.part(i).y))
for i in range(42, 48):
right_eye.append((landmarks.part(i).x, landmarks.part(i).y))

left_ear = eye_aspect_ratio(left_eye)
right_ear = eye_aspect_ratio(right_eye)
ear = (left_ear + right_ear) / 2.0

if ear < EYE_AR_THRESH:


counter += 1
if counter >= EYE_AR_CONSEC_FRAMES:
cv2.putText(frame, "DROWSINESS DETECTED", (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0
else:
counter = 0

cv2.imshow('Drowsiness Detection', frame)


if cv2.waitKey(1) & 0xFF == ord('q'):
break

cap.release()
cv2.destroyAllWindows()
Use Case 18: Real-time Fire Detection using Color Thresholding

import cv2
import numpy as np

# Open video capture


cap = cv2.VideoCapture('fire_video.mp4')

while cap.isOpened():
ret, frame = cap.read()
if not ret:
break

# Convert the frame to HSV color space


hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)

# Define fire color range in HSV


lower_fire = np.array([18, 50, 50])
upper_fire = np.array([35, 255, 255])

# Apply color thresholding


mask = cv2.inRange(hsv, lower_fire, upper_fire)

# Display the result


cv2.imshow('Fire Detection', mask)
if cv2.waitKey(1) & 0xFF == ord('q'):
break

cap.release()
cv2.destroyAllWindows()
Use Case 19: Real-time Smoke Detection using Color and Motion

import cv2
import numpy as np

# Open video capture


cap = cv2.VideoCapture('smoke_video.mp4')

# Initialize background subtractor


background_subtractor = cv2.createBackgroundSubtractorMOG2()

while cap.isOpened():
ret, frame = cap.read()
if not ret:
break

# Convert the frame to grayscale


gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)

# Apply background subtraction


fg_mask = background_subtractor.apply(gray)

# Define smoke color range in grayscale


lower_smoke = np.array([100])
upper_smoke = np.array([255])

# Apply color thresholding


mask = cv2.inRange(fg_mask, lower_smoke, upper_smoke)

# Display the result


cv2.imshow('Smoke Detection', mask)
if cv2.waitKey(1) & 0xFF == ord('q'):
break

cap.release()
cv2.destroyAllWindows()
Use Case 20: Real-time Vehicle Detection using HOG and SVM

import cv2

# Load pre-trained HOG + SVM model for vehicle detection


hog = cv2.HOGDescriptor()
svm = cv2.ml.SVM_load('vehicle_svm_model.yml')

# Open video capture


cap = cv2.VideoCapture('vehicle_video.mp4')

while cap.isOpened():
ret, frame = cap.read()
if not ret:
break

# Convert the frame to grayscale


gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)

# Detect vehicles using HOG + SVM


_, hog_features = hog.compute(gray)
result = svm.predict(hog_features.reshape(1, -1))

# Draw bounding box around detected vehicles


if result[1][0] == 1:
cv2.putText(frame, 'Vehicle Detected', (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0),

# Display the result


cv2.imshow('Vehicle Detection', frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break

cap.release()
cv2.destroyAllWindows()

Follow me on

Sumit Khanna for more updates

You might also like