Object Detection Using Python OpenCV
Object Detection Using Python OpenCV
We started with learning basics of OpenCV and then done some basic image
processing and manipulations on images followed by Image segmentations
and many other operations using OpenCV and python language. Here, in this
section, we will perform some simple object detection techniques using
template matching. We will find an object in an image and then we will
describe its features. Features are the common attributes of the image such
as corners, edges etc. We will also take a look at some common and popular
object detection algorithms such as SIFT, SURF, FAST, BREIF & ORB.
Object detection and recognition form the most important use case for
computer vision, they are used to do powerful things such as
• Labelling scenes
• Robot Navigation
• Self-driving cars
• Body recognition (Microsoft Kinect)
• Disease and cancer detection
• Facial recognition
• Handwriting recognition
• Identifying objects in satellite images
import cv2
import numpy as np
image=cv2.imread('WaldoBeach.jpg')
cv2.imshow('people',image)
cv2.waitKey(0)
gray=cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
template=cv2.imread('waldo.jpg',0)
#result of template matching of object over an image
top_left=max_loc
#increasing the size of bounding rectangle by 50 pixels
bottom_right=(top_left[0]+50,top_left[1]+50)
cv2.rectangle(image, top_left, bottom_right, (0,255,0),5)
cv2.imshow('object found',image)
cv2.waitKey(0)
cv2.destroyAllWindows()
The whole function returns an array which is inputted in result, which is the
result of the template matching procedure.
There are variety of methods to perform template matching and in this case
we are using cv2.TM_CCOEFF which stands for correlation coefficient.
The following factors make template matching a bad choice for object
detection.
Corners are identified when shifting a window in any direction over that
point gives a large change in intensity.
Corners are not the best cases for identifying the images, but yes they have
certainly good use cases of them which make them handy to use.
Now when we move the window in one direction we see that there is change
of intensity in one direction only, hence it’s an edge not a corner.
When we move the window in the corner, and no matter in what direction
So let’s identify corner with the help of Harris Corner Detection algorithm,
developed in 1998 for corner detection and works fairly well.
The following OpenCV function is used for the detection of the corners.
import cv2
import numpy as np
image = cv2.imread('chess.jpg')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
gray = np.float32(gray)
harris_corners = cv2.cornerHarris(gray, 3, 3, 0.05)
kernel = np.ones((7,7),np.uint8)
harris_corners = cv2.dilate(harris_corners, kernel, iterations = 2)
The following function is used for the same with the below mentioned
parameters
import cv2
import numpy as np
img = cv2.imread('chess.jpg')
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
It also returns the array of location of the corners like previous method, so
we iterate through each of the corner position and plot a rectangle over it.
The corner detectors like Harris corner detection algorithm are rotation
invariant, which means even if the image is rotated we could still get the
same corners. It is also obvious as corners remain corners in rotated image
also. But when we scale the image, a corner may not be the corner as
shown in the above image.
Then we create a vector descriptor for these interesting areas. And the scale
Invariance is achieved via the following process:
i. Interesting points are scanned at several different scales.
ii. The scale at which we meet a specific stability criteria, is then
selected and encoded by the vector descriptor. Therefore, regardless of the
initial size, the more stable scale is found which allows us to be scale
invariant.
http://www.cs.ubc.ca/~lowe/papers/ijcv04.pdf.
And you can also find a tutorial on the official OpenCV link.
As the SIFT and SURF are patented they are not freely available for
commercial use however there are alternatives to these algorithms which
are explained in brief here
• Key point detection only (no descriptor, we can use SIFT or SURF to
compute that)
• Used in real time applications
https://www.edwardrosten.com/work/rosten_2006_machine.pdf
http://cvlabwww.epfl.ch/~lepetit/papers/calonder_pami11.pdf
http://www.willowgarage.com/sites/default/files/orb_final.pdf
The SIFT & SURF algorithms are patented by their respective creators, and
while they are free to use in academic and research settings, you should
technically be obtaining a license/permission from the creators if you are
using them in a commercial (i.e. for-profit) application.
SIFT
import cv2
import numpy as np
image = cv2.imread('paris.jpg')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
Console Output:
Here the keypoints are (X,Y) coordinates extracted using sift detector and
drawn over the image using cv2 draw keypoint function.
import cv2
import numpy as np
image = cv2.imread('paris.jpg')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
#you can increase the value of hessian threshold to decrease the keypoints
surf = cv2.xfeatures2d.SURF_create(500)
Console Output:
import cv2
import numpy as np
image = cv2.imread('paris.jpg')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
fast = cv2.FastFeatureDetector_create()
# Obtain Key points, by default non max suppression is On
# to turn off set fast.setBool('nonmaxSuppression', False)
keypoints = fast.detect(gray, None)
print ("Number of keypoints Detected: ", len(keypoints))
Console Output:
BRIEF
import cv2
import numpy as np
image = cv2.imread('paris.jpg')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
brief = cv2.xfeatures2d.BriefDescriptorExtractor_create()
#brief = cv2.DescriptorExtractor_create("BRIEF")
Console Output:
ORB
import cv2
import numpy as np
image = cv2.imread('paris.jpg')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
orb = cv2.ORB_create()
Console Output:
We can specify the number of keypoints which has maximum limit of 5000,
however the default value is 500, i.e. ORB automatically would detect best
500 keypoints if not specified for any value of keypoints.
So this is how object detection takes place in OpenCV, the same programs