Laboratory 4. Image Features and Transforms: 4.1 Hough Transform For Lines Detection
Laboratory 4. Image Features and Transforms: 4.1 Hough Transform For Lines Detection
Image features are interesting points in an image, also called interest points or key points. They can
be useful in multiple computer vision application as image alignment or image matching.
1
Fundamentals of Image Processing and Computer Vision – Laboratory 4
where 𝜌 represents the perpendicular distance of the line from the origin in pixels, and 𝜃 is the angle
(measured in radians) which the line makes with the origin as shown in Figure 1. A line in 2D space is
parameterized by 𝜌 and 𝜃 (if we pick any pair (𝜌, 𝜃), it corresponds to a line).
Let us imagine a 2D array where the x-axis has all possible 𝜃 values and the y-axis has all possible
𝜌 values. Any bin in this 2D array corresponds to one line. This 2D array is called an accumulator because
the bins of this array are used to collect evidence about the lines that exist in the image. The top left cell
corresponds to (−𝑅, 0) and the bottom right corresponds to (𝑅, 𝜋).
The value inside the bin (𝜌, 𝜃) will increase as more evidence is gathered about the presence of a line with
parameters 𝜌 and 𝜃. The following steps are performed to detect lines in an image:
Step 1: Initialize Accumulator - create an accumulator array. The number of cells is a design decision,
let’s assume choosing a 10×10 accumulator. It means that 𝜌 can take only 10 distinct values and 𝜃 can take
10 distinct values, so therefore we will be able to detect 100 different kinds of lines. The size of the
accumulator will also depend on the resolution of the image.
Step 2: Detect Edges - if there is a visible line in the image, an edge detector should fire at the boundaries
of that line. These edge pixels provide evidence for the presence of a line. The output of edge detection is
an array of edge pixels [(𝑥1 , 𝑦1 ) , (𝑥2 , 𝑦2 ), . .. , (𝑥𝑛 , 𝑦𝑛 )].
Step 3: Voting by Edge Pixels - For every edge pixel (𝑥𝑖 , 𝑦𝑖 ) in the above array, we vary the values of 𝜃
from 0 to π and plug it in equation (1) to obtain a value for 𝜌. Let us suppose an accumulator of size 20x20.
So, there are 20 distinct values of 𝜃 and so for every edge pixel (𝑥𝑖 , 𝑦𝑖 ), we can calculate 20 (𝜌, 𝜃) pairs by
using equation (1). The bin of the accumulator corresponding to these 20 values of (𝜌, 𝜃) is incremented.
That bin is determined by finding the intersection of all the curves generated by the edge pixels. As an
example, in Figure 3 we vary the parameter 𝜃 for 3 pixels ( represented by the 3 colored curves ), and obtain
2
Fundamentals of Image Processing and Computer Vision – Laboratory 4
the values for 𝜌 using equation (1). These curves intersect at a point indicating that a line with parameters
𝜃 = 1 and 𝜌 = 9.5 is passing through them (so this bin should be incremented).
Doing this for every edge pixel results in an accumulator that has all the evidence about all possible lines
in the image. We can simply select the bins in the accumulator above a certain threshold to find the lines in
the image. If the threshold is higher, fewer strong lines will be detected, and if it is lower, a large number
of lines will be detected, including some weak ones.
Figure 3. 3 edge pixels intersecting in the point 𝜃 = 1 and 𝜌 = 9.5 indicate the line to which they belong.
In OpenCV, line detection using Hough Transform is implemented in the function HoughLines
and HoughLinesP [Probabilistic Hough Transform], with the following syntax:
3
Fundamentals of Image Processing and Computer Vision – Laboratory 4
srn=0 and stn=0 , the classical Hough transform is used. Otherwise, both these parameters should
be positive.
stn - For the multi-scale Hough transform, it is a divisor for the distance resolution theta.
min_theta - For standard and multi-scale Hough transform, minimum angle to check for lines. Must fall
between 0 and max_theta.
max_theta - For standard and multi-scale Hough transform, maximum angle to check for lines. Must fall
between min_theta and CV_PI.
The overall quality of detected lines depends heavily on the quality of the edge map. Therefore, in
real world situations, the Hough transform is used when the environment can be controlled and therefore
can lead to consistent edge maps, or when an edge detector can be trained for the specific kind of edges that
are searched for.
• Ex. 4.1 Read the image ‘highway.jpg’ and transform it to grayscale. Apply the Canny edge detector
and experiment with multiple types of thresholding, in order to extract mainly the road boundaries (less
boundaries from the vegetation). Use the function cv2.HoughLinesP to identify the lines in the
image. Draw the lines on the original image using a blue color (cv2.line) and display the results.
with parameters:
image - 8-bit, single-channel binary source image. The image may be modified by the function.
4
Fundamentals of Image Processing and Computer Vision – Laboratory 4
circles - Output vector of found circles. Each vector is encoded as 3 or 4 element floating-point vector
(x,y,radius) or (x,y,radius,votes) .
method - Detection method. Currently, the only implemented method is HOUGH_GRADIENT
dp - Inverse ratio of the accumulator resolution to the image resolution. For example, if dp=1, the
accumulator has the same resolution as the input image. If dp=2, the accumulator has half as big
width and height.
minDist - Minimum distance between the centers of the detected circles. If the parameter is too small,
multiple neighbor circles may be falsely detected in addition to a true one. If it is too large, some
circles may be missed.
param1 - First method-specific parameter. In case of HOUGH_GRADIENT , it is the higher threshold of the
two passed to the Canny edge detector (the lower one is twice smaller).
param2 - Second method-specific parameter. In case of HOUGH_GRADIENT , it is the accumulator threshold
for the circle centers at the detection stage. The smaller it is, the more false circles may be detected.
Circles, corresponding to the larger accumulator values, will be returned first.
minRadius - Minimum circle radius.
maxRadius - Maximum circle radius. If <= 0, uses the maximum image dimension. If < 0, returns centers
without finding the radius.
❖ Ex. 4.2 Read the image ‘circles.jpg’ and transform it to grayscale. Apply a median blur filter on the
gray image and then detect Hough circles:
Draw in green the detected circles. Experiment with other parameters values.
5
Fundamentals of Image Processing and Computer Vision – Laboratory 4
The basic idea in Harris corner detection is to identify the 3 possible situations displayed in Figure
5.
uniform (flat) area – no change edge – no change along the edge corner – significant change in
in all directions direction all directions
Figure 5. The Harris algorithm identifies each case mathematically.
The algorithm starts by analyzing the change in intensity for a shift displacement of (𝑢, 𝑣):
The window function 𝑤(𝑥, 𝑦) can be a simple 2D gate (1 inside the window, 0 outside) or a Gaussian
window (higher weight in the middle of the window, lower weights towards the boundaries). The sum of
squared differences (SSD) of intensities in equation 2 will be near 0 for a uniform area, and larger for a
distinctive patch (edge or corner). The search for corners becomes then a search for large 𝐸(𝑢, 𝑣) values.
Using a 1st order approximation from Taylor series, equation (2) becomes
2
𝐸(𝑢, 𝑣) ≈ ∑ 𝑤(𝑥, 𝑦) [𝐼(𝑥, 𝑦) + 𝑢𝐼𝑥 + 𝑣𝐼𝑦 − 𝐼(𝑥, 𝑦)]
𝑥,𝑦
𝐼𝑥2 𝐼𝑥 𝐼𝑦 𝑢
𝐸(𝑢, 𝑣) ≅ [𝑢 𝑣] (∑ [ 2 ]) [𝑣 ]
𝐼𝑥 𝐼𝑦 𝐼𝑦
𝑢
= [𝑢 𝑣] 𝑴 [ ]
𝑣
where the matrix 𝑴 is computed from the image derivatives
𝐼𝑥2 𝐼𝑥 𝐼𝑦 (3)
𝑴 = ∑ 𝑤(𝑥, 𝑦) [ ]
𝐼𝑥 𝐼𝑦 𝐼𝑦2
𝑥,𝑦
The points in the image will then be classified using the eigenvalues 𝜆1 , 𝜆2 of matrix 𝑴:
6
Fundamentals of Image Processing and Computer Vision – Laboratory 4
with parameters:
7
Fundamentals of Image Processing and Computer Vision – Laboratory 4
❖ Ex. 4.3 Read the input image table.jpg, convert it to grayscale. The gray image must be on float32.
Apply Harris corner detector and dilate the output image to make the corners more visible. Analyze the
detected corners depending on the selected threshold value.
# This value vary depending on the image and how many corners you want to detect
# Try changing this free parameter, 0.1, to be larger or smaller and see what
happens
thresh = 0.1*dst.max()
# Iterate through all the corners and draw them on the image (if they pass the
threshold)
for j in range(0, dst.shape[0]):
for i in range(0, dst.shape[1]):
if(dst[j,i] > thresh):
# image, center pt, radius, color, thickness
cv2.circle( corner_image, (i, j), 1, (0,255,0), 1)
plt.figure()
plt.imshow(corner_image)
8
Fundamentals of Image Processing and Computer Vision – Laboratory 4
maxCorners Maximum number of corners to return. If there are more corners than are found, the
strongest of them are returned. maxCorners <= 0 implies that no limit on the maximum is set and
all detected corners are returned.
qualityLevel Parameter characterizing the minimal accepted quality of image corners. The
parameter value is multiplied by the best corner quality measure, which is the minimal eigenvalue
(see cornerMinEigenVal ) or the Harris function response (see cornerHarris ). The corners
with the quality measure less than the product are rejected. For example, if the best corner has the
quality measure = 1500, and the qualityLevel=0.01 , then all the corners with the quality
measure less than 15 are rejected.
minDistance Minimum possible Euclidean distance between the returned corners. The function throws
away each corner for which there is a stronger corner at a distance less than maxDistance
mask Optional region of interest. If the image is not empty (it needs to have the type CV_8UC1 and the
same size as image ), it specifies the region in which the corners are detected.
blockSize Size of an average block for computing a derivative covariation matrix over each pixel
neighborhood.
useHarrisDetector Parameter indicating whether to use a Harris detector (see cornerHarris) or
cornerMinEigenVal.
k Free parameter of the Harris detector.
❖ Ex. 4.4 Review exercise 4.3 using the Shi-Tomasi corner detector. Compare the results.
9
Fundamentals of Image Processing and Computer Vision – Laboratory 4
with the following parameters. Generally, their default values work best, so the function is normally
called without input arguments.
10
Fundamentals of Image Processing and Computer Vision – Laboratory 4
nfeatures The number of best features to retain. The features are ranked by their scores (measured in
SIFT algorithm as the local contrast)
nOctaveLayers The number of layers in each octave. 3 is the value used in D. Lowe paper. The
number of octaves is computed automatically from the image resolution.
contrastThreshold The contrast threshold used to filter out weak features in semi-uniform (low-
contrast) regions. The larger the threshold, the less features are produced by the detector.
edgeThreshold The threshold used to filter out edge-like features. Note that the its meaning is
different from the contrastThreshold, i.e. the larger the edgeThreshold, the less features are filtered
out (more features are retained).
sigma The sigma of the Gaussian applied to the input image at the octave #0. If your image is captured
with a weak camera with soft lenses, you might want to reduce the number.
❖ siftObject.detect is a function that finds the keypoints in an image. Each keypoint is a special
structure whith many attributes like its (x,y) coordinates, size of the meaningful neighborhood, angle
which specifies its orientation, response that specifies strength of keypoints etc.
❖ cv2.drawKeyPoints – function which draws small circles on the locations of the keypoints.
❖ siftObject.compute – computes descriptors from the keypoints found by siftObject.detect.
❖ siftObject.detectAndCompute – directly find keypoints and compute descriptors in a single step.
• Ex. 4.5 Read the image mac3.jpg, convert it to grayscale and display both images. Initiate a SIFT
detector object. Find the keypoints with SIFT and compute the descriptor. Draw the keypoints
location on the original color image. Analyze the first 70, 80,… features generated by a SIFT
detector, observe their locations! How many features are detected by default for this image?
• Ex. 4.6 Experiment also with the images mac1.jpg and mac2.jpeg, allowing the detectors to track
as many features as possible.
11
Fundamentals of Image Processing and Computer Vision – Laboratory 4
Step 2. Calculate the feature orientation using Centroid. It computes the intensity weighted centroid
of the patch with a located corner at the center. The direction of the vector from this corner point to the
centroid gives the orientation. To improve the rotation invariance, moments are computed with x and y
which should be in a circular region of radius r, where r is the size of the patch.
Step 3. Use BRIEF to compute a descriptor. The points used to compute BRIEF are rotated based
on the orientation of the patch containing the feature. This makes the descriptor invariant to rotation.
The OpenCV function needed to create an ORB detector is cv2.ORB_create with a similar set
of parameters as cv2.SIFT_create. Generally, the parameters default values work best, so the function
is normally called without input arguments.
orbObject = cv2.ORB_create()
The newly created object, orbObject, has the following important functions:
where
image – input image.
keypoints - The detected keypoints.
mask - Mask specifying where to look for keypoints (optional). It must be a 8-bit integer matrix with non-
zero values in the region of interest.
keypoints - Input collection of keypoints. Keypoints for which a descriptor cannot be computed are
removed. Sometimes new keypoints can be added, for example: SIFT duplicates keypoint with
several dominant orientations (for each orientation).
descriptors - Computed descriptors.
12
Fundamentals of Image Processing and Computer Vision – Laboratory 4
flags - Flags setting drawing features. Possible flags bit values are defined by DrawMatchesFlags.
• Ex. 4.7 Read the images cvbook1.jpg and cvbook2.jpg, then convert both to grayscale and display
them. Initiate 2 ORB detector objects. Find the keypoints with ORB and compute the descriptors. Draw the
keypoints location, size and orientation on the original color images. Compare both output images and
comment on the features locations. Compare also the first 100 detected features from both images.
• Ex. 4.8 For the same input images from exercise 4.6, add a SIFT detector object, find the keypoints
with SIFT and compute the descriptor. Draw the keypoints location on the original color images and
compare SIFT features with ORB features. How many features are detected by each algorithm? What are
the descriptors sizes? How long does it take for each algorithm to detect and compute ? Use time:
import time
t0 = time.time()
# processing
t1 = time.time()
total = t1-t0
13