EE125 Lab 4: Image Manipulation & Camera Calibration
EE125 Lab 4: Image Manipulation & Camera Calibration
1 Introduction
In this lab, we will learn how to interface with a simple webcam in ROS, read in images, convert them to OpenCV
images, perform some basic image processing, compensate for the perspective change between the floor of the lab
and the camera thereby calibrating it, and, lastly, use the camera calibration to compute distances on the floor just by
taking an image.
The node usb_cam_node pulls a video stream from the webcam /dev/video0 and publishes it to the topic
/usb_cam/image_raw. You will need to make sure that your webcam is mapped to the device /dev/video0.
In order to do this, type
ls /dev/video*
This will list all the video devices in your /dev folder. If there are multiple video devices in the folder, unplug your
webcam and re-list them to figure out which device is your webcam. You may also need to check that the size of your
image is 640x472.
For this lab, we will want the ability to request a single image from the webcam. This can be done in ROS us-
ing something called a Service/Client architecture (http://wiki.ros.org/ROS/Tutorials/WritingServiceClient(python));
1
however, usb_cam does not provide this functionality. Thus we will need to create a node that subscribes to the
/usb_cam/image_raw/ topic and then provides a single image as service when requested. We will also want to
create a node that requests an image from the image service node and does the image processing that we want to do.
Create a package in a folder contained in your ROS_PACKAGE_PATH that depends on rospy and sensor_msgs.
We will refer to this package as lab3_cam. You will need to create subfolders /src to contain source code for the
nodes and /srv to contain a new service type inside the package. You will then need to create an image service node
and a image processing node.
#When another node calls the service, return the last image
def getLastImage(self, request):
#Print an alert to the console
#print("Image requested!")
2
#Return the last image
return ImageSrvResponse(self.lastImage)
def __init__(self):
#Create an instance variable to store the last image received
self.lastImage = None;
def run(self):
rospy.spin()
#!/usr/bin/env python
import roslib; roslib.load_manifest(’lab3_cam’)
import rospy
from sensor_msgs.msg import Image
from lab3_cam.srv import ImageSrv, ImageSrvResponse
import cv, cv2, time, sys
from cv_bridge import CvBridge, CvBridgeError
import numpy as np
from numpy.linalg import *
The main part of the image processing node should contain the following code. The comments tell you what each part
does.
if __name__ == ’__main__’:
3
while not rospy.is_shutdown():
# Waits for a key input to continue
raw_input(’Press enter to capture an image:’)
try:
#Request the last image from the image service
request = last_image()
# Extracts the ROS Image from the ImageSrv service
# Remember that ImageSrv.last_image was
# defined to be of type sensor_msgs.msg.Image
imgmsg = request.last_image
except rospy.ServiceException, e:
print "Service call failed: %s"%e
imgmsg is now an image stored in the ROS Image format. This node is complete for now. (We will modify it in the
next section to do more image processing.) Don’t forget to change the permissions so that camera_srv.py and
image_process.py are executable.
chmod +x camera_srv.py
chmod +x image_process.py
Also you will need to edit the CMakeLists.txt so that rosbuild_gensrv() is uncommented. Build the
package and add the following to the launch file.
cvimage=bridge.imgmsg_to_cv(imgmsg,’rgb8’)
Note that ’rgb8’ refers to the order of the red, green, and blue channels in the image. rgb8 may actually be the
wrong ordering for the color channels. If so the image will appear too red or green or blue when we display it. If the
image doesn’t look right, change the order of ’r’, ’g’, and ’b’ until it does. cvimage is now an image in the OpenCV
format iplimage. OpenCV can also store images as matrices or arrays which we will use later. To show an image
in the iplimage format, we can use the commmands
cv.NamedWindow("CV Image",cv.CV_WINDOW_AUTOSIZE)
cv.ShowImage("CV Image",cvimage)
cv.WaitKey()
4
The last command is required to make the image continually display. (If the process is dying after a short amount of
time, try changing cv.WaitKey() to cv.WaitKey(1000).) For the next part of the lab, we will want to store
the image as a numpy array. In order to transform the image from the iplimage format to an array, we can use the
command
cvimage_array = np.asarray(cvimage, dtype=np.uint8)
If we want to display the image when it is in the array format, we can use the command
cv2.imshow(’CV Image (array)’,cvimage_array)
cv2.waitKey()
The array format for an image is a 3-dimensional array with dimensions hxwx3 where h is the number of pixels in
the vertical direction, w is the number of pixels in the horizontal direction, and the third dimension has a place for the
red color value, the blue color value, and the green color value. By convention, the (0, 0) pixel location is defined to
be the upper left hand corner and pixel values increase as you move to the right and down. Usually, color values range
from 0 to 255 where 0 is totally dark and 255 is totally light. Thus [0 0 0] would be a black pixel and [255 255 255]
would be a white pixel. By indexing into the image array, we can select a part of the image or change values in the
array in order to change the image.
Add the following section to your code that will allow you to click on points in the image and store their pixel
locations.
#Define two global variables
global n_clicks, points
n_clicks = 0
tot_clicks = 4
# preallocates a list to hold the locations of the points we select
points = (np.nan*np.zeros(tot_clicks)).tolist()
while n_clicks <= tot_clicks-1:
# Displays the image
cv.ShowImage("CV Image",cvimage)
# Calls the callback function ’on_mouse_click’
# when the mouse is clicked in the window "CV Image"
cv.SetMouseCallback("CV Image",on_mouse_click,param=1)
cv.WaitKey()
5
Figure 1: Sample image of floor plane.
Let us consider a point on the lab floor given by X̃ = [x, y]T ∈ R2 . In order to keep the transformations linear
(instead of dealing with affine projections), we will use homogeneous coordinates, where we append a ”1” to the
coordinates of X̃ as,
x
X̃
X= = y ∈ R3 .
1
1
We are trying to determine the linear transform H ∈ R3×3 that maps the point X to the pixel Y = [u v 1]T ∈
3
R . This transform is called a homography and takes on the form,
h11 h12 h13
H := h21 h22 h23 . (1)
h31 h32 1
Notice that the last element of the homography is 1. This means that only 8 parameters of the matrix need to be
estimated. Once the matrix is estimated, the pixel coordinates, Ỹ = (u, v) can be determined using the following
equations,
h11 x + h12 y + h13 h21 x + h22 y + h23
u= , v= . (2)
h31 x + h32 y + 1 h31 x + h32 y + 1
These 2 equations can be rewritten in linear form as,
h11
h12
h13
x y 1 0 0 0 −u.x −u.y h21 = u .
. (3)
0 0 0 x y 1 −v.x −v.y
h22
v
h23
h31
h32
Since Eqn. 3 has 8 unknowns, in order to uniquely determine the unknowns, we will need N ≥ 4 floor point ↔
image pixel pairs. With these N points, the equation becomes,
A.x = b, (4)
6
where,
x1 y1 1 0 0 0 −u1 .x1 −u1 .y1 u1
0 0 0 x1 y1 1 −v1 .x1 −v1 .y1 h11 v1
h12
x2 y2 1 0 0 0 −u2 .x2 −u2 .y2 u2
h13
0 0 0 x2 y2 1 −v2 .x2 −v2 .y2 v2
h21
x3 y3 1 0 0 0 −u3 .x3 −u3 .y3 ∈ R2N ×8 , ∈ R8 , u3 ∈ R2N .
A= x= b=
0 0 0 x3 y3 1 −v3 .x3 −v3 .y3
h22
v3
h23
x4 y4 1 0 0 0 −u4 .x4 −u4 .y4 u4
h31
0 0 0 x4 y4 1 −v4 .x4 −v4 .y4 v4
..
h32
..
. .
(5)
note: [u, v]T are pixel coordinates and [x, y]T are ground coordinates with respect to the origin that you have defined.
Modify your code to compute the homography matrix between the floor plane and the camera plane. Define the [x, y]T
floor coordinates of the first point you click on to be [0, 0]T . Calculate the [x, y]T coordinates of the other 3 points
you click on using the fact that the ground tiles are 30.48 cm (1ft) per side. (See Figure 2 for reference.) Modify
your code to create and solve the linear equations above for the values of H. (You can use the inv function from the
np.linalg module.) You can check that you calculated the homography correctly by using the following function.
(If you calculated the homography correctly, running this function should draw black dots on the intersections of the
tiles.)
def check_homography(H,nx,ny,length,image):
# H should be a 3x3 numpy.array
# nx is the number of tiles in the x direction
# ny is the number of tiles in the y direction
# length is the length of one side of a tile
# image is an image array
for i in range(nx):
for j in range(ny):
xbar = np.array([[i*length],[j*length],[1]])
ubar = np.dot(H,xbar).T[0]
u = ubar[0]/ubar[2]
v = ubar[1]/ubar[2]
cv2.circle(image,(u,v),5,0,-1)
cv2.imshow(’Check Homography’,image)
cv2.waitKey()
Lab Report Assignment: In your lab report, include a picture of the floor with black dots at the intersections of the
tiles and give an explanation for what the check_homography function is doing. You can save the image in opencv
by changing it back from an array to an iplimage and saving it using the cv.SaveImage function.
image_to_save = cv.fromarray(image)
cv.SaveImage(’/absolute/path/to/folder/image_name.jpg’, picture)
You should also include any new code that you wrote to compute the homography. (You don’t need to include the
code that was given to you.)
7
(a) (b)
Figure 2: (a) Corner selection for ground plane (b) Calibrated image with projected grid.
be done by using the inverse of the homography matrix, H −1 . Let this matrix be,
q11 q12 q13
H −1 := Q = q21 q22 q23 . (6)
q31 q32 q33
With this inverse transform, the floor coordinates of any pixel can be computed as,
q11 u + q12 v + q13 q21 u + q22 v + q23
x= , y= . (7)
q31 u + q32 v + q33 q31 u + q32 v + q33
Modify your code to compute the distance between two points on the floor when you click on them in the image.
Test your code by placing an object of known length on the floor and use your code to measure it. Try measuring the
length of an object that does not lie in the floor plane. Are the measurements still accurate? Why or why not?
Lab Report Assignment: Place an object of known length on the floor and measure it by clicking on the ends of
the object in the picture from the webcam, recording the pixel points, and use the inverse homography to determine
the length of the object. Save a picture of the object with the points you clicked shown as block dots. (Look at the
check_homography function to see how to draw the dots.) Compare the known length of the object with the
length you measured using the inverse homography. Are they the same? (If the object is lying directly on the floor,
they should be very close.) Now try measuring an object of known length that doesn’t lie in the plane of the floor.
Again, include a picture with the points you clicked shown as black dots. Compare the known length of this object
with the length you measured using the inverse homography. Are they the same? Why or why not? Again, make sure
you include any new code you wrote for this section in your lab report.
8
Figure 3: Length between selected points