0% found this document useful (0 votes)
71 views12 pages

Tutesolutions

This document contains solutions to problems from a lecture on machine vision and camera calibration. It discusses: 1) Deriving the camera's extrinsic calibration matrix from a sequence of transformations between reference frames. 2) Finding the vanishing point of lines parallel to a given world direction in both the ideal and actual camera images. 3) Steps for calibrating a camera including recovering the projection matrix up to scale, decomposing it to obtain the rotation and intrinsic matrices, and recovering the translation. 4) An example of projecting 3D points into the image using the calibration results.

Uploaded by

Ece Ebru Kaya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
71 views12 pages

Tutesolutions

This document contains solutions to problems from a lecture on machine vision and camera calibration. It discusses: 1) Deriving the camera's extrinsic calibration matrix from a sequence of transformations between reference frames. 2) Finding the vanishing point of lines parallel to a given world direction in both the ideal and actual camera images. 3) Steps for calibrating a camera including recovering the projection matrix up to scale, decomposing it to obtain the rotation and intrinsic matrices, and recovering the translation. 4) An example of projecting 3D points into the image using the calibration results.

Uploaded by

Ece Ebru Kaya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

C18 Machine Vision 1(MT 2015) Page 1

C18 Machine Vision 1: Solutions


Bugs/queries to [email protected] MT 2015
For answers, hints, corrections see the course page David Murray
at www.robots.ox.ac.uk/∼dwm/Courses/4CV

1. Lecture 1: The camera as a geometric device


yC Image plane
xC
Aligned
yA
zC zW xA y
Camera W
h
zA
xW
4h World

(a) By deriving a succession of Euclidean transformations, find the camera’s extrinsic cam-
era calibration matrix [R|t].
First find the rotation from World to aligned frame A (either in steps — or in one go here as
it is obvious). Second handle the translation.
Using inhomogeneous 3 × 1 scene vectors
   
0 1 0 0
XA =  0 0 1  XW XC = XA +  −h 
1 0 0 4h

Now using 4 × 1 scene vectors


 
  0 1 0 0
R t  0 0 1 −h 
⇒XC = XW = X
0> 1  1 0 0 4h  W
0 0 0 1

(b) The “ideal” image plane is placed at zC = 1. Derive the image coordinates of the
vanishing point of the family of lines parallel to (XW , YW , ZW ) = (2 + 4t, 3 + 2t, 4 + 3t) .
 
2 + 4t
P  3 + 2t 
Point on the lines are XW =   4 + 3t . Let t → ∞, and the point at infinity is

1
 
4
X∞
P  2 
 
W =  3 
0
C18 Machine Vision 1(MT 2015) Page 2

. Projection as vanishing point is


  
0 1 0 0 4   

2 1/2
P  0 0 1 −h   2   
 P
x = [I|0]  1 0
 = 3 ⇒ x =  3/4 
0 4h   3 
4 1
0 0 0 1 0

(c) The intrinsic calibration matrix maps ideal image positions onto actual positions in
pixels. Explain ... (Lecture notes.)
(d) The actual camera has f = 800 pixels, γ = 0.9, s = 0, and (uo , vo ) = (350, 250) pixels.
Derive the coordinates in the actual image plane of the vanishing point of part (b).
P
Using an explicit per-point scale λ to convert = to = ...
    
x 800 0 350 1/2
λ  y  =  0 720 250   3/4 
1 0 0 1 1

⇒ λx = 800(1/2) + 350(1)
λy = 720(3/4) + 250(1)
λ = 1

⇒ [x, y ] = [750, 790]

2. Lecture 1: Camera calibration

(a) Step 1: recover projection matrix up to scale. Write


   
xi p11 p12 p13 p14
λi yi = PXi = p21 p22
   p23 p24  Xi
1 p31 p32 p33 p34

Hence
(p31 Xi + p32 Yi + p33 Zi + p34 )xi = (p11 Xi + p12 Yi + p13 Zi + p14 )
(p31 Xi + p32 Yi + p33 Zi + p34 )yi = (p21 Xi + p22 Yi + p23 Zi + p24 )

Re-arrange to give for point i


   
Xi Yi Zi 1 0 0 0 0 −Xi xi −Yi xi −Zi xi −xi 0
p=
0 0 0 0 Xi Yi Zi 1 −Xi yi −Yi yi −Zi yi −yi 0

where vector p = (p11 , p12 . . . , p33 , p34 )> contains the unknown elements. Solve for 6 known
points i = 1...6 build up Ap = 0 and find p as the null-space of A

p = ker(A)
C18 Machine Vision 1(MT 2015) Page 3

and refill projection matrix P.


Step 2. Construct PLEFT from the leftmost3 × 3 block of P. Invert it to obtain P−1
LEFT . (It
−1 −1
equals R K ).
Step 3. Apply the QR-decomposition to P−1
LEFT

QR ← P−1
LEFT

Step 4. Recover the rotation


R ← Q−1
Step 5. Recover the intrinsic matrix, but not K ← R−1 directly. Instead
Temp ← R−1
∀i, j : Kij ← Tempij /Temp33
∀i , j : Pij new ← Pij old /Temp33

Step 6. Recover the translation. The scale of P is now consistent with the scale K. Recall
P = K[R|t], hence
t ← K−1 [P14 P24 P34 ]> or you could use the pre-scaled quantities t ← R[p4 p8 p12 ]>

(b) Code in calibration.m also fixes some sign ambiguities which were mentioned but not explained
in the lecture (they would not be asked for). They arise because you can change the sign of
a row or column of a rotation matrix and it remains a rotation matrix. To understand them,
write the product KR in terms of symbols
 
(f R11 + sR21 + u0 R31 ) (f R12 + sR22 + u0 R32 ) (f R13 + sR23 + u0 R33 )
KR =  (γf R21 + v0 R31 ) (γf R22 + v0 R32 ) (γf R23 + v0 R33 ) 
K33 R31 K33 R32 K33 R33
and argue that f , γf and K33 must be positive. So
• if (f < 0), change the signs of f , R11 , R12 , R13 .
• if (γf < 0), change the signs of γf , s, R21 , R22 , R23 .
• if (K33 < 0), change the signs of K33 , u0 , v0 , R31 , R32 , R33 .
(c) Projection is
 
  0 1 0 0
1 0 2  0 0 1 −1     
(λ1 · · · λ6 )x =  0 2 1  [I|0] 
 1 0 0 4 
 X1 · · · X 6
0 0 1
0 0 0 1
           
  0 0 0 0 1 1
2 1 0 8           
0   1   1   0   0   1 
=  1 0 2 2   0 , 0 , 1 , 1 , 1 ,

1 
1 0 0 4
1 1 1 1 1 1
          
8 9 9 8 10 11
=  2  2 , 4 , 4 , 5 , 5 
4 4 4 4 5 5
C18 Machine Vision 1(MT 2015) Page 4

So measured inhomogeneous points are


             
x 2 9/4 9/4 2 2 11/5
= , , , , ,
y 1/2 1/2 1 1 1 1

Yes, two image points are the same. No matter.


(d) Results from Matlab using the above data ...
intrinsic = Rot =
1.0000 0.0000 2.0000 0.0000 1.0000 0.0000
0 2.0000 1.0000 -0.0000 -0.0000 1.0000
0 0 1.0000 1.0000 -0.0000 0.0000

translation =
0.0000
-1.0000
4.0000
Hurrah, it works.

3. Lecture 2: Edges, Radial Distortion & Corners

(a) i) An√undistorted image point at radius r is moved in the radial direction to rd =


r / 1 − 2κr 2 . By considering the distortion of the straight line in (b), determine
when κ < 0 gives rise to barrel or pin cushion distortion.
κ<0
r BARREL r
rd rd

Image centre Image centre κ>0


PINCUSHION
From the formula for κ > 0, the larger r gets the smaller the denominator becomes.
The distorted point gets pushed ever further from the centre. Hence this is pincushion
distortion. Obviously, vice versa for barrel distortion.
ii) Find the inverse ... No need to expand ... just rearrange
r 1 1 1 1 rd
rd = √ ⇒ 2 = 2 − 2κ ⇒ 2
= 2 + 2κ ⇒ r=p
1 − 2κr 2 rd r r rd 1 + 2κrd2

iii) ... skeleton of an algorithm to determine κ ...


• Choose a scene with many straight lines. Capture an image.
• Detect edgels, and join into strings. Break strings where the curvature is high.
• Set κ = 0
p
• “Undistort” the edgel positions using the current κ r = rd / 1 + 2κrd2 .
C18 Machine Vision 1(MT 2015) Page 5

• Fit straight lines to the strings.


• Sum up the squared orthogonal
X Xdeviations of edgels EiL from their lines to generate
a objective function C = 2iL to minimize.
L i∈L
• Select a new κ to reduce C, and repeat from (vi).

(b) Aperture problem ...


(c) From the notes.

4. Lecture 3: Epipolar geometry

(a) i) From the notes. Practice drawing.


ii) Different points x in general give rise to different epipolar planes which pierce the images
at different epipolar lines. However, all planes share the base line, and so all epipolar lines
pass through the epipole. Add diagram showning “cat’s whiskers”.
(b) Derive the point at infinity Q along the ray through x, and determine its projection q0
in the image of the second camera C0 . Also determine the epipole e0 , and hence show
that a homogeneous expression for the epipolar line in C0 is ...

K−1 x
   −1 
0 0 K x
Q= ⇒ q = K [R|t] = K0 RK−1 x0
0 0
   
0 0 0 0
C= ⇒ e = K [R|t] = K0 t
1 1
⇒l0 = e0 ×q0 = K0 t×K0 RK−1 x0 = [K0 ]−> (t×RK−1 x) = [K0 ]−> [t]× RK−1 x = Fx
(c) i) ... derive a compact expression for the Fundamental Matrix F ... done already
ii) Show that x0 > Fx = 0.
Point x0 is on the epipolar line, hence x0 > l0 = 0. Hence x0 > Fx = 0.
(d) A single camera with K = I captures an image, and then translates along its optic axis before
capturing a second image, so that t = [0, 0, tz ]> .
i) Use x0 > Fx = 0 to derive an explicit relationship relating x 0 , y 0 , x, and y . First find F.
 
0 −tz 0
0 −> −1
[K ] [t]× RK = [t]× = tz  0 0 
0 0 0
C18 Machine Vision 1(MT 2015) Page 6

Hence
>
x0 Fx = tz (−x 0 y + y 0 x) = 0 hence x 0 /y 0 = x/y .

ii) Briefly relate your result to the expected epipolar geometry for this camera motion.

(x’,y’)
The camera has moved along the optic
axis, so we expect the origin to be the
Focus of Expansion of a flow field. This
(x,y)
is exactly what x 0 /y 0 = x/y expresses.

5. Lecture 3: Epipolar geometry

(a) The figure shows a pair of cameras each with focal length unity whose principal axes
meet at a point. The y -axes of both cameras are parallel and point out of the page.
i) C has K = I and P = [I|0]. Find P0 , find F.
x1
x d z2
z x2
0 0 0 0 z1
Camera C has P = K [R|t] with K = I, so C
this is all about finding the extrinsic matrix 1
(one) 60o
for the second camera, such that
 
0 R t
X = X d
0> 1
x’ z’
1
C’
Build in stages.
       
0 cos 60 0 − sin 60 0 0
 I  0   0 1 0 0 1 0 
0  I
  
X1 = 
 −d 
 X ; X2 = 
 sin 60 0 cos 60 0  X ; X =  +d 
 X2

0> 1 0 0 0 1 0> 1

Combine
 √ √ 
  1/2 0 − 3/2 +( 3/2)d
0 R t  0
√ 1 0 0 
X = X= X
0> 1  3/2 0 1/2 +(1/2)d 
0 0 0 1
C18 Machine Vision 1(MT 2015) Page 7

P
F = [K0 ]−> [t]× RK−1
  √ 
0 −(1/2)d √ 0 (1/2) 0 (− 3/2)
= I>  +(1/2)d √0 −( 3/2)d   √ 0 1 0  I−1
0 +( 3/2)d 0 ( 3/2) 0 (1/2)
 
0 −(1/2)d √0
 −(1/2)d −( 3/2)d 
= √0
0 +( 3/2)d 0
ii) Use the relationship l0 = Fx to compute the epipolar line in the right image corre-
sponding to the homogeneous point x = [1, 1, 1]> in the left image.
    
0 −(1/2)d √ 0 1 +1√
P P
l0 = Fx =  −(1/2)d √ 0 −( 3/2)d  1  =  +(1 + 3) 

0 +( 3/2)d 0 1 − 3

(b) i) Derive the 4-vectors which represent the optical centres of the two cameras.
The world frame is coincident
 with
 the frame of the first camera,  so0 that
 optical centre
0 C
of C is at the world origin . The optical centre of C’ is at such that
1 1

C0 C0 −R> t
        
0 R t
= ⇒ =
1 0> 1 1 1 1

ii) Use the above results to derive expressions for the epipoles of the two cameras.
Epipole e is the projection of the optic centre of C’ into camera C.

−R> t
 
P P
e = K[I|0] = KR> t.
1

Epipole e0 is the projection of the optic centre of C into camera C’.


 
0 P 0 0
e = K [R|t] = K0 t.
1

iii) Prove that the epipoles e and e0 are the right and left null-spaces respectively of
the fundamental matrix F.
We have to show that Fe = 0 and F> e0 = 0, respectively.

P
Fe = [K0 ]−> [t]× RK−1 e = [K0 ]−> [t]× RK−1 KR> t = [K0 ]−> [t]× t = 0

P
F> e0 = [K]−> R> [t]× > [K0 ]−1 K0 t = [K]−> R> [t]× > t

But [t]× > = −[t]× as it is skew-symmetric. So

F> e0 = [K]−> R> − [t]× t = 0


C18 Machine Vision 1(MT 2015) Page 8

(c) A camera rotates abouts its optical centre and changes its intrinsics so that the camera
matrices before and after P = K[I|0] and P0 = K0 [R|0]
i) Derive an expression for the homography H which relates the images of points before
and after the motion as x0 = Hx.
From part (a),

x = KXW
3×1 and x0 = K0 RXW
3×1

Eliminate XW
3×1

P
x0 = K0 RK−1 x = Hx

The matrix H is a 3 × 3 matrix with 8 d.o.f.


ii) How might this formula be applied to rectify images from a stereo camera rig?
You want to work out the pixel values for square pixels in the rectified image. The trick
is to perform the inverse of the above.
• For each pixel i, j in the rectified image ...
• Consider its four corners at (i, j), (i, j + 1), (i + 1, j + 1), (i + 1, j)
• Derive their positions in the original image
       
i i i +1 i +1
x1,2,3,4 = H−1  j  ,  j + 1  ,  j + 1  ,  j 
1 1 1 1

• Dehomogenize into (x, y )1,2,3,4


• Find a weighted sum of the pixel values, as illustrated.

2
1 Original
Image
Mesh

4
3
C18 Machine Vision 1(MT 2015) Page 9

6. (a) Find F, books closed. Already done.


(b) Using three vectors, and for pure rotation
λx = KX and λ0 x0 = K0 (RX + t) . becomes λ0 x0 = K0 RX
and hence, inverting the first expression in order to replace X,
P P
λ0 x0 = λK0 RK−1 x ⇒ x0 = Hx with H = K0 RK−1
(c) The camera’s motion and zoom change are again as in part (a), but the scene viewed
is now a planar surface n̂T X = d, where n̂ is the plane’s unit normal.
i) Using the projection equation for the first position show that
X −1 t tn̂T K−1 x
= K x and = .
λ λ d
We haven’t seen this before, so just obey the instructions ... Use the first projection
equation
X
λx = KX ⇒ λK−1 x = X ⇒ = K−1 x
λ
Now introduce the equation of the plane ...
n̂> X d 1 n̂> K−1 x t t n̂> K−1 x
= = n̂> K−1 x ⇒ = =⇒
λ λ λ d λ d
λ0 x0
ii) Using that for the second position, develop an expression for . Again just follow
λ
the instructions ...
λ0 x0
 
0 0 0 0 X t
λ x = K (RX + t) ⇒ =K R +
λ λ λ
Suddendly you see why we being asked in the previous part ...
λ0 x0 t n̂> K−1 x
 
0 −1
= K RK x +
λ d
and show that the corresponding points are again related by an homography
t n̂> K−1 t n̂> K−1
   
0 P 0 −1 P 0 −1
x = K RK + x ⇒ J = K RK +
d d
(d) Show that if x0 = Hx, then H> F must be an antisymmetric matrix.

>
x0 Fx = 0 ⇒ x> H> Fx = 0
Now consider the scalar quantity x> Ax, where A is antisymmetric. The transpose of a scalar
is the same scalar, so
x> Ax = [x> Ax]> = x> A> x
But A> = −A, so
x> Ax = −x> Ax ⇒ x> Ax = 0 .
As x> H> Fx = 0, then H> F is antisymmetric.
C18 Machine Vision 1(MT 2015) Page 10

7. Lecture 4: Stereo correspondence

(a) The ordering constraint is often used in stereo correspondence algorithms to disam-
biguate point matches on corresponding epipolar lines. Sketch a configuration of sur-
faces and cameras for which the ordering constraint is valid, and a configuration for
which it is not valid.
As per notes. Invalid for (i) Discontinuous surfaces (ii) Tranparent surfaces
(b) Prove that ZNCC is invariant under I 0 = αI + β.
Assume the original intensity, and let the source patch be A. The patch mean is subtracted
from each pixel Ãij = Aij − µA and the values used in
P P
i j Ãij B̃ij
ZNCC = qP P qP P
2 2
i Ã
j ij i j B̃ij

Now consider altering the intensities of each pixel as in I 0 = αI + β. The patch mean is
changed to

µ0A = αµA + β

so that subtracting the patch mean gives


Ã0ij = A0ij − µ0A
= αAij + β − (αµA + β)
= α(Aij − µA )
= αÃij

Now
P P P P
i j Ã0 ij B̃ij α i j Ãij B̃ij
ZNCC 0 = qP P qP P = q P P qP P
2
i j
0
à ij i j B̃ij2 α2 i j Ã2ij i
2
j B̃ij

= ZNCC
as the α’s cancel top and bottom.
(c) Explain why using normalized cross-correlation is likely to be more important for stereo
matching than for matching between images in a monocular sequence.
In a monocular sequence images are taken a short time apart (typically 30ms) by the same
camera. Any change in gain and offset over that time is likely to be neglible.
Stereo involves different camera which could quite easily have different gains and offsets.
C18 Machine Vision 1(MT 2015) Page 11

8. Lecture 4: Triangulation

f
1
4

f
(a) When the cameras are separated by tx = 4 units, find the coordinates of the 3D point
for the correspondences: (i) [−1, 0] ↔ [1, 0]; (ii) [0, 0] ↔ [0, 0].
Easy to see that
   
−2 0
 0   0 
X1 = 
 2f 
 X2 =  1 

1 0

The latter is a point at infinity.


(b) i) Assuming the camera matrices are K[I|0] and K[R|t], derive a vector expression for
the position X of the 3D scene point given a correspondence x ↔ x0 , assuming no
noise.
We are assuming no noise in the image, so that
 −1    > 0 −1 0 
−R> t
  
0 K x R K x
X4×1 = +α = +β
1 0 1 0

Now just relax to 3-vectors, so


−1 0
X3×1 = αK−1 x = −R> t + βR> K0 x

We want to find either α or β. To find α, it saves work to bring the rotation to the LHS
−1 0
αRK−1 x = −t + βK0 x or αp = −t + βq

Then dot this equation first with p then q to get two simultaneous equations in α and
β, and solve for

(p> q)(q> t) − (q> q)(p> t)


α=
(p> p)(q> q) − (p> q)2
C18 Machine Vision 1(MT 2015) Page 12

ii) Verify this expression using the parallel camera configuration earlier in this question
−1
R=I t = [4, 0, 0]> K−1 = K0 = Diag(1/f , 1/f , 1)

Using the corresponding points

x = [−1, 0, 1]> x0 = [1, 0, 1]>

you should find

p = RK−1 x = [−1/f , 0, 1]> q = [1/f , 0, 1]>

Using our formula for α we find α = 2f .


So for the parallel cameras, Xi = αK−1 x = [−2, 0, 2f ]> — just as before. Which is nice.
(c) For a parallel camera stereo configuration the “horizontal” disparity d is given by d =
x 0 − x = f tx /Z. Hence show that the error δZ in depth Z corresponding to an error δd
in disparity is given by δZ = −(δd/f tx )Z 2
We are told Z = f tx d1 so

1 Z2 Z2
δZ = −f tx δd = − f tx δd = − δd
d2 (f tx )2 f tx

What are the consequences of this relation when computing a stereo reconstruction?
The error in disparity δd is likely to be constant across the image — it depends principally on
uncertainties in the location of image features which are independent of depth. As f and tx
are constant, the uncertainty in depth is proportional to the depth squared.

9. For interest only.

You might also like