0% found this document useful (0 votes)
57 views29 pages

Camera Calibration by Zhang

The document describes Zhang's method for camera calibration using planar targets. It explains the camera's intrinsic and extrinsic parameters, and how Zhang's method estimates these parameters using images of a planar target by establishing correspondences between the target's 3D points and their 2D projections in each image. It then derives the homography between the target plane and each image plane to simplify the calibration process.

Uploaded by

reza
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
57 views29 pages

Camera Calibration by Zhang

The document describes Zhang's method for camera calibration using planar targets. It explains the camera's intrinsic and extrinsic parameters, and how Zhang's method estimates these parameters using images of a planar target by establishing correspondences between the target's 3D points and their 2D projections in each image. It then derives the homography between the target plane and each image plane to simplify the calibration process.

Uploaded by

reza
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

Camera calibration by Zhang

Siniša Kolarić
<http://www.inf.puc-rio.br/~skolaric>

September 2006

Abstract
In this presentation, I present a way to calibrate a camera using the
method by Zhang.
NOTE. This is accompanying material to my trabalhos for the course INF2064 ”Tópicos de Computação Gráfica III -
Realidade Aumentada e Cooperativa” held by prof. Marcelo Gattass, during the 2006.2 semestre.
INF2064 Tópicos de Computação Gráfica III - Realidade Aumentada e Cooperativa Camera calibration by Zhang [1]

The problem

Given a set of photos (either real photos — made with a real camera, or
virtual photos — made with a ”virtual”1 camera), determine camera’s:

• Intrinsic parameters

• Extrinsic parameters

1
For example the one implemented with perspective transformation in OpenGL, or one implemented in a raytracer.
INF2064 Tópicos de Computação Gráfica III - Realidade Aumentada e Cooperativa Camera calibration by Zhang [2]

Camera’s intrinsic parameters

• Scaling factors — sx, sy

• Image center (principle point) — (ox, oy )

• Focal length(s) — f (fx = f /sx, fy = f /sy )

• Aspect ratio (skewness) — sh

• Lens distortion (pin-cushion effect) — k1, k2


INF2064 Tópicos de Computação Gráfica III - Realidade Aumentada e Cooperativa Camera calibration by Zhang [3]

Camera’s intrinsic matrix (Trucco & Verri)

 
− sfx sh ox
 
−fx sh ox
K= 0 − sfy oy  =  0 −fy oy 
 

0 0 1 0 0 1

• f — focal length in [m]


• sx, sy — scale factors in the image’s u and v axis. Can be interpreted as the horizontal and
vertical size (in meters) of the pixels, in another words dimensionality of sx, sy is [m/pixel].
• fx, fy — focal lengths in [pixel].
• sh — skewness of two image axes (dimensionless). Holds sh = tan δ ≈ 0 (because generally
δ ≈ 0), where δ is the angle between axis y and verticals on the axis x.
• (ox, oy ) — coordinate pair of the principal point (intersection of optical axis with image
plane), expressed in [pixel]. Also called image center.
INF2064 Tópicos de Computação Gráfica III - Realidade Aumentada e Cooperativa Camera calibration by Zhang [4]

Camera’s intrinsic matrix (Faugeras)

 
−f ku 0 u0
K= 0 −f kv v0 
0 0 1
Remarks:

• no skewness factor
• ku = s−1
x , k v = sy
−1
INF2064 Tópicos de Computação Gráfica III - Realidade Aumentada e Cooperativa Camera calibration by Zhang [5]

Camera’s intrinsic matrix (IMPA folks)

 
f sx fτ uc
K= 0 f sy vc 
0 0 1

Compared with Trucco/Verri and Faugeras, IMPA people have added the following changes:

• change of sign for diagonal elements k11, k22


• sx, sy are defined as inverted values of sx, sy in Trucco/Verri notation
INF2064 Tópicos de Computação Gráfica III - Realidade Aumentada e Cooperativa Camera calibration by Zhang [6]

Camera’s intrinsic matrix

• Having images only, it’s not possible to estimate individual values of f, sx, sy ;
only values fx and fy can be estimated

• However if the manufacturer supplied sx, sy with the camera, it’s possible to
derive f

• If we discover fx, it will be expressed in [pixel]. So if we know the height H


of the image (also expressed in [pixel]), we can calculate f ovy .
INF2064 Tópicos de Computação Gráfica III - Realidade Aumentada e Cooperativa Camera calibration by Zhang [7]

Camera’s extrinsic parameters

• Placement of the camera (translation vector t)

• Orientation of the camera (rotation matrix R)


INF2064 Tópicos de Computação Gráfica III - Realidade Aumentada e Cooperativa Camera calibration by Zhang [8]

Complete chain of coordinate transforms

1
 
sh ox
 
sx −f 0 0 0  
1 R t
pixels ←  0 oy  ← image ← 0 −f 0 0  ← camera ← ← world
 
sy

0 1
0 0 1 0 0 1 0

Combining first two matrices we get:


 
− sfx sh ox  
R t
pixels ←  0 − sfy oy  ← camera ← ← world
 
0 1
0 0 1
 
−fx sh ox  
R t
pixels ←  0 −fy oy  ← camera ← ← world
0 1
0 0 1
INF2064 Tópicos de Computação Gráfica III - Realidade Aumentada e Cooperativa Camera calibration by Zhang [9]

Zhang’s method

ZHANG()
1 take several(n ≥ 3) photos of your planar model’s printout
2 detect features in photos using LoG, jvInterpret(), etc
3 calculate camera’s extrinsic and intrinsic parms using closed-form solution
4 calculate coeffs for radial distortion solving linear least-squares
5 fine tune calculated parms using Levenberg-Marquardt
6 output calculated parms

There can be less than 3 photos, but only under the supposition that some
intrinsic parameters are known, see below.
INF2064 Tópicos de Computação Gráfica III - Realidade Aumentada e Cooperativa Camera calibration by Zhang [10]

Zhang’s method

• Firstly, the standard pinhole camera is being considered

• Then, radial distortion is being calculated on top of it


INF2064 Tópicos de Computação Gráfica III - Realidade Aumentada e Cooperativa Camera calibration by Zhang [11]

Zhang uses planar 3-D models

”Planar” means that we can flatten coordinate Z of every point of the model
in the Zhang method, that is, consider Z to be 0.

Examples of planar 3-D models would be, for example, patterns of black
rectangles with known dimensions, printed on a paper, glued to a hard-cover
book, and photographed by a camera.

Therefore, [X Y Z 1]τ (a 3-D point of the model) can be treated as [X Y 1]τ


in all subsequent calculations, since Z = 0 for all points.
INF2064 Tópicos de Computação Gráfica III - Realidade Aumentada e Cooperativa Camera calibration by Zhang [12]

General projective transformation can be simplified

Because of the simplification [X Y Z 1]τ −→ [X Y 1]τ , we can simplify the


general projective transformation

[X Y Z 1]τ −→ K[R t][X Y Z 1]τ

as
[X Y 1]τ −→ K[r1 r2 t][X Y 1]τ
where r1 and r2 are the first two columns of rotation matrix R, t translation
vector, and K intrinsic matrix. By this reduction, we can work with a simpler
projective plane to projective plane transformation (P 2 −→ P 2) instead with the
more general and more complex (P 3 −→ P 2) transformation.
INF2064 Tópicos de Computação Gráfica III - Realidade Aumentada e Cooperativa Camera calibration by Zhang [13]

Homography

Because it uses planar 3-D models, Zhang’s method makes use of a


homography (which is a map from projective plane P 2 onto itself):

1
[X Y 1]τ −→ K[R t][X Y 1]τ = H[X Y 1]τ = [u v 1]τ
λ
where

• H is a homography from the model plane to the image plane P 2 −→ P 2


defined as H = λ K[R t]

• K is camera’s intrinsic matrix, and R, t extrinsic matrices


INF2064 Tópicos de Computação Gráfica III - Realidade Aumentada e Cooperativa Camera calibration by Zhang [14]

Homography

• There is a factor λ in the definition of H because any homography is defined


up to a factor.
INF2064 Tópicos de Computação Gráfica III - Realidade Aumentada e Cooperativa Camera calibration by Zhang [15]

The idea behind the Zhang method

Let M designate the set of 2-D model points, and Mi0 set of 2-D points
detected in image i. In a nutshell, the idea is first to extract n homographies Hi
(3x3 matrices) from n pairs {M, Mi0}, i = 1, . . . n:
" 1 1 1
#
h11 h12 h13
{M, M10 } −→ H1 = 1
h21 1
h22 1
h23
1 1 1
h31 h32 h33
" 2 2 2
#
h11 h12 h13
{M, M20 } −→ H2 = 2
h21 2
h22 2
h23
2 2 2
h31 h32 h33
···
 n n n 
h11 h12 h13
{M, Mn0 } −→ Hn = n
n
h21 n
n
h22 n
n
h23
h31 h32 h33
INF2064 Tópicos de Computação Gráfica III - Realidade Aumentada e Cooperativa Camera calibration by Zhang [16]

The idea behind the Zhang method

Then we use these newly-found coefficients of Hi (eight coefficients for each


Hi, because all homographies have 8 DOF, that is, are determined up to a factor)
to setup a linear system of 2n (n = number of images) equations for five intrinsic
parameters (unknowns) sx, sy , γ, u0, v0 — elements of K. Thus in this way, we
end up finding (estimating) intrinsic matrix K.

With K in hand, we find extrinsics R = [~r1, ~r2, ~r3] and t for each image i:

~r1 = λK −1~h1 ~r2 = λK −1~h2 ~r3 = ~r1 × ~r2

~t = λK −1~h3
INF2064 Tópicos de Computação Gráfica III - Realidade Aumentada e Cooperativa Camera calibration by Zhang [17]

The idea behind the Zhang method

Please note that n matrices Ri so calculated do not in general case satisfy


the properties of rotation matrix (that is, columns and rows of Ri aren’t unitary
orthogonal vectors) due to inherent noise in data. There is a method, however,
that allows us to find a rotation matrix that is most similar to Ri — see the
article by Zhang.

Of course, noise also affects the other extrinsic parameter ti.


INF2064 Tópicos de Computação Gráfica III - Realidade Aumentada e Cooperativa Camera calibration by Zhang [18]

The linear system

A couple of remarks about the aforementioned linear system for sx, sy , γ, u0, v0.
For each image i (and thus each homography Hi = K[r1i r2i ti]), we have a
pair of constraining equations:

hτ1iK −τ K −1h2i = 0

hτ1iK −τ K −1h1i = hτ2iK −τ K −1h2i


where Hi = [h1i h2i h3i] and h1i, h2i, h3i are columns of Hi.
Since h1i, h2i, h3i are knowns, unknowns are six (6) coefficients of B :=
K −τ K −1. Also, every such pair of equations gives constraints for two (2 = 8 − 6)
intrinsics, since a homography has 8 DOF and there are 6 extrinsic parameters.
INF2064 Tópicos de Computação Gráfica III - Realidade Aumentada e Cooperativa Camera calibration by Zhang [19]

The linear system

If:

• n = 1 — then we can only solve two intrinsic parameters, for example sx and
sy . In this case, we set u0 = W H
2 , v0 = 2 , γ = 0, that is, u0 , v0 are at the image
center, and skewness is zero.

• n = 2 — then we can solve four intrinsic parameters, for example sx, sy , u0, v0.
In this case, we set γ = 0.

• n ≥ 3 — the linear system becomes overdetermined, so we have a unique


solution (up to a factor) for all five intrinsics sx, sy , γ, u0, v0.
INF2064 Tópicos de Computação Gráfica III - Realidade Aumentada e Cooperativa Camera calibration by Zhang [20]

This was just an estimate

Finally, matrix K, matrices Ri, and vectors ti so calculated, are just an


estimate of the ground truth.

To improve the accuracy of results, matrix K, matrices Ri, and vectors ti will
be fed into a nonlinear-minimization solver. That is, matrix K, matrices Ri, and
vectors ti are treated as an initial guess for further refinement.
INF2064 Tópicos de Computação Gráfica III - Realidade Aumentada e Cooperativa Camera calibration by Zhang [21]

Minimizing a functional

In other words, now it’s necessary to minimize the following functional:


n X
X m
km ~ˆ
~ ij − m(K, Ri, ti, Mj )k2
i=1 j=1

where

• m
~ ij is an observed (detected) point j in image i, and


• m(K, Ri, ti, Mj ) is the (re)projection of model point Mj onto image i using
estimated K, Ri, ti
INF2064 Tópicos de Computação Gráfica III - Realidade Aumentada e Cooperativa Camera calibration by Zhang [22]

Minimizing a functional

~ˆ is an estimate:
Therefore, m


m(K, Ri, ti, Mj ) = H[X Y 1]τj = K[r1 r2 t][X Y 1]τj

The nonlinear minimization solver usually used for the aforementioned problem
is Levenberg-Marquardt solver.
INF2064 Tópicos de Computação Gráfica III - Realidade Aumentada e Cooperativa Camera calibration by Zhang [23]

Camera’s optical center in planar model’s space

One of the tasks was to find the position of the camera’s optical centre C
expressed in the coordinate system defined by image i (that is, coordinate system
defined by the pose of calibration pattern during which image i has been taken).
getCoordsOfCamerasOpticalCentreInImageSystem()
1 get extrinsics matrix [R t] for image i, where t = [t1, t2, t3]τ
2 return Rτ (−t)
INF2064 Tópicos de Computação Gráfica III - Realidade Aumentada e Cooperativa Camera calibration by Zhang [24]

Camera’s optical center in planar model’s space

EXPLANATION. Let points Oim and Ocam be origins of the image system and
the camera system, respectively, in an absolute system. Let ~t be the vector that
gives the position of Oim in the camera system. Let ~r1, ~r2, ~r3 define the unitary
base of the image system (in relation to the unitary base of the camera system).
Because of this, matrix R which translates image’s canonical base {~i0, ~j 0, ~k 0} into
camera’s canonical base {~i, ~j, ~k}, has columns ~r1, ~r2, ~r3 and is an unitary matrix,
RRτ = Rτ R = I. In other words, matrix R rotates (but only after translation ~t
) image points into their corresponding camera points. Therefore, given a point
with image coordinates Pim = (X, Y, Z = 0), its coordinates Pcam in the camera
system are equal to the coordinates of the vector
−−−−−−→ −−−−−−→ −−−−−→
OcamPim = OcamOim + OimPim −→ Pcam = ~t + RPim
INF2064 Tópicos de Computação Gráfica III - Realidade Aumentada e Cooperativa Camera calibration by Zhang [25]

Camera’s optical center in planar model’s space

Now using the fact that RRτ = Rτ R = I, and multiplying Pcam = t + RPim by
Rτ from the left side:

Rτ Pcam = Rτ t + Rτ RPim = Rτ t + Pim

−→ Pim = Rτ Pcam − Rτ t
Therefore, given a point expressed in camera coordinates (Pcam), we can calculate
its image coordinates (Pim).
Finally, in the special case Pcam = Ocam = (0, 0, 0)cam, the formula above
gives:
Pim = Rτ Pcam − Rτ t = −Rτ t = Rτ · (−t)
and this way we get optical centre Ocam expressed in image coordinates.
INF2064 Tópicos de Computação Gráfica III - Realidade Aumentada e Cooperativa Camera calibration by Zhang [26]

Camera’s optical center in planar model’s space

Using homogenuous coordinates, matrix [R t] transforms image coordinates into


camera coordinates. More precisely:
 
  r11 r12 r13 t1
R t  r21 r22 r23 t2 
P = P0
P =
03 1  r31 r32 r33 t3 
0 0 0 1

where P = (X, Y, Z, 1) and P 0 = [X 0, Y 0, Z 0, 1].


INF2064 Tópicos de Computação Gráfica III - Realidade Aumentada e Cooperativa Camera calibration by Zhang [27]

Calculating f ovy

How to calculate f ovy for image i, from the intrinsics matrix Ki = {kij }?
Holds
f ovy H/2
= tan
2 fy
where H is the height of image expressed in pixels, and fy = k22 is the focal
length in the y direction, also expressed in pixels. Now

H/2
f ovy = 2 arctan [rad]
fy

360◦
Finally, f ovy · 2π [rad] is the field of view expressed in degrees ◦.
INF2064 Tópicos de Computação Gráfica III - Realidade Aumentada e Cooperativa Camera calibration by Zhang [28]

References

• A flexible new technique for camera calibration, Zhengyou Zhang, Technical


Report MSR-TR-98-71 (report last updated on March 25, 1999)

You might also like