0% found this document useful (0 votes)
51 views13 pages

Straight Lines Have To Be Straight: Automatic Calibration and Removal of Distortion From Scenes of Structured Enviroments

This document describes a method for automatically calibrating and removing distortion from camera images of structured environments without using a calibration grid. The key idea is that a camera follows the pinhole model if and only if straight lines in the 3D scene project to straight lines in the image. The algorithm extracts potential lines from the image using edge detection and line fitting, then finds the distortion parameters that best transform the edges into straight line segments. Results on real images are presented and compared to calibration from a full camera calibration method using a grid.

Uploaded by

Arbit
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
51 views13 pages

Straight Lines Have To Be Straight: Automatic Calibration and Removal of Distortion From Scenes of Structured Enviroments

This document describes a method for automatically calibrating and removing distortion from camera images of structured environments without using a calibration grid. The key idea is that a camera follows the pinhole model if and only if straight lines in the 3D scene project to straight lines in the image. The algorithm extracts potential lines from the image using edge detection and line fitting, then finds the distortion parameters that best transform the edges into straight line segments. Results on real images are presented and compared to calibration from a full camera calibration method using a grid.

Uploaded by

Arbit
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Straight lines have to be straight: automatic calibration

and removal of distortion from scenes of structured


enviroments
Frédéric Devernay, Olivier Faugeras

To cite this version:


Frédéric Devernay, Olivier Faugeras. Straight lines have to be straight: automatic calibration
and removal of distortion from scenes of structured enviroments. Machine Vision and Applica-
tions, Springer Verlag, 2001, 13 (1), pp.14-24. <10.1007/PL00013269>. <inria-00267247>

HAL Id: inria-00267247


https://hal.inria.fr/inria-00267247
Submitted on 26 Mar 2008

HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est


archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents
entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non,
lished or not. The documents may come from émanant des établissements d’enseignement et de
teaching and research institutions in France or recherche français ou étrangers, des laboratoires
abroad, or from public or private research centers. publics ou privés.
Machine Vision and Applications manuscript No.
(will be inserted by the editor)

Straight Lines Have to Be Straight


Automatic Calibration and Removal of Distortion from Scenes of Structured Environments
Frédéric Devernay, Olivier Faugeras
INRIA, BP93, 06902 Sophia Antipolis Cedex, e-mail: devernay,[email protected]

The date of receipt and acceptance will be inserted by the editor

Most algorithms in 3-D Computer Vision rely on the pin- perspective (or projective) camera model involves two more
hole camera model because of its simplicity, whereas video camera parameters corresponding to the position of the prin-
optics, especially low-cost wide-angle or fish-eye lens, gen- cipal point in the image (which is the intersection of the opti-
erate a lot of non-linear distortion which can be critical. cal axis with the image plane). For many applications which
To find the distortion parameters of a camera, we use the require high accuracy, or in cases where low-cost or wide-
following fundamental property: a camera follows the pin- angle lenses are used, the perspective model is not sufficient
hole model if and only if the projection of every line in space and more internal calibration parameters must be added to
onto the camera is a line. Consequently, if we find the trans- take into account camera lens distortion.
formation on the video image so that every line in space is The distortion parameters are most often coupled with in-
viewed in the transformed image as a line, then we know how ternal camera parameters, but we can also use a camera model
to remove the distortion from the image. in which they are decoupled. Decoupling the distortion pa-
The algorithm consists of first doing edge extraction on a rameters from others can be equivalent to adding more de-
possibly distorted video sequence, then doing polygonal ap- grees of freedom to the camera model.
proximation with a large tolerance on these edges to extract
possible lines from the sequence, and then finding the param-
eters of our distortion model that best transform these edges 1.2 Brief summary of existing related work
to segments.
Results are presented on real video images, compared with Here is an overview of the different kinds of calibration meth-
distortion calibration obtained by a full camera calibration ods available. The goal of this section is not to do an extensive
method which uses a calibration grid. review, and the reader can find more information in [3,18,22].
The first kind of calibration method is the one that uses
a calibration grid with feature points whose world 3-D coor-
1 Introduction dinates are known. These feature points, often called control
points, can be corners, dots, or any features that can be easily
1.1 External, internal, and distortion calibration extracted for computer images. Once the control points are
identified in the image, the calibration method finds the best
In the context of 3-D computer vision, camera calibration camera external (rotation and translation) and internal (image
consists of finding the mapping between the 3-D space and aspect ratio, focal length, and possibly others) parameters that
the camera plane. This mapping can be separated in two dif- correspond to the position of these points in the image. The
ferent transformation: first, the displacement between the ori- simplest form of camera internal parameters is the standard
gin of 3-D space and the camera coordinate system, which pinhole camera [13], but in many cases the distortion due to
forms the external calibration parameters (3-D rotation and wide-angle or low-quality lens has to be taken into account
translation), and second the mapping between 3-D points in [26,3]. When the lens has a non-negligible distortion, using a
space and 2-D points on the camera plane in the camera co- calibration method with a pinhole camera model may result
ordinate system, which forms the internal camera calibration in high calibration errors.
parameters. The problem with these methods that compute the exter-
The internal camera calibration parameters depend on the nal and internal parameters at the same time arises from the
camera. In the case of an orthographic or affine camera model, fact that there is some kind of coupling between internal and
optic rays are all parallel and there are only 3 parameters cor- external parameters that result in high errors on the camera
responding to the spatial sampling of the image plane. The internal parameters [27].
2 Frédéric Devernay, Olivier Faugeras

Another family of methods is those that use geometric maps a 3-D point M whose coordinates in the camera-cen-
invariants of the image features rather than their world coor- tered coordinate system are (X, Y, Z) to an “undistorted” im-
dinates, like parallel lines [6,2] or the image of a sphere [19]. age point mu = (xu , yu ) on the image plane:
The last kind of calibration techniques is those that do
X
not need any kind of known calibration points. These are also xu = f
called self-calibration methods, and the problem with these Z (1)
methods is that if all the parameters of the camera are un- Y
yu = f
known, they are still very unstable [12]. Known camera mo- Z
tion helps in getting more stable and accurate results [24,15] Then, the image distortion transforms mu to a distorted im-
but it’s not always that easy to get “pure camera rotation”. age point md . The image distortion model [22] is usually
A few other calibration methods deal only with distortion given as a mapping from the distorted image coordinates,
calibration, like the plumb line method [5]. Another method which are observable in the acquired images, to the undis-
presented in [4] uses a calibration grid to find a generic dis- torted image coordinates, which are needed for further cal-
tortion function, represented as a 2-D vector field. culations. The image distortion function can be decomposed
in two terms: radial and tangential distortion. Radial distor-
tion is a deformation of the image along the direction from a
1.3 Overview of our method point called the center of distortion to the considered image
point, and tangential distortion is a deformation perpendicu-
Since many self-calibration [12] or weak calibration [29] tech- lar to this direction. The center of distortion is invariant under
niques rely on a pinhole (i.e. perspective) camera model, our both transformations.
main idea was to calibrate only the image distortion, so that It was found that for many machine vision applications,
any camera could be considered as a pinhole camera after the tangential distortion need not to be considered [26]. Let R
application of the inverse of the distortion function to image be the radial distortion function, which is invertible over the
features. We also don’t want to rely on a particular camera image:
motion [24] in order to be able to work on any kind of video ∂R
recordings or snapshots (e.g. surveillance video recordings) R : ru −→ rd = R(ru ), with (0) = 1 (2)
∂ru
for which there can be only little knowledge on self-motion,
or some observed objects may be moving. The distortion model can be written as:
The only constraint is that the world seen though the cam-
R−1 (rd ) R−1 (rd )
era must contain 3-D lines and segments. It can be city scenes, xu = xd , yd = yd (3)
interior scenes, or aerial views containing buildings and man- rd rd
made structures. Edge extraction and polygonal approxima- where rd =
p
x2d + yd2 , and similarly the inverse distortion
tion is performed on these images in order to detect possible model is:
3-D edges present in the scene, then we look for the distortion
parameters that minimize the curvature of the 3-D segments R(ru ) R(ru )
xd = xu , y d = yu (4)
projected to the image. ru ru
After we find a first estimate of the distortion parame- p
ters, we perform another polygonal approximation on the cor- where ru = x2u + yu2 .
rected (un-distorted) edges, this way straight line segments Finally, distorted image plane coordinates are converted
that were broken into several line segments because of distor- to frame buffer coordinates, which can be expressed either
tion become one single line segment, and outliers (curves that in pixels or in normalized coordinates (i.e. pixels divided by
were detected as line segments because of their small curva- image dimensions), depending on the unit of f :
ture) are implicitly eliminated. We continue this iterative pro-
xi = Sx xd + Cx
cess until we fall into a stable minimum of the distortion error (5)
after the polygonal approximation step. yi = yd + Cy
In section 2, we review the different nonlinear distortion
where (Cx , Cy ) are the image coordinates of the principal
models available, including polynomial and fish-eye models,
point and Sx is the image aspect ratio.
and the whole calibration process is fully described section 3.
In our case we want to decouple the effect of distortion
from the projection on the image plane, because we want to
calibrate is the distortion without knowing anything about in-
2 The nonlinear distortion model ternal camera parameters. Consequently, in our model, the
center of distortion (cx , cy ) will be different from the prin-
The mapping between 3-D points and 2-D image points can cipal point (Cx , Cy ). It was shown [23] that this is mainly
be decomposed into a perspective projection and a function equivalent to adding decentering distortion terms to the dis-
that models the deviations from the ideal pinhole camera. tortion model of equation 6. A higher order effect of this is
A perspective projection associated with the focal length f to apply an (very small) affine transformation to the image,
Straight Lines Have to Be Straight 3

p √ √
R2 − ∆ and T = 13 arctan −∆
3
but the affine transform of a pinhole camera is also a pinhole where S = R
camera (i.e. this is a linear distortion effect). Combining equations 7 and 8, the distorted coordinates
Moreover, the image aspect ratio sx that we use in the are given by:
distortion model may not be the same as the real camera as-
pect ratio Sx . The difference between these two aspect ratios rd
xd = xu
will result in another term of tangential distortion. To summa- ru
rd (11)
rize, the difference between the coordinates of the center of yd = y u
distortion (cx , cy ) and those of the principal point (Cx , Cy ) ru
corresponds to decentering distortion because the center of With high-distortion lenses, it may be necessary to in-
distortion may be different from principal point, and the dif- clude higher order terms of Equation 6 in the distortion mo-
ference between the distortion aspect ratio sx and the camera del [17]. In this case, the transformation from undistorted to
aspect ratio Sx corresponds to a term of tangential distortion. distorted coordinates has no closed-from solution, and a line
In the following, all coordinates are frame buffer coordi- solver has to be used (a simple Newton method is enough).
nates, either expressed in pixels or normalized (by dividing x In the case of fish-eye lens and some other high-distortion
by the image width and y by the image height) to be unit-less. lens, nonlinear distortion was built-in on purpose, in order to
correct deficiencies of wide-angle distortion-free lens, such
2.1 Polynomial distortion models as the fact that objets near the border of the field-of-view
have an exagerated size on the image. To model the distortion
The lens distortion model (equations 3) can be written as an of these lens, it may be necessary to take into account many
infinite series: terms of Equation 6: in our experiences, distortion models of
order at least 3 (which correspond to a seventh order poly-
xu = xd (1 + κ1 rd2 + κ2 rd4 + · · · )
(6) nomial for radial distortion) had to be used to compensate
yu = yd (1 + κ1 rd2 + κ2 rd4 + · · · ) for nonlinear distortion of fish-eye lens. For this reason, we
Several tests [3,26] showed that using only the first order ra- looked for distortion models which are more suitable to this
dial symmetric distortion parameter κ1 , one could achieve an kind of lens.
accuracy of about 0.1 pixels in image space using lenses ex-
hibiting large distortion, together with the other parameters
of the perspective camera [13]. 2.2 Fish-eye models
The undistorted coordinates are given by the formula:
Fish-eye lenses are designed from the ground up to include
xu = xd (1 + κ1 rd2 ) some kind of nonlinear distortion. For this reason, it is better
(7)
yu = yd (1 + κ1 rd2 ) to use a distortion model that tries to mimic this effect, rather
p than to use a high number of terms in the series of Equa-
where rd = x2d + yd2 is the distorted radius. tion 6. Shah and Aggarwal [21] showed that when calibrating
The inverse distortion model is obtained by solving the a fish-eye lens using a 7th order odd powered polynomial for
following equation for rd , given ru : radial distortion (which corresponds to a third order distor-
ru = rd 1 + κ1 rd2

(8) tion model), distortion still remains, so that they have to use
p a model with even more degrees of freedom.
where ru = x2u + yu2 is the undistorted radius and rd is the Basu and Licardie [1] use a logarithmic distortion model
distorted radius. (FET, or Fish-Eye Transform) or a polynomial distortion mo-
This is a polynomial of degree three in rd of the form del (PFET) to model fish-eye lenses, and the PFET model
rd3 + crd + d = 0, with c = κ11 and d = −cru , which can be seems to perform better than the FET. The FET model is
solved using the Cardan method which is a direct method for based on the observation that fish-eye have a high resolution
solving polynomials of degree three. It has either one or three at the fovea, and a non-linearly decreasing resolution towards
real solutions, depending on the sign of the discriminant: the periphery. The corresponding radial distortion function is:
∆ = Q3 + R2
rd = R(ru ) = s log (1 + λru ) (12)
where Q = 3c and R = − d2 .
If ∆ > 0 there is only one real solution: We propose here another distortion model for fish-eye
lens, which is based on the way fish-eye lenses are designed:

q
3 Q The distance between an image point and the principal point
rd = R + ∆ + p 3
√ (9)
R+ ∆ is usually roughly proportional to the angle between the cor-
responding 3-D point, the optical center and the optical axis
and if ∆ < 0 there are three real solutions but only one is
(Figure 1), so that the angular resolution is roughly propor-
valid because when ru is fixed, rd must be a continuous func-
tional to the image resolution along an image radius. This
tion of κ1 . The continuity at κ1 = 0 gives the solution:
√ model has only one parameter, which is the field-of-view ω
rd = −S cos T + S 3 sin T (10) of the corresponding ideal fish-eye lens, so we called it the
4 Frédéric Devernay, Olivier Faugeras

FOV model. This angle may not correspond to the real cam- crucial but images need to be undistorted quickly (i.e. only
era field-of-view, since the fish-eye optics may not follow ex- the transform function from undistorted to distorted coordi-
actly this model. The corresponding distortion function and nated is to be used more often than its inverse in a program’s
its inverse are: main loop), then a good solution is to switch the distortion
1  ω function and its inverse. For the first order distortion model,
rd = arctan 2ru tan , (13) Equation 7 would become the distortion function and equa-
ω 2
tion 11 its inverse. This is what we call an order −1 polyno-
tan(rd ω)
and ru = (14) mial model in this paper. That way the automatic distortion
2 tan ω2 calibration step is costly (because, as we will see later, it re-
If this one-parameter model is not sufficient to model the quires undistorting edge features), but once the camera is cal-
complex distortion of fish-eye lens, the previous distortion ibrated, the un-distortion of the whole intensity images is a
model (Equation 6) can be applied before Equation 14, with lot faster.
κ1 = 0 (ω, as a first order distortion parameter, would be Inverse model can be derived from polynomial models,
redundant with κ1 ). A second order FOV model will have fish-eye models, FOV model, or any distortion model. Though
κ2 6= 0, and a third order FOV model will have κ3 6= 0. they have the same number of parameters as their direct coun-
terpart, we will see section 5.4 that they do not represent the
same kind of distortion, and may not be able to deal with
z
agiven lens distortion.
ru M

rd 3 Distortion calibration
c
m
3.1 Principle of distortion calibration

The goal of the distortion calibration is to find the transfor-


mation (or un-distortion) that maps the actual camera image
plane onto an image following the perspective camera model.
To find the distortion parameters described in section 2, we
use the following fundamental property: a camera follows
C the perspective camera model if and only if the projection
of every 3-D line in space onto the camera plane is a line.
Fig. 1 In the FOV distortion model, the distance cm is proportional Consequently, all we need is a way to find projections of 3-D
to the angle between (CM ) and the optical axis (Cz) lines in the image (they are not lines anymore in the images,
since they are distorted, but curves), and a way to measure
how much each 3-D line is distorted in the image. Then we
will just have to let the distortion parameters vary, and try
to minimize the distortion of edges transformed using these
2.3 Inverse models parameters.

Using the models described before, the cheapest transforma-


tion in terms of calculation is from the the distorted coordi-
3.2 Edge detection with sub-pixel accuracy
nates to undistorted coordinates. This also means that it is
cheaper to detect features in the distorted image and to undis-
tort them, than to undistort the whole image and to extract The first step of the calibration consists of extracting edges
the feature from the undistorted image: in fact, undistorting a from the images. Since image distortion is sometimes less
whole image consists of computing the distorted coordinates than a pixel at image boundaries, there was definitely a need
of every point in the undistorted image (which requires solv- for an edge detection method with a sub-pixel accuracy. We
ing a third degree polynomial –for the first-order model– or developed an edge detection method [11], which is a sub-
more complicated equations), and then computing its inten- pixel refinement of the classical Non-Maxima Suppression
sity value by bilinear interpolation in the original distorted (NMS) of the gradient norm in the direction of the gradient.
image. It was shown to give edge position with a precision varying
For some algorithms or feature detection methods which from 0.05 pixel RMS for a noise-free synthetic image, to 0.3
depend on linear perspective projection images, one must nev- pixel RMS for an image Signal to Noise Ratio (SNR) of 18dB
ertheless undistort the whole image. A typical example is (which is actually a lot of noise, the VHS videotapes SNR
stereo by correlation, which require an accurate rectification is about 50dB). In practice, any edge detection method with
of images. In these cases, where calibration time may not be sub-pixel accuracy can be used.
Straight Lines Have to Be Straight 5

3.3 Finding 3-D segments in a distorted image This leads to the following expression for the distortion
error of each edge segment [10]:

In order to calibrate distortion, we must find edges in the im- χ2 = a sin2 φ − 2 |b| |sin φ| cos φ + c cos2 φ (15)
age which are most probably images of 3-D segments. The
goal is not to get all segments, but to find the most proba- where:
ble ones. For this reason, we do not care if a long segment, n n
X 1 X
because of its distortion, is broken into smaller segments. a= x2j − ( xj )2 (16)
n j=1
Therefore, and because we are using a subpixel edge de- j=1
tection method, we use a very small tolerance for polygonal n n n
X 1X X
approximation: the maximum distance between edge points b= xj yj − xj yj (17)
n j=1 j=1
and the segment joining both ends of the edge must typically j=1
be less than 0.4 pixels. We also put a threshold on segment n n
X 1 X 2
length of about 60 pixels for a 640 × 480 image, because c= yj2 − ( yj ) (18)
j=1
n j=1
small segments may contain more noise than useful informa-
tion about distortion. α
α = a − c; β = √ (19)
Moreover, because of the corner rounding effect [8,9] due 2 α + 4b2
2
p p
to edge detection, we throw out a few edgels (between 3 and |sin φ| = 1/2 − β; cos φ = 1/2 + β (20)
5, depending on the amount of smoothing performed on the
image before edge detection) at both ends of each detected φ is the angle of the line in the image, and sin φ should have
edge segment. the same sign as b. φ can also be computed as φ = 1/2 arctan 2(2b, a−
c), but only sin φ and cos φ are useful to compute χ2 .

3.4 Measuring distortion of a 3-D segment in the image 3.5 Putting it all together: The whole calibration process

The whole distortion calibration process is not done in a sin-


In order to find the distortion parameters we use a measure of gle step (edge detection, polygonal approximation, and op-
how much each detected segment is distorted. This distortion timization), because there may be outliers in the segments
measure will then be minimized to find the best calibration detected by the polygonal approximation, i.e. segment edges
parameters. One could use for example the mean curvature which do not really correspond to 3-D line segments. More-
of the edges, or any distance function on the edge space that over, some images of 3-D line segments may be broken into
would be zero if the edge is a perfect segment and the more smaller edges because the first polygonal approximation is
the segment would be distorted, the bigger the distance would done on distorted edges. By doing another polygonal approx-
be. imation after the optimization, on undistorted edges, we can
We chose a simple measure of distortion which consists eliminate many outliers easily and sometimes get longer seg-
of doing a least squares approximation of each edge which ments which contain more information about distortion. This
should be a projection of a 3-D segment by a line [10], and way we get even more accurate calibration parameters.
to take for the distortion error the sum of squares of the dis- A first version of the distortion calibration process is:
tances from the point to the line (i.e. the χ2 of the least square
approximation, Figure 2). That way, the error is zero if the 1. Load or acquire a set of images.
edge lies exactly on a line, and the bigger the curvature of the 2. Do subpixel edge detection and linking on all the images
edge, the bigger the distortion error. in the collection. The result is the set of linked edges of
all images.
3. Initialize the distortion parameters with reasonable val-
distortion error ues.
4. Do polygonal approximation on undistorted edges to ex-
tract segment candidates. P 2
5. Compute the distortion error E0 = χ (sum is done
detected edge segment
over all the detected segments).
6. Optimize the distortion parameters κ1 , cx , cy , sx to min-
least squares line fit imize the total distortion error. The total distortion error
φ
is taken as the sum of the distortion errors (eq. 15) of all
detected line segments, and is optimized using a nonlin-
Fig. 2 The distortion error is the sum of squares of the distances ear least-squares minimization method (e.g. Levenberg-
from the edgels of an edge segment to the least square fit of a line to Marquart).
these edgels . 7. Compute the distortion error E1 for the optimized param-
eters.
6 Frédéric Devernay, Olivier Faugeras

8. If the relative change of error E0E−E


1
1
is less than a thresh- as we want (data is edges in our case), simply by acquiring
old, stop here. more images with the same lens (the scene need not to be dif-
9. update the distortion parameters with the optimized val- ferent for each image, moving around the camera is enough).
ues. For a given kind of model (e.g. polynomial), this method will
10. Go to step 4. almost always pick the model with the highest number of pa-
rameters, but we will still be able to say, between two dif-
By minimizing on all the parameters when the data still
ferent kinds of models with a given number of parameters,
contains many outliers, there is a risk of getting farther from
for example a third order polynomial model and a third order
the optimal parameters. For this reason, steps 3 to 9 are first
FOV model, which one is best. We will also be able to state
done with optimization only on the first radial distortion pa-
how much more accuracy we get by adding one order to a
rameter (κ1 for polynomial models, ω for FOV models) un-
given model.
til the termination condition of step 8 is verified, then cx
and cy are added, and finally full optimization on the distor-
tion parameters (including sx ) is performed. During the pro-
cess, polygonal approximation (step 4) progressively elimi- 4.2 MDL et al.
nates most outliers.
Of course, the success of the whole process depends on When the number of images is limited, or the camera is fixed
the number, length and accuracy of the line segments detected (e.g. a surveillance camera), a smarter selection method should
in the images. Moreover, the segments should be have vari- be used. A proper model selection method would be based on
ous positions and orientations in the image, in order to avoid the fact that the model that best describes the data leads to the
singular or almost singular situations. For example, one can- shortest encoding (or description, in the information theory
not compute radial distortion if all straight lines supporting sense) of the model and the data. This principle is called Min-
the detected segments go through a single point in the image. imum Description Length [20,14], and is now widely used
Fortunately, data is cheap in our case, since getting more line in computer vision. The MDL principle, or other model se-
segments usually involves only moving the camera and taking lection methods based on information theory [25,16] require
more pictures. Instead of analyzing how the number, length a fine analysis of the properties of the data and the model,
and accuracy of the detected segments influence the stabil- which are beyond the scope of this paper.
ity and accuracy of the algorithm, we judged that there was When the amount of data (edges in our case) increases,
enough data if adding more data (i.e. more pictures) wouldn’t these approaches become asymptotically equivalent to the prob-
change the results significantly. A more in-depth study on abilistic method, because almost all information is contained
what minimum data is necessary for the calibration would in the data. A different way of understanding this is that in the
be useful, especially in situations where “getting more data” ideal case where we have an infinite number of data with un-
is a problem. biased noise, the best model will always be the one that gives
the lowest residuals, whereas with only a few data, the model
that gives the lowest residuals may fit the noise instead of the
4 Model selection data itself. For this reason, we used the simpler probabilistic
method in our experiments, because we can gen as much data
We have shown that several models can be used to describe a as we want, just by using edges extracted from additional im-
lens’ nonlinear distortion: polynomial distortion models (eq. ages taken with the same lens.
6) with differents orders (first order uses only κ1 , second or-
der κ1 and κ2 , etc.), fish-eye models such as the FET model
(eq. 12) or the FOV model (eq. 14) with different orders (first
4.3 Conversion between distortion models
order uses only ω, second order is the application of a first
order polynomial model before eq. 14, etc.), but then arises
the problem of chosing the right model for a given lens. Suppose we have selected the distortion model that best fits
our lens, then one may want to know how much accuracy
is lost, in terms of pixel displacement, when using another
4.1 Probabilistic approach model instead of this one. Similarly, if two models seem to
perform equally with respect to our distortion calibration method,
The easiest way of chosing the model that best describes some one may want to be able to measure how much geometrically
data, based on probability theory, is to take the one that gives different these models are. To answer these questions, we de-
the lowest residuals. This usually leads to picking the model velopped a conversion method that picks within a distortion
with the biggest number of parameters, since increasing the model family the one that most resembles a given model from
number of parameters usually lowers the residuals (an ex- another family, and also measures how much different these
treme case is when there is as many parameters as residu- models are.
als, and the residuals can be zero). In the experimental setup One way to measure how close distortion model A with
we used, the models have a reduced number of parameters parameters pA is to distortion model B is to try to convert
(at most 6 for order 3 models), and we can get as much data the parameter set describing the first model to a parameter
Straight Lines Have to Be Straight 7

set pB describing the second model. Because the two mod- caused by the cheap wide-angle lens. The use of an on-line
els belong to different families of transformations, this con- camera allows very fast image transfer between the frame
version is generally not possible, but we propose the follow- grabber and the program memory using Direct Memory Ac-
ing method to get the parameter set pB of model B which cess (DMA), so that we are able to do fast distortion calibra-
gives the best approximation of model A(pA ) the least square tion. The quality of the whole system seems comparable to
sense. that of a VHS videotape.
The parameter set pB is chosen so that the distorted image Other images were acquired using an Imaging Technolo-
is undistorted “the same way” by A(pA ) and B(pB ). “The gies acquisition board together with several different cam-
same way” means that there is at most a non-distorting trans- era setups: a Sony XC75CE camera with 8mm, 12.5mm, and
formation between the set of points undistorted by A(pA ) and 16mm lens (the smaller the focal length, the more important
the set of points undistorted by B(pB ). We define a non- the distortion), and an old Pulnix TM-46 camera with 8mm
distorting transformation as being linear in projective coor- lens. The fish-eye images come from a custom underwater
dinates. The most general non-distorting transformation is an camera2
homography, so we are looking for parameters pB and a ho- The distortion calibration software is a stand-alone pro-
mography H so that the image undistorted by B(pB ) and gram that can either work on images acquired on-line using a
transformed by H is as close as possible to the image undis- camera and a frame grabber or acquired off-line and saved to
torted by A(pA ). disk. Image gradient was computed using a recursive Gaus-
Let us consider an infinite set of points {mi } uniformly sian filter [7], and subsequent edge detection was done by
distributed on the distorted image, let {mA i } be these points NMS.
undistorted using A(pA ), and let {mB i } be the same points The optimization step was performed using the subrou-
undistorted by B(pB ). We measure the closeness1 from A(pA ) tine lmdif from MINPACK or the subroutine dnls1 from
to B(pB ) as: SLATEC, both packages being available from Netlib3 .
s
1 X A 2
C(A(pA ), B(pB )) = inf lim mi − H(mB i )

H i→∞ i
i 5.2 The full calibration method
(21)
The conversion from A(pA ) to model B is simply achieved
In order to evaluate the validity of the distortion parameters
by computing pB (and H) that minimize (21), i.e. pB =
obtained by our method, we compared them to those obtained
arg inf pB C(A(pA ), B(pB )). In practice, of course, we use a
by a method for full calibration (both external and internal)
finite number of uniformly distributed points (e.g. 100×100).
that incorporates comparable distortion parameters. The soft-
C(A(pA ), B(pB )) is the mean residual error in image
ware we used to do full calibration implements the Tsai cal-
coordinates units, and measures how good model B fits to
ibration method [26] and is freely available. This software
A(pA ). This can be used, if B has less parameters than A,
implements calibration of external (rotation and translation)
to check if B(pB ) is good enough to represent the distortion
and internal camera parameters at the same time. The internal
yielded by A(pA ). An example is shown on fish-eye lens in
parameter set is composed of the pinhole camera parameters
section 5.4.
except for the shear parameter (which is very close to zero
on CCD cameras anyway [3]), and of the first radial distor-
tion parameter. From the result of this calibration mechanism,
5 Results and comparison with a full calibration method
we can extract the position of the principal point, the image
aspect ratio, and the first radial distortion parameter.
5.1 Experimental setup
As seen in section 2, though, these are not exactly the
We used various hardware setups to test the accuracy of the same parameters as those that we can compute using our
distortion calibration, from low-cost video-conference video method, since we allow more degrees of freedom for the dis-
hardware to high-quality cameras and frame-grabber. tortion function: two more parameters of decentering distor-
tion and one parameter of tangential distortion. Having differ-
The lowest quality hardware is a very simple video ac-
ent coordinates for the principal point and the center of dis-
quisition system included with every Silicon Graphics Indy
tortion, and for the image aspect ratio and distortion aspect
workstation. This system is not designed for accuracy nor
ratio. There are two ways of comparing the results of the two
quality and consists of an IndyCam camera coupled with the
methods: one is to compute the closeness (defined in sec. 4.3)
standard Vino frame grabber. The acquired image is 640×480
between the two sets of parameters by computing the best ho-
pixels interlaced, and contains a lot of distortion and blur
mography between two sets of undistorted points, the other is
1
This measure is not a distance, since to convert the radial distortion parameter found by Tsai cali-
C(A(pA ), B(pB )) 6= C(B(pB ), A(pA )). A dis-
0 2
tance
p derived from C is C (A(p A ), B(pB )) = Thanks go to J. Ménière and C. Migliorini from Poseidon, Paris,
C 2 (A(pA ), B(pB )) + C(B(pB ), A(pA )), but our measure who use these cameras for swimming-pool monitoring, for letting
reflects the fact that “finding the best parameters pB for model B to me use these images.
3
fit model A(pA )” is a non-symmetric process. http://www.netlib.org/
8 Frédéric Devernay, Olivier Faugeras

bration using the distortion center and aspect ratio found by


our method, and vice-versa.

5.3 Results

We calibrated a set of cameras using the Tsai method and


a calibration grid (Figure 3) with 128 points, and we com-
puted the distortion parameters from the result of this full cal-
ibration method (Table 2) (camera E could not be calibrated
this way, because the automatic feature extraction used be-
fore Tsai calibration didn’t work on these images). The dis-
tortion calibration method was also applied to sets of about
30 images (see Figure 4) for each camera/lens combination,
and the results for the four parameters of distortion are shown
in Table 1. For each set of images, the edges extracted from
all the images are used for calibration. The initial values for
the distortion parameters before the optimization were set to
“reasonable” values, i.e. the center of distortion was set to the
center of the image, κ1 was set to zero, and sx to the image
aspect ratio, computed for the camera specifications. For the
IndyCam, this gave cx = cy = 12 , κ1 = 0 and sx = 34 .
All the parameters and results are given in normalized
coordinates and are dimensionless: x is divided by the im-
age width and y by the image height, thus (x, y) ∈ [0, 1]2 .
This way, we are able to measure and compare the effect of
the lens, and we are as independant as possible of the frame
grabber (the results presented here were obtained using vari-
ous frame grabbers).
As explained in section 2, these have not exactly the same
meaning as the distortion parameters obtained from Tsai cal-
ibration, mainly because in this model the distortion center is Fig. 3 The calibration grid used for Tsai calibration: original dis-
torted image (top) and image undistorted using the parameters com-
the same as the optical center, and also because we introduced
puted by our method (bottom).
a few more degrees of freedom in the distortion function, al-
lowing decentering and tangential distortion. This explains
why the distortion center found on low-distortion cameras
such as the Sony 16mm are so far away from the principal to a model where the center and aspect ratio of distortion are
point. fixed to the values cx , cy , sx . The resulting set of parameters
The four bottom lines of Table 1 may need some more ex- is (cx , cy , sx , ψ(κ01 )), and the RMS residual error of the con-
planations. C0 is the closeness between the computed distor- vertion, i.e. the closeness
tion model and a zero-distortion camera model (i.e. κ1 = 0).
It is a good way to measure how much distorting this model is:
for example, for camera A, C0 is 6.385 · 10−3 in normalized
Cf = C((cx , cy , sx , f (κ01 )), (c0x , c0y , s0x , κ01 )),
coordinates, which corresponds to about 4 pixels of “mean
distortion” over the image (not only in the corners!). The
measure Ct says that there is about 0.5 pixels RMS between
the distortion model computed with our method and the Tsai is always below one third of a pixel (last line of the table),
model, for all camera/lens combinations, which means that which allows us to compare f (κ01 ) with κ1 . In fact, for all
the quality of our method is intrinsically acceptable, but there camera configurations, parameter f (κ01 ) is close to κ1 ob-
is still no way to tell which of both method, Tsai or auto- tained by our automatic calibration method, meaning that,
matic, gives best results. For cameras with high distortion, once again, our results are very close to those given by Tsai
like the IndyCam and the cameras with 8mm lens, The center calibration, though they look different (especially for low-
of distortion and the distortion aspect ratio are close to the distortion lenses).
principal point and the image aspect ratio computed by Tsai Figure 5 shows a sample image, before and after the cor-
calibration. rection. This image was affected by pin-cushion distortion,
The seventh line of Table 1 gives the result of the conver- corresponding to a positive value of κ1 . Barrel distortion cor-
sion from the Tsai set of parameters (c0x , c0y , s0x , κ01 ) (Table 2) responds to negative values of κ1 .
Straight Lines Have to Be Straight 9

camera/lens A B C D E
cx 0.493 0.635 0.518 0.408 0.496
cy 0.503 0.405 0.122 0.205 0.490
sx 0.738 0.619 0.689 0.663 0.590
κ1 0.154 0.041 0.016 0.012 -0.041
C0 .103 6.385 2.449 1.044 0.770 2.651
Ct .103 0.751 0.923 0.811 0.626 N/A
ψ(κ01 ) 0.137 0.028 0.004 0.002 N/A
Cf .103 0.217 0.516 0.263 0.107 N/A
Table 1 The distortion parameters obtained on various camera/lens
setups using our method, in normalized image coordinates: First
radial distortion parameter κ1 , position of the center of distortion
cx , cy , and distortion aspect ratio sx (not necessarily the same as Sx ,
the image aspect ratio). C0 is the closeness with a zero-distortion
model, Ct is the closeness between these parameters and the ones
found by Tsai calibration, ψ(κ01 ) is the first order radial distortion
converted from results of Tsai calibration (Table 2) using the distor-
tion center (cx , cy ) and aspect ration sx from our method. Cf is the
RMS residual error of the convertion. All parameters are dimension-
less. A is IndyCam, B is Sony XC-75E camera with 8mm lens, C is
Sony XC-75E camera with 12.5mm lens, D is Sony XC-75E with
16mm lens, E is Pulnix camera with 8mm lens.

camera/lens A B C D
c0x 0.475 0.514 0.498 0.484
c0y 0.503 0.476 0.501 0.487
s0x 0.732 0.678 0.679 0.678
κ01 0.135 0.0358 0.00772 0.00375
Table 2 The distortion parameters obtained using the Tsai calibra-
tion method, in normalized image coordinates: position of the prin-
cipal point, and image aspect ratio, First radial distortion parameter.
See Table 1 for details on the camera/lens configurations. Fig. 5 A distorted image with the detected segments (left) and the
same image at the end of the distortion calibration with segments
extracted from undistorted edges (right): some outliers (wrong seg-
ments on the plant) were removed and longer segments are detected.
This image represents the worst case, where some curves may be
mistaken for lines.

5.4 Choice of the distortion model

In this experiment, we use an underwater fish-eye lens (Fig-


ure 6), and we want to find which distortion model best fits
this lens. Besides, we will try to evaluate which order of radial
distortion is necessary to get a given accuracy. The distortion
models tested on this lens are FOV1, FOV2, FOV3 (first, sec-
ond, and third order FOV models), P1, P2, P3 (first, second
and third order polynomial models), P-1, P-2, P-3 (first, sec-
ond and third order inverse polynomial models).
Results (Table 3) show for each model the total number of
Fig. 4 Some of the images that were used for distortion calibration. segments detected at the end of the calibration stage, the num-
Even blurred or fuzzy pictures can be used. ber of edgels forming these segments, and the mean edgel dis-
tortion error (Equation 15) in normalized image coordinates.
The image size is 384 × 288, and we notice that the mean
edgel distortion error is almost the same for all models, and
comparable to the theoretical accuracy of the edge detection
method (0.1 pixel) [11]. We can judge the quality of the dis-
tortion models from the number of detected edgels which
10 Frédéric Devernay, Olivier Faugeras

were classified as segments, and from the mean segment length.


From these, we can see that model FOV3 gives the best re-
sults, and that all versions of the FOV model (FOV1, FOV2,
FOV3) perform better than polynomial models (P1, P2, P3)
for this lens. We also notice that inverse polynomial mod-
els (P-1, P-2, P-3) perform poorly, compared with their di-
rect counterpart. From these measurements, we clearly see
that though they have the same number of degrees of free-
dom, different distortion models (eg. FOV3, P3 and P-3) de-
scribe more or less accurately the real distortion transforma-
tion. Therefore, the distortion model must be chosen care-
fully.
Once we have chosen the distortion model (FOV in this
case), we still have to determine what order is necessary to
get a given accuracy. For this, we use the residual error of
the conversion from the highest-order model to a lower or-
der model (section 4.3). These residuals, conputed for con-
versions from FOV3 and P3 models, are shown Table 4.

from \ to FOV1 FOV2 FOV3 P1 P2 P3


FOV3 0.69 0.10 N/A 2.00 0.21 0.03
P3 1.00 0.04 0.02 2.08 0.03 N/A
from \ to P-1 P-2 P-3
FOV3 7.29 1.32 1.16
Table 4 Residual errors in 10−3 normalized coordinates after con-
verting from models FOV3 and P3 to other models.

From these results, we immediately notice that inverse


Fig. 6 An image taken with underwater fish-eye lens, and the same polynomial models (P-1, P-2, P-3) are completely inadequate
image undistorted using model 3f . The parallel lines help checking for this lens, since they can lead to mean distortion errors
the result of distortion calibration. from one half to several pixels, depending on the order, but
we already noticed that these models were not suitable from
the calibration results (Table 3).
model nb. seg. nb. edgels seg. len. dist. err.·103
The most important result is that by using FOV2 instead
FOV1 591 78646 133.1 0.312 of FOV3, we will get a mean distortion error of about 0.2 pix-
FOV2 589 77685 131.9 0.298
els (for a 512×512 image), if we use P2 this error will be 0.4
FOV3 585 79718 136.3 0.308
pixels, and if we use FOV1 it will be 1.4 pixels. Consequently,
P1 670 71113 106.1 0.304
if we need the best accuracy, we have to use the FOV3 model,
P2 585 77161 131.9 0.318
but FOV2 and P2 represent a good compromise between per-
P3 588 77154 131.2 0.318
P-1 410 48352 117.9 0.300 formance and accuracy. FOV2 is especially interesting, since
P-2 534 68286 127.9 0.308 a closed form inverse function is available for this model.
P-3 549 71249 129.8 0.312 This investigation was made on our underwater camera,
but the same investigation could be made on other lenses.
Table 3 The results of calibration on the same set of images using
different distortion models. Number of segments detected, number
This could, of course, lead to a different optimal distortion
of edgels forming these segments, mean segment length, and mean model than FOV2, but the method would be the same.
edgel distortion error.

6 Discussion

With computer vision applications demanding more and more


accuracy in the camera model and the calibration of its pa-
rameters, there is definitely a need for calibration methods
that don’t rely on the simple projective linear pinhole camera
model. Camera optics still have lots of distortion, and zero-
distortion wide-angle lens exist but remain very expensive.
Straight Lines Have to Be Straight 11

The automatic distortion calibration method presented here 8. R. Deriche and G. Giraudon. Accurate corner detection: An
has many advantages over other existing calibration methods analytical study. In Proceedings of the 3rd International Con-
that use a camera model with distortion [3,4,24,26]. First, it ference on Computer Vision, pages 66–70, Osaka, Japan, De-
makes very few assumptions on the observed world: there is cember 1990. IEEE Computer Society Press.
no need for a calibration grid [3,4,26]. All it needs is images 9. R. Deriche and G. Giraudon. A computational approach for
corner and vertex detection. The International Journal of Com-
of scenes containing 3-D segments, like interior scenes or city
puter Vision, 10(2):101–124, 1993.
scenes. Second, it is completely automatic, and camera mo- 10. Rachid Deriche, Régis Vaillant, and Olivier Faugeras. From
tion needs not to be known [23,24]. It can even be applied to noisy edge points to 3D reconstruction of a scene: A robust ap-
images acquired off-line, which could come from a surveil- proach and its uncertainty analysis. In Proceedings of the 7th
lance videotape or a portable camcorder. Results of distor- Scandinavian Conference on Image Analysis, pages 225–232,
tion calibration and comparison with a grid-based calibration Alborg, Denmark, August 1991.
method [18] are shown for several lenses and cameras. 11. Frédéric Devernay. A non-maxima suppression method for
If we decide to calibrate distortion, there is not a unique edge detection with sub-pixel accuracy. RR 2724, INRIA,
solution for the choice of the kind of distortion model [1] and November 1995.
12. Olivier Faugeras, Tuan Luong, and Steven Maybank. Camera
the order of this distortion model. For example, fish-eye lens
self-calibration: theory and experiments. In G. Sandini, edi-
may not be well represented by the traditional polynomial
tor, Proceedings of the 2nd European Conference on Computer
distortion model. We presented an alternative fish-eye model, Vision, pages 321–334, Santa Margherita, Italy, May 1992.
called the FOV model, together with methods to determine Springer-Verlag.
which model is best for a given lens, and at which order. This 13. Olivier Faugeras and Giorgio Toscani. Structure from Motion
study was made in the case of an underwater fish-eye camera, using the Reconstruction and Reprojection Technique. In IEEE
and the results showed that the highest-order model may not Workshop on Computer Vision, pages 345–348, Miami Beach,
always be necessary, depending on the required accuracy, and November-December 1987. IEEE Computer Society.
that different models with the same number of parameters 14. P. Fua and A.J. Hanson. An optimization framework for feature
don’t necessarily give the same accuracy. extraction. Machine Vision and Applications, (4):59–87, 1991.
15. Richard Hartley. Projective reconstruction and invariants from
Once the distortion is calibrated, any computer vision al-
multiple images. PAMI, 16(10):1036–1040, 1994.
gorithm that relies on the pinhole camera model can be used, 16. K. Kanatani. Automatic singularity test for motion analysis by
simply by applying the inverse of the distortion either to im- an information criterion. In Bernard Buxton, editor, Proceed-
age features (edges, corners, etc.) or to the whole image. This ings of the 4th European Conference on Computer Vision, pages
method could also be used together with self-calibration or 697–708, Cambridge, UK, April 1996.
weak calibration methods that would take into account the 17. J.M. Lavest, M. Viala, and M. Dhome. Do we really need an
distortion parameters. The distortion calibration could be done accurate calibration pattern to achieve a reliable camera cali-
before self-calibration, so that the latter would use un-distorted bration. In Hans Burkhardt and Bernd Neumann, editors, Pro-
features and images, or during self-calibration[28], the dis- ceedings of the 5th European Conference on Computer Vision,
tortion error being taken into account in the self-calibration volume 1 of Lecture Notes in Computer Science, pages 158–
174, Freiburg, Germany, June 1998. Springer-Verlag.
process.
18. R. K. Lenz and R. Y. Tsai. Techniques for calibration of the
scale factor and image center for high accuracy 3-D machine
References vision metrology. IEEE Transactions on Pattern Analysis and
Machine Intelligence, 10:713–720, 1988.
1. Anup Basu and Sergio Licardie. Alternative models for fish-eye 19. M. A. Penna. Camera calibration: A quick and easy way to de-
lenses. Pattern Recognition Letters, 16:433–441, April 1995. termine the scale factor. IEEE Transactions on Pattern Analysis
2. P. Beardsley, D. Murray, and A. Zissermann. Camera calibra- and Machine Intelligence, 13:1240–1245, 1991.
tion using multiple images. In G. Sandini, editor, Proceedings 20. J. Rissanen. Minimum description length principle. Encyclo-
of the 2nd European Conference on Computer Vision, pages pedia of Statistic Sciences, 5:523–527, 1987.
312–320, Santa Margherita, Italy, May 1992. Springer-Verlag. 21. Shishir Shah and J.K. Aggarwal. Intrinsic parameter cali-
3. Horst A. Beyer. Accurate calibration of CCD-cameras. In Pro- bration procedure for a (high-distortion) fish-eye lens camera
ceedings of the International Conference on Computer Vision with distortion model and accuracy estimation. Pattern Recog.,
and Pattern Recognition, Urbana Champaign, IL, June 1992. 29(11):1175–1788, 1996.
IEEE. 22. C. C. Slama, editor. Manual of Photogrammetry. American
4. P. Brand, R. Mohr, and P. Bobet. Distorsions optiques : correc- Society of Photogrammetry, fourth edition, 1980.
tion dans un modèle projectif. Technical Report 1933, LIFIA– 23. Gideon P. Stein. Internal camera calibration using rotation and
INRIA Rhône-Alpes, 1993. geometric shapes. Master’s thesis, Massachusetts Institute of
5. Duane C. Brown. Close-range camera calibration. Photogram- Technology, June 1993. AITR-1426.
metric Engineering, 37(8):855–866, 1971. 24. Gideon P. Stein. Accurate internal camera calibration using ro-
6. Bruno Caprile and Vincent Torre. Using Vanishing Points for tation with analysis of sources of error. In Proceedings of the
Camera Calibration. The International Journal of Computer 5th International Conference on Computer Vision, Boston, MA,
Vision, 4(2):127–140, March 1990. June 1995. IEEE Computer Society Press.
7. R. Deriche. Recursively implementing the gaussian and its 25. P.H.S. Torr and D.W. Murray. Statistical detection of indepen-
derivatives. Technical Report 1893, INRIA, Unité de Recherche dent movement from a moving camera. Image and Vision Com-
Sophia-Antipolis, 1993. puting, 11(4):180–187, 1993.
12 Frédéric Devernay, Olivier Faugeras

26. Roger Y. Tsai. A versatile camera calibration technique for


high-accuracy 3D machine vision metrology using off-the-shelf
TV cameras and lenses. IEEE Journal of Robotics and Automa-
tion, 3(4):323–344, August 1987.
27. J. Weng, P. Cohen, and M. Herniou. Camera calibration with
distortion models and accuracy evaluation. IEEE Transactions
on Pattern Analysis and Machine Intelligence, 14(10):965–980,
October 1992.
28. Z. Zhang. On the epipolar geometry between two images with
lens distortion. In International Conference on Pattern Recog-
nition, volume I, pages 407–411, Vienna, Austria, August 1996.
29. Z. Zhang, R. Deriche, O. Faugeras, and Q.-T. Luong. A ro-
bust technique for matching two uncalibrated images through
the recovery of the unknown epipolar geometry. Artificial In-
telligence Journal, 78:87–119, October 1995.

You might also like