0% found this document useful (0 votes)
18 views48 pages

Camera

Computer vision

Uploaded by

Keith Lêđông
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views48 pages

Camera

Computer vision

Uploaded by

Keith Lêđông
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 48

CO M P U T E R

V I S I O N
CAMERA & IMAGE
FORMATION

Ngo Quoc Viet-2024


CONTENTS

• Camera models
– Pinhole camera model
– Perspective projection

• Pinhole Simulation for Image Formation

• Camera calibration

• Lights and Colors

Ngo Quoc Viet 2


LECTURE OUTCOMES

• Understanding pinhole camera,


perspective projection and
calibration matrices.

• Implementing calibration to
determine intrinsic and extrinsic
matrices. 𝛼 0 𝑐𝑥
𝐾 = 0 𝛽 𝑐𝑦
• Implementing the small 0 0 1
𝑟11 𝑟12 𝑟13 𝑡𝑥
augmenting application to embed
R 𝑡 = 𝑟21 𝑟22 𝑟23 𝑡𝑧
3D objects to reality scence. 𝑟31 𝑟32 𝑟33 𝑡𝑧

Ngo Quoc Viet 3


SOME CAMERA MODELS

• Pinhole Camera Model: simplest mathematical model that can be applied to a


lot of real photography devices.

• Thin Lens Model: This model incorporates a thin lens to simulate the effects of
lenses on image formation

• Fish-Eye Camera Model: Fish-eye lenses capture a very wide field of view.

• Spherical Camera Model: represents cameras with spherical or panoramic lenses

• Omni-Directional Camera Model: Omni-directional cameras capture images


from all directions simultaneously.

Ngo Quoc Viet 4


PINHOLE CAMERA MODEL

• The pinhole camera model is the


simplest mathematical model that can
be applied to a lot of real
photography devices.
• The pinhole camera model is a
simplified representation of how light
travels through a small aperture to • In computer vision, the pinhole camera model is
form an image on a photosensitive
used for camera calibration (finding its intrinsic
surface.
parameters-focal length and optical center, and
• The pinhole camera model provides a extrinsic parameters-position and orientation of
straightforward and intuitive way to
the camera). Calibration is crucial for tasks such
understand the fundamental
as 3D reconstruction and object tracking.
principles of image formation.
Ngo Quoc Viet 5
PINHOLE CAMERA MODEL

Image plane 2D image Pinhole Virtual image 3D object


plane

f= focal length c = center of the camera

What is the transformation between images on the two image planes?


Ngo Quoc Viet 6
PINHOLE CAMERA MODEL

• The image plane, which is the film or medium that captures the light rays, is situated in
front of the pinhole. However, in the real world, it is located behind the pinhole. This
assumption makes it easier to model the projection as we don’t have to worry about
inverting an image.
• All the light rays from different points converge at the pinhole, which can also be called
the center of projection or the camera center.
• The idea is that the image of a point is the projection of that point on the image plane, or
where the line from the camera center to the point intersects the image plane.
• What we want is a one-to-one correspondence between the points in the world and the
pixels in the film.

Ngo Quoc Viet 7


PINHOLE CAMERA MODEL

• The pinhole camera model defines the geometric relationship between a


3D point and its 2D corresponding projection onto the image plane →
perspective projection.
• 3D point 𝑃 = 𝑋, 𝑌, 𝑍 𝑇 projected to 2D image 𝑝 = 𝑥, 𝑦 𝑇

m y

Ngo Quoc Viet 8


PINHOLE CAMERA MODEL

• Simplest form of perspective projection

𝑥 𝑋 𝑋 𝑦 𝑌 𝑌
= ⇒𝑥=𝑓 ; = ⇒𝑦=𝑓
𝑓 𝑍 𝑍 𝑓 𝑍 𝑍
𝑥, 𝑦, 1 𝑇 = 𝑋𝑓, 𝑌𝑓, 𝑍 𝑇 M
Projection line
• Change of unit: physical measurements to m y
pixels

• k, l: scale parameters (pixels/mm)


z
• f : focal length (mm). Denote: 𝛼=𝑘𝑓, 𝛽=𝑙𝑓
𝑋 𝑌
𝑥 = 𝑘𝑓 , 𝑦 = 𝑙𝑓 The coordinate system [x y z] centered at the pinhole P
𝑍 𝑍
such that the axis z is perpendicular to the image plane is
𝑌 𝑌 camera/reference coordinate system,
𝑦 = 𝛼 ,𝑦 = 𝛽
𝑍 𝑍 The line defined by C and P is called the principal/ optical
axis of the camera system Ngo Quoc Viet 9
PINHOLE CAMERA MODEL

M
• Change of coordinate system
Projection line
m
– Image plane coordinates have origin y

at image center
– Digital image coordinates have z
origin at top-left corner (0, 0)

– Image center (principal point):


𝑐𝑥 , 𝑐𝑦

𝑋 𝑌
𝑥 = 𝛼 + 𝑐𝑥 , 𝑦 = 𝛽 + 𝑐𝑦
𝑍 𝑍

(256, 256)
Ngo Quoc Viet 10
H O M O G E N E O U S C O O R D I N AT E S

• Homogeneous coordinates are a mathematical tool used in computer graphics,


computer vision, and other fields to represent points, vectors, and transformations in
projective spaces. The term "homogeneous" implies that these coordinates are defined
up to a scale factor.
• In homogeneous coordinates, a point is represented as (x, y, w), where (x, y) are the
Cartesian coordinates, and w is a non-zero scale factor. Point in Cartesian is a ray in
Homogeneous.
• Conversion between Cartesian and Homogeneous Coordinates

𝑥 𝑥 𝑥 𝑥
𝑦 𝑥 𝑦 𝑦 𝑥 𝑦 𝑧
𝑥, 𝑦 → 𝑦 , 𝑥, 𝑦, 𝑧 → 𝑧 ; 𝑦 → , , 𝑧 → , ,
1 𝑤 𝑤 𝑤 𝑤 𝑤 𝑤
1 𝑤

Ngo Quoc Viet 11


H O M O G E N E O U S C O O R D I N AT E S

• Projective Space: Homogeneous coordinates are particularly useful in projective


geometry and projective spaces. Projective geometry deals with the properties of
geometric objects that remain invariant under projective transformations
• Affine and Projective Transformations: Affine transformations (translation, rotation,
scaling, and shearing) can be represented using 3x3 (2D trans formation) or 3x3 (3D
transformation) matrices in homogeneous coordinates.
• Points at Infinity: In homogeneous coordinates, points at infinity can be represented
naturally. For example, the point (x, y, 0) represents a point at infinity along the line
through (x, y, 1).
• Homogeneous coordinates are particularly useful in projective geometry and projective
spaces.

Ngo Quoc Viet 12


BASIC GEOMETRY IN HOMOGENEOUS
COORDINATES

• 2D lines can also be represented using homogeneous coordinates 𝐼ሚ =


𝑎, 𝑏, 𝑐 . The corresponding line equation is x. 𝐼ሚ = 𝑎𝑥 + 𝑏𝑦 + 𝑐 = 0

• We can normalize the line equation vector so that 𝐼 = 𝑛𝑥 , 𝑛𝑦 , 𝑑 =


𝑛,
ො 𝑑 , 𝑛ො = 1

ො as a function of rotation angle , 𝑛ො = 𝑛𝑥 , 𝑛𝑦 =


• We can also express 𝑛
𝑐𝑜𝑠𝜃, 𝑠𝑖𝑛𝜃
a) 2D line equation and (b) 3D plane
equation, expressed in terms of the
normal 𝑛ො and the distance to the origin d.

Ngo Quoc Viet 13


BASIC GEOMETRY IN HOMOGENEOUS
CO O R D I N AT E S

• Line of two points 𝑥𝑖 , 𝑦𝑖 and 𝑥𝑗 , 𝑦𝑗 given by cross product:

𝑥𝑖 𝑥𝑗 𝑦𝑖 ∗ 1 − 1 ∗ 𝑦𝑗
𝑦𝑖 × 𝑦𝑗 = 1 ∗ 𝑥𝑗 − 𝑥𝑖 ∗ 1
1 1 𝑥𝑖 ∗ 𝑦𝑗 − 𝑦𝑖 ∗ 𝑥𝑗

• Intersection point of two lines given by cross product of the lines


𝑞𝑖𝑗 = 𝑙𝑖𝑛𝑒𝑖 × 𝑙𝑖𝑛𝑒𝑗

Ngo Quoc Viet 14


PINHOLE CAMERA MODEL

𝑋 𝑌
• Using homogeneous coordinates, we can formulate 𝑥 = 𝛼 + 𝑐𝑥 , 𝑦 = 𝛽 + 𝑐𝑦 in
𝑍 𝑍

𝛼𝑋 𝑐𝑥
𝛼𝑋 𝛽𝑌 𝑐𝑥 𝑐𝑦
Cartesian coordinate system , ⇒ 𝛽𝑌 , , ⇒ 𝑐𝑦 by
𝑍 𝑍 1 1
𝑍 1

𝑥 𝛼𝑋 + 𝑐𝑥 𝛼 0 𝑐𝑥 0 𝑋 𝛼 0 𝑐𝑥 0
𝑝 = 𝑦 = 𝛽𝑌 + 𝑐𝑦 = 0 𝛽 𝑐𝑦 0 𝑌 = 0 𝛽 𝑐𝑦 0 𝑃 = 𝑀𝑃
𝑍
𝑍 𝑍 0 0 1 0 1 0 0 1 0

• We can represent the relationship between a point in 3D space and its image
coordinates by a matrix vector relationship.

Ngo Quoc Viet 15


PINHOLE CAMERA MODEL

• We can decompose this transformation a bit further into

𝛼 0 𝑐𝑥 1 0 0 0
𝑝 = 𝑀𝑃 = 0 𝛽 𝑐𝑦 0 1 0 0 𝑃
0 0 1 0 0 1 0
𝛼 0 𝑐𝑥
= 0 𝛽 𝑐𝑦 I 0 𝑃 = 𝐾 I 0 𝑃
0 0 1
• The matrix K is often referred to as the camera matrix. It is also called
intrinsic matrix.

Ngo Quoc Viet 16


P I N H O L E C O M P L E T E C A M E R A M AT R I X
𝑦ො
y
• Image frame may not be exactly rectangular due to
sensor manufacturing errors. Two parameters are
currently missing: skewness and distortion (ignored)
𝑥ො
– : skew angle between x- and y-axis (normally =90 degree).

𝛼 𝛼𝑐𝑜𝑡𝜃 𝑐𝑥
x 𝛽
y 𝐾= 0 𝑐𝑦
𝑠𝑖𝑛𝜃
0 0 1
• Most cameras have zero-skew (=90).
𝑥ො
• The camera matrix K has 5 degrees of freedom: k, f
for focal length, 𝑐𝑥 , 𝑐𝑦 for offset and  for skew.
 x

Ngo Quoc Viet 17


W O R L D C O O R D I N AT E

• We have described a mapping between a point P in the 3D camera


reference system to a point p in the 2D image plane.
• But what if the information about the 3D world is available in a different
coordinate system.
• We need to include an additional transformation that relates points from
the world reference system to the camera reference system.
• This transformation is captured by a rotation matrix R and translation
vector T.

Ngo Quoc Viet 18


W O R L D C O O R D I N AT E

• The 3D world coordinate is projected to image plane through 2 transformations:


– Translate and rotate camera coordinate system to 3D world coordinate system
– Project 3D camera point P to image plane

• This transformation is captured by a rotation matrix R and translation vector T.

Ngo Quoc Viet 19


C A M E R A T R A N S L AT I O N & R OTAT I O N

• Translation 1 0 0
𝑅𝑥 𝛼 = 0 𝑐𝑜𝑠𝛼 −𝑠𝑖𝑛𝛼
𝛼 0 𝑐𝑥 1 0 0 𝑡𝑥 0 𝑠𝑖𝑛𝛼 𝑐𝑜𝑠𝛼
𝑝=𝐾 I 𝑡 𝑃= 0 𝛽 𝑐𝑦 0 1 0 𝑡𝑧 𝑃 𝑐𝑜𝑠𝛽 0 𝑠𝑖𝑛𝛽
0 0 1 0 0 1 𝑡𝑧 𝑅𝑦 𝛽 = 0 1 0
• Rotation −𝑠𝑖𝑛𝛽 0 𝑐𝑜𝑠𝛽

𝑐𝑜𝑠𝛾 −𝑠𝑖𝑛𝛾 0
𝛼 0 𝑐𝑥 𝑟11 𝑟12 𝑟13 𝑡𝑥 𝑅𝑧 𝛾 = 𝑠𝑖𝑛𝛾 𝑐𝑜𝑠𝛾 0
𝑝=𝐾 R 𝑡 𝑃= 0 𝛽 𝑐𝑦 𝑟21 𝑟22 𝑟23 𝑡𝑧 𝑃 0 0 1
0 0 1 𝑟31 𝑟32 𝑟33 𝑡𝑧 y p’
• These parameters R and T are known as the extrinsic
g
parameters because they are external to and do not
depend on the camera p

Ngo Quoc Viet 20


z
T H E E X T R I N S I C PA R A M E T E R S

𝑝 = 𝐾 R 𝑡 𝑃 = 𝑀𝑃

• The full projection matrix M consists of the two types of parameters introduced
above: intrinsic and extrinsic parameters.

• All parameters contained in the camera matrix K are the intrinsic parameters,
which change as the type of camera changes.

• The extrinsic paramters include the rotation and translation, which do not
depend on the camera’s build.

• Overall, we find that the 3 × 4 projection matrix M has 11 degrees of freedom: 5


for intrinsic, 3 for extrinsic rotation and 3 for extrinsic translation.

Ngo Quoc Viet 21


M A P P I N G CO O R D I N AT E S F R O M 3 D TO 2 D

• The most important intrinsic parameter is the focal length,


which determines how much the camera can zoom in or out.
The principal point coordinates define the center of the
image, where the optical axis intersects with the image
plane. The image sensor size determines the field of view of
the camera.

• The extrinsic parameters of a camera describe its position


and orientation in 3D space. These parameters include the
rotation and translation vectors, which determine the
camera’s position and orientation relative to the 3D scene
being imaged

• When we perform a perspective projection of 3D points


onto a 2D plane, we need to take into account both the
intrinsic and extrinsic parameters of the camera
Ngo Quoc Viet 22
PINHOLE CAMERA MODEL

• When the camera matrix is equal to the identity matrix, it means that there is no
distortion or scaling in the image. This is a special case known as the pinhole
camera model, where light passes through a single point (the pinhole) to form an
inverted image on the opposite side of the camera.
• Distortion coefficients are used to correct lens distortion, which can cause images
to appear warped or curved. When the distortion coefficients are set to zero, it
means that there is no distortion to correct for.
• In some cases, it may be appropriate to use an identity camera matrix and zero
distortion coefficients. This is typically done when the camera has already been
calibrated and the distortion is minimal, or when a pinhole camera model is
appropriate for the task at hand.

Ngo Quoc Viet 23


3D TO 2D PROJECTIONS

• Onthography: drop z component, p=

1 0 0 0
0 1 0 0 𝑃.
0 0 0 0
0 0 0 1
• Scaled-onthography: 𝑝 = 𝑠𝐼2×2 0 𝑃

• Para-perspective

• Perspective: the most commonly use projection


in computer graphics and computer vision.
• Object-centered

Ngo Quoc Viet 24


PERSPECTIVE PROJECTION

• The most commonly used projection in computer


graphics and computer vision is true 3D
perspective.
• Points are projected onto the image plane by
𝑋/𝑍
dividing them by their z component: 𝑝 = 𝑌/𝑍 . In
1
homogeneous coordinates, we have
• After projection, it is not possible to
1 0 0 0 recover the distance of the 3D point
p= 0 1 0 0 𝑃 from the image, which makes sense
0 0 1 0 for a 2D imaging sensor.
0 0 0 0

Ngo Quoc Viet 25


PERSPECTIVE PROJECTION

• A form often seen in computer graphics systems is a two-step


projection that
– Projects 3D coordinates into normalized device coordinates 𝑋, 𝑌, 𝑍 ∈
−1, 1 × −1, 1 × 0, 1 .
– Rescales these coordinates to integer pixel coordinates using a viewport
transformation

• The (initial) perspective projection is then represented using 4x4


matrix

1 0 0 0
0 1 0 0
p= −𝑍𝑓𝑎𝑟 𝑍𝑛𝑒𝑎𝑟 𝑍𝑓𝑎𝑟 𝑃, 𝑍𝑟𝑎𝑛𝑔𝑒 = 𝑍𝑓𝑎𝑟 − 𝑍𝑛𝑒𝑎𝑟
0 0
𝑍𝑟𝑎𝑛𝑔𝑒 𝑍𝑟𝑎𝑛𝑔𝑒
0 0
0 0
where 𝑍𝑛𝑒𝑎𝑟 and 𝑍𝑓𝑎𝑟 are the near and far z clipping planes

Ngo Quoc Viet 26


PERSPECTIVE PROJECTION

• Transform the world point 𝑋𝑤 , 𝑌𝑤 , 𝑍𝑤 with optical center W to a new coordinate system
with the camera’s optical center C as the (0, 0, 0) origin. This is done using a rigid body
transformation, consisting of rotation (R) and translation (t).
• Project the camera point 𝑋𝑐 , 𝑌𝑐 , 𝑍𝑐onto the optical sensor, creating new coordinates
𝑥, 𝑦, 𝑧 in the same coordinate system. This is achieved using the camera’s intrinsic
parameter matrix K. The optical sensor is often referred to as the “image plane” or “image
frame”.
• Normalize 𝑥, 𝑦, 𝑧 to pixel coordinates 𝑢, 𝑣 by dividing by z and and adjusting the
origin of the image.
• Python Code: Perspective Projection, Mapping coordinates from 3d to 2d in
https://colab.research.google.com/

Ngo Quoc Viet 27


PERSPECTIVE TRANSFOR MATION

• Mapping the player positions


which are in the camera space to
the actual soccer field coordinate
and generate graph with player
positions relative to the soccer
field.
• What we need was an output Picture from:
https://naadispeaks.blog/2021/08/31/perspective-
similar to the left bottom one in transformation-of-coordinate-points-on-polygons/
the figure.
• Mathematics of Two- and Three- Point Perspective:
https://people.eecs.berkeley.edu/~barsky/perspective.html
Ngo Quoc Viet 28
PERSPECTIVE TRANSFOR MATION

• If we know the frame-space location coordinates of the


4 corners of the field, we can easily transform any point
inside that polygon for a given coordinate space. It is
perspective transformation.

• In Perspective Transformation, we can change the


perspective of a given image or video for getting better
insights into the required information.

• We use cv2.getPerspectiveTransform and then


cv2.warpPerspective.

• Perspective transformation is a broader term (a wider


range of operations beyond just projection) that
encompasses various techniques for transforming or
mapping objects from one perspective to another.

Ngo Quoc Viet 29


I M AG E F O R M AT I O N W I T H P I N H O L E

• Light entry through the small pinhole.

• Inverted Image Formation: The limited rays of light passing through the
pinhole create an inverted image on the opposite side of the pinhole
• Image Projection: The inverted image is projected onto the photosensitive
surface (film or image sensor) .
• Please refer to Python code to simulate pinhole camera.

Ngo Quoc Viet 30


I M AG E F O R M AT I O N W I T H P I N H O L E

Ngo Quoc Viet 31


I M AG E F O R M AT I O N W I T H P I N H O L E

Ngo Quoc Viet 32


W H Y C A M E R A C A L I B R AT I O N

• Camera calibration is a crucial process in computer vision and image processing


because it helps correct distortions in images caused by the camera's optical
system.
• Radial Distortion Correction: Many camera lenses introduce radial distortion,
causing straight lines to appear curved. Calibration helps correct this distortion,
ensuring that straight lines in the real world remain straight in the image.
• Tangential Distortion Correction: This occurs when the lens is not perfectly
aligned with the image sensor. Calibration corrects tangential distortion,
ensuring that points away from the center of the image are accurately
represented

Ngo Quoc Viet 33


W H Y C A M E R A C A L I B R AT I O N

• Computer Vision Applications: In computer vision tasks like object recognition,


tracking, and 3D reconstruction, accurate knowledge of the camera's intrinsic
and extrinsic parameters is crucial. Calibration provides these parameters,
allowing algorithms to map image coordinates to real-world coordinates
accurately.
• Augmented Reality: In augmented reality applications, where virtual objects are
overlaid onto the real world, accurate camera calibration ensures proper
alignment of virtual and real-world elements.
• If given an arbitrary camera, we may or may not have access to these parameters.
Therefore, can we find a way to deduce them from images.

Ngo Quoc Viet 34


W H Y C A M E R A C A L I B R AT I O N

• 3D Reconstruction: Camera calibration is fundamental for reconstructing a three-


dimensional scene from multiple images. It helps establish the relationship
between the 3D world and the 2D image, allowing for accurate reconstruction

• Robotics: Robots often use cameras for navigation and perception. Accurate
calibration is necessary for robot systems to interpret visual information correctly
and make informed decisions.

• Quality Control: In industrial applications, where cameras are used for quality
control purposes, calibration ensures that defects or features are accurately
detected and measured

Ngo Quoc Viet 35


W H AT I S C A M E R A C A L I B R AT I O N

• Camera calibration is the process of determining the intrinsic and extrinsic


parameters of a camera. These parameters define how a camera projects a three-
dimensional (3D) scene onto a two-dimensional (2D) image.
• The calibration process typically involves capturing images of a known calibration
pattern, such as a checkerboard, from various orientations. By analyzing the
correspondences between the 3D coordinates of calibration points in the world
and their 2D image coordinates, the calibration software can estimate the
camera's intrinsic and extrinsic parameters.
• The calibration results enable the transformation from pixel coordinates in the
image to real-world coordinates in the 3D scene.

Ngo Quoc Viet 36


H O W TO C A L I B R AT I O N

• We need find the intrinsic camera matrix K


and the extrinsic parameters R, T by
solving the
𝑝 = 𝐾 R 𝑡 𝑃 = 𝑀𝑃
• M has 11 degrees of freedom: 5 for
intrinsic, 3 for extrinsic rotation and 3 for
extrinsic translation.
• This means that we need at least 6
correspondences to solve this.
Ngo Quoc Viet 37
H O W TO C A L I B R AT I O N

• From the rig’s known pattern, we have known points in Given n of these corresponding
the world reference frame 𝑃1 , 𝑃2 , ⋯ , 𝑃𝑛 . Finding these points, the entire linear system of
points in the image we take from the camera gives equations becomes
corresponding points in the image 𝑝1 , 𝑝2 , ⋯ , 𝑝𝑛 . 𝑢1 𝑚3 𝑃1 − 𝑚1 𝑃1 = 0
𝑣1 𝑚3 𝑃1 − 𝑚2 𝑃1 = 0
• We set up a linear system of equations from n

correspondences such that for each correspondence
𝑢𝑛 𝑚3 𝑃𝑛 − 𝑚1 𝑃𝑛 = 0
𝑃𝑖 , 𝑝𝑖 and camera matrix M whose rows are
𝑣𝑛 𝑚3 𝑃𝑛 − 𝑚2 𝑃𝑛 = 0
𝑚1 , 𝑚2 , 𝑚3

𝑚1 𝑃𝑖 𝑃1𝑇 0𝑇 −𝑢1 𝑃1𝑇


𝑢𝑖 𝑚 𝑃 0𝑇 𝑃1𝑇 −𝑣1 𝑃1𝑇 𝑚1𝑇
𝑝𝑖 = 𝑣 = 𝑀𝑃𝑖 = 3 𝑖 ⋮ 𝑚2𝑇 = 𝑃𝑚 = 0
𝑖 𝑚2 𝑃𝑖
𝑚3 𝑃𝑖 𝑃𝑛𝑇 0𝑇 −𝑢𝑛 𝑃𝑛𝑇 𝑚3𝑇
0𝑇 𝑃𝑛𝑇 −𝑣𝑛 𝑃𝑛𝑇
Ngo Quoc Viet 38
H O W TO C A L I B R AT I O N

• When 𝟐𝒏 > 𝟏𝟏 (at least 6 correspondences), our homogeneous linear system is


overdetermined. Therefore, we need linear equation system of at least 5 and 6
linear equations for intrinsic matrix and extrinsic parameters.
• For such a system m = 0 is always a trivial solution. Furthemore, even if there were
some other m that were a nonzero solution, then ∀𝑘 ∈ 𝑅, km is also a solution.
Therefore, to constrain our solution, we complete the following minimization
min 𝑃𝑚 2 , 𝑠𝑢𝑏𝑗𝑒𝑐𝑡 𝑡𝑜 𝑚 2 =1
𝑚
• To solve this minimization problem, we simply use singular value decomposition.
The derivation for this problem is outside the scope of this class. Please refer
Section 5.3 of Hartley & Zisserman-Multiple View Geometry in CV and Section
1.3.1 of the Forsyth & Ponce textbook.

Ngo Quoc Viet 39


Source: https://learnopencv.com/camera-calibration-using-opencv/
Ngo Quoc Viet 40
C A M E R A C A L I B R AT I O N A LG O R I T H M

1. Chessboard Setup: Use a chessboard pattern with known square dimensions;


Capture multiple images of the chessboard from different angles and
distances.

2. Image Preprocessing: Detect chessboard corners in each image using corner


detection algorithms (e.g., Harris corner detector).

3. Corner Detection: Find the image coordinates of the detected chessboard


corners.

4. World Coordinates: Define the world coordinates of the chessboard corners in


a 3D space; Use the known square dimensions of the chessboard.

Ngo Quoc Viet 41


C A M E R A C A L I B R AT I O N A LG O R I T H M

5. Calibration Object Points: Create a list of 3D points representing the real-world


coordinates of the chessboard corners.
6. Calibration Parameters Estimation
– Use the correspondences between image points and object points to estimate camera
parameters;
– Parameters include intrinsic matrix (focal length, principal point), distortion coefficients, and
extrinsic parameters (rotation and translation vectors)

7. Evaluate Calibration Quality: Reproject the 3D world points into 2D image


points using the obtained calibration parameters; Compare the reprojected
points with the detected image points to assess the calibration accuracy

Ngo Quoc Viet 42


C A M E R A C A L I B R AT I O N ’ S C O D E

• https://docs.opencv.org/4.x/dc/dbb/tutorial_py_calibration.html

• https://learnopencv.com/camera-calibration-using-opencv/

• https://opencv24-python-
tutorials.readthedocs.io/en/latest/py_tutorials/py_calib3d/py_calibration/
py_calibration.html

Ngo Quoc Viet 43


C O LO R S PAC E - R G B
0,1,0

R
(G=0,B=0)

G
1,0,0 (R=0,B=0)

0,0,1
B
(R=0,G=0)
Some drawbacks
• Strongly correlated channels
• Non-perceptual http://en.wikipedia.org/wiki/File:RGB_color_solid_cube.png
Ngo Quoc Viet 44
C O LO R S PAC E S : H S V

H
(S=1,V=1)

S
(H=1,V=1)

V
(H=1,S=0)

Intuitive color space


Ngo Quoc Viet 45
C O LO R S PAC E S : YC B C R
Y=0 Y=0.5

Y
Cr (Cb=0.5,Cr=0.5)

Y=1
Cb
Cb
(Y=0.5,Cr=0.5)

Cr
Fast to compute, good for compression, used by TV (Y=0.5,Cb=05)

Ngo Quoc Viet 46


C O LO R S PAC E S - L * A * B *

L
(a=0,b=0)

a
(L=65,b=0)

b
(L=65,a=0)

“Perceptually uniform”* color space


Ngo Quoc Viet 47
SUMMARY

• Pinhole camera model is the simplest mathematical model that can be applied
to a lot of real photography devices.

• Homogeneous coordinates are a mathematical tool used in computer graphics,


computer vision.

• Perspective projection and transformation necessary in computer vision.

• Calibration is crucial for tasks such as 3D reconstruction and object tracking.


Camera calibration involves estimating the intrinsic and extrinsic parameters of a
camera, which are necessary for accurate image analysis.

• Color spaces: RGB, HSV, YCbCr, L*a*b*.

Ngo Quoc Viet 48

You might also like