CV Unit-1 New
CV Unit-1 New
Learning Outcomes:
After completion of this unit, the student will be able to:
1. Understand about working of camera model and radiometry (L1)
2. Understand the image formation (L1)
3. apply transformation in the image (L3)
4. understand the 3D image formation from multi-camera perspective (L1)
Pinhole cameras
• This camera system can be designed by placing a barrier with a small aperture
between the 3D object and a photographic film or sensor.
• As Figure 1 shows, each point on the 3D object emits multiple rays of light
outwards.
• Without a barrier in place, every point on the film will be influenced by light rays
emitted from every point on the 3D object.
• Due to the barrier, only one (or a few) of these rays of light passes through the
aperture and hits the film.
• Therefore, we can establish a one-to-one mapping between points on the 3D object
and the film. The result is that the film gets exposed by an “image” of the 3D
Pinhole cameras
• In this second pin hole camera model, the film is commonly called the image or
retinal plane.
• The aperture is referred to as the pinhole O or center of the camera.
• The distance between the image plane and the pinhole O is the focal length f.
• Sometimes, the retinal plane is placed between O and the 3D object at a distance
f from O.
• In this case, it is called the virtual image or virtual retinal plane.
• Note that the projection of the object in the image plane and the image of the
object in the virtual image plane are identical up to a scale (similarity)
transformation.
Now, how do we use pinhole cameras? Let P = [x y z]T be a point on some 3D object
visible to the pinhole camera. P will be mapped or projected onto the image plane
Π’, resulting in point1 P’ = [x’ y’]T. Similarly, the pinhole itself can be projected onto
the image plane, giving a new point C’.
Recall that point P’ is derived from the projection of 3D point P on the image plane Π’.
Therefore, if we derive the relationship between 3D point P and image plane point P’,
we can understand how the 3D world imprints itself upon the image taken by a
pinhole camera. Notice that triangle P’ C’ O is similar to the triangle formed by P, O
and (0; 0; z). Therefore, using the law of similar triangles we find that:
As the aperture size increases, the number of light rays that passes through the
barrier consequently increases. With more light rays passing through, then each
point on the film may be affected by light rays from multiple points in 3D space,
blurring the image. Although we may be inclined to try to make the aperture as
small as possible, recall that a smaller aperture size causes less light rays to pass
through, resulting in crisper but darker images.
The Pinhole Perspective projection equations
Let P denote a scene point with coordinates (X, Y, Z) and p denote its image. with
coordinates (x, y, z). (Throughout this chapter, we will use uppercase letters to
denotes points in space, and lowercase letters to denote their image projections.)
Since p lies in the image plane, we have z = d. Since the three points P, O, and p are
collinear, we have Vector Op = λ*vector OP for some number λ,
Radiometry – Measuring
Light in Space Light
The measurement of light is a field in itself, known as radiometry. We need a series of
units that describe how energy is transferred from light sources to surface patches,
and what happens to the energy when it arrives at a surface. The distribution of light
in space is a function of position and direction.
Radiometry is the part of image formation concerned with the relation among the
amounts of light energy emitted from light sources, reflected from surfaces, and
registered by sensors.
The interaction between light and
matter can take many forms:
● Reflection
● Refraction
● Diffraction
● Absorption
● Scattering
● Emission
Light at surfaces
When light strikes a surface, it may be
absorbed
transmitted (skin)
scattered (milk)
reflected (mirror)
Usually the combination of these effects occur
e.g. Light arriving at skin can be
27
Geometric Transformations
2D Translation
y
x x t x , y y t y
P
x x tx
T P , P , T
y y ty
P x P P T
28
2D Rotation
y
Rotation in angle about a
P Pr R P Pr
x, y
cos sin
xr , yr x R
sin cos
29
2D Scaling
x x sx , y y s y
Sx
x sx 0 x
y 0 s y y
Sy
P S P
y Scaling about a fixed point x f , y f
x x sx x f 1 sx
y y s y y f 1 s y
x f , yf
x
P P S P f 1 - S
30
Homogeneous Coordinates
31
x 1 0 t x x
y 0 1 t y , P T t , t P
2D Translation
y x y
1 0 0 1 1
x S x 0 0 x
2D Scaling y 0 Sy 0 y , P S S x , S y P
1 0 0 1 1
32
Inverse transformations:
1 0 tx cos sin 0 1 S x 0 0
T 1 0 1 t y , R 1 sin cos 0 , S 1 0 1 S y 0
0 0 1 0 0 1 0 0 1
P T t2 x , t2 y T t1x , t1 y P T t2 x , t2 y T t1x , t1 y P
1 0 t2 x 1 0 t1x 1 0 t1 x t2 x
0 1 t 0 1 t 0 1 t t
Composite translations: 2y 1y 1y 2y
0 0 1 0 0 1 0 0 1
33
P R 2 R 1 P R 2 R 1 P
P R 1 2 P
S2 x 0 0 S1x 0 0 S1 x S2 x 0 0
0 S2 y 0 0 S1 y 0 0 S1 y S2 y 0
0 0 1 0 0 1 0 0 1
34
General 2D Rotation
Move to origin Rotate Move back
xr , yr
1 0 xr cos sin 0 1 0 xr
0 1 yr sin cos 0 0 1 yr
0 0 1 0 0 1 0 0 1
35
General 2D Scaling
Move to origin Scale Move back
x f , yf
1 0 x f Sx 0 0 1 0 x f S x 0 x f 1 S x
0 1
y f 0 Sy
0 0 1 y f 0 Sy y f 1 S y
0 0 1 0 0 1 0 0 1 0 0 1
36
Geometric Image Formation
Q UA L I TAT I V E R A D I O M E T RY
• We would like to know how “bright” surfaces are going to be under various
lighting conditions and how this “brightness” depends on local surface
properties, on surface shape, and on illumination.
• A surface patch sees the world through a hemisphere of directions at that
patch.
•In this case, every point on the plane must see the
same hemisphere — half of its viewing sphere is cut
off by the wall, and the other half contains the sky,
which is uniform—and the plane is uniform, so every
point must have the same “brightness”.
We now have an infinitely thin black wall that is infinitely long in only one direction and on
an infinite plane (Figure).
A qualitative description would be to find what the curves of equal “brightness” look like.
It is fairly easy to see that all points on any line passing through the point p in Figure see
the same input hemisphere and so must have the same “brightness”. Furthermore, the
distribution of “brightness” on the plane must have a symmetry about the line of the wall—
we expect the brightest points to be along the extension of the line of the wall and the
darkest to be at the base of the wall.
SOURCES AND THEIR EFFECTS
Radiometric Properties of Light Sources
We define a light source to be anything that emits light that is internally
generated (i.e., not just reflected).
To describe a source, we need a description of the radiance it emits in each
direction. Typically, internally generated radiance is dealt with separately from
reflected radiance.
This is because, although a source may reflect light, the light it reflects depends on
the environment, whereas the light it generates internally usually does not.
exitance, is defined as the internally generated energy radiated per unit time
and per unit area on the radiating surface.
Exitance is similar to radiosity, and can be computed as
Point Sources
A common approximation is to assume that the light source is an extremely small
sphere, in fact, a point; such a source is known as a point source. It is a natural
model to use because many sources are physically small compared with the
environment in which they stand.
The radiosity due to the source is obtained by integrating the pattern generated
by the source, times cos θi, over the patch of solid angle. As tends to zero, the
patch shrinks and the cos θi is close to constant. If ρ is the surface albedo, all this
means the expression for radiosity due to the point source is
• Stereopsis (from Ancient Greek (stereós) 'solid', and ópsis) 'appearance, sight') is
the component of depth perception retrieved through binocular vision.
• binocular vision is a type of vision in which an animal has two eyes capable of
facing the same direction to perceive a single three-dimensional image of its
surroundings.
• Images of the same world scene taken from slightly displaced viewpoints are
called stereo images.
• Stereopsis is not the only contributor to depth perception, but it is a major one.
• Binocular vision happens because each eye receives a different image because
they are in slightly different positions on one's head (left and right eyes).
• These positional differences are referred to as "horizontal disparities" or, more
generally, "binocular disparities".
Depth from Stereo
• Goal: recover depth by finding image coordinate x’ that corresponds to x
• Problems
– Calibration: How do we recover the relation of the cameras (if not already
known)?
– Correspondence: How do we search for the matching point x’?
Images of the same world scene taken from slightly displaced viewpoints are
called stereo images.
Important Points about the Model
• The cameras are identical
• The coordinate systems of both cameras are perfectly alligned.
• Once camera and world co-ordinate systems are alligned the xy plane of the
image is alligned with the XY plane of the world co-ordinate system,then Z
coordinate of W is same for both camera coordinate systems.
We have two images taken from cameras at different positions
• How do we match a point in the first image to a point in the second? What
constraints do we have?
Issues with Template Matching
Epipolar geometry is the geometry of stereo vision. When two cameras view a
3D scene from two distinct positions, there are a number of geometric relations
between the 3D points and their projections onto the 2D images that lead to
constraints between the image points.
Image Rectification
The calculations associated with stereo algorithms are often considerably simplified
when the images of interest have been rectified—that is, replaced by two equivalent
pictures with a common image plane parallel to the baseline joining the two optical
centers
Image Digitization
• Acquisition is done by a physical device that is sensitive to the light from the
object we wish to image.
In digital camera, the sensors produce an electrical output proportional to
light intensity.
• Reflected responses from all the spatial positions of the object are caught by a
sensor, a CCD or a Vidicon camera and transformed into the equivalent analog
electrical signals by a photoelectric detector in the imaging system.
Digital Image Representation
Quantization
•Quantization = spacing of discrete values in the range of a signal.
•usually thought of as the number of bits per sample of the signal. e.g., 1 bit per
pixel (b/w images), 16-bit audio, 24-bit color images, etc.
• The digital samples resulting from both sampling and quantization produce
the 2D digital image as a matrix of real numbers.
Light
source
Analog
electrical
Normal Optical sensor & signals Sampler &
Photoelectric Quantizer
Detector
• the amount of light reflected by the objects in the scene Ö Reflectance, r(x,y)
f ( x, y) = i( x,y)r ( x, y)
0 < i( x, y) < ∞
determined by the nature of the light source 0 < r ( x, y) < 1
determined by the nature of the objects in a scene Ö bounded from total
absorption to total reflectance.
Gray level
Resolution
• Resolution (how much you can see the detail of the image) depends on sampling
and gray levels.
• the bigger the sampling rate (n) and the gray scale (g), the better the
approximation
of the digitized image from the original.
• the more the Sampling scale becomes, the bigger the size of the digitized image.
Checkerboard effect
a) 1024x1024
(b) 512x512
(c) 256x256
(d) 128x128
(e) 64x64
(f) 32x32
if the resolution is decreased too much, the
checkerboard effect can occur.
False contouring
(a) Gray level = 16
(b) Gray level = 8
(c) Gray level = 4
(d) Gray level = 2