0% found this document useful (0 votes)
17 views96 pages

CV Unit-1 New

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views96 pages

CV Unit-1 New

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 96

Computer Vision

Dr. N Lakshmipathi Anantha, Ph.D.,


Assistant Professor,
Department of Computer Science & Engineering,
Gitam (Deemed to be University)
Module Contents
Module I: Camera Models, CAMERAS: Pinhole Cameras, Radiometry – Measuring
Light: Light in Space, Light Surfaces, Sources, Shadows, And Shading: Qualitative
Radiometry, Sources and Their Effects, Image formation, Orthographic & Perspective
Projection, Depth estimation and Multi-camera, views Perspective, Binocular
Stereopsis: Camera and Epipolar Geometry, Homography, Rectification, 3-D
reconstruction framework; Auto-calibration.

Learning Outcomes:
After completion of this unit, the student will be able to:
1. Understand about working of camera model and radiometry (L1)
2. Understand the image formation (L1)
3. apply transformation in the image (L3)
4. understand the 3D image formation from multi-camera perspective (L1)
Pinhole cameras

• This camera system can be designed by placing a barrier with a small aperture
between the 3D object and a photographic film or sensor.
• As Figure 1 shows, each point on the 3D object emits multiple rays of light
outwards.
• Without a barrier in place, every point on the film will be influenced by light rays
emitted from every point on the 3D object.
• Due to the barrier, only one (or a few) of these rays of light passes through the
aperture and hits the film.
• Therefore, we can establish a one-to-one mapping between points on the 3D object
and the film. The result is that the film gets exposed by an “image” of the 3D
Pinhole cameras

• In this second pin hole camera model, the film is commonly called the image or
retinal plane.
• The aperture is referred to as the pinhole O or center of the camera.
• The distance between the image plane and the pinhole O is the focal length f.
• Sometimes, the retinal plane is placed between O and the 3D object at a distance
f from O.
• In this case, it is called the virtual image or virtual retinal plane.
• Note that the projection of the object in the image plane and the image of the
object in the virtual image plane are identical up to a scale (similarity)
transformation.
Now, how do we use pinhole cameras? Let P = [x y z]T be a point on some 3D object
visible to the pinhole camera. P will be mapped or projected onto the image plane
Π’, resulting in point1 P’ = [x’ y’]T. Similarly, the pinhole itself can be projected onto
the image plane, giving a new point C’.

Recall that point P’ is derived from the projection of 3D point P on the image plane Π’.
Therefore, if we derive the relationship between 3D point P and image plane point P’,
we can understand how the 3D world imprints itself upon the image taken by a
pinhole camera. Notice that triangle P’ C’ O is similar to the triangle formed by P, O
and (0; 0; z). Therefore, using the law of similar triangles we find that:
As the aperture size increases, the number of light rays that passes through the
barrier consequently increases. With more light rays passing through, then each
point on the film may be affected by light rays from multiple points in 3D space,
blurring the image. Although we may be inclined to try to make the aperture as
small as possible, recall that a smaller aperture size causes less light rays to pass
through, resulting in crisper but darker images.
The Pinhole Perspective projection equations

Let P denote a scene point with coordinates (X, Y, Z) and p denote its image. with
coordinates (x, y, z). (Throughout this chapter, we will use uppercase letters to
denotes points in space, and lowercase letters to denote their image projections.)
Since p lies in the image plane, we have z = d. Since the three points P, O, and p are
collinear, we have Vector Op = λ*vector OP for some number λ,
Radiometry – Measuring
Light in Space Light
The measurement of light is a field in itself, known as radiometry. We need a series of
units that describe how energy is transferred from light sources to surface patches,
and what happens to the energy when it arrives at a surface. The distribution of light
in space is a function of position and direction.
Radiometry is the part of image formation concerned with the relation among the
amounts of light energy emitted from light sources, reflected from surfaces, and
registered by sensors.
The interaction between light and
matter can take many forms:
● Reflection
● Refraction
● Diffraction
● Absorption
● Scattering
● Emission
Light at surfaces
When light strikes a surface, it may be
absorbed
transmitted (skin)
scattered (milk)
reflected (mirror)
Usually the combination of these effects occur
e.g. Light arriving at skin can be

scattered at various depths into tissue


reflected from blood or from melanin absorbed
scattered tangential to the skin within a film of oil
then escape at some distant point
Reflection is the change in direction of a
wavefront at an interface between two
different media so that the wavefront returns
into the medium from which it originated.
Common examples include the reflection of
light, sound and water waves
Refraction is the bending of light (it also
happens with sound, water and other waves)
as it passes from one transparent substance
into another. This bending by refraction
makes it possible for us to have lenses,
magnifying glasses, prisms and rainbows.
Even our eyes depend upon this bending of
light.
Diffraction is the spreading out of waves as
they pass through an aperture or around
objects. It occurs when the size of the
aperture or obstacle is of the same order of
magnitude as the wavelength of the incident
wave.
Factors in Image
Geometry Formation
● concerned with the relationship between points in the three-dimensional
world and their images
■ Radiometry
● concerned with the relationship between the amount of light radiating from
a surface and the amount incident at its image
■ Photometry
● concerned with ways of measuring the intensity of light
■ Digitization
● concerned with ways of converting continuous signals (in both space and
time) to digital approximations Brightness: informal notion used to describe both
scene and image brightness.
■ Image brightness: related to energy flux incident on the image plane:
IRRADIANCE
■ Scene brightness: brightness related to energy flux emitted (radiated) from a
surface.
RADIANCE
Factors in Image
Formation
Radiometry – measuring
light
• Relationship between light source, surface
geometry, surface properties, and receiving end
(camera)
• Inferring shape from surface reflectance
– Photometric stereo
– Shape from shading
Light at surfaces
• Many effects when light strikes a • Fluorescence: Some surfaces
surface -- could be: absorb light at one wavelength
– absorbed and radiate light at a different
– transmitted wavelength
• skin • Assume that
– reflected – all the light leaving a point is
• mirror due to that arriving at that
point
– scattered
– surfaces don’t fluoresce (light
• milk
leaving a surface at a given
– travel along the surface and wavelength is due to light
leave at some other point arriving at that wavelength)
• sweaty skin – surfaces don’t emit light (i.e.
are cool)
Radiometry is the detection and measurement of light waves in the optical portion
of the electromagnetic spectrum which is further divided into ultraviolet, visible, and
infrared light.

All light measurement is considered radiometry with photometry being a special


subset of radiometry weighted fora typical human eye response.
Radiant Intensity ( ) It is the amount of flux emitted through a known solid
angle. It is measured in watts / steradian. To measure radiant intensity, start with the
angle subtended by the detector at a given distance from the source (see Figure4).
Then divide the amount to fflux by that solid angle. Radiant Intensity is aproperty of the
light source and may not be relevant if the spatial distribution of radiation from the
source is nonuniform. It is appropriate for point sources (and for close approximations,
such as an LED intensity measurements), but not for collimated sources.
BRDF
• Bidirectional Reflectance Distribution
Function (BRDF): the most general
model of local reflection
• Bidirectional:
– illumination and viewing directions
A surface illuminated by
radiance
coming in from a region of solid
angle dω at angle to
emit radiance
Geometric Image Formation

27
Geometric Transformations
2D Translation

y
x  x  t x , y   y  t y
P
 x  x  tx 
T P    , P    , T   
 y  y  ty 
P x P P  T

28
2D Rotation
y
Rotation in angle  about a

pivot (rotation) point  xr , yr .



yr
x
xr
y x  xr   x  xr  cos    y  yr sin 
 x, y
y  yr   x  xr sin    y  yr  cos 

P Pr  R P  Pr 
  x, y 
 cos   sin  
 xr , yr  x R 
 sin  cos  

29
2D Scaling
x  x sx , y  y s y
Sx
 x  sx 0   x
 y  0 s y   y 
Sy   
P S P
y Scaling about a fixed point  x f , y f 
x  x sx  x f 1  sx 

y  y s y  y f 1  s y 
x f , yf 
x
P P S  P f 1 - S 

30
Homogeneous Coordinates

Rotate and then displace a point P : P = M1 P + M 2

M1: 2 2 rotation matrix. M 2 : 2 1 displacement vector.

Displacement is unfortunately a non linear operation.

Make displacement linear with Homoheneous Coordinates.

 x, y    x, y,1. Transformations turn into 3 3 matrices.


Very big advantage. All transformations are concatenated by
matrix multiplication.

31
 x  1 0 t x   x 
 y  0 1 t   y  , P T t , t P
2D Translation
   y    x y
 1   0 0 1   1 

 x  cos   sin  0  x 


2D Rotation  y  sin  cos  0   y  , P R  P
  
 1   0 0 1   1 

 x  S x 0 0  x 
2D Scaling  y  0 Sy 0   y  , P S S x , S y P
  
 1   0 0 1   1 

32
Inverse transformations:

 1 0  tx   cos  sin  0 1 S x 0 0
T  1  0 1  t y  , R  1   sin  cos  0  , S  1  0 1 S y 0 
 0 0 1   0 0 1   0 0 1 

Composite transformations: P M 2 M1 P  M 2 M1  P M P

  
P T t2 x , t2 y  T t1x , t1 y P  T t2 x , t2 y T t1x , t1 y  P
 1 0 t2 x   1 0 t1x   1 0 t1 x  t2 x 
 0 1 t   0 1 t   0 1 t  t 
Composite translations:  2y   1y   1y 2y 

 0 0 1   0 0 1   0 0 1 

T t2 x , t2 y T t1x , t1 y  T t1x  t2 x , t1 y  t2 y 

33
P R  2  R 1 P  R  2 R 1  P

Composite Rotations: R  2  R 1  R 1   2 

P R 1   2 P

 S2 x 0 0   S1x 0 0   S1 x S2 x 0 0
 0 S2 y 0   0 S1 y 0   0 S1 y S2 y 0 

 0 0 1   0 0 1   0 0 1 

Composite Scaling: S S 2 x , S 2 y S S1x , S1 y  S S1 x S2 x , S1 y S2 y 

34
General 2D Rotation
Move to origin Rotate Move back

 xr , yr 

1 0 xr   cos   sin  0   1 0  xr 
0 1 yr   sin  cos  0   0 1  yr  

 0 0 1   0 0 1   0 0 1 

 cos   sin  xr 1  cos    yr sin  


 
 sin  cos  yr 1  cos    xr sin  
 0 0 1 

35
General 2D Scaling
Move to origin Scale Move back

x f , yf 

1 0 x f   Sx 0 0  1 0  x f   S x 0 x f 1  S x 
0 1  
 y f   0 Sy   
0   0 1  y f   0 Sy y f 1  S y 
 0 0 1   0 0 1   0 0 1   0 0 1

 

36
Geometric Image Formation
Q UA L I TAT I V E R A D I O M E T RY

• We would like to know how “bright” surfaces are going to be under various
lighting conditions and how this “brightness” depends on local surface
properties, on surface shape, and on illumination.
• A surface patch sees the world through a hemisphere of directions at that
patch.

•The radiation arriving at the surface along a particular


direction passes through a point in the hemisphere.
•If two surface patches have equivalent incoming
hemispheres, they must have the same incoming
radiation, whatever the outside world looks like.
• This means that any difference in “brightness” between patches with the same
incoming hemisphere is a result of different surface properties.
• Lambert determined the distribution of “brightness” on a uniform plane at the
base of an infinitely high black wall illuminated by an overcast sky.

•In this case, every point on the plane must see the
same hemisphere — half of its viewing sphere is cut
off by the wall, and the other half contains the sky,
which is uniform—and the plane is uniform, so every
point must have the same “brightness”.
We now have an infinitely thin black wall that is infinitely long in only one direction and on
an infinite plane (Figure).
A qualitative description would be to find what the curves of equal “brightness” look like.
It is fairly easy to see that all points on any line passing through the point p in Figure see
the same input hemisphere and so must have the same “brightness”. Furthermore, the
distribution of “brightness” on the plane must have a symmetry about the line of the wall—
we expect the brightest points to be along the extension of the line of the wall and the
darkest to be at the base of the wall.
SOURCES AND THEIR EFFECTS
Radiometric Properties of Light Sources
We define a light source to be anything that emits light that is internally
generated (i.e., not just reflected).
To describe a source, we need a description of the radiance it emits in each
direction. Typically, internally generated radiance is dealt with separately from
reflected radiance.
This is because, although a source may reflect light, the light it reflects depends on
the environment, whereas the light it generates internally usually does not.

exitance, is defined as the internally generated energy radiated per unit time
and per unit area on the radiating surface.
Exitance is similar to radiosity, and can be computed as
Point Sources
A common approximation is to assume that the light source is an extremely small
sphere, in fact, a point; such a source is known as a point source. It is a natural
model to use because many sources are physically small compared with the
environment in which they stand.
The radiosity due to the source is obtained by integrating the pattern generated
by the source, times cos θi, over the patch of solid angle. As tends to zero, the
patch shrinks and the cos θi is close to constant. If ρ is the surface albedo, all this
means the expression for radiosity due to the point source is

where E is a term in the exitance of the source integrated


over the small patch.
Line Sources
A line source has the geometry of a line—a
good example is a single fluorescent light
bulb. Their main interest is as an example
for radiometric problems; in particular, the
radiosity of patches reasonably close to a
line source changes as the reciprocal of
distance to the source (rather than the
square of the distance). We model a line
source as a thin cylinder with diameter .
Assume for the moment that the line
source is infinitely long and that we are
considering a patch that views the source
frontally.
Area Sources
An area source is an area that radiates light. Area
sources are important for two reasons. First, they
occur quite commonly in natural scenes—an
overcast sky is a good example—and in syn thetic
environments—for example, the fluorescent light
boxes found in many industrial ceilings. Second, a
study of area sources allows us to explain various
shadowing and interreflection effects. Area
sources are normally modeled as surface patches
whose emitted radiance is independent of
position and of direction—they can be described
by their exitance.
LOCAL SHADING MODELS
The Appearance of Shadows In a local shading model, shadows occur when the
patch can not see one or more sources.
In this model, point sources produce a series of shadows with crisp boundaries;
shadow regions where no source can be seen are particularly dark. Shadows cast
with a single source can be crisp and black depending on the size of the source and
the albedo of other nearby surfaces.
Any patch on the plane is in shadow if a ray from the patch to the source passes
through an object. This means that there are two kinds of shadow boundary. At self
shadow boundaries, the surface is turning away from the light, and a ray from the
patch to the source is tangent to the surface. At cast shadow boundaries, from the
perspective of the patch, the source suddenly disappears behind an occluding
object. Shadows cast onto curved surfaces can have extremely complex
geometries.
Area sources generate complex shadows with smooth boundaries, because from the
point of view of a surface patch, the source disappears slowly behind the occluder.
Regions where the source cannot be seen at all are known as the umbra; regions where
some portion of the source is visible are known as the penumbra. A good model is to
imagine lying with your back to the surface looking at the world above. At point 1, you
can see all of the source; at point 2, you can see some of it; and at point 3, you can see
none of it.
Shading models
Local shading model Global shading model
• Surface has incident radiance due • surface radiosity is due to radiance
only to sources visible at each reflected from other surfaces as
point well as from surfaces
• Advantages: • Advantages:
– often easy to manipulate, – usually very accurate
expressions easy • Disadvantage:
– supports quite simple theories – extremely difficult to infer
of how shape information can anything from shading values
be extracted from shading • Rarely used in vision, often in
• Used in vision & real time photorealistic graphics
graphics
Binocular Stereopsis:

• Stereopsis (from Ancient Greek (stereós) 'solid', and ópsis) 'appearance, sight') is
the component of depth perception retrieved through binocular vision.
• binocular vision is a type of vision in which an animal has two eyes capable of
facing the same direction to perceive a single three-dimensional image of its
surroundings.
• Images of the same world scene taken from slightly displaced viewpoints are
called stereo images.
• Stereopsis is not the only contributor to depth perception, but it is a major one.
• Binocular vision happens because each eye receives a different image because
they are in slightly different positions on one's head (left and right eyes).
• These positional differences are referred to as "horizontal disparities" or, more
generally, "binocular disparities".
Depth from Stereo
• Goal: recover depth by finding image coordinate x’ that corresponds to x
• Problems
– Calibration: How do we recover the relation of the cameras (if not already
known)?
– Correspondence: How do we search for the matching point x’?
Images of the same world scene taken from slightly displaced viewpoints are
called stereo images.
Important Points about the Model
• The cameras are identical
• The coordinate systems of both cameras are perfectly alligned.
• Once camera and world co-ordinate systems are alligned the xy plane of the
image is alligned with the XY plane of the world co-ordinate system,then Z
coordinate of W is same for both camera coordinate systems.
We have two images taken from cameras at different positions
• How do we match a point in the first image to a point in the second? What
constraints do we have?
Issues with Template Matching
Epipolar geometry is the geometry of stereo vision. When two cameras view a
3D scene from two distinct positions, there are a number of geometric relations
between the 3D points and their projections onto the 2D images that lead to
constraints between the image points.
Image Rectification

The calculations associated with stereo algorithms are often considerably simplified
when the images of interest have been rectified—that is, replaced by two equivalent
pictures with a common image plane parallel to the baseline joining the two optical
centers
Image Digitization

• Image Acquisition is involved in getting a natural image.

• Acquisition is done by a physical device that is sensitive to the light from the
object we wish to image.
In digital camera, the sensors produce an electrical output proportional to
light intensity.

• The object of interest to be captured and processed is illuminated by white light


or infrared or ultraviolet or X-ray.

• Reflected responses from all the spatial positions of the object are caught by a
sensor, a CCD or a Vidicon camera and transformed into the equivalent analog
electrical signals by a photoelectric detector in the imaging system.
Digital Image Representation

• A digital image is an image f(x,y) that has


been
digitized both in spatial coordinates and
brightness.

• the value of f at any point (x,y) is


proportional to the brightness (or gray level)
of the image at that point.

• A digital image can be considered a matrix


whose row and column indices identify a
point in the image and the corresponding
matrix element value identifies the gray level
at that point.
Image acquisition [10]
• The Image Digitization is performed by a device for converting the analog output
of the physical sensing device into digital form.
A digitizer has two parts: Sampler and Quantizer.
• The process of digitizing the spatial coordinates of the image is sampling.
• The process of digitizing the amplitude value of the image is quantization.

Digitization example [10]


Image Digitization
Sampling
• sampling = the spacing of discrete values in the domain of a signal.
• sampling-rate = how many samples are taken per unit of each
dimension. e.g., samples per second, frames per second, etc.

Quantization
•Quantization = spacing of discrete values in the range of a signal.
•usually thought of as the number of bits per sample of the signal. e.g., 1 bit per
pixel (b/w images), 16-bit audio, 24-bit color images, etc.
• The digital samples resulting from both sampling and quantization produce
the 2D digital image as a matrix of real numbers.

Digital image representation [10]


Formation of Digital Image

Light
source

Analog
electrical
Normal Optical sensor & signals Sampler &
Photoelectric Quantizer
Detector

Light reflected from


image, f

Block Digital Image Formation


Digital Image Representation
cont..
Pixel values in highlighted region

Camera Digitizer A set of number in 2D grid

Samples the analog data and digitizes it.


Example of Digital Image

Continuous image projected onto a sensor array


Result of image sampling and quantization
Light-intensity function

• image refers to a 2D light-intensity function, f(x,y)

• the amplitude of f at spatial coordinates (x,y) gives the intensity (brightness)


of the image at that point.

• light is a form of energy thus f(x,y) must be nonzero and finite.


<0 f ( x, y) < ∞
Illumination and Reflectance

• the basic nature of f(x,y) may be characterized by 2 components

• the amount of source light incident on the scene being viewed Ö


Illumination,i(x,y).

• the amount of light reflected by the objects in the scene Ö Reflectance, r(x,y)

f ( x, y) = i( x,y)r ( x, y)
0 < i( x, y) < ∞
determined by the nature of the light source 0 < r ( x, y) < 1
determined by the nature of the objects in a scene Ö bounded from total
absorption to total reflectance.
Gray level

• We call the intensity of a monochrome image f at coordinate (x,y) the gray


level (l) of the image at that point.

• thus, l lies in the range Lmin ≤ Lmax


• Lmin is positive and Lmax is finite.

• gray scale = [Lmin, Lmax]


• common practice, shift the interval to [0, L]
• 0 = black , L = white
Number of bits The number of gray levels typically is an integer
power of 2
L = 2k
Number of bits required to store a digitized image
b=MxNxk

Resolution
• Resolution (how much you can see the detail of the image) depends on sampling
and gray levels.

• the bigger the sampling rate (n) and the gray scale (g), the better the
approximation
of the digitized image from the original.

• the more the Sampling scale becomes, the bigger the size of the digitized image.
Checkerboard effect

a) 1024x1024
(b) 512x512
(c) 256x256
(d) 128x128
(e) 64x64
(f) 32x32
if the resolution is decreased too much, the
checkerboard effect can occur.

False contouring
(a) Gray level = 16
(b) Gray level = 8
(c) Gray level = 4
(d) Gray level = 2

if the gray scale is not enough, the smooth area


will be affected.
False contouring can occur on the smooth area
which has fine gray scales.
Questions?
Thank You

You might also like