Image
Image
Image
uk/rbf/IAPR/r
esearchers/D2PAGES/d2tut.htm
Image Acquisition
The first stage of any vision system is the image acquisition stage.
After the image has been obtained, various methods of processing can be applied to the
image to perform the many different vision tasks required today.
However, if the image has not been acquired satisfactorily then the intended tasks may
not be achievable, even with the aid of some form of image enhancement
2D Image Input
The basic two-dimensional image is a monochrome (greyscale) image which has been
digitised.
Describe image as a two-dimensional light intensity function f(x,y) where x and y are
spatial coordinates and the value of f at any point (x, y) is proportional to the brightness or
grey value of the image at that point.
A digitised image is one where
2D Input Devices
TV Camera or Vidicon Tube
A first choice for a two-dimensional image input device may be a television camera -output is a video signal:
By far the most popular two-dimensional imaging device is the charge-coupled device
(CCD) camera.
Single IC device
Consists of an array of photosensitive cells
each cell produces an electric current dependent on the incident light falling on it.
Video Signal Output
Less geometric distortion
More linear Video output.
Frame Stores
Video Signal must be digitised.
A device known as a frame storeor frame grabber usually performs this task. It:
3D imaging
We will consider a set up using two cameras in stereo. -- other methods that involve
stereo are similar.
Let's consider a simplified optical set up:
and
Let f be the focal length of both cameras, the perpendicular distance between the lens
centre and the image plane. Then by similar triangles:
The quantity
disparity.
Near objects accurately acurately but impossible for far away objects. Normally,
d and f are fixed. However, distance is inversely proportional to disparity.
Disparity can only be measured in pixel differences.
Disparity is proportional to the camera separation d. This implies that if we have a
fixed error in determining the disparity then the accuracy of depth determination
will increase with d.
However as the camera separation becomes large difficulties arise in correlating the two
camera images.
In order to measure the depth of a point it must be visible to both cameras and we must
also be able to identify this point in both images.
As the camera separation increases so do the differences in the scene as recorded by each
camera.
Thus it becomes increasingly difficult to match corresponding points in the images.
This problem is known as the stereo correspondence problem.
Methods of Acquisition
Laser Ranging Systems
Laser ranging works on the principle that the surface of the object reflects laser light back
towards a receiver which then measures the time (or phase difference) between
transmission and reception in order to calculate the depth.
Most laser rangefinders:
Project patterns of light (grids, stripes, elliptical patterns etc.) onto an object.
Surface shapes are then deduced from the distortions of the patterns that are
produced on Object's Surface.
Knowing relevant camera and projector geometry, depth can be inferred by
triangulation.
Many methods have been developed using this approach.
Major advantage -- simple to use.
Low spatial resolution -- patterns become sparser with distance.
Some close range (4cm) sensors exist with good depth resolution (around
0.05mm) but have very narrow field of view and close range of operation.
Moire fringe methods are capable of producing very accurate depth data (resolution to
within about 10 microns) but the methods have certain drawbacks.
Reliable extraction of certain features (such as edges or points) from both images
Matching of corresponding features between images.
Both of these tasks are non-trivial and computationally complex.
Passive stereo may not produce depth maps within a reasonable time.
the depth data produced is typically sparse since high level features, such as
edges, are used rather than points.
NOTE:
Problems in finding and accurately locating features in each image can be hard.
Care needed not to introduce errors.
Depth measurements accurate to a few millimetres.
One such passive stereo vision system is TINA developed at Sheffield University.
The
Moving the laser stripe across the scene to obtain a series of vertical columns of
pixels
Triangulate Pixels to give the required dense depth map. The depth of a point is
measured as the distance from one of the cameras, chosen as the master camera.
Knowing the relevant geometry and optical properties of the cameras the depth
map is constructed using the following method:
, proceed to
The position of the point is easily found by projecting a line from the centre of the
secondary camera passing through Q. The intersection of the lines l and gives the
coordinates of
The depth map is formed by using a world coordinate system fixed on the master camera
with its origin at
Image processing
Image processing is in many cases concerned with taking one array of pixels as input and
producing another array of pixels as output which in some way represents an
improvement to the original array.
For example, this processing
Many such techniques are dealt with in Professor Batchelor's companion course.
Many books, such as Gonzalez and Woods, are devoted to this subject
Image processing methods may be broadly divided into
Real space
methods -- which work by directly processing the input pixel array.
Fourier space
methods -- which work by firstly deriving a new representation of the input data
by performing a Fourier transform, which is then processed, and finally, an
inverse Fourier transform is performed on the resulting data to give the final
output image.
Fourier Methods
Lets consider a 1D Fourier transform example:
Consider a complicated sound such as the noise of a car horn. We can describe this sound
in two related ways:
sample the amplitude of the sound many times a second, which gives an
approximation to the sound as a function of time.
analyse the sound in terms of the pitches of the notes, or frequencies, which make
the sound up, recording the amplitude of each frequency.
Similarly brightness along a line can be recorded as a set of values measured at equally
spaced distances apart, or equivalently, at a set of spatial frequency values.
Each of these frequency values is referred to as a frequency component.
An image is a two-dimensional array of pixel measurements on a uniform grid.
This information be described in terms of a two-dimensional grid of spatial frequencies.
A given frequency component now specifies what contribution is made by data which is
changing with specified x and y direction spatial frequencies.
Smoothing Noise
The idea with noise smoothing is to reduce various spurious effects of a local nature in
the image, caused perhaps by
The smoothing can be done either by considering the real space image, or its Fourier
transform.
Finding pixels in the image where edges are likely to occur by looking for
discontinuities in gradients.
Candidate points for edges in the image are usually referred to as edge points,
edge pixels, or edgels.
Linking these edge points in some way to produce descriptions of edges in terms
of lines, curves etc.