Digital Image Fundamentals

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 61

2.

Digital Image Fundamentals


2.1ELEMENTS VISUAL PERCEPTION
1. STRUCTURE OF HUMAN EYE:
• The eye is nearly a sphere, with an average diameter of approximately 20
mm.
• Three membranes enclose the eye:
the cornea and sclera outer cover; the choroid; and the retina.

• The cornea is a tough, transparent tissue that covers the anterior


(front)surface of the eye.
• Continuous with the cornea, the sclera is an opaque membrane (not
transparent) that encloses the remainder of the optic globe.

• The choroid lies directly below the sclera. This membrane contains a
network of blood vessels that serve as the major source of nutrition to
the eye.
• Even superficial injury to the choroid, often not deemed serious, can lead
to severe eye damage as a result of inflammation that restricts blood
flow.
• The choroid coat is heavily pigmented and hence helps to reduce the
amount of extraneous light entering the eye and the backscatter within
the optic globe.
• At its anterior extreme, the choroid is divided into the ciliary body and the
iris.
• The ciliary , contracts or expands to control the amount of light that enters
the eye and controls the shape of lens.
• The central opening of the iris (the pupil) varies in diameter from
approximately 2 to 8 mm.
• The front of the iris contains the visible pigment of the eye, whereas the
back contains a black pigment.
• The lens is made up of concentric layers of fibrous cells and is suspended
by fibers that attach to the ciliary body.
• It contains 60 to 70% water, about 6% fat, and more protein than any
other tissue in the eye. The lens is colored by a slightly yellow
pigmentation that increases with age.
• The innermost membrane of the eye is the retina, which lines the inside of
the wall’s entire posterior portion. When the eye is properly focused, light
from an object outside the eye is imaged on the retina.
• Pattern vision is afforded by the distribution of discrete light receptors
over the surface of the retina.

• There are two classes of receptors: cones and rods.

• Cones are located in the fovea and are sensitive to color.


Each one is connected to its own nerve end.
Cone vision is called photopic (or bright-light vision).

• Rods are giving a general, overall picture of the field of view and are not
involved in color vision.
Several rods are connected to a single nerve and are sensitive to low levels
of illumination (scotopic or dim-light vision).
2. Image Formation in the Eye:

• In an ordinary photographic camera, the lens has a fixed focal length, and
focusing at various distances is achieved by varying the distance
between the lens and the imaging plane, where the film is located.

• The distance between the lens and the imaging region (the retina) is fixed,
and the focal length needed to achieve proper focus is obtained by
varying the shape of the lens.
• The eye lens (if compared to an optical lens) is flexible.
• Distance between the center of the lens and the retina (focal length):
– varies from 17 mm to 14 mm (refractive power of lens goes from
minimum to maximum).
• Objects farther than 3 m use minimum refractive lens powers (and vice
versa).
• Example:
– Calculation of retinal image of an object

15 x

100 17

x 2.55mm
3. Brightness Adaptation & Discrimination:

• Digital images are displayed as a discrete set of intensities

• Range of light intensity levels to which HVS (human visual system) can
adapt: on the order of 1010.

• For any given set of conditions, the current sensitivity level of HVS is
called the brightness adaptation level.
• The eye also discriminates between changes in brightness at any specific
adaptation level.
Ic /I → WEBER EQUATION
• Where: Ic: the increment of illumination
discriminable 50% of the time and
I : background illumination
• A small value of means that a small percentage change in intensity is
discriminable. This represents “good” brightness discrimination.
• Conversely, a large value of means that a large percentage change in
intensity is required .This represents “poor” brightness discrimination.
• At low levels of illumination brightness discrimination is poor (rods) and it
improves significantly as background illumination increases (cones).
Perceived Brightness
Simultaneous Contrast
2.2 LIGHT AND THE ELECTROMAGNETIC SPECTRUM

• In 1666, Sir Isaac Newton discovered that when a beam of sunlight is passed
through a glass prism, the emerging beam of light is not white but consists
instead of a continuous spectrum of colors ranging from violet at one end to
red at the other.
• The range of colors we perceive in visible light represents a very small
portion of the electromagnetic spectrum.
• On one end of the spectrum are radio waves with wavelengths billions of
times longer than those of visible light.
• On the other end of the spectrum are gamma rays with wavelengths millions
of times smaller than those of visible light.
=

• The electromagnetic spectrum can be expressed in terms of wavelength,


frequency, or energy.

• h is Planck’s constant.
• The units of wavelength are meters, with the terms microns and
nanometers being used just as frequently.
• Frequency is measured in Hertz (Hz), with one Hertz being equal to one
cycle of a sinusoidal wave per second.
• unit of energy is the electron-volt.
The electromagnetic spectrum.The visible spectrum is shown zoomed to
facilitate explanation, but note that the visible spectrum is a rather narrow
portion of the EM spectrum.
• Electromagnetic waves can be visualized as propagating sinusoidal waves
with wavelength, or they can be thought of as a stream of mass-less
particles, each traveling in a wavelike pattern and moving at the speed of
light.
• Each mass-less particle contains a certain amount (or bundle) of energy.
Each bundle of energy is called a photon.

• Gamma rays are so dangerous to living organisms.


• Light is a particular type of electromagnetic radiation that can be seen and
sensed by the human eye.
• The visible band of the electromagnetic spectrum spans the range from
approximately 0.43 micro m (violet) to about 0.79 micro m (red).
• For convenience, the color spectrum is divided into six broad regions:
Violet,
Violet Blue,
Blue Green,
Green Yellow,
Yellow Orange,
Orange and Red.
Red
• Light that is void of color is called achromatic (no color) or
monochromatic (single color) light.
• The term gray level generally is used to describe monochromatic intensity
because it ranges from black, to grays, and finally to white.

• Three basic quantities are used to describe the quality of a chromatic light
source:
Radiance is the total amount of energy that flows from the light source,
and it is usually measured in watts (W).

Luminance, measured in lumens (lm), gives a measure of the amount of


energy an observer perceives from a light source.
For example, light emitted from a source operating in the far infrared
region of the spectrum could have significant energy (radiance), but
an observer would hardly perceive it; its luminance would be almost
zero.

Brightness is a subjective descriptor of light perception that is practically


impossible to measure. It embodies the achromatic notion of intensity
and is one of the key factors in describing color sensation.
2.3 IMAGE SENSING AND ACQUSITION
• Most of the images in which we are interested are generated by the
combination of an “illumination” source (3-D) and the reflection or
absorption of energy from that source by the elements of the “scene”
being imaged.
• 2.12 shows the three principal sensor arrangements used to transform
illumination energy into digital images.
• The idea is simple: Incoming energy is transformed into a voltage by the
combination of input electrical power and sensor material that is
responsive to the particular type of energy being detected.
• The output voltage waveform is the response of the sensor(s), and a
digital quantity is obtained from each sensor by digitizing its response.
• In this section, we look at the principal modalities for image sensing and
generation.
1. Image Acquisition Using a Single Sensor:

• The most familiar sensor of this type is the photodiode, which is


constructed of silicon materials and whose output voltage waveform is
proportional to light.
• The use of a filter in front of a sensor improves selectivity.
• For example, a green (pass) filter in front of a light sensor favors light in
the green band of the color spectrum. As a consequence, the sensor
output will be stronger for green light than for other components in the
visible spectrum.
• In order to generate a 2-D image using a single sensor, there has to be
relative displacements in both the x- and y-directions between the sensor
and the area to be imaged.
• Figure 2.13 shows an arrangement used in high-precision scanning, where
a film negative is mounted onto a drum whose mechanical rotation
provides displacement in one dimension.
• The single sensor is mounted on a lead screw that provides motion in
the perpendicular direction.
• Because mechanical motion can be controlled with high precision, this
method is an inexpensive (but slow) way to obtain high-resolution images.
• Other similar mechanical arrangements use a flat bed, with the sensor
moving in two linear directions. These types of mechanical digitizers
sometimes are referred to as micro densitometers.
2. Image Acquisition Using Sensor Strips:
• A geometry that is used much more frequently than single sensors
consists of an in-line arrangement of sensors in the form of a sensor strip,
as Fig. 2.12(b) shows.
• The strip provides imaging elements in one direction.
• Motion perpendicular to the strip provides imaging in the other direction,
as shown in Fig. 2.14(a).This is the type of arrangement used mostly in flat
bed scanners. Sensing devices with 4000 or more in-line sensors are
possible.
• In-line sensors are used routinely in airborne imaging applications, in
which the imaging system is mounted on an aircraft that flies at a constant
altitude and speed over the geographical area to be imaged.
• One-dimensional imaging sensor strips that respond to various bands of
the electromagnetic spectrum are mounted perpendicular to the
direction of flight.
• The imaging strip gives one line of an image at a time, and the motion of
the strip completes the other dimension of a two-dimensional image.
• Lenses or other focusing schemes are used to project the area to be
scanned onto the sensors.
• Sensor strips mounted in a ring configuration are used in medical and
industrial imaging to obtain cross-sectional (“slice”) images of 3-D objects,
as Fig. 2.14(b) shows. A rotating X-ray source provides illumination and
the sensors opposite the source collect the X-ray energy that passes
through the object (the sensors obviously have to be sensitive to X-ray
energy).
• In other words, images are not obtained directly from the sensors by
motion alone; they require extensive processing.
• A 3-D digital volume consisting of stacked images is generated as the
object is moved in a direction perpendicular to the sensor ring. Other
modalities of imaging based on the computerized axial tomography (CAT)
principle include magnetic resonance imaging (MRI) and position emission
tomography (PET).The illumination sources, sensors, and types of images
are different, but conceptually they are very similar to the basic imaging
approach shown in Fig. 2.14(b).
3. Image Acquisition Using Sensor Arrays:
• Figure 2.12(c) shows individual sensors arranged in the form of a 2-D array.
• Numerous electromagnetic and some ultrasonic sensing devices
frequently are arranged in an array format. A typical sensor for these
cameras is a CCD array, which can be manufactured with a broad range of
sensing properties and can be packaged in rugged arrays of 4000*4000
elements or more. CCD sensors are used widely in digital cameras and
other light sensing instruments.
• The response of each sensor is proportional to the integral of the light
energy projected onto the surface of the sensor, a property that is used in
astronomical and other applications requiring low noise images.
• Noise reduction is achieved by letting the sensor integrate the input light
signal over minutes .
• Because the sensor array in Fig. 2.12(c) is two-dimensional, its key
advantage is that a complete image can be obtained by focusing the energy
pattern onto the surface of the array.
• Motion obviously is not necessary, as is the case with the sensor
arrangements discussed in the preceding two sections.
• The principal manner in which array sensors are used is shown in Fig. 2.15.
• This figure shows the energy from an illumination source being reflected
from a scene element.
• The first function performed by the imaging system in Fig. 2.15(c) is to
collect the incoming energy and focus it onto an image plane.
• If the illumination is light, the front end of the imaging system is an optical
lens that projects the viewed scene onto the lens focal plane, as Fig.
2.15(d) shows.
• The sensor array, which is coincident with the focal plane, produces
outputs proportional to the integral of the light received at each sensor.
• Digital and analog circuitry sweep these outputs and convert them to an
analog signal, which is then digitized by another section of the imaging
system.
• The output is a digital image, as shown diagrammatically in Fig. 2.15(e).
Conversion of an image into digital form is the topic of Section 2.4.
4. A Simple Image Formation Model:

• As introduced in Section 1.1, we denote images by two-dimensional


functions of the form .
• The value or amplitude of ‘f ‘at spatial coordinates is a positive scalar
quantity whose physical meaning is determined by the source of the image.
• When an image is generated from a physical process, its intensity values
are proportional to energy radiated by a physical source (e.g.,
electromagnetic waves).
• As a consequence, f(x, y) must be non zero and finite; that is,
0 < f(x, y) < 
• The function f(x,y) may be characterized by two components:
– (1) the amount of source illumination incident on the scene being
viewed, and
– (2) the amount of illumination reflected by the objects in the scene.
• Appropriately, these are called the illumination and reflectance
components and are denoted by i(x,y) and r(x,y), respectively. The two
functions combine as a product to form f(x,y)
f(x,y) = i(x,y)  r(x,y)
0 < i(x,y) < 
0 < r(x,y) < 1
• Equation indicates that reflectance is bounded by 0 (total absorption) and
1 (total reflectance).
• The nature of i(x,y) is determined by the illumination source, and r(x,y) is
determined by the characteristics of the imaged objects.

• It is noted that these expressions also are applicable to images formed via
transmission of the illumination through a medium, such as a chest X-ray.
• In this case, we would deal with a transmissivity instead of a reflectivity
function, but the limits would be the same and the image function
formed would be modeled as the product .
2.4 Image Sampling and Quantization
• Our objective is to generate digital images from sensed data.

• The output of most sensors is a continuous voltage waveform whose


amplitude and spatial behavior are related to the physical phenomenon
being sensed.

• To create a digital image, we need to convert the continuous sensed data


into digital form. This involves two processes: sampling and quantization.
1. Basic Concepts in Sampling and Quantization:

• The basic idea behind sampling and quantization is illustrated in Fig. 2.16.

• Figure 2.16(a) shows a continuous image f that we want to convert to


digital form. An image may be continuous with respect to the x- and y-
coordinates, and also in amplitude.
• To convert it to digital form, we have to sample the function in both
coordinates and in amplitude.

• Digitizing the coordinate values is called sampling.

• Digitizing the amplitude values is called quantization.


Digitizing the
coordinate
Digitizing the
values
amplitude
values
• The one-dimensional function in Fig. 2.16(b) is a plot of amplitude values
of the continuous image along the line segment AB in Fig. 2.16(a).
• The random variations are due to image noise.
• To sample this function, we take equally spaced samples along line AB, as
shown in Fig. 2.16(c).
• The spatial location of each sample is indicated by a vertical tick mark in
the bottom part of the figure. The samples are shown as small white
squares superimposed on the function.
• The set of these discrete locations gives the sampled function. However,
the values of the samples still span (vertically) a continuous range of
intensity values.
• In order to form a digital function, the intensity values also must be
converted (quantized) into discrete quantities.
• The right side of Fig. 2.16(c) shows the intensity scale divided into eight
discrete intervals, ranging from black to white.
• The vertical tick marks indicate the specific value assigned to each of the
eight intensity intervals.
• The continuous intensity levels are quantized by assigning one of the eight
values to each sample.
• The assignment is made depending on the vertical proximity of a sample
to a vertical tick mark.
• The digital samples resulting from both sampling and quantization are
shown in Fig. 2.16(d).
• Starting at the top of the image and carrying out this procedure line by
line produces a two-dimensional digital image.
• It is implied in Fig. 2.16 that, in addition to the number of discrete levels
used, the accuracy achieved in quantization is highly dependent on the
noise content of the sampled signal.
• Sampling in the manner just described assumes that we have a
continuous image in both coordinate directions as well as in amplitude.
• Figure 2.17(a) shows a continuous image projected onto the plane of an
array sensor. Figure 2.17(b) shows the image after sampling and
quantization.
• Clearly, the quality of a digital image is determined to a large degree by
the number of samples and discrete intensity levels used in sampling
and quantization.
2. Representing Digital Images:
• Let f(s,t) represent a continuous image function of two continuous
variables, s and t.
• We convert this function into a digital image by sampling and
quantization, as explained in the previous section.
• Suppose that we sample the continuous image into a 2-D array, f(x,y) ,
containing M rows and N columns, where (x,y) are discrete coordinates.
For notational clarity and convenience, we use integer values for these
discrete coordinates: X= 0,1,2….. ,M-1 And Y=0,1,2….., N-1
• Thus, for example, the value of the digital image at the origin is f(0,0) ,
and the next coordinate value along the first row is f(0,1).
• Here, the notation (0, 1) is used to signify the second sample along the
first row. It does not mean that these are the values of the physical
coordinates when the image was sampled.
• The section of the real plane spanned by the coordinates of an image is
called the SPATIAL DOMAIN, with x and y being referred to as spatial
variables or spatial coordinates.
• Three basic ways to represent f(x,y) Figure 2.18(a) is a plot of the
function, with two axes determining spatial location and the third axis
being the values of f (intensities) as a function of the two spatial variables
x and y.
• This representation is useful when working with gray-scale sets whose
elements are expressed as triplets of the form, (x,y,z) where x and y are
spatial coordinates (x,y)and z is the value of ‘f’ at coordinates.
• The representation in Fig. 2.18(b) is much more common. It shows f(x,y)
as it would appear on a monitor or photograph. Here, the intensity of
each point is proportional to the value of f at that point. In this figure,
there are only three equally spaced intensity values. If the intensity is
normalized to the interval [0, 1], then each point in the image has the
value 0, 0.5, or 1.
• A monitor or printer simply converts these three values to black, gray, or
white, respectively, as Fig. 2.18(b) shows. The third representation is
simply to display the numerical values of f(x,y) as an array (matrix).
3. Spatial and Intensity Resolution:
• Spatial resolution is a measure of the smallest discernible detail in an
image.
• Quantitatively, spatial resolution can be stated in a number of ways, with
line pairs per unit distance, and dots (pixels) per unit distance being
among the most common measures.
• Dots per unit distance is a measure of image resolution used commonly in
the printing and publishing industry. In the U.S., this measure usually is
expressed as dots per inch (dpi).
• To give you an idea of quality,
newspapers are printed with a resolution of 75 dpi,
magazines at 133 dpi,
glossy brochures at 175 dpi, and the
book page at which you are presently looking is printed at 2400
dpi.
• Intensity resolution similarly refers to the smallest discernible change in
intensity level.
• We have considerable discretion regarding the number of samples used to
generate a digital image, but this is not true regarding the number of
intensity levels.
• The most common number is 8 bits, with 16 bits being used in some
applications in which enhancement of specific intensity ranges is
necessary.
• Intensity quantization using 32 bits is rare.

• Unlike spatial resolution, which must be based on a per unit of distance


basis to be meaningful, it is common practice to refer to the number of
bits used to quantize intensity as the intensity resolution
4. Image Interpolation:
• Interpolation is a basic tool used extensively in tasks such as zooming,
shrinking, rotating, and geometric corrections.
• Our principal objective in this section is to introduce interpolation and
apply it to image resizing (shrinking and zooming), which are basically
image re-sampling methods.
• Fundamentally, interpolation is the process of using known data to
estimate values at unknown locations.
• The method just discussed is called nearest neighbor interpolation
because it assigns to each new location the intensity of its nearest
neighbor in the original image.
• This approach is simple but, it has the tendency to produce undesirable
artifacts, such as severe distortion of straight edges. For this reason, it is
used infrequently in practice.
• .
• A more suitable approach is bilinear interpolation, in which we use the
four nearest neighbors to estimate the intensity at a given location
• The next level of complexity is bicubic interpolation, which involves the
sixteen nearest neighbors of a point.
• Generally, bicubic interpolation does a better job of preserving fine detail
than its bilinear counterpart.
• Bicubic interpolation is the standard used in commercial image editing
programs, such as Adobe Photoshop and Corel Photopaint.
2.5 Some Basic Relationships between
Pixels
1. Neighbors of a Pixel :

• A pixel p at (x,y) has 2 horizontal and 2 vertical neighbors:


(x+1,y), (x-1,y), (x,y+1), (x,y-1)
This set of pixels is called the 4-neighbors of p: N 4(p)

• The 4 diagonal neighbors of p are: (ND(p))


(x+1,y+1), (x+1,y-1), (x-1,y+1), (x-1,y-1)

• N4(p) + ND(p)  N8(p): The 8-neighbors of p


2. Adjacency, Connectivity, Regions, and Boundaries:
• Let V be the set of intensity values used to define adjacency. In a binary
image, V={1}
• If we are referring to adjacency of pixels with value 1.
• In a gray-scale image, the idea is the same, but set V typically contains
more elements.
• For example, in the adjacency of pixels with a range of possible intensity
values 0 to 255, set V could be any subset of these 256 values.
• Mixed adjacency is a modification of adjacency.
• It is introduced to eliminate the ambiguities that often arise when 8-
adjacency is used.
• For example, consider the pixel arrangement shown in Fig. 2.25(a) for
V={1}
• The three pixels at the top of Fig. 2.25(b) show multiple (ambiguous) 8-
adjacency, as indicated by the dashed lines.
• A path (curve) from pixel p with coordinates (x,y) to pixel q with
coordinates (s,t) is a sequence of distinct pixels:
(x0,y0), (x1,y1), …, (xn,yn)
• Let S represent a subset of pixels in an image.
• Two pixels p and q are said to be connected in S if there exists a path
between them consisting entirely of pixels in S.
• For any pixel p in S, the set of pixels that are connected to it in S is called a
connected component of S. If it only has one connected component, then
set S is called a connected set.
• Let R be a subset of pixels in an image. We call R a REGION of the image if
R is a connected set.
• Two regions Ri and Rj are said to be adjacent if their union forms a
connected set.
• Regions that are not adjacent are said to be disjoint.
• We consider 4- and 8-adjacency when referring to regions.
• The boundary of a region R is the set of points that are adjacent to points
in the complement of R. Said another way, the border of a region is the
set of pixels in the region that have at least one background neighbor.
• Connectivity between pixels is important: Because it is used in
establishing boundaries of objects and components of regions in an image
3. Distance measures:
For pixels p,q,z with coordinates (x,y), (s,t), (u,v), D is a distance function or metric if:
D(p,q) ≥ 0 (D(p,q)=0 iff p=q)
D(p,q) = D(q,p)
D(p,z) ≤ D(p,q) + D(q,z)
Euclidean distance:
De(p,q) = [(x-s)2 + (y-t)2]1/2
Points (pixels) having a distance less than or equal to r from (x,y) are contained in
a disk of radius r centered at (x,y).
D4 distance (City-Block Distance):
D4(p,q) = |x-s| + |y-t|
forms a diamond centered at (x,y)
e.g. pixels with D4≤2 from p
D8 distance (Chessboard Distance):
D8(p,q) = max(|x-s|,|y-t|)
forms a square centered at p
e.g. pixels with D8≤2 from p
2.6 An Introduction to the Mathematical Tools Used
in Digital Image Processing
• Two principal objectives:
(1) to introduce you to the various mathematical tools we use
throughout the book; and
(2) to help you begin developing a “feel” for how these tools are
used by applying them to a variety of basic image-processing tasks, some
of which will be used numerous times in subsequent discussions
1. Array versus Matrix Operations:

• An array operation involving one or more images is carried out on a pixel-


by-pixel basis.
• In fact, there are many situations in which operations between images are
carried out using matrix theory. It is for this reason that a clear distinction
must be made between array and matrix operations.

• The array product of these two images is

• On the other hand, the matrix product is given by


2. Linear versus Nonlinear Operations:
• One of the most important classifications of an image-processing method is
whether it is linear or nonlinear. Consider a general operator, H, that produces
an output image, g(x, y), for a given input image, f (x, y):

• H is said to be a linear operator if

• Indicates that the output of a linear operation due to the sum of two inputs is
the same as performing the operation on the inputs individually and then
summing the results.
• The output of a linear operation to a constant times an input is the same as the
output of the operation due to the original input multiplied by that constant.
• Thefirst property is called the property of additivity and the second is called the
property of homogeneity.
• function of this operator is simply to sum its inputs. To test for linearity, we start with the
left side and attempt to prove that it is equal to the right side:

• where the first step follows from the fact that summation is distributive. So, an expansion
of the left side is equal to the right side and we conclude that the sum operator is linear.
• On the other hand, consider the max operation, whose function is to find the maximum
value of the pixels in an image. For our purposes here, the simplest way to prove that this
operator is nonlinear, is to find an example that fails the test .


Linear versus Nonlinear Operations

and suppose that we let and To test for linearity, we again start
with the left side of

Working next with the right side, we obtain

The left and right sides are not equal in this case, so we
have proved that in general the max operator is nonlinear.
3. Arithmetic operations:

• Arithmetic operations between images are array operations means that


arithmetic operations are carried out between corresponding pixel pairs.
• The four arithmetic operations are denoted as
4. Set and Logical Operations:
5. Spatial Operations:
6. vector and Matrix Operations:

You might also like