Image Processing
Image Processing
iitk
Kamlesh Tiwari
2.1 CRT This number gives the ratio of vertical points to horizon-
tal points necessary to produce equal-length lines in both
A beam of electrons (cathode rays), emitted by an elec- directions on the screen. An aspect ratio of 3 / 4 means
tron gun,passes through focusing and deflection systems that a vertical line plotted with three points has the same
that direct the beam toward specified positions on the length as a horizontal line plotted with four points.
phosphor coated screen. The phosphor then emits a small
spot of light at each position contacted by the electron
2.1.4 Raster scan
beam. Because the light emitted by the phosphor fades
very rapidly, some method is needed for maintaining the
In this system, the electron beam is swept across the
screen picture. One way to keep the phosphor glowing is
screen, one row at a time from top to bottom. As the
to redraw the picture repeatedly by quickly directing the
electron beam moves across each row, the beam intensity
electron beam back over the same points. This type of
is turned on and off to create a pattern of illuminated
display is called a refresh CRT.
spots. Picture definition is stored in a memory area called
the refresh buffer or frame buffer. Intensity range for
2.1.1 Persistence pixel positions depends on the capability of the raster sys-
tem. Refreshing on raster-scan displays is carried out at
For the phosphors coated on the screen, persistence rep- the rate of 60 to 80 frames per second, although some
resents the duration they continue to emit light after the systems are designed for higher refresh rates. At the end
CRT beam is removed. Persistence is defined as the time of each scan line, the electron beam returns to the left
it takes the emitted light from the screen to decay to one- side of the screen to begin displaying the next scan line.
tenth of its original intensity. Lower-persistence phosphors The return to the left of the screen, after refreshing each
1
2.2 Color monitors 2 DISPLAY DEVICES
scan line, is called the horizontal retrace of the electron 2.2.1 Beam-penetration
beam. And at the end of each frame, the electron beam
returns (vertical retrace) to the top left comer of the The beam-penetration method for displaying color pic-
screen to begin the next frame. tures has been used with random-scan monitors. Two lay-
ers of phosphor, usually red and green, are coated onto the
inside of the CRT screen, and the displayed color depends
on how far the electron beam penetrates into the phos-
2.1.5 Random scan
phor layers. A beam of slow electrons excites only the
outer red layer. A beam of very fast electrons penetrates
In this type of display system the CRT has the electron through the red layer and excites the inner green layer. At
beam directed only to the parts of the screen where a pic- intermediate beam speeds, combinations of red and green
ture is to be drawn. Random scan monitors draw a picture light are emitted to show two additional colors, orange and
one line at a time and for this reason are also referred to yellow. The speed of the electrons, and hence the screen
as vector displays (or stroke-writing or calligraphic color at any point, is controlled by the beam-acceleration
displays). The component lines of a picture can be drawn voltage. Beam penetration has been an inexpensive way
and refreshed by a random scan system in any specified to produce color in random-scan monitors, but only four
order. A pen plotter operates in a similar way andis an colors are possible, and the quality of pictures is not as
example of a random-scan, hard-copy device. good as with other methods.
Refresh rate on a random-scan system depends on the
number of lines to be displayed. Picture definition is now
2.2.2 Shadow-masking
stored as a set of linedrawing commands in an area of
memory refered to as the refresh display file. Some-
Shadow-mask methods are commonly used in rasterscan
times the refresh display file is called the display list, dis-
systems because they produce a much wider range of col-
play program, or simply the refresh buffer. To display a
ors than the beam-penetration method. A shadow-mask
specified picture, the system cycles through the set of com-
CRT has three phosphor color dots at each pixel posi-
mands in the display file, drawing each component line in
tion. One phosphor dot emits a red light, another emits
turn. After all line drawing commands have been pro-
a green light, and the third emits a blue light. This type
cessed, the system cycles back to the first line command
of CRT has three electron guns, one for each color dot,
in the list. Random-scan displays are designed to draw
and a shadow-mask grid just behind the phosphor-coated
all the component lines of a picture 30 to 60 times each
screen. The three electron beams are deflected and focused
second.
as a group onto the shadow mask, which contains a series
Random-scan systems are designed for linedrawing appli- of holes aligned with the phosphor-dot patterns. When
cations and can not display realistic shaded scenes. Since the three beams pass through a hole in the shadow mask,
pidure definition is stored as a set of linedrawing instruc- they activate a dot triangle, which appears as a small color
tions and not as a set of intensity values for all screen spot on the screen. The phosphor dots in the triangles are
points, vector displays generally have higher resolution arranged so that each electron beam can activate only its
than raster systems. Also, vector displays produce smooth corresponding color dot when it passes through the shadow
line drawings because the CRT beam directly follows the mask.
line path. A raster system, in contrast, produces jagged
We obtain color variations in a shadow-mask CRT by vary-
lines that are plotted as discrete point sets.
ing the intensity levels of the three electron beams. A so-
phisticated systems can set intermediate intensity levels
for the electron beams, allowing several million different
2.1.6 interlacing colors to be generated.
o
2 of ?? κtiwari [at] cse.iitk.ac.in
3.1 Digital Image Definitions 3 CONCEPT OF VISUAL INFORMATION
plays use optical effects to convert sunlight or light from In this form the data is most suitable for several appli-
some other source into graphics patterns. The most im- cations such as transmission via digital communications
portant example of a nonemisswe flat-panel display is a facilities, storage within digital memory media or process-
liquid-crystal device. ing by computer.
o
3 of ?? κtiwari [at] cse.iitk.ac.in
3.4 Characteristics of Image Operations 3 CONCEPT OF VISUAL INFORMATION
Reasonable question to ask at this point is how many sam- Neighborhood operations play a key role in modern digital
ples and gray levels are required for a good approximation? image processing. It is therefore important to understand
This brings up the question of resolution. The resolution how images can be sampled and how that relates to the
(ie the degree of discernble detail) of an image is strangely various neighborhoods that can be used to process an im-
dependent on both N and m. The more these parameters age.
are increased, the closer the digitized array will approxi-
mate the original image. Rectangular sampling - In most cases, images are sam-
pled by laying a rectangular grid over an image as illus-
Unfortunately this leads to large storage and consequently trated in Figure(1.1). This results in the type of sampling
processing requirements increase rapidly as a function of shown in Figure(1.3ab). Hexagonal sampling-An alter-
N and large m. native sampling scheme is shown in Figure (1.3c) and is
termed hexagonal sampling.
3.3 Spatial and Gray level resolution Both sampling schemes have been studied extensively and
both represent a possible periodic tiling of the continuous
Sampling is the principal factor determining the spatial image space. However rectangular sampling due to hard-
resolution of an image. Basically spatial resolution is the ware and software and software considerations remains
o
4 of ?? κtiwari [at] cse.iitk.ac.in
3.6 Convolution 3 CONCEPT OF VISUAL INFORMATION
c=a⊗b=a∗b
In a interlaced image the odd numbered lines (1, 3, 5.) are Convolution is commutative.
scanned in half of the allotted time (e.g. 20 ms in PAL) c=a⊗b=b⊗a
and the even numbered lines (2, 4, 6,.) are scanned in the
remaining half. The image display must be coordinated Convolution is associative.
with this scanning format. The reason for interlacing the a ⊗ (b ⊗ c) = (a ⊗ b) ⊗ c = a ⊗ b ⊗ c
o
5 of ?? κtiwari [at] cse.iitk.ac.in
3.9 Importance of phase and magnitude 3 CONCEPT OF VISUAL INFORMATION
o
6 of ?? κtiwari [at] cse.iitk.ac.in
3.10 Statistics 3 CONCEPT OF VISUAL INFORMATION
An arbitrary 2D signal a(x, y) can always be written in a The probability that a brightness in a region falls between
polar coordinate system as a(r, θ). When the 2D signal a and a + 4a ,given the probability distribution function
exhibits a circular symmetry this means that: P (a) can be expressed as P (a)4a where P (a) is the prob-
ability density function.
a(x, y) = a(r, θ) = a(r)
dP (a)
P (a)4a = 4a
where r2 = x2 + y 2 and tan θ = y/π. As a number of da
physical systems such as lenses exhibit circular symmetry,
it is useful to be able to compute an appropriate Fourier Because of monotonic, non-decreasing
R +∞
character of P (a)
representation. we have P (a) ≥ 0 and −∞ P (a) da = 1. For an image
with quantized (integer) brightness amplitudes, the inter-
The Fourier transform A(u, v) can be written in polar co-
pretation of 4a is the width of a brightness interval. We
ordinates A(ωr , ζ and then, for a circularly symmetric sig-
assume constant width intervals. The brightness proba-
nal, rewritten as a Hankel transform:
bility density function is frequently estimated by counting
the number of times that each brightness occurs in the re-
Z∞ gion to generate a histogram, h[a].The histogram can then
A(u, v) = F {a(x, y)} = 2π a(r)J0 (wr r)dr = A(ωr ) be normalized so that the total area under the histogram
is 1. Said another way, the P (a) for region is the normal-
0
(1) ized count of the number of pixels, N, in a region that have
quantized brightness a:
where ωr2 = u2 + v 2 and tan ζ = v/u and J0 (∗) is a Bessel
function of the first kind of order zero. a X
p[a] = h[a] with N = h[a]
N
The inverse Hankel transform is given by: α
Z∞
1 The brightness probability distribution function for the
a(r) = A(wr r)wr dwr image is shown in Figure(1. 6a). The (unnormalized)
2
0 brightness histogram which is proportional to the esti-
mated brightness probability density function is shown in
The Fourier transform of a circularly symmetric 2D signal Figure(??). The height in this histogram corresponds to
is a function of only the radial frequency wr . The depen- the number of pixels with a given brightness.
dence on the angular frequency ζ has vanished. Further if
a(x, y) = a(r) is real, then it is automatically even due to Both the distribution function and the histogram as mea-
the circular symmetry. According to equ (??), will then sured from a region are a statistical description of that
be real and even. region. It must be emphasized that both P (a) and p(a)
should be viewed as estimates of true distributions when
they are computed from a specific region. That is, we
3.10 Statistics view an image and a specific region as one realization of
the various random processes involved in the formation
In image processing it is quite common to use simple sta- of that image and that region . In the same context, the
tistical descriptions of images and sub-images. The no- statistics defined below must be viewed as estimates of the
tion of a statistic is intimately connected to the concept underlying parameters.
o
7 of ?? κtiwari [at] cse.iitk.ac.in
3.12 Coefficient-of-variation 3 CONCEPT OF VISUAL INFORMATION
Percentiles
Za
p(α) dα = p%
−∞
o
8 of ?? κtiwari [at] cse.iitk.ac.in
4.1 Model of the Human Eye 4 PERCEPTION
o
9 of ?? κtiwari [at] cse.iitk.ac.in
4.1 Model of the Human Eye 4 PERCEPTION
Contrast sensitivity
Figure 8: The Weber ratio with background
o
10 of ?? κtiwari [at] cse.iitk.ac.in
4.4 Color matching 4 PERCEPTION
Light is a form of electromagnetic (em) energy that can The counterpart to trichromacy of vision is the Trichro-
be completely specified at a point in the image plane by macy of Color Mixture.
its wavelength distribution. Not all electromagnetic ra- This important principle states that light of any color can
diation is visible to the human eye. In fact, the entire be synthesized by an appropriate mixture of 3 properly
visible portion of the radiation is only within the narrow chosen primary colors.
wavelength band of 380 to 780 nms. Till now, we were
Maxwell in 1855 showed this using a 3-color projecting
concerned mostly with light intensity, i.e. the sensation
system. Several development took place since that time
of brightness produced by the aggregate of wavelengths.
creating a large body of knowledge referred to as colorime-
However light of many wavelengths also produces another
try.
important visual sensation called ”color”. Different spec-
tral distributions generally, but not necessarily, have dif- Although trichromacy of color is based on subjective &
ferent perceived color. Thus color is that aspect of visible physiological finding, these are precise measurements that
radiant energy by which an observer may distinguish be- can be made to examine color matches.
tween different spectral compositions.
A color stimulus therefore specified by visible radiant en- 4.4 Color matching
ergy of a given intensity and spectral composition.Color is
generally characterised by attaching names to the differ- Consider a bipartite field subtending an angle (<) of 2o at
ent stimuli e.g. white, gray, back red, green, blue. Color a viewer’s eye. The entire field is viewed against a dark,
stimuli are generally more pleasing to eye than ”black and neutral surround. The field contains the test color on left
stimuli” .Consequently pictures with color are widespread and an adjustable mixture of 3 suitably chosen primary
in TV photography and printing. colors on the right as shown in Figure (2.7).
Color is also used in computer graphics to add ”spice” to
the synthesized pictures. Coloring of black and white pic-
tures by transforming intensities into colors (called pseudo
colors) has been extensively used by artist’s working in
pattern recognition. In this module we will be concerned
with questions of how to specify color and how to repro-
duce it. Color specification consists of 3 parts:
2. Color differences
3. Color appearance or perceived color It is found that most test colors can be matched by a
proper mixture of 3 primary colors as long as the pri-
We will discuss the first of these questions in this module mary colors are independent. The primary colors are usu-
ally chosen as red, green & blue or red, green & violet.
The ”tristimulus values” of a test color are the amount
4.3 Representation of color for human vi- of 3 primary colors required to give a match by additive
sion mixture.They are unique within an accuracy of the ex-
periment. Much of colorimetry is based on experimental
Let S(λ) denote the spectral power distribution (in watts results as well as rules attributed to Grassman. Two im-
/m2 /unit wavelength) of the light emanating from a pixel portant rules that are valid over a large range of observing
o
11 of ?? κtiwari [at] cse.iitk.ac.in
4.5 Color-Coordinate Systems. 5 SAMPLING
conditions are ”linearity ” and ”additivity”. They state in which left side of the split field shown in Fig (2.7), is
that, allowed to emit light of unit intensity whose spectral distri-
bution is constant wrt λ i.e. (equal energy white E).Then
1. The color match between any two color stimuli holds the amount of each primary required for a match is taken
even if the intensities of the stimuli are increased or by definition as one ”unit”.
decreased by the same multiplying factor, as long The amount of primaries for matching other test colors
as their relative spectral distributions remain un- is then expressed in terms of this unit. In practice equal
changed. As an example, if stimuli s1 (λ) and s2 (λ) energy white ’E’ is matched with positive amounts of each
match, and stimuli s3 (λ) and s4 (λ) also match, then primary.
additive mixtures s1 (λ) + s3 (λ) and s2 (λ) + s4 (λ) will
also match.
Figure 12: The color-matching functions for the 20 Stan- 4.7 Chromaticity coordinates in CIE-
dard Observer , using primaries of wavelengths 700(red), XYZ system.
546.1 (green), and 435.8 nm (blue), with units such that
equal quantities of the three primaries are needed to match
the equal energy white, E
4.8 Color Mixtures
o
12 of ?? κtiwari [at] cse.iitk.ac.in
5.1 Evaluation 5 SAMPLING
5.0.1 Two dimensional rectangular sampling Here the spatial frequency variables Ω1 , Ω2 have the units
in cycles/mm and are related to radian frequencies by
We discuss 2D rectangular sampling of a stillerriage a scale factor of 2π. In order to evaluate the 2D FT
xa (t1 , t2 ) in two spatial coordinates. In spatial rectangle Xp (Ω1 , Ω2 ) of xp (t1 , t2 ) after substitution of equations and
sampling, we sample at the locations. exchange the order of function and summation to obtain,
+∞
Z +∞
t1 = n1 T1 XX Z
t2 = n2 T2 (3) Xp (Ω1 , Ω2 ) = xa (n1 T1 , n2 T2 ) (
n1 n2 −∞ −∞
Where T1 and T2 are sampling distances in the t1 and δ(t1 − n1 T1 , t2 − n2 T2 ) ×
t2 directions, respectively. The 2D rectangular sampling exp−j2π(Ω1 t1 ,Ω2 t2 ) ) dt1 dt2
grid is depicted in figure ?? below. The sampled signal can
which simplifies as,
XX
Xp (Ω1 , Ω2 ) = xa (n1 T1 , n2 T2 )e−j2π(Ω1 n1 T1 ,Ω2 n2 T2 )
n1 n2
x(n1 , n2 ) = xa (n1 T1 , n2 T2 ) ∀(n, n2 ) ∈ Z 2 Term Paper understand the paper, fillup the gapes,
demonstration/seminar. Will be done in groups (of
In some cases it is convenient to define an intermediate two).
sampled signal in terms of continuous coordinate variables
given by, EndSem Exam
XX
xp (t1 , t2 ) = xa (t1 , t2 ) δ(t− n1 T1 , t2 − n2 T2 )
n1 n2 5.2 Questions
XX
= xa (n1 t1 , n2 t2 )δ(t− n1 T1 , t2 − n2 T2 ) Q1. Why do we process images?
n1 n2
XX – Picture digitization and coding - for transmis-
= x(n1 , n2 )δ(t− n1 T1 , t2 − n2 T2 )
sion, storage, and printing
n1 n2
– Picture enhansment and restoration
Note xp (t1 , t2 ) is indeed a sampled signal because of the
– Picture segmentation and description - for ma-
presence of 2D Dirac delta functions.
chine vision, image understanding.
Q2. What is image?
5.0.2 Spectrum of the sampled signal
– Panchromatic - gray scale, 2D light intensity
We now relate the Fourier Xp (Ω1 , Ω2 ) transform or function f(x,y)
X(w1 , w2 ) of the sampled signal to that of the continu- – Multispectral - color image, f(x,y) is a vector
ous signal xa (t1 , t2 ). As given earlier, the 2D continuous (R,G,B)
space Fourier transform Xa (Ω1 , Ω2 ) of a xa (t1 , t2 ) signal
with continuous variables (t1 , t2 ) is given by, Q3. What is digital image?
Descritize both in special and intensity function.
+∞
Z +∞ Z
Xa (Ω1 , Ω2 ) = xa (t1 , t2 )exp−j2π(t1 Ω1 +t2 Ω2 ) dt1 dt2 f (1, 1) . . . f (1, N )
g=
.. ..
−∞ −∞ . .
2 f (N, 1) . . . f (N, N ) N ×N
where (Ω1 , Ω2 ) ∈ R and the inverse Fourier transform is
given by, 0 ≤ f (x, y) ≤ G − 1, where G is number of gray levels
+∞
Z Z+∞ and is bounded by G = 2m
xa (t1 , t2 ) = Xa (Ω1 , Ω2 )exp−j2π(t1 Ω1 +t2 Ω2 ) dΩ1 dΩ2 Number of bits required to represent an image is N ×
−∞ −∞ N × m typically it is 512*512*8
o
13 of ?? κtiwari [at] cse.iitk.ac.in
6.1 Sensitivity 6 QUANTIZATION
Q4. What is resolution of an image? In other words point spread function h(x, α, y, β) expresses
Minute details observable in image. Checherboard how much the input value at position (x, y) influences the
pattern. value at point (α, β).
Reducing m but keeping N constant is called false
For a linear operator ~[aδ(x−α, y −β)] = a~[δ(x−α, y −
contouring.
β)].
Operator are defined in terms of point spread functions
6 Quantization Effect of an operator characterized by h(x, α, y, β) on an
image f (x, y) can be written as
6.1 Sensitivity N
X −1 N
X −1
g(α, β) = f (x, y)h(x, α, y, β)
For color images we use three set of sensors as below. x=0 y=0
h(x, α, y, β) = h(α − x, β − y)
Picture degitization and coding
o
14 of ?? κtiwari [at] cse.iitk.ac.in
6.1 Sensitivity 6 QUANTIZATION
Where
Shifting Property
H = hαβ Z +∞
f (x0 )δ(x − x0 ) dx0 = f (x)
hTαβ −∞
≡ [h(0, α, 0, β), h(1, α, 0, β), ..., h(N −
1, α, 0, β), h(0, α, 1, β), h(1, α, 1, β), ..., h(N −
1, α, 1, β), ..., h(0, α, N − 1, β), h(1, α, N − Scaling Property
1, β), ..., h(N − 1, α, N − 1, β)] δ(x)
δ(ax) =
|a|
Image f T
≡ [f (0, 0), f (1, 0), ..., f (N −
Kronecker Delta
1, 0), f (0, 1), f (1, 1), ..., f (N − 1, 1), ..., f (0, N −
1), f (1, N − 1), ..., f (N − 1, N − 1)]
0 n 6= 0
δ(n) =
1 n=0
Vn it is N × 1 column matrix having all elements except
nth set as zero.
Shifting Property of Kronecker Delta
Image processing refers to processing of two dimensional
picture by a digital computer. Basic classes of image pro- +∞
X
cessing applications f (m)δ(n − m) = f (n)
m=−∞
Image representation and modeling
Concerns with charqcterization of the quantity that
Rectangle
each picture-element represents. Image can represent
1 |x| ≤ 1
rect(x) = 2
luminance of object in the scene, absorption charac- 0 |x| > 1
2
teristics of the body tissue, the radar cross section of a
target, temperature profile of a region ... or anything.
Signum
Image enhencement
1 x>0
The goal is to accenture certain image feature for sgn(x) = 0 x=0
sbusequent analysis or for image display.
−1 x<0
Image restoration
Refers to removal or minimization of known degrada- Sinc
sin πx
tion in an image. It includes deblurring, noise filter- sinc(x) =
πx
ing, geometric distortion corrections .. etc.
Image analysis Comb
∞
Concerns with making quantative measurements from X
an image to produce its discription. comb(x) = δ(x − n)
n=−∞
4δ − a, y − b) = n2 rect[n(x − a), n(y − b)]
n (x
o
15 of ?? κtiwari [at] cse.iitk.ac.in
8 IMAGE PERCEPTION
H[a1 x1 (m, n)+a2 x2 (m, n)] = a1 H[x1 (m, n)]+a2 H[x2 (m, n)] Properties of the Fourier Transform.
1. Spatial Frequencies
Impulse Response When input is Kronecker delta function If f (x, y) is luminance and x, y the spatial coordi-
at location (m0 , n0 ) the output at location (m, n) is defined nates, then ξ1 , ξ2 are the spacial frequencies that rep-
as resent luminance changes with respect to spatial dis-
h(m, n, m0 , n0 ) = H[δ(m − m0 , n − n0 )] tances.
2. Uniqueness
PSFimpulse response is called the Point Spread Function For continuous functions f (x, y) and F (ξ1 , ξ2 ) are
when input and output represent a positive quantity. unique with one another.
8 Image Perception
Shift invarient (or spatially invarient) system if translation
of input causes the translation of output. Lightis the electromagnetic radiation that stimulates our
H[δ(m, n)] = h(m, n; 0, 0) visual response. Light received from an oblect can be writ-
ten as
By definition I(λ) = ρ(λ)L(λ)
4 Where ρ(λ) represents the reflectivity or transmissivity
h(m, n, m0 , n0 ) = H[δ(m − m0 , n − n0 )] of the object, L(λ) is the incident energy distribution.
h(m, n, m0 , n0 ) = h(m − m0 , n − n0 ) Human Photoreceptors
Property Rods Cones
No. 100 million 6.5 million
Convolution for shift invarient system the output becomes Vision Scotopic Photopic
X∞ X∞ Color No Color color vision
0 0 0 0 Nerves one for group one for every
y(m, n) = h(m − m , n − n )x(m , n )
m=−∞ n=−∞ Concentration outer to fovea near to fovea
Luminance or intensityof a spatially distributed object
with light distribution I(x, y, λ) is defined as
Z ∞
7 Fourier Transform f (x, y) = I(x, y, λ) V (λ) dλ
0
The fourier transform of a complex function f (x) is fedined
as where V (λ) is relative luminous efficiency function of the
Z ∞ vidual system. For human eye V (λ) is bell-shaped curve.
4 4
F (ξ) = F[f (x)] = f (x) exp(−j2πξx) dx Brightness is the perceived luminance and depends on the
−∞ luminance of surround. Objects with same luminance can
have different brightness.
The inverse Fourier Transform F (ξ) is defined as
Simultaneous ContrastSince our perception is sensitive to
Z ∞
4 4 luminus contrast rather tham the absolute value, therefore
f (x) = F −1 [F (ξ)] = F (ξ) exp(j2πξx) dx two squares of same luminance value embedded between
−∞
o
16 of ?? κtiwari [at] cse.iitk.ac.in
9.1 References 9 IMAGE SAMPLING AND QUANTIZATION
∆
X∞X
comb(x, y, ∆x, ∆y) = δ(x − m∆x, y − n∆y)
m,n=−∞
o
17 of ?? κtiwari [at] cse.iitk.ac.in
9.1 References 9 IMAGE SAMPLING AND QUANTIZATION
Gonzales - DIP
NPTEL notes.
o
18 of ?? κtiwari [at] cse.iitk.ac.in