Digital Image Processing Notes
Digital Image Processing Notes
ld
or
W
TU
JN
ll
A
What is Digital Image Processing?
ld
Digital Image
or
Common Digital image formats include: Binary images or single bit images,
Gray scale images, Color images
W
gray levels, colours, heights, opacities etc. Pixels are the elements of a digital
image
An image may be defined as a two-dimensional function, f(x, y), where x and y are
spatial (plane) coordinates, and the amplitude OR intensity OR gray level of f at any pair
of coordinates (x, y) .
JN
When x, y, and the amplitude values of f are all finite, discrete quantities, we call the
image a digital image. each of these has a particular location and value.
These elements are referred to as picture elements, image elements, pels, and pixels.
Pixel is the term most widely used to denote the elements of a digital imag.
The field of digital image processing refers to processing digital images by means of a
digital computer.
ll
A color image is just three functions pasted together. We can write this as
a vector-valued function:
A
r ( x, y)
f ( x, y) g( x, y)
A
ll
JN
TU
b( x, y)
W
or
ld
ld
or
W
Representing Digital Images:
Image plotted as a surface, Image displayed as a visual intensity array, Numerical array
ll
A
Dynamic range = ratio of maximum measurable intensity to minimum detectable
intensity level in the system Rule: upper limit determined by saturation, lower limit
determined by noise
Contrast = difference in intensity between the highest and the lowest intensity levels in
an image
ld
High dynamic range => high contrast expected Low
dynamic range => dull, washed-out gray Look
or
W
TU
The representation of an MN numerical array as
a a a
JN
a a ... a
A 1,0 1,1 1, N 1
... ... ... ...
a a a
M 1,0 M 1,1 ... M 1, N 1
Image with 2k intensity levels => k-bit image (ex: 256 8-bit image)
ld
A Simple Image Formation Model
or
Images are generated by physical process: intensity values are proportional to the
energy radiated by a physical source => 0 < f(x,y) <
W
(2) The amount of illumination reflected by the objects of the scene: reflectance r(x,y)
f(x,y) = i(x,y) r(x,y), where 0 < i(x,y) < and 0 < r(x,y) < 1
Example of typical ranges of illumination i(x,y) for visible light (average values):
Sun on a clear day: ~ 90,000 lm/m2, down to 10,000 lm/m2 on a cloudy day
Full moon on a clear evening: ~0.1 lm/m2
TU
Typical illumination level in a commercial office: ~1000 lm/m2
Typical values of reflectance r(x,y):
0.01 for black velvet, 0.65 for stainless steel
0.8 for flat white wall paint, 0.9 for silver-plated metal, 0.93 for snow
JN
ll
A
Types or Levels of image processing
Image to Image Transformation
Image processing
Noise Removal
ld
Image sharpening
Enhancement (make image more
Image useful, pleasing) Image
Restoration
or
deblurring, grid line removal
Geometry
(scaling, sizing , Zooming, Morphing
one object to another).
W
Image-to-information transformations
Image processing
Image to output attributes
(e.g., edges, contours, and the identity of
individual objects). Information
TU
Image Image statistics (histograms)
Histogram is the fundamental tool for
analysis and image processing
Image compression Image analysis (image
segmentation, feature extraction, pattern
recognition), computer-aided detection and
diagnosis (CAD)
JN
Information-to-image transformations
Image processing
Attributes to understanding
Decompression of compressed image data.
Information Reconstruction of image slices from CT or Image
ll
ld
Outputs of these processes are generally images
or
Wavelets &
Multi resolution Compression
processing for Reduces Morphological
Color Image
Representing storage volume; Processing
Processing images in reduces
Representation
various bandwidth for
W
degrees of transmission
shape.
resolution
Image Restoration
Improves
appearance of an
image like Segmentation
enhancement; partition an image
TU
Objective into its
constituent parts
or objects for
Image Enhancement
to bring out obscured Knowledge Base
Representation
detail; highlight
features of interest in and Description:
Process that
JN
an image; Subjective
assigns a label to
an object
Image Acquisition
First process Object
Could be a digital Recognition
Image preprocessing
(Scaling)
ll
Problem domain
A
What are the components / Elements of an Image Processing System?
ld
Network
or
SAN, NAS for terra/peta bytes
of data communication
W
Image Displays short time storage
Specially designed or
Stereo displays on Custom made (frame buffers), fast
monitors computers based on RT recall
or offline
Image Processing
Specialized Image Processing Software: Adobe,
TU
Hard Copy HW : Front-end HW with high
For recording images Photoshop, Corel Draw,
speed for RT processing with
Laser printers, Film high data throughputs. (e.g.,
Serif Photoplus, Matlab,
cameras, Heat- digitizing and averaging video Erdas Imagine, ENVI,
sensitive devices, images at 30 framess) that the PCI Geomatica etc
inkjet units etc and typical computer can not handle
digital units, such as
JN
ld
electrical power and sensor material.
or
A digital quantity is obtained from each sensor by digitizing its response.
W
CCD: Charge Couple Device
ld
In order to generate a 2-D image using a single sensor, there has to be relative
displacements in both the x- and y-directions between the sensor and the area to be
imaged.
or
High-precision scanning scheme is shown in figure, where a film negative is mounted
onto a drum whose mechanical rotation provides displacement in one dimension. The
single sensor is mounted on a lead screw that provides motion in the perpendicular
direction.
W
TU
JN
The strip provides imaging elements in one direction. Motion perpendicular to the strip
provides imaging in the other direction. This is the type of arrangement used in most flat
bed scanners.
ll
A
ld
or
Image acquisition
using a linear sensor
strip (b) Image
acquisition using a
W
circular sensor strip
Air borne and Space borne applications use Sensing devices with 4000 or more
TU
in-line sensors.
Imaging system is mounted on an aircraft or space craft that fly at a constant altitude
and speed over the geographical area to be imaged.
One dimensional imaging sensor strips that respond to various bands of the
electromagnetic spectrum are mounted perpendicular to the direction of flight.
The imaging strip gives one line of an image at a time, and the motion of the strip
JN
Figure shows individual sensors arranged in the form of a 2-D array. This is also the
predominant arrangement found in digital cameras. A typical sensor for these cameras
is a CCD array, which can be manufactured with a broad range of sensing properties
A
ld
or
W
Array sensor
Define spatial and gray level resolution. Explain about iso preference curves.
TU
Spatial and Gray-Level Resolution:
Sampling is the principal factor determining the spatial resolution of an image. Basically,
spatial resolution is the smallest discernible detail in an image. Suppose that we
construct a chart with vertical lines of width W, with the space between the lines also
having width W. A line pair consists of one such line and its adjacent space. Thus, the
width of a line pair is 2W, and there are 1/2 W line pairs per unit distance.
JN
A widely used definition of resolution is simply the smallest number of discernible line
pairs per unit distance; for example, 100 line pairs per millimeter.
Gray-level resolution similarly refers to the smallest discernible change in gray level.
Due to hardware considerations, the number of gray levels is usually an integer power
of 2. The most common number is 8 bits, with 16 bits and in some applications 10 or 12
ll
Spatial resolution
A measure of the smallest discernible detail in an image
A
stated with line pairs per unit distance, dots (pixels) per unit distance, dots
per inch (dpi)
Intensity resolution
The smallest discernible change in intensity level
Basic Relationships Between Pixels
Neighborhood
Adjacency
Connectivity
ld
Paths
Regions and boundaries
or
Each element f(x,y) at location (x,y) is called a pixel.
W
x (x-1, y)
pixel p at coordinates (x,y) will have Neighbors (x,y-1) p(x,y) (x,y+1)
4-neighbors of p, denoted by N4(p) (x+1, y)
Each of these neighbors is at an equal unit distance from p(x,y)
Adjacency
ld
(ii) q is in the set ND(p) and the set N4(p) N4(q) has no pixels whose values are
from V.
or
Two image subsets si and sj are adjacent if
are adjacent
Adjacency for 2 image regions like this that if there are 2 image subsets - Si and Sj, we
say that Si and Sj will be adjacent if there exists a point p in image region Si and a point
q in image region Sj such that p and q are adjacent.
W
So, consider image region Si and image region Sj;
then if some point p in image Si and some other point q in the image Sj so that this p
and q, they are adjacent. So, if p and q are adjacent, then this image region Si is
adjacent to image region Sj. That means Si and Sj; they must appear one after the
other, one adjacent to the other. So, this is the adjacency relation.
TU
Si Sj
p
q
q
JN
A (digital) path (or curve) from pixel p with coordinates (x0, y0) to pixel q with
coordinates (xn, yn) is a sequence of distinct pixels with coordinates
TU
(x0, y0), (x1, y1), , (xn, yn)
Where (xi, yi) and (xi-1, yi-1) are adjacent for 1 i n.
Here n is the length of the path.
If (x0, y0) = (xn, yn), the path is closed path.
We can define 4-, 8-, and m-paths based on the type of adjacency used
JN
Connectivity
Two pixels are said to be connected if they are adjacent in some sense.
to set V.
p,q V
Three types of connectivity are defined:
4 connectivity : p, q V and p N 4( q )
8 connectivity : p, q V and p N 8( q )
M connectivity or Mixed connectivity:
p, q V are m connected if,
q N 4 ( p ) or
ld
q N D ( p ) and N 4 ( p) N 4 ( q)
where N 4 ( p) N 4 ( q) are set of pixels that are four neighbors of
both p and q and whose values are from p.
or
Mixed connectivity is a modification of 8 connectivity,; eliminates multiple path
connections that arise through 8 connectivity. The M connectivity or mixed connectivity
has been introduced to avoid this multiple connection path. So, you just recollect the
restriction that we have put in case of mixed connectivity. In case of mixed connectivity
we have said that 2 points are M connected if one is the 4 neighbor of other or one is 4
W
neighbor of other and at the same time, they do not have any common 4 neighbor
TU
Connected in S
JN
ld
1. Valid distance measure : D is a distance function or metric if D(p,q) 0;
D(p, q) = 0 if p=q
or
3. In equality property: D (p, z) D(p,q) + D(q,z)
W
So for this, let us take 3 points. We take 3 points here; p having a coordinate (x, y), q
having a coordinate (s, t) and I take another point z having the coordinate (u, v). Then D
is called a distance measure is a valid distance measure or valid distance metric. If D
(p, q) is greater than or equal to 0 for any p and q, any 2 points p and q; D (p, q) must
be greater than or equal to 0 and D (p, q) will be 0 only if p is equal to q.
So, that is quite obvious because the distance of the point from the point itself has to be
TU
equal to 0. Then the distance metric distance function should be symmetric that is
if I measure the distance from p to q, that should be same as the distance if I measure
from q to p. That is the second property that must hold true. That is D (p, q) should be
equal to D (q, p).
And, there is a third property which is an inequality. That is if I take a third point z,
then the distance between p and z that is D (p, z) must be less than or equal to the
distance between the p and q plus the distance between q and z and this is quite
JN
obvious, again from our school level mathematics you know that if I have say 3 points
(p, q) and I have another point z and if I measure the distance between p and z, this
must be less than the distance between pq plus the distance between pz.
EUCLIDEAN DISTANCE
In digital domain, There are various other distance measures. Those distance
measures are say;
ld
city block distance, chess board distance and so on
or
D4= (x-s) + (y-t)
W
Now, coming to the second distance measure which is also called D4 distance or city
block distance or this is also known as Manhattan distance; so this is defined as D 4 (p,
q) is equal to x minus s absolute value plus y minus t absolute value.
The pixels having a D4 distance < r from (x,y) form a diamond centered at
TU
(x,y) Example: pixels where D4 2
In case of chess board distance, it is the maximum of the distances that you
cover along x direction and y direction.
Now, we come to the third distance measure which is the chess board distance. As you
have seen that in case of city block distance, the distance between 2 points was defined
ll
as the sum of the distances that you cover along x direction plus the distance along the
y direction.
In case of chess board distance, it is the maximum of the distances that you cover along
A
ld
Similarly, the set of points with a chess board distance will be equal to 2 will be
just the points outside the points having a chess board distance equal to 1. So, if you
continue like this you will find that all the points having a chess board distance of less
than or equal to r from a point p will form a square with point p at the center of the
square. So, these are the distance different distance measures that can be used in the
or
digital domain.
W
TU
JN
ld
or
W
FROM ANALOG TO DIGITAL
TU
JN
Image Processing ?
Coding/compression
Enhancement, restoration,
reconstruction
Analysis, detection, recognition,
understanding
Digital number Raster data matrix representation
Visualization
ll
A
ld
or
W
Explain about image sampling and quantization process.
TU
Image Sampling and Quantization:
The output of most sensors is a continuous voltage waveform whose amplitude and
spatial behavior are related to the physical phenomenon being sensed.
To create a digital image, we need to convert the continuous sensed data into digital
form.
JN
Sensing strip: the number of sensors in the strip establishes the sampling limitations
in one image direction; in the other: same value taken in practice
Sensing array: the number of sensors in the array establishes the limits of sampling
in both directions
SAMPLING AND QUANTIZATION
Sampling and quantization are the two important processes used to convert continuous analog image
into digital image.
ld
Image sampling refers to discretization of spatial coordinates whereas quantization refers to
discretization of gray level values.
Normally, the sampling and quantization deals with integer values. After the image is sampled, with
or
respect to x and y coordinates the number of samples used along the x and y directions are denoted
as N and M, respectively. The N and M are usually the integer powers of 2. Hence N and M can be
represented by the mathematical equation as follows:
W
intervals.
M=2 n N=2k
Similarly, when we discretize the gray levels, we use the integer values and the number of
integer values that can be used is denoted as G. The number of integer gray level values used to
represent an image usually are an integer powers of 2.
TU
The finer the sampling (i.e., the larger M and N) and quantization (the larger K)
the better the approximation of the continuous image function f(x,y).
Spatial frequency
JN
ll
A
Digital
Sampler Quantization Computer
ld
Digital to
Digital Analog
or
Computer converter Display
Image Sampling
W
Image sampling is required to represent the image in a digital computer, in a digital form
. For this, instead of considering every possible point in the image space, we will take
some discrete set of points and those discrete set of points are decided by grid.
So, if we have a uniform rectangular grid; then at each of the grid locations, we can take
a particular point and we will consider the intensity at that particular point. So, this is the
process which is known as sampling.
TU
An image when it is digitized will be represented in the form of a matrix like this
So, whatever process we have done during digitization; during visualization or during
display, we must do the reverse process. So, for displaying the images, it has to be first
A
converted into the analog signal which is then displayed on a normal display.
for an image, when you measure the frequency; it has to be cycles per unit
length, not the cycles per unit time as is done in case of a time varying signal.
ld
or
W
TU
Now, in this figure we have shown that as in case of the signal X (t), we had its
frequency spectrum represented by X (w) and we say that the signal X (t) is band
limited if X of omega is equal to 0 for omega is greater than omega naught where
omega naught is the bandwidth of the signal X (t).
JN
or
= X(m t) ( t- mt) m
= -
W
TU
JN
Now, let us see what will happens in case of 2 dimensional sampling or when we try to
sample an image.
The original image is represented by the function f of x y and as we have seen, in case
of a TWO 1 dimensional signal that if x of t is multiplied by comb of t delta t for the
ll
but a 2 dimensional array of the delta functions where along x direction, the spacing is
delta x and along y direction, the spacing is delta y.
So again, as before, this fS (x, y) can be represented in the form f (m delta x n
delta y) multiplied by delta function x minus m delta x, y minus n delta y where both m
and n varies from minus infinity to infinity.
ld
or
W
TU
So, as we have done in case of 1 dimensional signal; if we want to find out the
frequency spectrum of this sampled image, then the frequency spectrum of the sampled
JN
image fS omega x omega y will be same as f omega x omega y which is the frequency
spectrum of the original image f (x, y) which has to be convoluted with comb omega x
omega y where comb omega x omega y is nothing but the Fourier transform of comb x
y delta x delta y.
And if you compute this Fourier transform, you find that comb omega x omega y
will come in the form of omega xs omega ys comb of omega x omega y, 1 upon delta x,
1 upon delta y where this omega xs and this omega ys, omega xs is nothing but 1 upon
delta x which is the sampling frequency along the x direction and omega ys is equal to
ll
omega 1 upon delta y which is nothing but the sampling frequency along the y direction.
A
ld
or
IMAGE QUANTIZATION W
TU
Quantization is mapping of a continuous variable u to a discrete variable u
if your input signal is a u, after quantization the quantized signal becomes u where u is
JN
one of the discrete variables as shown in this case as r1 to rL. So, we have L number of
discrete variables r1 to rL and u takes a value of one of these variables.
. These samples are discrete in time domain. But still, every sample value is an analog
value; it is not a discrete value. So, what we have done after sampling is instead of
considering all possible time instants; we have considered the signal values at some
discrete time instants and at each of this discrete time instants, I get a sample value.
ll
we have defined a number of transition levels or decision levels which are given
A
as t1, t2, t3, t4 up to tL plus 1 and here ti is the minimum value and tL plus 1 is the
maximum value and we also defined a set of the reconstruction levels that is rk. So,
what we have shown in the previous slide that the reconstructed value r prime u prime
takes one of the discrete values rk. So, the quantized value will take the value rk if the
input signal u lies between the decision levels tk and tk plus 1. So, this is how you do the
quantization.
ld
or
W
TU
JN
ll
A
The basic idea behind sampling and quantization is illustrated in Figure. Figure shows a
continuous image, f(x, y), that we want to convert to digital form.
ld
or
W
TU
JN
ll
Similar is the case with an image. So here, in case of an image, the sampling is done in
2 dimensional grids where at each of the grid locations, we have a sample value which
is still analog.
A
Now, if I want to represent a sample value on a digital computer, then this analog
sample value cannot be represented and the quantization comes into picture.
Image quantization:
So, this particular figure shows that if your input signal u lies between the
transition levels t1 and t2; then the reconstructed signal or the quantized signal will take
ld
a value r1. If the input signal lies between t2 and t3, the reconstructed signal or the
quantized signal will take a value r2. Similarly, if the input signal lies between tk and tk
plus 1, then the reconstructed signal will take the value of rk and so on.
or
W
rL
rk
TU
tk tk+1 t L+1
JN
ll
A
ld
or
W
So, this peak staircase function shows what are the quantization function that will
TU
be used and this central linear line which is inclined at an angle of 45 degree with the u
axis, this shows that what should be the ideal input output characteristics.
So, here we have shown the same figure. Here you find that whenever this green line
which is inclined at 45 degree with the u axis crosses the staircase function; at this
point, whatever is your signal value, it is same as the reconstructed value.
JN
So, only at these cross over points, your error in the quantized signal will be 0. At all
other points, the error in the quantized signal will be a non zero value. So, at this point,
the error will be maximum, which will maximum and negative which will keep on
reducing. At this point, this is going to be 0 and beyond this point, again it is going to
increase.
So, if I plot this quantization error, you find that the plot of the quantization error
will be something like this between every transition levels. So, between t1 and t2 the error
value is like this, between t2 and t3 the error continuously increases, between t3 and t4 the
ll
error continuously increases and so on. Now, what is the effect of this error on the
reconstructed signal?
A
ld
or
So, because from the quantized signal I can never get back the original signal; so we
are always introducing some error in the reconstructed signal which can never be
W
recovered and this particular error is known as quantization error or quantization noise.
Obviously, the quantization error or quantization noise will be reduced if the
quantizer step size that is the transition intervals say t k to tk plus 1 reduces; similarly, the
reconstruction step size rk to rk plus 1 that interval is also reduced
So, for quantizer design, the aim of the quantizer design will be to minimize this
TU
quantization error. So accordingly, we have to have an optimum quantizer and this
optimum mean square error quantizer known as Lloyd-Max quantizer, this
minimizes the mean square error for a given number of quantization levels and here we
assume that let u be a real scalar random variable with a continuous probability density
function pu of u and it is desired to find the decision levels tk and the reconstruction
JN
levels rk for an n L level quantizer which will reduce or minimize the quantization noise
or quantization error.
The quality of a digital image is determined to a large degree by the number of samples
and discrete intensity levels used in sampling and quantization.
ll
A
Discrete Fourier transform in image processing ...
In image processing,
The Fourier Transform is used
in a wide range of applications, such as image analysis, image filtering,
ld
image reconstruction and image compression
or
The output of the transformation represents the image in the Fourier or frequency
domain, while the input image is the spatial domain equivalent.
W
.
The number of frequencies corresponds to the number of pixels in the spatial domain
image, i.e. the image in the spatial and Fourier domain are of the same size.
JN
On 2D images, you are watching the waveform from top view You can imagine that the
white parts are top of the wave and black parts are the bottom parts (see picture on the
right) Fourier transform:
ld
or
W
TU
JN
In the 2D-image Fourier transforms, there is always a peak at origin The value at origin
is the average intensity/color of the image.
ld
Both Tu,v) and f(x,y) are 2-D periodic
or
f(x,y) is the image in the spatial domain and the exponential term is the basis function
corresponding to each point F(u,v) in the Fourier space.
The equation can be interpreted as: the value of each point T(u,v) is obtained by
multiplying the spatial image with the corresponding base function and summing the
result.
W
T(u,v)= f (x,y) . Basis fn
The basis functions are sine and cosine waves with increasing frequencies, i.e. F(0,0)
represents the DC-component of the image which corresponds to the average
brightness and F(M-1,N-1) represents the highest frequency.
(, )= (, ) (, , , )
TU
Forward transformation kernal
(, )= (, ) (, , , )
Let us see what is that class of transformation. You will find that if we define a
transformation of this form say T (u, v) is equal to double summation f (x, y) where f (x,
y) is the 2 dimensional signal into g (x, y, u, v) where both x and y vary from 0 to capital
ll
A
N minus 1. So, we are assuming that our 2 dimensional signal f (x, y) is an N by N
array, capital N by capital N array and the corresponding inverse transformation is given
by f (x, y) is equal to double summation again, we have this transformation matrix
transform coefficients T (u, v) into h (x, y, u, v) where this g (x, y, u, v) is called the
forward transformation kernel and h (x, y, u, v) is called the inverse transformation
ld
kernel or the basis functions.
or
we had g (x, y, u, v) which was of this form e to the power minus j 2 pi by capital N into
ux plus vy and of course, we had this multiplicity term 1 upon capital N. So, this was in
the forward transformation kernel in case of 2 dimensional discrete Fourier transform
or 2D DFT.
W
For N X N square image,
()
( , , , )=
TU
(, )= ( , ) ( + )
JN
()
( , , , )=
ld
or
(, )= ( , ) ( + )
W
For M X N Image,
TU
forward DFT
Kernal
Inverse DFT
JN
Kernal
ll
A
Relation between spatial sample space and frequency
intervals
u, v relations shown above express the relation between spatial sample space and
ld
frequency intervals
=
1
=
1
There are various forms of the FFT and most of them restrict the size of the input image
that may be transformed, often to where n is an integer.
or
W
T (u,v) is the component with frequency u < M/2, v<N/2
TU
M.N are image sizes, f (x,y) is pixel intensity at position x,y
JN
..and its
spectrum
ll
A
Magnitude and Phase
The Fourier Transform produces a complex number valued output image which can be
displayed with two images, either with the real and imaginary part or with magnitude
and phase.
ld
In image processing, often only the magnitude of the Fourier Transform is displayed, as
it contains most of the information of the geometric structure of the spatial domain
image.
or
The Fourier image can also be re-transformed into the correct spatial domain after
some processing in the frequency domain...(both magnitude and phase of the image
must be preserved for this). The Fourier domain image has a much greater range than
the image in the spatial domain. Hence, to be sufficiently accurate, its values are usually
calculated and stored in float values.
W
TU
JN
ll
sample
A complex value is expressed by two real values in
either Cartesian or polar coordinate space.
Cartesian: R(u,v) is the real and I(u, v) the
imaginary component
Polar: |F(u,v)| is the magnitude and phi(u,v) the
phase
ld
Notes on the magnitude spectrum:
Magnitudes are generally referred to as the
or
spectrum but this should be understood as the
magnitude spectrum.
Typically has an extremely large dynamic range and
it is typical to log-compress those values for
W
display
For presentation, the DC component, F(0,0), is
placed at the center. Low frequency components
are shown near the center and frequency increases
with distance from center.
TU
The magnitude spectrum contains information about the
shape of objects.
A strong edge in the source will generate a
strong edge in the magnitude spectrum
JN
(rotated 90 degrees)
The phase spectrum contains information about their
actual location in the source.
An image of lots of Qs will have the same
magnitude specta but not the same phase
spectra.
ll
A
Properties of the 2-D Fourier transform
Display
ld
The dynamic range of the Fourier spectra is generally higher than can be
displayed
or
where c is a scaling factor and the logarithm function performs a
compression of the data, c is usually chosen to scale the data into the
W
range of the display device, [0-255] typically ([1-256] for 256 gray-level
MATLAB image)
Now, these transformations, this class of transformation will be separable if we can write
JN
g (x, y, u, v) in the form g1 (x, u) into g2 (y, v). So, if g (x, y, u, v) can be written in the
form g1 (x, u) into g2 (y, v), then this transformation will be a separable transformation.
So, in that case, these class of transformations will be separable obviously because g
(x, y, u, v) we have written as product of 2 functions : g1 (x, u) into g 2 (y, v). And since
g1 (x, u) and g2 (y, v); so this function g1 and g2, they are functionally same, so this I can
ll
write as g1 (x, u) into g1 (y, v) and in this case, the function will be called as symmetric.
called separable as well as symmetric and the same is also true for the inverse
transformation kernel that is h (x, y, u, v).
(, , , )= (, ) (, )
(, , , )=
ld
or
W
2D DFT can be implemented as a series of 1D DFTs along each column, followed
by 1D DFTs along each row.
Translation /Modulation/shifting
TU
JN
ll
A
ld
or
W
Translation of the source will cause the phase spectrum to change but leave the
magnitude spectrum unchanged since the phase spectrum encodes location
TU
information while the magnitude spectrum encodes shape information.
Conclusion: the origin of the Fourier transform can be moved to the center of
the frequency square by multiplying the original function by (1) x+y
JN
The Fourier transform (and its inverse) are distributive over addition but not over
multiplication . So,
ll
Rotation of the source corresponds to an identical rotation of the magnitude and phase
A
spectra
Periodicity of the Fourier transform
The discrete Fourier transform (and its inverse) are periodic with period N.
T(u,v) = T(u+N,v) = T(u,v+N) = T(u+N,v+N)
ld
Although T(u,v) repeats itself infinitely for many values of u and v, only N
values of each variable are required to obtain f(x,y) from T(u,v)
i.e. Only one period of the transform is necessary to specify T(u,v) in the frequency
domain.
Similar comments may be made for f(x,y) in the spatial Domain
or
( )
( + , + )= ( , )
( ) ( )
( + , + )= ( , )
( + , + )=
1 W
(
) =1
(, )
(
) = (, )
TU
( + , + )= (, )
Conjugate
If f(x,y) is real (true for all of our cases), the Fourier transform exhibits conjugate
JN
symmetry
F(u,v)=F*(-u,-v) or, the more interesting
Scale
ll
The DFT coefficients produced by the 2D DFT equations here, are arranged in
A
Feature Extraction
JN
-Edge Detection
-Corner detection, etc
ld
or
W
TU
JN
ll
A
Discrete Fourier transformation is nothing but a special case of a class
of transformations or a class of separable transformations
ld
Forward Transformation Kernal
(, )= (, ) (, , , )
or
(, )= ( , ) ( , , , ) Inverse Transformation Kernal
(, , , )= 1( , ) 2( , )
(, , , )= 1( , ) 1( , )
W
Forward transformation kernel and Inverse transformation kernel:
Let us see what is that class of transformation. You will find that if we define a
transformation of this form say T (u, v) is equal to double summation f (x, y) where f (x,
y) is the 2 dimensional signal into g (x, y, u, v) where both x and y vary from 0 to capital
N minus 1. So, we are assuming that our 2 dimensional signal f (x, y) is an N by N
TU
array, capital N by capital N array and the corresponding inverse transformation is given
by f (x, y) is equal to double summation again, we have this transformation matrix
transform coefficients T (u, v) into h (x, y, u, v) where this g (x, y, u, v) is called the
forward transformation kernel and h (x, y, u, v) is called the inverse transformation
kernel or the basis functions.
Now, these transformations, this class of transformation will be separable if we can write
JN
g (x, y, u, v) in the form g1 (x, u) into g2 (y, v). So, if g (x, y, u, v) can be written in the
form g1 (x, u) into g2 (y, v), then this transformation will be a separable transformation.
So, in that case, these class of transformations will be separable obviously because g
(x, y, u, v) we have written as product of 2 functions - g1 (x, u) into g 2 (y, v). And since
g1 (x, u) and g2 (y, v); so this function g1 and g2, they are functionally same, so this I can
ll
write as g1 (x, u) into g1 (y, v) and in this case, the function will be called as symmetric.
called separable as well as symmetric and the same is also true for the inverse
transformation kernel that is h (x, y, u, v).
2 dimensional discrete Fourier transformation
we had g (x, y, u, v) which was of this form e to the power minus j 2 pi by capital N into ux
plus vy and of course, we had this multiplicity term 1 upon capital N. So, this was in the
forward transformation kernel in case of 2 dimensional discrete Fourier transform or
ld
2D DFT.
1 ( )
( , , , )=
or
( , , , ) = 1( , ) 1( , )
( ) ( )
( , , , )=
ld
They vary in minor details
the most popular is the DCT-II, also known as even symmetric DCT, or as the
DCT DCT is not complex but real valued.
or
(2 + 1) (2 + 1)
(, , , )= ( ) ( ) cos cos
2 2
(, , , )= (. , , )
()= if u=0 , ()= if u=1,2,.N-1
W
()= if v=0 , ()= if v=1,2,.N-1
So, you find that in case of discrete cosine transformation, if you analyze this,
you find that both the forward transformation kernel and also the inverse transformation
kernel, they are identical and not only that, these transformation kernels are separable
TU
as well as symmetric because in this ; g1 (x, u) equal to alpha u cosine twice x plus 1
u pi divided by twice N and g1 (y, v) can be alpha times v into cosine 2y plus 1 v pi upon
twice N.
So, this transformation that is discrete cosine transformation is separable as well
JN
as symmetric and the inverse forward inverse transformation kernel and the forward
transformation kernel, they are identical.
Now, we have to see what the values of alpha u and alpha v. Here, alpha u is
given by square root of 1 upon capital N where u is equal to 0 and it is equal to square
root of twice by capital N for values of u equal to 1, 2 to capital N minus 1.
ll
A
let us see how the basis functions or the basis images look like in case of
discrete cosine transform.
Now, using these kernels, now we can write the expressions for the 2
dimensional discrete cosine transformation in the form of c (u, v)
ld
(, )
(2 + 1) (2 + 1)
or
= () () (, ) cos cos
2 2
(, )
W
(2 + 1) (2 + 1)
= ( ) ( ) ( , ) cos cos
2 2
TU
Now, you find that there is one difference in case of forward discrete cosine
transformation. The terms alpha (u) and alpha (v) were kept outside the summation,
double summation whereas incase of inverse discrete cosine transformation, the terms
alpha (u) and alpha (v) are kept inside the double summation. The reason being, in
JN
case of forward transformation because the summation is taken over x and y varying
from 0 to capital N minus 1, so alpha (u) and alpha (v), these terms are independent of
this summation operation whereas, in case of inverse discrete cosine transformation,
the double summation is taken over u and v varying from 0 to capital N minus 1. So, this
terms alpha (u) and alpha (v) are kept ah are kept inside the double summation
operation.
ll
A
ld
or
W
TU
JN
Now, if you closely look at this output coefficients, you find that in case of
discrete cosine transformation, the energy of the coefficients are concentrated mostly in
a particular region where the coefficients are near the origin that is (0, 0) which is more
visible in the case of a 3 dimensional plot. So, you find that here in this particular case,
the energy is concentrated in a small region in the coefficients space near about the (0,
0) coefficients. So, this is a very very important property of the discrete cosine
transformation which is called energy compaction property.
ll
A
DFT DCT
separable as well as symmetric separable as well as symmetric
ld
FFT for faster implementation faster implementation of discrete cosine
transformation or FDCT
discrete Fourier transform is periodic with the magnitude of the coefficients are periodic
period capital N where N is the number of with a period twice N where N is the number of
samples samples
or
periodicity is not same as incase of discrete
Fourier transformation
in case of discrete cosine transformation is twice
of the period in case of discrete Fourier
transformation and we will see later that this
W
particular property helps to obtain data
compression, the smother data compression
using the discrete cosine transformation and not
using the discrete Fourier transformation.
Enegy is concentrated at the centre energy compaction property because most of
the signal energy or image energy is
cconcentrated in a very few number of
TU
coefficients near the origin or near the (0, 0)
value in the frequency domain in the uv plane.
Complex Real
JN
ll
A
ld
or
W
TU
N is number of samples, n is number of bits for x or u, bk (x) is kth bit in digital binary
representation of x
Walsh transform is separable and symmetric. Energy compaction is not as that of DCT
JN
ll
A
ld
or
W
TU
or
Walsh functions are orthogonal and have only +1 and -1 values. In general, the Walsh transform can be
generated by the Hadamard matrix as follows:
Walsh ordering
Very often instead of the Hadamard transform we are using reordered form of the
W
Hadamard transform (this reordering is called Walsh ordering and sometimes the
corresponding transform is called the Walsh transform).
Reordering is performed in such manner that on the top of the transform matrix is
put the row of the Hadamard transform matrix with the smallest number of sign
changes (from 1 to -1 and from -1 to 1) and below rows are ordered in increasing
order of sign changes.
TU
Hadamard and Walsh transforms are equivalent to each other but the Walsh transform
order has analogy with the sinusoidal transform with respect to increasing of
frequencies represented with corresponding coefficients.
JN
ll
A
Haar transform
ld
T=HFH T
or
The Haar transform matrix is real and orthogonal:
W
For Haar transform, H contains the Haar basis function hk (z). They are
defined over continuous closed interval 0 z 1,
( ) ( )
; when k > 0, Z [0,1]
ll
z = z = 1/
() ( )
And z = z = 1/ -2 (q-0.5) / 2 z q / 2
0 otherwise, Z [0,1]
From the above equation, one can see that p determines the amplitude and width of the non-zero part of
the function, while q determines the position of the non-zero part of the Haar function
For N=2, the first row of 2x2 matrix H2 is computed using (z) = 1/2. The first row of H2 is computed using (z) which is equal to 1/2 for both elements.
For N=2, the second row of 2x2 matrix H2 is computed using (z) for z=0/2, 1/2 .
ld
( ) ( )
0 for z=0/2,
will be 1/2 and 1/2 for z=1/2 will be 1/2
H2 = 1/ 2 1 1 H4 = 1/ 4 1 1 1 1
1 -1 1 1 -1 -1
or
2 2 0 0
0 0 2 2
W
KLT or HOTELLING TRANSFORM (PCA)
In statistics, principal components analysis (PCA) is a technique to simplify a dataset;
more formally it is a transform used for reducing dimensionality in a dataset while
retaining those characteristics of the dataset that contribute most to its variance. These
characteristics may be the 'most important', but this is not necessarily the case,
TU
depending on the application.
PCA (principle component analysis) is also called the Karhunen-Love transform or the
Hotelling transform.
The KLT analyzes a set of vectors or images, into basis functions or images where the
choice of the basis set depends on the statistics of the image set - depends on image
co-variance matrix.
A
Consider a set of vectors (corresponding for instance to rows of an image)
x= x1
x2
ld
x3
:
or
or mean operator.
W
TU
JN
which equals the outer product of the vector x - mx with itself. Hence if x is a length N
ld
that the eigen vectors with the largest eigen values correspond to the dimensions that
have the strongest correlation in the dataset. The original measurements are finally
projected onto the reduced vector space.
or
Let A be a matrix whose rows are formed from the eigen vectors of Cx
First row of A is the eigen vector correspond to the largest eigen value and the last row
of the eigen vector correspond to the smallest eigen value
Suppose A is the transformation matrix that maps the vecors of x into vecots of y by
W
using the following transformation
Y = A [ x mx]
The above transform is the called the Karhunen Loeve or Hotelling transform
Disadvantages:
TU
KLT depends on the covariance of the data and therefore needs to be computed for
every image
JN
ll
A
DIGITAL IMAGE PROCESSING
UNIT 2: IMAGE ENHANCEMENT - SPATIAL DOMAIN
Process an image so that the result will be more suitable than the
ld
original image for a specific application.
Highlighting interesting detail in images
Removing noise from images
Making images more visually appealing
or
So, a technique for enhancement of x-ray image may not be the best for
enhancement of microscopic images.
W
G (x,y) = T ( f(x,y) ) depends only on the value of f at (x,y)
ld
The operator T applied on f (x, y) may be defined over some neighborhood of (x,y)
(i) A single pixel (x, y) . In this case T is a gray level transformation (or mapping)
function.
or
(ii) Some neighborhood of (x, y) .
(iii) T may operate to a set of input images instead of a single image.
the neighborhood of a point (x, y) is usually a square sub image which is centered at
point (x, y).
Mask/Filter
Neighborhood of a point (x,y)
can be defined by using a
W
TU
square/rectangular (common
used) or circular subimage
area centered at (x,y)
The center of the
subimage is moved from
pixel to pixel starting at the
top of the corner
JN
Spatial Processing :
ll
image thresholding
ld
> Image Negatives
> Log Transformations
> Power Law Transformations
PiecewiseLinear Transformation Functions
Contrast stretching
or
Graylevel slicing
Bitplane slicing
Mask processing / spatial filtering
Spatial filters
Smoothening filters Low pass filters
W
Median filters
In spatial domain operations, the enhancement techniques work directly on the image
JN
pixels and then these spatial domain operations can have 3 different forms.
One is the point processing,
other one is the histogram based processing techniques and(histogram based
processing technique is also a form of point processing technique.)
the third one is mask processing techniques.
ll
A
Point Processing OR Gray level
transformation
The smallest possible neighbourhood size is 1x1.
ld
The simplest spatial domain operations occur when the neighbourhood is simply the
pixel itself. In this case T is referred to as a gray level transformation function or a
point processing operation
or
where s refers to the processed image pixel value (gray-level mapping
function) and
r refers to the original image pixel value.or gray level of the pixel
T gray level (or intensity or mapping) transformation function
Contrast Stretching
W
Produce higher contrast than the
original by
darkening the levels below m in the
original image
Brightening the levels above m in
TU
the original image
JN
ll
A
Thresholding (piece wise linear
transformation)
Produce a two-level (binary) image
ld
or
W
TU
JN
From display point of view, Image grey level values are usually in [0, 255] range
Where 0 is black and 255 is white. These are say L levels, where L=256=28 =2k
There is no reason why we have to use this range and Grey levels can also be
assumed to be in the range [0.0, 1.0] as binary.
ll
This transformation would produce a higher contrast image than the original by
A
darkening the intensity levels below m and brightening the levels above m
Values of r lower than m are darkened and more than m are brightened.
In case of thresholding Operation, T(r) produces a two level image.
ld
or
Intensity Transformations and Spatial Filtering
Basics
W
May be linear or nonlinear
Linear filters
Lowpass: Attenuates (or eliminate) high frequency components such
as characterized by edges and sharp details in an image. Removes
noise. Net effect is image blurring
Highpass: Attenuate (or eliminate) low frequency components such as slowly
TU
varying characteristics
Net effect is a sharpening of edges and other details
Bandpass: Attenuate (or eliminate) a given frequency range
Used primarily for image restoration (are of little interest for
image enhancement)
The value of the mask coefficients determine the nature of the process
Used in techniques
Image Sharpening
Image Smoothing
Spatial filtering
Generally involves operations over the entire image
Operations take place involving pixels within a neighborhood of a point of interest
Also involves a predefined operation called a spatial filter
The spatial filter is also commonly referred to as:
Spatial mask
Kernel
ld
Template
Window
or
Spatial domain: Image Enhancement
W
Logarithmic transformation (log and inverse log transformations)
Power law transforms (nth power and nth root transformations)
Linear function
Negative and identity
transformations
Logarithm function
Log and inverse-log transformation
Power-law function
nth power and nth root transformations
ll
A
Image Negatives
Here, we consider that the digital image that we are considering that will have capital
L number of intensity levels represented from 0 to capital L minus 1 in steps of 1.
The negative of a digital image is obtained by the transformation function s T (r) L 1 r ,
ld
where L is the number of grey levels.
The idea is that the intensity of the output image decreases as the intensity of the
input increases.
This is useful in numerous applications such as displaying medical images.
So, we find that whenever r is equal to 0, then s will be equal to L minus 1 which is the
or
maximum intensity value within our digital image and when r is equal to capital L
minus 1 that is the maximum intensity value in the original image; in that case, s will
be equal to 0. So, the maximum intensity value in the original image will be converted
to the minimum intensity value in the processed image and the minimum intensity
value in the processed image will be converted to maximum intensity value in the
minimum intensity value in the original image will be converted to maximum intensity
W
value in the processed image.
s T (r) L 1 r
L 1
TU
r
L 1
JN
Negative images are useful for enhancing white or grey detail embedded in dark
regions of an image
S = intensitymax - r = s = L 1 r
ld
The log transformation maps a narrow range of low input grey level values into a
wider range of output values. The inverse log transformation performs the opposite
transformation
or
s = log(1 + r)
W
TU
JN
ll
A
s = c log (1+r)
c is a constant and r 0
Log curve maps a narrow range of low gray-level values in the input
image into a wider range of output levels.
Used to expand the values of dark pixels in an image
while compressing the higher level values.
ld
It compresses the dynamic range of images with large variations in pixel
values Example of image with dynamic range: Fourier spectrum image
It can have intensity range from 0 to 10 6 or higher.
We cant see the significant degree of detail as it will be lost in the display.
or
W
TU
Identity Function
A
ld
A cathode ray tube (CRT), for example, converts a video signal to light in a nonlinear
way. The light intensity is proportional to a power () of the source voltage V S
For a computer CRT, is about 2.2
Viewing images properly on monitors requires correction
Power law transformations have the following
or
form s = c * r
c and are positive constants
s=r
We usually set c to 1. Grey levels must be in the range [0.0, 1.0]
W
Some times it is also written as s = c (r+ ) , and this offset is to provide a
measurable output even when input values are zero
TU
JN
or
When the is reduced too much, the image begins to reduce
contrast to the point where the image started to have very slight
wash-out look, especially in the background
W
(a) image has a washed-
out appearance, it needs a
compression of gray levels
needs > 1
(b) result after power-law
transformation with = 3.0
TU
(suitable)
(c) transformation with =
4.0
(suitable)
(d) transformation with =
5.0
(high contrast, the image
JN
ld
Intensity-level slicing
Bit-plane slicing
Contrast Stretching
or
Low contrast images occur often due to poor or non uniform lighting conditions,
or due to nonlinearity, or small dynamic range of the imaging sensor.
Purpose of contrast stretching is to process such images so that the dynamic range of
the image will be very high, so that different details in the objects present in the image
W
will be clearly visible. Contrast stretching process expands dynamic range of intensity
levels in an image so that it spans the full intensity range of the recording medium or
display devices.
TU
JN
ll
Control points (r1,s1) and (r2,s2) control the shape of the transform T(r)
ld
and monotonically increasing.
If (r1,s1)=(rmin,0) and (r2,s2)=(rmax,L-1), where rmin and r max are minimum and
maximum levels in the image. The transformation stretches the levels linearly from
their original range to the full range (0,L-1)
or
Gray Level (Intensity level) Slicing
Purpose: Highlight a specific range of gray values
W
But what difference we are going to have is the difference of intensity values that we
have in the original image and the difference of intensity values we get in the process
image. That is what gives us the enhancement
TU
JN
ll
(a) transformation highlights range [A,B] of gray level and reduces all others to a
constant level
(b) transformation highlights range [A,B] but preserves all other levels
Used to highlight a specific range of intensities in an image that might
be of interest
Set all pixel values within a range of interest to one value (white) and
ld
all others to another value (black)
Produces a binary image
That means, Display high value for range of interest, else low value (discard
background)
or
Brighten (or darken) pixel values in a range of interest and leave all others
Unchanged. That means , Display high value for range of interest, else original
value (preserve background)
So, in such cases, for enhancement, what we can use is the gray level slicing operation
W
and the transformation function is shown over here. Here, the transformation function on
the left hand side says that for the intensity level in the range A to B, the image will be
enhanced for all other intensity levels, the pixels will be suppressed.
On the right hand side, the transformation function shows that again within A and B,
the image will be enhanced but outside this range, the original image will be retained
and the results that we get is something like this.
TU
Bit Plane Slicing
Only by isolating particular bits of the pixel values in a image we can highlight
interesting aspects of that image.
High order bits contain most of the significant visual
JN
ld
Pixels are digital values composed of bits
For example, a pixel in a 256-level gray-scale image is comprised of 8 bits
We can highlight the contribution made to total image appearance by specific bits
or
For example, we can display an image that only shows the contribution of
a specific bit plane
W
TU
N
I (i , j ) 2 n 1 I n (i , j)
n1
0 to 127 can be mapped as 0, 128 to 256 can be mapped as 1
For an 8 bit image, the above forms a binary image. This occupies less storage space.
ll
A
HISTOGRAM
Spreading out the histogram frequencies in an image (or equalising the image) is a
simple way to improve dark or washed out images
ld
Image statistics
Image compression
Image segmentation
or
Histogram processing. Definition of the histogram of an image.
By processing (modifying) the histogram of an image we can create a new image with
specific desired properties.
So, in this particular case, we will find that because we are considering the discrete
W
images; so this function the histogram h (r ) will also be discrete h (r ) = n . So here, r
k k k k
is a discrete intensity level, nk is the number of pixels having intensity level rk and h
(rk) which is same as n also assumes discrete values.
k
The histogram represents the frequency of occurrence of the various grey levels in
the image. A plot of this function for all values of k provides a global description of the
appearance of the image.
TU
NORMALISED HISTOGRAM
So instead of taking a simple histogram as just defined, we sometimes take a normalized
JN
histogram. So, a normalized histogram is very easily derived from this original histograms
or the normalized histogram is represented as p(rk) is equal to nk by
n. (n=N2 )
So, as before this nk is the number of pixels having intensity value rk and n is the total
number of pixels in the digital image. So, find that from this expression that p (r ) equal
k
to n by MN, this p (r ) actually tells you that what is the probability of occurrence of a
k k
pixel having intensity value equal to rk and such type of histograms give as we said;
ll
Suppose we have a digital image of size N N with grey levels in the range [0, L 1] . The
histogram of the image is defined as the following discrete function:
Sk T(rk)
L1
L1 n
k
pr(rk )
ld
2 2
or
is the kth grey level, k 0,1, , L 1
n
k is the number of pixels in the image with grey level rk
W
One approach is image enhancement using image subtraction operation and the
other approach is image enhancement using image averaging operation. So, first let
us start discussion on histogram processing and before that let us see that what we
mean by the histogram of an image.
TU
Four basic image types and their corresponding histograms
Dark
Light
Low contrast
High contrast
When the dynamic range of the image is concentrated on the lower side of the gray
scale, the image will be dark image.
When the dynamic range of an image is biased towards the high side of the gray
A
An image with a low contrast has a dynamic range that will be narrow and
concentrated to the middle of the gray scale. The images will have dull or washed out
look.
When the dynamic range of the image is significantly broad, the image will have a
high contrast and the distribution of pixels will be near uniform.
nk
Dark
Image
ld
or
rk
W
So, here we find that the first image as you see that it is a very very dark image. It is
very difficult to find out what is the content of this particular image and if we plot the
histogram of this particular image, then the histogram is plotted on the right hand side.
You find that this plot says that most of the pixels of this particular image have
intensity values which are near to 0.
TU
Bright
Image
JN
ll
A
Here, you find that this image is very bright and if you look at the histogram of this
particular image; you find that for this image, the histogram shows that most of the
pixels of this image have intensity values which are near to the maximum that is near
value 255 and because of this the image becomes very bright.
Let us come to a third image category where the pixel values cover wide range of
ld
Intensity scale and close to uniformly distributed and has high dynamic range. This
can be achieved through image transformations.
or
Low contrast Image
W
TU
High contrast Image
JN
So, when we talk about this histogram based processing, most of the histogram based
image enhancement techniques, they try to improve the contrast of the image; whether
ll
Now, when we talk about this histogram based techniques, this histogram based
techniques; the histograms just give you a description a global description of the
A
image. It does not tell you anything about the content of the image and that is quite
obvious in these cases. Just by looking at the histogram, we cannot say that what is
the content of the image.
Histogram Equalization or Image
equalization or Histogram Linearization
let us see that how these histograms can be processed to enhance the images. So,
ld
the first one that we will talk about is the image equalization or histogram equalization
operation.
or
spreading the histogram out to be approximately uniformly distributed
The gray levels of an image that has been subjected to histogram equalization are
spread out and always reach white
The increase of dynamic range produces an increase in contrast
For images with low contrast, histogram equalization has the adverse effect of
W
increasing visual graininess
An output intensity level s is produced for every pixel in the input image having
TU
intensity r
We assume
T(r) is monotonically increasing in the interval 0 r L-1
0 (r) L-1 for 0 r L-1
Then T(r) should be strictly monotonically increasing, so that output will be in the
same input range, and ensures output s will never be less than input r values.
ll
A
A
ll
JN
TU
W
or
ld
Histogram equalization Histogram
Linearisation requires construction of a
transformation function sk
ld
or
W
where rk is the kth gray level, nk is the number of pixels with that gray level, MxN is
the number of pixels in the image, and k=0,1,,L-1
This yields an s with as many elements as the original images histogram (normally
TU
256 for 8 bit images) The values of s will be in the range [0,1]. For constructing a
new image, s would be scaled to the range [1,256]
Thus the processed output image is obtained by mapping each pixel in the input
image with intensity rk into a corresponding pixel with level sk in the output image.
JN
Histogram equalization
ll
Thus, an output image is obtained by mapping each pixel with level rk in the
input image into a corresponding pixel with level sk in the output image
A
A
ll
JN
TU
W
or
ld
ld
or
W
It is clearly seen that
Histogram equalization distributes the gray level to reach the maximum gray
level (white) because the cumulative distribution function equals 1 when
0 r L-1
If the cumulative numbers of gray levels are slightly different, they will be mapped
TU
to little different or same gray levels as we may have to approximate the
processed gray level of the output image to integer number Thus the discrete
transformation function cant guarantee the one to one mapping relationship
Now, the first condition is very very important because it maintains the order of the
gray levels in the processed image.
JN
That is a pixel which is dark in the original image should remain darker in the
processed image; a pixel which is brighter in the original image should remain
brighter in the processed image.
So, the intensity ordering does not change in the processed image and that is
guaranteed by the first condition that is T (r) should be single valued and
monotonically increasing in the range 0 to 0 to 1 of the values of the r.
The second condition that is 0 T (r) 1 is the one which ensures that the
processed image that you get, that does not leads to a pixel value which is
ll
ld
white.
An output intensity level s is produced for every pixel in the input image having
intensity r
We assume
or
T(r) is monotonically increasing in the interval 0 r L-1
0 (r) L-1 for 0 r L-1
W
Then T(r) should be strictly monotonically increasing, so that output will be in the
same input range, and ensures output s will never be less than input r values.
TU
JN
So, for discrete formulation, what we have seen earlier is that pr (rk) is given by nk
divided by n where n is the number of pixels having intensity value r and n is the total
k k
ll
number of pixels in the image. And a plot of this pr (rk) for all values of rk gives us the
histogram of the image.
So, the technique to obtain the histogram equalization and by that the image
A
enhancement will be; first we have to find out the cumulative distribution function the
CDF of r k and so we will get sk which is given by T (rk) and this T (rk) now is the
cumulative distribution function which is p of say r where i will vary from 0 to k and this
r i
is nothing but sum of n by n where i will vary from to 0 to k.
i
The inverse of this is obviously r is equal to T inverse of s for 0 less than or equal to s
k k k
less than or equal to 1
ld
Enhancement Using Arithmetic/Logic Operations
Algebraic
Addition
or
Subtraction
Multiplication
Division
Logical
AND
W
OR
NOT
XOR
Depending on the hardware and/or software being used, the actual mechanics of
implementing arithmetic/logic operations can be done sequentially, one pixel at a
time, or in parallel, where all operations are performed simultaneously
ll
A
ld
or
The AND and OR operations are used for masking; that is, for selecting subimages
in an image as above.
Arithmetic operations
W
Addition, subtraction, multiplication and division
S(x, y) = f(x, y) + g(x, y),
D(x, y) = f(x, y) - g(x, y),
P(x, y) = f(x, y) X g(x, y),
V(x, y) = f(x, y) g(x, y),
Images are to be of the same size. X=0,1,2,.M-1, y=0,1,2,N-1
Addition:
TU
Image averaging will reduce the noise. Images are to be registered before
adding.
An important application of image averaging is in the field of astronomy, where
imaging with very low light levels is routine, causing sensor noise frequently to render
single images virtually useless for analysis
JN
As K increases, indicate that the variability (noise) of the pixel values at each location
(x, y) decreases
ll
In practice, the images gi(x, y) must be registered (aligned) in order to avoid the
introduction of blurring and other artifacts in the output image.
A
Subtraction
A frequent application of image subtraction is in the enhancement of differences
between images. Black (0 values) in difference image indicate the location where
there is no difference between the images.
One of the most commercially successful and beneficial uses of image subtraction is
in the area of medical imaging called mask mode radiography
ld
g(x, y) = f(x, y) - h (x, y)
Take an image. Make the Least Significant image as zero. The difference in these two
or
images would be an enhanced image.
Image of a digital angiography. Live image and mask image with fluid injected.
Difference will be useful to identify the blocked fine blood vessels.
The difference of two 8 bit images can range from -255 to 255, and the sum of
W
two images can range from 0 to 510.
Given and f(x,y) image, f m = f - min (f) which creates an image whose min value is
zero.
fs = k [fm/max ( fm) ], fs is a scaled image whose values of k are 0 to 255. For
8 bit image k=255,
TU
(perform contrast
Stretching transformation)
A
ld
mask image an image (taken after
injection of a contrast
or
medium (iodine) into the
bloodstream) with mask
W
An image multiplication and Division
An image multiplication and Division method is used in shading correction.
g(x, y) = f(x, y) x h (x, y)
Another use of multiplication is Region Of Interest (ROI). Multiplication of a given image by mask
image that has 1s in the ROI and 0s elsewhere. There can be more than one ROI in the mask image.
JN
ll
A
A
ll
JN
TU
W
or
ld
The output pixels are set of
elements not in A.All elements
in A become zero and the
others to 1All
ld
AND operation is the set of
coordinates common to A and B
or
The output pixels belong to
either A or B or Both
Image averaging
JN
Suppose that we have an image f (x, y) of size M N pixels corrupted by noise n(x, y) , so
we obtain a noisy image as follows. g(x,y)= f (x,y) + (x,y)
For the noise process n(x, y) the following assumptions are made.
(i) The noise process n(x, y) is ergodic.
ll
M 1N 1
(ii) It is zero mean, i.e., E n(x, y) 1 n(x, y) 0
MN x 0 y 0
(ii) It is white, i.e., the autocorrelation function of the noise process defined as
1
A
M 1kN 1l
R[k,l] E{n(x, y)n(x k, y l)} n(x, y)n(x k, y l) is zero, apart for the
(M k)(N l) x 0 y 0
pair [k,l] [0,0] .
Let g(x,y) denote a corrupted image by adding noise (x,y) to a noiseless image f(x,y):
g(x, y) f (x, y)
(x, y)
ld
or
W
TU
JN
ll
A
The noise has zero mean value
At every pair of coordinates E[ zi ] 0
zi=(xi,yi) the noise is uncorrelated E[ z i z j ] 0
ld
The noise effect is reduced by averaging a set of K noisy images. The new image is
1K
g(x,y) g ( x , y)
Ki1 i
or
The intensities at each pixel of the new image new image may be viewed as random
variables. The mean value and the standard deviation of the new image show that the
effect of noise is reduced
W
filters g linear filtering /High (nonlinear filter) / median
filters frequency filter
emphasis
filter
/Unsharp
masking
TU
Smoothening High boost averaging Principle: Principle function:
filters filters filters / Subtract an Force points with distinct
Low pass Derivative lowpass unsharp intensity levels to be more
filters filters filters image from like their neighbours
Median filters (linear the original
filter) image Objective:
Uses Replace the value of the
JN
ld
mask to the
original
or
Spatial filtering
Generally involves operations over the entire image
W
Operations take place involving pixels within a neighborhood of a point of interest
Also involves a predefined operation called a spatial filter
The spatial filter is also commonly referred to as:
Spatial mask
Kernel
Template
Window
TU
Spatial filters : spatial masks, kernels, templates, windows
Linear Filters and Non linear filters based on the operation performed on the
image. Filtering means accepting ( passing ) or rejecting some frequencies.
JN
ld
Col -1 Col 0 Col 1
Row -1 -1,-1 -1,0 -1,-1
Row 0
W1 W2 W3
Row 1 0,-1 0,0 0,1
or
W4 W5 W6
1,-1 1,0 1,1
W7 W8 W9
At any point (x,y) in the image, the response g(x,y) of the filter is the sum of products
of the filter coefficients and the image response and the image pixels encompassed
W
by the filter.
Observe that he filter w(0,0) aligns with the pixel at location (x,y)
g(x,y)= w (-1,-1) f(x-1,y-1) + w (-1,-0) f(x-1,y) + +w(0,0)f(x,y) +.+w(1,1)f(x+1,y+1)
TU
JN
ll
A
ld
or
W
TU
JN
ll
mn
= wk zk = wT Z
k=1
ld
or
W
TU
JN
ld
Low Pass LF and i=1 to 9
filter Sharp intensity
transitions Box filter (if all
integration coefficients are
Side effect is: equal)
or
This will blur
sharp edges Weighted
Average:
Mask will have
different
W
coefficients
Order Salt and 1.Median filter Non linear
statistic pepper noise 50 percentile filter
or impulse
noise removal
Order Max filter finds 2.Max filter (100
TU
statistic bright objects percentile) t
3.Min filter ( zero
percentile)
Sharpening Highlights Image Differentiation or gradient
filters sharpening sharpening first order or is Linear
intensity gradient operator
JN
Second
differentiation derivative filter Second magnitud
is better for derivative e is Non
edge detection (Laplacian filter) linear
Image Second Linear
sharpening derivative is
Laplacian
ll
Unsharp image
masking sharpening
High Boost
A
filtering
First order image Non
derivatives sharpening Linear
for image
sharpening
ld
filters or Low pass filters
The output (response) of a smoothing, linear spatial filter is simply the average of the
pixels contained in the neighborhood of the filter mask. These filters sometimes are
called averaging filters. they also are referred to a lowpass filters.
or
The idea behind smoothing filters is straightforward. By replacing the value of every
pixel in an image by the average of the gray levels in the neighborhood defined by the
filter mask, this process results in an image with reduced sharp transitions in gray
levels. Because random noise typically consists of sharp transitions in gray levels, the
most obvious application of smoothing is noise reduction. However, edges (which
W
almost always are desirable features of an image) also are characterized by sharp
transitions in gray levels, so averaging filters have the undesirable side effect that
they blur edges
TU
JN
Weighted Average mask: Central pixel usually have higher value. Weightage is
inversely proportional to the distance of the pixel from centre of the mask.
T the general implementation for filtering an MxN image with a weighted averaging
ll
ld
or
W
TU
JN
Uniform filtering
ld
The most popular masks for low pass filtering are masks with all their coefficients
positive and equal to each other as for example the mask shown below. Moreover,
they sum up to 1 in order to maintain the mean of the image.
or
1 1 1
1
9 1 1 1
W
1 1 1
Gaussian filtering
TU
The two dimensional Gaussian mask has values that attempts to approximate the
continuous function
2 2
x y
1
G(x, y) e 2
2 2
JN
1 4 7 4 1
4 16 26 16 4
A
273 7 26 41 26 7
4 16 26 16 4
A
ll
JN
1
4
7
TU
4
1
W
or
ld
Median filtering
ld
Order-Statistics ( non linear )Filters
or
The best-known example in this category is the Median filter, which, as its name
implies, replaces the value of a pixel by the median of the gray levels in the
neighborhood of that pixel (the original value of the pixel is included in the
computation of the median).
Order static filter / (nonlinear filter) / median filter Objective: Replace the valve of the pixel by the median of the intensity values in the neighbourhood of that pixel
W
Although the median filter is by far the most useful order-statistics filter in image
processing, it is by no means the only one. The median represents the 50th percentile
of a ranked set of numbers, but the reader will recall from basic statistics that ranking
lends itself to many other possibilities. For example, using the 100th percentile results
in the so-called max filter, which is useful in finding the brightest points in an image.
The response of a 3*3 max filter is given by R=max [ zk| k=1, 2, , 9]
.
TU
The 0th percentile filter is the min filter, used for the opposite
purpose. Example nonlinear spatial filters
Median filter: Computes the median gray-level value of
the neighborhood. Used for noise reduction.
Max filter: Used to find the brightest points in an image
Min filter: Used to find the dimmest points in an image
R = max{z | k =1,2,...,9}
JN
R = min{z | k =1,2,...,9}
ll
A
ld
or
W
Directional smoothing or directional averaging filter
TU
To protect the edges from blurring while smoothing, a directional averaging filter can
be useful. Spatial averages g(x, y : ) are calculated in several selected directions (for
example could be horizontal, vertical, main diagonals)
1
g(x,y: ) f(x k,y l)
N (k,l) W
JN
and a direction is found such that f (x, y) g(x, y : ) is minimum. (Note that W is the
neighbourhood along the direction and N is the number of pixels within this
neighbourhood).
ld
could be accomplished by spatial differentiation. This, in fact, is the case, and the
discussion in this section deals with various ways of defining and implementing
operators for sharpening by digital differentiation.
Fundamentally, the strength of the response of a derivative operator is proportional to
the degree of discontinuity of the image at the point at which the operator is applied.
or
Thus, image differentiation enhances edges and other discontinuities (such as noise)
and deemphasizes areas with slowly varying gray-level values.
first derivative (1) must be zero in flat segments (areas of constant gray-level values);
(2) must be nonzero at the onset of a gray-level step or ramp; and (3) must be
W
nonzero along ramps. Similarly, any definition of a second derivative (1) must be zero
in flat areas; (2) must be nonzero at the onset and end of a gray-level step or ramp;
and (3) must be zero along ramps of constant slope.
Let us consider the properties of the first and second derivatives as we traverse the
JN
profile from left to right. First, we note that the first-order derivative is nonzero along
the entire ramp, while the second-order derivative is nonzero only at the onset and
end of the ramp. Because edges in an image resemble this type of transition, we
conclude that first-order derivatives produce thick edges and second-order
derivatives, much finer ones
a second-order derivative to enhance fine detail (including noise) much more than a
first-order derivative.
ll
ld
We are interested in isotropic filters, whose response is independent of the direction
of the discontinuities in the image to which the filter is applied. In other words,
isotropic filters are rotation invariant, in the sense that rotating the image and then
applying the filter gives the same result as applying the filter to the image first and
or
then rotating the result.
W
operator is the Laplacian, which, for a function (image) f(x, y) of two variables, is
defined as
TU
Because derivatives of any order are linear operations, the Laplacian is a linear
operator.
In order to be useful for digital image processing, this equation needs to be expressed
in discrete form.
we use the following notation for the partial second-order derivative in the x-direction:
JN
ld
Since each diagonal term
also contains a 2f(x, y) term, the total subtracted from the difference terms now
would be 8f(x, y)
Laplacian operator
or
The Laplacian of a 2-D function f (x, y) is a second order derivative defined as
2 f (x, y) 2 f (x, y)
2 f (x, y)
x2 y2
In practice it can be also implemented using a 3x3 mask as follows (why?)
2
f 4z5 (z2 z4 z6 z8 )
W
The main disadvantage of the Laplacian operator is that it produces double edges
(why?).
ld
or
Unsharp masking and high-boost filtering
A process used for many years in the publishing industry to sharpen images consists
of subtracting a blurred version of an image from the image itself.This process, called
unsharp masking, is expressed as
W
fs(x, y) = f(x, y) - b(x, y)
where fs(x, y) denotes the sharpened image obtained by unsharp masking, and b(x,
y) is a blurred version of f(x, y).
be written as
One of the principal applications of boost filtering is when the input image is darker
than desired. By varying the boost coefficient, it generally is possible to obtain an
overall increase in average gray level of the image, thus helping to brighten the
A
final result
High Boost Filtering
A high pass filtered image may be computed as the difference between the original
image and a lowpass filtered version of that image as follows:
ld
(Highpass part of image)=(Original)-(Lowpass part of image)
Multiplying the original image by an amplification factor denoted by A, yields the so called
high boost filter:
(Highboost image)= (A) (Original)-(Lowpass)= ( A 1) (Original)+(Original)-
or
(Lowpass) = ( A 1) (Original)+(Highpass)
The general process of subtracting a blurred image from an original as given in the
first line is called unsharp masking. A possible mask that implements the above
procedure could be the one illustrated below.
W
0 0 0 -1 -1 -1
1
0 A 0 -1 -1 -1
9
0 0 0 -1 -1 -1
TU
-1 -1 -1
1
9A 1
9 -1 -1
-1 -1 -1
JN
The high-boost filtered image looks more like the original with a degree of edge
enhancement, depending on the value of A .
A determines nature of filtering
ll
A
SHARPENING HIGH PASS FILTER
ld
or
W
TU
Highpass filter example
If A>1, part of the original image is added to the highpass result (partially restoring low
A
frequency components)
Result looks more like the original image with a relative degree of edge
enhancement that depends on the value of A
May be implemented with the center coefficient value w=9A-1 (A1)
A=1.1
ld
or
A=1.15 A=1.2
W
TU
JN
ll
A
ld
or
W
TU
Use of First Derivatives for Enhancement The Gradient
Use of first derivatives for Image Sharpening ( Non linear)
An edge is the boundary between two regions with relatively distinct grey level
properties. The idea underlying most edge detection techniques is the computation of
a local derivative operator.
The magnitude of the first derivative calculated within a neighborhood around the
pixel of interest, can be used to detect the presence of an edge in an image.
First derivatives in image processing are implemented using the magnitude of the
ll
gradient.
For a function f(x, y), the gradient of f at coordinates (x, y) is defined as the two-
dimensional column vector.
A
ld
or
The components of the gradient vector itself are linear operators, but the magnitude
of this vector obviously is not because of the squaring and square root operations. On
W
the other hand, the partial derivatives are not rotation invariant (isotropic), but the
magnitude of the gradient vector is. Although it is not strictly correct, the magnitude of
the gradient vector often is referred to as the gradient. In keeping with tradition, we
will use this term in the following discussions, explicitly referring to the vector or its
magnitude only in cases where confusion is likely.
TU
to denote image points in a 3*3 region, for example, the center point, z5 , denotes
f(x, y), z1 denotes f(x-1, y-1), and so on
The gradient of an image f (x, y) at location (x, y) is a vector that consists of the partial
JN
f (x.y) 2 f (x, y) 2 1/ 2
f (x, y) mag( f (x, y))
x y
A
Size of M(x,y) is same size as the original image. It is common practice to refer to this
image as gradient image or simply as gradient.
Common practice is to approximate the gradient with absolute values which is simpler
to implement as follows.
f (x, y) f (x, y) f (x, y)
x y
Consider a pixel of interest f (x, y) z5 and a rectangular neighborhood of size 3 3 9
pixels (including the pixel of interest) as shown below.
ld
y
z1 z2 z3
or
z z z
z7 z8 z9
Roberts operator
W
Above Equation can be approximated at point z5 in a number of ways. The simplest is
to use the difference (z5 z8 ) in the x direction and (z5 z6 ) in the y direction. This
approximation is known as the Roberts operator, and is expressed mathematically as
follows.
f z5 z8 z5 z6
Another approach for approximating the equation is to use cross differences
TU
f z5 z9 z6 z8
Above Equations can be implemented by using the following masks. The original
image is convolved with both masks separately and the absolute values of the two
outputs of the convolutions are added.
1 0 1 -1
JN
-1 0 0 0
Roberts operator
1 0 0 1
0 -1 -1 0
ll
Roberts operator
A
Prewitt operator
f (x, y) f (x, y) f (x, y)
ld
x y
Another approximation of above equation but using now a 3 3 mask is the following.
f (z7 z8 z9 ) (z1 z2 z3 ) (z3 z6 z9 ) (z1 z4 z7 )
or
(4) This approximation is known as the Prewitt operator. Equation (4) can be
implemented by using the following masks. Again, the original image is convolved
with both masks separately and the absolute values of the two outputs of the
convolutions are added.
W
y
-1 0 1 -1 -1 -1
-1 0 1 0 0 0
-1 0 1 1 1 1
TU
Prewitt operator
x
Sobel operator.
JN
Definition and comparison with the Prewitt operator ( gives weightage to centre pixel)
The most popular approximation of equation (1) but using a 3 3 mask is the following.
f (z7 2z8 z9 ) (z1 2z2 z3 ) (z3 2z6 z9 ) (z1 2z4 z7 )
This approximation is known as the Sobel operator.
y
-1 0 1 -1 -2 -1
ll
-2 0 2 0 0 0
-1 0 1 1 2 1
A
Sobel operator
x
If we consider the left mask of the Sobel operator, this causes differentiation along the
y direction.
ld
or
W
TU
JN
ll
A
Lecture Notes
ld
Image Enhancement (Frequency
Domain) Filtering in frequency domain
Obtaining frequency domain filters from spatial filters
Generating filters directly in the frequency domain
or
Low pass (smoothing) and High pass (sharpening) filters in Frequency domain
W
Filters in the frequency domain can be divided in four groups:
To remove certain frequencies during filtering, set their corresponding F(u) coefficients to
A
zero
ld
Some Basic Properties of the Frequency Domain
(wrt spatial domain)
or
Frequency is directly related to the spatial rate of change (of brightness or gray values).
Therefore, slowest varying frequency component (u=v=0) corresponds to the average
intensity level of the image. Corresponds to the origin of the Fourier Spectrum.
W
Higher frequencies corresponds to the faster varying intensity level changes in the image.
The edges of objects or the other components characterized by the abrupt changes in the
intensity level corresponds to higher frequencies.
Spatial Domain Frequency Domain
TU
JN
Frequency domain filters can achieve the same results as that of spatial filtering by altering
the DFT coefficients directly
G(u,v) H(u,v)F(u,v)
In spatial domain,
Let g (x,y) be a desired image formed by the convolution of an image f (x,y) and a linear, position invariant
operator, h (x,y):
g(x, y) h(x, y) f (x, y)
exhibits some highlighted features of f(x,y). For instance, edges in f(x,y) can be that
accentuated by using a function H(u,v) emphasizes the high frequency components
ld
of F(u,v)
We have access to Fourier transform magnitude ( spectrum ) and phase angle. Phase is
not useful for visual analysis.
or
Convolution
Convolution in the spatial domain corresponds to multiplication in the Fourier domain.
W
.
Note that many implementations of convolution produce a larger output image than this
because they relax the constraint that the kernel can only be moved to positions where it
fits entirely within the image. Instead, these implementations typically slide the kernel to
all positions where just the top left corner of the kernel is within the image. Therefore the
kernel `overlaps' the image on the bottom and right edges. One advantage of this
TU
approach is that the output image is the same size as the input image. Unfortunately, in
order to calculate the output pixel values for the bottom and right edges of the image, it is
necessary to invent input pixel values for places where the kernel extends off the end of
the image. Typically pixel values of zero are chosen for regions outside the true image,
but this can often distort the output image at these places. Therefore in general if you are
using a convolution implementation that does this, it is better to clip the image to remove
JN
these spurious regions. Removing n - 1 pixels from the right hand side and m - 1 pixels
from the bottom will fix things.
If the image size is MxN, kernel size is mxn, the convolved image size will
M-m+1, N-n+1
Flip the kernel or filter horizontally and vertically prior to array multiplication with
ll
the image.
A
ld
or
W
Convolution, the mathematical, local operation defined in Section 3.1 is central to modern
image processing. The basic idea is that a window of some finite size and shape-- the
support--is scanned across the image. The output pixel value is the weighted sum of the
input pixels within the window where the weights are the values of the filter assigned to
every pixel of the window itself. The window with its weights is called the convolution kernel.
This leads directly to the following variation on eq. . If the filter h[j,k] is zero outside the
TU
(rectangular) window {j=0,1,...,J-1;k=0,1,...,K-1}, then, using eq. , the convolution can be
written as the following finite sum:
(, )= ( , ) ( , ) = [, ] [ 1, ]
JN
ll
A
Basic Steps for Filtering in the Frequency Domain:
ld
or
W
g(x,y) Filters affect the real and imaginary parts equally, and thus no effect on the phase.
These filters are called zero-phase-shift filters
Zero Pad the input image f(x,y) to p =2M, and q=2N, if arrays are of same size.
If functions f(x,y) and h(x,y) are of size AxB and CxD, respectively, zero padding is:
PA+C-1, QB+D-1
TU
If both filters are of the same size, it is more efficient to do filtering in the frequency domain
Design a much smaller filter in the spatial domain that approximates the performance
1. Multiply the input padded image by (-1) x+y to center the transform.
2. Compute F(u,v), the DFT of the image from (1).
3. Multiply F(u,v) by a filter function H(u,v).
JN
Given the filter H(u,v) (filter transfer function OR filter or filter function) in the frequency
domain, the Fourier transform of the output image (filtered image) is given by:
ll
The filtered image g(x,y) is simply the inverse Fourier transform of G(u,v).
A
ld
low frequencies while passing the high
frequencies.
or
Correspondence between filtering in spatial and frequency
domains
Let us find out equivalent of frequency domain filter H (u,v) in spatial domain.
h(x,y) H (u,v),
A
Since h(x,y) can be obtained from the response of a frequency domain filter
to an impulse, h(x,y) spatial filter is some times referred as finite impulse
response filter (FIR) of H(u,v)
ld
3. Compute its 2D DFT to obtain
H(u,v) To find h(x, y) from H(u,v):
1 Centering: H(u, v) * (1)u+v
or
2 Inverse Fourier transform
3 Multiply real part by
(1)x+y Properties of h(x, y):
W
1 It has a central dominant circular component (providing the blurring)
2 It has concentric circular components (rings) giving rise to the ringing
effect.
TU
Implementation:
JN
ll
A
A
ll
JN
TU
W
or
ld
ld
or
Image smoothing using Low Pass Filters
Ideal LPF
Gaussian LPF
W
For higher order, BLPF tends to ILPF
Low pass filtering can be naturally accomplished in the frequency domain since the
JN
The DFT coefficients correspond to frequency components of the source and that
their frequency increases with increasing distance from the center of the shifted
spectrum.
Low pass filtering is then accomplished by zeroing the amplitude of the high
ll
Typically, a threshold radius is chosen such that all DFT coefficients outside of
this threshold radius have their magnitude set to zero while all DFT
coefficients falling inside of the threshold are unchanged (passed through).
IDLF:
A 2D low pass filter that passes all frequencies without attenuation, within a circle of radius
Do from the origin and cuts off all frequencies out side the circle is a ILPF
IPLF Can not be realized with electronic components but can be simulated on computers.
ld
D(u,v) = [(u-P/2)2 +( v-Q/2)2]
or
W Cut off
frequency
TU
JN
The point of transition between H(u,v)=0 and H(u,v)=1 is cut off frequency
which is D0 . D0 is a positive constant
To find h(x, y):
Centering: H(u, v) * (1)u+v
Inverse Fourier transform
ll
As the filter radius increases, less and less power is removed/filtered out,
more and more details are preserved.
ld
with radius 5.
or
The percentage of power enclosed in circles of 10,30,60,160, 460 pixels is generally around
87, 93.1,95.7, 97.8,99.2 %.
W
TU
JN
ld
Total image power PT is obtained by summing the components of the
power spectrum. Enclosed power = Remained power after filtration
or
P(u,v) =F (u,v)2= R2 (u,v) + I2 (u,v)
W
TU
JN
of ILPFs
A
Representation in
the spatial domain of
an ILPF of radius 5
and size 1000x1000
Intensity profile of a
horizontal line passing
ld
through the centre of
the image
or
Filtering in the spatial domain can be done by convolving h(x,y) with image
f(x,y). Imagine each pixel in the image as a discrete impulse whose strength
is proportional to the intensity of the image at that location.
the impulse.
W
Convolving a sinc with an impulse copies the sinc at the location of
Thus centre lobe of the sinc causes blurring while outer smaller lobes are
responsible for ringing.
TU
JN
ll
A
ld
or
W
Blurring and Ringing properties of ILPF:
ld
The transfer function of a Butterworth LPF of order n, and with a cut off frequency at a
distance Do from origin is given by:
or
W
TU
Cut off freq
JN
Unlike in ILPF, BLPF transfer function does not have a sharp discontinuity that gives a clear
cut off between passed and filtered frequencies.
For BLPF, the cutoff frequency is defined as the frequency at which the transfer function
has value which is half of the maximum
ll
The BLPF with order of 1 does not have any ringing artifact.
BLPF with orders 2 or more shows increasing ringing effects as the order increases.
A
Butterworth filter is a low-pass filter with smooth edges such that there is no (or
minimal) ringing in the spatial domain
Unlike in ILPF, here Smooth transition in blurring is a function of increasing cut off
A
frequency. Due to smooth transition between low and high frequencies, ringing is not
visible.
BLPF has no ringing in spatial domain for order 1. Ringing is visible and similar to ILPF for
order n=20 in BLPF.
ld
or
filtered frequencies
W
Unlike ILPF, BLPF does not have a sharp discontinuity that gives a clear cut off between assed and
H (u ,v ) e( u 2
v2 )/2 2
W
Therefore there is no ringing effect of the GLPF.
Ringing artifacts are not acceptable in fields like medical imaging. Hence use Gaussian
instead of the ILPF/BLPF.
TU
Other well-known frequency domain low pass filters include the Chebyshev and the
Gaussian transfer functions.
The Gaussian low pass filter has the very interesting property of having the
JN
same form in both the Fourier and spatial domains. In other words, the DFT of
a Gaussian function is itself a Gaussian function.
A Gaussian low pass filter introduces no ringing when applied either in the
spatial or frequency domains. The Gaussian transfer function is given as
ll
A
Sharpening Frequency-Domain Filters
The high-frequency components are: edges and sharp transitions such as
noise. Sharpening can be achieved by high pass filtering process, which
attenuates low frequency components without disturbing the high-frequency
information in the frequency domain.
ld
The filter transfer function, H hp (u,v), of a high pass filter is given by:
H hp (u,v) = 1- H lp (u,v)
or
Where Hlp(u,v), is the transfer function of the corresponding lowpass filter.
W
TU
JN
ll
A
ld
or
W
TU
W
TU
JN
ll
A
High Pass Filters
Ideal high-pass filter (IHPF)
Butterworth high-pass filter (BHPF)
Gaussian high-pass filter (GHPF)
Difference of Gaussians
Unsharp Masking and High Boost filtering
ld
Sharpening Frequency-Domain Filters
Butterworth Highpass Filter (BHPF):
The transfer function of BHPF of order n and with a specified cutoff frequency is given by:
or
W
TU
Smoother results are obtained in BHPF when compared IHPF. There is almost no ringing
artifacts.
ld
or
W
Gaussian High pass Filter (GHPF): Smoother results are obtained in
TU
GHPF when compared BHPF. There is absolutely no ringing artifacts
JN
ll
A
A
ll
JN
TU
W
or
ld
ld
Example: using high pass filtering and thresholding for
image enhancement
or
W
TU
JN
ll
A
HIGH BOOST FILTERING
ld
or
W
TU
JN
g mask ( x, y ) f ( x, y ) f LP ( x, y)
A
Highboost filtering:
(alternative definition)
g(x, y) (A 1) f (x, y)
fHP(x, y)
fLP (x, y) f (x, y) * hLP (x, y) F(
fLP (x, y)) F (u, v)H LP (x, y)
ld
or
W
TU
JN
ll
A
Band Filters
Low pass filtering is useful for reducing noise but may produce an image that is overly blurry.
High pass filtering is useful for sharpening edges but also accentuates image noise.
Band filtering seeks to retain the benefits of these techniques while reducing their undesirable
properties.
Band filtering isolates the mid-range frequencies from both the low-range and high-
range frequencies.
ld
A band stop (or notch) filter attenuates mid-level frequencies while leaving the high and
low frequencies unaltered.
A band pass filter is the inverse of a band stop; leaving the mid-level frequencies
or
unaltered while attenuating the low and high frequencies in the image.
A band of frequencies may be specified by giving the center frequency and the width of the
band. The band width determines the range of frequencies that are included in the band.
A band stop filter is essentially a combination of a low and high pass filter, which implies that ideal,
W
Butterworth, and Gaussian band stop filters can be defined.
Homomorphic Filtering
TU
ld
or
W
TU
W
TU
This filter will set F(0,0) to zero and leave all the other frequency
components. Such a filter is called the notch filter, since it is constant
function with a hole (notch) at the origin.
M 1N1
JN
1 f (x, y)
F(0,0)
MN
F(0,0) is the average intensity of an image x0y 0
ll
A
HOMOMORHIC FILTERING
Note that:
ld
To accomplish separability, first map the model to natural log domain and
then take the Fourier transform of it.
or
z(x, y) = ln{ f (x, y)} = ln{i(x, y)}+ ln{r(x, y)}
Then,
or
UNIT 4:
Image Restoration
ld
Degradation model
Algebraic approach to restoration
Inverse filtering, least mean square filters.
Constrained Least square restoration, Interactive restoration
or
Restoration techniques involve modeling of degradation and applying the inverse
process in order to recover the image
W
Degradation gray value altered
Distortion pixel shifted (Geometric restoration (image registration))
So, in case of image restoration, the image degradation model is very very important.
So, we have to find out what is the phenomena or what is the model which has
degraded the image and once that degradation model is known; then we have to apply
TU
the inverse process to recover or restore the desired image.
In Restoration, if we know a degradation model by which the image has been degraded
and on that degradation model, on the degraded image, some noise has been added.
JN
ld
or
W
TU
JN
where h(x,y) is a system that causes image distortion g(x, y) f (x, y) h(x, y) (x, y)
We will assume that a degradation function exists, which, together with additive noise,
operates on the input image f(x,y) to produce a degraded image g(x,y).
ld
The objective of restoration is to obtain an estimate for the original image from its
degraded version g(x,y) while having some knowledge about the degradation function H
and the noise (x,y).
or
The third expression is in matrix form.
g=Hf +
W
g is a column matrix or column vector of dimension m x n the image is of dimension m x
n. f is also column vector of the same dimension m x n.
This degradation matrix H, is of dimension mn x mn,
ld
We will assume that a degradation function exists, which, together with additive noise,
operates on the input image f(x,y) to produce a degraded image g(x,y).
or
The objective of restoration is to obtain an estimate for the original image from its
degraded version g(x,y) while having some knowledge about the degradation function H
and additive noise (x,y).
The key is finding an appropriate model of the image degradation that can be inverted
W
Additive noise
g(x, y) = f(x, y) + (x, y)
Linear blurring
g(x, y) = f(x, y) * h(x, y)
Linear blurring and additive noise
g(x, y) = f(x, y) * h(x, y) +(x, y)
Linear blurring and complex noise
TU
g(x, y) = [f(x, y) * h(x, y)][1 + m(x, y)] + n(x, y)
Given g(x, y) and some knowledge about the degradation function H and the noise ,
obtain an estimate f (x, y) of the original image
h(x, y): spatial representation of the degradation function
ll
Frequency domain
Inverse filter
Wiener (minimum mean square error) filter
Algebraic approaches
Unconstrained optimization
Constrained optimization
The regularization theory
ld
Gaussian: poor illumination
Impulse: faulty switch during imaging, Salt-and-Pepper
Rayleigh: range image
or
Gamma/Exp: laser imaging
Uniform is least used.
W
Parameters can be estimated based on histogram on
small flat area of an image
If the shape of pdf is Gaussian, the mean and variance completely specify it. Otherwise, these
estimates are needed to derive a and b.
For impulse noise, we need to estimate actual probabilities Pa and Pb from the histogram.
ll
A
Sources of Noise Three major sources:
ld
During acquisition (often random) (faulty CCD elements)
During transmission (channel interference)
During image processing (compression)
or
What causes the image to blur?
Camera: translation, shake, out-of-focus
W
circuitryQuantization noise
Mean Filter
Median Filter
the noise terms are unknown, so, subtracting them from g(x,y) or G(u,v) is not a realistic
option.
A
Therefore, spatial filtering is a good candidate when only additive random noise is
present.
We already discussed spatial filters before; here we discuss noise rejection
capabilities of different spatial filters.
1. Mean filters (Arithmetic mean filter, Geometric mean filter, Harmonic mean
filter, Contraharmonic mean filter)
2. Order-statistic filters (Max and Min filters, Midpoint filter, Alpha-trimmed
mean filter, Adaptive local noise reduction filter,)
3. Adaptive filters (Adaptive local noise reduction filter, Adaptive median filter)
ld
or
W
TU
JN
NOISE
ll
ld
What is the best way to remove noise?
or
Capture N images of the same
scene gi(x,y) = f(x,y) + ni(x,y)
Average to obtain new image
gave(x,y) = f(x,y) + nave(x,y)
W
Use all information available !
Derive noise model for the input image
Generate a synthetic image with the same noise distribution as the input image
Perform experiments on the synthetic image to select the noise removal method and
parameters that minimize restoration error
Apply this method to the input image
TU
the term noise has the following meanings:
1. An undesired disturbance within the frequency band of interest; the summation of unwanted or
disturbing energy introduced into a communications system from man-made and natural sources.
2. A disturbance that affects a signal and that may distort the information carried by the signal.
3. Random variations of one or more characteristics of any entity such as voltage, current, or data.
JN
4. A random signal of known statistical properties of amplitude, distribution, and spectral density.
Noise has long been studied. People analyze its property, type, influence and what can be done about it.
Most of the research is done in mathematics and close related to Probability Theory
For thermal noise Often, noise is signal-dependent. Examples: speckle, photon noise,
Many noise sources can be modeled by a multiplicative model:
g(x, y) = f (x, y) (x, y)
ld
he major sources of noise during image acquisition from electro-optics devices are photon noise,
thermal noise, on-chip electronic noise, amplifier noise, and quantization noise.
Estimation of Noise
or
Consists of finding an image (or subimage) that contains only noise, and
then using its histogram for the noise model
W
Noise only images can be acquired by aiming the imaging device (e.g.
camera) at a blank wall
In case we cannot find "noise-only" images, a portion of the image is selected that has a
known histogram, and that knowledge is used to determine the noise characteristics
After a portion of the image is selected, we subtract the known values from
TU
the histogram, and what is left is our noise model
For good strategy in removing noise and restoring image quality one needs to determine
noise distribution.
(Check the histogram in a reasonable size smooth region with visibly small variation in values)
Once the noise model is estimated use an appropriate filter.
Histogram in the same region indicates level of success.
Noise pdfs
1. Gaussian (normal) noise is very attractive from a mathematical point of view since its DFT
is another Gaussian process.
ld
or
Here z represents intensity, is the mean (average) value of z and is its standard
z, deviation. 2 is the variance of z.
W
Electronic circuit noise, sensor noise due to low illumination or high temperature.
TU
JN
ll
Radar range and velocity images typically contain noise that can be modeled by the
Rayleigh distribution
A
A
ll
JN
TU
W
or
ld
ld
or
W
The gray level values of the noise are evenly distributed across a specific range
Quantization noise has an approximately uniform distribution
TU
JN
ll
A
ld
or
W
TU
JN
The salt-and-pepper type noise (also called impulse noise, shot noise or spike
noise) is typically caused by malfunctioning pixel elements in the camera sensors, faulty
memory locations, or timing errors in the digitization process There are only two
possible values, a and b, and the probability of each is typically less than 0.2 with
numbers greater than this the noise will swamp out the image. For an 8-bit image, the
ll
W
Noise Removal by Frequency Domain Filtering:
An Ideal Bandreject filter is given by:
TU
JN
ll
A
Notch filter:
ld
such as mechanical jitter (vibration) or electrical interference in the system during
image acquisition
It appears in the frequency domain as impulses corresponding to
sinusoidal Interference
It can be removed with band reject and notch filters Periodic Noise
or
Frequency Domain Filtering
Band reject Filter
Notch Filter
W
Adaptive Median Filter:
Effects of coding/decoding
Image noise as a random variable:
For every (x, y), (x, y) can be considered as a random variable.
TU
In general, we assume that the noise (x,y) is not dependent on the underlying signal f(x,y).
In general, we assume that the value of (x, y) is not correlated with (x, y).
(Spatially uncorrelated noise)
JN
ll
A
ld
or
W
TU
JN
Assumptions
ll
Noise
Independent of spatial location
Exception: periodic noise
A
W
(x,y)=0
what we have is we have a degraded image g (x, y) which now let us represent it is like
this H of f (x, y) plus eta (x, y) where in this particular case, we assume that this H is the
TU
degradation operator which operates on the input image f (x, y) and that when added
with the additive noise eta (x, y) gives us the degraded image g (x, y).
f1 (x,y) f2 (x,y)
H [ k1 f1 (x,y) + k2 f2 (x,y) ]
A
= H [ f1 (x,y) + f2 (x,y) ]
Position Invariant
ld
H[ f (x-, y-)] = g (x-, y-)
g(x,y)=H[f(x,y)]
or
Image Deblurring
Based on this model, the fundamental task of deblurring is to deconvolve the blurred
image with the PSF that exactly describes the distortion. Deconvolution is the process
W
of reversing the effect of convolution.
Note! The quality of the deblurred image is mainly determined by knowledge of the
PSF.
TU
JN
Degradation model:
A
ld
deconvolution to get f(x,y) back from g(x,y).
This is why convolution is sometimes also known as the superposition integral and deconvolution is
sometimes also called signal separation
or
The process of image restoration by use of the
estimated degradation function is sometimes called
W
blind deconvolution
u u
JN
obs orig k
Goal: Given uobs, recover both uorig and k
3
ll
A
There are 3 principal methods of estimating the degradation
function for Image Restoration: ( Blind convolution:
ld
because the restored image will be only an estimation. )::
or
what we have is the degraded image g (x, y) and by looking at this degraded image
g (x, y), we have to estimate what is the degradation function involved.
The degradation function H can be estimated by visually looking into a small section of
the image containing simple structures, with strong signal contents, like part an object
W
and the background.
Given a small sub image gs(x,y), we can manually (i.e. filtering) remove the degradation
in that region (by observing the gray levels) with an estimated sub image (x.y), and
assuming that the additive noise is negligible in such an area with a strong signal
content. To reduce the effects of noise, we look for an area of strong signal (area of
high contrast) and try to process that sub image to un-blur it as much as possible (for
instance, by sharpening sub image with a sharpening filter).
TU
Let denote the original sub image by g s (x,y) and the its restored version by f (x,y)
we can assume
g(x,y)
gs (x,y) Gs (u,v)
JN
fs (x,y) Fs (u,v)
Having Hs(u,v) estimated for such a small sub image, the shape of this degradation
ll
function can be used to get an estimation of H (u,v) for the entire image.
Since noise term is neglected, the sub image considered should be a strong image
A
ld
Observation
or
DFT Subimage
Estimated Transfer
G s (u , v ) gs(x,y)
function
Restoration
Gs(u,v) process by
H(u,v) Hs(u,v)
Fs ( u , v ) estimation
DFT
W
This case is used when we Reconstructed
know only g(x,y) and cannot Fs ( u , v ) Subimage
repeat the experiment!
f s(x,y)
So here, our purpose will be to find out the point spread function or the impulse
response of this imaging setup. it is the impulse response which fully characterizes any
JN
particular system. So, once the impulse response is known, the response of that system
to any arbitrary input can be computed from the impulse response.
So here, the first operation that we have to do is we have to simulate an impulse. So,
first requirement is impulse simulation.
Now, how do you simulate an impulse? An impulse can be simulated by a very bright
spot of light and because our imaging setup is a camera, so we will have a bright spot
as small as possible of light falling on the camera and this bright spot if it is very small,
ll
Then it is equivalent to an impulse and using this bright spot of light as an input,
whatever image that we get that is the response to that bright spot of light which in our
case is an impulse.
A
ld
or
So, that is what has been shown in this particular slide. The left most image is the
simulated impulse. Here you find that at the center, we have a bright spot of light. Of
course, this spot is shown in a magnified form, in reality this spot will be even smaller
than this and on the right hand side, the image that you have got, this is the image
W
which is captured by the camera when this impulse falls on this camera lens.
So, this is my impulse, simulated impulse and this is what is my impulse response. So,
once I have the impulse and this impulse response; then from this, I can find out what is
the degradation function of this imaging system.
Now, we know from our earlier discussion that for a very very narrow impulse, the
Fourier transformation of an impulse is a constant.
TU
Estimation by Experiment
Used when we have the same equipment set up and can repeat the
experiment.
Response image from
Input impulse image the system
JN
System
H( )
A (x, y) g(x, y)
ll
DFT DFT
DFT A (x, y) A G(u,v)
A
G(u, v)
H (u, v)
A
If we have the acquisition device producing degradation on
images, we can use the same device to obtain an accurate
estimation of the degradation.
ld
So, in this case, this G (u, v) is the Fourier transform of the observed image and here,
this Fourier transform is nothing but the Fourier transform of the image that we have got
which is response to the simulated impulse that has fallen on the camera. A is the
Fourier transform of the impulse falling on the lens and the ratio of these 2 that is G (u,
v) by this constant A that gives us what is the deformation or what is the degradation
or
model of this particular imaging setup
W
as an input image . The Fourier transform of an impulse is
constant, A, therefore
TU
Where, A is a constant describing the strength of the impulse. Note that the effect of
noise on an impulse is negligible.
Simply take the Fourier transform of the degraded image and after normalization by a
JN
So, I get the degradation function and the same degradation function, we assume that it
is also valid for the actual imaging system. Now, in this point, regarding this, one point
should be kept in mind that the intensity of the light which is the simulated impulse
should be very very high so that the effect of noise is reduced.
ll
A
Estimation by Mathematical Modeling:
Sometimes the environmental conditions that causes the degradation can be modeled
by mathematical formulation.
ld
For example the atmospheric turbulence can be modeled by:
This equation is similar to Gaussian LPF and would produce blurring in the image
according to the values of k. For example if k=0.0025, the model represents severe
or
turbulence, if k=0.001, the model represents mild turbulence and if k=0.00025, the
model represents low turbulence.
W
Once a reliable mathematical model is formed the effect of the degradation can be
obtained easily.
If the value of K is large, that means the turbulence is very strong whereas if the value
of K is very low, it says that the turbulence is not that strong
TU
So, this is the one which gives you modeling of degradation which occurs because of
turbulence. Now, there are other approaches of degradation mathematical model to
estimate the degradation which are obtained by fundamental principles. So, from the
basic principles also, we can obtain what should be the degradation function.
JN
ll
A
Possible classification of restoration methods
ld
deterministic: we work with sample by sample processing of the observed
(degraded) image
stochastic: we work with the statistics of the images involved in the
process
or
non-blind: the degradation process H is known blind:
the degradation process H is unknown
semi-blind: the degradation process H could be considered partly known
W
direct
iterative
recursive
TU
Inverse Filtering: (un constrained)
In most images, adjacent pixels are highly correlated, while the gray levels of
widely separated pixels are only loosely correlated.
Therefore, the autocorrelation function of typical images generally decreases
away from the origin.
Power spectrum of an image is the Fourier transform of its autocorrelation
JN
function, therefore, we can argue that the power spectrum of an image generally
decreases with frequency
Typical noise sources have either a flat power spectrum or one that decreases
with frequency more slowly than typical image power spectra.
Therefore, the expected situation is for the signal to dominate the spectrum at
low frequencies while the noise dominates at high frequencies.
ll
Until now our focus was the calculation of degradation function H(u,v). Having H(u,v)
calculated/estimated the next step is the restoration of the degraded image.
A
There are different types of filtering techniques for obtaining or for restoring the original
image from a degraded image.
The simplest kind of Approach to restoration is direct inverse filtering technique.
The simplest way of image restoration is by using
Inverse filtering:
Now, the concept of inverse filtering is very simple. Our expression is that G (u, v) that
is the Fourier transform of the degraded image is given by H (u, v) into F (u, v) where H
(u, v) is the degradation function in the frequency domain and estimate F (u, v) is the
ld
Fourier transform of the original image, G (u, v) is the Fourier transform of the degraded
image. The division is an array operation.
(, )= (, )(, )+ (, )
or
Noise is enhanced
when H(u,v) is
W
small.
To avoid the side effect of enhancing noise, we can apply this formulation to freq. component
(u,v) with in a radius D0 from the center of H(u,v).
TU
So, this expression says that even if H (u, v) is known exactly, the perfect reconstruction
may not be possible because N (u,v) is not known.
Again if H(u,v) is near zero, N(u,v)/H(u,v) will dominate the F(u,v) estimate.
Now, because this H (u, v) into F (u, v), this is a point by point multiplication. That is for
every value u and v, the corresponding F component and the corresponding H
JN
component will be multiplied together to give you the final matrix which is again in the
frequency domain. This problem could be reduced by limiting the analysis to
frequencies near the origin.
The solution is again to carry out the restoration process in a
limited neighborhood about the origin where H (u,v) is not very
small. This procedure is called pseudoinverse filtering.
ll
A
ld
turbulence for the origin of the frequency spectrum,
or
If we consider a Butterworth Lowpass filter of H(u,v)
W
around the origin we will only pass the low frequencies
(high amplitudes of H(u,v)).
ld
Wiener filter (constrained)
Direct Method (Stochastic Regularization)
or
Inverse Filter considers degradation function only and does not consider the noise part.
How do say that up to what extent of (u, v) value we should go? That is again image
dependent. So, it is not very easy to decide that to what extent of frequency
components we should consider for the reconstruction of the original image if i go for
W
direct inverse filtering.
So, there is another approach which is the minimum mean square error approach or it is
also called the Wiener filtering approach. In case of Wiener filtering approach, the
Wiener filtering tries to reconstruct the degraded image by minimizing an error function.
So, it is something like this
TU
Restoration: Wiener filter
Degradation model:
g(x, y) = h(x, y) * f (x, y) + (x, y)
Wiener filter: a statistical approach to seek an estimate f that minimizes the statistical
function (mean square
error): e2 = E { (f - f ) 2 }
JN
* Assumptions:
# Image and noise are uncorrelated
# Image and/or noise has zero mean
# Gray levels in the estimate are linear function of the levels in the degraded image
minimum mean square error or Wiener filtering approach for restoration of a degraded
image.
Mean Square Error MSE = = ( ) ( , )
ll
A
ld
or
W
TU
JN
Now, in this case, you might notice that if the image does not contain any noise; then
obviously, S eta (u, v) which is the power spectrum of the noise will be equal to 0 and in
that case, this wiener filter becomes identical with the inverse filter. But if the degraded
image also contains additive noise in addition to the blurring; in that case, the wiener
TU
filter and the inverse filter is different.
only)
Now, unlike in case of Wiener filtering where the performance of the Wiener filtering
depends upon the correct estimation of the value of R that is the performance of Wiener
filtering depends upon how correctly you can estimate what is the power spectrum of
the original un degraded image.
Constrained Least Square Filter method uses the mean of the noise which we will
ll
write as say m and the variance of the noise which we will write as 2
So, we will see that how the reconstruction using this constant least square filter
A
approach makes use of this noise parameter like mean of the noise and the variance of
the noise.
Degradation model:
g(x, y) f (x, y) h(x, y) (x, y)
In matrix form, g Hf
ld
Now here, you will notice that the value of H is very very sensitive to noise. So, to take
care of that what we do is we define an optimality criteria and using that optimality
criteria, the reconstruction has to be done .
The optimality criteria that we will use is the image smoothness.
or
So, you know from our earlier discussion that the second derivative operation or the
Laplacian operator, it tries to enhance the irregularities or discontinuities in the image.
W
Objective: to find the minimum of a criterion function 2 2
Cf (x, y)
x0y0
Subject to the constraint g Hf 2 2
Again, without going into the details of mathematical derivation, we will simply give the
frequency domain solution of this particular constant least square estimation where the
frequency domain solution now is given by F hat (u, v) is equal to H star (u, v) upon H
JN
(u, v) square plus a constant gamma times P (u, v) square, this times G (u, v).
H * (u, v)
F (u, v)
2 G(u, v)
H (u,v) P(u, v) 2
ll
0 1 0
is adaptively adjusted to achieve the best result
Constrained least squares restoration using different values f or . Note that is a
scalar constant, not a ratio of frequency domain functions as with the Wiener case, that
are unlikely to be constant. In this case, is determined manually
Again as before, this H star indicates that it is the complex conjugate of H. Here again,
ld
we have a constant term given as gamma where the gamma is to be a adjusted so that
the specified constant that is g minus Hf hat square is equal to n square this constant is
met.
So, as we said that this gamma has to be adjusted manually for obtaining the optimum
or
result and the purpose is that this adjusted value of gamma, the gamma is adjusted so
that the specified constant is maintained. However, it is also possible to automatically
estimate the value of gamma by an iterative approach.
W
TU
JN
ll
A
ld
or
W
UNIT 5 :
IMAGE SEGMENTATION
TU
JN
ll
A
UNIT 5 : IMAGE SEGMENTATION
ld
Input is Image output is features of images
or
Segmentation is an approach for Feature extraction in an image
W
Features of Image: Points, lines, edges, corner points, regions
Attributes of features :
TU
Geometrical (orientation, length, curvature, area,
diameter, perimeter etc
JN
Topological attributes: overlap, adjacency, common end
point, parallel, vertical etc
Image segmentation refers to the process of partitioning an
ll
ld
or
W
TU
JN
ll
A
ld
or
W
HOUGH TRANSFORM
TU
JN
ll
A
Point detection:
ld
or
W
TU
JN
ll
A
Isolated Point detection:
ld
So, when it comes to an isolated point detection, we can use
a mask havingor the coefficient as given below.
or
Now, we say that an isolated point at a location say (x, y) is
detected in the image where the mask is centered if the
corresponding modulus R value, is greater than certain threshold
W
say T where this T is a non negative threshold value.
TU
JN
= 0 otherwise
A
Detection of Lines,
ld
Apply all the 4 masks on the image
or
W
TU
There will be four responses R1, R2, R3, R4.
JN
Suppose that at a particular point,
R 1 > R j , where j=2,3,4 and j1
ld
or
W
TU
JN
ll
A
Detection of an edge in an image:
ld
What is edge:
An ideal Edge can be defined as a set of connected pixels each
or
of which is located at an orthogonal step transition in gray level
W
TU
JN
ll
A
Calculation of Derivatives of Edges:
ld
or
W
TU
JN
ll
A
There are various ways in which this first derivative operators can
ld
be implemented
Prewitt Edge Operator Sobel Edge Operator
or
(noise is taken care)
W
Horizontal
Gx TU
Vertical
Gy
Horizontal
Gx
vertical
Gy
JN
The direction of the edge that is the direction of gradient
vector f. Direction (x,y) = tan -1 ( Gy / Gx )
ll
ld
we have to detect the position of an edge and by this, what
or
is expected is to get the boundary of a particular segment.
For this there are two approaches : One is local processing
W
HOUGH TRANSFORM
The second approach is global processing ( HOUGHS
transformation)
TU
JN
ll
A
EDGE LINKING BY LOCAL PROCESSING
ld
A point (x, y) in the image which is already operated by the
sobel edge operator. T is threshold
or
In the edge image take two points x,y and x,y and to link them
Use similarity measure
W
first one is the strength of the gradient
operator the direction of the gradient
TU
By sobel edge operator.
JN
ld
The Hough transform is a mapping from the spatial domain
to a parameter space for a particular straight line, the values
or
of m and c will be constant
W
Spatial domain Parameter space
TU
JN
ll
or
Points Lines
Collinear points Intersecting lines
W
So for implementation of Hough Transform, what we
have to do is this entire mc space has to be subdivided into
a number of accumulator cells.
TU At each point of the
parameter space,
count how many lines
JN
pass through it.
This is a bright point
in the parameter image
It can be found by
ll
thresholding. This is
A
ld
infinity ; to solve this make use of the normal representation of a
straight line Use the Normal equation of a line:
or
W
TU
JN
A Point in Image Space is now represented as a SINUSOID
= x cos +y sin Therefore, use ( , ) space
= x cos + y sin
= magnitude
ll
ld
the slope m and c, now parameters become , and .
Use the parameter space ( , )
or
The new space is FINITE
0 < < D , where D is the image diagonal = (M2 +N2),
MxN is image size.
W
0 < < ( or = 90 deg)
In ( , ) space
point in image space == sinusoid in ( , ) space
TU
where sinusoids overlap, accumulator = max
maxima still = lines in image space
JN
ll
A
Global Thresholding : a threshold value is selected where the
ld
threshold value depends only on the pixel intensities in the image
or
Dynamic or adaptive thresholding: Threshold depends on pixel
value and pixel position. So, the threshold for different pixels in the
image will be different.
W
Optimal thresholding : estimate that how much is the error
incorporated if we choose a particular threshold. Then, you
TU
choose that value of the threshold where by which your average
error will be minimized
JN
ll
A
THRESHOLDING
ld
Region based segmentation
operations thresholding
region growing and
or
the region splitting and merging techniques
W
TU
JN
So, for such a bimodal histogram, you find that there are two peaks.
Now, the simplest form of the segmentation is, choose a threshold value say
T in the valley region
if a pixel at location x,y have the intensity value f (x, y) T; then we say that
ll
ld
or
W
TU
JN
So, you will find that the basic aim of this thresholding operation
is we want to create a thresholded image g (x, y) which will be a
binary image containing pixel values either 0 or 1 depending upon
ll
ld
Automatic Thresholding
1. Initial value of Threshold T
or
2. With this threshold T, Segregate the pixels into two gr2oups
G1 and G2
W
3. Find the mean values of G1 and G2. Let the means be 1
and 2
4. Now Choose a new threshod. Find the average of the
means
TU
T new = (1 + 2)/2
5. With this new threshold, segregate two groups and repeat
the procedure. T T new> T , back to step.
JN
Else stop.
ll
A
Basic Adaptive Thresholding is
ld
Divide the image into sub-images and use local thresholds
or
But, in case of such non uniform illumination, getting a global
threshold which will be applicable over the entire image is very
W
very difficult
So, if the scene illumination is non uniform, then a global
threshold is not going to give us a good result. So, what
TU
we have to do is we have to subdivide the image into a
number of sub regions and find out the threshold
value for each of the sub regions and segment that sub
JN
region using this estimated threshold value and here,
because your threshold value is position dependent, it
depends upon the location of the sub region; so the kind
of thresholding that we are applying in this case is an
ll
adaptive thresholding.
A
ld
or
Basic Global and Local Thresholding
Simple tresholding schemes compare each pixels gray level
with a single global threshold. This is referred to as Global
W
Tresholding.
ld
Adaptive Thresholding is
Divide the image into sub-images and use local thresholds,
Local properties (e.g., statistics) based criteria can be used for adapting the threshold.
or
Statistically examine the intensity values of the local neighborhood of each
pixel. The statistic which is most appropriate depends largely on the input
W
image. Simple and fast functions include the mean of the local intensity
distribution,
TU
You can simulate the effect with the following steps:
1. Convolve the image with a suitable statistical operator,
JN
i.e. the mean or median.
2. Subtract the original from the convolved image.
3. Threshold the difference image with C.
ll
ld
or
W
TU
JN
ll
A
ld
Now, what is our aim in this particular case? Our aim is that
we want to determine a threshold T which will minimize the
or
average segmentation error.
W
TU
JN
ll
A
ld
If f (x, y) is greater than T, then (x, y) belongs to object but
the pixel with intensity value f (x, y) also has a finite
or
probability; say given by this that it may belong to the
background. So, while taking this decision, we are
incorporating some error. The error is the area given by this
W
probability curve for the region intensity value greater than T.
Let us say corresponding error will be given by E1 (T)
Similarly, if a background pixel is classified as an object
TU
pixel, then the corresponding error will be given by E2
(T) is equal to integral P1(z) dz where the integral has to
be taken from T to infinity.
JN
ll
A
Overall probability of error is given by:
ld
E (T ) = P2 E1(T) + P1 E2(T)
Now, for minimization of this error
or
E(T) / T=0
By assuming Gaussian probability density function,
W
TU
The value of T can now be found out as the solution for T is given
JN
by, solution of this particular equation
AT2 + BT + C=0
A= 1 2 - 2 2
B= 2 ( 1 2 2 - 2 1 2 )
ll
C= (2 2 1 2 - 12 2 2 ) + 2 1 2 2 2 ln ( )
A
ld
2 =12 =22
or
Optimal Threshold is obtained by
T = ( 1 + 2 ) /2 + [ 2 / (1 - 2 )] ln ( P2/P1)
W
The capital P1 and capital P 2, they are same; in that case,
the value of T will be simply 1 plus 2 by 2 that is the
TU
mean of the average intensities of the foreground region
and the background region.
JN
ll
A
Boundary characteristics for Histogram Thresholding
ld
Use of Boundary Characteristics for Histogram Improvement
and Local Thresholding
or
W
TU
JN
ll
A
But, the boundary between the object and the boundary itself
ld
is not known
or
Compute the gradient of intensities ( first derivative) and
W
the second order derivative operator, the Laplacian ( this will
be affected by noise)
So, First derivative will be used for edge position identification and
TU
Laplacian for identifying direction. On the bright side of the edge,
the Laplacian becomes negative
So, our approach is though we have said that we want to consider
JN
only those pixels for generation of the histogram which are lying
either on the boundary either on the edge between the object and
the background; so, that information can be obtained by using
from the output of the gradient because for all the pixels which
ll
TU
the edge. So, for such points, we are making s (x, y) is equal to
0 and we will put s (x, y) is equal to positive if gradient of f is
greater than or equal to T indicating that this is an edge point
JN
or this is a point near the edge and at the same time, if del
square f is greater than or equal to 0 which indicates that this
point is on the dark side of the edge.
ll
A
ld
Region growing:
starting from this particular pixel, you try to grow the region
or
based on connectivity or based on adjacency and similarity. So,
this is what is the region growing based approach
W
Group pixels from sub-regions to larger regions
Start from a set of seed pixels and append pixels with similar
properties
Selection of similarity criteria: color, descriptors (gray level +
moments)
Stopping rule
TU
JN
Basic formulation
Every pixel must be in a region
Points in a region must be connected
Regions must be disjoint
ll
regions
Region growing operation will start from the seed point.
ld
Choosing a 3 by 3 neighborhood around the seed point and grow
or
the region starting from the seed point, then all the points which
include in the same group or in the same partition, these points
have to be connected. That means, start growing this region from
W
the points which are connected to the seed point. And at
the end, what we have is a number of regions which are grown
around these seed points.
TU
So, what does this region growing actually mean? The region
growing as the name implies that it is a procedure which groups
the pixels or sub regions into a larger region based on some
JN
predefined criteria , say, similarity or close in intensities.
ll
A
Region splitting& merging Quadtree decomposition
ld
Region splitting and merging : split the image into a number of
smaller size sub images or smaller size components, then you try
or
to merge some of those sub images which are adjacent and which
are similar in some sense.
W
If I have an image say R
If all the pixels in the image are similar, Let R
leave it as it is denote the
If they are not similar,
TU
then you break this image into quadrants.
make 4 partitions of this image.
Full image.
R
JN
Then, check each and every partition is similar
ld
so partition R1 region again making it R10 R11 R12 R13
and
or
you go on doing this partitioning until and unless you come to a
partition size which is the smallest size permissible or you come to
a situation where the partitions have become uniform, or so you
W
cannot partition them anymore.
And in the process of doing this, we have a quad tree
representation of the image.
TU
JN
ll
A
So, in case of quad tree representation, if root node is R, initial
ld
partition gives out 4 nodes - R0 R1 R2 and R3. Then R1 gives
again R10 R11 R12 and R13. Once such partitioning is
or
completed, then what you do is you try to check all the adjacent
partitions to see if they are similar.
If they are similar, you merge them together to form a bigger
W
segment. Say, if R12 and R13 are similar. Merge them.
TU
JN
ll
A
ld
So, this is the concept of splitting and merging technique
for segmentation.
or
Now at the end, leave it if no more partition is possible ie.
reached a minimum partition size or every partition has become
uniform;
W
then look for adjacent partitions which can be combined together
to give me a bigger segment.
TU
JN
ll
A
A
ll
JN
TU
W
or
ld
Data redundancy is the central concept in image
ld
compression and can be mathematically defined.
Data Redundancy
or
Because various amount of data can be used to represent the
same amount of information, representations that contain
irrelevant or repeated information are said to contain
W
redundant data.
The Relative data redundancy RD of the first data set, n1, is
defined by:
TU
CR refers to the compression ratio compression
ratio (CR) or bits per pixel (bpp) and is defined by:
JN
ll
ld
Code: a list of symbols (letters, numbers, bits , bytes etc.)
or
Code word: a sequence of symbols used to represent a piece
of information or an event (e.g., gray levels).
Code word length: number of symbols in each code word
W
Ex: 101 Binary code for 5, Code length 3, symbols 0,1
rk is the pixel values defined in the interval [0,1] and p r r(k) is the
probability of occurrence of rk. L is the number of gray levels. nk
ll
is the number of times that kth gray level appears in the image
A
ld
or
W
TU
JN
ll
A
The average number of bit used for fixed 3-bit code:
ld
or
W
TU
JN
ll
A
Inter pixel Redundancy or Spatial Redundancy
ld
or
W
TU
JN
ll
A
The gray level of a given pixel can be predicted by its
ld
neighbors and the difference is used to represent the image;
this type of transformation is called mapping
or
Run-length coding can also be employed to utilize inter
pixel redundancy in image compression
Removing inter pixel redundancy is lossless
W
TU
JN
ll
A
Irrelevant information
ld
One of the simplest ways to compress a set of data is to
remove superfluous data For images, information that is ignored
or
by human visual system or is extraneous to the intended use of
an image are obvious candidate for omission.
The gray image, since it appears as a homogeneous field of
W
gray, can be represented by its average intensity alone a single
8-bit value. Therefore, the compression would be
TU
Psychovisual Redundancy (EYE CAN RESOLVE 32 GRAY
LEVELS ONLY)
JN
The eye does not respond with equal sensitivity to all visual
information. The method used to remove this type of
redundancy is called quantization which means the mapping of a
ll
ld
Fidelity criteria is used to measure information loss and
can be divided into two classes.
or
1) Objective fidelity criteria (math expression is used):
Measured mathematically about the amount of error in
the reconstructed data.
W
2) Subjective fidelity criteria: Measured by human
observation
TU
Objective fidelity criteria:
When information loss can be expressed as a mathematical
function of the input and output of the compression process, it is
JN
based on an objective fidelity criterion. For instance, a root-mean-
square (rms) error between two images.
Let f(x,y) be an input image and be an approximation of f(x,y)
resulting from compressing and decompressing the input image. For
ll
ld
While objective fidelity criteria offer a simple and convenient
way to estimate information loss, images are viewed by humans.
or
Therefore, measuring image quality by subjective evaluations of
people is often more appropriate: show two images (original and
decompressed) to a number of viewers and average their
W
evaluations.
Subjective fidelity criteria:
A Decompressed image is presented to a cross section of
TU
viewers and averaging their evaluations.
It can be done by using an absolute rating scale
By means of side by side comparisons of f(x, y) & f(x, y).
Or
JN
Side by Side comparison can be done with a scale such as
{-3, -2, -1, 0, 1, 2, 3}
to represent the subjective valuations
{much worse, worse, slightly worse, the same, slightly better,
ll
or
predict compression system, Quantizer + Mapper = Single Block
Mapper: Transforms the image into array of coefficients reducing inter pixel
redundancies. This is a reversible process which is not lossy. Run-length coding
W
is an example of mapping. In video applications, the mapper uses previous (and
future) frames to facilitate removal of temporal redundancy.
Quantizer: This process reduces the accuracy and hence psycho visual
TU
redundancies of a given image is irreversible and therefore lossy.
ld
or
W
TU
JN
= - [ 0.4 log (0.4) + 0.3 log (0.3) + 0.1 log (0.1 ) + 0.1 log (0.1) +
A
ld
for the symbol/pixel with the highest probability (a2). The longest
codeword (01011) is given for the symbol/pixel with the lowest
or
probability (a5). The average length of the code is given by:
W
TU
JN
ll
A
Transform Coding: (lossy image compression)
ld
In digital images the spatial frequencies are important as they correspond to
important image features. High frequencies are a less important part of the
or
images. This method uses a reversible transform (i.e. Fourier, Cosine
transform) to map the image into a set of transform coefficients which are then
quantized and coded.
W
Transform Selection: The system is based on discrete 2D transforms. The
choice of a transform in a given application depends on the amount of the
reconstruction error that can be tolerated and computational resources
available.
TU
Consider a sub image NxN image f(x,y), where the forward discrete transform
T(u,v) is given by:
JN
For u, v=0,1,2,3,..,N-1
General scheme
The transform has to decorrelate the pixels or to compact as much
information as possible into the smallest number of transform coefficients
The quantization selectively eliminates or more coarsely quantizes the less
ll
informative coefficients
A
ld
or
if u=0 , Same for (v)
W
if u=1,2,.N-1
Inverse DCT
TU
JN
ll
A
JPEG Standard
ld
JPEG exploits spatial redundancy
or
Objective of image compression standards is to enhance the
W
interoperability and compatibility among compression systems
by different vendors.
TU
digital compression and coding of continuous tone still images.
ld
Different modes such as sequential, progressive and hierarchical
or
modes and options like lossy and lossless modes of the JPEG
standards exist.
JPEG supports the following modes of encoding
W
Sequential : The image is encoded in the order in which it is
scanned. Each image component is encoded in a single left-to-
TU
right, top-to-bottom scan.
ld
Applications : color FAX, digital still camera,
multimedia computer, internet
or
JPEG Standard consists of
Algorithm: DCT + quantization + variable length coding
W
Steps in JPEG Compression
Divide the file into 8 X 8 blocks.
Apply DCT. Transform the pixel information from the
Cosine
Transform.
TU
spatialdomain to the frequency domain with the Discrete
ld
manner.
Follow by Huffman coding.
or
W
TU
JN
ll
A
ld
or
W
TU
JN
ld
or
In JPEG 2000, Instead of the DCT transformation, JPEG 2000,
ISO/IEC 15444, uses the Wavelet transformation.
W
The advantage of JPEG 2000 is that the blockiness of JPEG
is removed, but replaced with a more overall fuzzy picture,
TU
H.261/H.263 originally designed for video-conferencing over
telephone lines, i.e. low bandwidth
JN
Allows to extract various sub-images from a single compressed
image code stream, the so called Compress Once, Decompress
Many Ways.
ll
A
JPEG2000 (J2K)
ld
Better efficiency, and more functionality
or
Multiple resolution
Large images
W
Single decompression architecture
Spatial Scalability:
TU
Multi-resolution decoding from one bit-stream
Blockiness of JPEG is removed,
JN
The compression ratio for JPEG 2000 is higher than for JPEG
Discrete Wavelet Transform (DWT)
Embedded Block Coding with Optimized Truncation (EBCOT)
ll
A
Applications of JPEG-2000 and their requirements
ld
Internet
or
Color facsimile
Printing
W
Scanning
Digital photography
Remote Sensing
Mobile
Medical imagery
TU
Digital libraries and archives
JN
E-commerce
ll
A
Each application area has some requirements which the
ld
standard should fulfill.
Improved low bit-rate performance: It should give acceptable
or
quality below 0.25 bpp. Networked image delivery and remote
sensing applications have this requirements.
W
Progressive transmission: The standard should allow progressive
transmission that allows images to be reconstructed with increasing
pixel accuracy and resolution.
TU
Region of Interest Coding: It should preferentially allocate more bits
to the regions of interest (ROIs) as compared to the non-ROI ones.
or
2D Discrete Entropy
Wavelet Quantization Coding
Transform
W
2D discrete wavelet transform
TU
(1D DWT applied alternatively to
horizontal and vertical direction
line by line ) converts images into
10
20
30
40
50
10
20
30
40
50
sub-bands Upper left is the DC
JN
60 60
20 40 60 20 40 60
coefficient
Lower right are higher
frequency sub-bands.
ll
A
ld
or
LL1 HL1
Image decomposition Scale 1
W
4 subbands : LL1, LH1,HL1,HH1 LH1 HH
1
TU LL2 HL2
HL1
LH2 HH2
JN
Image decomposition Scale 2
or
Children
Descendants: corresponding
coeff. at finer scales Ancestors:
W
corresponding coeff. at coarser
scales
TU
JN
ll
A
ld
or
Image Decomposition
W
Feature 1:
Energy distribution similar to other
TC: Concentrated in low frequencies
Feature 2:
TU
Spatial self-similarity
across subbands
JN
ll
A
ld
Post-Compression
Rate-Distortion
or
(PCRD
W
TU
JN
ll
A
Embedded Block Coding with Optimized Truncation of bit-stream
ld
(EBCOT), which can be applied to wavelet packets and which
offers both resolution scalability and SNR scalability.
or
Each sub band is partitioned into small non-overlapping block
W
of samples, known as code blocks. EBCOT generates an
embedded bit-stream for each code block. The bit-stream
associated with each code block may be truncated to any of a
TU
collection of rate-distortion optimized truncation points
JN
ll
A
Steps in JPEG2000
ld
Tiling:
or
Smaller non-overlapping blocks of image are known as tiles
The image is split into tiles, rectangular regions of the
image. Tiles can be any size.
W
Dividing the image into tiles is advantageous in that the decoder will
need less memory to decode the image and it can opt to decode
only selected tiles to achieve a partial decoding of the image.
TU
Wavelet Transform: Either CDF 9/7 or CDF 5/3 bi-orthogonal
wavelet transform.
JN
Quantization: Scalar quantization
or
regions in the wavelet domain. They are selected in a way that
the coefficients within them across the sub-bands form
W
approximately spatial blocks in the image domain. Precincts are
split further into code blocks. Code blocks are located in a single
sub-band and have equal sizes. The encoder has to encode the
bits of all quantized coefficients of a code block, starting with the
TU
most significant bits and progressing to less significant bits by
EBCOT scheme.
JN
ll
A
A
ll
JN
TU
W
or
ld
MPEG1 MOVING PICTURE EXPERT GROUP
ld
MPEG exploits temporal redundancy. Prediction based.
Compare each frame of a sequence with its predecessor and only pixels that
or
have changed are updated,
MPEG-1 standard is for storing and retrieving video information on digital
storage media.
W
MPEG-2 standard is to support digital video broadcasting, HDTV systems.
H.261 standard for telecommunication applications
TU
Temporal compression algorithm: Temporal compression algorithm relies on
similarity between successive pictures using prediction in motion compensation
Spatial compression algorithm: relies upon redundancy with in small areas of a
JN
picture and is based around the DCT transform, quantization an entropy coding
techniques.
MPEG-1 was up to 1.5 Mbit/s. MPEG-2 typically over 4MBit/s but can be up to
80 Mbit/s.
ll
or
DVD, VOD, etc.)
Coding scheme :
W
Spatial redundancy : DCT + Quantization
Temporal redundancy : Motion estimation and
compensation Statistical redundancy : VLC
Applications :
TU
Internet Multimedia, Wireless Multimedia Communication
Multimedia Contents for Computers and Consumer
Electronics Interactive Digital TV
Coding scheme :
JN
Spatial redundancy : DCT + Quantization, Wavelet Transform
Temporal redundancy : Motion estimation and compensation
Statistical redundancy : VLC (Huffman Coding, Arithmetic
Coding) Shape Coding : Context-based Arithmetic Coding
ll
A
Wavelets are functions that wave above and below the x-axis,
ld
(1)varying frequency, (2) limited duration, and (3)
or
an average value of zero.
This is in contrast to sinusoids, used by FT, which have infinite energy.
W
Like sines and cosines in FT, wavelets are used as basis functions k(t)
COMPARISON OF THE
SINE WAVE AND THE
TU WAVELET
A wavelet is a
JN
waveform of
effectively
limited duration
that has an
average value of
ll
zero.
A
Wavelet functions
ld
Wavelet (mother wavelet),
Scaling Function ( Father wavelet): Translation, Dilation
or
These two functions generate a family of functions that can be used to
j/2 j j
break up or reconstruct a signal (x, y) 2 (2 x m,2 y n)
W
The scaling function is given as j,mn,
i j/2 i j
The orthonormal basis or wavelet basis is j,mn,(x, y) 2 (2 x m,2j y n)
TU i {HV,, D}
where is called the wavelet function and j and m, n are integers that scale and
JN
dilate the wavelet function. The factor j is known as the scale index,
which indicates the wavelets width. The location index m,n provides the position. The
wavelet function is dilated by powers of two and is translated by the integer m,n
ll
ld
1 2
i
or
4
Morlet e 0 e 2 Gabor
Haar or Daubechies 1
Daubechies 2, 3, 4, 5,
W
6, and 10
Symmetric 2, 3, 4, and 5
Biorthogonal 3.3, 3.5,
3.7, 3.9, 4.4, 5.5, and 6.8
Meyer
Mexican
hat chirplet
JN
ll
A
Crude wavelets are generated from mathematical expression.
ld
To use them with digital signal, the crude wavelets are to be converted to wavelet
filters having number of discrete equal distant points
or
W
TU
JN
ll
A
Two popular wavelets for CWT are the Mexican hat and Morlet.
ld
The Mexican hat wavelet is the second derivative of the Gaussian function,
given as
or
W
The two-dimensional Mexican hat wavelet is well known as the Laplacian
operator, widely used for zero-crossing image edge detection.
The Morlet wavelet is given by
Haar Wavelet
TU
JN
The Haar wavelet is a bipolar step function, localised in time domain, poorly localised
in frequency domain.
+1
t
ll
-1
t
A
ld
The wavelet and scaling coefficients are related by the
quadrature mirror relationship,
or
The term N is the number of vanishing moments
W
Wavelet Family Filters length Number of vanishing moments, N
Haar 2 1
Daubechies M
Coiflets M
Symlets
2M
6M
2M
TU M
2M-1
M
JN
ll
A
Wavelets can come in various shapes and sizes by stretching and shifting
ld
or
Magn
Ampl
Time Domain itude Freq Domain Space Wavelet itude
W
time Frequency Translation
TU
JN
ll
A
ld
Multiscale analysis 1D example
or
Haar Wavelet
Multiscale Filter:
W
TU
JN
ll
A
Example
ld
or
W
c2
d2
TU
JN
Multiplication by 2 is needed to ensure energy conservation
ll
2
A
Energy Concerns
ld
Energy of signals
or
The 1-level Haar transform conserves energy
W
TU
Proof of Energy Conservation
JN
c
ll
A
SHORT TIME FOURIER TRANSFORM (STFT)
ld
To analyze only a small section of the signal at a time -- a technique called
Windowing the Signal is used.
. The Segment of Signal is Assumed Stationary
or
W
A function of time and frequency
TU
Time/Frequency localization depends on window size.
Once you choose a particular window size, it will be the same for all frequencies
Use narrower windows at high frequencies for better time
resolution.
JN
ld
Wavelet transform is capable of providing the time and frequency
information simultaneously,
or
Similar to STFT: signal is multiplied with a function
W
The Forest & the Trees
Notice gross features with a large "window Notice
small features with a small "window
TU
Width of the Window is Changed as the Transform is Computed
for Every Spectral Components
JN
Split Up the Signal into a Bunch of Signals Representing the Same Signal, but
all Corresponding to Different Frequency Bands Only Providing What
Frequency Bands Exists at What Time Intervals
ll
A
What is wavelet transform?
ld
Wavelet transform decomposes a signal using a set of basis
functions (wavelets)
Wavelets are obtained from a single prototype wavelet (t) called mother
or
wavelet by dilations and shifting:
where s is the scaling parameter and is the shifting parameter
fine
W
scal details
e/
frequ j
ency
2D function
TU local
izati
on time localization
coarse
details
JN
s 2 j
k2 j
ld
Scaling, Stretching, Dilation (opposite is
length.
compression, Shrinking) all refer to the frequency (pseudo) for
or
expanding or shrinking the wavelet in time axis.
In Digital Signal Processing, scaling means changing amplitude.
W
detailsHigh frequency detailed S<1: compress the signal
TU
High scale or large scale a Stretched wavelet Slowly changing, coarse
JN
features Low frequency Non detailed S>1: dilate the signal
ll
A
Shifting, Sliding or Translation: shifting the wavelet in time axis
ld
Shifting a wavelet simply means delaying (or hastening) its onset.
Mathematically, delaying a function f(t) by k is represented by f(t-k)
or
W
TU C = 0.0004
JN
ll
C = 0.0034
A
SCALING AND SHIFTING PROCESS OF THE DWT
ld
or
W
TU
JN
ll
A
DEFINITION OF CONTINUOUS WAVELET TRANSFORM
ld
There are two main differences between the STFT and the CWT:
or
The width of the window is changed as the transform is computed for every
single spectral component, which is probably the most significant characteristic
of the wavelet transform.
W
The Continuous Wavelet Transform (CWT) is
the scalar product of the original signal with a translated and
dilated version of a locally confined function, called wavelet .
TU
Therefore, the CWT of a function depends on two parameters,
ONE for translation , shifts the wavelet for local information and
JN
OTHER for dilation controls the window size in which the signal analysis
must be performed .
ll
A
2D Continuous Wavelet Transform
ld
or
W
The wavelets are generated from a single basic wavelet (t), the so-
called mother wavelet, by scaling and translation
s is the scale factor, is
the translation factor and
CWT can be regarded as the inner product of the signal with a basis function
ll
A
As seen in the above equation, the transformed signal is a
ld
function of two variables, and s, the translation and scale
parameters, respectively. (t) is the transforming function, and it
or
is called the mother wavelet .
In terms of frequency, low frequencies (high scales) correspond to
W
a global information of a signal (that usually spans the entire
signal), whereas high frequencies (low scales) correspond to a
detailed information of a hidden pattern in the signal (that usually
TU
lasts a relatively short time).
ld
Main Steps
or
1. Take a wavelet and compare it to
a section at the start of the original
signal.
W
2. Calculate a number, C, that
represents how closely correlated the
TU
wavelet is with this section of the
signal. The higher C is, the more the
similarity.
JN
or
W
5. Repeat steps 1 through 4 for all scales
TU
Wavelet analysis produces a time-scale view of the input
JN
signal or image.
ll
A
ld
Inverse CWT:
or
W
TU
JN
ll
A
ld
COMPARSION OF
TRANSFORMATIONS
or
W
TU
JN
ll
A
A
ll
JN
TU
W
or
ld
ld
COMPARISON OF CWT And Discrete Wavelet Transform
or
CWT DWT
Scale At any scale Dyadic scales
W
Translation At any point Integer point
Wavelet Any wavelet that satisfies Orthogonal, bi orthogonal
minimum critteria
Computation
Detection
Application
TU
Large
Easily detects Direction,
Orientation
Pattern recognition, feature
Small
Can not detect minute
objects if not finely tuned
Compression, De noising,
JN
extraction, detection Transmission,
Characterisation
ll
A
ld
DWT is a fast linear operation on a data vector, whose length is an integer power of
2. Behaves like a filter bank: Data in, coefficients out
or
W
TU
JN
ll
ld
When the signal is in vector form (or pixel form), the discrete
wavelet transform may be applied.
or
Decompose the signal into a coarse approximation and detail
information
W
Remember that our discrete wavelets are not time-discrete,
only the translation and the scale step are discrete.
TU
Discrete Wavelet Transform is computed by
successive low pass and high pass filtering of the input image
data.
JN
This is called the Mallat algorithm or
Mallat-tree decomposition.
Mallat was the first to implement this scheme, using a well known filter design
ll
called two channel sub band coder, yielding a Fast Wavelet Transform
A
Mallats Algorithm
ld
The input data is passed through two convolution functions, each of
which creates an output stream that is half the length of the original input.
This procedure is referred to as down sampling . The convolution
or
functions are filters.
The low pass outputs contain most of the information of the input data and
are known as coarse coefficients. The outputs from the high pass filter are
W
known as detail coefficients.
The coefficients obtained from the low pass filter are used as the
original signal for the next set of coefficients.
TU
This procedure is carried out recursively until a trivial number of low pass
filter coefficients are left. The final output contains the remaining low pass
filter outputs and the accumulated high pass filter outputs.
JN
This procedure is termed as decomposition.
ll
A
Effectively, the DWT is nothing but a system of filters.
ld
There are two filtrs involved,
one is the wavelet filter, HIGH PASS
FILTER Details
or
other is the scaling filter. Filter
LOW PASS
FILTER Averaging
Filter
W
LPF
Approximations: High-scale, low-
frequency components of the signal
Details: low-scale, high-frequency Input
components HPF
TU
This process produces twice the data it
Signal
second coefficient.
A
ld
Multi-level Wavelet Analysis
or
Multi-level wavelet decomposition tree Reassembling original signal
W
TU
JN
ll
A
2D (Separable) Wavelets and Filter Banks
ld
The 1D wavelet transform can be applied twice to form a 2D wavelet transform.
or
First, the rows of the input are transformed, then the columns. This approach only
works if the filters are separable: that is, if the filter transfer functions are of the form
H(z1, z2) = H1(z1)H2(z2).
W
Thus in 2D, the wavelet transform has 4 stages for every scale: filtering and
decimation along the rows and then along the columns. To extend the transform to
color, the image can be decomposed into its RGB components, essentially reducing
TU
the image to 3 separate grayscale images.
256x256
Image
image 128 X 128 128 X 128
A
ld
Wavelet Transforms in Two Dimensions
M 1N1
or
W ( j0 , m, n ) 1
f(x,y) j0 , m , n ( x , y)
MN x 0 y 0
W
j /2 j j
j,m,n ( x, y ) 2 (2 x m, 2 y n)
i
j,m,n ( x, y ) 2 j /2 i (2 j x m, 2 j y
i
W ( j , m, n ) 1 f
TU
M 1N1
n) i {H , V , D}
(x,y) i
j,m,n ( x , y)
JN
MN x 0 y 0
i H,V,D
ll
A
ld
Inverse Wavelet Transforms in Two
Dimensions
or
W
1
f(x,y) W ( j0 , m, n ) j0 , m ,n ( x , y)
MN m n
1
TU
W i ( j , m, n ) i j , m ,n (x , y)
MN i H ,V ,D j j0 m n
JN
ll
A
ld
2D Wavelet Transform
or
W
LoLo
TU LoHi
JN
HiLo
HiHi
Downsample
ll
Downsample
Rows
Columns
A
ld
Analysis Filter Bank
or
y0 [n] 1 1
x[n 1] x[n]
Analysis filter bank has 2 2
W
Y (z) H (z)X(z) 1 [z 1 1]X(z)
common Inputs; 0 0
2
Synthesis filter bank has
common outputs
Low Pass Filter
y1[n]
Y (z)
1
1
2
x[n]
2
[1 z 1 ]X (z)
TU
1
x[n 1]
2
JN
High Pass Filter
Y (z) H (z)X(z)
1
2
A
2-D 4-band filter bank
ld
or
Approximation
W
Vertical detail
TU Horizontal detail
JN
Diagonal details
ll
A
SUBBABD CODING ALGORITHM
ld
If we regard the wavelet transform as a filter bank, then we can consider wavelet
transforming a signal as passing the signal through this filter bank. The outputs of the
or
different filter stages are the wavelet- and scaling function transform coefficients
W
Doubles the Frequency Resolution
The spanned frequency band halved
TU
JN
ll
A
ld
or
LP HP
W
TU
Another way is to split the signal spectrum in two (equal) parts, a low-pass and a
high-pass part. The high-pass part contains the smallest details we are interested in
JN
and we could stop here. We now have two bands. However, the low-pass part still
contains some details and therefore we can split it again. And again, until we are
satisfied with the number of bands we have created. In this way we have created an
iterated filter bank. Usually the number of bands is limited by for instance the
amount of data or computation power available.
ll
A
ld
The wavelets give us the band-pass bands with doubling bandwidth and
the scaling function provides us with the low-pass band.
or
From this we can conclude that a wavelet transform is the same thing as
a subband coding scheme using a constant-Q filter bank.
W
Summarizing, if we implement the wavelet transform as an iterated filter bank, we
do not have to specify the wavelets explicitly!
TU
JN
ll
A
A
ll
JN
TU
W
or
ld
A
ll
JN
TU
W
or
ld
ld
or
W
Averaging and differencing at multiple scales can be done efficiently via
a discrete orthogonal wavelet transform (WT), e.g., the Haar wavelet
TU
JN
ll
A
ld
Haar Wavelet
or
Resolution Averages Detail Coefficients
W
3 [2, 2, 0, 2, 3, 5, 4, 4] ----
0
TU[1.5,
[2.75]
4] [0.5, 0]
[-1.25]
JN
wavelet decomposition(wavelet
[2.75, -1.25, 0.5, 0, 0, -1, -1, 0]
ll
transform):
A
ld
Haar Wavelet Coefficients
or
Using wavelet coefficients one can pull the raw data
Keep only the large wavelet coefficients and pretend
W
other coefficients to be 0.
[2.75, -1.25, 0.5, 0, 0, -1, -1, 0]
The
TU
the data
elimination of small coefficients introduces only small
error when reconstructing the original data
JN
ll
A
A
ll
JN
TU
W
or
ld
ld
or
Why wavelets for denoising?
W
TU
JN
ll
A
Noise removal or noise reduction can be done on an
ld
image by filtering,
by wavelet analysis, or
or
by multi-fractal analysis.
W
During thresholding, a wavelet coefficient is compared
with a given threshold and is set to zero if its magnitude is
TU
less than the threshold; otherwise, it is retained or
modified depending on the threshold rule.
Types
JN
Universal or Global Thresholding
Hard
Soft
SubBand Adaptive Thresholding
ll
A
ld
The choice of a threshold is an important point of interest.
or
Care should be taken so as to preserve the edges of the
denoised image.
W
There exist various methods for wavelet thresholding, which rely
on the choice of a threshold value.
Some typically used methods for image noise removal include
TU
VisuShrink, Level shrink, SureShrink and BayesShrink
JN
ll
A
ld
or
The hard thresholding rule is usually referred
to simply as wavelet thresholding
W
TU
soft thresholding is referred to as
Tz=x-t, if x>t
=x+t, if x<t,
=0, otherwise
wavelet shrinkage, since it "shrinks"
JN
the coefficients with high amplitude
towards zero.
ll
A
hard- thresholding and soft-thresholding
ld
Hard thresholding sets zeros for all wavelet coefficients whose
absolute value is less than the specified threshold limit. It has
or
shown that hard thresholding provides an improved signal to noise
ratio.
W
Soft thresholding is where the coefficients with greater than
the threshold are shrunk towards zero after comparing them to
a threshold value
TU
JN
ll
A
The thresholding of the wavelet coefficients is usually applied only to
ld
the detail coefficients dj k of y rather than to the approximation
coefficients aj k since the latter ones represent 'low-frequency' terms
or
that usually contain important components of the signal, and are
,
W
In practice, it can be seen that the soft method is much better and yields more
visually pleasant images. This is because the hard method is discontinuous and
yields abrupt artifacts in the recovered images. Also, the soft method yields a
smaller minimum mean squared error compared to hard form of thresholding.
TU
JN
ll
A
Signal denoising using the DWT consists of the three successive
ld
procedures, namely,
signal decomposition,
thresholding of the DWT coefficients,
or
and signal reconstruction.
W
It has following steps:
1. Perform multi scale decomposition of the image corrupted by guassian
noise using wavelet transform.
2. Estimate the noise variance 2
TU
3. For each level, compute the scale parameter
4. Compute the standard deviation
5. Compute threshold t,
6. Apply semi soft thresholding to the noisy coefficients.
JN
7. Denoise high frequency coefficient
8. Merge low frequency coefficient with denoise high frequency coefficient
9. Invert the multiscale decomposition to reconstruct the denoised image .
ll
A
VisuShrink
ld
It uses a threshold value t that is proportional to the standard
or
deviation of the noise.
W
universal threshold and is defined as
t = (2log N)
2 is the noise variance present in the signal and N represents the
TU
signal size or number of samples.
ld
A threshold chooser based on Steins Unbiased Risk Estimator (SURE) and
is called as SureShrink.
It is a combination of the universal threshold and the SURE threshold. This
or
method specifies a threshold value tj for each resolution level j in the
wavelet transform which is referred to as level dependent thresholding.
Let wavelet coefficients in the j-th subband be { Xi : i =1,,d }
W
For the soft threshold estimator X it ( X i )
2
X X
SURE (t; X ) d 2 # i:
Select threshold tS by
TU
tS argminSURE(t; X )
i t d
i1
min i ,t
where z(x,y) is the estimate of the signal while f(x,y) is the original signal
ll
without noise and n is the size of the signal. SureShrink suppresses noise by
A
ld
or
The SureShrink threshold t* is defined as
W
TU
where t denotes the value that minimizes Steins Unbiased Risk
JN
Estimator, is the noise variance computed from equation, and n
is the size of the image.
SureShrink follows the soft thresholding rule. The thresholding
employed here is adaptive
ll
A
BayesShrink
ld
The goal of this method is to minimize the Bayesian risk, and
or
hence its name, BayesShrink. It uses soft thresholding and is
subband-dependent, which means that thresholding is done at
each band of resolution in the wavelet decomposition. Like the
W
SureShrink procedure, it is smoothness adaptive.
ld
Morphology is an image processing techniques that discusses about
the structure and shape of objects. It is a tool for extracting image
or
components that are useful for representation and description of
region shape, boundary, skeleton, convex hull etc.
W
In image morphology, the basic assumption is an image can
be presented by a point set
TU
Binary morphology is applied to binary images,
Gray level morphology is applied to gray level images
It can be applied to color images also.
JN
ll
A
Y
ld
X is a set of points belonging to objet, The
or
pixel values of object (Foreground) is 1
X and back ground is zero for this binary
W
image.
x
Reflection of B :
B = { w | w = -b for b B }
ll
Translation operation:
A z = { c | c = a + z for a A}
A
Morphological Transformation:
X
ld
Morphological transformation gives a relation of the image X with
or
another small point set or another small image say capital B
which is called a structuring element
W
X
TU B B
JN
Structuring Element is a small set to probe the image under study
, for each SE, define origin and shape and size must be adapted
to geometric properties for the objects
ll
A
ld
There are basically four morphological
or
transformations: Dilation,
Erosion,
Opening and
W
Closing
TU
JN
ll
A
Structuring Element (Kernel or Template)
ld
Structuring Elements can have varying sizes
or
Usually, element values are 0,1 and none(!)
Structural Elements have an origin
W
For thinning, other values are possible
Empty spots in the Structuring Elements are
dont cares!
algebraic
TU
By applying these structuring elements to the data using different
combinations,
transformations on the data.
one performs morphological
JN
ll
6
A
A
ll
JN
TU
W
or
ld
Simple morphological transformations
ld
Dilation A B={p z2 |p=a+b , a A, b B}
A dilation B ( vector addition)
or
object area gets expanded , Grow
filling of holes of certain shape and size, given by
SE internal noise within objects removed,.
W
Erosion A -- B = { p z 2 | p + b a, for every b of B } the object
area gets reduced along its boundary, Shrink
removal of structures of certain shape and size, given by SE
or A B = (A B) B
Closing=Dilation
TU
A B = (A B)B
Hit and Miss Two structuring elements, fore ground and back ground
A
operation
Dilation transformation on a image
ld
image A The structuring
The dilation of B over X
element is shown
by B:
or
W
image A
TU
Erosion transformation on a image
Structuring The erosion of B over X
JN
Element B
ll
A
Closing operation A B = (A B) B
ld
Closing involves one or more dilations followed by erosion.
or
Dilation :Dilation enlarges foreground, shrinks background
all pixels turnout to be object pixels, internal
W
noise within objects removed, Objects
get expanded.
The Dilation of an Image A by a
FOLLWED BY TU
structuring element B is written as A B
ld
Erosion
Removes boundary pixels in the bodies including the
or
external noise. Shrinks the objects
W
Dilation
TU
Expands the removed pixels except the external noise
JN
Opening is idempotent: Repeated application has no
further effects!
ll
A
Closing and Opening Operation
ld
or
W
TU
Take this particular binary image. This image contains two different object regions
JN
and there are two noisy regions in this image.
One is that the inter region should have been object, but two object
pixels have turned out to be background pixels and
second one is that we do not have any noise but two object regions are
ll
presence of noise
Closing Operation involves Dilation and Erosion
ld
or
W
TU
JN
So, the effect is quite clear. After doing the morphological closing operation, we
have removed the internal noise 1
ll
ld
or
W
TU
JN
ll
A
A
ll
JN
TU
W
or
ld
A
ll
JN
TU
W
or
ld
A
ll
JN
TU
W
or
ld
ld
Hit-and-miss Transform
Used to look for particular patterns
or
of foreground and background pixels
Very simple object recognition
W
All other morphological operations can
be derived from it!!
TU
Input: Binary Image
Two Structuring Elements, containing 0s and 1s
and dont cares
JN
ll
18
A
ld
Hit-and-miss Transform
or
Similar to Pattern Matching:
W
If foreground and background pixels in the
structuring element exactly match foreground
and background pixels in the image, then the
TU
pixel underneath the origin of the structuring
element is set to the foreground value.
JN
If it doesn't match, then that pixel is set
to the background value.
ll
19
A
Hit-or-Miss Transform is normally used to detect or locate an object
ld
of a given shape and size in an image. The morphological hit-or-miss
transform is a basic tool for shape detection.
or
W
Where B1 is the shape to detect (foreground) , and B2 is the set
of elements associated with the corresponding background
The Hit-or-Miss Transform is given by
TU X is the object
JN
ll
A
And the image in which
ld
the object is to be detected is:
or
W
Perform A X:
the output obtained is:
TU
JN
ll
A
Now, perform Ac (W-X)
ld
We obtain only 1 pixel shown in yellow:
or
We get, the location of the
object in the image
W
TU
Taking the intersection between the two
results: i.e.:
JN
ll
A
We get, the location of the object in the image
ld
or
W
TU
JN
ll
A
Applications of morphology is
ld
extracting image components that are useful in the
representation and description of shape.
or
Medical image analysis: Tumor detection, measurement of size
W
and shape of internal organs, Regurgitation, etc.
TU
scene, motion control and execution through visual feedback.
ld
or
The boundary of a set A, denoted as (A), can be obtained by first
eroding A by B and then performing the set difference between A
and its erosion as follows:
W
TU
where B is a suitable structuring element.
JN
ll
A
A
ll
JN
TU
W
or
ld
Corner Detection with Hit-and-miss Transform
ld
Structuring Elements representing four corners. Usually 1s at the center
or
W
Four structuring elements used for corner finding in binary images using the hit-
by different amounts.
TU
and-miss transform. Note that they are really all the same element, but rotated
ld
or
W
original HMT
TU
JN
ll
A
Thinning
ld
1. Used to remove selected foreground pixels from binary images
or
2. After edge detection, lines are often thicker than one pixel.
W
Let K be a kernel and I be an image
thin I, K I HitAndMiss I, K
with 0-1=0!! TU
If foreground and background fit the structuring
JN
element exactly, then the pixel at the origin of
the SE is set to 0
or dont care!
A
ld
Thickening
or
Used to grow selected regions of foreground pixels
E.g. applications like approximation of convex hull
Let K be a kernel and I be an image
W
thicken I, K I HitAndMiss I, K
with 1+1=1 TU
If foreground and background match exactly
JN
the SE, then set the pixel at its origin to 1!
Note that the value of the SE at the origin is 0
or dont care!
ll
A
Digital Watermarking
ld
A watermark is a pattern of bits inserted or embedded
or
into a digital image, audio or video file that identifies the
file's copyright information (author, rights, etc) which can
W
be later extracted or detected for variety of purposes
including identification and authentication purposes.
Water marking is a secret key (string or integer)
produced using a random number. This water marking is
TU
embedded redundantly over the whole image, so that
every part of the image is protected.
JN
Water marked image fw = ( 1- ) f + w
will be between 0 and 1
If = 1, fw= w, image becomes opaque and gets completely filled up
and image becomes obscured with watermark
If =0, fw= f , no water marking happens.
ll
A
Types of watermarking techniques
ld
or
W
TU
JN
ll
A
Spatial Domain Approach:
ld
Embeds the watermark into least significant bits (LSB) of the image
pixels. This technique has relatively low information hiding capacity and
can be easily erased by lossy image compression.
or
Patchwork Approach: This technique uses a random number generator to
select n pairs of pixels and slightly increases or decrease their luminosity
(brightness level). Thus the contrast of this set is increased without any change
W
in the average luminosity of the image. With suitable parameters, Patchwork
even survives compression using JPEG.
Frequency domain: water mark is added to DCT coefficients
Wavelet domain: Water mark is embedded into the high frequency
TU
coefficients Audio : Jitter :
For audio watermarking scheme, jitter is added to the signal. The signal is split
into chunks of 500 samples, either duplicated or deleted a sample at random in
each chunk (resulting in chunks of 499 or 501 samples long) and stuck the chunks
JN
back together.
Fragile Watermark : It breaks very easily on modifying host signal. It is used for
A
ld
watermark. The invisible watermark is used as protection or back up for the
visible watermark.
Video watermarking The basic idea of watermarking for raw video is addition of
or
pseudo-random signal to the video that is below the threshold of perception that
cant be identified and thus removed without knowledge of the parameters of
the watermarking algorithm.
W
An invisible robust private watermarking scheme requires the original or reference
image for watermark detection; whereas the public watermarks do not.
TU
Source-based watermark are desirable for ownership identification. The watermark
could also be destination based where each distributed copy gets a unique watermark
identifying the particular buyer. The destination -based watermark could be used to
JN
trace the buyer in the case of illegal reselling.
ll
A
In order to achieve the copyright protection, the Digital Water
ld
Marking algorithm should meet few basic requirements
i) Imperceptibility: The watermark should not affect the quality of
or
the original signal, thus it should be invisible/ inaudible to human
eyes/ ears.
ii) Robustness: The watermarked data should not be removed or
W
eliminated by unauthorized distributors, thus it should be robust
to resist common signal processing manipulations such as
filtering, compression, filtering with compression.
person. TU
iii) Security: watermark should only be detected by authorized
ld
Copyright protection: Visible watermarking is used for copyright protection which is
the most important watermarking application
or
Finger Printing: Finger printing is similar to giving serial number to any product.
Each distributed multimedia copy is embedded with a different watermark. The
objective is to convey the information about the legal recipients.
W
Content Authentication (integrity protection): Invisible watermark is an evidence of
ownership. The objective of this application is to detect modification in data.
d) Broadcast Monitoring: Watermark is embedded in commercial advertisements.
Automated monitoring system can verify whether the advertisements are
broadcasted as contracted or not. The main use of broadcast monitoring is to
TU
protecting the valuable TV products like news items from illegal transmission.
Indexing: Comments and markers or key information related to the data is inserted
as watermark. This watermark information is used by a search engine for retrieving
the required data quickly and without any ambiguity.
JN
Medical Applications: Patient's information is inserted as watermark in medical
Images. It helps in avoiding ambiguities in searching the medical records.
ll
A
watermarking scheme (algorithm)
ld
General Framework for Watermarking
or
In general, any watermarking scheme (algorithm) consists of three
parts.
W
_ The watermark.
_ The encoder (insertion algorithm).
_ The decoder and comparator (verification or extraction
TU
or detection algorithm).
JN
ll
A
Encoding Process
ld
Let us denote an image by I, a signature by S= s1,s2,. and the
watermarked image by I . E is an encoder function, it takes
or
an image I and a signature S , and it generates a new image
which is called watermarked image I , mathematically,
E (I,S) = I
W
It should be noted that the signatures S may be dependent on
image I
ENCODER
TU
JN
ll
A
Decoding Process
ld
A decoder function D, I is the water marked image, I is the original
image, S is the recovered signature.
or
D (I,I) = S
S is compared with owners signature,
C is a comparator function , is some threshold
W
Extracted water mark S is compared with original water mark
S using similarity function Sim ( S , S)
TU
JN
ll
A
Attacks on Watermarks
ld
or
W
TU
JN
ll
A
ld
The watermark is
inserted in the DCT
domain of the image
or
by setting the
frequency
components vi in the
image to vi
W
fw= f (1+ x )
where is a scale
factor ( say 0.1 )
TU
JN
ll
A
ld
Insertion of watermark in frequency Extraction of watermark
domain (invisible)
Compute DCT for entire image Compute DCT of the entire
or
watermarked
Find out perceptually significant Compute DCT of the entire original
coefficients (say 1000 coeffts) image
W
Compute water mark X = x1, x2, x3, The difference of the above two is the
.xn extracted watermark X*
Where xi is chosen as per N (0,1); Extracted water mark X* is compared
and variance of 1.
TU
N is normal distribution with mean zero with original water mark X using
similarity function Sim ( X,X*)
JN
The watermark is inserted in the DCT
domain of the image by setting the
frequency components vi in the image
to vi
vi = vi (1+ x )
ll