0% found this document useful (0 votes)
15 views

Digital Image Processing_Unit 5

The document discusses pixel coding, explaining that a pixel is the smallest unit of an image on a computer display, with its color determined by the number of bits used to represent it. It covers concepts such as block truncation coding, wavelet transform coding, and color image processing, detailing how images are constructed from pixels and how various coding techniques can compress and reconstruct images. Additionally, it introduces back projection as a method for detecting features in images based on histogram models.

Uploaded by

cammusowmiya13
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views

Digital Image Processing_Unit 5

The document discusses pixel coding, explaining that a pixel is the smallest unit of an image on a computer display, with its color determined by the number of bits used to represent it. It covers concepts such as block truncation coding, wavelet transform coding, and color image processing, detailing how images are constructed from pixels and how various coding techniques can compress and reconstruct images. Additionally, it introduces back projection as a method for detecting features in images based on histogram models.

Uploaded by

cammusowmiya13
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

Unit 5

Pixel Coding
Pixel
The full form of the pixel is "Picture Element." It is also known as "PEL." Pixel is the smallest
element of an image on a computer display, whether they are LCD or CRT monitors. A screen is
made up of a matrix of thousands or millions of pixels. A pixel is represented with a dot or a
square on a computer screen.

The good thing is that a pixel cannot be seen as they are very small which result in a smooth
and clear image rather than "pixelated." Each pixel has a value, or we can say a unique logical
address. It can have only one color at a time. Colour of a pixel is determined by the number of
bits which is used to represent it. A resolution of a computer screen depends upon graphics card
and display monitor, the quantity, size and color combination of pixels.
As we know that an image is build up of thousands and millions of pixels. In the above images, if
we zoom in the image, we will be able to see some of the pixels.

Relationship with CCD array


When an image has zoomed in, the surface of CCD will look like filled dots.These dots are light
receptor called photodiode.

CCD sizes are described using Terms like 2 million pixels (megapixel) and 4 million pixels
(megapixel). The more the count of pixel, more detailed image is generated. To get a clear and
image smooth, CCD and image size is increased.
Calculation of the total number of pixels

Below is the formula to calculate the total number of pixel in an image.

For example: let rows=300 & columns=200


Total number of pixels= 300 X 200
= 500

Gray level
The value of the minimum gray level is 0. The gray level depends on the depth of the image.

For example: In an 8-bit image, gray level is 255. For a binary image, a pixel can only take value
0 or 255. In color image, it can choose values between 0 and 255.
The formula for calculating gray level in a color image is as shown below:

Pixel value(0)
As we know that each pixel has a unique value. 0 is a unique value that means the absence of
light. It means that 0 is used to denote dark.

Block truncation coding


The basic BTC algorithm is a lossy fixed length compression method that uses a Q level quantizer to
quantize a local region of the image. The quantizer levels are chosen such that a number of the
moments of a local region in the image are preserved in the quantized output. In its simplest form, the
objective of BTC is to preserve the sample mean and sample standard deviation of a grayscale image.
Additional constraints can be added to preserve higher order moments. For this reason BTC is a block
adaptive moment preserving quantizer.

The first step of the algorithm is to divide the image into non-overlapping rectangular regions. For the
sake of simplicity we let the blocks be square regions of size n x n, where n is typically 4. For a two level
(1 bit) quantizer, the idea is to select two luminance values to represent each pixel in the block. These
values are chosen such that the sample mean and standard deviation of the reconstructed block are
identical to those of the original block. An n x n bit map is then used to determine weather a pixel
luminance value is above or below a certain threshold. In order to illustrate how BTC works, we will let
the sample mean of the block be the threshold; a “1” would then indicate if an original pixel value is
above this threshold, and “0” if it is below. Since BTC produces a bitmap to represent a block, it is
classified as a binary pattern image coding method [10]. The thresholding process makes it possible to
reproduce a sharp edge with high fidelity, taking advantage of the human visual system’s capability to
perform local spatial integration and mask errors. Figure 1 illustrates the BTC encoding process for a
block. Observe how the comparison of the block pixel values with a selected threshold produces the
bitmap.

By knowing the bit map for each block, the decompression/reconstruction algorithm knows whether a
pixel is brighter or darker than the average. Thus, for each block two gray scale values, a and b, are
needed to represent the two regions. These are obtained from the sample mean and sample standard
deviation of the block, and are stored together with the bit map. Figure 2 illustrates the decompression
process. An explanation of how a and b are determined will be given below. For the example illustrated
in Figures 1 and 2, the image was compressed from 8 bits per pixel to 2 bits per pixel (bpp). This is due to
the fact that BTC requires 16 bits for the bit map, 8 bits for the sample mean and 8 bits for the sample
standard deviation. Thus, the entire 4x4 block requires 32 bits, and hence the data rate is 2 bpp. From
this example it is easy to understand how a smaller data rate can be achieved by selecting a bigger block
size, or by allocating less bits for the sample mean or the sample standard deviation [5, 7]. We will
discuss later how the data rate can be further reduced.

To understand how a and b are obtained, let k be the number of pixels of an nxn block (k=n 2 ) and
x1,x2, … ,xk be the intensity values of the pixels in a block of the original image. The first two sample
moments m1 and m2 are given by
and the sample standard deviation s is given by

As the example illustrated, the mean can be selected as the quantizer threshold. Other thresholds could
also be used such as the sample median. Another way to determine the threshold is to perform an
exhaustive search over all possible intensity values to find a threshold that minimizes a distortion
measure relative to the reconstructed image [7].
Once a threshold, xth, is selected the output levels of the quantizer (a and b) are found such that the
first and second moments are preserved in the output. If we let q be the number of pixels in a block that
are greater than or equal to xth in value, we have:

Rather than selecting the threshold to be the mean, an additional constraint can be added to (4) in order
to determine the threshold of the quantizer. This is done by preserving the third sample moment (m3):

where m3 is given by

wavelet transform coding of images

Different ways to introduce the wavelet transform can be envisaged (Starck et al., 1998). However, the
traditional method to achieve this goal remains the use of the Fourier theory (more precisely, STFT). The
Fourier theory uses sine and cosine as basis functions to analyse a particular signal. Due to the infinite
expansion of the basis functions, the FT is more appropriate for signals of the same nature, which
generally are assumed to be periodic. Hence, the Fourier theory is purely a frequency domain approach,
which means that a particular signal f(t) can be represented by the frequency spectrum F(w), as follows:
The original signal can be recovered, under certain conditions, by the inverse Fourier Transform as
follows:

Obviously, discrete-time versions of both direct and inverse forms of the Fourier transform are possible.
Due to the non-locality and the time-independence of the basis functions in the Fourier analysis, as
represented by the exponential factor of equation (1), the FT can only suit signals with “time-
independent” statistical properties. In other words, the FT can only provide global information of a
signal and fails in dealing with local patterns like discontinuities or sharp spikes (Graps, 1995). However,
in many applications, the signal of concern is both time and frequency dependent, and as such, the
Fourier theory is “incapable” of providing a global and complete analysis. The shortcomings of the
Fourier transform, in addition to its failure to deal with non-periodic signals led to the adoption by the
scientific community of a windowed version of this transform known as the STFT. The STFT transform of
a signal f(t) ) and through the usage of a sliding window w (centred at time is defined around a time
as (Wickerhauser, 1994; Graps, 1995; Burrus et al., 1998; David, 2002a frequency & Oppenheim &
Schafer, 2010):

As it is apparent from equation (3), even if the integral limits are infinite, the analysis is ] of the sliding,
always limited to a portion of the signal, bounded by the limits [- window. The time-frequency plane of
a fixed window STFT transform is illustrated in Figure 1.
Fig. 1. Fourier time-frequency plane (Graps, 1995)

Although, this approach (using STFT transform) succeeds well in giving both time and frequency
information about a portion of the signal, however, as its predecessor, it has a major drawback. The fact
is that the choice of the window size is crucial. As stated by Starck and al (Starck et al., 1998): ” The
smaller the window size, the better the time-resolution. However, the smaller the window size also, the
more the number of discrete frequencies which can be represented in the frequency domain will be
reduced, and therefore the more weakened will be the discrimination potential among frequencies”.
This problem is closely linked to the Heisenberg’s uncertainty principle, which states that a signal (e.g. a
very short portion of the signal) cannot be represented as a point in the time-frequency domain. This
shortcoming brings us to rise the fundamental question: how to size then the sliding window? Not
surprisingly, the answer to this question leads us by means of certain transformations to the wavelet
transform. In fact, by considering the convolution of the sliding window with the time-dependant
exponential e -jwt within the integral of equation (3):

by a shifting by a scaling factor a, and the window bound And replacing the frequency factor b, leads
us to the first step leading to the Continuous Wavelet Transform (CWT), as represented in equation (5):

The combination of equation (5) with equation (3), leads to the CWT as defined by Morlet and Grossman
(Grossman & Morlet, 1984).
Where f(t) belongs to the square integrable functions space, L2(R). In the same way, the inverse CWT
can be defined as (Grossman & Morlet, 1984):

The Cψ factor is needed for reconstruction purposes. In fact, the reconstruction is only possible if this
factor is defined. This requirement is known as the admissibility condition. In a more general way, ψ(t) is
replaced by ┯(t), allowing a variety of choices, which can enhance certain features for some particular
applications (Starck et al., 1998; Stromme, 1999 & Hankerson et al., 2005). However, the CWT in the
form defined by equation (6) is highly redundant, which makes its direct implementation of minor
interest. The time-frequency plane of a wavelet transformation is illustrated in Figure 2. The differences
with the STFT transform are visually clear.

color image coding

Color image processing includes the following topics:

Color fundamentals

– Color models

– Pseudocolor image processing

– Color image smoothing and sharpening

– Color edge detection

– Noise in color images

– Color perception models

Color Fundamentals
In 1666 Sir Isaac Newton discovered that g (2002) when a beam of sunlight passes through a glass prism,
the emerging beam is split into a spectrum of colors

The colors that humans and most animals perceive in an object are determined by the nature of the
light reflected from the object ➢ For example, green objects reflect light with wave lengths primarily in
the range of 500 – 570 nm while absorbing most of the energy at other wavelengths
Three basic quantities are used to describe the quality of a chromatic light source:

– Radiance: the total amount of energy that flows from the light source (measured in watts)

– Luminance: the amount of energy an observer perceives from the light source (measured in lumens)

• Note we can have high radiance but low luminance

– Brightness: a subjective (practically un-measurable) notion that embodies the achromatic notion of
intensity of light

➢ Chromatic light spans the electromagnetic spectrum from approximately 400 to 700 nm.

➢ Human color vision is achieved through 6 to 7 million cones in each eye.

➢ Three principal sensing groups:

–66% of these cones are sensitive to red light

–33% to green light

–2% to blue light.

Absorption curves for the different cones have been determined experimentally. ➢ Strangely these do
not match the CIE standards for red (700nm), green (546.1nm) and blue (435.8nm) light as the standards
were developed before the experiments!

The primary colors can be added to produce the secondary colors.


➢ Mixing the three primaries produces white.

➢ Mixing a secondary with its opposite primary produces white (e.g. red+cyan).

➢ Primary colors of light (red, green, blue)

➢ Primary colors of pigments (colorants) – A color that subtracts or absorbs a primary color of light and
reflects the other two. – These are cyan, magenta and yellow (CMY). – A proper combination of pigment
primaries produces black.

How to Distinguish one color from another? ➢ Brightness: the achromatic notion of intensity. ➢ Hue:
the dominant wavelength in a mixture of light waves. Note : The dominant color perceived by an
observer, e.g. when we call an object red or orange we refer to its hue) ➢ Saturation: the amount of
white light mixed with a hue. Pure colors are fully saturated. Pink (red+white) is less saturated. ➢ Hue
and saturation are called chromaticity. ➢ Therefore any color is characterized by its brightness and
chromaticity. ➢ The amounts of red, green and blue needed to form a particular color are called
tristimulus values and are denoted by X, Y, Z.

A color is then specified by its trichromatic coefficients:

Back Projection Operator

What is Back Projection?

 Back Projection is a way of recording how well the pixels of a given image fit the distribution of pixels in
a histogram model.
 To make it simpler: For Back Projection, you calculate the histogram model of a feature and then use it
to find this feature in an image.
 Application example: If you have a histogram of flesh color (say, a Hue-Saturation histogram ), then
you can use it to find flesh color areas in an image:

What is Back Projection?

 Back Projection is a way of recording how well the pixels of a given image fit the distribution of
pixels in a histogram model.
 To make it simpler: For Back Projection, you calculate the histogram model of a feature and then
use it to find this feature in an image.
 Application example: If you have a histogram of flesh color (say, a Hue-Saturation histogram ),
then you can use it to find flesh color areas in an image:

How does it work?

 We explain this by using the skin example:

Let's say you have gotten a skin histogram (Hue-Saturation) based on the image below. The histogram besides
is going to be our model histogram (which we know represents a sample of skin tonality). You applied some
mask to capture only the histogram of the skin area:

Now, let's imagine that you get another hand image (Test Image) like the one below: (with its respective
histogram):
What we want to do is to use our model histogram (that we know represents a skin tonality) to detect skin areas
in our Test Image. Here are the steps

1. In each pixel of our Test Image (i.e. p(i,j) ), collect the data and find the correspondent bin location for
that pixel (i.e. (hi,j,si,j) ).
2. Lookup the model histogram in the correspondent bin - (hi,j,si,j) - and read the bin value.
3. Store this bin value in a new image (BackProjection). Also, you may consider to normalize the model
histogram first, so the output for the Test Image can be visible for you.
4. Applying the steps above, we get the following BackProjection image for our Test Image:

5.In terms of statistics, the values stored in BackProjection represent the probability that a pixel in Test
Image belongs to a skin area, based on the model histogram that we use. For instance in our Test image,
the brighter areas are more probable to be skin area (as they actually are), whereas the darker areas have
less probability (notice that these "dark" areas belong to surfaces that have some shadow on it, which in
turns affects the detection).turns affects the detection).
Vector DPCM:

Vector data model: Vector models are used for storing data that has decorated boundaries.
Raster data model representation of the surface divided into regular grids of cells.

Raster models are useful for storing the pixel data.

Raster scan Vector scan


1.It can draw areas filled with colors It can only draw lines
2.Scanning is done one line at a time form top Scanning is done between the end point
to bottom and left to right
3.Even for a complex image a raster can For a complex image a vector scan display may
display does not flicker flicker.
4.Raster scan is converted to pixel Vector scan is not converted to pixel

VECTOR DPCM: If the images is vector scan, then it is possible to generalized the DPCM
technique by considering vector recursive predictor.

Hybrid coding is the method of implementing n*1 vector DPCM coder. Typically the
images unitarily transformed in one of its pixel in that direction.

Each transform co-efficient is then sequentially coded in the other direction by one
dimensional DPCM. This technique provide robust performance of transform coding hybrid
coding defined as,
Let un (n=0,1,2….) n*1 column of an image ,which are transformed as v n= ¥ u(n)

The dpcm equation can be written as

e(n)= v(n) –v (n)

u(n)=e(n)+va(n)

The receiver simply reconstruct the transformed vector according to perform the inverse
transformation(y, ¥).

Back projection algorithm

Background: The back-projection algorithm is the most common method for the
reconstruction of circular-scanning-based photoacoustic tomography (CSPAT) due to its
simplicity, computational efficiency, and robustness. It usually can be implemented in two
models: one for ideal point detector, and the other for planar transducer with infinite element
size. However, because most transducers in CSPAT are planar with a finite size, the off-
center targets will be blurred in the tangential direction with these two reconstruction models.

Methods: Here in this paper, we put forward a new model of the back projection algorithm for the

reconstruction of CSPAT with finite size planar transducer, in which the acoustic spatial temporal response of

the employed finite size transducer is approximated with a virtual detector placed at an optimized distance

behind the transducer, and the optimized distance is determined by a phase square difference minimization

scheme. Notably, this proposed method can also be suitable for reconstruction with the ideal point detector and

infinite planar detector, and thus is a generalized form of the back-projection algorithm.

Results: Compared with the two conventional models of the back-projection method and the modified back-

projection method, the proposed method in this work can significantly improve the tangential resolution of off-

center targets, thus improving the reconstructed image quality. These findings are validated with both

simulations and experiments.

Conclusions: We propose a generalized model of the back projection algorithm to restore the elongated

tangential resolution in CSPAT in case of a planar transducer of finite size, which can also be applicable for

point and large-size planar transducers. This proposed method may also guide the design of CSPAT scanning

configurations for potential applications such as human breast imaging for cancer detection.
Methods

Reconstruction algorithm

To better understand the advantages of the modified algorithm being proposed, a review of the two existing

back-projection models is necessary. The core of the back-projection method is to first measure the time delay

between the pixel and each transducer, and then the pixel value is given by the sum of transducer signals at

the corresponding time delay. Figure 1A shows the schematic of the situation when the first expression applies,

and where the transducers are regarded as ideal point detectors. This is also the presupposition for most

current CSPAT reconstruction algorithms. In this case, the projection line (or rather equal time delay line) is a

group of concentric curves centered at the detector position (blue lines as shown in Figure 1A), and, for an

arbitrary pixel located at r(x,y) in the imaging domain, its value can be given by

Figure 1 Schematics of the back-projection algorithm of different reconstruction models. (A) The model for ideal point
detector; (B) the model for detector with infinite element size; (C) the model for detector with finite element size.

where R is the radius of the transducer scanning trace, N is the number of total transducers, S(t) is the signal

received by the i-th transducer, θi is the angular coordinate of the i-th transducer, and v is the acoustic velocity

in the media. Eq. [1] can give a uniform resolution for an ideal point transducer, but for a planar transducer with

a finite size, the tangential resolution will deteriorate as the imaging point moves away from the circular

scanning center, and becomes equal to the transducer size at the boundary of the transducer scanning trace

(11).

The second expression of the back-projection algorithm is for the transducer to be so big that it is to be

regarded as a planar detector with infinite size. The schematic for this situation is shown in Figure 1B, and the
projection lines are parallel with the detector plane. The image reconstruction in this case is very similar to the

inverse Radon transform, in which the pixel value at (x,y) is given as

However, for a smaller transducer, this equation will also lead to a tangential elongation. Therefore, here we

seek to find a concise form of the back-projection algorithm when a transducer with a finite element size is

employed. Our strategy is to create a virtual detector, which is located further from the actual transducer, as

illustrated in Figure 1C. If the value of the distance between the virtual detector and the actual transducer L is

correctly set, the signal received by the transducer Si(t) can be approximate to the signal received by the virtual

detector but with a time delay of L/v. In this way, because the radius of the virtual detector scanning trace

is R+L, the final pixel reconstruction value that can be given by Eq. [1] becomes

It is also worth noting that in Eq. [3], if L equals to 0, this equation will deteriorate to Eq. [1], which is suitable for

ideal point transducers. On the other hand, if on the condition that x2+y2<<R2, meaning if the reconstruction

region near the rotation center is much smaller than the scanning radius (so that x and y are infinitely small

compared with R+L), the distance delay term in Eq. [3] becomes

However, for a smaller transducer, this equation will also lead to a tangential elongation. Therefore, here we

seek to find a concise form of the back-projection algorithm when a transducer with a finite element size is

employed. Our strategy is to create a virtual detector, which is located further from the actual transducer, as

illustrated in Figure 1C. If the value of the distance between the virtual detector and the actual transducer L is

correctly set, the signal received by the transducer Si(t) can be approximate to the signal received by the virtual

detector but with a time delay of L/v. In this way, because the radius of the virtual detector scanning trace

is R+L, the final pixel reconstruction value that can be given by Eq. [1] becomes
Fan-Beam Projection
The fanbeam function computes projections of an image matrix along specified
directions. A projection of a two-dimensional function f(x,y) is a set of line integrals.
The fanbeam function computes the line integrals along paths that radiate from a single
source, forming a fan shape. To represent an image, the fanbeam function takes
multiple projections of the image from different angles by rotating the source around the
center of the image. The following figure shows a single fan-beam projection at a
specified rotation angle.
Fan-Beam Projection at Rotation Angle Theta

When you compute fan-beam projection data using the fanbeam function, you specify
as arguments an image and the distance between the vertex of the fan-beam
projections and the center of rotation (the center pixel in the image).
The fanbeam function determines the number of beams, based on the size of the image
and the values of specified name-value arguments.
The FanSensorGeometry name-value argument specifies how sensors are
aligned: "arc" or "line".

The FanRotationIncrement parameter specifies the rotation angle increment. By


default, fanbeam takes projections at different angles by rotating the source around the center
pixel at 1 degree intervals.
The following figures illustrate both these geometries. The first figure illustrates geometry used
by the fanbeam function when FanSensorGeometry is set to "arc" (the default). Note how you
specify the distance between sensors by specifying the angular spacing of the beams.

Fan-Beam Projection with Arc Geometry

The following figure illustrates the geometry used by the fanbeam function
when FanSensorGeometry is set to "line". In this figure, note how you specify the position
of the sensors by specifying the distance between them in pixels along the x´ axis.

Fan-Beam Projection with Line Geometry


Algebraic restoration techniques
This is the first part of a small series of articles on various image restoration
methods used in digital image processing applications. We will try to present the
bird’s-eye perspective of concepts of different restoration techniques but not to dive
too deep into the math and theoretical intricacies, although we assume that the reader
has some understanding of discrete mathematics and signal processing basics.

Basic Concepts
The goal of the image restoration is to recover an image that has been blurred in some
way. In computational image processing blurring is usually modeled by a convolution of
image matrix and a blur kernel. A blur kernel in this case is a two -dimensional matrix
which describes the response of an imaging system to a point light source or a point
object. Another term for it is Point Spread Function (PSF).

What does it mean?


Let’s suppose we have three two-dimensional matrices: f(x,y) for the original
image, h(m,n) for the blurring kernel and g(x,y) for the blurred image. Than we can write
the convolution:

or, using asterisk to denote convolution operation,

g = f * h.

The above equation is written in spatial domain because we use spatial


coordinates (x,y), but the convolution result can be described in a more convenient
way in frequency domain using frequency coordinates (u,w). From the convolution
theorem the Discrete Fourier Transforming (DFT) of the blurred image is the point-wise
product of the DFT of the original image and the DFT of the blurring kernel:

The most common types of blur are motion blur, out-of-focus blur and Gaussian blur (it’s a
good approximation of an image degrading by atmospheric turbulences).

There are several widely used techniques in image restoration, some of which are based
on frequency domain concepts while others attempt to model the degradation and apply
the inverse process. The modeling approach requires determining the criterion of
“goodness” that will yield an “optimal” solution.

You might also like