0% found this document useful (0 votes)
26 views14 pages

Dip Unit-I

Uploaded by

Pradnyan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views14 pages

Dip Unit-I

Uploaded by

Pradnyan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

Concept of Visual Information

Introduction

The ability to see is one of the truly remarkable characteristics of living beings. It enables them to
perceive and assimilate in a short span of time an incredible amount of knowledge about the world
around them. The scope and variety of that which can pass through the eye and be interpreted by the
brain is nothing short of astounding.

It is thus with some degree of trepidation that we introduce the concept of visual information, because
in the broadest sense, the overall significance of the term is overwhelming. Instead of taking into
account all of the ramifications of visual information; the first restriction we shall impose is that of finite
image size, In other words, the viewer receives his or her visual information as if looking through a
rectangular window of finite dimensions. This assumption is usually necessary in dealing with real world
systems such as cameras, microscopes and telescopes for example; they all have finite fields of view
and can handle only finite amounts of information.

The second assumption we make is that the viewer is incapable of depth perception on his own. That
is, in the scene being viewed he cannot tell how far away objects are by the normal use of binocular
vision or by changing the focus of his eyes.

This scenario may seem a bit dismal. But in reality, this model describes an over whelming proportion
of systems that handle visual information, including television, photographs, x-rays etc.

In this setup, the visual information is determined completely by the wavelengths and amplitudes of light
that passes through each point of the window and reach the viewers eye. If the world outside were to
be removed and a projector installed that reproduced exactly the light distribution on the window, the
viewer inside would not be able to tell the difference.

Thus, the problem of numerically representing visual information is reduced to that of representing the
distribution of light energy and wavelengths on the finite area of the window. We assume that the image
perceived is "monochromatic" and static. It is determined completely by the perceived light energy
(weighted sum of energy at perceivable wavelengths) passing through each point on the window and
reaching the viewer's eye. If we impose Cartesian coordinates on the window, we can represent
perceived light energy or "intensity" at point by . Thus represents the
monochromatic visual information or "image" at the instant of time under consideration. As images that
occur in real life situations cannot be exactly specified with a finite amount of numerical data, an
approximation of must be made if it is to be dealt with by practical systems. Since number
bases can be changed without loss of information, we may assume to be represented by
binary digital data. In this form the data is most suitable for several applications such as transmission
via digital communications facilities, storage within digital memory media or processing by computer.

Digital Image Definitions

A digital image described in a 2D discrete space is derived from an analog


image in a 2D continuous space through a sampling process that is frequently referred to as
digitization. The mathematics of that sampling process will be described in subsequent Chapters. For
now we will look at some basic definitions associated with the digital image. The effect of digitization
is shown in figure 1.

The 2D continuous image is divided into N rows and M columns. The intersection of a row
and a column is termed a pixel. The value assigned to the integer
coordinates with and is . In fact, in most
cases ,which we might consider to be the physical signal that impinges on the face of a 2D
sensor , is actually a function of many variables including depth ,color ( )and time .Unless
otherwise stated, we will consider the case of 2D, monochromatic, static images in this module.

Figure(1.1): Digitization of a continuous image.

The pixel at coordinates has the integer brightness value 110.

The image shown in figure (1.1) has been divided into rows and The value assigned
to every pixel is the average brightness in the pixel rounded to the nearest integer value. The process
of representing the amplitude of the 2D signal at a given coordinate as an integer value with L different
gray levels is usually referred to as amplitude quantization or simple quantization.

Common values

There are standard values for the various parameters encountered in digital image processing. These
values can be caused by video standards, by algorithmic requirements, or by the desire to keep digital
circuitry simple. Table 1 gives some comm
Parameter Symbol Typical values
Rows N 256,512,525,625,1024,1035
Columns M 256,512,768,1024,1320
Gray Levels L 2,64,256,1024,4096,16384

Table 1: Common values of digital image parameters

Quite frequently we see cases of M=N=2k where .This can be motivated by digital
circuitry or by the use of certain algorithms such as the (fast) Fourier transform.

The number of distinct gray levels is usually a power of 2, that is, where B is the number of bits
in the binary representation of the brightness levels. When we speak of a gray-level image;
when we speak of a binary image. In a binary image there are just two gray levels which can be
referred to, for example, as "black" and "white" or "0" and "1".
Suppose that a continuous image is approximated by equally spaced samples arranged in the
form of an array as:

(1)

Each element of the array refered to as "pixel" is a discrete quantity. The array represents a digital
image.

The above digitization requires a decision to be made on a value for N a well as on the number of
discrete gray levels allowed for each pixel.

It is common practice in digital image processing to let N=2n and G = number of gray levels = . It
is assumed that discrete levels are equally spaced between 0 to L in the gray scale.

Therefore the number of bits required to store a digitized image of size is In


other words a image with 256 gray levels (ie 8 bits/pixel) required a storage
of bytes.

The representation given by equ (1) is an approximation to a continuous image.

Reasonable question to ask at this point is how many samples and gray levels are required for a good
approximation? This brings up the question of resolution. The resolution (ie the degree of discernble
detail) of an image is strangely dependent on both N and m. The more these parameters are increased,
the closer the digitized array will approximate the original image.

Unfortunately this leads to large storage and consequently processing requirements increase rapidly
as a function of N and large m.

Characteristics of Image Operations

There is a variety of ways to classify and characterize image operations. The reason for doing so is to
understand what type of results we might expect to achieve with a given type of operation or what
might be the computational burden associated with a given operation.

 Type of operations

The types of operations that can be applied to digital images to transform an input image a[m, n] into
an output image b[m, n] (or another representation) can be classified into three categories as shown in
Table 2.

Operation Characterization Generic


Complexity/
Pixel
*Point -the output value at a specific coordinate is constant
dependent only on the input value at that same
coordinate.
*Local -the output value at a specific coordinate is
dependent on the input values in
the neighborhood of that same coordinate.

*Global --the output value at a specific coordinate is


dependent on all the values in the input image..

Table 2: Types of image operations. Image size= neighborhood size= . Note that the
complexity is specified in operations per pixel.

This is shown graphically in Figure(1.2).

Figure (1.2): Illustration of various types of image operations

 Types of neighborhoods

Neighborhood operations play a key role in modern digital image processing. It is therefore important
to understand how images can be sampled and how that relates to the various neighborhoods that can
be used to process an image.

Rectangular sampling - In most cases, images are sampled by laying a rectangular grid over an image
as illustrated in Figure(1.1). This results in the type of sampling shown in Figure(1.3ab). Hexagonal
sampling-An alternative sampling scheme is shown in Figure (1.3c) and is termed hexagonal sampling.

Both sampling schemes have been studied extensively and both represent a possible periodic tiling of
the continuous image space. However rectangular sampling due to hardware and software and software
considerations remains the method of choice. Local operations produce an output pixel

value based upon the pixel values in

the neighborhood .Some of the most common neighborhoods are the 4-connected
neighborhood and the 8-connected neighborhood in the case of rectangular sampling and the 6-
connected neighborhood in the case of hexagonal sampling illustrated in Figure(1.3).

Fig (1.3a) Fig (1.3b) Fig (1.3c)


Video Parameters
We do not propose to describe the processing of dynamically changing images in this introduction. It is
appropriate-given that many static images are derived from video cameras and frame grabbers-to
mention the standards that are associated with the three standard video schemes that are currently in
worldwide use- NTSC, PAL, and SECAM. This information is summarized in Table 3.

Standard NTSC PAL SECAM


Property
Images / Second 29.97 25 25
Ms / image 33.37 40.0 40.0
Lines / image 525 625 625
(horiz./vert.)=aspect radio 4:3 4:3 4:3
interlace 2:1 2:1 2:1
Us / line 63.56 64.00 64.00

Table 3: Standard video parameters

In a interlaced image the odd numbered lines (1, 3, 5.) are scanned in half of the allotted time (e.g. 20
ms in PAL) and the even numbered lines (2, 4, 6,.) are scanned in the remaining half. The image
display must be coordinated with this scanning format. The reason for interlacing the scan lines of a
video image is to reduce the perception of flicker in a displayed image. If one is planning to use
images that have been scanned from an interlaced video source, it is important to know if the two
half-images have been appropriately "shuffled" by the digitization hardware or if that should be
implemented in software. Further, the analysis of moving objects requires special care with interlaced
video to avoid 'Zigzag' edges.

Tools

Certain tools are central to the processing of digital images. These include mathematical tools such
as convolution, Fourier analysis, and statistical descriptions, and manipulative tools such as chain
codes and run codes. We will present these tools without any specific motivation. The motivation will
follow in later sections.

2D Convolution

 There are several possible notations to indicate the convolution of two (multi-dimensional)
signals to produce an output signal. The most common are:

 We shall use the first form


 ,with the following formal definitions.
 In 2D continuous space:

 In 2D discrete space:
Properties of Convolution

 There are a number of important mathematical properties associated with convolution.


 Convolution is commutative.

 Convolution is associative.

 Convolution is distributive.

 where a, b, c, and d are all images, either continuous or discrete.

 2D Fourier Transforms

 The Fourier transform produces another representation of a signal, specifically a


representation as a weighted sum of complex exponentials. Because of Euler's formula:


where , we can say that the Fourier transform produces a representation of a (2D)
signal as a weighted sum of sines and cosines. The defining formulas for the forward Fourier
and the inverse Fourier transforms are as follows. Given an image a and its Fourier transform
A, then the forward transform goes from the spatial domain (either continuous or discrete) to
the frequency domain which is always continuous.


 The inverse Fourier transform goes from the frequency domain back to the spatial domain.

 The Fourier transform is a unique and invertible operation so that:


Substituting the above expression in (1.27) we get


 The specific formulas for transforming back and forth between the spatial domain and the
frequency domain are given below
 In 2D continuous space:
 In 2D Discrete space:

Properties of Fourier Transforms

 Importance of phase and magnitude


 Circularly symmetric signals
 Examples of 2D signals and transforms

There are a variety of properties associated with the Fourier transform and the inverse Fourier
transform. The following are some of the most relevant for digital image processing.

* The Fourier transform is, in general, a complex function of the real frequency variables. As such the
transform con be written in terms of its magnitude and phase.

* A 2D signal can also be complex and thus written in terms of its magnitude and phase.

* If a 2D signal is real, then the Fourier transform has certain symmetries.

The symbol (*) indicates complex conjugation. For real signals equation leads directly to,

* If a 2D signal is real and even, then the Fourier transform is real and even

* The Fourier and the inverse Fourier transforms are linear operations
where a and b are 2D signals(images) and and are arbitrary, complex constants.

* The Fourier transform in discrete space, ,is periodic in both and .Both periods
are

integers

Importance of phase and magnitude-

The definition indicates that the Fourier transform of an image can be complex. This is illustrated below in Figure (1.
4a-c).

Figure (1.4a) Figure( 1.4b) Figure( 1.4c )

Figure (1.4a) shows the original image , Figure (1.4b) the magnitude in a scaled form

as and Figure (1.4c) the phase .

Both the magnitude and the phase functions are necessary for the complete reconstruction of an image from its Fourier
transform. Figure(1. 5a) shows what happens when Figure (1.4a) is restored solely on the basis of the magnitude
information and Figure (1.5b) shows what happens when Figure (1.4a) is restored solely on the basis of the phase
information.

Figure(1.5a) Figure (1.5b)

Figure (1.5a) figure(1.5b) shows constant


Neither the magnitude information nor the phase information is sufficient to restore the image. The magnitude-only
image Figure (1.5a) is unrecognizable and has severe dynamic range problems. The phase-only image Figure (1.5b)
is barely recognizable, that is, severely degraded in quality.

Circularly symmetric signals

An arbitrary 2D signal can always be written in a polar coordinate system as .When the 2D signal
exhibits a circular symmetry this means that:

where and . As a number of physical systems such as lenses exhibit circular symmetry,
it is useful to be able to compute an appropriate Fourier representation.

The Fourier transform can be written in polar coordinates and then, for a circularly symmetric
signal, rewritten as a Hankel transform:

(1.2)

where and is a Bessel function of the first kind of order zero.

The inverse Hankel transform is given by:

The Fourier transform of a circularly symmetric 2D signal is a function of only the radial frequency

.The dependence on the angular frequency has vanished. Further if is real, then it is

automatically even due to the circular symmetry. According to equ (1.2), will then be real and even.

Statistics

 Probability distribution function of the brightnesses


 Probability density function of the brightnesses
 Average
 Standard deviation
 Coefficient-of-variation
 SignaltoNoise ratio

In image processing it is quite common to use simple statistical descriptions of images and sub-images. The notion
of a statistic is intimately connected to the concept of a probability distribution, generally the distribution of signal
amplitudes. For a given region-which could conceivably be an entire image-we can define the probability distribution
function of the brightnesses in that region and probability density function of the brightnesses in that region. We will
assume in the discussion that follows that we are dealing with a digitized image .

Probability distribution function of the bright nesses

The probability distribution function, , is the probability that a brightness chosen from the region is less than or
equal to a given brightness value a. As a increases from increases from 0 to 1. is monotonic,
non-decreasing in a and thus .
Probability density function of the brightnesses

The probability that a brightness in a region falls between a and ,given the probability distribution
function can be expressed as where is the probability density function.

Because of monotonic , non-decreasing character of we have and . For an image


with quantized (integer) brightness amplitudes, the interpretation of is the width of a brightness interval. We
assume constant width intervals. The brightness probability density function is frequently estimated by counting the
number of times that each brightness occurs in the region to generate a histogram, .The histogram can then be
normalized so that the total area under the histogram is 1. Said another way, the for region is the normalized
count of the number of pixels, N, in a region that have quantized brightness a:

The brightness probability distribution function for the image is shown in Figure(1. 6a). The (unnormalized) brightness
histogram which is proportional to the estimated brightness probability density function is shown in Figure(1. 6b). The
height in this histogram corresponds to the number of pixels with a given brightness.
Figure (1.6a) Figure( 1.6b)

Figure(1. 6): (a) Brightness distribution function of Figure(1. 4a) with minimum, median, and maximum indicated.

(b) Brightness histogram of Figure (1.4a).

Both the distribution function and the histogram as measured from a region are a statistical description of that region.
It must be emphasized that both and should be viewed as estimates of true distributions when they are
computed from a specific region. That is, we view an image and a specific region as one realization of the various
random processes involved in the formation of that image and that region . In the same context, the statistics defined
below must be viewed as estimates of the underlying parameters.

Average

The average brightness of a region is defined as sample mean of the pixel brightnesses within that region. The

average, of the brightness over the N pixels within a region is given by:

Alternatively, we can use a formulation based upon the (unnormalized) brightness histogram, ,with
discrete brightness values a. This gives:

The average brightness is an estimate of the mean brightness, ,of the underlying brightness probability
distribution.
Standard deviation

The unbiased estimate of the standard deviation, of the brightnesses within a region with N pixels is called
the sample standard deviation and is given by:

Using the histogram formulation gives

The standard deviation , is an estimate of of the underlying brightness probability distribution.

Coefficient-of-variation

The dimensionless coefficient-of-variation, CV, is defined as:

Percentiles-

The percentile, p%, of an unquantized brightness distribution is defined as that value of the brightness
such that:

or equivalently

Three special cases are frequently used in digital image processing.

* 0% the minimum value in the region

* 50% the median value in the region


* 100% the maximum value in the region.

All three of these values can be determined from Figure (1.6a).

Mode-

The mode of the distribution is the most frequent brightness value. There is no guarantee that a mode
exists or that it is unique.

Signal to Noise ratio-

The signal-to-noise ratio, SNR, can have several definitions. The noise is characterized by its standard

deviation, .The characterization of the signal can differ. If the signal is known to lie between two

boundaries, then the SNR is defined as:

Bounded signal

(1.B)

If the signal is not bounded but has a statistical distribution then two other definitions are known:

Stochastic signal: S & N inter-dependent and S & N independent

where and are defined above.

The various statistics are given in Table 5 for the image and the region shown in Figure 7.

Statistics from Fig (1.7 )

Statistic Image ROI


Average 137.7 219.3
Standard Deviation 49.5 4.0
Minimum 56 202
Median 141 220
Maximum 241 226
Mode 62 220
SNR (db) NA 33.3
Fig (1.7 ). Region is the interior of the circle

A SNR calculation for the entire image based on equ (1.3) is not directly available. The variations in
the image brightnesses that lead to the large value of s (=49.5) are not, in general, due to noise but to
the variation in local information. With the help of the region there is a way to estimate the SNR. We

can use the (=4.0) and the dynamic range, , for the image (=241-56) to calculate a
global SNR (=33.3 dB). The underlying assumptions are that (1) the signal is approximately constant
in that region and the variation in the region is therefore due to noise, and, (2 ) that the noise is the

same over the entire image with a standard deviation given by .

You might also like