CCR Clustering and Collaborative Representation
CCR Clustering and Collaborative Representation
Representation
for Fast Single Image Super-Resolution
ABSTRACT
Clustering and collaborative representation (CCR) have recently been used in fast single image super-
resolution SR). In this paper, we propose an effective and fast single image super-resolution (SR)
algorithm by combining clustering and collaborative representation. In particular, we first cluster the
feature space of low-resolution (LR) images into multiple LR feature subspaces and group the
corresponding high-resolution (HR) feature subspaces. The local geometry property learned from the
clustering process is used to collect numerous neighbour LR and HR feature subsets from the whole
feature spaces for each cluster center. Multiple projection matrices are then computed via collaborative
representation to map LR feature subspaces to HR subspaces. For an arbitrary input LR feature, the desired
HR output can be estimated according to the projection matrix, whose corresponding LR cluster center is
nearest to the input. Moreover, by learning statistical priors from the clustering process, our clustering-
based SR algorithm would further decrease the computational time in the reconstruction phase. Extensive
experimental results on commonly used datasets indicate that
our proposed SR algorithm obtains compelling SR images quantitatively and qualitatively against many
state-of-the-art methods.
This letter addresses the problem of generating a super-resolution (SR) image from a single low-resolution
(LR) input image in the wavelet domain. To achieve a sharper image, an intermediate stage for estimating
the high-frequency (HF) sub-bands has been proposed. This stage includes an edge preservation procedure
and mutual interpolation between the input LR image and the HF sub-band images, as performed via the
discrete wavelet transform (DWT). Sparse mixing weights are calculated over blocks of coefficients in an
image, which provides a sparse signal representation in the LR image. All of the sub-band images are used
to generate the new high-resolution image using the inverse DWT. Experimental results indicated that the
proposed approach outperforms existing methods in terms of objective criteria and subjective perception
improving the image resolution.
1
CHAPTER-1
I. INTRODUCTION
The images and video sequences that register from radar, optical, medical and other sensors and that are
presented on high-definition television, in electron microscopy, etc., are obtained from electronic devices
that use a variety of sensors. Therefore, a pre processing technique that permits enhancement of image
This step can be performed by estimating a high-resolution (HR) image x(m, n) from measurements of a
low-resolution (LR) image y(m, n) that were obtained through a linear operator V that forms a degraded
version of the unknown HR image, which was additionally contaminated by an additive noise w(m, n), i.e.,
In most applications, V is a sub sampling operator that should be inverted to restore an original image size,
In remote sensing monitoring and navigation missions with small airborne or unmanned flying vehicle
platforms, LR sensors with simple and cheap hardware, such as unfocused fractional SAR systems, optical
Manuscript received October 20, 2013; revised January 13, 2014; accepted
The authors are with the School of Mechanical and Electrical Engineering (ESIME)
However, such cheap sensors or fractal synthesis mode inevitably sacrifices spatial resolution.
2
The system could also suffer from the uncertainties that are attributed to random signal perturbations,
imperfect system calibration, etc. Therefore, the SR algorithms that are the cost-effective decisions have an
important application in the pro-cessing of satellite or aerial images obtained by radar or optical sensors.
The wavelet technique as a simple sparse representation also plays a significant role in many image
processing applications, in particular in resolution enhancement, and recently, many novel algorithms have
been proposed.
Prior information on the image sparsity has been widely used for image interpolation. The principal idea
behind the restriction of the sparse SR algorithms is that the HR results can be improved by using more
The predominant challenge of this study is to employ an approach that is similar to the approach of these
wavelet-based algorithms, accounting for both spatial and spectral wavelet pixel information to enhance
The principal difference of the novel SR approach in comparison with existing methods consists in the
mutual interpolation via Lanczos and nearest neighbour interpolation (NNI) techniques for wavelet
transform (WT) high-frequency (HF) subband images and edge extracting images via discrete wavelet
directional LR image interpolation is computed by estimating sparse image mixture models in a DWT
image. To obtain robustness for the SR process in presence of noise, the novel framework uses special
denoising filtering, employing the nonlocal means (NLM) technique for the input LR image .
Finally, all of the sub-band images are combined, reconstructing via inverse DWT (IDWT) the
output HR image that appears to demonstrate superiority of the designed algorithm in terms of the
3
objective criteria and subjective perception (via the human visual system), in comparison with the best
existing techniques.
To justify that the novel algorithm called super resolution using wavelet domain interpolation with
edge extraction and sparse representation (SR-WDIEE-SR) has real advantages, we have compared the
proposed SR procedure with other similar techniques, such as the following: Demirel - Anbarjafari Super
Resolution, Wavelet Domain Image Resolution Enhance-ment Using Cycle-Spinning, Image Resolution
Enhance-ment by using Discrete and Stationary Wavelet Decomposition, Discrete Wavelet Transform-
Based Satellite Image Resolution Enhancement, and Dual-Tree Complex Wavelet Transform.
To ascertain the effectiveness of the proposed algorithm over other wavelet domain resolution-
enhancement techniques, numerous LR images of different nature (video and radar) obtained from the
Web pages were tested. The first database consists of 38 images, and the second database contains about
20 radar images.
The remainder of this letter is organized as follows. Section II presents a short introduction to the
NLM filtering method and to implementation of the interpolation through the inverse mixing estimator for
a single image in WT space. The proposed technique is presented in Section III. Section IV discusses the
qualitative and quantitative results of the proposed algorithm in comparison with other better conventional
1.1 PRELIMINARIES
The sub-sampled image xˆ(m, n) is decomposed with one-level DWT in the LL, LH, HL, and HH image
sub-bands, which are treated as matrices Ψ whose columns (approxima-tions and details) are the vectors of
a wavelet image single scale {ψd,n}0≤d≤3,n∈ G . The decomposition process is performed with a dual frame
4
matrix Ψ whose columns are
the dual wavelet frames ψd,n 0≤d≤3,n∈ G. The wavelet coefficients are written as follows:
The WT separates a low-frequency (LF) image zl (an approx-imation) that is projected over the LF
scaling filters {ψ0,n}n∈ G and an HF image zh (details) that is projected over the finest scale wavelets HF in
˜ ˜
n∈ G d=1
zl has little aliasing and can thus be interpolated with a Lanczos interpolator V +. zh is interpolated by
selecting direc-tional interpolators Vθ+ for θ ∈ Θ, where Θ is a set of angles uniformly discretized between
0 and π.
5
Ψ a˜(Bθ,q )1Bθ,q zh(m, n) . (6)
×_ q∈ zˆθ
(m − r)2 + (n − s)2
For each angle θ, an update is computed over wavelet coeffi-cients of each block of direction θ
multiplied by their mixing weight a˜(Bθ,q ), with the difference between the separable
approximation set B. This overall interpolator is calculated with 20 angles, with blocks having a width
CHAPTER-2
In this letter, one level of DWT that applies different wavelet families is used to decompose an input
image. DWT separates an image into different subband images, namely, LL, LH, HL, and HH, where last
three subbands contain the HF component of the image. The interpolation process should be applied to all
subband images. To suppress noise influence, the novel framework applies a denoising procedure by using
6
the NLM technique for the input LR image (see step 1 in Fig. 1).
In the proposed SR procedure, the LR image is used as the input data in the sparse representation
for the resolution-enhancement process in the following way (see step 2a in Fig. 1).
The LR image is calculated by using a 1-D interpolation in a given direction θ and then following
the computations of the new samples along the oversampled rows, columns, or diag-onals. Finally, in
this step, the algorithm computes the missing samples along the direction θ from the previously
calculated new samples, where the entire sparse process is performed with the Lanczos interpolation
The differences between the interpolated (factor α = 2) LL subband image and the LR input image
DOMAIN INTERPOLATION
7
fig. 1. Block diagram of the proposed resolution-enhancement technique.
Fig. 2. Visual perception results for the Aerial-A image contaminated by Gaussian additive noise (PSNR =
17 dB).
has been proposed. As shown in step 2b of the algorithm (see Fig. 1), this difference is performed in HF
subbands by interpolating each band via the NNI process (changing the values of pixels in agreement with
the closest neighbor value), including additional HF features into the HF images.
To preserve more edge information (to obtain a sharper enhanced image), we have proposed an extraction
step of the edges using HF subbands HH, HL, and LH images; next, the edge information is used in HF
subbands employing the NNI process (see step 2c in Fig. 1). The edges extracted are calculated as follows
[16]:
In the concluding stage, we perform an additional interpola-tion with the Lanczos interpolation (factor α
= 2) to reach the required size for the IDWT process (see step 3 in Fig. 1).
8
It is noticed that the intermediate process of adding the difference image and the edge extraction stage,
both of which contain the additional HF features, generate a significantly sharper reconstructed SR image.
This sharpness is boosted by the fact that the interpolation of the isolated HF components in HH, HL, and
LH appears to preserve more HF components than interpolating from the LR image directly.
This section reports the results of the statistical simulations and the performance evaluation that is
conducted via objective metrics (Peak Signal-to-Noise Ratio, Mean Absolute Error, and Structural
Similarity Index Measure) [17]. In addition, a sub-jective visual comparison of the SR images performed
by different algorithms was employed and thus made it possible to evaluate the performance of the
Numerous aerial optical and radar satellite images particularly Aerial-A and SAR-B images of
different nature and physical characteristics, were studied applying the designed and better existing SR
procedures. In simulations, the pixels of the LR image have been obtained by down sampling the original
Fig. 3. Visual perception results for the SAR-B image contaminated by Gaussian additive noise (PSNR =
17 dB).
9
TABLE I
TABLE II
denoising stage, the NLM filter from (2) was applied, the neighborhood Q was found in the simulation
Biorthogonal (Bior1.3).
In the Aerial-A image (see Fig. 2), it is easy to see better performance in accordance with the objective
criteria and via subjective visual perception in SR when the proposed algorithm SR-WDIEE-SR is
employed with the wavelet Bior1.3, demon-strating better preservation of the fine feature in the zoomed
part of the image. It has better sharpness and less smoothing at the edges, preventing pixel blocking
Referring to the SR image SAR-B, one can see in Fig. 3 that the novel SR algorithm appears to perform
better in terms of objective criteria (PSNR and SSIM), as well as in visual sub-jective perception,
particularly using wavelet Bior1.3. This can be particularly viewed in the well-defined borders, where the
designed framework restores slightly rather regular geometrical structures and the fine details appear to be
preserved better.
The presented analysis of many simulation results obtained in the SR for images of different nature
INTERPOLATION
using state-of-the-art techniques has shown that the novel SR-WDIEE-SR framework outperforms other
competitor meth-ods, presenting better performance. Given that the textures and chromaticity properties of
these images are different, the perfor-mance results confirm the robustness of the current proposal.
11
Because the noise presence in LR image is natural in practice, we compare the two configuration of the
novel framework when the NLM filter is used, or as in different algorithms, this filter is not applied in the
corrupted image.
In Tables I and II, one can observe the superiority of the proposed SR-WDIEE-SR framework, observing
We performed the comparison evaluation of average objective criteria values (PSNR and SSIM)
throughout all images from the mentioned databases (Aerial and SAR) for the proposed framework and
competitor techniques. The results obtained have presented the following: PSNR = 31.12 dB and SSIM =
0.672 in the case of our proposal and PSNR = 28.62 dB and SSIM = 0.614 for the better technique DASR
Last comparison of criteria values over image databases, visual subjective perception (see Figs. 1 and 2),
and results presented in Tables I and II have also confirmed the robustness of the current proposal.
Edge detection is the name for a set of mathematical methods which aim at identifying points in a digital
image at which the image brightness changes sharply or, more formally, has discontinuities. The points at
which image brightness changes sharply are typically organized into a set of curved line segments
termed edges. The same problem of finding discontinuities in 1D signals is known asstep detection and the
problem of finding signal discontinuities over time is known as change detection. Edge detection is a
fundamental tool in image processing, machine vision and computer vision, particularly in the areas
of feature detection and feature extraction.
12
The purpose of detecting sharp changes in image brightness is to capture important events and changes in
properties of the world. It can be shown that under rather general assumptions for an image formation
model, discontinuities in image brightness are likely to correspond to:[2][3]
discontinuities in depth,
discontinuities in surface orientation,
changes in material properties and
variations in scene illumination.
In the ideal case, the result of applying an edge detector to an image may lead to a set of connected curves
that indicate the boundaries of objects, the boundaries of surface markings as well as curves that
correspond to discontinuities in surface orientation. Thus, applying an edge detection algorithm to an
image may significantly reduce the amount of data to be processed and may therefore filter out
information that may be regarded as less relevant, while preserving the important structural properties of
an image. If the edge detection step is successful, the subsequent task of interpreting the information
contents in the original image may therefore be substantially simplified. However, it is not always possible
to obtain such ideal edges from real life images of moderate complexity.
Edges extracted from non-trivial images are often hampered by fragmentation, meaning that the edge
curves are not connected, missing edge segments as well as false edgesnot corresponding to interesting
phenomena in the image – thus complicating the subsequent task of interpreting the image data.
Edge detection is one of the fundamental steps in image processing, image analysis, image pattern
recognition, and computer vision techniques.
2.4 INTERPOLATION:
In the mathematical field of numerical analysis, interpolation is a method of constructing new data points
within the range of a discrete set of known data points.
In engineering and science, one often has a number of data points, obtained
by sampling or experimentation, which represent the values of a function for a limited number of values of
the independent variable. It is often required to interpolate (i.e. estimate) the value of that function for an
intermediate value of the independent variable. This may be achieved by curve fitting or regression
analysis.
13
A different problem which is closely related to interpolation is the approximation of a complicated
function by a simple function. Suppose the formula for some given function is known, but too complex to
evaluate efficiently. A few known data points from the original function can be used to create an
interpolation based on a simpler function. Of course, when a simple function is used to estimate data points
from the original, interpolation errors are usually present; however, depending on the problem domain and
the interpolation method used, the gain in simplicity may be of greater value than the resultant loss in
precision.
In the examples below if we consider x as a topological space and the function f forms a different kind of
Banach spaces then the problem is treated as "interpolation of operators". The classical results about
interpolation of operators are the Riesz–Thorin theorem and the Marcinkiewicz theorem. There are also
many other subsequent results.
is one of the most popular of the time-frequency-transformations. This article provides a formal,
A wavelet is a wave-like oscillation with an amplitude that begins at zero, increases, and then decreases
back to zero. It can typically be visualized as a "brief oscillation" like one might see recorded by
a seismograph or heart monitor. Generally, wavelets are purposefully crafted to have specific properties
that make them useful for signal processing. Wavelets can be combined, using a "reverse, shift, multiply
and integrate" technique called convolution, with portions of a known signal to extract information from
For example, a wavelet could be created to have a frequency of Middle C and a short duration of roughly
a 32nd note. If this wavelet was to be convolved with a signal created from the recording of a song, then
the resulting signal would be useful for determining when the Middle C note was being played in the song.
Mathematically, the wavelet will correlate with the signal if the unknown signal contains information of
14
similar frequency. This concept of correlation is at the core of many practical applications of wavelet
theory.
As a mathematical tool, wavelets can be used to extract information from many different kinds of data,
including – but certainly not limited to – audio signals and images. Sets of wavelets are generally needed
to analyze data fully. A set of "complementary" wavelets will decompose data without gaps or overlap so
that the decomposition process is mathematically reversible. Thus, sets of complementary wavelets are
useful in wavelet based compression/decompression algorithms where it is desirable to recover the original
information with minimal loss.
In formal terms, this representation is a wavelet series representation of a square-integrable function with
respect to either a complete, orthonormal set of basis functions, or an overcomplete set or frame of a vector
space, for the Hilbert space of square integrable functions.
Super-resolution imaging (SR) is a class of techniques that enhance the resolution of an imaging system.
In some SR techniques—termed optical SR—the diffraction limit of systems is transcended, while in
others—geometrical SR—the resolution of digital imaging sensors is enhanced.
Super-resolution imaging techniques are used in general image processing and in super-resolution
microscopy.
Basic concepts
Because some of the ideas surrounding superresolution raise fundamental issues, there is need at the outset
to examine the relevant physical and information-theoretical principles.
Diffraction Limit The detail of a physical object that an optical instrument can reproduce in an image has
limits that are mandated by laws of physics, whether formulated by the diffraction equations in the wave
theory of light[1] or the Uncertainty Principle for photons in quantum mechanics.[2] Information transfer
can never be increased beyond this boundary, but packets outside the limits can be cleverly swapped for
(or multiplexed with) some inside it.[3]One does not so much “break” as “run around” the diffraction limit.
New procedures probing electro-magnetic disturbances at the molecular level (in the so-called near
field)[4]remain fully consistent with Maxwell's equations.
A succinct expression of the diffraction limit is given in the spatial-frequency domain. In Fourier
optics light distributions are expressed as superpositions of a series of grating light patterns in a range of
15
fringe widths, technically spatial frequencies. It is generally taught that diffraction theory stipulates an
upper limit, the cut-off spatial-frequency, beyond which pattern elements fail to be transferred into the
optical image, i.e., are not resolved. But in fact what is set by diffraction theory is the width of the
passband, not a fixed upper limit. No laws of physics are broken when a spatial frequency band beyond the
cut-off spatial frequency is swapped for one inside it: this has long been implemented in dark-field
microscopy. Nor are information-theoretical rules broken when superimposing several
bands,[5][6] disentangling them in the received image needs assumptions of object invariance during
multiple exposures, i.e., the substitution of one kind of uncertainty for another.
Information When the term super resolution is used in techniques of inferring object details from statistical
treatment of the image within standard resolution limits, for example, averaging multiple exposures, it
involves an exchange of one kind of information (extracting signal from noise) for another (the assumption
that the target has remained invariant).
Resolution and localization True resolution involves the distinction of whether a target, e.g. a star or a
spectral line, is single or double, ordinarily requiring separable peaks in the image. When a target is known
to be single, its location can be determined with higher precision than the image width by finding the
centroid (center of gravity) of its image light distribution. The word ultra-resolution had been proposed for
this process but it did not catch on, and the high-precision localization procedure is typically referred to as
super resolution.
CHAPTER-3
MATLAB
INTRODUCTION TO MATLAB
16
What Is MATLAB?
visualization, and programming in an easy-to-use environment where problems and solutions are
Algorithm development
Data acquisition
MATLAB is an interactive system whose basic data element is an array that does not require
dimensioning.
This allows you to solve many technical computing problems, especially those with matrix and vector
formulations, in a fraction of the time it would take to write a program in a scalar non interactive language
such as C or FORTRAN.
The name MATLAB stands for matrix laboratory. MATLAB was originally written to provide easy
access to matrix software developed by the LINPACK and EISPACK projects. Today, MATLAB engines
17
incorporate the LAPACK and BLAS libraries, embedding the state of the art in software for matrix
computation.
MATLAB has evolved over a period of years with input from many users. In university
environments, it is the standard instructional tool for introductory and advanced courses in mathematics,
engineering, and science. In industry, MATLAB is the tool of choice for high-productivity research,
important to most users of MATLAB, toolboxes allow you to learn and apply specialized technology.
Toolboxes are comprehensive collections of MATLAB functions (M-files) that extend the MATLAB
environment to solve particular classes of problems. Areas in which toolboxes are available include signal
processing, control systems, neural networks, fuzzy logic, wavelets, simulation, and many others.
This is the set of tools and facilities that help you use MATLAB functions and files. Many of these
tools are graphical user interfaces. It includes the MATLAB desktop and Command Window, a command
history, an editor and debugger, and browsers for viewing help, the workspace, files, and the search path.
This is a vast collection of computational algorithms ranging from elementary functions like sum,
sine, cosine, and complex arithmetic, to more sophisticated functions like matrix inverse, matrix eigen
18
The MATLAB Language:
This is a high-level matrix/array language with control flow statements, functions, data structures,
input/output, and object-oriented programming features. It allows both "programming in the small" to
rapidly create quick and dirty throw-away programs, and "programming in the large" to create complete
Graphics:
MATLAB has extensive facilities for displaying vectors and matrices as graphs, as well as
annotating and printing these graphs. It includes high-level functions for two-dimensional and three-
dimensional data visualization, image processing, animation, and presentation graphics. It also includes
low-level functions that allow you to fully customize the appearance of graphics as well as to build
This is a library that allows you to write C and Fortran programs that interact with MATLAB. It
includes facilities for calling routines from MATLAB (dynamic linking), calling MATLAB as a
MATLAB DESKTOP:-
Matlab Desktop is the main Matlab application window. The desktop contains five sub windows, the
command window, the workspace browser, the current directory window, the command history window,
and one or more figure windows, which are shown only when the user displays a graphic.
19
The command window is where the user types MATLAB commands and expressions at the prompt (>>)
MATLAB defines the workspace as the set of variables that the user creates in a work session. The
workspace browser shows these variables and some information about them. Double clicking on a variable
in the workspace browser launches the Array Editor, which can be used to obtain information and income
The current Directory tab above the workspace tab shows the contents of the current directory, whose
path is shown in the current directory window. For example, in the windows operating system the path
might be as follows: C:\MATLAB\Work, indicating that directory “work” is a subdirectory of the main
directory “MATLAB”; WHICH IS INSTALLED IN DRIVE C. clicking on the arrow in the current
directory window shows a list of recently used paths. Clicking on the button to the right of the window
MATLAB uses a search path to find M-files and other MATLAB related files, which are organize
in directories in the computer file system. Any file run in MATLAB must reside in the current directory or
in a directory that is on search path. By default, the files supplied with MATLAB and math works
The easiest way to see which directories are on the search path. The easiest way to see which
directories are soon the search path, or to add or modify a search path, is to select set path from the File
menu the desktop, and then use the set path dialog box. It is good practice to add any commonly used
directories to the search path to avoid repeatedly having the change the current directory.
The Command History Window contains a record of the commands a user has entered in the
command window, including both current and previous MATLAB sessions. Previously entered MATLAB
commands can be selected and re-executed from the command history window by right clicking on a
20
command or sequence of commands. This action launches a menu from which to select various options in
addition to executing the commands. This is useful to select various options in addition to executing the
commands. This is a useful feature when experimenting with various commands in a work session.
The MATLAB editor is both a text editor specialized for creating M-files and a graphical MATLAB
debugger. The editor can appear in a window by itself, or it can be a sub window in the desktop. M-files
The MATLAB editor window has numerous pull-down menus for tasks such as saving, viewing, and
debugging files. Because it performs some simple checks and also uses color to differentiate between
various elements of code, this text editor is recommended as the tool of choice for writing and editing M-
functions. To open the editor , type edit at the prompt opens the M-file filename.m in an editor window,
ready for editing. As noted earlier, the file must be in the current directory, or in a directory in the search
path.
Getting Help:
The principal way to get help online is to use the MATLAB help browser, opened as a separate
window either by clicking on the question mark symbol (?) on the desktop toolbar, or by typing help
browser at the prompt in the command window. The help Browser is a web browser integrated into the
MATLAB desktop that displays a Hypertext Markup Language(HTML) documents. The Help Browser
consists of two panes, the help navigator pane, used to find information, and the display pane, used to view
the information. Self-explanatory tabs other than navigator pane are used to perform a search.
CHAPTER-4
Background:
Digital image processing is an area characterized by the need for extensive experimental work to
establish the viability of proposed solutions to a given problem. An important characteristic underlying
the design of image processing systems is the significant level of testing & experimentation that normally
is required before arriving at an acceptable solution. This characteristic implies that the ability to formulate
approaches &quickly prototype candidate solutions generally plays a major role in reducing the cost &
What is DIP
An image may be defined as a two-dimensional function f(x, y), where x & y are spatial
coordinates, & the amplitude of f at any pair of coordinates (x, y) is called the intensity or gray level of
the image at that point. When x, y & the amplitude values of f are all finite discrete quantities, we call
the image a digital image. The field of DIP refers to processing digital image by means of digital
computer. Digital image is composed of a finite number of elements, each of which has a particular
Vision is the most advanced of our sensor, so it is not surprising that image play the single most
important role in human perception. However, unlike humans, who are limited to the visual band of the
EM spectrum imaging machines cover almost the entire EM spectrum, ranging from gamma to radio
waves.
They can operate also on images generated by sources that humans are not accustomed to
22
There is no general agreement among authors regarding where image processing stops & other related
areas such as image analysis& computer vision start. Sometimes a distinction is made by defining image
processing as a discipline in which both the input & output at a process are images. This is limiting &
somewhat artificial boundary. The area of image analysis (image understanding) is in between image
There are no clear-cut boundaries in the continuum from image processing at one end to complete
vision at the other. However, one useful paradigm is to consider three types of computerized processes in
this continuum: low-, mid-, & high-level processes. Low-level process involves primitive operations such
as image processing to reduce noise, contrast enhancement & image sharpening. A low- level process is
characterized by the fact that both its inputs & outputs are images.
Mid-level process on images involves tasks such as segmentation, description of that object to
reduce them to a form suitable for computer processing & classification of individual objects. A mid-level
process is characterized by the fact that its inputs generally are images but its outputs are attributes
Finally higher- level processing involves “Making sense” of an ensemble of recognized objects,
as in image analysis & at the far end of the continuum performing the cognitive functions normally
Digital image processing, as already defined is used successfully in a broad range of areas of
23
What is an image?
An image is represented as a two dimensional function f(x, y) where x and y are spatial co-
ordinates and the amplitude of ‘f’ at any pair of coordinates (x, y) is called the intensity of the image at
that point.
A grayscale image is a function I (xylem) of the two spatial coordinates of the image plane.
I(x, y) is the intensity of the image at the point (x, y) on the image plane.
I (xylem) takes non-negative values assume the image is bounded by a rectangle [0, a] [0, b]I: [0, a] [0,
b] [0, info)
Color image:
It can be represented by three functions, R (xylem) for red, G (xylem) for green and B (xylem)
for blue.
An image may be continuous with respect to the x and y coordinates and also in amplitude.
Converting such an image to digital form requires that the coordinates as well as the amplitude to be
digitized. Digitizing the coordinate’s values is called sampling. Digitizing the amplitude values is called
quantization.
Coordinate convention:
The result of sampling and quantization is a matrix of real numbers. We use two principal ways to
represent digital images. Assume that an image f(x, y) is sampled so that the resulting image has M rows
24
and N columns. We say that the image is of size M X N. The values of the coordinates (xylem) are discrete
quantities. For notational clarity and convenience, we use integer values for these discrete coordinates.
In many image processing books, the image origin is defined to be at (xylem)=(0,0).The next
coordinate values along the first row of the image are (xylem)=(0,1).It is important to keep in mind that the
notation (0,1) is used to signify the second sample along the first row. It does not mean that these are the
actual values of physical coordinates when the image was sampled. Following figure shows the coordinate
convention. Note that x ranges from 0 to M-1 and y from 0 to N-1 in integer increments.
The coordinate convention used in the toolbox to denote arrays is different from the preceding
paragraph in two minor ways. First, instead of using (xylem) the toolbox uses the notation (race) to
indicate rows and columns. Note, however, that the order of coordinates is the same as the order discussed
in the previous paragraph, in the sense that the first element of a coordinate topples, (alb), refers to a row
and the second to a column. The other difference is that the origin of the coordinate system is at (r, c) = (1,
1); thus, r ranges from 1 to M and c from 1 to N in integer increments. IPT documentation refers to the
coordinates. Less frequently the toolbox also employs another coordinate convention called spatial
coordinates which uses x to refer to columns and y to refers to rows. This is the opposite of our use of
variables x and y.
Image as Matrices:
The preceding discussion leads to the following representation for a digitized image function:
f (xylem)= . . .
. . .
25
f (M-1,0) f(M-1,1) ………… f(M-1,N-1)
The right side of this equation is a digital image by definition. Each element of this array is called
an image element, picture element, pixel or pel. The terms image and pixel are used throughout the rest of
. . .
f= . . .
Where f (1,1) = f(0,0) (note the use of a monoscope font to denote MATLAB quantities). Clearly
the two representations are identical, except for the shift in origin. The notation f(p ,q) denotes the element
located in row p and the column q. For example f(6,2) is the element in the sixth row and second column
of the matrix f. Typically we use the letters M and N respectively to denote the number of rows and
columns in a matrix. A 1xN matrix is called a row vector whereas an Mx1 matrix is called a column
Matrices in MATLAB are stored in variables with names such as A, a, RGB, real array and so on.
Variables must begin with a letter and contain only letters, numerals and underscores.
26
As noted in the previous paragraph, all MATLAB quantities are written using mono-scope
characters. We use conventional Roman, italic notation such as f(x ,y), for mathematical expressions
Reading Images:
Images are read into the MATLAB environment using function imread whose syntax is
Imread (‘filename’)
Here filename is a spring containing the complete of the image file(including any applicable
Reads the JPEG (above table) image chestxray into image array f. Note the use of single quotes (‘)
to delimit the string filename. The semicolon at the end of a command line is used by MATLAB for
suppressing output If a semicolon is not included. MATLAB displays the results of the operation(s)
27
specified in that line. The prompt symbol (>>) designates the beginning of a command line, as it appears in
Data Classes:
Although we work with integers coordinates the values of pixels themselves are not restricted to
be integers in MATLAB. Table above list various data classes supported by MATLAB and IPT are
representing pixels values. The first eight entries in the table are refers to as numeric data classes. The
ninth entry is the char class and, as shown, the last entry is referred to as logical data class.
All numeric computations in MATLAB are done in double quantities, so this is also a frequent data
Class unit 8 also is encountered frequently, especially when reading data from storages devices, as 8
bit images are most common representations found in practice. These two data classes, classes logical, and,
to a lesser degree, class unit 16 constitute the primary data classes on which we focus.
Many ipt functions however support all the data classes listed in table. Data class double requires 8
bytes to represent a number uint8 and int 8 require one byte each, uint16 and int16 requires 2bytes and unit
32.
Name Description
Element).
28
Uint16 unsigned 16_bit integers in the range [0, 65535] (2byte per element).
Uint 32 unsigned 32_bit integers in the range [0, 4294967295](4 bytes per element). Int8
Int 16 signed 16_byte integers in the range [32768, 32767] (2 bytes per element).
Int 32 Signed 32_byte integers in the range [-2147483648, 21474833647] (4 byte per
element).
Int 32 and single required 4 bytes each. The char data class holds characters in Unicode
representation. A character string is merely a 1*n array of characters logical array contains only the values
0 to 1,with each element being stored in memory using function logical or by using relational operators.
1 .Intensity images;
2. Binary images;
3. Indexed images;
4. R G B images.
29
Most monochrome image processing operations are carried out using binary or intensity images,
so our initial focus is on these two image types. Indexed and RGB colour images.
Intensity Images:
An intensity image is a data matrix whose values have been scaled to represent intentions. When
the elements of an intensity image are of class unit8, or class unit 16, they have integer values in the range
[0,255] and [0, 65535], respectively. If the image is of class double, the values are floating point numbers.
Values of scaled, double intensity images are in the range [0, 1] by convention.
Binary Images:
Binary images have a very specific meaning in MATLAB.A binary image is a logical array 0s
and1s.Thus, an array of 0s and 1s whose values are of data class, say unit8, is not considered as a binary
image in MATLAB .A numeric array is converted to binary using function logical. Thus, if A is a numeric
B=logical (A)
If A contains elements other than 0s and 1s.Use of the logical function converts all nonzero
If c is a logical array, this function returns a 1.Otherwise returns a 0. Logical array can be
30
Indexed Images:
Matrix map is an m*3 arrays of class double containing floating point values in the range [0,
1].The length m of the map are equal to the number of colors it defines. Each row of map specifies the red,
green and blue components of a single color. An indexed images uses “direct mapping” of pixel intensity
The color of each pixel is determined by using the corresponding value the integer matrix x as a
pointer in to map. If x is of class double ,then all of its components with values less than or equal to 1
point to the first row in map, all components with value 2 point to the second row and so on. If x is of class
units or unit 16, then all components value 0 point to the first row in map, all components with value 1
RGB Image:
An RGB color image is an M*N*3 array of color pixels where each color pixel is triplet
corresponding to the red, green and blue components of an RGB image, at a specific spatial location. An
RGB image may be viewed as “stack” of three gray scale images that when fed in to the red, green and
Produce a color image on the screen. Convention the three images forming an RGB color image are
referred to as the red, green and blue components images. The data class of the components images
determines their range of values. If an RGB image is of class double the range of values is [0, 1].
31
Similarly the range of values is [0,255] or [0, 65535].For RGB images of class units or unit 16
respectively. The number of bits use to represents the pixel values of the component images determines the
bit depth of an RGB image. For example, if each component image is an 8bit image, the corresponding
Generally, the number of bits in all component images is the same. In this case the number of
possible color in an RGB image is (2^b) ^3, where b is a number of bits in each component image. For the
CHAPTER X
INTRODUCTION
1.1 IMAGE:
An image is a two-dimensional picture, which has a similar appearance to some subject usually a
physical object or a person.
32
Fig 1 General image
An image is a rectangular grid of pixels. It has a definite height and a definite width counted in
pixels. Each pixel is square and has a fixed size on a given display. However different computer monitors
may use different sized pixels. The pixels that constitute an image are ordered as a grid (columns and
rows); each pixel consists of numbers representing magnitudes of brightness and color.
Each pixel has a color. The color is a 32-bit integer. The first eight bits determine the redness of the
pixel, the next eight bits the greenness, the next eight bits the blueness, and the remaining eight bits the
transparency of the pixel.
33
Fig1.2 Transparency image
Image file size is expressed as the number of bytes that increases with the number of pixels
composing an image, and the color depth of the pixels. The greater the number of rows and columns, the
greater the image resolution, and the larger the file. Also, each pixel of an image increases in size when its
color depth increases, an 8-bit pixel (1 byte) stores 256 colors, a 24-bit pixel (3 bytes) stores 16 million
colors, the latter known as true color.
Image compression uses algorithms to decrease the size of a file. High resolution cameras produce
large image files, ranging from hundreds of kilobytes to megabytes, per the camera's resolution and the
image-storage format capacity. High resolution digital cameras record 12 megapixel (1MP = 1,000,000
pixels / 1 million) images, or more, in true color. For example, an image recorded by a 12 MP camera;
since each pixel uses 3 bytes to record true color, the uncompressed image would occupy 36,000,000 bytes
of memory, a great amount of digital storage for one image, given that cameras must record and store
many images to be practical. Faced with large file sizes, both within the camera and a storage disc, image
file formats were developed to store such large images.
Image file formats are standardized means of organizing and storing images. This entry is about
digital image formats used to store photographic and other images. Image files are composed of either
pixel or vector (geometric) data that are rasterized to pixels when displayed (with few exceptions) in a
vector graphic display. Including proprietary types, there are hundreds of image file types. The PNG,
JPEG, and GIF formats are most often used to display images on the Internet.
34
Fig1.3 Resolution image
In addition to straight image formats, Metafile formats are portable formats which can include both
raster and vector information. The metafile format is an intermediate format. Most Windows applications
open metafiles and then save them in their own native format.
JPEG/JFIF:
JPEG (Joint Photographic Experts Group) is a compression method. JPEG compressed images are
usually stored in the JFIF (JPEG File Interchange Format) file format. JPEG compression is lossy
compression. Nearly every digital camera can save images in the JPEG/JFIF format, which supports 8 bits
per color (red, green, blue) for a 24-bit total, producing relatively small files. Photographic images may be
better stored in a lossless non-JPEG format if they will be re-edited, or if small "artifacts" are
unacceptable. The JPEG/JFIF format also is used as the image compression algorithm in many Adobe PDF
files.
EXIF:
The EXIF (Exchangeable image file format) format is a file standard similar to the JFIF format
with TIFF extensions. It is incorporated in the JPEG writing software used in most cameras. Its purpose is
to record and to standardize the exchange of images with image metadata between digital cameras and
editing and viewing software. The metadata are recorded for individual images and include such things as
camera settings, time and date, shutter speed, exposure, image size, compression, name of camera, color
information, etc. When images are viewed or edited by image editing software, all of this image
information can be displayed.
35
TIFF:
The TIFF (Tagged Image File Format) format is a flexible format that normally saves 8 bits or 16
bits per color (red, green, blue) for 24-bit and 48-bit totals, respectively, usually using either the TIFF or
TIF filename extension. TIFFs are lossy and lossless. Some offer relatively good lossless compression for
bi-level (black & white) images. Some digital cameras can save in TIFF format, using the LZW
compression algorithm for lossless storage. TIFF image format is not widely supported by web browsers.
TIFF remains widely accepted as a photograph file standard in the printing business. TIFF can handle
device-specific color spaces, such as the CMYK defined by a particular set of printing press inks.
PNG:
The PNG (Portable Network Graphics) file format was created as the free, open-source successor
to the GIF. The PNG file format supports true color (16 million colors) while the GIF supports only 256
colors. The PNG file excels when the image has large, uniformly colored areas. The lossless PNG format
is best suited for editing pictures, and the lossy formats, like JPG, are best for the final distribution of
photographic images, because JPG files are smaller than PNG files. PNG, an extensible file format for the
lossless, portable, well-compressed storage of raster images. PNG provides a patent-free replacement for
GIF and can also replace many common uses of TIFF. Indexed-color, grayscale, and true color images are
supported, plus an optional alpha channel. PNG is designed to work well in online viewing applications,
such as the World Wide Web. PNG is robust, providing both full file integrity checking and simple
detection of common transmission errors.
GIF:
GIF (Graphics Interchange Format) is limited to an 8-bit palette, or 256 colors. This makes the GIF
format suitable for storing graphics with relatively few colors such as simple diagrams, shapes, logos and
cartoon style images. The GIF format supports animation and is still widely used to provide image
animation effects. It also uses a lossless compression that is more effective when large areas have a single
color, and ineffective for detailed images or dithered images.
BMP:
The BMP file format (Windows bitmap) handles graphics files within the Microsoft Windows OS.
Typically, BMP files are uncompressed, hence they are large. The advantage is their simplicity and wide
acceptance in Windows programs.
36
1.3.2 VECTOR FORMATS:
As opposed to the raster image formats above (where the data describes the characteristics of each
individual pixel), vector image formats contain a geometric description which can be rendered smoothly at
any desired display size.
At some point, all vector graphics must be rasterized in order to be displayed on digital monitors.
However, vector images can be displayed with analog CRT technology such as that used in some
electronic test equipment, medical monitors, radar displays, laser shows and early video games. Plotters
are printers that use vector data rather than pixel data to draw graphics.
CGM:
CGM (Computer Graphics Metafile) is a file format for 2D vector graphics, raster graphics, and
text. All graphical elements can be specified in a textual source file that can be compiled into a binary file
or one of two text representations. CGM provides a means of graphics data interchange for computer
representation of 2D graphical information independent from any particular application, system, platform,
or device.
SVG:
SVG (Scalable Vector Graphics) is an open standard created and developed by the World Wide
Web Consortium to address the need for a versatile, scriptable and all purpose vector format for the web
and otherwise. The SVG format does not have a compression scheme of its own, but due to the textual
nature of XML, an SVG graphic can be compressed using a program such as gzip.
37
Several factor combine to indicate a lively future for digital image processing. A major factor is the
declining cost of computer equipment. Several new technological trends promise to further promote digital
image processing. These include parallel processing mode practical by low cost microprocessors, and the
use of charge coupled devices (CCDs) for digitizing, storage during processing and display and large low
cost of image storage arrays.
Fig 1.5
Image fundamental
Scanner produces a two-dimensional image. If the output of the camera or other imaging sensor is
not in digital form, an analog to digital converter digitizes it. The nature of the sensor and the image it
produces are determined by the application.
Image enhancement is among the simplest and most appealing areas of digital image processing.
Basically, the idea behind enhancement techniques is to bring out detail that is obscured, or simply to
highlight certain features of interesting an image. A familiar example of enhancement is when we increase
the contrast of an image because “it looks better.” It is important to keep in mind that enhancement is a
very subjective area of image processing.
39
Fig 1.5.3 Image enhancement
40
Fig 1.5.5 Color & Gray scale image
Wavelets were first shown to be the foundation of a powerful new approach to signal processing
and analysis called Multiresolution theory. Multiresolution theory incorporates and unifies techniques
from a variety of disciplines, including sub band coding from signal processing, quadrature mirror filtering
from digital speech recognition, and pyramidal image processing.
41
1.5.6 Compression:
Compression, as the name implies, deals with techniques for reducing the storage required saving
an image, or the bandwidth required for transmitting it. Although storage technology has improved
significantly over the past decade, the same cannot be said for transmission capacity. This is true
particularly in uses of the Internet, which are characterized by significant pictorial content. Image
compression is familiar to most users of computers in the form of image file extensions, such as the jpg
file extension used in the JPEG (Joint Photographic Experts Group) image compression standard.
In binary images, the sets in question are members of the 2-D integer space Z2, where each element
of a set is a 2-D vector whose coordinates are the (x,y) coordinates of a black(or white) pixel in the image.
Gray-scale digital images can be represented as sets whose components are in Z3. In this case, two
components of each element of the set refer to the coordinates of a pixel, and the third corresponds to its
discrete gray-level value.
1.5.8 Segmentation:
Segmentation procedures partition an image into its constituent parts or objects. In general,
autonomous segmentation is one of the most difficult tasks in digital image processing. A rugged
42
segmentation procedure brings the process a long way toward successful solution of imaging problems that
require objects to be identified individually.
On the other hand, weak or erratic segmentation algorithms almost always guarantee eventual
failure. In general, the more accurate the segmentation, the more likely recognition is to succeed.
Regional representation is appropriate when the focus is on internal properties, such as texture or
skeletal shape. In some applications, these representations complement each other. Choosing a
representation is only part of the solution for transforming raw data into a form suitable for subsequent
computer processing. A method must also be specified for describing the data so that features of interest
are highlighted. Description, also called feature selection, deals with extracting attributes that result in
some quantitative information of interest or are basic for differentiating one class of objects from another.
43
The last stage involves recognition and interpretation. Recognition is the process that assigns a
label to an object based on the information provided by its descriptors. Interpretation involves assigning
meaning to an ensemble of recognized objects.
1.5.11 Knowledgebase:
Knowledge about a problem domain is coded into image processing system in the form of a
knowledge database. This knowledge may be as simple as detailing regions of an image when the
information of interests is known to be located, thus limiting the search that has to be conducted in seeking
that information. The knowledge base also can be quite complex, such as an inter related to list of all major
possible defects in a materials inspection problem or an image data base containing high resolution
satellite images of a region in connection with change deletion application. In addition to guiding the
operation of each processing module, the knowledge base also controls the interaction between modules.
The system must be endowed with the knowledge to recognize the significance of the location of the string
with respect to other components of an address field. This knowledge glides not only the operation of each
module, but it also aids in feedback operations between modules through the knowledge base. We
implemented preprocessing techniques using MATLAB.
As recently as the mid-1980s, numerous models of image processing systems being sold
throughout the world were rather substantial peripheral devices that attached to equally substantial host
computers. Late in the 1980s and early in the 1990s, the market shifted to image processing hardware in
the form of single boards designed to be compatible with industry standard buses and to fit into
engineering workstation cabinets and personal computers. In addition to lowering costs, this market shift
also served as a catalyst for a significant number of new companies whose specialty is the development of
software written specifically for image processing.
44
Fig 1.6 Component of image processing
Although large-scale image processing systems still are being sold for massive imaging
applications, such as processing of satellite images, the trend continues toward miniaturizing and blending
of general-purpose small computers with specialized image processing hardware. Figure 1.24 shows the
basic components comprising a typical general-purpose system used for digital image processing. The
function of each component is discussed in the following paragraphs, starting with image sensing.
Image sensors:
With reference to sensing, two elements are required to acquire digital images. The first is a
physical device that is sensitive to the energy radiated by the object we wish to image. The second, called a
digitizer, is a device for converting the output of the physical sensing device into digital form. For
instance, in a digital video camera, the sensors produce an electrical output proportional to light intensity.
The digitizer converts these outputs to digital data.
Specialized image processing hardware:
Specialized image processing hardware usually consists of the digitizer just mentioned, plus
hardware that performs other primitive operations, such as an arithmetic logic unit (ALU), which performs
arithmetic and logical operations in parallel on entire images. One example of how an ALU is used is in
45
averaging images as quickly as they are digitized, for the purpose of noise reduction. This type of
hardware sometimes is called a front-end subsystem, and its most distinguishing characteristic is speed. In
other words, this unit performs functions that require fast data throughputs (e.g., digitizing and averaging
video images at 30 frames) that the typical main computer cannot handle.
Computer:
The computer in an image processing system is a general-purpose computer and can range from a
PC to a supercomputer. In dedicated applications, sometimes specially designed computers are used to
achieve a required level of performance, but our interest here is on general-purpose image processing
systems. In these systems, almost any well-equipped PC-type machine is suitable for offline image
processing tasks.
Image processing software:
Software for image processing consists of specialized modules that perform specific tasks. A well-
designed package also includes the capability for the user to write code that, as a minimum, utilizes the
specialized modules. More sophisticated software packages allow the integration of those modules and
general-purpose software commands from at least one computer language.
Mass storage:
Mass storage capability is a must in image processing applications. An image of size 1024*1024
pixels, in which the intensity of each pixel is an 8-bit quantity, requires one megabyte of storage space if
the image is not compressed. When dealing with thousands, or even millions, of images, providing
adequate storage in an image processing system can be a challenge. Digital storage forimage processing
applications fall into three principal categories: (1) short-term storage for use during processing, (2) on-
line storage for relatively fast recall, and (3) archival storage, characterized by infrequent access. Storage
is measured in bytes (eight bits), Kbytes (one thousand bytes), Mbytes (one million bytes), Gbytes
(meaning giga, or one billion, bytes), and Tbytes (meaning tera, or one trillion, bytes)
Hardcopy:
Hardcopy devices for recording images include laser printers, film cameras, heat-sensitive devices,
inkjet units, and digital units, such as optical and CD-ROM disks. Film provides the highest possible
resolution, but paper is the obvious medium of choice for written material. For presentations, images are
displayed on film transparencies or in a digital medium if image projection equipment is used. The latter
approach is gaining acceptance as the standard for image presentations.
Network:
Networking is almost a default function in any computer system in use today. Because of the large
amount of data inherent in image processing applications, the key consideration in image transmission is
bandwidth. In dedicated networks, this typically is not a problem, but communications with remote sites
via the Internet are not always as efficient. Fortunately, this situation is improving quickly as a result of
optical fiber and other broadband technologies.
CCM:
In music
Contemporary Christian music, a genre of popular music which is lyrically focused on matters concerned
with the Christian faith
o CCM Magazine, a magazine that covers Contemporary Christian music
47
University of Cincinnati College-Conservatory of Music, the performing arts college of the University of
Cincinnati
Cincinnati Conservatory of Music, a conservatory formed in 1867 as part of a girls' finishing school which
later became part of the University of Cincinnati College-Conservatory of Music.
Contemporary Commercial Music
In the context of MIDI: control change message.
In cryptography
In medicine
Cerebral cavernous malformation, a vascular disorder of the central nervous system that may appear either
sporadically or exhibit autosomal dominant inheritance
Classical Chinese medicine, a medicine that developed from germ theory
Comprehensive Care Management, a member of the Beth Abraham Family of Health Services
Critical Care Medicine, a peer-reviewed medical journal in the field of critical care medicine
In politics
In religion
In sports
In technology
Continuous Controls Monitoring describes techniques of continuously monitoring and auditing an IT system
Continuous Current Mode, operational mode of DC-DC converters
Cisco CallManager, a Cisco product
Cloud Computing Manifesto
Configuration & Change Management
CORBA Component Model, a portion of the CORBA standard for software componentry
A deprecated abbreviation for the cubic centimetre unit of volume measurement
In transportation
In education
County College of Morris, a two-year, public community college located off of Route 10 on Center Grove
Road in Randolph Township, New Jersey
City College Manchester, a Further Education college in the United Kingdom.
City College of Manila, in the Philippines.
In military
Center for Countermeasures, a United States military center based at White Sands Missile Range, New
Mexico
Command Chief Master Sergeant, a position in the United States Air Force
49
In other fields
Corn cob mix, a kind of silage consisting of corn cobs and kernels.
El Centro Cultural de Mexico, an alternative space in Santa Ana, Orange County, California
Cerberus Capital Management, a large privately owned hedge fund
Certified Consulting Meteorologist, a person designated by the American Meteorological Society to
possess attributes as they pertain to the field of meteorology
Crime Classification Manual, FBI produced text for a standardized system to investigate and classify
violent crimes.
Cervecería Cuauhtémoc Moctezuma, a major brewery in Mexico that produces brands such as Dos Equis
and Tecate.
Color and texture are two low-level features widely used for image classification, indexing and retrieval. Color is
usually represented as a histogram, which is a first order statistical measure that captures global distribution of color
in an image One of the main drawbacks of the histogram- based approaches is that the spatial distribution and local
variations in color are ignored. Local spatial variation of pixel intensity is commonly used to capture texture
information in an image. Grayscale Co-occurrence Matrix (GCM) is a well-known method for texture extraction in
the spatial domain. A GCM stores the number of pixel neighborhoods in an image that have a particular grayscale
combination. Let I be an image and let p and Np respectively denote any arbitrary pixel and its neighbor in a given
direction. If GL denotes the total number of quantized gray levels and gl denotes the individual gray levels, where,
gl {0, . . .,GL _ 1}, then each component of GCM can be written as follows:
gcm(i, j) is the number of times the gray level of a pixel p denoted by glp equals i, and the gray
level of its neighbor Np denoted by glNp equals j, as a fraction of the total number of pixels in the image. Thus, it
estimates the probability that the gray level of an arbitrary pixel in an image is i, and that of its neighbor is j. One
50
GCM matrix is generated for each possible neighborhood direction, namely, 0, 45, 90 and 135.Average and range of
14 features like Angular Second Moment, Contrast, Correlation, etc., are generated by combining all the four
matrices to get a total of 28 features. In the GCM approach for texture extraction, color information is completely
To incorporate spatial information along with the color of image pixels, a feature called
color correlogram has recently been proposed. It is a three dimensional matrix that represents the probability of
finding pixels of any two given colors at a distance ‘d’ apart Auto correlogram is a variation of correlogram, which
represents the probability of finding two pixels with the same color at a distance ‘d’ apart. This atproach can
effectively represent color distribution in an image. However, correlogram features do not capture intensity variation
Many image databases often contain both color as well as gray scale images. The color correlogram method does not
Another method called Color Co-occurrence Matrix (CCM) has been proposed to capture
color variation in an image. CCM is represented as a three-dimensional matrix, where color pair of the pixels p and
Np are captured in the first two dimensions of the matrix and the spatial distance ‘d’ between these two pixels is
captured in the third dimension. This approach is a generalization of the color correlogram and reduces to the pure
color correlogram for d = 1. CCM is generated using only the Hue plane of the HSV (Hue, Saturation and Intensity
Value) color space. The Hue axis is quantized into HL number of levels. If individual hue values are denoted by hl,
Four matrices representing neighbors at angles 0, 90, 180 and 270 are considered. This approach
was further extended by separating the diagonal and the non-diagonal components of CCM to generate a Modified
Color Co-occurrence Matrix (MCCM). MCCM, thus, may be written as follows: MCCM = (CCMD;CCMND)
51
Here, CCMD and CCMND correspond to the diagonal and off-diagonal components of CCM.
The main drawback of this approach is that, like correlogram, it also captures only color information and intensity
An alternative approach is to capture intensity variation as a texture feature from an image and
combine it with color features like histograms using suitable weights . One of the challenges of this approach is to
determine suitable weights since these are highly application-dependent. In certain applications like Content-based
Image Retrieval (CBIR), weights are often estimated from relevance feedback given by users.
While relevance feedback is sometimes effective, it makes the process of image retrieval user-
dependent and iterative. There is also no guarantee on the convergence of the weight-learning algorithms. In order to
overcome these problems, researchers have tried to combine color and texture features together during extraction.
proposed two approaches for capturing color and intensity variations from an image using the LUV color space. In
the Single-channel Co-occurrence Matrix (SCM), variations for each color channel, namely, L, U and V are
considered independently. In the Multi channel Co-occurrence Matrix (MCM), variations are captured taking two
channels at a time – UV, LU and LV. Since the LUV color space separates out chrominance (L and U) from
luminance (V), SCM in effect, generates one GCM and two CCMs from each image independently. As a result,
However, in MCM, the count of pair wise occurrences of the values of different channels of
the color space is captured. Thus, each component of MCM can be written as follows:
Here, mcmUV(i, j) is the number of times the U chromaticity value of a pixel p denoted by
up equals i, and the V chromaticity value of its neighbor Np denoted by vNp equals j, as a fraction of the total
52
number of pixels in the image. Similarly, mcmLU(i, j) and mcmLV(i, j) are defined. One MCM matrix is generated
for each of the four neighborhood directions, namely, 0, 45, 90 and 135.
Deng and Manjunath (2001) proposed a two-stage method called JSEG, which combines color
and texture after image segmentation. In the first stage, colors are quantized to the required levels for differentiating
between various regions of an image. Pixel values of the regions are then replaced by their quantized color levels to
form a color map. Spatial variation of color levels between different regions in the map is viewed as a type of texture
Yu et al. (2002) suggested the use of color texture moments to represent both color and texture of
Local Fourier Transformation (LFT) coefficients. Eight templates equivalent to LFT are operated over an image to
generate a characteristic map of the image. Each template is a 3 · 3 filter that considers eight neighbors of the current
pixel for LFT calculation. First and second order moments of the characteristic map are then used to generate a set
of features.
In this paper, we propose an integrated approach for capturing spatial variation of both color and
intensity levels in the neighborhood of each pixel using the HSV color space. In contrast to the other methods, for
each pixel and its neighbor, the amount of color and intensity variation between them is estimated using a weight
function. Suitable constraints are satisfied while choosing the weight function for effectively relating visual
perception of color and the HSV color space properties. The color and intensity variations are represented in a single
composite feature known as Integrated Color and Intensity Co-occurrence Matrix (ICICM). While the existing
schemes generally treat color and intensity separately, the proposed method provides a composite view to both color
and intensity variations in the same feature. The main advantage of using ICICM is that it avoids the use of weights
to combine individual color and texture features. We use ICICM feature in an image retrieval application from large
image databases.
Early result on this work was reported in (Vadivel et al., 2004a). In the next section, we describe
the proposed feature extraction technique after introducing some of the properties of the HSV color space. Choice of
53
quantization levels for color and intensity axes, selection of parameter values and a brief overview of the image
retrieval application
We propose to capture color and intensity variation around each pixel in a two-dimensional
matrix called Integrated Color and Intensity Co-occurrence Matrix (ICICM). This is a generalization of the
Grayscale Co-occurrence Matrix and the Color Co-occurrence Matrix techniques. For each pair of neighboring
pixels, we consider
their contribution to both color perception as well as gray level perception to the human eye. Some of the useful
properties of the HSV color space and their relationship to human color perception are utilized for extracting this
feature. In the next sub-section, we briefly explain relevant properties of the HSV color space. In the subsequent
subsection, we describe how the properties can be effectively used for generating ICICM.
HSV Color space: Basically there are three properties or three dimensions of color that being
hue, saturation and value HSV means Hue, Saturation and Value. It is important to look at because it describes the
color based on three properties. It can create the full spectrum of colors by editing the HSV values. The first
dimension is the Hue. Hue is the other name for the color or the complicated variation in the color. The quality of
color as determined by its dominant wavelength. This Hue is broadly classified into three categories. They are
primary Hue, Secondary Hue and Teritiary Hue. The first and the foremost is the primary Hue it consists of three
colors they are red, yellow and blue. The secondary Hue is formed by the combination of the equal amount of colors
of the primary Hue and the colors of the secondary Hue which was formed by the primary Hue are Orange, Green
and violet. The remaining one is the teritiary Hue is formed by the combination of the primary Hue and the
secondary Hue. The limitless number of colors are produced by mixing the colors of the primary Hue in different
54
amounts. Saturation is the degree or the purity of color. Then the second dimension is the saturation. Saturation just
gives the intensity to the colors. The saturation and intensity drops just by mixing the colors or by adding black to
the color. By adding the white to the color in spite of more intense the color becomes lighter. Then finally the third
dimension is the Value. The value is the brightness of the color. When the value is zero the color space is totally
black with the increase in the color there is also increase in the brightness and shows the various colors. The value
describes the contrast of the color. That means it describes the lightness and darkness of the color. As similar to the
saturation this value consists of the tints and shades. Tints are the colors with the added white and shades are the
Sensing of light from an image in the layers of human retina is a complex process with rod cells
contributing to scotopic or dim-light vision and cone cells to photopic or bright-light vision (Gonzalez and Woods,
2002). At low levels of illumination, only the rod cells are excited so that only gray shades are perceived. As the
illumination level increases, more and more cone cells are excited, resulting in increased color perception. Various
color spaces have been introduced to represent and specify colors in a way suitable for storage, processing or
transmission of color information in images. Out of these, HSV is one of the models that separate out the luminance
component (Intensity) of a pixel color from its chrominance components (Hue and Saturation). Hue represents pure
color, which is perceived when incident light is of sufficient illumination and contains a single wavelength.
Saturation gives a measure of the degree by which a pure color is diluted by white light. For light with low
illumination, corresponding intensity value in the HSV color space is also low.
The HSV color space can be represented as a Hexa cone, with the central vertical axis denoting
the luminance component, I (often denoted by V for Intensity Value). Hue, is a chrominance component defined as
an angle in the range [0,2p] relative to the red axis with red at angle 0, green at 2p/3, blue at 4p/3 and red again at
2p. Saturation, S, is the other chrominance component, measured as a radial distance from the central axis of the
55
hexacone with value between 0 at the center to 1 at the outer surface. For zero saturation, as the intensity is
increased, we move from black to white through various shades of gray. On the other hand, for a given intensity and
hue, if the saturation is changed from 0 to 1, the perceived color changes from a shade of gray to the most pure form
of the color represented by its hue. When saturation is near 0, all the pixels in an image look alike even though their
As we increase saturation towards 1, the colors get separated out and are visually perceived as the
true colors represented by their hues. Low saturation implies presence of a large number of spectral components in
the incident light, causing loss of color information even though the illumination level is sufficiently high. Thus, for
low values of saturation or intensity, we can approximate a pixel color by a gray level while for higher saturation
and intensity, the pixel color can be approximated by its hue. For low intensities, even for a high saturation, a pixel
color is close to its gray value. Similarly, for low saturation even for a high value of intensity, a pixel is perceived as
gray. We use these properties to estimate the degree by which a pixel contributes to color perception and gray level
perception.
One possible way of capturing color perception of a pixel is to choose suitable thresholds on
the intensity and saturation. If the saturation and the intensity are above their respective thresholds, we may consider
the pixel to have color dominance; else, it has gray level dominance. However, such a hard thresholding does not
properly capture color perception near the threshold values. This is due to the fact that there is no fixed level of
illumination above which the cone cells get excited. Instead, there is a gradual transition from scotopic to photopic
vision. Similarly, there is no fixed threshold for the saturation of cone cells that leads to loss of chromatic
information at higher levels of illumination caused by color dilution. We, therefore, use suitable weights that vary
smoothly with saturation and intensity to represent both color and gray scale perception for each pixel.
retrieval then the computation will be very difficult to ensure rapid retrieval. It is essential to quantify HSV space
component to reduce computation and improve efficiency. At the same time, because the human eye to distinguish
colors is limited, do not need to calculate all segments. Unequal interval quantization according the human color
Based on the color model of substantial analysis, we divide color into eight parts. Saturation
and intensity is divided into three parts separately in accordance with the human eyes to distinguish. In accordance
with the different colors and subjective color perception quantification, quantified hue(H), saturation(S) and
value(V)
vector for different values of with different weights to form one dimensional feature vector and is given by the
following equation:
G = Qs*Qv*H+Qv*s+V
Where Qs is the quantized series of S and Qv is the quantized series of V. And now by
In this way three component vector of the HSV from one dimensional vector, Which
quantize the whole color space for the 72 kinds of the main colors. So we can handle 72 bins of one dimensional
57
histogram. This qualification is effective in reducing the images by the effect of the light intensity, but also reducing
IMAGE RETRIEVAL:
Image retrieval is nothing but a computer system used for browsing searching and
retrieving images from a large database of digital images. Most traditional and common methods of image retrieval
use some method of adding metadata by captioning, Keywords or the descriptions to the images so that the retrieval
can be performed. Manual image annotation is time consuming, expensive and laborious. For addressing this there
has been a large amount of research done on automatic image annotation. It is crucial to understand the scope and
nature of the image data in order to determine the complexity of the image search system design. The design is also
largely dependent on the factors. And some of the factors include archives, Domain specific collection, Enterprise
58
Invention of the digital camera has given the common man the privilege to capture his
world in pictures, and conveniently share them with others. one can today generate volumes of images with content
as diverse as family get-togethers and national park visits. Low-cost storage and easy Web hosting has fueled the
metamorphosis of common man from a passive consumer of photography in the past to a current-day active
producer. Today, searchable image data exists with extremely diverse visual and semantic content, spanning
geographically disparate locations, and is rapidly growing in size. All these factors have created innumerable
possibilities and hence considerations for real-world image search system designers.
retrieval has been unquestionably rapid. In recent years, there has been significant effort put into understanding the
real world implications, applications, and constraints of the technology. Yet, real-world application of the
technology is currently limited. We devote this section to understanding image retrieval in the real world and discuss
user expectations, system constraints and requirements, and the research effort to make image retrieval a reality in
An image retrieval system designed to serve a personal collection should focus on features
such as personalization, flexibility of browsing, and display methodology. For example, Google’s Picasa system
[Picasa 2004] provides a chronological display of images taking a user on a journey down memory lane. Domain
specific collections may impose specific standards for presentation of results. Searching an archive for content
discovery could involve long user search sessions. Good visualization and a rich query support system should be the
design goals. A system designed for the Web should be able to support massive user traffic. One way to supplement
software approaches for this purpose is to provide hardware support to the system architecture. Unfortunately, very
little has been explored in this direction, partly due to the lack of agreed-upon indexing and retrieval methods. The
notable few applications include an FPGA implementation of a color-histogram-based image retrieval system
[Kotoulas and Andreadis 2003], an FPGA implementation for sub image retrieval within an image database [Nakano
and Takamichi 2003], and a method for efficient retrieval in a network of imaging devices [Woodrow and
Heinzelman 2002].
59
Discussion. Regardless of the nature of the collection, as the expected user-base grows, factors such as
concurrent query support, efficient caching, and parallel and distributed processing of requests become critical. For
future real-world image retrieval systems, both software and hardware approaches to address these issues are
essential. More realistically, dedicated specialized servers, optimized memory and storage support, and highly
parallelizable image search algorithms to exploit cluster computing powers are where the future of large-scale image
OVERVIEW OF TEXTURE:
We all know about the term Texture but for defining it is a hard time. One can differentiate the two
different Textures by recognizing the similarities and differences. Commonly there are three ways for the usage of
the Textures:
Based on the Textures the images can be segmented To differentiate between already segmented
regions or to classify them.We can reproduce Textures by producing the descriptions. The texture can be analyzed in
60
CHAPTER 3
Software Introduction:
61
Modeling, simulation, and prototyping
Data analysis, exploration, and visualization
Scientific and engineering graphics
Application development, including graphical user interface building
MATLAB is an interactive system whose basic data element is an array that does not require
dimensioning. This allows you to solve many technical computing problems, especially those with matrix
and vector formulations, in a fraction of the time it would take to write a program in a scalar non
interactive language such as C or FORTRAN.
The name MATLAB stands for matrix laboratory. MATLAB was originally written to provide
easy access to matrix software developed by the LINPACK and EISPACK projects. Today, MATLAB
engines incorporate the LAPACK and BLAS libraries, embedding the state of the art in software for
matrix computation.
MATLAB has evolved over a period of years with input from many users. In university
environments, it is the standard instructional tool for introductory and advanced courses in mathematics,
engineering, and science. In industry, MATLAB is the tool of choice for high-productivity research,
development, and analysis.
Development Environment:
This is the set of tools and facilities that help you use MATLAB functions and files. Many of these
tools are graphical user interfaces. It includes the MATLAB desktop and command window, a command
history, an editor and debugger, and browsers for viewing help, the workspace, files, and the search path.
62
The MATLAB Mathematical Function Library:
This is a vast collection of computational algorithms ranging from elementary functions, like sum,
sine, cosine, and complex arithmetic, to more sophisticated functions like matrix inverse, matrix Eigen
values, Bessel functions, and fast Fourier transforms.
This is a high-level matrix/array language with control flow statements, functions, data structures,
input/output, and object-oriented programming features. It allows both “programming in the small” to
rapidly create quick and dirty throw-away programs, and “programming in the large” to create large and
complex application programs.
Graphics:
MATLAB has extensive facilities for displaying vectors and matrices as graphs, as well as
annotating and printing these graphs. It includes high-level functions for two-dimensional and three-
dimensional data visualization, image processing, animation, and presentation graphics. It also includes
low-level functions that allow you to fully customize the appearance of graphics as well as to build
complete graphical user interfaces on your MATLAB applications.
This is a library that allows you to write C and FORTRAN programs that interact with MATLAB.
It includes facilities for calling routines from MATLAB (dynamic linking), calling MATLAB as a
computational engine, and for reading and writing MAT-files.
Various toolboxes are there in MATLAB for computing recognition techniques, but we are using
IMAGE PROCESSING toolbox.
MATLAB’s Graphical User Interface Development Environment (GUIDE) provides a rich set of
tools for incorporating graphical user interfaces (GUIs) in M-functions. Using GUIDE, the processes of
laying out a GUI (i.e., its buttons, pop-up menus, etc.)and programming the operation of the GUI are
63
divided conveniently into two easily managed and relatively independent tasks. The resulting graphical M-
function is composed of two identically named (ignoring extensions) files:
A file with extension .fig, called a FIG-file that contains a complete graphical description of all the
function’s GUI objects or elements and their spatial arrangement. A FIG-file contains binary data that
does not need to be parsed when he associated GUI-based M-function is executed.
A file with extension .m, called a GUI M-file, which contains the code that controls the GUI operation.
This file includes functions that are called when the GUI is launched and exited, and callback functions
that are executed when a user interacts with GUI objects for example, when a button is pushed.
guide filename
Where filename is the name of an existing FIG-file on the current path. If filename is omitted,
64
A graphical user interface (GUI) is a graphical display in one or more windows containing
controls, called components that enable a user to perform interactive tasks. The user of the GUI does not
have to create a script or type commands at the command line to accomplish the tasks. Unlike coding
programs to accomplish tasks, the user of a GUI need not understand the details of how the tasks are
performed.
GUI components can include menus, toolbars, push buttons, radio buttons, list boxes, and sliders
just to name a few. GUIs created using MATLAB tools can also perform any type of computation, read
and write data files, communicate with other GUIs, and display data as tables or as plots.
If you are new to MATLAB, you should start by reading Manipulating Matrices. The most important
things to learn are how to enter matrices, how to use the: (colon) operator, and how to invoke functions.
After you master the basics, you should read the rest of the sections below and run the demos.
65
At the heart of MATLAB is a new language you must learn before you can fully exploit its power.
You can learn the basics of MATLAB quickly, and mastery comes shortly after. You will be rewarded
with high productivity, high-creativity computing power that will change the way you work.
3.4.3 Manipulating Matrices - introduces how to use MATLAB to generate matrices and perform
mathematical operations on matrices.
3.4.4 Graphics - introduces MATLAB graphic capabilities, including information about plotting data,
annotating graphs, and working with images.
3.4.5 Programming with MATLAB - describes how to use the MATLAB language to create
scripts and functions, and manipulate data structures, such as cell arrays and multidimensional arrays.
3.5.1Introduction
This chapter provides a brief introduction to starting and quitting MATLAB, and the tools and
functions that help you to work with MATLAB variables and files. For more information about the topics
covered here, see the corresponding topics under Development Environment in the MATLAB
documentation, which is available online as well as in print.
On a Microsoft Windows platform, to start MATLAB, double-click the MATLAB shortcut icon on
your Windows desktop.On a UNIX platform, to start MATLAB, type matlab at the operating system
prompt. After starting MATLAB, the MATLAB desktop opens - see MATLAB Desktop.
You can change the directory in which MATLAB starts, define startup options including running a script
upon startup, and reduce startup time in some situations.
When you start MATLAB, the MATLAB desktop appears, containing tools (graphical user
interfaces) for managing files, variables, and applications associated with MATLAB.The first time
MATLAB starts, the desktop appears as shown in the following illustration, although your Launch Pad
may contain different entries.
You can change the way your desktop looks by opening, closing, moving, and resizing the tools in
it. You can also move tools outside of the desktop or return them back inside the desktop (docking). All
the desktop tools provide common features such as context menus and keyboard shortcuts.
You can specify certain characteristics for the desktop tools by selecting Preferences from the File
menu. For example, you can specify the font characteristics for Command Window text. For more
information, click the Help button in the Preferences dialog box.
This section provides an introduction to MATLAB's desktop tools. You can also use MATLAB
functions to perform most of the features found in the desktop tools. The tools are:
Command History
Lines you enter in the Command Window are logged in the Command History window. In the
Command History, you can view previously used functions, and copy and execute selected lines. To save
the input and output from a MATLAB session to a file, use the diary function.
You can run external programs from the MATLAB Command Window. The exclamation point
character! is a shell escape and indicates that the rest of the input line is a command to the operating
system. This is useful for invoking utilities or running other programs without quitting MATLAB. On
Linux, for example,!emacs magik.m invokes an editor called emacs for a file named magik.m. When you
quit the external program, the operating system returns control to MATLAB.
Launch Pad
MATLAB's Launch Pad provides easy access to tools, demos, and documentation.
Help Browser
Use the Help browser to search and view documentation for all your Math Works products. The Help
browser is a Web browser integrated into the MATLAB desktop that displays HTML documents.
To open the Help browser, click the help button in the toolbar, or type helpbrowser in the
Command Window. The Help browser consists of two panes, the Help Navigator, which you use to find
information, and the display pane, where you view the information.
Help Navigator
Product filter - Set the filter to show documentation only for the products you specify.
Contents tab - View the titles and tables of contents of documentation for your products.
Index tab - Find specific index entries (selected keywords) in the MathWorks documentation for your
products.
68
Search tab - Look for a specific phrase in the documentation. To get help for a specific function, set the
Search type to Function Name.
Display Pane
After finding documentation using the Help Navigator, view it in the display pane. While viewing the
documentation, you can:
Browse to other pages - Use the arrows at the tops and bottoms of the pages, or use the back and
forward buttons in the toolbar.
Find a term in the page - Type a term in the Find in page field in the toolbar and click Go.
Other features available in the display pane are: copying information, evaluating a selection, and
viewing Web pages.
MATLAB file operations use the current directory and the search path as reference points. Any file
you want to run must either be in the current directory or on the search path.
Search Path
To determine how to execute functions you call, MATLAB uses a search path to find M-files and other
MATLAB-related files, which are organized in directories on your file system. Any file you want to run in
MATLAB must reside in the current directory or in a directory that is on the search path. By default, the
files supplied with MATLAB and MathWorks toolboxes are included in the search path.
69
Workspace Browser
The MATLAB workspace consists of the set of variables (named arrays) built up during a MATLAB
session and stored in memory. You add variables to the workspace by using functions, running M-files,
and loading saved workspaces.
To view the workspace and information about each variable, use the Workspace browser, or use the
functions who and whos.
To delete variables from the workspace, select the variable and select Delete from the Edit menu.
Alternatively, use the clear function.
The workspace is not maintained after you end the MATLAB session. To save the workspace to a file
that can be read during a later MATLAB session, select Save Workspace As from the File menu, or use the
save function. This saves the workspace to a binary file called a MAT-file, which has a .mat extension.
There are options for saving to different formats. To read in a MAT-file, select Import Data from the File
menu, or use the load function.
Array Editor
Double-click on a variable in the Workspace browser to see it in the Array Editor. Use the Array Editor
to view and edit a visual representation of one- or two-dimensional numeric arrays, strings, and cell arrays
of strings that are in the workspace.
Editor/Debugger
Use the Editor/Debugger to create and debug M-files, which are programs you write to runMATLAB
functions. The Editor/Debugger provides a graphical user interface for basic text editing, as well as for M-
file debugging.
You can use any text editor to create M-files, such as Emacs, and can use preferences (accessible from
the desktop File menu) to specify that editor as the default. If you use another editor, you can still use the
MATLAB Editor/Debugger for debugging, or you can use debugging functions, such as dbstop, which sets
a breakpoint.
70
If you just need to view the contents of an M-file, you can display it in the Command Window by
using the type function.
The best way for you to get started with MATLAB is to learn how to handle matrices. Start MATLAB
and follow along with each example.
A=
16 3 2 13
5 10 11 8
9 6 7 12
4 15 14 1
71
This exactly matches the numbers in the engraving. Once you have entered the matrix, it is
automatically remembered in the MATLAB workspace. You can refer to it simply as A.
3.6.2 Expressions
Like most other programming languages, MATLAB provides mathematical expressions, but unlike
most programming languages, these expressions involve entire matrices. The building blocks of
expressions are:
Variables
Numbers
Operators
Functions
Variables
MATLAB does not require any type declarations or dimension statements. When MATLAB
encounters a new variable name, it automatically creates the variable and allocates the appropriate amount
of storage. If the variable already exists, MATLAB changes its contents and, if necessary, allocates new
storage. For example,
num_students = 25
Creates a 1-by-1 matrix named num_students and stores the value 25 in its single element.
Variable names consist of a letter, followed by any number of letters, digits, or underscores.
MATLAB uses only the first 31 characters of a variable name. MATLAB is case sensitive; it distinguishes
between uppercase and lowercase letters. A and a are not the same variable. To view the matrix assigned
to any variable, simply enter the variable name.
Numbers
MATLAB uses conventional decimal notation, with an optional decimal point and leading plus or
minus sign, for numbers. Scientific notation uses the letter e to specify a power-of-ten scale factor.
Imaginary numbers use either i or j as a suffix. Some examples of legal numbers are
3 -99 0.0001
72
9.6397238 1.60210e-20 6.02252e23
1i -3.14159j 3e5i
All numbers are stored internally using the long format specified by the IEEE floating-point standard.
Floating-point numbers have a finite precision of roughly 16 significant decimal digits and a finite range of
roughly 10-308 to 10+308.
3.6.3 Operators
+ Addition
- Subtraction
* Multiplication
/ Division
^ Power
3.6.4 Functions
MATLAB provides a large number of standard elementary mathematical functions, including abs, sqrt,
exp, and sin. Taking the square root or logarithm of a negative number is not an error; the appropriate
complex result is produced automatically. MATLAB also provides many more advanced mathematical
73
functions, including Bessel and gamma functions. Most of these functions accept complex arguments. For
a list of the elementary mathematical functions, type help elfun, For a list of more advanced mathematical
and matrix functions, type help specfun help elmat
Some of the functions, like sqrt and sin, are built-in. They are part of the MATLAB core so they
are very efficient, but the computational details are not readily accessible. Other functions, like gamma and
sinh, are implemented in M-files. You can see the code and even modify it if you want. Several special
functions provide values of useful constants.
Pi 3.14159265...
I Same as i
Inf Infinity
NaN Not-a-number
3.7 GUI
A graphical user interface (GUI) is a user interface built with graphical objects, such as buttons, text
fields, sliders, and menus. In general, these objects already have meanings to most computer users. For
example, when you move a slider, a value changes; when you press an OK button, your settings are
applied and the dialog box is dismissed. Of course, to leverage this built-in familiarity, you must be
consistent in how you use the various GUI-building components.
Applications that provide GUIs are generally easier to learn and use since the person using the
application does not need to know what commands are available or how they work. The action that results
from a particular user action can be made clear by the design of the interface.
74
The sections that follow describe how to create GUIs with MATLAB. This includes laying out the
components, programming them to do specific things in response to user actions, and saving and launching
the GUI; in other words, the mechanics of creating GUIs. This documentation does not attempt to cover
the "art" of good user interface design, which is an entire field unto itself. Topics covered in this section
include:
MATLAB implements GUIs as figure windows containing various styles of uicontrol objects. You
must program each object to perform the intended action when activated by the user of the GUI. In
addition, you must be able to save and launch your GUI. All of these tasks are simplified by GUIDE,
MATLAB's graphical user interface development environment.
While it is possible to write an M-file that contains all the commands to lay out a GUI, it is easier to
use GUIDE to lay out the components interactively and to generate two files that save and launch the GUI:
A FIG-file - contains a complete description of the GUI figure and all of its
children (uicontrols and axes), as well as the values of all object properties.
An M-file - contains the functions that launch and control the GUI and the
75
Note that the application M-file does not contain the code that lays out the uicontrols; this information
is saved in the FIG-file.
GUIDE simplifies the creation of GUI applications by automatically generating an M-file framework
directly from your layout. You can then use this framework to code your application M-file. This approach
provides a number of advantages:
The M-file contains code to implement a number of useful features (see Configuring Application
Options for information on these features). The M-file adopts an effective approach to managing object
handles and executing callback routines (see Creating and Storing the Object Handle Structure for more
information). The M-files provides a way to manage global data (see Managing GUI Data for more
information).
The automatically inserted subfunction prototypes for callbacks ensure compatibility with future
releases. For more information, see Generating Callback Function Prototypes for information on syntax
and arguments.
76
You can elect to have GUIDE generate only the FIG-file and write the application M-file yourself.
Keep in mind that there are no uicontrol creation commands in the application M-file; the layout
information is contained in the FIG-file generated by the Layout Editor.
Selecting GUIDE Application Options - set both FIG-file and M-file options.
Command-Line Accessibility
When MATLAB creates a graph, the figure and axes are included in the list of children of their
respective parents and their handles are available through commands such as findobj, set, and get. If you
issue another plotting command, the output is directed to the current figure and axes.
GUIs are also created in figure windows. Generally, you do not want GUI figures to be available as
targets for graphics output, since issuing a plotting command could direct the output to the GUI figure,
resulting in the graph appearing in the middle of the GUI.
In contrast, if you create a GUI that contains an axes and you want commands entered in the command
window to display in this axes, you should enable command-line access.
77
The Layout Editor component palette contains the user interface controls that you can use in your GUI.
These components are MATLAB uicontrol objects and are programmable via their Callback properties.
This section provides information on these components.
Push Buttons
Sliders
Toggle Buttons
Frames
Radio Buttons
Listboxes
Checkboxes
Popup Menus
Edit Text
Axes
Static Text
Figures
Push Buttons
Push buttons generate an action when pressed (e.g., an OK button may close a dialog box and apply
settings). When you click down on a push button, it appears depressed; when you release the mouse, the
button's appearance returns to its nondepressed state; and its callback executes on the button up event.
Properties to Set
String - set this property to the character string you want displayed on the push button.
Tag - GUIDE uses the Tag property to name the callback subfunction in the application M-file. Set Tag
to a descriptive name (e.g., close_button) before activating the GUI.
When the user clicks on the push button, its callback executes. Push buttons do not return a value or
maintain a state.
Toggle Buttons
78
Toggle buttons generate an action and indicate a binary state (e.g., on or off). When you click on a
toggle button, it appears depressed and remains depressed when you release the mouse button, at which
point the callback executes. A subsequent mouse click returns the toggle button to the nondepressed state
and again executes its callback.
The callback routine needs to query the toggle button to determine what state it is in. MATLAB sets
the Value property equal to the Max property when the toggle button is depressed (Max is 1 by default)
and equal to the Min property when the toggle button is not depressed (Min is 0 by default).
The following code illustrates how to program the callback in the GUIDE application M-file.
button_state = get(h,'Value');
if button_state == get(h,'Max')
end
Assign the CData property an m-by-n-by-3 array of RGB values that define a truecolor image. For
example, the array a defines 16-by-128 truecolor image using random values between 0 and 1 (generated
by rand).
a(:,:,1) = rand(16,128);
a(:,:,2) = rand(16,128);
a(:,:,3) = rand(16,128);
set(h,'CData',a)
79
Radio Buttons
Radio buttons are similar to checkboxes, but are intended to be mutually exclusive within a group of
related radio buttons (i.e., only one button is in a selected state at any given time). To activate a radio
button, click the mouse button on the object. The display indicates the state of the button.
Radio buttons have two states - selected and not selected. You can query and set the state of a radio
button through its Value property:
To make radio buttons mutually exclusive within a group, the callback for each radio button must set
the Value property to 0 on all other radio buttons in the group. MATLAB sets the Value property to 1 on
the radio button clicked by the user.
The following subfunction, when added to the application M-file, can be called by each radio button
callback. The argument is an array containing the handles of all other radio buttons in the group that must
be deselected.
function mutual_exclude(off)
set(off,'Value',0)
The handles of the radio buttons are available from the handles structure, which contains the handles of
all components in the GUI. This structure is an input argument to all radio button callbacks.
The following code shows the call to mutual_exclude being made from the first radio button's callback
in a group of four radio buttons.
off = [handles.radiobutton2,handles.radiobutton3,handles.radiobutton4];
mutual_exclude(off)
80
% Continue with callback
After setting the radio buttons to the appropriate state, the callback can continue with its
implementation-specific tasks.
Checkboxes
Check boxes generate an action when clicked and indicate their state as checked or not checked. Check
boxes are useful when providing the user with a number of independent choices that set a mode (e.g.,
display a toolbar or generate callback function prototypes).
The Value property indicates the state of the check box by taking on the value of the Max or Min
property (1 and 0 respectively by default):
You can determine the current state of a check box from within its callback by querying the state of its
Value property, as illustrated in the following example:
function checkbox1_Callback(h,eventdata,handles,varargin)
if (get(h,'Value') == get(h,'Max'))
else
end
Edit Text
81
Edit text controls are fields that enable users to enter or modify text strings. Use edit text when you
want text as input. The String property contains the text entered by the user.
To obtain the string typed by the user, get the String property in the callback.
user_string = get(h,'string');
MATLAB returns the value of the edit text String property as a character string. If you want users to
enter numeric values, you must convert the characters to numbers. You can do this using the str2double
command, which converts strings to doubles. If the user enters non-numeric characters, str2double returns
NaN.
You can use the following code in the edit text callback. It gets the value of the String property and
converts it to a double. It then checks if the converted value is NaN, indicating the user entered a non-
numeric character (isnan) and displays an error dialog (errordlg).
function edittext1_Callback(h,eventdata,handles,varargin)
user_entry = str2double(get(h,'string'));
if isnan(user_entry)
end
On UNIX systems, clicking on the menubar of the figure window causes the edit text callback to
execute. However, on Microsoft Windows systems, if an editable text box has focus, clicking on the
menubar does not cause the editable text callback routine to execute. This behavior is consistent with the
respective platform conventions. Clicking on other components in the GUI execute the callback.
82
Static Text
Static text controls displays lines of text. Static text is typically used to label other controls, provide
directions to the user, or indicate values associated with a slider. Users cannot change static text
interactively and there is no way to invoke the callback routine associated with it
Frames
Frames are boxes that enclose regions of a figure window. Frames can make a user interface easier to
understand by visually grouping related controls. Frames have no callback routines associated with them
and only uicontrols can appear within frames (axes cannot).
Frames are opaque. If you add a frame after adding components that you want to be positioned within
the frame, you need to bring forward those components. Use the Bring to Front and Send to Back
operations in the Layout menu for this purpose.
List Boxes
List boxes display a list of items and enable users to select one or more items.
The String property contains the list of strings displayed in the list box. The first item in the list has an
index of 1.
The Value property contains the index into the list of strings that correspond to the selected item. If the
user selects multiple items, then Value is a vector of indices. By default, the first item in the list is
highlighted when the list box is first displayed. If you do not want any item highlighted, then set the Value
property to empty.
The ListboxTop property defines which string in the list displays as the top most item when the list box
is not large enough to display all list entries. ListboxTop is an index into the array of strings defined by the
String property and must have a value between 1 and the number of strings. Noninteger values are fixed to
the next lowest integer
The values of the Min and Max properties determine whether users can make single or multiple
selections:
83
If Max - Min > 1, then list boxes allow multiple item selection.
If Max - Min <= 1, then list boxes do not allow multiple item selection.
Selection Type
Listboxes differentiate between single and double clicks on an item and set the figure SelectionType
property to normal or open accordingly. See Triggering Callback Execution for information on how to
program multiple selection.
MATLAB evaluates the list box's callback after the mouse button is released or a keypress event
(including arrow keys) that changes the Value property (i.e., any time the user clicks on an item, but not
when clicking on the list box scrollbar). This means the callback is executed after the first click of a
double-click on a single item or when the user is making multiple selections. In these situations, you need
to add another component, such as a Done button (push button) and program its callback routine to query
the list box Value property (and possibly the figure SelectionType property) instead of creating a callback
for the list box. If you are using the automatically generated application M-file option, you need to either:
Set the list box Callback property to the empty string ('') and remove the callback subfunction from the
application M-file. Leave the callback subfunction stub in the application M-file so that no code executes
when users click on list box items.
The first choice is best if you are sure you will not use the list box callback and you want to minimize
the size and efficiency of the application M-file. However, if you think you may want to define a callback
for the list box at some time, it is simpler to leave the callback stub in the M-file.
Popup Menus
Popup menus open to display a list of choices when users press the arrow. The String property contains
the list of string displayed in the popup menu. The Value property contains the index into the list of strings
that correspond to the selected item. When not open, a popup menu displays the current choice, which is
determined by the index contained in the Value property. The first item in the list has an index of 1.
Popup menus are useful when you want to provide users with a number of mutually exclusive choices,
but do not want to take up the amount of space that a series of radio buttons requires.
84
Programming the Popup Menu
You can program the popup menu callback to work by checking only the index of the item selected
(contained in the Value property) or you can obtain the actual string contained in the selected item.
This callback checks the index of the selected item and uses a switch statement to take action based on the
value. If the contents of the popup menu is fixed, then you can use this approach.
val = get(h,'Value');
switch val
case 1
case 2
% etc.
This callback obtains the actual string selected in the popup menu. It uses the value to index into the
list of strings. This approach may be useful if your program dynamically loads the contents of the popup
menu based on user action and you need to obtain the selected string. Note that it is necessary to convert
the value returned by the String property from a cell array to a string.
val = get(h,'Value');
string_list = get(h,'String');
% etc.
85
Enabling or Disabling Controls
You can control whether a control responds to mouse button clicks by setting the Enable property.
Controls have three states:
off - The control is disabled and its label (set by the string property) is
grayed out.
inactive - The control is disabled, but its label is not grayed out.
When a control is disabled, clicking on it with the left mouse button does not execute its callback
routine. However, the left-click causes two other callback routines to execute: First the figure
WindowButtonDownFcn callback executes. Then the control's ButtonDownFcn callback executes. A right
mouse button click on a disabled control posts a context menu, if one is defined for that control. See the
Enable property description for more details.
Axes
Axes enable your GUI to display graphics (e.g., graphs and images). Like all graphics objects, axes
have properties that you can set to control many aspects of its behavior and appearance. See Axes
Properties for general information on axes objects.
Axes Callbacks
Axes are not uicontrol objects, but can be programmed to execute a callback when users click a mouse
button in the axes. Use the axes ButtonDownFcn property to define the callback.
86
GUIs that contain axes should ensure the Command-line accessibility option in the Application
Options dialog is set to Callback (the default). This enables you to issue plotting commands from callbacks
without explicitly specifying the target axes.
If a GUI has multiple axes, you should explicitly specify which axes you want to target when you issue
plotting commands. You can do this using the axes command and the handles structure. For example,
axes(handles.axes1)
makes the axes whose Tag property is axes1 the current axes, and therefore the target for plotting
commands. You can switch the current axes whenever you want to target a different axes. See GUI with
Multiple Axes for and example that uses two axes.
Figure
Figures are the windows that contain the GUI you design with the Layout Editor. See the description of
figure properties for information on what figure characteristics you can control.
87
CHAPTER-5
CONCLUSION
In this paper, we propose a simple yet fast and novel image super-resolution (SR) algorithm, which
belongs to the family of learning-based SR algorithms, using clustering and collaborative representation
(CCR). The algorithm employs clustering method and collaborative representation [47] to learn numerous
projection matrices from the LR feature spaces to their HR feature spaces. When compared with other
state-of-the-art SR methods, our CCR-based algorithm shows the best performance both in terms of
objective evaluation metrics and subjective visual results. As for objective evaluation, our algorithm
obtains a largne gains over other competing methods in PSNR, SSIM, and VIF values and also consumes
the least running time. When it comes to visual SR results, our algorithm also turns out to reconstruct
results that are more faithful to the original HR images with sharper edges and finer details. Moreover, our
approach can further speed up the SR procedure by using a very small number of projection matrices while
maintaining high-quality SR results, which would be very useful and adaptive for real time applications.
In this letter, a novel resolution-enhancement technique based on the interpolation of the HF subband
88
In contrast with other state-of-the-art resolution-enhancement techniques, the designed framework
applies the edge and fine features information that is obtained in WT space, performs sparse interpolation
over an oriented block in an LR image, and uses the NLM denoising algorithm for the SR restoration.
Experimental results highlight the superior performance of the proposed algorithm in terms of objective
criteria, as well as in the subjective perception via the human visual system, in comparison with other
conventional methods.
CHAPTER-6
REFERENCES
[1] Y. V. Shkvarko, J. Tuxpan, and S. R. Santos, “l2 − l1 structured de-scriptive experiment design
regularization based enhancement of frac-tional SAR imagery,” Signal Process., vol. 93, no. 12, pp.
[2] T. M. Lillesand, R. W. Kiefer, and J. W. Chipman, Remote Sensing and Image Interpretation.
Hoboken, NJ, USA: Wiley, 2004, xiv 763 pp. Record Number: 20043080717.
[3] A. Temizel and T. Vlachos, “Image resolution upscaling in the wavelet domain using directional
[4] M. Elad, M. A. T. Figueiredo, and Y. Ma, “On the role of sparse and redundant representations in
89
image processing,” Proc. IEEE, vol. 98, no. 6, pp. 972–982, Jun. 2010.
[5] M. Unser, “Splines: A perfect fit for signal and image processing,” IEEE Signal Process. Mag.,
[6] K. Turkowski, “Filters for common resampling tasks,” in Graphics Gems. New York, NY, USA:
Academic, 1990.
[7] M. Protter, M. Elad, H. Takeda, and P. Milanfar, “Generalizing the nonlocal-means to super-
resolution reconstruction,” IEEE Trans. Image Process, vol. 18, no. 1, pp. 36–51, Jan. 2009.
[8] G. Anbarjafari and H. Demirel, “Image super resolution based on interpo-lation of wavelet domain
high frequency subbands and the spatial domain input image,” ETRI J., vol. 32, no. 3, pp. 390–394,
2010.
[9] A. Temizel and T. Vlachos, “Wavelet domain image resolution enhance-ment using cycle-
spinning,” Electron Lett., vol. 41, no. 3, pp. 119–121, Feb. 2005.
[10] H. Demirel and G. Anbarjafari, “Image resolution enhancement by using discrete and stationary
wavelet decomposition,” IEEE Trans. Image Pro-cess, vol. 20, no. 5, pp. 1458–1460, May 2011.
[11] H. Demirel and G. Anbarjafari, “Discrete wavelet transform-based satel-lite image resolution
enhancement,” IEEE Trans. Geosci. Remote Sens., vol. 49, no. 6, pp. 1997–2004, Jun. 2011.
[12] M. Iqbal, A. Ghafoor, and A. Siddiqui, “Satellite image resolution enhancement using dual-tree
90
complex wavelet transform and nonlocal means,” IEEE Geosci. Remote Sens. Lett., vol. 10, no. 3, pp.
[15] S. Mallat and G. Yu, “Super-resolution with sparse mixing estima-tors,” IEEE Trans. Image
[16] L. Feng, C. Y. Suen, Y. Y. Tang, and L. H. Yang, “Edge extraction of images by reconstruction
using wavelet decomposition details at differ-ent resolution levels,” Int J. Pattern Recog. Artif. Intell.,
[17] Z. Wang and A. Bovik, “Mean squared error: Love it or leave it? A new look at signal fidelity
measures,” IEEE Signal Process. Mag., vol. 26, no. 1, pp. 98–117, Jan. 2009.
91
92
93