Final Doc Cbir (Content Based Image Retrieval) in Matlab Using Gabor Wavelet
Final Doc Cbir (Content Based Image Retrieval) in Matlab Using Gabor Wavelet
To be submitted by:-
Yadav Satyam
Date: _______________
CERTIFICATE OF APPROVAL
Yadav Satyam
Student’s of “Master of Science (Information Technology)” of “Thakur Degree College of
Science & Commerce” have successfully completed and submitted the project entitled “Content
Based Image Retrieval (CBIR) using Gabor Wavelet” in the partial fulfillment as per the syllabus
defined by the University of Mumbai.
The project is the original study work and the important sources used have been
dually acknowledged in the report.
________________ ______________________
_____________________
External Examiner
I am thankful to the Honorable Principal Mrs. C. T. Chakraborty and the Head of the IT
Department (H.O.D.) Mr. Santosh Kumar Singh for their great support.
I am very much grateful to Mrs. VarshaTurkar, my Project Guide for the valuable
guidance and suggestions which helped me accomplish the project.
I am very much grateful to Mr. RavindraPatil, for his suggestions which helped me in the
project.
Content Based Image Retrieval (CBIR) is a system, which retrieves the images from
images from an image collection where the retrieval is based on a query, which is specified by
content and by index or address. The CBIR system retrieves relevant images from an image
collection based on automatic derived features. The derived features include primitive features
like texture, color, and shape.
Problem Definition:
Image retrieval can be divided into Text-Based Image Retrieval (TBIR) and Content-
Based Image Retrieval (CBIR). The text-based image retrieval technique first annotates the
images by text, and then uses text-based database management systems to perform image
retrieval.
CBIR is the retrieval of images based on visual features such as color, texture and shape.
1. Feature Extraction: The first step in the process is extracting image features to a
distinguishable extent.
2. Matching: The second step involves matching these features to yield a result that is
visually similar.
Algorithm to be implemented:
Extract the feature vector of the images.
Store the feature vector in the database.
Extract the feature vector of the Query image.
Compare the feature vector of the query image with database features
Apply similarity measures.
Retrieves similar images.
Technology:
MATLAB
MATLAB is a high performance language for technical computing. It allows easy matrix
manipulation, plotting of function and data, implementation of algorithm, creation of user
interfaces and interfacing with programs in other language.
MATLAB also provides wide range of elementary functions. It supports various types of
data files and it also has Object Oriented capabilities
INDEX
CONTENT BASED IMAGE RETRIEVAL Page 7
Sr. No Content Page No.
1. Introduction
1.1 Content Based Image Retrieval
1.2 Digital Image and its Features
1.2.1 Feature Vectors
2. Wavelet
2.1 Introduction
2.2 Wavelet Transform
2.3 Gabor Wavelet
3. System Overview
3.1 Working of the project
3.2 Flow chart of the CBIR systems
4. System Analysis and Design
4.1 Methodology
4.2 Hardware and Software Requirement
4.3 Gantt Chart
4. Introduction to Matlab
5. Screenshots
6. Testing
7. Cost Estimation
8. Annexure
8.1 Limitation of the System
8.2 Future Enhancements
9. Conclusion
10. Appendix
Bibliography
Image retrieval is the process of browsing, searching and retrieving images from a large database
of digital images. The collection of images in the web are growing larger and becoming more
In recent years, with large scale storing of images the need to have
an efficient method of image searching and retrieval has increased. It can simplify many tasks in
many application areas such as biomedicine, forensics, artificial intelligence, military, education,
web image searching. Most of the image retrieval systems present today are text-based, in which
images are manually annotated by text-based keywords and when we query by a keyword,
instead of looking into the contents of the image, this system matches the query to the keywords
present in the database. This technique has its some disadvantages:
a) Firstly, considering the huge collection of images present, it’s not feasible to manually
annotate them.
b) Secondly, the rich features present in an image cannot be described by keywords completely.
These disadvantages of text-based image retrieval techniques call for another relatively new
technique known as Content-Based Image Retrieval (CBIR).
E.g.
A digital image is a representation of a two-dimensional image using ones and zeros (binary).
Depending on whether or not the image resolution is fixed, it may be of vector or raster type.
Without qualifications, the term "digital image" usually refers to raster images also called bitmap
images. Raster images have a finite set of digital values, called picture elements or pixels. The
digital image contains a fixed number of rows and columns of pixels. Pixels are the smallest
individual element in an image, holding quantized values that represent the brightness of a given
color at any specific point. Raster images can be created by a variety of input devices and
techniques, such as digital cameras, scanners, coordinate-measuring machines, seismographic
profiling, airborne radar, and more. They can also be synthesized from arbitrary non-image data,
Digital image processing is the use of computer algorithms to perform image processing on
digital images. As a subcategory or field of digital signal processing, digital image processing
has many advantages over analog image processing. It allows a much wider range of algorithms
to be applied to the input data and can avoid problems such as the build-up of noise and signal
distortion during processing. Since images are defined over two dimensions (perhaps more)
digital image processing may be modeled in the form of Multidimensional Systems.
Feature Vectors:-
1. Mean: Mean gives the overall distribution of the pixel’s gray level.
2. Variance: The variance is a measure of dispersion. It tells us something about the scatter
of scores (here pixels) around the mean. It is defined as the mean squared deviation from
the mean, and symbolized by a small sigma squared. Its formula is:
Variance = σ x2= ∑ (X −M) 2 /N
Where X=Pixel value
M=Mean of the Image
N=Total number of pixel
3. Standard Deviation:
The standard deviation is the square root of the variance and is Symbolized by a small
Greek sigma - σ. Its formula is the square root of any of the formulae for the variance,
e.g. x=∑(x)2/N
(c) Rocks
(a) Clouds
(b) Bricks
Fig 1.2 Examples of Textures
7. Shape :
WAVELET
CONTENT BASED IMAGE RETRIEVAL Page 13
In Mathematics wavelet transform refers to the representation of a signal in terms of finite
length or fast decaying oscillating waveform known as mother wavelet. Wavelet analysis allows
the use of long time intervals where we want more precise low-frequency information, and
shorter regions where we want high-frequency information.
Wavelet analysis is capable of revealing aspects of data that other signal analysis techniques miss
aspects like trends, breakdown points, discontinuities in higher derivatives, and self-similarity.
Furthermore, because it affords a different view of data than those presented by traditional
techniques, wavelet analysis can often compress or de-noise a signal without appreciable
degradation.
Wavelet transforms are broadly divided into three classes: the continuous wavelet transform, the
discretized wavelet transform and multi-resolution based wavelet transform. DWT is good for
signal having high frequency components for short durations and low frequency components for
long duration e.g. images. When a wavelet transform of the image is performed, a coefficient in a
low sub-band can be thought of having four descendants in the next higher sub-band. The four
descendants each have four descendants in the next higher sub-band.
Discrete wavelet transform (DWT), transforms a discrete time signal to a discrete wavelet
representation. A 2D wavelet transforms works as follows:
WAVELET TRANSFORM
Unlike the usage of sine functions to represent signals in Fourier transforms, in wavelet
transform, we use functions known as wavelets. Wavelets are finite in time, yet the average value
of a wavelet is zero. In a sense, a wavelet is a waveform that is bounded in both frequency and
duration. While the Fourier transform converts a signal into a continuous series of sine waves,
each of which is of constant frequency and amplitude and of infinite duration, most real-world
signals (such as music or images) have a finite duration and abrupt changes in frequency. This
account for the efficiency of wavelet transforms. This is because wavelet transforms convert a
signal into a series of wavelets, which can be stored more efficiently due to finite time, and can
be constructed with rough edges, thereby better approximating real-world signals.
Wavelet Decomposition:
Wavelets are functions defined over a finite interval and having an average value of zero. In
wavelet decomposition, an image is decomposed into four components namely Approximate
coefficients, Horizontal coefficients, Diagonal coefficients and Vertical coefficients.
Image
AC HC DC VC
In numerical analysis and functional analysis, a discrete wavelet transform (DWT) is any wavelet
transform for which the wavelets are discretely sampled. As with other wavelet transforms, a key
advantage it has over Fourier transforms is temporal resolution: it captures both frequency and
location information (location in time).
In image processing, a Gabor filter, named after Dennis Gabor, is a linear filter
used for edge detection. Frequency and orientation representations of Gabor filter are similar to
those of human visual system, and it has been found to be particularly appropriate for texture
representation and discrimination. In the spatial domain, a 2D Gabor filter is a Gaussian kernel
function modulated by a sinusoidal plane wave. The Gabor filters are self-similar – all filters can
be generated from one mother wavelet by dilation and rotation.
Gabor filter (or Gabor wavelet) is widely adopted to extract texture features from the images for
image retrieval and has been shown to be very efficient. Manjunath and MA [5] have shown that
image retrieval using Gabor features outperforms that using pyramid-structured wavelet
transform(PWT) features, tree-structured wavelet transform (TWT)features and multiresolution
simultaneous autoregressive model(MR-SAR) features. Basically, Gabor filters are a group of
wavelets, with each wavelet capturing energy at a specific frequency and a specific direction.
Expanding a signal using this basis provides a localized frequency description, therefore
capturing local features/energy of the signal. Texture features can then be extracted from this
group of energy distributions. The scale (frequency) and orientation tunable property of Gabor
filter makes it especially useful for texture analysis.
The filter has a real and an imaginary component representing orthogonal directions. The two
components may be formed into a complex number or used individually.
Real:
Imaginary:
Where
And
In this equation,
λ:represents the wavelength of the sinusoidal factor.
θ represents the orientation of the normal to the parallel stripes of a Gabor function.
ψ is the phase offset, σ is the sigma of the Gaussian envelope.
γ is the spatial aspect ratio, and specifies the ellipticity of the support of the Gabor
function.
functiongb=gabor_fn(sigma,theta,lambda,psi,gamma)
sigma_x = sigma;
sigma_y = sigma/gamma;
% Bounding box
nstds = 3;
xmax = max(abs(nstds*sigma_x*cos(theta)),abs(nstds*sigma_y*sin(theta)));
xmax = ceil(max(1,xmax));
ymax = max(abs(nstds*sigma_x*sin(theta)),abs(nstds*sigma_y*cos(theta)));
ymax = ceil(max(1,ymax));
xmin = -xmax; ymin = -ymax;
[x,y] = meshgrid(xmin:xmax,ymin:ymax);
% Rotation
x_theta=x*cos(theta)+y*sin(theta);
y_theta=-x*sin(theta)+y*cos(theta);
CBIR provides the retrieval of the digital images similar to the query image from the large
storage of the database according to their content.
When user gives the query image Gabor transform is applied over the image.
Features (Mean, Entropy & Standard Deviation) of the transformed are calculated.
Next the query image is split into R, G and B plane.
Features (Mean, Entropy & Standard Deviation) of each plane are calculated again.
Query image is then transformed with discrete wavelet Transform (DWT) 1-level.
Again, the features of the each component of transformed image are calculated.
Now the same procedure is applied on each image of the database.
The Feature Vector of the database images (Fd) is obtained.
The Feature Vector of the query image (Fq) is obtained.
Similarity measurement using Euclidean distance is done between the feature vectors of
database & query image.
Images whose distance is more than the predefined threshold is retrieved from the image
database.
The following diagram shows the working of the system.
The diagram above describes the overview of the Content Based Image Retrieval (CBIR) system.
Each block of the figure describes a particular process in the system. As shown in the figure all
the feature vectors of the images are stored in the databases called as the feature database, the
corresponding feature vectors of query image is extracted and it is compared with all the feature
vectors stored in the database using a suitable similarity measurement technique and relevant
images are retrieved on the basis of predefined threshold.
Incremental development is a scheduling and staging strategy, in which the various parts of the
system are developed at different times or rates, and integrated as they are completed. It does not
imply, require nor preclude iterative development or waterfall development - both of those are
rework strategies. The alternative to incremental development is to develop the entire system
with “big bang” integration.
Iterative development is a rework scheduling strategy in which time is set aside to revise and
improve parts of the system. It does not presuppose incremental development, but works very
well with it. A typical difference is that the output from an increment is not necessarily subject to
further refinement, and it’s testing or user feedback is not used as input for revising the plans or
specifications of the successive increments. On the contrary, the output from iteration is
examined for modification, and especially for revising the targets of the successive iterations.
For every successful project proper use of its resources is the major factor. For the
success of software projects proper intelligent use of the available hardware and software is
important.
Hardware Requirement:
Processor-Pentium IV
RAM -512 MB or more.
Hard Disk of 40 GB
Monitor
Software Requirement:
Matlab 9.0
Project development
Requirement &
specification
Analysis
&
Research
Coding
System Testing
Writing Manual
The above Gantt chart shows the development life cycle of the proposed system. It shows the
allotted time for each phase of the cycle and the actual time taken to go through a phase.
MATLAB is an interactive system whose basic data element is an array that does not require
dimensioning. This allows you to solve many technical computing problems, especially those
with matrix and vector formulations.
Development Environment: -This is the set of tools and facilities that help you use MATLAB
functions and files.
Graphics: - MATLAB has extensive facilities for displaying vectors and matrices as graphs, as
well as annotating and printing these graphs. It includes high-level and low-level functions.
The MATLAB Application Program Interface (API):- This is a library that allows you to
write C and FORTRAN programs that interact with MATLAB.
Testing analyzes a program with the intent of finding problems and errors that measures system
functionality and quality. Testing includes inspection and structured peer reviews of requirement
and design, as well as execution test of code. The code developed during coding activity is likely
to have some requirement errors and design errors in addition to the errors introduced during the
coding activity. Testing perform a very critical role for quality assurance and for ensuring the
reliability of software. The system must be tested to evaluate the actual system functionality.
White Box Testing is related with structure of the program. To test the logic of the program
various test cases are design which takes care of following:
o BBT is related with input and output and not related with internal structure of the
program.
o In BBT it is checked if some input is given than whether specific output is produce by the
program or not.
o The various sets of input test cases are prepared and applied on a program corresponding
output are verified.
o This type of testing is done Test Engineers.
Five-five images of each category are tested at threshold 20 and the result is summarized in the
following tables with their average.
In information retrieval contexts, precision and recall are defined in terms of a set of retrieved
documents (e.g. the list of documents produced by a web search engine for a query) and a set of
relevant documents (e.g. the list of all documents on the internet that are relevant for a certain
topic).
Precision:
In the field of information retrieval, precision is the fraction of retrieved documents that are
relevant to the search:
Recall in Information Retrieval is the fraction of the documents that are relevant to the query that
are successfully retrieved.
The following tables show the Precision and Recall for each category of the images.
Category Bus
Image Precision Recall Precision Recall
Name Percentage Percentage
1.jpg 1.00 0.22 100% 22%
3.jpg 1.00 0.08 100% 8%
25.jpg 0.69 0.185 69% 8%
44.jpg 0.50 0.06 50% 6%
39.jpg 0.67 0.12 67% 12%
0.772 0.2664 77.2% 26.64%
Average
Category Dinosaur
Image Name Precision Recall & RecallPrecision
Table 5.1.2 Precision Recall
of Dinosaur category
Percentage Percentage
51.jpg 1.00 0.58 100% 58%
54.jpg 1.00 0.44 100% 44%
56.jpg 1.00 0.50 100% 50%
94.jpg 1.00 0.18 100% 18%
95.jpg 1.00 0.12 100% 12%
Average 1.00 0.364 100% 36.4%
Category Beach
Image Precision Recall Precision Recall
Name Percentage Percentage
143.jpg 0.5 0.06 50% 6%
141.jpg 1.00 0.08 100% 8%
131.jpg 0.67 0.08 67% 8%
135.jpg 0.75 0.06 75% 6%
120.jpg 0.36 0.10 36% 10%
Average 0.656 0.076 65.6 7.6
Category Rose
Image Precision Recall Precision Recall
Name Percentage Percentage
151.jpg 1.00 0.10 100% 10%
155.jpg 1.00 0.08 100% 8%
157.jpg 0.60 0.06 60% 6%
164.jpg 1.00 0.06 100% 6%
186.jpg 1.00 0.08 100% 8%
Average 0.92 0.076 92% 7.6%
Category Horse
Image Precision Recall Precision Recall
Name Percentage Percentage
203.jpg 1.00 0.20 100% 20%
207.jpg 1.00 0..18 100% 18%
209.jpg 1.00 0.20 100% 20%
212.jpg 1.00 0.06 100% 6%
240.jpg 1.00 0.04 100% 4%
Average 1.00 0.136 100% 13.6%
Category Elephant
Table 5.1.6 Precision & Recall of Elephant category
Image Precision Recall Precision Recall
Name Percentage Percentage
251.jpg 0.45 0.10 45% 10%
253.jpg 1.00 0.06 100% 6%
258.jpg 0.75 0.06 75% 6%
263.jpg 0.5625 0.18 56.25% 18%
270.jpg 0.714 0.10 71.4% 10%
Average 0.6953 0.10 69.53% 10%
Category Mountain
Image Precision Recall Precision Recall
Name Percentage Percentage
301.jpg 1.00 0.04 100% 4%
303.jpg 0.60 0.12 60% 12%
306.jpg 0.50 0.04 50% 4%
320.jpg 0.714 0.10 71.4% 10%
335.jpg 1.00 0.04 100% 4%
Average 0.7628 0.068 76.28% 6.8%
Category Dinosaur
Image Precision Recall Precision Recall
Name Percentage Percentage
51.jpg 1.00 0.56 100% 56%
54.jpg 1.00 0.44 100% 44%
56.jpg 1.00 0.44 100% 44%
94.jpg 1.00 0.18 100% 18%
95.jpg 1.00 0.12 100% 12%
Category Beach
Image Precision Recall Precision Recall
Name Percentage Percentage
143.jpg 1.00 0.06 100% 6%
141.jpg 1.00 0.08 100% 8%
131.jpg 0.75 0.06 75% 6%
135.jpg 1.00 0.06 100% 6%
120.jpg 0.307 0.08 30.7% 8%
Average 0.8114 0.068 81.14% 6.8%
Category Rose
Image Precision Recall Precision Recall
Name Percentage Percentage
151.jpg 1.00 0.02 100% 2%
155.jpg 1.00 0.08 100% 8%
157.jpg 0.75 0.06 75% 6%
164.jpg 1.00 0.06 100% 6%
186.jpg 0.80 0.08 80% 8%
Average 0.91 0.078 91% 7.8%
Table 5.2.4 Precision & Recall of Rose category
Category Elephant
Image Name Precision Recall Precision Recall
Percentage Percentage
251.jpg 0.44 0.08 44% 8%
253.jpg 1.00 0.06 100% 6%
258.jpg 1.00 0.06 100% 6%
263.jpg 0.5625 0.18 56.25% 18%
270.jpg 1.00 0.01 100% 1%
Average 0.8005 0.078 80.05 7.8
Category Beach
Image Precision Recall Precision Recall
Name Percentage Percentage
143.jpg 1.00 0.06 100% 6%
141.jpg 1.00 0.08 100% 8%
131.jpg 0.75 0.06 75% 6%
135.jpg 1.00 0.06 100% 6%
120.jpg 0.367 0.08 30.76% 8%
Average 0.81152 0.068 81.152% 6.8%
Category Rose
Image Precision Recall Precision Recall
Name Percentage Percentage
151.jpg 1.00 0.10 100% 10%
155.jpg 1.00 0.08 100% 8%
157.jpg 0.75 0.06 75% 6%
164.jpg
CONTENT BASED IMAGE1.00
RETRIEVAL 0.06 100% 6% Page 49
186.jpg 1.00 0.08 100% 8%
Average 0.95 0.076 95% 7.6%
Table 5.3.4 Precision & Recall of Rose category
Category Horse
Image Precision Recall Precision Recall
Name Percentage Percentage
203.jpg 1.00 0.16 100% 16%
207.jpg 1.00 0.16 100% 16%
209.jpg 1.00 0.20 100% 20%
212.jpg 1.00 0.06 100% 6%
240.jpg 1.00 0.04 100% 4%
Average 1.00 0.124 100% 12.4%
Category Elephant
Image Precision Recall Precision Recall
Name Percentage Percentage
251.jpg 0.4444 0.08 44.44% 8%
253.jpg 1.00 0.06 100% 6%
258.jpg 1.00 0.06 100% 6%
263.jpg 0.5333 0.16 53.33% 16%
270.jpg 1.00 0.10 100% 10%
Average 0.7955 0.092 79.55% 9.2%
Category Mountain
Image Precision Recall Precision Recall
Name Percentage Percentage
301.jpg 1.00 0.04 100% 4%
303.jpgTable 5.3.71.00
Precision & Recall0.12 100%
of Mountain category 12%
306.jpg 1.00 0.04 100% 4%
320.jpg 1.00 0.08 100% 8%
335.jpg 1.00 0.04 100% 4%
Average 1.00 0.068 100% 6.4%
The cost of the project can be estimated using Constructive Cost Model (COCOMO)
developed by Bochm.
The basic steps in this model are:-
A] Obtain an initial estimate of the development effort from the estimate of thousand of
delivered lines of source code (KDLOC). Initial estimate also known as the nominal estimate is
determined by an equation where KDLOC is used to measure size. To determine the initial effort
Ei in Person-Month the equation is:
Ei = a * (KDLOC) b, value of a and b depends on the project type
COCOMO projects are categorized into three types:
i. Organic: Suitable for organization that has considerable experience and requirements
ii. Semidetached: Examples of this type are developing new database management
system.
iii. Embedded: Organization has little experience and stringent requirements.
The size of the different modules and the overall system are estimated to be:
There are 15 different attributes called cost driver attribute that determine the multiplying
factors.
These factors depends on product, computer, personnel and technology attributes (called
project attributes)
Eg: of the attribute are required software reliability (RELY), product complexity (CPLX),
analyst capability (ACAP) application experience development schedule (SCHD).
Ratings
Very Very Extra
Cost Drivers Low Low Nominal High High High
Product attributes
Required software reliability 0.75 0.88 1.00 1.15 1.40
Size of application database 0.94 1.00 1.08 1.16
Complexity of the product 0.70 0.85 1.00 1.15 1.30 1.65
Hardware attributes
Run-time performance constraints 1.00 1.11 1.30 1.66
Memory constraints 1.00 1.06 1.21 1.56
Volatility of the virtual machine environment 0.87 1.00 1.15 1.30
Required turnabout time 0.87 1.00 1.07 1.15
Personnel attributes
Analyst capability 1.46 1.19 1.00 0.86 0.71
Applications experience 1.29 1.13 1.00 0.91 0.82
Software engineer capability 1.42 1.17 1.00 0.86 0.70
Virtual machine experience 1.21 1.10 1.00 0.90
Programming language experience 1.14 1.07 1.00 0.95
Project attributes
Use of software tools 1.24 1.10 1.00 0.91 0.82
Application of software engineering methods 1.24 1.10 1.00 0.91 0.83
Required development schedule 1.23 1.08 1.00 1.04 1.10
Image analysis is very memory-intensive, so the program will run faster and more
effectively on machines with more RAM available.
Doesn’t work on shapes.
Threshold values need to be adjusted for images belonging to different category.
Doesn’t do object recognition.
Future Enhancements
List of figure:
Fig.2.6 Demonstration of a Gabor filter applied to Chinese character. Four orientations are
shown on the right 0°, 45°, 90° and 135°. The original character picture and the superposition of
all four orientations are shown on the left.
List of Charts:
List of Tables: