Efficient CBIR Using Color Histogram Processing
Efficient CBIR Using Color Histogram Processing
Efficient CBIR Using Color Histogram Processing
1, March 2011
3
Department of Electronics and Communication, Sri Satya Sai Institute of Science and
Technology,Sehore
[email protected]
ABSTRACT
The need for efficient content-based image retrieval system has increased hugely. Efficient and effective
retrieval techniques of images are desired because of the explosive growth of digital images. content based
image retrieval (CBIR) is a promising approach because of its automatic indexing retrieval based on their
semantic features and visual appearance. The similarity of images depends on the feature
representation.However users have difficulties in representing their information needs in queries to content
based image retrieval systems. In this paper we investigate two methods for describing the contents of
images. The first one characterizes images by global descriptor attributes, while the second is based on
color histogram approach.To compute feature vectors for Global descriptor, required time is much less as
compared to color histogram. Hence cross correlation value & image descriptor attributes are calculated
prior histogram implementation to make CBIR system more efficient.The performance of this approach is
measured and results are shown. The aim of this paper is to compare various global descriptor attributes
and to make CBIR system more efficient. It is found that further modifications are needed to produce better
performance in searching images.
KEYWORDS
CBIR, Image Retrieval, Feature extraction, Global descriptor.
I. INTRODUCTION
Advances in computer and network technologies coupled with relatively cheap high volume data
storage devices have brought tremendous growth in the amount of digital images. The digit
contents are being generated with an exponential speed. Businesses, the media, government
agencies and even individuals all need to organize their images somehow. As the amount of
collections of digital images increases, the problem finding a desired image in the web becomes a
hard task. There is a need to develop an efficient method to retrieve digital images.
Nowadays, CBIR is a hotspot of digital image processing techniques. CBIR research started in
the early 1990’s and is likely to continue during the first two decades of the 21st century. Many
research groups in leading universities and companies are actively working in this area and a
fairly large number of prototypes and commercial products are already available. However, the
current solutions are still far from reaching the ultimate goal.
DOI : 10.5121/sipij.2011.2108 94
Signal & Image Processing : An International Journal(SIPIJ) Vol.2, No.1, March 2011
There are two approaches to image retrieval: Text-Based approach and Content- Based approach
Today, the most common way of doing this is by textual descriptions and categorizing of images.
This approach has some obvious shortcomings. Different people might categorize or describe the
same image differently, leading to problems retrieving it again. It is also time consuming when
dealing with very large databases. Content based image retrieval (CBIR) is a way to get around
these problems.
CBIR systems search collection of images based on features that can be extracted from the image
files themselves without manual descriptive.In past decades many CBIR systems have been
developed ,the common ground for them is to extract a desired image. Comparing two images
and deciding if they are similar or not is a relatively easy thingto do for a human. Getting a
computer to do the same thing effectively is however a different matter. Many different
approaches to CBIR have been tried and many of these have one thing in common, the use of
color histograms.
Researchers working on CBIR claim that TBIR(Text Based Image Retrieval) has limitations. For
example, Brahmi et al. mentioned the following two drawbacks in text-based image retrieval First
, manual image annotation is time-consuming and therefore costly. Second, human annotation is
subjective. In addition, Sclaroff et al. indicated that some images could not be annotated because
it is difficult to describe their content with words . This may be one of the main causes of above
two problems. We agree that the above two problems of annotation seem valid; however, we do
not think that we should support CBIR instead of TBIR. There are two reasons to support TBIR.
First, CBIR has its own problems, which are probably more crucial. Second, the negative effects
due to the above problems in ABIR may be mitigated. First, let us start with the analysis of the
problems in CBIR. It is obvious that there are many applications where the use of CBIR is
advantageous. As examples, CBIR is suitable for medical diagnoses based on the comparison of
X-ray pictures with past cases, and for finding the faces of criminals from video shots of a crowd.
These examples can be categorized as “find-similar” tasks; the images to be searched may not
differ significantly in their appearances, and so the superficial similarities of the images are more
important than the semantic contents. Other applications that involve more semantic relationships
cannot be dealt with by CBIR, even if extensive image processing procedures are applied. For
instance, in the gathering of the photos regarding the ’Iraq war’, it is not clear what kind of
images should be used for the querying. This is simply because visual features cannot fully
represent concepts. Only texts or words can do that. Also, it should be noted that in a QbE setting,
which is usually the premise of CBIR, users must have an example image at hand. In contrast, in
a QbT setting, users simply need to have their search requests in mind because they can compose
queries freely using their natural language. Now that we have clarified the advantages and
disadvantages of CBIR.
CBIR systems use visual content such as color, texture, and simple shape properties to search
images from large scale image databases (Del Bimbo, 1999). Although they improve text-based
image retrieval systems, these systems are not yet a commercial success. One of the major
reasons for this limited success is that CBIR rely upon a global view of the image, sometimes
leading to a lot of irrelevant image content that is used in the search process. A solution for the
global view problem can be found in localized CBIR. These systems only focus on the portion of
the image the user is interested in.
In this paper ,three dimensional color space RGB is investigated & histogram based image
retrieval method is used. Another issue in this work is to evaluate the performance measurement
parameters of all global descriptor attributes including cross correlation function and make
comparision.
95
Signal & Image Processing : An International Journal(SIPIJ) Vol.2, No.1, March 2011
96
Signal & Image Processing : An International Journal(SIPIJ) Vol.2, No.1, March 2011
A. Feature Extraction
The first step in this process is to extract the image features to a distinguishable extent. In this
paper ,global features are extracted to make system system more efficient. In this section, we
introduce the image features used by our methods for images description. We classify the various
features as follows-
Texture Features
Color Features
Where Pij is the (i; j) pixelcolor, N is the total number of pixels in the image.
These values allow to estimate the average color, the dispersion of color values from the average
and the symmetry of their distribution on the whole image.
B. Matching
The second step involves matching these features to yield a result that is visually similar.
Basic idea behind CBIR is that, when building an image database, feature vectors from images
(the features can be color, shape, texture, region or spatial features, features in some compressed
domain, etc.) are to be extracted and then store the vectors in another database for future use.
When given a query image its feature vectors are computed. If the distance between feature
vectors of the query image and image in the database is small enough, the corresponding image in
the database is to be considered as a match to the query. The search is usually based on similarity
rather than on exact match and the retrieval results are then ranked accordingly to a similarity
index.
97
Signal & Image Processing : An International Journal(SIPIJ) Vol.2, No.1, March 2011
The RGB model uses three primary colors, red, green and blue, in an additive fashion to be able
to reproduce other colors. As this is the basis of most computer displays today, this model has the
advantage of being easy to extract. In a true-color image each pixel will have has a red, green and
blue value ranging from 0 to 255 giving a total of 16777216 different colors.
98
Signal & Image Processing : An International Journal(SIPIJ) Vol.2, No.1, March 2011
One disadvantage with the RGB model is its behaviour when the illumination in an image
changes. The distribution of rgb-values will change proportionally with the illumination, thus
giving a very different histogram.
A. Color Histogram
The approach more frequently adopted for CBIR systems is based on the conventional color
histogram (CCH), which contains occurrences of each color obtained counting all image pixels
having that color. Each pixel is associated to a specific histogram bin only on the basis of its own
color, and color similarity across different bins or color dissimilarity in the same bin are not taken
into account. Since any pixel in the image can be described by three components in a certain
colour space(for instance, red, green and blue components in RGB space or hue, saturation and v
alue in HSV space), a histogram, i.e., the distribution of the number of pixels for each quantized
bin, can be defined for each component.
By default the maximum number of bins one can obtain using the histogram function in MatLab
is 256. The conventional color histogram (CCH) of an image indicates the frequency of
occurrence of every color in an image. The appealing aspect of the CCH is its simplicity and ease
of computation.
The flow chart of experiment in fig. follows the procedure of general image retrieval system.
99
Signal & Image Processing : An International Journal(SIPIJ) Vol.2, No.1, March 2011
In the fig. above, once a query image & a retrieval method is chosen by users, the rest of whose
process is done automatically.However,the histogram data for all images in database are
computed and saved in DB in advance so that only the image indexes and histogram data can be
used to compare the query image with images in DB.All processes were realized using
MATLAB. The following sections explain the experiment in detail.
B. Generation of image DB
The image data used in the experiment were taken from digital camera & few of the images were
downloaded from a web site to create large database..However in order to reduce the
computation time of the whole process,image sizes were reduced to 8×8 pixels.
C. Quatization
Comparing all the colours in two images would however be very time consuming andcomplex,
and so a method of reducing the amount of information must be used. Oneway of doing this is by
quantizing the colour distribution into colour histograms. this is probably one of the more popular
approaches to image retrieval today.
When computing a colour histogram for an image, the different colour axes are dividedinto a
number of so-called bins. A three dimensional 256x256x256 RGB histogram would therefore
contain a total of 16777216 such bins. When indexing the image, the colourof each pixel is found,
and the corresponding bin’s count is incremented by one.
100
Signal & Image Processing : An International Journal(SIPIJ) Vol.2, No.1, March 2011
101
Signal & Image Processing : An International Journal(SIPIJ) Vol.2, No.1, March 2011
102
Signal & Image Processing : An International Journal(SIPIJ) Vol.2, No.1, March 2011
1 ∑
(4)
Where Qi is the value of bin i in the query image and Ii is the corresponding bin in the database
image. Based on experience from earlier projects ,the L1-norm is the metric of choice for this
paper
There are however, several difficulties associated with the color histogram (CH)viz a) CH is
sensitive to noisy interferences such as illumination changes and quantization errors; b) large
dimension of CH involves large computation on indexing, c)It does not take into consideration
color similarity across different bins, d)It cannot handle rotation and translation.It means that
information about object location,shape,, and texture is discarded.e) Two perceptually very
different images with similar colour distribution will be deemed similar by a colour histogram-
based retrieval system as illustrated in figure 5. Hence image retrieved by using global color
histogram may not be semantically related even though they share similar color distribution.
103
Signal & Image Processing : An International Journal(SIPIJ) Vol.2, No.1, March 2011
IX. EXPERIMENTS
Our methods has been implemented with a general-purpose image database including about 20
pictures , which are stored in JPEG format with size 8x8. To make more efficient color
histogram based CBIR method, firstly texture feature attribute i.e. cross correlation & color
feature attributes are calculated for all 20 database images against query image.
104
Signal & Image Processing : An International Journal(SIPIJ) Vol.2, No.1, March 2011
Table 1: : Showing Distance Between Global Feature Attributes of Query Image & Database Images
A. AMP-measurements
105
Signal & Image Processing : An International Journal(SIPIJ) Vol.2, No.1, March 2011
Where N is the number of images in the database and R is the rank of the returned image. This is
calculated for each image and the results averaged. A score of 100% indicates a perfect match.
106
Signal & Image Processing : An International Journal(SIPIJ) Vol.2, No.1, March 2011
Those images for which obtained AMP result is above 60%are considered to be matched.
Secondly,histogram comparision metrics using RGB color model is applied to the matched
databases.
The top six best images according to AMP measurement are returned as result.Now histogram
comparision metrics using RGB color model is applied to the matched databases .
107
Signal & Image Processing : An International Journal(SIPIJ) Vol.2, No.1, March 2011
B. Retrieval Efficiency
The retrieval efficiency, namely recall precision and accuracy were calculated for 20 color
images from image database. Standard formulas have been used to compute these parameters.
Total no. of No. of relevant Total no.of No. of relevant Precision Recall Accuracy
images in the images in the images images rate rate rate
database database retrieved retrieved
20 10 9 6 66.6% 60% 63.2%
Total no. of No. of relevant Total no.of No. of relevant Precision Recall Accuracy
images in the images in the images images rate rate rate
database database retrieved retrieved
20 10 8 5 62.5% 50% 56.25%
Total no. of No. of relevant Total no.of No. of relevant Precision Recall Accuracy
images in the images in the images images rate rate rate
database database retrieved retrieved
20 10 8 7 87.5% 70% 78.75%
108
Signal & Image Processing : An International Journal(SIPIJ) Vol.2, No.1, March 2011
Total no. of No. of relevant Total no.of No. of relevant Precision Recall Accuracy
images in the images in the images images rate rate rate
database database retrieved retrieved
20 10 6 5 83.3% 50% 66.6%
Table 8 : Precision And Recall Values In % For Average Value Of Image Descriptor Attributes
Figure 8: Comparative global descriptor attributes of the proposed CBIR system for various
retrieval efficiency measurement parameters.
A comparision result between global descriptor attributes shows that , cross correlation function
achieve better retrieval results than all color descriptor attributes for all retrieval efficiency
measurement parameters.However recall rate is same for both E and global descriptor
attributes. Precision and recall rate of both E and is also better than & one.It has found that
cross correlation function also works better than result obtained by taking the average of all
image descriptor attributes. Precision rate of average value is greater than that for all color
descriptor attributes but recall rate is same as for &.From these comparision results, we can see
that cross correlation function achieves a highest retrieval efficiency.
X. CONCLUSION
To compute feature vectors for Global descriptor, required time is much less as compared to color
histogram.Hence to enhance the efficiency of retrieval system , a new CBIR technique is
developed in which global descriptor attributes of all database images are measured first and then
histogram-based search method is investigated in RGB color space only on matched databases.A
higher successful rate in retrieving a target image is obtained.
109
Signal & Image Processing : An International Journal(SIPIJ) Vol.2, No.1, March 2011
The performance of various image feature attributes are measured and compared. A comparision
result between global descriptor attributes containing texture feature attribute “cross correlation”
and color feature attributes “ color expectancy, color variance, skewness” shows that cross
correlation function achieves better retrieval results than all color descriptor attributes for all
retrieval efficiency measurement parameters.
Histogram search characterizes an image by its color distribution, or histogram but the drawback
of a global histogram representation is that information about object location, shape, and texture
is discarded. Thus this paper showed that images retrieved by using the global color histogram
may not be semantically related even though they share similar color distribution in some
results.This drawback is also minimized upto some limit by calculating color feature attributes
along with efficient implementation.
XII. REFERENCES
[1] Shengjiu Wang, “A Robust CBIR Approach Using Local Color Histograms,” Technical Report TR 01-
03, Departement of computing science, University of Alberta, Canada. October 2001.
[2] R. Schettini, G. Ciocca, S Zuffi. A survey of methods for colour image indexing and retrieval in image
databases. Color Imaging Science: Exploiting Digital Media, (R. Luo, L. MacDonald eds.), J. Wiley, 2001.
[3] R. Russel, P Sinha. Perceptually based Comparison of Image Similarity Metrics.,MIT AI Memo 2001-
014. Massachusetts Institute of Technology, 2001
[4] J.F. Omhover, M. Detyniecki and B. Bouchon-Meunier, “A Region Similarity Based Image Retrieval
System”, The 10th International conference on Information Processing and Management of Uncertainty in
Knowledge-Based Systems Perugia,Italy 2004.
[5] http://en.wikipedia.org/wiki/RGB,
110
Signal & Image Processing : An International Journal(SIPIJ) Vol.2, No.1, March 2011
[7] http://en.wikipedia.org/wiki/HSV_color_space,
[8] Ryszard S. Chora´s”Image Feature Extraction Techniques and Their Applications for CBIR and
Biometrics Systems” international journal of biology and biomedical engineering,2007.
[9] H. B. Kekre , Dhirendra Mishra “CBIR using Upper Six FFT Sectors of Color Images for Feature
Vector Generation” H.B.Kekre. et al /International Journal of Engineering and Technology Vol.2(2), 2010,
49-54.
[10]Ch.Srinivasa rao , S. Srinivas kumar #, B.N.Chatterji “ Content Based Image Retrieval using
Contourlet Transform” ICGST-GVIP Journal, Volume 7, Issue 3, November 2007.
[11]Dr. H. B. Kekre Kavita Sonavane “CBIR Using Kekre’s Transform over Row column Mean and
Variance Vector ” (IJCSE) International Journal on Computer Science and Engineering Vol. 02, No. 05,
2010, 1609-1614.
[12]S. Nandagopalan, Dr. B. S. Adiga, and N. Deepak “A Universal Model for Content-Based Image
Retrieval” World Academy of Science, Engineering and Technology 46 2008.
[13]P. B. Thawari & N. J. Janwe “CBIR Based On Color And Texture” International Journal of
Information Technology and Knowledge Management January-June 2011, Volume 4, No. 1, pp. 129-132.
[14]Jalil Abbas, Salman Qadri, Muhammad Idrees3, Sarfraz Awan, Naeem Akhtar Khan1 “Frame Work
For Content Based Image Retrieval (Textual Based) System” Journal of American Science 2010;6(9).
[15]Ramesh Babu Durai C “A Generic Approach To Content Based Image Retrieval Using Dct And
Classification Techniques” (IJCSE) International Journal on Computer Science and Engineering Vol. 02,
No. 06, 2010, 2022-2024.
[16]Ch.Srinivasa Rao , S.Srinivas Kumar and B.Chandra Mohan “ CBIR Using Exact Legendre Moments
And Support Vector Machine” International Journal Of Multimedia And Its Applications Vol.2,No.2,May
2010.
[17]Hichem Bannour_Lobna Hlaoua_Bechir Ayeb, “Survey Of The Adequate Descriptor For Content
Based Image Retrieval On The Web:Global Versus Local Features “ 2009.
[18]Hiremath P.S. and Jagadeesh Pujari “Content Based Image Retrieval using Color Boosted Salient
Points and Shape features of an image” International Journal of Image Processing, Volume (2) : Issue (1).
[19]Zhe-Ming Lu, Su-Zhi Li and Hans Burkhardt , “ A Content-Based Image Retrieval Scheme In JPEG
Compressed Domain ” International Journal of Innovative Computing, Information and Control ICIC
International °c 2006 ISSN 1349-4198 Volume 2, Number 4, August 2006.
[20]Issam El-Naqa, Yongyi Yang , Nikolas P. Galatsanos , Robert M. Nishikawa , and Miles N. Wernick ,
“A Similarity Learning Approach to Content-Based Image Retrieval: Application to Digital Mammography
” Ieee Transactions On Medical Imaging, Vol. 23, No. 10, October 2004 1233.
111
Signal & Image Processing : An International Journal(SIPIJ) Vol.2, No.1, March 2011
[21]Joel Ponianto , “Content-Based Image Retrieval Indexing” School of Computer Science and Software
Engineering Monash University, 2005.
[22]Sameer Antani, L. Rodney Long, George R. Thoma , “Content-Based Image Retrieval for Large
Biomedical Image Archives ” MEDINFO 2004 M. Fieschi et al. (Eds) Amsterdam: IOS Press © 2004
IMIA.
[23]Stefan Uhlmann, Serkan Kiranyaz and Moncef Gabbouj, “A Regionalized Content-Based Image
Retrieval Framework” 15th European Signal Processing Conference (EUSIPCO 2007), Poznan, Poland,
September 3-7, 2007, copyright by EURASIP.
[24]Ricardo da S. Torres, Alexandre X. Falco , “A New Framework to Combine Descriptors for Content
based Image Retrieval ” IKM’05, October 31.November 5, 2005, Bremen, Germany. Zhong Su, Hongjiang
Zhang, Stan Li, and Shaoping Ma , “Relevance Feedback in Content-Based Image Retrieval: Bayesian
Framework, Feature Subspaces, and Progressive Learning” IEEE Transactions On Image Processing, Vol.
12, No. 8, August 2003.
Authors
[1] Neetu Sharma is pursuing her Post Graduation(MTECH) in VLSI Design, Her Area of interest
include Image Processing, Multimedia and VLSI Technology.
112