0% found this document useful (0 votes)
99 views

Content Based Image Retrieval

The document summarizes several content-based image retrieval systems that use features like color, texture, and shape for image matching and retrieval. It describes the Blobworld system which segments images into coherent regions and allows users to query at the object level. It also discusses a region-based fuzzy feature matching approach that represents images as sets of segmented regions characterized by fuzzy features, improving robustness to segmentation errors. Finally, it mentions a system using snakes/active contours for segmentation and the gradient vector flow method for capturing object boundaries.

Uploaded by

santhoshm2k810
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
99 views

Content Based Image Retrieval

The document summarizes several content-based image retrieval systems that use features like color, texture, and shape for image matching and retrieval. It describes the Blobworld system which segments images into coherent regions and allows users to query at the object level. It also discusses a region-based fuzzy feature matching approach that represents images as sets of segmented regions characterized by fuzzy features, improving robustness to segmentation errors. Finally, it mentions a system using snakes/active contours for segmentation and the gradient vector flow method for capturing object boundaries.

Uploaded by

santhoshm2k810
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

A survey on Content Based Image Retrieval using Color, Texture and Shape

Jasmine Shiny. D
Assistant Professor, Department of Information Technology, CMS College Of Engineering, Namakkal, Tamilnadu, India e-mail: [email protected] Faisal E.K Project Engineer, Wipro Technologies, Cochin, Kerala. e-mail: [email protected]

Santhosh. M
Assistant Professor, Department of Information Technology, CMS College Of Engineering, Namakkal, Tamilnadu, India e-mail: [email protected]

Abstract Content Based Image Retrieval (CBIR) systems receive an image or an image description as input and retrieve images from a database that are similar to the query image. In this paper we survey some technical aspects of current Content Based Image Retrieval systems [1], [8], [9], [10], [11], [12]. The purpose of this survey however, is to provide an overview of the functionality of temporary image retrieval systems in terms of technical aspects: querying, relevance feedback, features, matching measures, indexing data structures, and result presentation. We are discussing about some Content Based Image Retrieval system that using the features such as color, texture and shape. Keywords- Content Based Image Retrieval, Gradient vector Flow (GVF), Gabor Filter, Euclidian distance

image processing is the need to extract useful information from the raw data before any kind of reasoning about the images contents is possible. Image databases thus differ fundamentally from text databases, where the raw material has already been logically structured. The objective of the paper is to provide an overview of the functionality of temporary image retrieval systems in terms of technical aspects: querying, relevance feedback, features, matching measures, indexing data structures, and result presentation. II. PRINCIPLE OF CBIR

I.

INTRODUCTION

Content based image retrieval (CBIR) a technique for retrieving images on the basis of automatically derived features such as color, texture and shape. CBIR operates on a totally different principle from keyword indexing. Primitive features characterizing image content, such as color, texture, and shape, is computed for both stored and query images, and used to identify the stored images most closely matching the query. The most challenging aspect of CBIR is to bridge the gap between low-level feature layout and high-level semantic concepts. "Content-based" means that the search will analyze the actual contents of the image rather than the metadata such as keywords, tags, and/or descriptions associated with the image. The term 'content' in this context might refer to colors, shapes, textures, or any other information that can be derived from the image itself. CBIR is desirable because most web based image search engines rely purely on metadata and this produces a lot of garbage in the results. Also having humans manually enter keywords for images in a large database can be inefficient, expensive and may not capture every keyword that describes the image. Thus a system that can filter images based on their content would provide better indexing and return more accurate results. CBIR differs from classical information retrieval in that image databases are essentially unstructured, since digitized images consist purely of arrays of pixel intensities, with no inherent meaning. One of the key issues with any kind of

Content-based image retrieval, also known as query by image content and content-based visual information retrieval is the application of computer vision to the image retrieval problem, that is, the problem of searching for digital images in large databases. Content-based means that the search makes use of the contents of the images themselves, rather than relying on human-input metadata such as captions or keywords. A content-based image retrieval system (CBIR) is a piece of software that implements CBIR. In CBIR each image that is stored in the database has its features extracted and compared to the features of the query image. It involves two steps. Feature Extraction: The first step in this process is to extract the image features to a distinguishable extent. Matching: The second step involves matching these features to yield a result that is visually similar. III. SYSTEMS

Below we describe a number of content-based image retrieval systems. A. Blobworld: Image Segmentation Using ExpectationMaximization and Its Application to Image Querying [8] Retrieving images from large and varied collections using image content as a key is a challenging and important problem. In Blobworld, It present a new framework for image retrieval based on segmentation into regions and querying using properties of these regions. The regions generally correspond to objects or parts of objects. While Blobworld does not exist completely in the "thing" domain, it recognizes the nature of images as combinations of objects, and querying in Blobworld is more meaningful than it is with

simple "stuff" representations. It presents a new image representation that provides a transformation from the raw pixel data to a small set of image regions that are coherent in color and texture. Blobworld present a new image representation that provides a transformation from the raw pixel data to a small set of image regions that are coherent in color and texture. This "Blobworld" representation is created by clustering pixels in a joint color-texture-position feature space. The segmentation algorithm is fully automatic and has been run on a collection of 10,000 natural images. Blobworld describe a system that uses the Blobworld representation to retrieve images from this collection. An important aspect of the system is that the user is allowed to view the internal representation of the submitted image and the query results. Similar systems do not offer the user this view into the workings of the system; consequently, query results from these systems can be inexplicable, despite the availability of knobs for adjusting the similarity metrics. By finding image regions that roughly correspond to objects, Blobworld allow querying at the level of objects rather than global image properties. In order to segment each image automatically, Blobworld model the joint distribution of color, texture, and position features with a mixture of Gaussians. Blobworld use the Expectation-Maximization (EM) algorithm to estimate the parameters of this model; the resulting pixel-cluster memberships provide a segmentation of the image. After the image is segmented into regions, a description of each region's color and texture characteristics is produced. In a querying task, the user can access the regions directly, in order to see the segmentation of the query image and specify which aspects of the image are important to the query. When query results are returned, the user also sees the Blobworld representation of each retrieved image; this information assists greatly in refining the query. Result presentation: The retrieved images are ranked in linear order, and presented together with the segmented version showing the regions. Advantages: Advantages of this scheme are user is allowed to view the internal representation of submitted image and query results. It allows Querying level of objects rather than global image properties. Blobworld produces higher precision than does querying using color and texture histograms. Disadvantages: Disadvantages of this scheme are, this segmentation algorithm will sometimes oversegment the objects. For example, a zebra may be oversegmented into trunk, legs, and head. Current features clearly do not encode all the important information about the blob. B. A Region-Based Fuzzy Feature Matching Approach to Content-Based Image Retrieval [9] In this scheme, an image is represented as set of segment regions. Regions are characterized by fuzzy features or fuzzy set reflecting color, texture, and shape properties. An image is a family of fuzzy features corresponding to regions. Fuzzy features naturally characterize the gradual transition between regions (blurry boundaries) within an image and incorporate

the segmentation-related uncertainties into the retrieval algorithm. The resemblance of two images is then defined as the overall similarity between two families of fuzzy features and quantified by a similarity measure, UFM measure, which integrates properties of all the regions in the images. Compared with similarity measures based on individual regions and on all regions with crisp-valued feature representations, the UFM measure greatly reduces the influence of inaccurate segmentation and provides a very intuitive quantification. The UFM has been implemented as a part of SIMPLIcity image retrieval system. The performance of the system is illustrated using examples from an image database of about 60,000 general-purpose images. This system segments images based on color and spatial variation features using a k-means algorithm. It is a very fast statistical clustering method. To segment an image, the system first partitions the image into small blocks. A feature vector is then extracted for each block. In the UFM scheme, an image is first segmented into regions. Each region is then represented by a fuzzy feature that is determined by center location and width. A direct consequence of fuzzy feature representation is the region-level similarity. Advantages: Advantages of this scheme include that it provides good accuracy and robustness to image segmentation and image alteration. A region is allowed to be matched with several regions in case of inaccurate segmentation Disadvantages: Disadvantages of this scheme include that the objects that totally different in semantic may be clustered into the same region. This system utilizes the shape and size information, but not fully exploited. All fuzzy features within one image have the same shape. C. Snakes, Shapes, and Gradient Vector Flow [10] Snakes, or active contours, are used extensively in computer vision and image processing applications, particularly to locate object boundaries. Active contour model is a framework for finding an object outline from a possibly noisy 2D image. This paper presents a new class of external forces for active contour models. These fields, which we call gradient vector flow (GVF) fields, are dense vector fields derived from images by minimizing certain energy functional in a variational framework. The minimization is achieved by solving a pair of decoupled linear partial differential equations that diffuses the gradient vectors of a gray-level or binary edge map computed from the image. We call the active contour that uses the GVF field as its external force a GVF snake. The GVF snake is distinguished from nearly all previous snake formulations in that its external forces cannot be written as the negative gradient of a potential function. Because of this, it cannot be formulated using the standard energy minimization framework; instead, it is specified directly from a force balance condition. Advantages: An advantage of this scheme is that it helpful to detect the boundaries of object using gradient vector flow (GVF) which is suitable for image segmentation.

Disadvantages: Disadvantages of this scheme are it only focused on shape of an object and it required longer computation time for higher accuracy. D. PicToSeek: Combining Color and Shape Invariant Features for Image Retrieval [11] PicToSeek aim at combining color and shape invariants for indexing and retrieving images. To this end, color models are proposed independent of the object geometry, object pose, and illumination. From these color models, color invariant edges are derived from which shape invariant features are computed. Computational methods are described to combine the color and shape invariants into a unified highdimensional invariant feature set for discriminatory object retrieval. The basic idea of image retrieval by image example is to extract characteristic features from target images which are then matched with those of the query image. These features are typically derived from shape, texture, or color properties of query and target images. After matching, images are ordered with respect to the query image according to their similarity measure and displayed for viewing. Result presentation: The retrieved images are shown without explicit order. Advantages: Advantages of this scheme include that the image retrieval scheme is highly robust to partial occlusion, object clutter and a change in viewing position. Disadvantages: A disadvantage of this scheme is that it depends only on color and shape. E. Texture Features for Browsing and Retrieval of Image Data [12] The focus of this paper is on the image processing aspects and in particular using texture information for browsing and retrieval of large image data. We propose the use of Gabor wavelet features for texture analysis and provide a comprehensive experimental evaluation. Comparisons with other multiresolution texture features using the Brodatz texture database indicate that the Gabor features provide the best pattern retrieval accuracy. The objective of this paper is to study the use of texture as an image feature for pattern retrieval. An image can be considered as a mosaic of different texture regions, and the image features associated with these regions can be used for search and retrieval. A typical query could be a region of interest provided by the user, such as outlining a vegetation patch in a satellite image. The input information in such cases is an intensity pattern or texture within a rectangular window. Texture analysis has a long history and texture analysis algorithms range from using random field models to multiresolution filtering techniques such as the wavelet transform. Several researchers have considered the use of such texture features for pattern retrieval. This paper focuses on a multiresolution representation based on Gabor filters. The use of Gabor filters in extracting textured image features is motivated by various factors.

Advantages: Advantages of this scheme include that an adaptive filter selection algorithm is proposed which can facilitate fast image browsing. Disadvantage: Disadvantages of this scheme is that it depend only on texture feature F. Content Based Image Retrieval using color, texture and shape [1] In this method, an image is partitioned into 24 (4 x 6 or 6 x 4) non overlapping tiles as shown in Fig1. These tiles will serve as local color and texture descriptors for the image. Gabor features are used for texture similarity. With the Corel dataset used for experimentation with 6 x 4 (or 4 x 6) partitioning, the size of individual tile will be 64 x 64. The choice of smaller sized tiles than 64 x 64 leads to degradation in the performance. Most of the texture analysis techniques make use of 64 x 64 blocks. This tiling structure is extended to second level decomposition of the image.

Fig. 1. System overview.

The image is decomposed into size M/2 x N/2, where M and N are number of rows and columns in the original image respectively. With a 64 x 64 tile size, the number of tiles resulting at this resolution is 6 as shown in Fig 1. This allows us to capture different image information across resolutions. For robustness, we have also included the tile features resulting from the same grid structure (i.e. 24 tiles at resolution 2 and 6 tiles at resolution 1) as shown in Fig. 1. Going beyond second level of decomposition added no significant information. So, a two level structure is used. The matching of images at different resolutions is done independently as shown in Fig. 1. Since at any given level of decomposition the number of tiles remains the same for all the images (i.e. either 24 at first level of decomposition or 6 at second level of decomposition), all the tiles will have equal significance. A tile from query image is allowed to be matched to any tile in the target image. However, a tile may participate in the matching process only once. A bipartite graph of tiles for the query image and the target image is built as shown in Fig. 2. The labeled edges of the bipartite graph indicate the distances between tiles. A minimum cost matching is done for this graph. Since, this process involves too many comparisons; the method has

to be implemented efficiently. To this effect, we have designed an algorithm for finding the minimum cost matching based on most similar highest priority (MSHP) principle using the adjacency matrix of the bipartite graph.

the edge image. Gradient vector flow (GVF) is a static external force used in active contour method. GVF is computed as a diffusion of the gradient vectors of a graylevel or binary edge map derived from the images. It differs fundamentally from traditional snake external forces in that it cannot be written as the negative gradient of a potential function, and the corresponding snake is formulated directly from a force balance condition rather than a variational formulation. The GVF uses a force balance condition given by, Fint + Fext(p) = 0 (3) Where Fint is the internal force and Fext(p) is the external force. The external force field Fext(p) = V (x, y) is referred to as the GVF field. The GVF field V(x, y) is a vector field given by V(x, y) =[u(x, y),v(x, y)] that minimizes the energy functional (4) This variational formulation follows a standard principle that of making the results smooth when there is no data. In particular, when |f| is small, the energy is dominated by the sum of squares of the partial derivatives of the vector field, yielding a slowly varying field. On the other hand, when |f| is large, the second term dominates the integrand, and is minimized by setting V = |f|. This produces the desired effect of keeping V nearly equal to the gradient of the edge map when it is large, but forcing the field to be slowly varying in homogeneous regions. The parameter is a regularization parameter governing the tradeoff between the first term and the second term in the integrand. Using this method compute the edge features, color and texture of images and retrieve according to the features of query images. IV. CONCLUSION

Fig. 2. Bipartite graph showing 4 tiles of both the images.

In this, the distance matrix is computed as an adjacency matrix. The minimum distance dij of this matrix is found between tiles i of query and j of target. The distance is recorded and the row corresponding to tile i and column corresponding to tile j, are blocked (replaced by some high value, say 999). This will prevent tile i of query image and tile j of target image from further participating in the matching process. The distances, between i and other tiles of target image and, the distances between j and other tiles of query image, are ignored (because every tile is allowed to participate in the matching process only once). This process is repeated till every tile finds a matching. The process is demonstrated in Fig. 3 using an example for 4 tiles.

D = 1.67 + 2.56 + 4.78 + 25.33 = 34.34


Fig. 3. Image similarity computation based on MSHP principle, (a) first pair of matched tiles i=2,j=1 (b) second pair of matched tiles i=1, j=2 (c) third pair of matched tiles i=3, j=4 (d) fourth pair of matched tiles i=4,j=3, yielding the integrated minimum cost match distance 34.34.

The complexity of this matching procedure is reduced from O(n2) to O(n), where n is the number of tiles involved. The integrated minimum cost match distance between images is now defined as: (2)

We have discussed Content Based Image Retrieval systems that using color, texture and shape features. Most systems are products of research, and therefore emphasize one aspect of content-based retrieval. Some systems provide a user interface that allows more powerful query formulation than is useful in the demo system. Most systems use color and texture features, few systems use shape feature, and we have discussed about a system that using color, texture and shape features. . A combination of these color, texture and shape features provides a robust feature set for image retrieval. REFERENCES
[1] P. S. Hiremath & Jagadeesh Pujari, Content Based Image Retrieval using Color, Texture and Shape features, 15th IEEE International Conference on Advanced Computing and Communications (ADCOM 2007), Pages: 780-784. A. Smeulders, M.Worring, S. Santini, A. Gupta, and R. Jain, Contentbased image retrieval at the end of the early years, IEEE Trans. Pattern Anal. Mach. Intell., vol. 22, no. 12, pp. 13491380, 2000. John P. Eakins and Margaret E. Graham, Content-based image retrieval, a report to the JISC technology applications programme, Institute for image database research, University of Northumbria at Newcastle, U.K, January 1999.

[2]

Where dij is the best-match distance between tile i of query image q and tile j of target image t and Dqt is the distance between images q and t. Shape information is captured in terms of the edge image of the gray scale equivalent of every image in the database. We have used gradient vector flow (GVF) fields to obtain

[3]

[4]

Hideyuki Tamura and Naokazu Yokoya. Image Database Systems: A Survey, PatternRecognition, 17(1):2949, 1984. [5] Fahui Long,Hongjiang Zhang and David Dagan Feng, Fundamentals of content-based image retrieval: Microsoft corporation research articles, 2003. [6] Tobias Weyand and Thomas Deselaers, Combining Content-based Image Retrieval with Textual Information Retrieval , Department of Computer Science, RWTH, Aachen, October 2005 [7] Remco C. Veltkamp, Mirela Tanase, Content-based image retrieval systems, a survey, Technical report, department of computer science, Utrecht University, October 2000. [8] C. Carson, S. Belongie, H. Greenspan, and J. Malik, Blobworld: Image Segmentation Using Expectation-Maximization and Its Application to Image Querying, in IEEE Trans. On PAMI, vol. 24, No.8, pp. 1026-1038,2002 [9] Y. Chen and J. Z. Wang, A Region-Based Fuzzy Feature Matching Approach to Content-Based Image Retrieval, in IEEE Trans. on PAMI, vol. 24, No.9, pp.1252-1267, 2002. [10] Chenyang Xu, Jerry L Prince, Snakes,Shapes, and Gradient Vector Flow, IEEE Transactions on Image Processing, Vol-7, No 3,PP 359369, March 1998. [11] T. Gevers and A.W.M. Smeuiders., PicToSeek: Combining Color and Shape Invariant Features for Image Retrieval, IEEE TRANSACTIONS ON IMAGE PROCESSING, vol. 9, No. 1, January 2000

[12] B.S.Manjunath and W.Y.Ma , Texture Features for Browsing and Retrieval .of Image Data, IEEE transactions on PAAMI, vol 18, No. 8, August 1996. Frontier Technologies, Feb 2007, Manipal, India. [13] S. Nandagopalan, Dr. B. S. Adiga, and N. Deepak, "A Universal Model for Content-Based Image Retrieval", International Journal of Computer Science 4:4 2009 [14] D.Lowe, Distinctive image features from scale invariant keypoints, International Journal of Computer vision, vol. 2(6),pp.91-110,2004. [15] K.Mikolajezyk and C.Schmid, Scale and affine invariant interest point detectors, International Journal of Computer Vision, vol. 1(60),pp. 63-86, 2004. [16] Etinne Loupias and Nieu Sebe, Wavelet-based salient points: Applications to image retrieval using color and texture features, in Advances in visual Information systems, Proceedings of the 4th International Conference, VISUAL 2000, pp. 223-232, 2000. [17] M.Banerjee, M,K,Kundu and P.K.Das, Image Retrieval with Visually Prominent Features using Fuzzy set theoretic Evaluation, ICVGIP 2004, India, Dec 2004. [18] Yong Rui, Thomas S. Huang, Shih-Fu Chang, "Image Retrieval: Current Techniques, Promising directions and open issues", 2003 [19] http://wang.ist.psu.edu/

You might also like