A Study On Image Annotation Techniques
A Study On Image Annotation Techniques
net/publication/258650578
CITATIONS READS
27 1,805
2 authors:
All content following this page was uploaded by Reena Pagare on 11 February 2016.
42
International Journal of Computer Applications (0975 – 8887)
Volume 37– No.6, January 2012
3.2 Manual Annotation training examples are used as input where each training
In manual annotation users have to enter some descriptive example is described by its low level features & corresponding
keywords when the images are loaded / registered/browsed. annotation to the image. It results in template for annotating
Manual annotation of image content is considered “best case” image with set of relevant keywords for image. To bridge the
in terms of accuracy, since keywords are selected based on gap between low-level features and high-level semantics in
human determination of the semantic content of images. But at retrieval systems, user participation is required in semi
the same time, it is an effort intensive and monotonous process. automatic annotation. The user is supposed improve results by
Manual annotation can have a problem that at the retrieval, using negative as well as positive examples and revise the
users can forget the annotations they have used after a long knowledge about image classes in the semantic space.
period of time. This semi automatic [12] annotation method is combination of
efficiency of automatic annotation & accuracy of manual
3.3 Image Annotation Based On Ontology annotation. The user has to provide the feedback while
Semantic Web technologies like ontologies can be used to examining retrieval results. This method has three main parts:
annotate images with semantic descriptions. Ontology is a The query firing interface (A keyword query), the image
design of an abstract/summary. It actually defines a collection browser and the relevance feedback interface. When user
of figurative terms called concepts. Focus of ontology based submits a query, search results are returned as a ranked of
semantic image annotation is on relating the contents of an images as relevance with query. According to ranked list order,
image and tries to describe image contents as completely as images are displayed on the image browser where user can
possible. Three layer architecture [10] for image annotation is view them. After browsing images the user can give feedback
suggested. Low level features of images are selected by the through relevance feedback interface. The system returns the
bottom layer. These features are then mapped to semantically refined retrieval results based on the user‟s feedback and
significant keywords in the middle layer. These keywords are presents the results in the browser. This method is particularly
then connected to schemas and ontologies on the top layer. suitable in a dynamic database system, in which new images
are constantly being introduced continuously.
The keyword-based approach is user friendly and it can be
easily applied with satisfactory retrieval accuracy, while 3.5 Automatic Image Annotation
semantically rich ontology concentrate on the need for In this [13] automatic image annotation method, image
complete descriptions of image retrieval and advances the segmentation algorithms are used to divide the images into a
accuracy of retrieval. Ontology performs well with the number of unevenly shaped „blob‟ regions and to work on
combination of low level image features with high level textual these blobs. It uses the „global‟ features for automated image
information due to effectiveness of visual information to sort annotation. This modeling framework is based on
out the most of imprecise results. nonparametric density estimation, using the technique of
„kernel smoothing‟. User annotator has to select the word for
annotation of image with the some probability. This probability
3.4 Semi Automatic Annotation can be interpreted into probability density of image x & density
There is high semantic gap between the low level visual
of x conditional upon the assignment of annotation w.
features of the image & high level human semantic due to
which performance of traditional Content Based Information In this approach [14] a training set of images are used for
Retrieval (CBIR) systems degrades. In semi automatic image automatically annotating images. Vocabulary of blobs is used
annotation, it requires some sort of user participation in the for describing the regions in the image. In this method by using
image annotation process. the training set of images with annotated keywords, it is
required to predict the probability of deriving the label for the
It [10] explains machine learning algorithms for user supported
blobs in the image. Image can be seen as collection of blobs.
image annotation. Three layer architecture is used for image
For each, there is probability distribution called as relevance
annotation. Visual information taken from the raw image
model of image. This relevance model can be treated as
contents forms the bottom layer. These contents are them
container holding all possible blobs that exist in the image. It
mapped with the semantically rich keywords at the middle
will contain the keywords that exist in the image. With the help
layer. Top layer consists of by mapping of keywords to
of training set of images with annotated labels, the possibility
schemas (structure described in a formal language) &
of producing a tag specified the blobs in an image can be
ontologies (a formal explicit description of concepts). Machine
guessed.
learning together with user feedback helps to make use of
previously annotated images to increase the rate of annotation This method [15] uses word to word correlation because
for images from the same domain. It is consistent, cost- sometimes image features are inadequate in establishing the
effective, fast, intelligent annotation of visual data. Intelligent corresponding word annotation. To integrate the word-to-word
Image Indexing Web Service (I3WS) is used in this approach correlation, it needs to approximate the probability of
which will take a raw image repository (along with some annotating image with a set of words. This approach uses
voluntary restrictions and parameters such as schemas, language model to produce annotation words for image. This
keywords, ontologies, etc.) as an input and return its annotated model contains set of word probabilities. Probability means
version as an outcome. how probably the particular word will be used for annotation.
Advantage to this approach is that it automatically determines
In [11] this to acquire the information about the semantic
the annotation length for a given image which in turn enhances
meaning of an image called as keywords, the image is divided
precision of image retrieval.
based on contents including objects consisting of its category,
personality & its action. The resulting semantic classification
This [16] approach improves the existing annotations of images
of the image as semantic class is treated as the root of the
i.e. it will refine the conditional probability so that more
hierarchical description structure. Sequence of keywords is
accurate annotations will have higher probabilities. As effect,
used to annotate the image & selection of keyword is
the annotations with highest probabilities will be kept as the
dependent on occurrence of concept in the image. Set of
43
International Journal of Computer Applications (0975 – 8887)
Volume 37– No.6, January 2012
44
International Journal of Computer Applications (0975 – 8887)
Volume 37– No.6, January 2012
4.4 Annotation Process system can make the annotation method more intellectual and
There is limitation of input on mobile device so the functions precise.
of annotation process are compact. The annotation algorithm
needs to be as simple as possible due to the restriction
6. REFERENCES
computing ability of mobile phone. To give annotations to an [1] Wei Liu , Xue Li, Daoli Huang “A Survey on Context
image, following are the steps: Awareness” Computer Science and Service System
(CSSS), International Conference on,29 June 2011
1) Metadata Analysis: The metadata can be acquired from the IEEE
image (EXIF), including time, GPS and artist fields. [2] B. Shevade, H. Sundaram, L. Xie. “Modeling Personal
2) Getting Personal Context: The time and GPS information is and Social Network Context for Event Annotation in
used to analyze valuable information in personal utilities such Images”. In JCDL 2007, ACM Press (2007).
as calendar, contacts and email, and the produced results are [3] Shuangrong Xia, Xiangyang Gong, Wendong Wang,Ye
presented as annotation implications; Tia “Context-Aware Image Annotation and Retrieval on
3) Tagging: Images are annotated based on these suggestions Mobile Device” 2010 IEEE.
and add other tags manually if required (such as emotion). [4] L. Cao, J. Luo, H. Kautz, T. S. Huang. “Image Annotation
within the Context of Personal Photo Collections Using
4) Uploading: The photo and the annotations are uploaded to Hierarchical Event and Scene Models”. In IEEE
the server. The server accumulates the photo and the Multimedia 2009 11(2), 208- 219.
annotations and generates multidimensional indices for the
photos. [5] W. Viana, J. B. Filho, J. Gensel, M. Villanova-Oliver, H.
Martin. PhotoMap: From location and time to context-
When a photo is selected to upload, the system reads date/time, aware photo Annotations. In Journal of Location Based
GPS and photographer (artist) from the EXIF segment of the Services 2008 2(3), 211-235
photo file. The event which is programmed at that time in the
calendar through some API (application programming [6] M. Ames, M. Naaman. “Why We Tag: Motivations for
interface) is searched. The search results are listed as event Annotation”. In proc. CHI 2007, ACM Press (2007),
annotation suggestions. 971-980.
[7] U. WESTERMANN and R. JAIN. Toward a Common
After uploading photo, the server obtains the GPS and other Event Model for Multimedia Applications. In IEEE
metadata and annotates the upload time and the owner of the Multimedia 2007 14(1), 19-29
photo. Server accumulates the photo and creates thumbnails for
the photo. Along with this, it also generates multidimensional [8] M. Davis, N. V. House, J. Towle, S. King, S. Ahern, C.
indices for the photo. These indices are used to retrieve the Burgener, Perkel, M. Finn, V.Viswanathan, M.
image. Rothenberg. “MMM2: Mobile Media Metadata for
Media Sharing”, Ext. Abstracts CHI 2005, ACM Press
(2005), 1335-1338
[9] asullah Khalid Alham, Maozhen Li1, Suhel Hammoud
and Hao Qi “Evaluating Machine Learning Techniques
for Automatic Image Annotations” 2009 IEEE
[10] O. Marques, N. Barman, Semi-Automatic Semantic
Annotation of Images Using Machine Learning
Techniques, Proc. of ISWC, pp. 550-565, 2003
[11] J. Vompras, S. Conrad, A Semi-Automated Framework
for Supporting Semantic Image Annotation, Proc. of
ISWC, pp.105-109, 2005.
[12] L. Wenyin, S. Dumais, Y. Sun, H. Zhang, M. Czerwinski
and B. Field, Semi-Automatic Image Annotation, Proc. Of
INTERACT, pp.326-333, 2001.
Fig 2: Annotation Process [13] A. Yavlinsky, E. Schofield, S. M. Rüger, Automated
1) Time Index Image Annotation using Global Features and Robust
2) User Index Nonparametric Density Estimation, Proc. of CIVR, pp.
3) Emotion Index 507-517, 2005
4) Location Index
5) Relevance Index [14] Jeon, V. Lavrenko, and R. Manmatha, Automatic Image
6) Event Index Annotation and Retrieval using Cross-Med Relevance
Models, Proc. ACM SIGIR, pp. 119-126, 2003.
5. CONCLUSION [15] R. Jin, J. Chai, L. Si, Effective Automatic Image
Manual annotation is a costly and time consuming work, Annotation via a Coherent Language Model and Active
especially for mobile device. We have discussed methods of Learning, Proc. of ACM Conference on Multimedia, pp.
automatic & semi-automatic image annotation. Semi-automatic 892-899, 2004
performs better than other annotation techniques in terms of [16] C. Wang, F. Jing, L. Zhang, H. Zhang, Content-Base
accuracy as user is participating in annotation process. The Image Annotation Refinement, Proc. of CVPR, 2007.
semi-automatic way to add annotation to image by using
contextual information of the mobile device is discussed here.
Machine learning mechanism if integrated into annotation
45