Generation of Stratified Image Database With Web Image Sharing Service and Ontology
Generation of Stratified Image Database With Web Image Sharing Service and Ontology
Ryoya Fujimoto , Ryosuke Yamanishi , Yuji Iwahori , Kohichi Toshioka and Junichi Fukumoto
Dept. of Computer Science
Graduate School of Engineering, Chubu University
1200 Matsumoto-cho, Kasugai, 487-8501 Japan
Email: [email protected], {iwahori, toshioka}@cs.chubu.ac.jp
Dept. of Media Technology
College of Information Science and Engineering, Ritsumeikan University
1-1-1 Nojihigashi, Kusatsu, 525-8500 Japan
Email: [email protected]
AbstractPreparation of a training image dataset used objects on social Web service. Web image mining is
for the genetic object recognition system needs a lot of realized to obtain a huge image dataset consisting of
costs. Some datasets were recently created from Web image photos uploaded to Web service.
sharing services like Flickr, but examples which utilize a
semantic classication or information of tags given to photos Flickr1 is a one of Web image sharing service, and it
are quite few. This paper proposes an automatic generation is often used as a source of Web Image mining. Several
method of an image dataset by introducing semantics of GOR researches using image dataset which is obtained
each label given to images, where the images are positioned from Flickr has been reported[8], [9]. These researches
in a semantic stratication according to the meaning of directly uses tags given to image data from Flickr (i.e.
labels. Here the semantic stratication consists of IS-A
relations, where Animal locates in a upper concept of photo uploaded to Flicker) as data labels. The tag of each
Dog and Cat. By using an image dataset with conceptual image data from Flickr is often given by multiple Web
information like this, a computer will be able to recognize users, thus the tag given to photo should be appropriate
objects pictured in photos from a wide concept like Animal as the general intuition. However, inappropriate tags (e.g.,
or Vehicle to narrower concepts gradually. In experiments, coined terms and meaningless string) are sometimes given
47,910 photos were automatically classied into 183 classes
which were aligned in a stratied tree. An evaluation test of and relation between contents of in image data and the tag
the labels was performed manually for the generated image becomes inappropriate. The existing research statistically
dataset and it was conrmed that most of labeling were removes these noise data using appearance, number of
reasonably collected. frequency and dispersion of tags and images; semantics
Keywords-Web Image Mining; Ontology; Stratied Image is not taken in the consideration. Here how to prepare
Dataset; Generic Object Recognition; the learning image dataset automatically becomes the key
problem.
I. I NTRODUCTION This paper proposes automatic generation of huge
Recognizing object in the given image without any and semantically stratied image dataset generated from
constraints has been known as Generic Object Recognition Flickr. Tags given to photo in Flickr are veried as reason-
(GOR). GOR is one of the difcult topic in the led of able correspondence to ontology dictionary and then noise
computer vision, and some studies have tried to achieve tags are removed. And the image dataset is semantically
generic object recognition [1], [2], [3]. stratied corresponding to ontology, e.g., hypernym of
To achieve GOR, huge training image database for Dog and Cat is Animal. This concept of automatic
various objects must be prepared because environment generation of stratied image dataset may make it possible
around objects and angles of view are varied for each to realize GOR better similarly in human recognition.
image data even if the same object is taken as an image.
As a results, problem of classications into multiple II. T HE P ROPOSED M ETHOD
classes should be solved. Existing GOR studies often use The general procedure of the proposed method is as
manually prepared image dataset especially for researches follows;
in computer vision [4], [5]. However, it is hard work to 1) Acquirement of image data
prepare an appropriate image database in general because 2) Verication of tag with ontology
various objects should be prepared in the database, huge 3) Removing inappropriate tags for image data with
number of images should be prepared in the database, and idea of IS-A
labeling given to image data should not depend on the 4) Stratication of Image Dataset
original researcher.
First, we search class name on ontology on Flickr and
As papers to obtain general image dataset, Web Image
URL of Image data of class name and the tags are ac-
mining[6], [7] have recently been focused on. Presently,
each individual easily uploads their photos about various 1 http://ickr.com/
Table I
quired. Next, the acquired tag is veried with ontology and C LASS THAT EACH TAG OF F IGURE 3 BELONGS TO
then coined words and meaningless string are removed. Tag Class that the tag belongs to
And, inappropriate tags for contents of image data are animal Animal
removed with idea of IS-A. Last, image dataset is stratied cat Mammal
with remained tag and ontology. fun MusicGenre
grey Color
A. DBpedia as Ontology italy Country
nikon Company
DBpedia[10] is used as ontology in this paper. DBpedia
is a service that generates ontology from Wikipedia and
DBpedia is published as RDF. Conceptual structure of
DBpedia is periodically updated, thus DBpedia has high process enables to apply the idea of IS-A to remove
modiability. Moreover, DBpedia is usable to stratify inappropriate tag such as bazooka in Figure 2. The
dataset because hyponymy is prepared as tree structure details about the removing of inappropriate tag with idea
in which http://www.w3.org/2002/07/owl#Thing is a top of IS-A will be shown in section II-C.
node. To search ontology on DBpedia, query language Here, an image tag is corresponded to an in-
SPARQL is available and API is published. stance in ontology. Dog and Cat that are in-
Data on RDF can be shown as triple: subject, predicate stances in Figure 1, and they are actually rep-
and object, and each concept is uniquely shown as URI. resented as URI: http://dbpedia.org/resource/Dog and
Figure 1 shows an example of conceptual structure on http://dbpedia.org/resource/Cat.
DBpedia with only the last path for reduction. In the We add a tag given to image data at the foot of URI
Figure, ellipse is a concept, and relation among concept http://dbpedia.org/resource/, and then the tag is corre-
(e.g., afliation and hyponymy) can be shown as arrows. sponded to the instance. A tag is assumed as a coined
The concepts such as Mammal and Animal are called word or less common word if the URI with the tag is not
as class, and Dog and Cat are called as their instances. included in DBpedia, and the tag is removed from tags
In Figure 1, Dog and Cat belong to class Mammal; the given to image data.
relation is IS-A, and Mammal is a hyponym of Animal.
C. Removing of Inappropriate Tags with Idea of IS-A
In this paper, tag is given to the image data and referred
to the instance. Then, tags are adjusted with ontology and Through the above procedures detailed in section II-B,
stratication of image dataset. we obtain image data that is searched with a class name in
ontology and tags that are veried with ontology. However,
B. Acquirement of Image Data inappropriate tag for contents in an image data (e.g.,
Image data are acquired from Flickr to generate strat- bazooka in Figure 2) still remains.
ied image dataset. Using unstandardized query, the ob- To remove such an inappropriate tag, semantic relation
tained image data are biased and then inappropriate image between contents in image data and tags should be taken
dataset is generated. Image data about animal can naturally into the consideration. In ontology, class and instance are
be obtained with a query animal, however, inappropriate semantically related, that is a relation of IS-A. Therefore,
tags (e.g., something contained in image data except introducing ideas of class and instance, i.e. IS-A, the
animal and personal name) are also given; for example, inappropriate tag for contents in an image data is removed.
inappropriate tag bazooka which means a weapon name Assuming a tag as an instance, the tag remains if the
is given to the image in Figure 2 as one of tags. tag is just the searched class and one of instances of
Therefore we use class name in ontology as keyword the searched class. If a tag is not the searched class and
query and search the query on Flickr, and then acquire not included in instances of the searched class, the tag
image data related with the class name. Then, image data is removed. For example, in case of Figure 2, the image
where the searched class name is included in Title, data are searched with a query animal. Thus, animal
Description and Tags are searched and stored. This that is just a searched class name and dog that is one of
967
Appropriate tags for the contents in the
Originally given tags on Flickr. Only tags veried with ontology.
image data introducing idea of IS-A.
Table II
T EN OF THE TOP NUMBER OF REMOVED IMAGE DATA FOR EACH
Weapo
n tag:glock SEARCHED CLASS
Searched Class Number of removed image data
tag:car
Australian football league 3863
Ice hockey league 3832
Thing Vehicle
Imdb 3817
tag:bike Bowling league 3788
American football league 3608
tag:cat British royalty 3547
Animal Beach volleyball player 3514
tag:dog American football team 3477
Racing driver 3473
Philosopher 3389
968
Table IV
T RANSITION OF EXPERIMENTAL DATA FOR EACH PROCEDURE
Number of image data Number of kinds of tag Average number of tags for an image data
Originally from Flickr 1,603,542 612,727 15.01
Tags are veried with ontology 835,598 30,250 3.85
Inappropriate Tags are removed 47,910 4,872 1.55
Table III
T HE TOP TEN CLASSES THAT REMOVED TAG BELONG TO
Class Number of kinds of removed tag
Municipality 8617
Year 1416
Company 1292
Populated place 881
Animal 861
Town 755 Figure 5. Examples of image data with tag Cat
Band 754
City 743
Person 717
Plant 585
969
Figure 11. Examples of image data with tag Overweight belongings
to class Disease
Table V
R ESULTS OF EVALUATIONAL EXPERIMENT FOR TAGS
Tag Percentage of image data correctly labeled (%)
animal 82.0
cat 64.0
coffee 53.0
conidae 90.0
cypraeidae 78.0
dog 79.0
earth 31.0
elephant 59.0
fencing 69.0
fern 65.0
gecko 74.0
green 25.0
lion 70.0
lizard 89.0
moth 62.0
pine 79.0
plant 52.0
pulmonata 76.0
quartz 19.0
spider 53.0
AVG 63.5
970
Table VI GOR experiment in order to conrm the usefulness of
R ESULTS OF EVALUATIONAL EXPERIMENT FOR CLASS
the stratied image dataset.
Class Percentage of image data correctly labeled (%)
amphibian 98.0 ACKNOWLEDGMENT
animal 80.0
arachnid 66.0
Iwahoris research is supported by JSPS Grant-in-Aid
owering plant 85.0 for Scientic Research (C)(23500228) and a Chubu Uni-
fungus 90.0 versity Grant. The authors would like to thank Iwahori
insect 86.0 Lab. and Toshioka Lab. members for their useful discus-
mammal 83.0
mollusca 66.0 sions.
plant 70.0
sport 48.0 R EFERENCES
AVG 77.2 [1] R. Bergevin and M. D. Levine, Generic object recognition:
Building and matching coarse descriptions from line draw-
ings, Pattern Analysis and Machine Intelligence, IEEE
Transactions on, vol. 15, no. 1, pp. 1936, 1993.
earth. The average of percentage of image data correctly
labeled in the evaluation experiment for tags was 63.5%. [2] Y. LeCun, F. J. Huang, and L. Bottou, Learning methods
for generic object recognition with invariance to pose and
The precision for green and earth were low because lighting, in Computer Vision and Pattern Recognition,
these tags were ambiguity for object. For example, image 2004. CVPR 2004. Proceedings of the 2004 IEEE Com-
data labeled as earth were varied: the earth as a planet puter Society Conference on, vol. 2. IEEE, 2004, pp.
and its illustration, scenery of nature. However fencing II97.
was not a tag that means object, high precision was
[3] A. Opelt, A. Pinz, M. Fussenegger, and P. Auer, Generic
conrmed for fencing; image data of people who plays object recognition with boosting, Pattern Analysis and
fencing was appropriately collected in the generate image Machine Intelligence, IEEE Transactions on, vol. 28, no. 3,
dataset. pp. 416431, 2006.
From Table VI, it was conrmed that the average of per-
centage of image data correctly labeled in the evaluation [4] G. Csurka, C. Dance, L. Fan, J. Willamowski, and C. Bray,
Visual categorization with bags of keypoints, in Workshop
experiment for class was 77.2%; this became about 14% on statistical learning in computer vision, ECCV, vol. 1, no.
higher than the one for class. Class covers wider concept. 1-22, 2004, pp. 12.
than tag. Therefore, image data of tiger in Figure 9 was
appropriate for class Animal, however the same one in [5] G. Grifn, A. Holub, and P. Perona, Caltech-256 object
Figure 9 was inappropriate for tag cat. category dataset, 2007.
Image dataset used in the experiment included con- [6] T.-S. Chua, J. Tang, R. Hong, H. Li, Z. Luo, and Y. Zheng,
fusable image data such as player on podium labeled Nus-wide: a real-world web image database from national
as fencing and Spiders Web labeled as spider. In university of singapore, in Proceedings of the ACM inter-
this experiment, only the image data that more than six national conference on image and video retrieval. ACM,
2009, p. 48.
subject evaluated as appropriate were assumed as correct,
under the assumption that these confusable image data as [7] K. Yanai, Generic image classication using visual knowl-
inappropriately labeled image data. However, these image edge on the web, in Proceedings of the eleventh ACM
data has the potential to be assumed as image data that international conference on Multimedia. ACM, 2003, pp.
shows related information of label. 167176.
971