0% found this document useful (0 votes)
32 views6 pages

Generation of Stratified Image Database With Web Image Sharing Service and Ontology

informe iee
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views6 pages

Generation of Stratified Image Database With Web Image Sharing Service and Ontology

informe iee
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

2014 IIAI 3rd International Conference on Advanced Applied Informatics

Generation of Stratied Image Database


with Web Image Sharing Service and Ontology

Ryoya Fujimoto , Ryosuke Yamanishi , Yuji Iwahori , Kohichi Toshioka and Junichi Fukumoto
Dept. of Computer Science
Graduate School of Engineering, Chubu University
1200 Matsumoto-cho, Kasugai, 487-8501 Japan
Email: [email protected], {iwahori, toshioka}@cs.chubu.ac.jp
Dept. of Media Technology
College of Information Science and Engineering, Ritsumeikan University
1-1-1 Nojihigashi, Kusatsu, 525-8500 Japan
Email: [email protected]

AbstractPreparation of a training image dataset used objects on social Web service. Web image mining is
for the genetic object recognition system needs a lot of realized to obtain a huge image dataset consisting of
costs. Some datasets were recently created from Web image photos uploaded to Web service.
sharing services like Flickr, but examples which utilize a
semantic classication or information of tags given to photos Flickr1 is a one of Web image sharing service, and it
are quite few. This paper proposes an automatic generation is often used as a source of Web Image mining. Several
method of an image dataset by introducing semantics of GOR researches using image dataset which is obtained
each label given to images, where the images are positioned from Flickr has been reported[8], [9]. These researches
in a semantic stratication according to the meaning of directly uses tags given to image data from Flickr (i.e.
labels. Here the semantic stratication consists of IS-A
relations, where Animal locates in a upper concept of photo uploaded to Flicker) as data labels. The tag of each
Dog and Cat. By using an image dataset with conceptual image data from Flickr is often given by multiple Web
information like this, a computer will be able to recognize users, thus the tag given to photo should be appropriate
objects pictured in photos from a wide concept like Animal as the general intuition. However, inappropriate tags (e.g.,
or Vehicle to narrower concepts gradually. In experiments, coined terms and meaningless string) are sometimes given
47,910 photos were automatically classied into 183 classes
which were aligned in a stratied tree. An evaluation test of and relation between contents of in image data and the tag
the labels was performed manually for the generated image becomes inappropriate. The existing research statistically
dataset and it was conrmed that most of labeling were removes these noise data using appearance, number of
reasonably collected. frequency and dispersion of tags and images; semantics
Keywords-Web Image Mining; Ontology; Stratied Image is not taken in the consideration. Here how to prepare
Dataset; Generic Object Recognition; the learning image dataset automatically becomes the key
problem.
I. I NTRODUCTION This paper proposes automatic generation of huge
Recognizing object in the given image without any and semantically stratied image dataset generated from
constraints has been known as Generic Object Recognition Flickr. Tags given to photo in Flickr are veried as reason-
(GOR). GOR is one of the difcult topic in the led of able correspondence to ontology dictionary and then noise
computer vision, and some studies have tried to achieve tags are removed. And the image dataset is semantically
generic object recognition [1], [2], [3]. stratied corresponding to ontology, e.g., hypernym of
To achieve GOR, huge training image database for Dog and Cat is Animal. This concept of automatic
various objects must be prepared because environment generation of stratied image dataset may make it possible
around objects and angles of view are varied for each to realize GOR better similarly in human recognition.
image data even if the same object is taken as an image.
As a results, problem of classications into multiple II. T HE P ROPOSED M ETHOD
classes should be solved. Existing GOR studies often use The general procedure of the proposed method is as
manually prepared image dataset especially for researches follows;
in computer vision [4], [5]. However, it is hard work to 1) Acquirement of image data
prepare an appropriate image database in general because 2) Verication of tag with ontology
various objects should be prepared in the database, huge 3) Removing inappropriate tags for image data with
number of images should be prepared in the database, and idea of IS-A
labeling given to image data should not depend on the 4) Stratication of Image Dataset
original researcher.
First, we search class name on ontology on Flickr and
As papers to obtain general image dataset, Web Image
URL of Image data of class name and the tags are ac-
mining[6], [7] have recently been focused on. Presently,
each individual easily uploads their photos about various 1 http://ickr.com/

978-1-4799-4173-5/14 $31.00 2014 IEEE 966


DOI 10.1109/IIAI-AAI.2014.189
Figure 1. Example of conceptual structure on DBpedia
Figure 2. Image data with tags: [bazuca, bazooka, dog, animal, pet]

Table I
quired. Next, the acquired tag is veried with ontology and C LASS THAT EACH TAG OF F IGURE 3 BELONGS TO
then coined words and meaningless string are removed. Tag Class that the tag belongs to
And, inappropriate tags for contents of image data are animal Animal
removed with idea of IS-A. Last, image dataset is stratied cat Mammal
with remained tag and ontology. fun MusicGenre
grey Color
A. DBpedia as Ontology italy Country
nikon Company
DBpedia[10] is used as ontology in this paper. DBpedia
is a service that generates ontology from Wikipedia and
DBpedia is published as RDF. Conceptual structure of
DBpedia is periodically updated, thus DBpedia has high process enables to apply the idea of IS-A to remove
modiability. Moreover, DBpedia is usable to stratify inappropriate tag such as bazooka in Figure 2. The
dataset because hyponymy is prepared as tree structure details about the removing of inappropriate tag with idea
in which http://www.w3.org/2002/07/owl#Thing is a top of IS-A will be shown in section II-C.
node. To search ontology on DBpedia, query language Here, an image tag is corresponded to an in-
SPARQL is available and API is published. stance in ontology. Dog and Cat that are in-
Data on RDF can be shown as triple: subject, predicate stances in Figure 1, and they are actually rep-
and object, and each concept is uniquely shown as URI. resented as URI: http://dbpedia.org/resource/Dog and
Figure 1 shows an example of conceptual structure on http://dbpedia.org/resource/Cat.
DBpedia with only the last path for reduction. In the We add a tag given to image data at the foot of URI
Figure, ellipse is a concept, and relation among concept http://dbpedia.org/resource/, and then the tag is corre-
(e.g., afliation and hyponymy) can be shown as arrows. sponded to the instance. A tag is assumed as a coined
The concepts such as Mammal and Animal are called word or less common word if the URI with the tag is not
as class, and Dog and Cat are called as their instances. included in DBpedia, and the tag is removed from tags
In Figure 1, Dog and Cat belong to class Mammal; the given to image data.
relation is IS-A, and Mammal is a hyponym of Animal.
C. Removing of Inappropriate Tags with Idea of IS-A
In this paper, tag is given to the image data and referred
to the instance. Then, tags are adjusted with ontology and Through the above procedures detailed in section II-B,
stratication of image dataset. we obtain image data that is searched with a class name in
ontology and tags that are veried with ontology. However,
B. Acquirement of Image Data inappropriate tag for contents in an image data (e.g.,
Image data are acquired from Flickr to generate strat- bazooka in Figure 2) still remains.
ied image dataset. Using unstandardized query, the ob- To remove such an inappropriate tag, semantic relation
tained image data are biased and then inappropriate image between contents in image data and tags should be taken
dataset is generated. Image data about animal can naturally into the consideration. In ontology, class and instance are
be obtained with a query animal, however, inappropriate semantically related, that is a relation of IS-A. Therefore,
tags (e.g., something contained in image data except introducing ideas of class and instance, i.e. IS-A, the
animal and personal name) are also given; for example, inappropriate tag for contents in an image data is removed.
inappropriate tag bazooka which means a weapon name Assuming a tag as an instance, the tag remains if the
is given to the image in Figure 2 as one of tags. tag is just the searched class and one of instances of
Therefore we use class name in ontology as keyword the searched class. If a tag is not the searched class and
query and search the query on Flickr, and then acquire not included in instances of the searched class, the tag
image data related with the class name. Then, image data is removed. For example, in case of Figure 2, the image
where the searched class name is included in Title, data are searched with a query animal. Thus, animal
Description and Tags are searched and stored. This that is just a searched class name and dog that is one of

967
Appropriate tags for the contents in the
Originally given tags on Flickr. Only tags veried with ontology.
image data introducing idea of IS-A.

[ambra, animal, cat, colors, cute, eyes,


freetime, fun, gatto, grey, house, italy,
[animal, cat, fun, grey, italy, nikon] [animal, cat]
kitten, moment, nikon, pet, shadow,
shot, wood]
Figure 3. Example of transition of tags through each procedure; a searched query is Animal.

Table II
T EN OF THE TOP NUMBER OF REMOVED IMAGE DATA FOR EACH
Weapo
n tag:glock SEARCHED CLASS
Searched Class Number of removed image data
tag:car
Australian football league 3863
Ice hockey league 3832
Thing Vehicle
Imdb 3817
tag:bike Bowling league 3788
American football league 3608
tag:cat British royalty 3547
Animal Beach volleyball player 3514
tag:dog American football team 3477
Racing driver 3473
Philosopher 3389

Figure 4. Conceptual diagram of stratied image dataset


http://www.w3.org/2002/07/owl#Thing as shown in Fig-
ure 4.
instance of animal remain, and bazuca and bazooka Stratication of image dataset enables us to handle
are removed as inappropriate tags for contents of the image image data with semantics. As shown in Figure 4, each tag
data. belongs to class such as Animal and each tag has image
Tags given to image data transit through the above data. Then relationship between searched class and tags
procedures; original tags on Flickr, only the tags which is IS-A, thus noise data (e.g., image data about Dog
were veried with ontology remain, and relation between belongs to Weapon) is decreased.
the tags and the searched class are reasonable as an Using the stratied image dataset, a computer become
instance and the class. Figure 3 shows an example of to recognize an input object in a stepwise fashion; at rst,
the transition. On Flickr, 19 tags are originally given a computer resolves a conceptual problem Is the image
to the image data shown in Figure 3 that is searched data Weapon, Vehicle or Animal? and next the computer
with a query animal. Then, only six tags are veried resolves specic GOR.
with ontology, which is detailed in Table I. As shown in III. S TRATIFIED I MAGE DATASET G ENERATION
Table I, tags that satisfy the above selection requirement
Image data are obtained from Flickr and stratied image
are Animal that is the searched class and Mammal that
dataset is generated according to the proposed method. 518
is hyponym of Animal. Eventually, animal and cat
classes are each searched on Flickr, and maximum number
remain considering relationships between each tag and the
of image data for each searched class is set as 4,000 as
searched class animal.
experimental parameter.
D. Stratication of Image Dataset A. Experimental Data
We, then, stratify the image dataset with hyponymy As the result of search for each class, 1,603,542 image
of classes in ontology. Through the above proce- data and 612,727 kinds of tag from Flickr were obtained.
dures, we obtained image dataset in which the re- The tags of the obtained image data were processed as
lationships between tags and searched class is IS- detailed in section II, and image dataset was stratied. The
A. Image dataset is stratied while the top node is transition of the experimental data is shown in Table IV.

968
Table IV
T RANSITION OF EXPERIMENTAL DATA FOR EACH PROCEDURE
Number of image data Number of kinds of tag Average number of tags for an image data
Originally from Flickr 1,603,542 612,727 15.01
Tags are veried with ontology 835,598 30,250 3.85
Inappropriate Tags are removed 47,910 4,872 1.55

Table III
T HE TOP TEN CLASSES THAT REMOVED TAG BELONG TO
Class Number of kinds of removed tag
Municipality 8617
Year 1416
Company 1292
Populated place 881
Animal 861
Town 755 Figure 5. Examples of image data with tag Cat
Band 754
City 743
Person 717
Plant 585

Figure 6. Examples of image data with Glock tag

Image data with no tags via removal processing of the


inappropriate tags, were eliminated from image dataset
nd the number of image data decreased as a result. Ap-
proximately half of experimental image data has only tags
Figure 7. Examples of image data with tag Tucano
unveried with ontology, and the image data was removed
from the dataset: number of image data decreases from
1,603,542 to 835,598. Even if the tags are veried with
ontology, some of the tags are not instances of the searched
class, and the image data that has only the inappropriate
tags was removed from the dataset: number of image data
decreases from 835,598 to 47,910.
Table III-B shows ten of the top numbers of removed
image data for each searched class. From Table III-B, Figure 8. Examples of image data with tag Candy
image data was removed from the dataset when class
about sports league was searched. Instances of American
football league were inappropriate concepts for label of
image data such as Italian football league and Austrian
football league. These inappropriate concepts for GOR
were correctly removed from the dataset.
Table III shows the top ten classes to which the removed
tag belongs; all 30,250 tags were veried with ontology.
Figure 9. Examples of image data with class Animal
It was conrmed that the removed tags belong to the class
about place name such as Municipality and Populated
place, Year to which number symbols belong, and
Company that shows commercial name. Tags about
place name tend to be given to show shooting location
without consideration of contents of image data, thus it
is treated that such tags are inappropriate label of image Figure 10. Examples of image data with class Food
data for GOR. It was conrmed that inappropriate tags
for stratication of image dataset for GOR were correctly
removed through the procedures of the proposed method.
and Figure 9 and 10 are obtained from the generated
B. Generated Image Dataset stratied image dataset with class. Various image data that
were appropriate for tag in the stratied image dataset
With the processed tags and ontology, stratied image were obtained as shown in Figure 510. Especially, it
dataset was generated. Figure 510 show examples of was conrmed that image data for Species such as Animal
image data and the label in the generated stratied image and Plant were sufciently gathered in the dataset.
dataset. Then, Figure 5, 6, 7 and 8 are obtained from the As shown in Figure 11 and 12, there are some classes
generated stratied image dataset with a tag (i.e., instance), that are difcult to be related with image, e.g., Disease and

969
Figure 11. Examples of image data with tag Overweight belongings
to class Disease

Figure 12. Examples of image data with tag English belongings to


class Language Figure 14. Web page prepared for experiment

Table V
R ESULTS OF EVALUATIONAL EXPERIMENT FOR TAGS
Tag Percentage of image data correctly labeled (%)
animal 82.0
cat 64.0
coffee 53.0
conidae 90.0
cypraeidae 78.0
dog 79.0
earth 31.0
elephant 59.0
fencing 69.0
fern 65.0
gecko 74.0
green 25.0
lion 70.0
lizard 89.0
moth 62.0
pine 79.0
plant 52.0
pulmonata 76.0
quartz 19.0
spider 53.0
AVG 63.5

Figure 13. Stratication of the generated image dataset


and class and used in the experiment. In this regard, classes
that do not show the object (e.g., Place, Time period
Language. These classes are treated as an inappropriate and Topical concept) and its lower tags were previously
label to show object. excepted.
Figure 13 shows the stratication of the generated image The experiment was done through Web browser as
dataset, where some parts of tree are expanded. The shown in Figure 14. Then, subjects referred abstract of
generated image dataset was constructed by 183 classes label in Wikipedia. Seven subject evaluated whether the
however 518 classes were searched on Flickr, because the relationship between image data and those labels were
classes shown in Table were removed. The class on far left appropriate or not. In this experiment, the assumption
in Figure 13 is http://www.w3.org/2002/07/owl#Thing. In used is that the image data whose label was evaluated as
Figure 13, the image data are structured as those Thing is appropriate by more than six subjects as correctly labeled
a top node and tags were each node. image data; under the condition that this setting is strict
standard for affective experiment.
IV. E VALUATIONAL E XPERIMENT
B. Results
Experiments were done to evaluate the precision of label
for image data in the generated image dataset. Table V and Table VI show results of evaluational
experiment for 20 tags and 10 classes, respectively. Here,
A. Evaluational Procedure tag and class were used for the same meaning as label of
20 tags and 10 classes were randomly selected from the image data.
generated image dataset. Image data were obtained from From Table V, high precision was conrmed for tags
the generated image dataset using the tags and classes. that means animal such as conidae and lizard, however
Then 100 image data were randomly selected for each tag low precision was conrmed for quartz, green, and

970
Table VI GOR experiment in order to conrm the usefulness of
R ESULTS OF EVALUATIONAL EXPERIMENT FOR CLASS
the stratied image dataset.
Class Percentage of image data correctly labeled (%)
amphibian 98.0 ACKNOWLEDGMENT
animal 80.0
arachnid 66.0
Iwahoris research is supported by JSPS Grant-in-Aid
owering plant 85.0 for Scientic Research (C)(23500228) and a Chubu Uni-
fungus 90.0 versity Grant. The authors would like to thank Iwahori
insect 86.0 Lab. and Toshioka Lab. members for their useful discus-
mammal 83.0
mollusca 66.0 sions.
plant 70.0
sport 48.0 R EFERENCES
AVG 77.2 [1] R. Bergevin and M. D. Levine, Generic object recognition:
Building and matching coarse descriptions from line draw-
ings, Pattern Analysis and Machine Intelligence, IEEE
Transactions on, vol. 15, no. 1, pp. 1936, 1993.
earth. The average of percentage of image data correctly
labeled in the evaluation experiment for tags was 63.5%. [2] Y. LeCun, F. J. Huang, and L. Bottou, Learning methods
for generic object recognition with invariance to pose and
The precision for green and earth were low because lighting, in Computer Vision and Pattern Recognition,
these tags were ambiguity for object. For example, image 2004. CVPR 2004. Proceedings of the 2004 IEEE Com-
data labeled as earth were varied: the earth as a planet puter Society Conference on, vol. 2. IEEE, 2004, pp.
and its illustration, scenery of nature. However fencing II97.
was not a tag that means object, high precision was
[3] A. Opelt, A. Pinz, M. Fussenegger, and P. Auer, Generic
conrmed for fencing; image data of people who plays object recognition with boosting, Pattern Analysis and
fencing was appropriately collected in the generate image Machine Intelligence, IEEE Transactions on, vol. 28, no. 3,
dataset. pp. 416431, 2006.
From Table VI, it was conrmed that the average of per-
centage of image data correctly labeled in the evaluation [4] G. Csurka, C. Dance, L. Fan, J. Willamowski, and C. Bray,
Visual categorization with bags of keypoints, in Workshop
experiment for class was 77.2%; this became about 14% on statistical learning in computer vision, ECCV, vol. 1, no.
higher than the one for class. Class covers wider concept. 1-22, 2004, pp. 12.
than tag. Therefore, image data of tiger in Figure 9 was
appropriate for class Animal, however the same one in [5] G. Grifn, A. Holub, and P. Perona, Caltech-256 object
Figure 9 was inappropriate for tag cat. category dataset, 2007.
Image dataset used in the experiment included con- [6] T.-S. Chua, J. Tang, R. Hong, H. Li, Z. Luo, and Y. Zheng,
fusable image data such as player on podium labeled Nus-wide: a real-world web image database from national
as fencing and Spiders Web labeled as spider. In university of singapore, in Proceedings of the ACM inter-
this experiment, only the image data that more than six national conference on image and video retrieval. ACM,
2009, p. 48.
subject evaluated as appropriate were assumed as correct,
under the assumption that these confusable image data as [7] K. Yanai, Generic image classication using visual knowl-
inappropriately labeled image data. However, these image edge on the web, in Proceedings of the eleventh ACM
data has the potential to be assumed as image data that international conference on Multimedia. ACM, 2003, pp.
shows related information of label. 167176.

[8] H. Nakayama, T. Harada, and Y. Kuniyoshi, Canonical


V. C ONCLUSION contextual distance for large-scale image annotation and
This paper proposed a new method to generate a strat- retrieval, in Proceedings of the First ACM workshop on
Large-scale multimedia retrieval and mining. ACM, 2009,
ied image dataset with Web image sharing service and pp. 310.
ontology.
Stratied image dataset was generated where it included [9] X. Li, C. G. Snoek, and M. Worring, Unsupervised multi-
47,910 image data: the image dataset was stratied with feature tag relevance learning for social image retrieval,
4,872 kinds of tag and 183 classes. The relationships in Proceedings of the ACM International Conference on
Image and Video Retrieval. ACM, 2010, pp. 1017.
between image data obtained from Flickr and its given
tags were veried by introducing idea of IS-A. Through [10] J. Lehmann, R. Isele, M. Jakob, A. Jentzsch, D. Kon-
an evaluational experiment, it was conrmed that the tokostas, P. N. Mendes, S. Hellmann, M. Morsey, P. van
precision of relationships between the image data and its Kleef, S. Auer et al., Dbpedia-a large-scale, multilingual
label was sufcient for practical level. knowledge base extracted from wikipedia, Semantic Web
Journal, 2013.
It is expected that generated stratied image dataset
is applicable to GOR and also humans generic object
recognition will be feasible by further research activities.
As a remained work, GOR algorithm will be designed
by applying this stratied image dataset, and conduct

971

You might also like