0% found this document useful (0 votes)
37 views

Margaret E. I. Kipp : Exploring Measures of Inter Tagger Consistency

This document summarizes a study that explored various measures of consistency between taggers on the social bookmarking site Delicious. It analyzed tagging data from 43 folders tagged at different levels of exhaustivity. Consistency was measured using cosine similarity, Jaccard similarity, and pairwise comparisons between taggers and calculated tag centroids. The results found consistency levels between 4-82% across measures, consistent with levels found in previous inter-indexer consistency studies. Some consistency may be due to suggested tags in the Delicious interface acting as a semi-controlled vocabulary.

Uploaded by

srdiego
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views

Margaret E. I. Kipp : Exploring Measures of Inter Tagger Consistency

This document summarizes a study that explored various measures of consistency between taggers on the social bookmarking site Delicious. It analyzed tagging data from 43 folders tagged at different levels of exhaustivity. Consistency was measured using cosine similarity, Jaccard similarity, and pairwise comparisons between taggers and calculated tag centroids. The results found consistency levels between 4-82% across measures, consistent with levels found in previous inter-indexer consistency studies. Some consistency may be due to suggested tags in the Delicious interface acting as a semi-controlled vocabulary.

Uploaded by

srdiego
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1

Exploring Measures of Inter Tagger Consistency

Margaret E. I. Kipp <[email protected]>


Information Organization Research Group (IOrg), School of Information Studies, UWMilwaukee

INTRODUCTION METHODOLOGY RESULTS

Studies of inter indexer consistency have traditionally •Data was collected as part of a larger study examining patterns in convergence and gtd_tiddly
43folders
wiki
examined the consistency of indexing documents between 2 divergence of tags on delicious.com (see Kipp 2009). (highly webmd
(lightly
or 3 indexers. Leonard (1977) and Markey (1984) examined •This study used a number of inter indexer consistency measures to examine inter tagger tagged)
tagged)
the results of multiple inter indexer consistency studies, consistency on delicious.com. cosine full 0.034 0.045 0.068
examining not only the levels of consistency which varied •All measures except the Pairwise Jaccard used a calculated centroid for comparison. Two cosine partial 0.25 0.24 0.26
widely but also the level of indexing exhaustivity (number of centroids were used: one composed of all tags, the other of the top 25 tags. cosine full tf-idf 0.012 0.016 0.027
terms assigned), data collection method and vocabulary size. cosine partial tf-idf 0.12 0.1 0.11
The majority of inter indexer consistency studies show high INTER TAGGER CONSISTENCY MEASURES cosine full df 0.48 0.56 0.63
levels of inconsistency between indexers. cosine partial df 0.48 0.56 0.63
jaccard full 0.0014 0.0024 0.0053
A. Salton's Cosine jaccard exclusive full 0.0014 0.0024 0.0054
SOCIAL TAGGING cosine similarity - weighted and jaccard partial 0.077 0.072 0.083
unweighted terms and a centroid jaccard excl. partial 0.087 0.082 0.095
B. Jaccard (aka Hooper and Rolling's) pairwise jaccard 0.079 0.11 0.12
Studies of social tagging show convergence of popular terms ratio of the intersection of tag lists pairwise excl. jaccard 0.11 0.16 0.17
in frequency graphs, but an examination of user tag lists to the union with a centroid ICD 0.021 0.03 0.04
shows that divergence of opinion continues (Kipp 2009). C. Inter-indexer Consistency Density
Many measures of inter indexer consistency can be modified (Wolfram and Olson 2007)
DISCUSSION AND CONCLUSIONS
to examine indexing with an arbitrary number of indexers or Euclidian distance measure of
taggers, allowing the calculation of inter tagger consistency. difference with a centroid Results of inter indexer consistency studies
D. Pairwise Jaccard (Kipp 2009) vary depending on the indexers and use of
compares tag lists to each other controlled vocabulary or natural language,
showing ranges from 4% to 82% (Markey
1984; Leonard 1977).
REFERENCES

Kipp, Margaret E.I. 2009. Information Organisation Practices on the Web: Tagging and the Social Organisation of Many inter tagger consistency values in this
Information. PhD Thesis, University of Western Ontario, London, Ontario. study fall into this range (4-82%) and are thus
Leonard, L. 1977. Inter-indexer consistency studies, 1954-1975: a review of the literature and summary of study results.
Occasional papers (University of Illinois at Urbana-Champaign. Graduate School of Library Science) 131.
consistent with previous findings. Some of the
Markey, Karen. 1984. Interindexer Consistency Tests: a Literature Review and Report of a Test of Consistency in consistency in tagging on delicious.com may
Indexing Visual Materials. Library and Information Science Research 6(2): 155-177. be influenced by the incorporation of
Wolfram, Deitmar and Olson, Hope. 2007. A method for comparing large scale inter-indexer consistency using IR
modeling. Proceedings of the 35th conference of the Canadian Association for Information Science, Montreal, QC, May
suggested terms into the interface, acting as a
10-12, 2007. form of semi-controlled vocabulary.

You might also like