Improving Folksonomies Quality by Syntactic Tag Variations Grouping – SAC 2009
Abstract
Folksonomies offer an easy method to organize information in the current Web. This fact and their collaborative features have derived in an extensive involvement in many Social Web projects. However they present important drawbacks regarding their limited exploring and searching capabilities, in contrast with other methods as taxonomies, thesauruses and ontologies. One of these drawbacks is an effect of its flexibility for tagging, producing frequently multiple variations of a same tag. In this paper we propose a method to group syntactic variations of tags using pattern matching techniques. We propose the utilization of a fuzzy similarity measure and we conclude that this technique offers better results than other classic techniques after comparing them on a large real dataset.
Authors
Francisco Echarte, José Javier Astrain, Alberto Córdoba, Jesús Villadangos
More information
- Paper: download
- CSV file with CiteULike annotations (24 MB): download
- CSV file with DS1 data set (330 KB): download
- CSV file with DS2 data set (22 KB): download
- CSV file with results for hamming distance (280 KB): download
- CSV file with results for levenshtein distance (270 KB): download
- CSV file with results for fuzzy similarity measure (1 MB): download
- CSV file with results for fuzzy similarity measure with variable costs (1.1 MB): download
Reference
Echarte, F., Astrain, J. J., Córdoba, A., and Villadangos, J. 2009. Improving folksonomies quality by syntactic tag variations grouping. In Proceedings of the 2009 ACM Symposium on Applied Computing (Honolulu, Hawaii). SAC ’09. ACM, New York, NY, 1226-1230.