Pattern Matching Techniques to Identify Syntactic Variations of Tags in Folksonomies – WSKS 2008
Abstract
Folksonomies offer an easy method to organize information in the current Web. This fact and their collaborative features have derived in an extensive involvement in many Social Web projects. However they present important drawbacks regarding their limited exploring and searching capabilities, in contrast with other methods as taxonomies, thesauruses and ontologies. One of these drawbacks is an effect of its flexibility for tagging, producing frequently multiple syntactic variations of a same tag. In this paper we study the application of two classical pattern matching techniques, Levenshtein distance for the imperfect string matching and Hamming distance for the perfect string matching, to identify syntactic variations of tags.
Authors
Francisco Echarte, José Javier Astrain, Alberto Córdoba, Jesús Villadangos
Workbench
- Paper: download
- CSV file with CiteULike annotations (24 MB): download
- CSV file with DS1 data set (330 KB): download
- CSV file with DS2 data set (22 KB): download
- CSV file with results for hamming distance (280 KB): download
- CSV file with results for levenshtein distance (270 KB): download
Reference
Echarte, F., Astrain, J. J., Córdoba, A., and Villadangos, J. 2008. Pattern Matching Techniques to Identify Syntactic Variations of Tags in Folksonomies. In Proceedings of the 1st World Summit on the Knowledge Society: Emerging Technologies and information Systems For the Knowledge Society (Athens, Greece, September 24 – 26, 2008). M. D. Lytras, J. M. Carroll, E. Damiani, and R. D. Tennyson, Eds. Lecture Notes In Artificial Intelligence, vol. 5288. Springer-Verlag, Berlin, Heidelberg,