September 27, 2017

Hatem Mousselly: Novel Tag Similarity Approach with Application on Geotagged Folksonomy

Abstract

Folksonomies – collections of user-contributed tags, proved to be efficient in reducing the inherent semantic gap. However, user tags are noisy; thus, they need to be processed before they can be used by further applications. To address this problem, an approach for identifying similar tags in folksonomy is presented. In this approach each tag is represented by an empirical probability distribution derived from tag co-occurrence statistics. The similarity between two tags is determined based on the distance between their corresponding probability distributions. To deal with statistical fluctuations, we also propose an extension for the well-known Jensen-Shannon Divergence. The proposed approach is compared to a widely used method for identifying similar tags based on the cosine measure. The evaluation shows promising results and emphasizes the advantage of our approach.