Abstract
In this work we study the task of term extraction for word cloud generation in sparsely tagged domains, in which manual tags are scarce. We present a folksonomy-based term extraction method, called tag-boost, which boosts terms that are frequently used by the public to tag content. Our experiments with tag-boost based term extraction over different domains demonstrate tremendous improvement in word cloud quality, as reflected by the agreement between manual tags of the testing items and the cloud's terms extracted from the items' content. Moreover, our results demonstrate the high robustness of this approach, as compared to alternative cloud generation methods that exhibit a high sensitivity to data sparseness. Additionally, we show that tag-boost can be effectively applied even in nontagged domains, by using an external rich folksonomy borrowed from a well-tagged domain.
Original language | English |
---|---|
Article number | 60 |
Journal | ACM Transactions on Intelligent Systems and Technology |
Volume | 3 |
Issue number | 4 |
DOIs | |
State | Published - 1 Sep 2012 |
Externally published | Yes |
Keywords
- Keyword extraction
- Tag-boost
- Tag-cloud generation
ASJC Scopus subject areas
- Theoretical Computer Science
- Artificial Intelligence