DegExt - A language-independent graph-based keyphrase extractor

Marina Litvak, Mark Last, Hen Aizenman, Inbal Gobits, Abraham Kandel

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

46 Scopus citations

Abstract

In this paper, we introduce DegExt, a graph-based languageindependent keyphrase extractor,which extends the keyword extraction method described in [6]. We compare DegExt with two state-of-the-art approaches to keyphrase extraction: GenEx [11] and TextRank [8]. Our experiments on a collection of benchmark summaries show that DegExt outperforms TextRank and GenEx in terms of precision and area under curve (AUC) for summaries of 15 keyphrases or more at the expense of a non-significant decrease of recall and F-measure. Moreover, DegExt surpasses both GenEx and TextRank in terms of implementation simplicity and computational complexity.

Original languageEnglish
Title of host publicationAdvances in Intelligent Web Mastering - Proceedings of the 7th Atlantic Web Intelligence Conference, AWIC 2011, Fribourg, Switzerland, January, 2011
EditorsElena Mugellini, Maria Sokhn, Piotr Szczepaniak, Maria Chiara Pettenati
Pages121-130
Number of pages10
DOIs
StatePublished - 23 Sep 2011

Publication series

NameAdvances in Intelligent and Soft Computing
Volume86
ISSN (Print)1867-5662

Keywords

  • Keyphrase extraction
  • graph-based document representation
  • summarization
  • text mining

ASJC Scopus subject areas

  • Computer Science (all)

Fingerprint

Dive into the research topics of 'DegExt - A language-independent graph-based keyphrase extractor'. Together they form a unique fingerprint.

Cite this