A new approach to improving multilingual summarization using a genetic algorithm

Marina Litvak, Mark Last, Menahem Friedman

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

28 Scopus citations

Abstract

Automated summarization methods can be defined as “language-independent,” if they are not based on any language-specific knowledge. Such methods can be used for multilingual summarization defined by Mani (2001) as “processing several languages, with summary in the same language as input.” In this paper, we introduce MUSE, a language-independent approach for extractive summarization based on the linear optimization of several sentence ranking measures using a genetic algorithm. We tested our methodology on two languages-English and Hebrew-and evaluated its performance with ROUGE-1 Recall vs. state-of-the-art extractive summarization approaches. Our results show that MUSE performs better than the best known multilingual approach (TextRank1) in both languages. Moreover, our experimental results on a bilingual (English and Hebrew) document collection suggest that MUSE does not need to be retrained on each language and the same model can be used across at least two different languages.

Original languageEnglish
Title of host publicationACL 2010 - 48th Annual Meeting of the Association for Computational Linguistics, Conference Proceedings
EditorsJan Hajic, Sandra Carberry, Stephen Clark
PublisherAssociation for Computational Linguistics (ACL)
Pages927-936
Number of pages10
ISBN (Electronic)1932432663, 9781932432664
StatePublished - 1 Jan 2010
Event48th Annual Meeting of the Association for Computational Linguistics, ACL 2010 - Uppsala, Sweden
Duration: 11 Jul 201016 Jul 2010

Publication series

NameProceedings of the Annual Meeting of the Association for Computational Linguistics
Volume2010-July
ISSN (Print)0736-587X

Conference

Conference48th Annual Meeting of the Association for Computational Linguistics, ACL 2010
Country/TerritorySweden
CityUppsala
Period11/07/1016/07/10

ASJC Scopus subject areas

  • Computer Science Applications
  • Linguistics and Language
  • Language and Linguistics

Fingerprint

Dive into the research topics of 'A new approach to improving multilingual summarization using a genetic algorithm'. Together they form a unique fingerprint.

Cite this