Multi-document summarization by extended graph text representation and importance refinement

Uri Mirchev, Mark Last

Research output: Chapter in Book/Report/Conference proceedingChapterpeer-review

10 Scopus citations

Abstract

Automatic multi-document summarization is aimed at recognizing important text content in a collection of topic-related documents and representing it in the form of a short abstract or extract. This chapter presents a novel approach to the multi-document summarization problem, focusing on the generic summarization task. The proposed SentRel (Sentence Relations) multi-document summarization algorithm assigns importance scores to documents and sentences in a collection based on two aspects: static and dynamic. In the static aspect, the significance score is recursively inferred from a novel, tripartite graph representation of the text corpus. In the dynamic aspect, the significance score is continuously refined with respect to the current summary content. The resulting summary is generated in the form of complete sentences exactly as they appear in the summarized documents, ensuring the summary's grammatical correctness. The proposed algorithm is evaluated on the TAC 2011 dataset using DUC 2001 for training and DUC 2004 for parameter tuning. The SentRel ROUGE-1 and ROUGE-2 scores are comparable to state-of-the-art summarization systems, which require a different set of textual entities.

Original languageEnglish
Title of host publicationInnovative Document Summarization Techniques
Subtitle of host publicationRevolutionizing Knowledge Understanding
PublisherIGI Global
Pages28-53
Number of pages26
ISBN (Electronic)9781466650206
ISBN (Print)1466650192, 9781466650190
DOIs
StatePublished - 31 Jan 2014

ASJC Scopus subject areas

  • Computer Science (all)

Fingerprint

Dive into the research topics of 'Multi-document summarization by extended graph text representation and importance refinement'. Together they form a unique fingerprint.

Cite this