Query-based summarization using MDL principle

Natalia Vanetik, Marina Litvak

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

29 Scopus citations

Abstract

Query-based text summarization is aimed at extracting essential information that answers the query from original text. The answer is presented in a minimal, often predefined, number of words. In this paper we introduce a new unsupervised approach for query-based extractive summarization, based on the minimum description length (MDL) principle that employs Krimp compression algorithm (Vreeken et al., 2011). The key idea of our approach is to select frequent word sets related to a given query that compress document sentences better and therefore describe the document better. A summary is extracted by selecting sentences that best cover query-related frequent word sets. The approach is evaluated based on the DUC 2005 and DUC 2006 datasets which are specifically designed for query-based summarization (DUC, 2005 2006). It competes with the best results.

Original languageEnglish
Title of host publicationMultiLing 2017 - Workshop on Summarization and Summary Evaluation Across Source Types and Genres, Proceedings of the Workshop
PublisherAssociation for Computational Linguistics (ACL)
Pages22-31
Number of pages10
ISBN (Electronic)9781945626418
StatePublished - 1 Jan 2017
Externally publishedYes
Event2017 Workshop on Summarization and Summary Evaluation Across Source Types and Genres, MultiLing 2017 at the EACL 2017 Workshop - Valencia, Spain
Duration: 3 Apr 2017 → …

Publication series

NameMultiLing 2017 - Workshop on Summarization and Summary Evaluation Across Source Types and Genres, Proceedings of the Workshop

Conference

Conference2017 Workshop on Summarization and Summary Evaluation Across Source Types and Genres, MultiLing 2017 at the EACL 2017 Workshop
Country/TerritorySpain
CityValencia
Period3/04/17 → …

ASJC Scopus subject areas

  • Language and Linguistics
  • Computational Theory and Mathematics
  • Computer Science Applications
  • Software

Fingerprint

Dive into the research topics of 'Query-based summarization using MDL principle'. Together they form a unique fingerprint.

Cite this