TY - GEN
T1 - Query-based summarization using MDL principle
AU - Vanetik, Natalia
AU - Litvak, Marina
N1 - Publisher Copyright:
© 2017 Association for Computational Linguistics.
PY - 2017/1/1
Y1 - 2017/1/1
N2 - Query-based text summarization is aimed at extracting essential information that answers the query from original text. The answer is presented in a minimal, often predefined, number of words. In this paper we introduce a new unsupervised approach for query-based extractive summarization, based on the minimum description length (MDL) principle that employs Krimp compression algorithm (Vreeken et al., 2011). The key idea of our approach is to select frequent word sets related to a given query that compress document sentences better and therefore describe the document better. A summary is extracted by selecting sentences that best cover query-related frequent word sets. The approach is evaluated based on the DUC 2005 and DUC 2006 datasets which are specifically designed for query-based summarization (DUC, 2005 2006). It competes with the best results.
AB - Query-based text summarization is aimed at extracting essential information that answers the query from original text. The answer is presented in a minimal, often predefined, number of words. In this paper we introduce a new unsupervised approach for query-based extractive summarization, based on the minimum description length (MDL) principle that employs Krimp compression algorithm (Vreeken et al., 2011). The key idea of our approach is to select frequent word sets related to a given query that compress document sentences better and therefore describe the document better. A summary is extracted by selecting sentences that best cover query-related frequent word sets. The approach is evaluated based on the DUC 2005 and DUC 2006 datasets which are specifically designed for query-based summarization (DUC, 2005 2006). It competes with the best results.
UR - http://www.scopus.com/inward/record.url?scp=85061896931&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85061896931
T3 - MultiLing 2017 - Workshop on Summarization and Summary Evaluation Across Source Types and Genres, Proceedings of the Workshop
SP - 22
EP - 31
BT - MultiLing 2017 - Workshop on Summarization and Summary Evaluation Across Source Types and Genres, Proceedings of the Workshop
PB - Association for Computational Linguistics (ACL)
T2 - 2017 Workshop on Summarization and Summary Evaluation Across Source Types and Genres, MultiLing 2017 at the EACL 2017 Workshop
Y2 - 3 April 2017
ER -