Question answering as an automatic evaluation metric for news article summarization

Matan Eyal, Tal Baumel, Michael Elhadad

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

9 Scopus citations

Abstract

Recent work in the field of automatic summarization and headline generation focuses on maximizing ROUGE scores for various news datasets. We present an alternative, extrinsic, evaluation metric for this task, Answering Performance for Evaluation of Summaries. APES utilizes recent progress in the field of reading-comprehension to quantify the ability of a summary to answer a set of manually created questions regarding central entities in the source article. We first analyze the strength of this metric by comparing it to known manual evaluation metrics. We then present an end-to-end neural abstractive model that maximizes APES, while increasing ROUGE scores to competitive results.

Original languageEnglish
Title of host publicationLong and Short Papers
PublisherAssociation for Computational Linguistics (ACL)
Pages3938-3948
Number of pages11
ISBN (Electronic)9781950737130
DOIs
StatePublished - 1 Jan 2019
Event2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2019 - Minneapolis, United States
Duration: 2 Jun 20197 Jun 2019

Publication series

NameNAACL HLT 2019 - 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Conference
Volume1

Conference

Conference2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2019
Country/TerritoryUnited States
CityMinneapolis
Period2/06/197/06/19

Fingerprint

Dive into the research topics of 'Question answering as an automatic evaluation metric for news article summarization'. Together they form a unique fingerprint.

Cite this