Skip to main navigation Skip to search Skip to main content

With measured words: Simple sentence selection for black-box optimization of sentence compression algorithms

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    1 Scopus citations

    Abstract

    Sentence Compression is the task of generating a shorter, yet grammatical version of a given sentence, preserving the essence of the original sentence. This paper proposes a Black-Box Optimizer for Compression (B-BOC): given a black-box compression algorithm and assuming not all sentences need be compressed - find the best candidates for compression in order to maximize both compression rate and quality. Given a required compression ratio, we consider two scenarios: (i) single-sentence compression, and (ii) sentences-sequence compression. In the first scenario, our optimizer is trained to predict how well each sentence could be compressed while meeting the specified ratio requirement. In the latter, the desired compression ratio is applied to a sequence of sentences (e.g., a paragraph) as a whole, rather than on each individual sentence. To achieve that, we use B-BOC to assign an optimal compression ratio to each sentence, then cast it as a Knapsack problem, which we solve using bounded dynamic programming. We evaluate B-BOC on both scenarios on three datasets, demonstrating that our optimizer improves both accuracy and Rouge-F1-score compared to direct application of other compression algorithms.

    Original languageEnglish
    Title of host publicationEACL 2021 - 16th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference
    PublisherAssociation for Computational Linguistics (ACL)
    Pages1625-1634
    Number of pages10
    ISBN (Electronic)9781954085022
    DOIs
    StatePublished - 1 Jan 2021
    Event16th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2021 - Virtual, Online
    Duration: 19 Apr 202123 Apr 2021

    Publication series

    NameEACL 2021 - 16th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference

    Conference

    Conference16th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2021
    CityVirtual, Online
    Period19/04/2123/04/21

    ASJC Scopus subject areas

    • Software
    • Computational Theory and Mathematics
    • Linguistics and Language

    Fingerprint

    Dive into the research topics of 'With measured words: Simple sentence selection for black-box optimization of sentence compression algorithms'. Together they form a unique fingerprint.

    Cite this