Capturing the Content of a Document through Complex Event Identification

Zheng Qi, Elior Sulem, Haoyu Wang, Xiaodong Yu, Dan Roth

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Granular events, instantiated in a document by predicates, can usually be grouped into more general events, called complex events. Together, they capture the major content of the document. Recent work grouped granular events by defining event regions, filtering out sentences that are irrelevant to the main content. However, this approach assumes that a given complex event is always described in consecutive sentences, which does not always hold in practice. In this paper, we introduce the task of complex event identification. We address this task as a pipeline, first predicting whether two granular events mentioned in the text belong to the same complex event, independently of their position in the text, and then using this to cluster them into complex events. Due to the difficulty of predicting whether two granular events belong to the same complex event in isolation, we propose a context-augmented representation learning approach CONTEXTRL that adds additional context to better model the pairwise relation between granular events. We show that our approach outperforms strong baselines on the complex event identification task and further present a promising case study exploring the effectiveness of using complex events as input for document-level argument extraction.

Original languageEnglish
Title of host publication*SEM 2022 - 11th Joint Conference on Lexical and Computational Semantics, Proceedings of the Conference
EditorsVivi Nastase, Ellie Pavlick, Mohammad Taher Pilehvar, Jose Camacho-Collados, Alessandro Raganato
PublisherAssociation for Computational Linguistics (ACL)
Pages331-340
Number of pages10
ISBN (Electronic)9781955917988
StatePublished - 1 Jan 2022
Externally publishedYes
Event11th Joint Conference on Lexical and Computational Semantics, *SEM 2022 - Seattle, United States
Duration: 14 Jul 202215 Jul 2022

Publication series

Name*SEM 2022 - 11th Joint Conference on Lexical and Computational Semantics, Proceedings of the Conference

Conference

Conference11th Joint Conference on Lexical and Computational Semantics, *SEM 2022
Country/TerritoryUnited States
CitySeattle
Period14/07/2215/07/22

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Computer Science Applications
  • Theoretical Computer Science

Fingerprint

Dive into the research topics of 'Capturing the Content of a Document through Complex Event Identification'. Together they form a unique fingerprint.

Cite this