Cross-Lingual Extractive Question Answering with Unanswerable Questions

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Cross-lingual Extractive Question Answering (EQA) extends standard EQA by requiring models to find answers in passages written in languages different from the questions. The Generalized Cross-Lingual Transfer (G-XLT) task evaluates models' zero-shot ability to transfer question answering capabilities across languages using only English training data. While previous research has primarily focused on scenarios where answers are always present, real-world applications often encounter situations where no answer exists within the given context. This paper introduces an enhanced G-XLT task definition that explicitly handles unanswerable questions, bridging a critical gap in current research. To address this challenge, we present two new datasets: miXQuAD and MLQA-IDK, which address both answerable and unanswerable questions and respectively cover 12 and 7 language pairs. Our study evaluates state-of-the-art large language models using fine-tuning, parameter-efficient techniques, and in-context learning approaches, revealing interesting trade-offs between a smaller fine-tuned model's performance on answerable questions versus a larger in-context learning model's capability on unanswerable questions. We also examine language similarity patterns based on model performance, finding alignments with known language families.
Original languageEnglish
Title of host publicationProceedings of the 14th Joint Conference on Lexical and Computational Semantics (*SEM 2025)
EditorsLea Frermann, Mark Stevenson
Place of PublicationSuzhou, China
PublisherAssociation for Computational Linguistics
Pages100-121
Number of pages22
ISBN (Print)9798891763401
DOIs
StatePublished - 1 Nov 2025

Fingerprint

Dive into the research topics of 'Cross-Lingual Extractive Question Answering with Unanswerable Questions'. Together they form a unique fingerprint.

Cite this