Cross-Lingual UMLS Named Entity Linking using UMLS Dictionary Fine-Tuning

Rina Galperin, Shachar Schnapp, Michael Elhadad

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

We study cross-lingual UMLS named entity linking, where mentions in a given source language are mapped to UMLS concepts, most of which are labeled in English. Our cross-lingual framework includes an offline unsupervised construction of a translated UMLS dictionary and a per-document pipeline which identifies UMLS candidate mentions and uses a fine-tuned pretrained transformer language model to filter candidates according to context. Our method exploits a small dataset of manually annotated UMLS mentions in the source language and uses this supervised data in two ways: to extend the unsupervised UMLS dictionary and to fine-tune the contextual filtering of candidate mentions in full documents. We demonstrate results of our approach on both Hebrew and English. We achieve new state-of-the-art (SOTA) results on the Hebrew Camoni corpus, +8.9 F1 on average across three communities in the dataset. We also achieve new SOTA on the English dataset MedMentions with +7.3 F1.

Original languageEnglish
Title of host publicationACL 2022 - 60th Annual Meeting of the Association for Computational Linguistics, Findings of ACL 2022
EditorsSmaranda Muresan, Preslav Nakov, Aline Villavicencio
PublisherAssociation for Computational Linguistics (ACL)
Pages3380-3390
Number of pages11
ISBN (Electronic)9781955917254
DOIs
StatePublished - 1 Jan 2022
Event60th Annual Meeting of the Association for Computational Linguistics, ACL 2022 - Dublin, Ireland
Duration: 22 May 202227 May 2022

Publication series

NameProceedings of the Annual Meeting of the Association for Computational Linguistics
ISSN (Print)0736-587X

Conference

Conference60th Annual Meeting of the Association for Computational Linguistics, ACL 2022
Country/TerritoryIreland
CityDublin
Period22/05/2227/05/22

ASJC Scopus subject areas

  • Computer Science Applications
  • Linguistics and Language
  • Language and Linguistics

Fingerprint

Dive into the research topics of 'Cross-Lingual UMLS Named Entity Linking using UMLS Dictionary Fine-Tuning'. Together they form a unique fingerprint.

Cite this