Recent Advances in Natural Language Processing via Large Pre-trained Language Models: A Survey

Bonan Min, Hayley Ross, Elior Sulem, Amir Pouran Ben Veyseh, Thien Huu Nguyen, Oscar Sainz, Eneko Agirre, Ilana Heintz, Dan Roth

Research output: Contribution to journalArticlepeer-review

678 Scopus citations

Abstract

Large, pre-trained language models (PLMs) such as BERT and GPT have drastically changed the Natural Language Processing (NLP) field. For numerous NLP tasks, approaches leveraging PLMs have achieved state-of-the-art performance. The key idea is to learn a generic, latent representation of language from a generic task once, then share it across disparate NLP tasks. Language modeling serves as the generic task, one with abundant self-supervised text available for extensive training. This article presents the key fundamental concepts of PLM architectures and a comprehensive view of the shift to PLM-driven NLP techniques. It surveys work applying the pre-training then fine-tuning, prompting, and text generation approaches. In addition, it discusses PLM limitations and suggested directions for future research.

Original languageEnglish
Article number30
JournalACM Computing Surveys
Volume56
Issue number2
DOIs
StatePublished - 29 Feb 2024

Keywords

  • Large language models
  • foundational models
  • generative AI
  • neural networks

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Recent Advances in Natural Language Processing via Large Pre-trained Language Models: A Survey'. Together they form a unique fingerprint.

Cite this