Skip to main navigation Skip to search Skip to main content

Recent advances in text line segmentation and baseline detection in historical document images: a systematic review

  • Irina Rabaev
  • , Marina Litvak

Research output: Contribution to journalArticlepeer-review

3 Scopus citations

Abstract

The purpose of this survey is to provide a comprehensive overview of recent advancements in text line segmentation and baseline detection techniques within the analysis of historical document images. Text line extraction is an essential step in the historical documents image analysis pipeline, as its results significantly impact the accuracy of subsequent processes, such as handwritten text recognition (HTR). Through a multi-stage procedure, we carefully selected 49 peer-reviewed studies published since 2019. Based on careful analysis of these studies, we summarize the information of the existing datasets, describe and categorize different methods, and summarize evaluation protocols. In addition, we compare the results of various methods on benchmark datasets. Finally, we highlight the gaps and suggest directions for future research. We believe that this comprehensive survey will be of great assistance to researchers working in the field of historical document image analysis, as it offers critical insights into the latest advancements and developments, providing a foundation for future research.

Original languageEnglish
Pages (from-to)3-39
Number of pages37
JournalInternational Journal on Document Analysis and Recognition
Volume29
Issue number1
DOIs
StatePublished - 1 Mar 2026
Externally publishedYes

Keywords

  • Baseline detection
  • Historical documents
  • Review
  • Survey
  • Text line extraction

ASJC Scopus subject areas

  • Software
  • Computer Vision and Pattern Recognition
  • Computer Science Applications

Fingerprint

Dive into the research topics of 'Recent advances in text line segmentation and baseline detection in historical document images: a systematic review'. Together they form a unique fingerprint.

Cite this