Abstract
The purpose of this survey is to provide a comprehensive overview of recent advancements in text line segmentation and baseline detection techniques within the analysis of historical document images. Text line extraction is an essential step in the historical documents image analysis pipeline, as its results significantly impact the accuracy of subsequent processes, such as handwritten text recognition (HTR). Through a multi-stage procedure, we carefully selected 49 peer-reviewed studies published since 2019. Based on careful analysis of these studies, we summarize the information of the existing datasets, describe and categorize different methods, and summarize evaluation protocols. In addition, we compare the results of various methods on benchmark datasets. Finally, we highlight the gaps and suggest directions for future research. We believe that this comprehensive survey will be of great assistance to researchers working in the field of historical document image analysis, as it offers critical insights into the latest advancements and developments, providing a foundation for future research.
| Original language | English |
|---|---|
| Pages (from-to) | 3-39 |
| Number of pages | 37 |
| Journal | International Journal on Document Analysis and Recognition |
| Volume | 29 |
| Issue number | 1 |
| DOIs | |
| State | Published - 1 Mar 2026 |
| Externally published | Yes |
Keywords
- Baseline detection
- Historical documents
- Review
- Survey
- Text line extraction
ASJC Scopus subject areas
- Software
- Computer Vision and Pattern Recognition
- Computer Science Applications
Fingerprint
Dive into the research topics of 'Recent advances in text line segmentation and baseline detection in historical document images: a systematic review'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver