Abstract
We present a learning-free method for text line segmentation of historical handwritten document images. This method relies on automatic scale selection together with second derivative of anisotropic Gaussian filters to detect the blob lines that strike through the text lines. Detected blob lines guide an energy minimization procedure to extract the text lines. Historical handwritten documents contain noise, heterogeneous text line heights, skews and touching characters among text lines. Automatic scale selection allows for automatic adaption to the heterogeneous nature of handwritten text lines in case the character height range is correctly estimated. In the extraction phase, the method can accurately split the touching characters among the text lines. We provide results investigating various settings and compare the model with recent learning-free and learning-based methods on the cBAD competition dataset.
Original language | English |
---|---|
Article number | 8276 |
Pages (from-to) | 1-19 |
Number of pages | 19 |
Journal | Applied Sciences (Switzerland) |
Volume | 10 |
Issue number | 22 |
DOIs | |
State | Published - 2 Nov 2020 |
Keywords
- Historical handwritten documents
- Learning-free
- Text line detection
- Text line extraction
- Text line segmentation
ASJC Scopus subject areas
- General Materials Science
- Instrumentation
- General Engineering
- Process Chemistry and Technology
- Computer Science Applications
- Fluid Flow and Transfer Processes