Robust and efficient text‐line extraction by local minimal sub-seams

Raid Saabni

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

Robust text line extraction from document images is vital prerequisite for any successful text recognition or analyzing process. Generally, most of the proposed algorithms for this task assumed kind of binarization pre-processing step in order to insure well performance. In this paper, we present a novel robust and efficient algorithm to extract text-lines directly from gray level document images. The algorithm tracks minimal energy sub-seams accumulated to perform a full local minimal/maximal separating and medial seams defining the text lines. To improve the ability of extracting such seams, we enhance the image using double-sided adaptive local density projection profile followed by multi-scale anisotropic second derivative of Gaussian filter bank. Following the observation that center of lines are more reliable to follow, we first extract seams that follow the center of lines to constraint the algorithm for evolving the separating seams. The algorithm is parameter-free and we evaluate the free parameters directly by analyzing the image properties and the pixels distribution. We have tested our approach on multi-lingual various datasets written at range of image quality and received very encouraging results, which outperform state-of-the-art algorithms.

Original languageEnglish
Title of host publicationProceedings of the 2nd International Symposium on Computer Science and Intelligent Control, ISCSIC 2018
PublisherAssociation for Computing Machinery
ISBN (Electronic)9781450366281
DOIs
StatePublished - 21 Sep 2018
Externally publishedYes
Event2nd International Symposium on Computer Science and Intelligent Control, ISCSIC 2018 - Stockholm, Sweden
Duration: 21 Sep 201823 Sep 2018

Publication series

NameACM International Conference Proceeding Series

Conference

Conference2nd International Symposium on Computer Science and Intelligent Control, ISCSIC 2018
Country/TerritorySweden
CityStockholm
Period21/09/1823/09/18

Keywords

  • Document Image Analyzing
  • Line Extraction
  • Local projection profile
  • Minimal Seams
  • Multi-scale anisotropic Gaussian filter bank

ASJC Scopus subject areas

  • Software
  • Human-Computer Interaction
  • Computer Vision and Pattern Recognition
  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'Robust and efficient text‐line extraction by local minimal sub-seams'. Together they form a unique fingerprint.

Cite this