Text Line Extraction Using Fully Convolutional Network and Energy Minimization

Berat Kurar Barakat, Ahmad Droby, Reem Alaasam, Boraq Madi, Irina Rabaev, Jihad El-Sana

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Text lines are important parts of handwritten document images and easier to analyze by further applications. Despite recent progress in text line detection, text line extraction from a handwritten document remains an unsolved task. This paper proposes to use a fully convolutional network for text line detection and energy minimization for text line extraction. Detected text lines are represented by blob lines that strike through the text lines. These blob lines assist an energy function for text line extraction. The detection stage can locate arbitrarily oriented text lines. Furthermore, the extraction stage is capable of finding out the pixels of text lines with various heights and interline proximity independent of their orientations. Besides, it can finely split the touching and overlapping text lines without an orientation assumption. We evaluate the proposed method on VML-AHTE, VML-MOC, and Diva-HisDB datasets. The VML-AHTE dataset contains overlapping, touching and close text lines with rich diacritics. The VML-MOC dataset is very challenging by its multiply oriented and skewed text lines. The Diva-HisDB dataset exhibits distinct text line heights and touching text lines. The results demonstrate the effectiveness of the method despite various types of challenges, yet using the same parameters in all the experiments.

Original languageEnglish GB
Title of host publicationPattern Recognition. ICPR International Workshops and Challenges
EditorsAlberto Del Bimbo, Rita Cucchiara, Stan Sclaroff, Giovanni Maria Farinella, Tao Mei, Marco Bertini, Hugo Jair Escalante, Roberto Vezzani
Place of PublicationCham
PublisherSpringer Science and Business Media Deutschland GmbH
Pages126-140
Number of pages15
ISBN (Print)9783030687861
DOIs
StatePublished - 1 Jan 2021
Event25th International Conference on Pattern Recognition Workshops, ICPR 2020 - Milan, Italy
Duration: 10 Jan 202111 Jan 2021

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume12667 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference25th International Conference on Pattern Recognition Workshops, ICPR 2020
Country/TerritoryItaly
CityMilan
Period10/01/2111/01/21

Keywords

  • Handwritten document
  • Historical documents analysis
  • Text line detection
  • Text line extraction
  • Text line segmentation

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science (all)

Fingerprint

Dive into the research topics of 'Text Line Extraction Using Fully Convolutional Network and Energy Minimization'. Together they form a unique fingerprint.

Cite this