Evolution maps for connected components in text documents

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

5 Scopus citations

Abstract

For highly degraded text documents, common tasks such as binarization and line extraction, remain difficult tasks. Equipped with a reliable information regarding the distribution of character dimensions in the document, one can improve results of these algorithms significantly. We introduce a novel perspective of the image data which maps the evolution of connected components along the change in gray scale threshold. We use these maps to provide a robust algorithm for extracting information about character dimensions in degraded documents, and demonstrate improvement in binarization results using this information. We analyze statistically the characteristics of the evolution maps for text documents, and compare our results with ground truth data.

Original languageEnglish
Title of host publicationProceedings - 13th International Conference on Frontiers in Handwriting Recognition, ICFHR 2012
Pages405-410
Number of pages6
DOIs
StatePublished - 1 Dec 2012
Event13th International Conference on Frontiers in Handwriting Recognition, ICFHR 2012 - Bari, Italy
Duration: 18 Sep 201220 Sep 2012

Publication series

NameProceedings - International Workshop on Frontiers in Handwriting Recognition, IWFHR
ISSN (Print)1550-5235

Conference

Conference13th International Conference on Frontiers in Handwriting Recognition, ICFHR 2012
Country/TerritoryItaly
CityBari
Period18/09/1220/09/12

Cite this