TY - GEN
T1 - Text line extraction using deep learning and minimal sub seams
AU - Azran, Adi
AU - Schclar, Alon
AU - Saabni, Raid
N1 - Publisher Copyright:
© 2021 ACM.
PY - 2021/8/16
Y1 - 2021/8/16
N2 - Accurate text line extraction is a vital prerequisite for efficient and successful text recognition systems ranging from keywords/phrases searching to complete conversion to text. In many cases, the proposed algorithms target binary pre-processed versions of the image, which may cause insufficient results due to poor quality document images. Recently, more papers present solutions that work directly on gray-level images [1,2,7,12,15]. In this paper, we present a novel robust, and efficient algorithm to extract text-lines directly from gray-level document images. The proposed approach uses a combination of two variants of Convolutional Neural Network (CNNs), followed by minimal energy seam extraction. The first ConvNet is a modified version of the autoencoder used for biomedical image segmentation [8]. The second is a deep convolutional Neural Network, working on overlapping vertical slices of the original image. The two variants are combined to one neural net after re-attaching the resulting slices of the second net. The merged results of the two nets are used as a preprocessed image to obtain an energy map for a second phase. In the second step, we use the algorithm presented in [2], to track minimal energy sub-seams accumulated to perform a full local minimal/maximal separating and medial seam defining the text baselines and the text line regions. We have tested our approach on multi-lingual various datasets written at a range of image quality based on the ICDAR datasets.
AB - Accurate text line extraction is a vital prerequisite for efficient and successful text recognition systems ranging from keywords/phrases searching to complete conversion to text. In many cases, the proposed algorithms target binary pre-processed versions of the image, which may cause insufficient results due to poor quality document images. Recently, more papers present solutions that work directly on gray-level images [1,2,7,12,15]. In this paper, we present a novel robust, and efficient algorithm to extract text-lines directly from gray-level document images. The proposed approach uses a combination of two variants of Convolutional Neural Network (CNNs), followed by minimal energy seam extraction. The first ConvNet is a modified version of the autoencoder used for biomedical image segmentation [8]. The second is a deep convolutional Neural Network, working on overlapping vertical slices of the original image. The two variants are combined to one neural net after re-attaching the resulting slices of the second net. The merged results of the two nets are used as a preprocessed image to obtain an energy map for a second phase. In the second step, we use the algorithm presented in [2], to track minimal energy sub-seams accumulated to perform a full local minimal/maximal separating and medial seam defining the text baselines and the text line regions. We have tested our approach on multi-lingual various datasets written at a range of image quality based on the ICDAR datasets.
KW - convolutional neural networks
KW - historical document image analysis
KW - image processing
KW - line extraction
KW - local projection profile
KW - minimal seams
KW - seam carving
KW - text line extraction
UR - https://www.scopus.com/pages/publications/85113610367
U2 - 10.1145/3469096.3474941
DO - 10.1145/3469096.3474941
M3 - Conference contribution
AN - SCOPUS:85113610367
T3 - DocEng 2021 - Proceedings of the 2021 ACM Symposium on Document Engineering
BT - DocEng 2021 - Proceedings of the 2021 ACM Symposium on Document Engineering
PB - Association for Computing Machinery, Inc
T2 - 21st ACM Symposium on Document Engineering, DocEng 2021
Y2 - 24 August 2021 through 27 August 2021
ER -