Layout analysis on challenging historical arabic manuscripts using siamese network

Reem Alaasam, Berat Kurar, Jihad El-Sana

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

5 Scopus citations

Abstract

This paper presents layout analysis for historical Arabic documents using siamese network. Given pages from different documents, we divide them into patches of similar sizes. We train a siamese network model that takes as an input a pair of patches and gives as an output a distance that corresponds to the similarity between the two patches. We used the trained model to calculate a distance matrix which in turn is used to cluster the patches of a page as either main text, side text or a background patch. We evaluate our method on challenging historical Arabic manuscripts dataset and report the F-measure. We show the effectiveness of our method by comparing with other works that use deep learning approaches, and show that we have state of art results.

Original languageEnglish
Title of host publicationProceedings - 15th IAPR International Conference on Document Analysis and Recognition, ICDAR 2019
PublisherIEEE Computer Society
Pages738-742
Number of pages5
ISBN (Electronic)9781728128610
DOIs
StatePublished - 1 Sep 2019
Event15th IAPR International Conference on Document Analysis and Recognition, ICDAR 2019 - Sydney, Australia
Duration: 20 Sep 201925 Sep 2019

Publication series

NameProceedings of the International Conference on Document Analysis and Recognition, ICDAR
ISSN (Print)1520-5363

Conference

Conference15th IAPR International Conference on Document Analysis and Recognition, ICDAR 2019
Country/TerritoryAustralia
CitySydney
Period20/09/1925/09/19

Keywords

  • Clustering
  • Historical Arabic Documents
  • Layout Analysis
  • Siamese Network

Fingerprint

Dive into the research topics of 'Layout analysis on challenging historical arabic manuscripts using siamese network'. Together they form a unique fingerprint.

Cite this