Abstract
This paper presents layout analysis for historical Arabic documents using siamese network. Given pages from different documents, we divide them into patches of similar sizes. We train a siamese network model that takes as an input a pair of patches and gives as an output a distance that corresponds to the similarity between the two patches. We used the trained model to calculate a distance matrix which in turn is used to cluster the patches of a page as either main text, side text or a background patch. We evaluate our method on challenging historical Arabic manuscripts dataset and report the F-measure. We show the effectiveness of our method by comparing with other works that use deep learning approaches, and show that we have state of art results.
Original language | English |
---|---|
Title of host publication | Proceedings - 15th IAPR International Conference on Document Analysis and Recognition, ICDAR 2019 |
Publisher | Institute of Electrical and Electronics Engineers |
Pages | 738-742 |
Number of pages | 5 |
ISBN (Electronic) | 9781728128610 |
DOIs | |
State | Published - 1 Sep 2019 |
Event | 15th IAPR International Conference on Document Analysis and Recognition, ICDAR 2019 - Sydney, Australia Duration: 20 Sep 2019 → 25 Sep 2019 |
Publication series
Name | Proceedings of the International Conference on Document Analysis and Recognition, ICDAR |
---|---|
ISSN (Print) | 1520-5363 |
Conference
Conference | 15th IAPR International Conference on Document Analysis and Recognition, ICDAR 2019 |
---|---|
Country/Territory | Australia |
City | Sydney |
Period | 20/09/19 → 25/09/19 |
Keywords
- Clustering
- Historical Arabic Documents
- Layout Analysis
- Siamese Network
ASJC Scopus subject areas
- Computer Vision and Pattern Recognition