WAHD: A database for writer identification of Arabic historical documents

Alaa Abdelhaleem, Ahmed Droby, Abedelkader Asi, Majeed Kassis, Reem Al Asam, Jihad El-Sanaa

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

15 Scopus citations

Abstract

A comprehensive Arabic handwritten text database is an important resource for Arabic handwritten text recognition research. It is essential for training text recognition algorithms and vital for evaluating the performance of these algorithms. In this paper, we present a database that includes manuscripts from the Islamic heritage project (IHP), consisting of 333 historical manuscripts written by 302 different writers, 23 from them are known. The database contains 54 manuscripts, whose writers are known, from 13 sources. Among these known writers, 11 have written multiple manuscripts. The total number of pages in the entire database is 36,969. Each manuscript in the database accompanied with metadata that include various properties of the manuscript, such as title, creator, subject, language, copyist name, etc. To enrich our database we added twenty historical books scanned from the National Library(NLJ), in Jerusalem. The books have different number of pages and different writing styles. In addition, we present a number of experimental results on the database using two classifiers, The GMMS System and The OBI/SIFT System. The database is made freely available to researchers worldwide for research in various handwritten related problems such as text recognition, writer identification, verification, forms analysis, pre-processing and segmentation.

Original languageEnglish
Title of host publication1st IEEE International Workshop on Arabic Script Analysis and Recognition, ASAR 2017
PublisherInstitute of Electrical and Electronics Engineers
Pages64-68
Number of pages5
ISBN (Electronic)9781509066285
DOIs
StatePublished - 13 Oct 2017
Event1st IEEE International Workshop on Arabic Script Analysis and Recognition, ASAR 2017 - Nancy, France
Duration: 3 Apr 20175 Apr 2017

Publication series

Name1st IEEE International Workshop on Arabic Script Analysis and Recognition, ASAR 2017

Conference

Conference1st IEEE International Workshop on Arabic Script Analysis and Recognition, ASAR 2017
Country/TerritoryFrance
CityNancy
Period3/04/175/04/17

ASJC Scopus subject areas

  • Computer Vision and Pattern Recognition
  • Linguistics and Language
  • Computer Science Applications

Fingerprint

Dive into the research topics of 'WAHD: A database for writer identification of Arabic historical documents'. Together they form a unique fingerprint.

Cite this