TY - GEN
T1 - WAHD: A database for writer identification of Arabic historical documents
AU - Abdalhaleem, Alaa
AU - Droby, Ahmed
AU - Asi, Abedelkadir
AU - Kassis, Majeed
AU - Asam, Reem Al
AU - El-Sana, Jihad
N1 - Publisher Copyright:
© 2017 IEEE
PY - 2017
Y1 - 2017
N2 - A comprehensive Arabic handwritten text database is an important resource for Arabic handwritten text recognition research. It is essential for training text recognition algorithms and vital for evaluating the performance of these algorithms. In this paper, we present a database that includes manuscripts from the Islamic heritage project (IHP), consisting of 333 historical manuscripts written by 302 different writers, 23 from them are known. The database contains 54 manuscripts, whose writers are known, from 13 sources. Among these known writers, 11 have written multiple manuscripts. The total number of pages in the entire database is 36,969. Each manuscript in the database accompanied with metadata that include various properties of the manuscript, such as title, creator, subject, language, copyist name, etc. To enrich our database we added twenty historical books scanned from the National Library(NLJ), in Jerusalem. The books have different number of pages and different writing styles. In addition, we present a number of experimental results on the database using two classifiers, The GMMS System and The OBI/SIFT System. The database is made freely available to researchers worldwide for research in various handwritten related problems such as text recognition, writer identification, verification, forms analysis, pre-processing and segmentation.
AB - A comprehensive Arabic handwritten text database is an important resource for Arabic handwritten text recognition research. It is essential for training text recognition algorithms and vital for evaluating the performance of these algorithms. In this paper, we present a database that includes manuscripts from the Islamic heritage project (IHP), consisting of 333 historical manuscripts written by 302 different writers, 23 from them are known. The database contains 54 manuscripts, whose writers are known, from 13 sources. Among these known writers, 11 have written multiple manuscripts. The total number of pages in the entire database is 36,969. Each manuscript in the database accompanied with metadata that include various properties of the manuscript, such as title, creator, subject, language, copyist name, etc. To enrich our database we added twenty historical books scanned from the National Library(NLJ), in Jerusalem. The books have different number of pages and different writing styles. In addition, we present a number of experimental results on the database using two classifiers, The GMMS System and The OBI/SIFT System. The database is made freely available to researchers worldwide for research in various handwritten related problems such as text recognition, writer identification, verification, forms analysis, pre-processing and segmentation.
UR - http://www.scopus.com/inward/record.url?scp=85027768590&partnerID=8YFLogxK
U2 - 10.1109/ASAR.2017.8067761
DO - 10.1109/ASAR.2017.8067761
M3 - פרסום בספר כנס
T3 - 1st IEEE International Workshop on Arabic Script Analysis and Recognition, ASAR 2017
SP - 64
EP - 68
BT - 1st International Workshop on Arabic Script Analysis and Recognition, ASAR 2017, Nancy, France, April 3-5, 2017
PB - Institute of Electrical and Electronics Engineers
Y2 - 3 April 2017 through 5 April 2017
ER -