TY - GEN
T1 - WAHD
T2 - 1st IEEE International Workshop on Arabic Script Analysis and Recognition, ASAR 2017
AU - Abdelhaleem, Alaa
AU - Droby, Ahmed
AU - Asi, Abedelkader
AU - Kassis, Majeed
AU - Al Asam, Reem
AU - El-Sanaa, Jihad
N1 - Publisher Copyright:
© 2017 IEEE
PY - 2017/10/13
Y1 - 2017/10/13
N2 - A comprehensive Arabic handwritten text database is an important resource for Arabic handwritten text recognition research. It is essential for training text recognition algorithms and vital for evaluating the performance of these algorithms. In this paper, we present a database that includes manuscripts from the Islamic heritage project (IHP), consisting of 333 historical manuscripts written by 302 different writers, 23 from them are known. The database contains 54 manuscripts, whose writers are known, from 13 sources. Among these known writers, 11 have written multiple manuscripts. The total number of pages in the entire database is 36,969. Each manuscript in the database accompanied with metadata that include various properties of the manuscript, such as title, creator, subject, language, copyist name, etc. To enrich our database we added twenty historical books scanned from the National Library(NLJ), in Jerusalem. The books have different number of pages and different writing styles. In addition, we present a number of experimental results on the database using two classifiers, The GMMS System and The OBI/SIFT System. The database is made freely available to researchers worldwide for research in various handwritten related problems such as text recognition, writer identification, verification, forms analysis, pre-processing and segmentation.
AB - A comprehensive Arabic handwritten text database is an important resource for Arabic handwritten text recognition research. It is essential for training text recognition algorithms and vital for evaluating the performance of these algorithms. In this paper, we present a database that includes manuscripts from the Islamic heritage project (IHP), consisting of 333 historical manuscripts written by 302 different writers, 23 from them are known. The database contains 54 manuscripts, whose writers are known, from 13 sources. Among these known writers, 11 have written multiple manuscripts. The total number of pages in the entire database is 36,969. Each manuscript in the database accompanied with metadata that include various properties of the manuscript, such as title, creator, subject, language, copyist name, etc. To enrich our database we added twenty historical books scanned from the National Library(NLJ), in Jerusalem. The books have different number of pages and different writing styles. In addition, we present a number of experimental results on the database using two classifiers, The GMMS System and The OBI/SIFT System. The database is made freely available to researchers worldwide for research in various handwritten related problems such as text recognition, writer identification, verification, forms analysis, pre-processing and segmentation.
UR - http://www.scopus.com/inward/record.url?scp=85027768590&partnerID=8YFLogxK
U2 - 10.1109/ASAR.2017.8067761
DO - 10.1109/ASAR.2017.8067761
M3 - Conference contribution
AN - SCOPUS:85027768590
T3 - 1st IEEE International Workshop on Arabic Script Analysis and Recognition, ASAR 2017
SP - 64
EP - 68
BT - 1st IEEE International Workshop on Arabic Script Analysis and Recognition, ASAR 2017
PB - Institute of Electrical and Electronics Engineers
Y2 - 3 April 2017 through 5 April 2017
ER -