Abstract
In this paper we present a new database with handwritten Arabic script. It is based on five books written by different writers from the years 1088-1451. We took 680 pages from these five books, and fully annotated them on the sub-word level. For each page we manually applied bounding boxes on the different sub-words and annotated the sequence of characters. It consists of 121,636 sub-word appearances consisted of 244,553 characters out of a vocabulary of 1,731 forms of sub-words. The database is described in detail and is designed for training and testing recognition systems for handwritten Arabic sub-words. This database is available for the purpose of research, and we encourage researchers to develop and test new methods using our database.
Original language | English |
---|---|
Pages | 11-14 |
Number of pages | 4 |
DOIs | |
State | Published - 13 Oct 2017 |
Event | 1st IEEE International Workshop on Arabic Script Analysis and Recognition, ASAR 2017 - Nancy, France Duration: 3 Apr 2017 → 5 Apr 2017 |
Conference
Conference | 1st IEEE International Workshop on Arabic Script Analysis and Recognition, ASAR 2017 |
---|---|
Country/Territory | France |
City | Nancy |
Period | 3/04/17 → 5/04/17 |
ASJC Scopus subject areas
- Computer Vision and Pattern Recognition
- Linguistics and Language
- Computer Science Applications