TY - GEN
T1 - HADARA - A software system for semi-automatic processing of historical handwritten arabic documents
AU - Pantke, Werner
AU - Märgner, Volker
AU - Fecker, Daniel
AU - Fingscheidt, Tim
AU - Asi, Abedelkadir
AU - Biller, Ofer
AU - El-Sana, Jihad
AU - Saabni, Raid
AU - Yehia, Mohammad
PY - 2013/9/3
Y1 - 2013/9/3
N2 - Recently, many big libraries all over the world have been scanning their collections to make them publicly available and to preserve historical documents. We present a modular software system which can be used as a tool for semi-automatical processing of historical handwritten Arabic documents. The development of this system is part of the HADARA project which aims for historical document analysis of Arabic manuscripts and consists of a project team including engineers and computer scientists but also users such as linguists and historians. The HADARA system is designed to support script and content analysis, identification, and classification of historical Arabic documents. The system has been created following an iterative development approach, and the current version assists the user in an interactive and partially already in an automatic manner. In this paper, a system overview is given and the first modules are presented which support the annotation of a scanned manuscript in a semi-automatic manner. They comprise page layout analysis, text line segmentation, and transcription. Word spotting is the first application implemented in the HADARA system and its concept is outlined in this paper.
AB - Recently, many big libraries all over the world have been scanning their collections to make them publicly available and to preserve historical documents. We present a modular software system which can be used as a tool for semi-automatical processing of historical handwritten Arabic documents. The development of this system is part of the HADARA project which aims for historical document analysis of Arabic manuscripts and consists of a project team including engineers and computer scientists but also users such as linguists and historians. The HADARA system is designed to support script and content analysis, identification, and classification of historical Arabic documents. The system has been created following an iterative development approach, and the current version assists the user in an interactive and partially already in an automatic manner. In this paper, a system overview is given and the first modules are presented which support the annotation of a scanned manuscript in a semi-automatic manner. They comprise page layout analysis, text line segmentation, and transcription. Word spotting is the first application implemented in the HADARA system and its concept is outlined in this paper.
UR - http://www.scopus.com/inward/record.url?scp=84883159396&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:84883159396
SN - 9780892083046
T3 - Archiving 2013 - Final Program and Proceedings
SP - 161
EP - 166
BT - Archiving 2013 - Final Program and Proceedings
T2 - 10th IS and T Archiving Conference, Archiving 2013
Y2 - 2 April 2013 through 5 April 2013
ER -