Abstract
We present our work on the paleographic analysis and recognition system intended for processing of historical Hebrew calligraphy documents. The main goal is to analyze documents of different writing styles in order to identify the locations, dates, and writers of test documents. Using interactive software tools, a data base of extracted characters has been established. It now contains about 20,000 characters of 34 different writers, and will be distinctly expanded in the near future. Preliminary results of automatic extraction of pre-specified letters using the erosion operator are presented. We further propose and test topological features for handwriting style classification based on a selected subset of the Hebrew alphabet. A writer identification experiment using 34 writers yielded 100% correct classification.
Original language | English |
---|---|
Pages (from-to) | 89-99 |
Number of pages | 11 |
Journal | International Journal on Document Analysis and Recognition |
Volume | 9 |
Issue number | 2-4 |
DOIs | |
State | Published - 1 Apr 2007 |
Keywords
- Binarization
- Character extraction
- Document analysis
- Historical documents
- Writer identification
ASJC Scopus subject areas
- Software
- Computer Vision and Pattern Recognition
- Computer Science Applications