TY - GEN
T1 - Hard and Soft Labeling for Hebrew Paleography
T2 - 15th IAPR International Workshop on Document Analysis Systems, DAS 2022
AU - Droby, Ahmad
AU - Shapira, Daria Vasyutinsky
AU - Rabaev, Irina
AU - Barakat, Berat Kurar
AU - El-Sana, Jihad
N1 - Publisher Copyright:
© 2022, Springer Nature Switzerland AG.
PY - 2022/1/1
Y1 - 2022/1/1
N2 - Paleography studies the writing styles of manuscripts and recognizes different styles and modes of scripts. We explore the applicability of hard and soft-labeling for training deep-learning models to classify Hebrew scripts. In contrast to the hard-labeling scheme, where each document image has one label representing its class, the soft-labeling approach labels an image by a label vector. Each element of the vector is the similarity of the document image to a certain regional writing style or graphical mode. In addition, we introduce a dataset of medieval Hebrew manuscripts that provides complete coverage of major Hebrew writing styles and modes. A Hebrew paleography expert manually annotated the ground truth for soft-labeling. We compare the applicability of soft and hard-labeling approaches on the presented dataset, analyze, and discuss the findings.
AB - Paleography studies the writing styles of manuscripts and recognizes different styles and modes of scripts. We explore the applicability of hard and soft-labeling for training deep-learning models to classify Hebrew scripts. In contrast to the hard-labeling scheme, where each document image has one label representing its class, the soft-labeling approach labels an image by a label vector. Each element of the vector is the similarity of the document image to a certain regional writing style or graphical mode. In addition, we introduce a dataset of medieval Hebrew manuscripts that provides complete coverage of major Hebrew writing styles and modes. A Hebrew paleography expert manually annotated the ground truth for soft-labeling. We compare the applicability of soft and hard-labeling approaches on the presented dataset, analyze, and discuss the findings.
KW - Convolutional neural network
KW - Digital paleography
KW - Medieval Hebrew manuscripts
KW - Script type classification
KW - Soft-labeling
UR - http://www.scopus.com/inward/record.url?scp=85131137328&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-06555-2_33
DO - 10.1007/978-3-031-06555-2_33
M3 - Conference contribution
AN - SCOPUS:85131137328
SN - 9783031065545
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 492
EP - 506
BT - Document Analysis Systems - 15th IAPR International Workshop, DAS 2022, Proceedings
A2 - Uchida, Seiichi
A2 - Barney, Elisa
A2 - Eglin, Véronique
PB - Springer Science and Business Media Deutschland GmbH
Y2 - 22 May 2022 through 25 May 2022
ER -