Deep learning for paleographic analysis of medieval Hebrew manuscripts: A DH team collaboration experience

Daria Vasyutinsky Shapira, Irina Rabaev, Berat Kurar Barakat, Ahmad Droby, Jihad El-Sana

Research output: Contribution to journalConference articlepeer-review


Our research project is part of the Visual Media Lab, headed by Professor Jihad El-Sana, the Department of Computer Science at Ben- Gurion University of the Negev, Israel. In this interdisciplinary project we apply deep learning models to classify script types and sub-types in medieval Hebrew manuscripts. The model incorporates the the techniques and databases of Hebrew paleography and (with reservations) Hebrew codicology. Main theoretical base of our project is the SfarData dataset, that in- cludes the full codicological descriptions and paleographical definitions of all dated medieval Hebrew manuscripts till the year 1540. In some ex- ceptional cases, we go beyond this dataset framework. The major source of the data in terms of high definition photos of manuscripts is the In- stitute of Microfilmed Hebrew Manuscripts at the National Library of Israel that has undertaken the mission to collect copies of all extant He- brew manuscripts from all over the world. We mostly use manuscripts from the National library of Israel, the British library, and the French National library. This multidisciplinary project brings together researchers from both fields, Humanities and Computer Science. Currently, one professor, one lec- turer, one post-doc, and two doctoral students are participating in the project. This is a very exciting work in which there are no ready-made so- lutions for the various challenges. We collectively discuss ways to address these challenges and adapt our solution on the go. During the presentation, we will talk about how our project functions and how we strive to achieve a common result. The inevitable difficul- ties that we face during this collaboration include, inter alia, different research systems in Humanities and in Computer Sciences, lack of com- mon terminology, different technical training, different requirements for publications and conferences, etc.

Original languageEnglish
Pages (from-to)84-92
Number of pages9
JournalCEUR Workshop Proceedings
StatePublished - 1 Jan 2020
Event2020 Twin Talks 2 and 3 Workshops at DHN 2020 and DH 2020: Understanding and Facilitating Collaboration in Digital Humanities 2020, TwinTalks 2020 - Virtual, Online
Duration: 20 Oct 2020 → …

ASJC Scopus subject areas

  • General Computer Science


Dive into the research topics of 'Deep learning for paleographic analysis of medieval Hebrew manuscripts: A DH team collaboration experience'. Together they form a unique fingerprint.

Cite this