Abstract
Historical documents and archaeological artifacts are hard to process due to natural degradation, fading, spills, tears, overlaid data,, and so on. In this work, we focus on the task of recovering characters and symbols from images of corrupted archaeological artifacts where data is partially erased, occluded, or overwritten by other data. Such phenomena can be widely observed in image datasets of palimpsests and petroglyphs consisting of erased, overwritten, and in general heavily degraded data. Segmentation and binarization are typically applied to such images to detect and recover characters and symbols from their background. However, these methods mainly focus on the visible data while in our case, due to large corruption, both visible and invisible information should be considered. For example, computing the segmentation mask of an occluded character requires also labeling invisible pixels and missing parts. In this work, we introduce a deep neural network that computes character segmentation in palimpsests and petroglyphs while overcoming occlusions, missing parts, and degradation. Our network has inference abilities, thus, not only segmenting the symbol's foreground pixels but also inferring and completing missing and corrupted parts. Since palimpsests and petroglyphs have very limited annotated ground-truth data, we also introduce data augmentation tools to properly train our network. We demonstrate both qualitative and quantitative performance of our method also including a user study involving expert evaluation.
Original language | English |
---|---|
Article number | 13 |
Journal | Journal on Computing and Cultural Heritage |
Volume | 15 |
Issue number | 1 |
DOIs | |
State | Published - 1 Feb 2022 |
Keywords
- Segmentation
- machine learning
- shape recognition
ASJC Scopus subject areas
- Conservation
- Information Systems
- Computer Science Applications
- Computer Graphics and Computer-Aided Design