We propose a variational method for model based segmentation of gray-scale images of highly degraded historical documents. Given a training set of characters (of a certain letter), we construct a small set of shape models that cover most of the training set's shape variance. For each gray-scale image of a respective degraded character, we construct a custom made shape prior using those fragments of the shape models that best fit the character's boundary. Therefore, we are not limited to any particular shape in the shape model set. In addition, we demonstrate the application of our shape prior to degraded character recognition. Experiments show that our method achieves very accurate results both in segmentation of highly degraded characters and both in recognition. When compared with manual segmentation, the average distance between the boundaries of respective segmented characters was 0.8 pixels (the average size of the characters was 70/70 pixels).
- Degraded character recognition
- Historical documents
- Level set
- Shape prior