Abstract
Few-shot learning in semantic segmentation has gained significant attention recently for its adaptability in applications where only a few or no examples are available as support for training. Here we advocate for a new testing paradigm, we coin it half-shot learning (HSL), which evaluates model’s ability to generalise to new categories when support objects are partially viewed, significantly cropped, occluded, noised, or aggressively transformed. This new paradigm introduces challenges that will spark advances in the field, allowing us to benchmark existing models and analyze their acquired sense of objectness. Humans are remarkably exceptional at recognizing objects even when partially obstructed. HSL seeks to bridge the gap between human-like perception and machine learning models by forcing them to recognize objects from incomplete, fragmented, or noisy views-just as humans do. We propose a highly augmented image set for HSL that is built by intentionally manipulating PASCAL-5i and COCO-20i to fit this paradigm. Our results reveal the shortcomings of state-of-the-art few-shot learning models and suggest improvements through data augmentation or the incorporation of additional attention-based modules to enhance the generalization capabilities of few-shot semantic segmentation (FSS). To improve the training method, we propose a channel and spatial attention module (Woo et al., 2018), where an FSS model is retrained with attention module and tested against the highly augmented support information. Our experiments demonstrate that an FSS model trained with the proposed method achieves significantly a higher accuracy (approximately 5%) when exposed to limited or highly cropped support data.
| Original language | English |
|---|---|
| Pages (from-to) | 430-438 |
| Number of pages | 9 |
| Journal | Proceedings of the International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications |
| Volume | 2 |
| DOIs | |
| State | Published - 1 Jan 2025 |
| Externally published | Yes |
| Event | 20th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, VISIGRAPP 2025 - Porto, Portugal Duration: 26 Feb 2025 → 28 Feb 2025 |
Keywords
- Deep Neural Networks
- Few-Shot Learning
- Machine Vision
- Meta-Learning
- Semantic Segmentation
ASJC Scopus subject areas
- Computer Graphics and Computer-Aided Design
- Computer Vision and Pattern Recognition
- Human-Computer Interaction