Abstract
Recent image search or content-based image retrieval (CBIR) systems rely on deep metric learning (DML) for extracting representative image features; however, their generalisation is limited by the dependency on large volumes of high-quality, diverse and unbiased training data. We introduce ATLANTIS, a framework with a novel methodology that automatically identifies training data deficiencies and then performs targeted and controlled synthetic data augmentation. Our framework comprises a Data Insight Generator for extracting contextual insights and the deficiencies from the existing training data, an Augmentation Protocol Selector to define dynamic, context-aware augmentation strategies, and an Outlier Removal and Diversity Control module to control the synthetic data's semantic coherence and diversity. ATLANTIS leverages image-to-text transformations, large language models, and text-to-image synthesis to iteratively generate and refine synthetic data while ensuring alignment with the original data and augmenting training data diversity in a controlled manner. Our comprehensive empirical evaluations reveal that ATLANTIS surpasses state-of-art in challenging domain-scarce and class-imbalanced data scenarios while also enhancing adversarial robustness, thus underscoring the generalisation gains. ATLANTIS also sets new benchmarks in standard balanced DML tasks, thereby establishing it as a robust and scalable framework for CBIR.
| Original language | English |
|---|---|
| State | Published - 1 Jan 2024 |
| Event | 35th British Machine Vision Conference, BMVC 2024 - Glasgow, United Kingdom Duration: 25 Nov 2024 → 28 Nov 2024 |
Conference
| Conference | 35th British Machine Vision Conference, BMVC 2024 |
|---|---|
| Country/Territory | United Kingdom |
| City | Glasgow |
| Period | 25/11/24 → 28/11/24 |
ASJC Scopus subject areas
- Artificial Intelligence
- Computer Vision and Pattern Recognition
Fingerprint
Dive into the research topics of 'ATLANTIS: A Framework for Automated Targeted Language-guided Augmentation Training for Robust Image Search'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver