TY - JOUR
T1 - Comparison of state-of-the-art deep learning APIs for image multi-label classification using semantic metrics
AU - Kubany, Adam
AU - Ben Ishay, Shimon
AU - Ohayon, Ruben Sacha
AU - Shmilovici, Armin
AU - Rokach, Lior
AU - Doitshman, Tomer
N1 - Funding Information:
This study was supported by grants from the MAGNET program of the Israeli Innovation Authority and the MAFAT program of the Israeli Ministry of Defense.
Funding Information:
The authors declare the following financial interests/personal relationships which may be considered as potential competing interests: This work was supported by grants from the MAGNET program of the Israeli Innovation Authority and the MAFAT program of the Israeli Ministry of Defense.
Publisher Copyright:
© 2020 Elsevier Ltd
PY - 2020/12/15
Y1 - 2020/12/15
N2 - Image understanding heavily relies on accurate multi-label classification. In recent years, deep learning algorithms have become very successful for such tasks, and various commercial and open-source APIs have been released for public use. However, these APIs are often trained on different datasets, which, besides affecting their performance, might pose a challenge to their performance evaluation. This challenge concerns the different object-class dictionaries of the APIs’ training dataset and the benchmark dataset, in which the predicted labels are semantically similar to the benchmark labels but considered different simply because they have different wording in the dictionaries. To face this challenge, we propose semantic similarity metrics to obtain richer understating of the APIs predicted labels and thus their performance. In this study, we evaluate and compare the performance of 13 of the most prominent commercial and open-source APIs in a best-of-breed challenge on the Visual Genome and Open Images benchmark datasets. Our findings demonstrate that, while using traditional metrics, the Microsoft Computer Vision, Imagga, and IBM APIs performed better than others. However, applying semantic metrics also unveil the InceptionResNet-v2, Inception-v3, and ResNet50 APIs, which are trained only with the simple ImageNet dataset, as challengers for top semantic performers.
AB - Image understanding heavily relies on accurate multi-label classification. In recent years, deep learning algorithms have become very successful for such tasks, and various commercial and open-source APIs have been released for public use. However, these APIs are often trained on different datasets, which, besides affecting their performance, might pose a challenge to their performance evaluation. This challenge concerns the different object-class dictionaries of the APIs’ training dataset and the benchmark dataset, in which the predicted labels are semantically similar to the benchmark labels but considered different simply because they have different wording in the dictionaries. To face this challenge, we propose semantic similarity metrics to obtain richer understating of the APIs predicted labels and thus their performance. In this study, we evaluate and compare the performance of 13 of the most prominent commercial and open-source APIs in a best-of-breed challenge on the Visual Genome and Open Images benchmark datasets. Our findings demonstrate that, while using traditional metrics, the Microsoft Computer Vision, Imagga, and IBM APIs performed better than others. However, applying semantic metrics also unveil the InceptionResNet-v2, Inception-v3, and ResNet50 APIs, which are trained only with the simple ImageNet dataset, as challengers for top semantic performers.
KW - Deep learning
KW - Image multi-label classification comparison
KW - Image understanding
KW - Semantic evaluation
UR - http://www.scopus.com/inward/record.url?scp=85087591749&partnerID=8YFLogxK
U2 - 10.1016/j.eswa.2020.113656
DO - 10.1016/j.eswa.2020.113656
M3 - Article
AN - SCOPUS:85087591749
SN - 0957-4174
VL - 161
JO - Expert Systems with Applications
JF - Expert Systems with Applications
M1 - 113656
ER -