DETECTION IN COMPLEX SCENES USING RGB AND DEPTH MULTIMODAL FEATURE FUSION

Shengli Yan, Yuan Rao, Wenhui Hou

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Unlike RGB images, depth images are robust to complex scenes of densely planted orchards. In this paper, we propose a fruit detection method using a multimodal feature fusion module (MMFF) of RGB and depth images. A dual-stream convolutional neural network is adopted in our method for feature extraction to capture multi-scale information of RGB images and depth images based on feature pyramids. The multimodal feature fusion module can filter similar and different features between modalities to suppress the same features and fuse different features. In addition, we use a multi-scale feature fusion method to fuse more information and improve the accuracy of fruit detection. To validate the effectiveness of our method, experimental research is conducted on a self-created pear dataset with multiple modalities. Extensive experiments demonstrate that our proposed approach can achieve state-of-the-art performance at low computation cost.

Original languageEnglish
Title of host publication2024 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2024 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers
Pages2495-2499
Number of pages5
ISBN (Electronic)9798350344851
DOIs
StatePublished - 1 Jan 2024
Externally publishedYes
Event2024 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2024 - Seoul, Korea, Republic of
Duration: 14 Apr 202419 Apr 2024

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
ISSN (Print)1520-6149

Conference

Conference2024 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2024
Country/TerritoryKorea, Republic of
CitySeoul
Period14/04/2419/04/24

Keywords

  • Depth image
  • Feature fusion
  • Multimodality
  • Object detection
  • RGB-D

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'DETECTION IN COMPLEX SCENES USING RGB AND DEPTH MULTIMODAL FEATURE FUSION'. Together they form a unique fingerprint.

Cite this