Fusing visual and range imaging for object class recognition

Aharon Bar-Hillel, Dmitri Hanukaev, Dan Levi

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

17 Scopus citations

Abstract

Category level object recognition has improved significantly in the last few years, but machine performance remains unsatisfactory for most real-world applications. We believe this gap may be bridged using additional depth information obtained from range imaging, which was recently used to overcome similar problems in body shape interpretation. This paper presents a system which successfully fuses visual and range imaging for object category classification. We explore fusion at multiple levels: using depth as an attention mechanism, high-level fusion at the classifier level and low-level fusion of local descriptors, and show that each mechanism makes a unique contribution to performance. For low-level fusion we present a new algorithm for training of local descriptors, the Generalized Image Feature Transform (GIFT), which generalizes current representations such as SIFT and spatial pyramids and allows for the creation of new representations based on multiple channels of information. We show that our system improves state-of-the-art visual-only and depth-only methods on a diverse dataset of every-day objects.

Original languageEnglish
Title of host publication2011 International Conference on Computer Vision, ICCV 2011
Pages65-72
Number of pages8
DOIs
StatePublished - 1 Dec 2011
Externally publishedYes
Event2011 IEEE International Conference on Computer Vision, ICCV 2011 - Barcelona, Spain
Duration: 6 Nov 201113 Nov 2011

Publication series

NameProceedings of the IEEE International Conference on Computer Vision

Conference

Conference2011 IEEE International Conference on Computer Vision, ICCV 2011
Country/TerritorySpain
CityBarcelona
Period6/11/1113/11/11

ASJC Scopus subject areas

  • Software
  • Computer Vision and Pattern Recognition

Fingerprint

Dive into the research topics of 'Fusing visual and range imaging for object class recognition'. Together they form a unique fingerprint.

Cite this