Enhanced control of a wheelchair-mounted robotic manipulator using 3-D vision and multimodal interaction

Hairong Jiang, Ting Zhang, Juan P. Wachs, Bradley S. Duerstock

Research output: Contribution to journalArticlepeer-review

33 Scopus citations

Abstract

This paper presents a multiple-sensors, 3D vision-based, autonomous wheelchair-mounted robotic manipulator (WMRM). Two 3D sensors were employed: one for object recognition, and the other for recognizing body parts (face and hands). The goal is to recognize everyday items and automatically interact with them in an assistive fashion. For example, when a cereal box is recognized, it is grasped, poured in a bowl, and brought to the user. Daily objects (i.e. bowl and hat) were automatically detected and classified using a three-steps procedure: (1) remove background based on 3D information and find the point cloud of each object; (2) extract feature vectors for each segmented object from its 3D point cloud and its color image; and (3) classify feature vectors as objects after applying a nonlinear support vector machine (SVM). To retrieve specific objects, three user interface methods were adopted: voice-based, gesture-based, and hybrid commands. The presented system was tested using two common activities of daily living — feeding and dressing. The results revealed that an accuracy of 98.96% is achieved for a dataset with twelve daily objects. The experimental results indicated that hybrid (gesture and speech) interaction outperforms any single modal interaction.

Original languageEnglish
Pages (from-to)21-31
Number of pages11
JournalComputer Vision and Image Understanding
Volume149
DOIs
StatePublished - 1 Aug 2016
Externally publishedYes

Keywords

  • 3D vision
  • Assistive robotics
  • Multi-modal interface
  • Wheelchair mounted robotic manipulator

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Computer Vision and Pattern Recognition

Fingerprint

Dive into the research topics of 'Enhanced control of a wheelchair-mounted robotic manipulator using 3-D vision and multimodal interaction'. Together they form a unique fingerprint.

Cite this