Development of a Low-Cost, Noninvasive, Portable Visual Speech Recognition Program

Gavriel D. Kohlberg, Ya'Akov Gal, Anil K. Lalwani

Research output: Contribution to journalArticlepeer-review

4 Scopus citations


Objectives: Loss of speech following tracheostomy and laryngectomy severely limits communication to simple gestures and facial expressions that are largely ineffective. To facilitate communication in these patients, we seek to develop a low-cost, noninvasive, portable, and simple visual speech recognition program (VSRP) to convert articulatory facial movements into speech. Methods: A Microsoft Kinect-based VSRP was developed to capture spatial coordinates of lip movements and translate them into speech. The articulatory speech movements associated with 12 sentences were used to train an artificial neural network classifier. The accuracy of the classifier was then evaluated on a separate, previously unseen set of articulatory speech movements. Results: The VSRP was successfully implemented and tested in 5 subjects. It achieved an accuracy rate of 77.2% (65.0%-87.6% for the 5 speakers) on a 12-sentence data set. The mean time to classify an individual sentence was 2.03 milliseconds (1.91-2.16). Conclusion: We have demonstrated the feasibility of a low-cost, noninvasive, portable VSRP based on Kinect to accurately predict speech from articulation movements in clinically trivial time. This VSRP could be used as a novel communication device for aphonic patients.

Original languageEnglish
Pages (from-to)752-757
Number of pages6
JournalAnnals of Otology, Rhinology and Laryngology
Issue number9
StatePublished - 1 Sep 2016


  • communication AIDS for disabled
  • lipreading
  • silent speech interface
  • user computer interface
  • visual speech recognition

ASJC Scopus subject areas

  • Otorhinolaryngology


Dive into the research topics of 'Development of a Low-Cost, Noninvasive, Portable Visual Speech Recognition Program'. Together they form a unique fingerprint.

Cite this