Modeling Non-Linguistic Contextual Signals in LSTM Language Models Via Domain Adaptation

Min Ma, Shankar Kumar, Fadi Biadsy, Michael Nirschl, Tomas Vykruta, Pedro Moreno

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

7 Scopus citations

Abstract

Language Models (LMs) for Automatic Speech Recognition (ASR) can benefit from utilizing non-linguistic contextual signals in modeling. Examples of these signals include the geographical location of the user speaking to the system and/or the identity of the application (app) being spoken to. In practice, the vast majority of input speech queries typically lack annotations of such signals, which poses a challenge to directly train domain-specific LMs. To obtain robust domain LMs, generally an LM which has been pre-trained on general data will be adapted to specific domains. We propose four domain adaptation schemes to improve the domain performance of Long Short-Term Memory (LSTM) LMs, by incorporating app based contextual signals of voice search queries. We show that most of our adaptation strategies are effective, reducing word perplexity up to 21 % relative to a fine-tuned baseline on a held-out domain-specific development set. Initial experiments using a state-of-the-art Italian ASR system show a 3 % relative reduction in WER on top of an unadapted 5-gram LM. In addition, human evaluations show significant improvements on sub-domains from using app signals.

Original languageEnglish
Title of host publication2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers
Pages6094-6098
Number of pages5
ISBN (Print)9781538646588
DOIs
StatePublished - 10 Sep 2018
Externally publishedYes
Event2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018 - Calgary, Canada
Duration: 15 Apr 201820 Apr 2018

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume2018-April
ISSN (Print)1520-6149

Conference

Conference2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018
Country/TerritoryCanada
CityCalgary
Period15/04/1820/04/18

Keywords

  • Domain adaptation
  • Language model adaptation
  • Neural network based language models
  • Speech recognition

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Modeling Non-Linguistic Contextual Signals in LSTM Language Models Via Domain Adaptation'. Together they form a unique fingerprint.

Cite this