Skip to main navigation Skip to search Skip to main content

Sparse non-negative matrix language modeling: Maximum entropy flexibility on the cheap

  • Ciprian Chelba
  • , Diamantino Caseiro
  • , Fadi Biadsy

Research output: Contribution to journalConference articlepeer-review

2 Scopus citations

Abstract

We present a new method for estimating the sparse non-negative model (SNM) by using a small amount of held-out data and the multinomial loss that is natural for language modeling; we validate it experimentally against the previous estimation method which uses leave-one-out on training data and a binary loss function and show that it performs equally well. Being able to train on held-out data is very important in practical situations where training data is mismatched from held-out/test data. We find that fairly small amounts of held-out data (on the order of 30-70 thousand words) are sufficient for training the adjustment model, which is the only model component estimated using gradient descent; the bulk of model parameters are relative frequencies counted on training data. A second contribution is a comparison between SNM and the related class of Maximum Entropy language models. While much cheaper computationally, we show that SNM achieves slightly better perplexity results for the same feature set and same speech recognition accuracy on voice search and short message dictation.

Original languageEnglish
Pages (from-to)2725-2729
Number of pages5
JournalProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Volume2017-August
DOIs
StatePublished - 1 Jan 2017
Externally publishedYes
Event18th Annual Conference of the International Speech Communication Association, INTERSPEECH 2017 - Stockholm, Sweden
Duration: 20 Aug 201724 Aug 2017

Keywords

  • Language modeling
  • Machine learning
  • Maximum entropy
  • Speech recognition

ASJC Scopus subject areas

  • Language and Linguistics
  • Human-Computer Interaction
  • Signal Processing
  • Software
  • Modeling and Simulation

Fingerprint

Dive into the research topics of 'Sparse non-negative matrix language modeling: Maximum entropy flexibility on the cheap'. Together they form a unique fingerprint.

Cite this