Abstract
Wav2vec 2.0 is a state-of-the-art speech recognition model which maps speech audio waveforms into latent representations. The largest version of wav2vec 2.0 contains 317 million parameters. Hence, the inference latency of wav2vec 2.0 will be a bottleneck in production, leading to high costs and a significant environmental footprint. To improve wav2vec’s applicability to a production setting, we explore multiple model compression methods borrowed from the domain of large language models. Using a teacher-student approach, we distilled the knowledge from the original wav2vec 2.0 model into a student model, which is 2 times faster, 4.8 times smaller than the original model. More importantly, the student model is 2 times more energy efficient than the original model in terms of CO2 emission. This increase in performance is accomplished with only a 7% degradation in word error rate (WER). Our quantized model is 3.6 times smaller than the original model, with only a 0.1% degradation in WER. To the best of our knowledge, this is the first work that compresses wav2vec 2.0.
| Original language | English |
|---|---|
| Title of host publication | SustaiNLP 2021 - 2nd Workshop on Simple and Efficient Natural Language Processing, Proceedings of SustaiNLP |
| Editors | Nafise Sadat Moosavi, Iryna Gurevych, Angela Fan, Thomas Wolf, Yufang Hou, Ana Marasovic, Sujith Ravi |
| Publisher | Association for Computational Linguistics (ACL) |
| Pages | 134-141 |
| Number of pages | 8 |
| ISBN (Electronic) | 9781955917018 |
| State | Published - 1 Jan 2021 |
| Externally published | Yes |
| Event | 2nd Workshop on Simple and Efficient Natural Language Processing, SustaiNLP 2021 - Virtual, Online Duration: 10 Nov 2021 → … |
Publication series
| Name | SustaiNLP 2021 - 2nd Workshop on Simple and Efficient Natural Language Processing, Proceedings of SustaiNLP |
|---|
Conference
| Conference | 2nd Workshop on Simple and Efficient Natural Language Processing, SustaiNLP 2021 |
|---|---|
| City | Virtual, Online |
| Period | 10/11/21 → … |
UN SDGs
This output contributes to the following UN Sustainable Development Goals (SDGs)
-
SDG 7 Affordable and Clean Energy
-
SDG 13 Climate Action
ASJC Scopus subject areas
- Language and Linguistics
- Computational Theory and Mathematics
- Software
- Linguistics and Language
Fingerprint
Dive into the research topics of 'Shrinking Bigfoot: Reducing wav2vec 2.0 footprint'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver