Multilingual Adaptation of RNN Based ASR Systems

Müller, Markus; Stüker, Sebastian; Waibel, Alex

Electrical Engineering and Systems Science > Audio and Speech Processing

arXiv:1711.04569v1 (eess)

[Submitted on 13 Nov 2017 (this version), latest version 27 Feb 2018 (v2)]

Title:Multilingual Adaptation of RNN Based ASR Systems

Authors:Markus Müller, Sebastian Stüker, Alex Waibel

View PDF

Abstract:A large amount of data is required for automatic speech recognition (ASR) systems achieving good performance. While such data is readily available for languages like English, there exists a long tail of languages with only limited language resources. By using data from additional source languages, this problem can be mitigated. In this work, we focus on multilingual systems based on recurrent neural networks (RNNs), trained using the Connectionist Temporal Classification (CTC) loss function. Using a multilingual set of acoustic units to train systems jointly on multiple languages poses difficulties: While the same phones share the same symbols across languages, they are pronounced slightly different because of, e.g., small shifts in tongue positions. To address this issue, we proposed Language Feature Vectors (LFVs) to train language adaptive multilingual systems. In this work, we extended this approach by introducing a novel technique which we call "modulation" to add LFVs . We evaluated our approach in multiple conditions, showing improvements in both full and low resource conditions as well as for grapheme and phone based systems.

Comments:	Submitted to ICASSP 2018
Subjects:	Audio and Speech Processing (eess.AS); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
Cite as:	arXiv:1711.04569 [eess.AS]
	(or arXiv:1711.04569v1 [eess.AS] for this version)
	https://doi.org/10.48550/arXiv.1711.04569

Submission history

From: Markus Müller [view email]
[v1] Mon, 13 Nov 2017 13:22:54 UTC (17 KB)
[v2] Tue, 27 Feb 2018 13:44:46 UTC (17 KB)

Electrical Engineering and Systems Science > Audio and Speech Processing

Title:Multilingual Adaptation of RNN Based ASR Systems

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Electrical Engineering and Systems Science > Audio and Speech Processing

Title:Multilingual Adaptation of RNN Based ASR Systems

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators