Learning Problem-agnostic Speech Representations from Multiple Self-supervised Tasks

Pascual, Santiago; Ravanelli, Mirco; Serrà, Joan; Bonafonte, Antonio; Bengio, Yoshua

Computer Science > Machine Learning

arXiv:1904.03416 (cs)

[Submitted on 6 Apr 2019]

Title:Learning Problem-agnostic Speech Representations from Multiple Self-supervised Tasks

Authors:Santiago Pascual, Mirco Ravanelli, Joan Serrà, Antonio Bonafonte, Yoshua Bengio

View PDF

Abstract:Learning good representations without supervision is still an open issue in machine learning, and is particularly challenging for speech signals, which are often characterized by long sequences with a complex hierarchical structure. Some recent works, however, have shown that it is possible to derive useful speech representations by employing a self-supervised encoder-discriminator approach. This paper proposes an improved self-supervised method, where a single neural encoder is followed by multiple workers that jointly solve different self-supervised tasks. The needed consensus across different tasks naturally imposes meaningful constraints to the encoder, contributing to discover general representations and to minimize the risk of learning superficial ones. Experiments show that the proposed approach can learn transferable, robust, and problem-agnostic features that carry on relevant information from the speech signal, such as speaker identity, phonemes, and even higher-level features such as emotional cues. In addition, a number of design choices make the encoder easily exportable, facilitating its direct usage or adaptation to different problems.

Subjects:	Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
Cite as:	arXiv:1904.03416 [cs.LG]
	(or arXiv:1904.03416v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1904.03416

Submission history

From: Santiago Pascual de la Puente [view email]
[v1] Sat, 6 Apr 2019 10:51:25 UTC (94 KB)

Computer Science > Machine Learning

Title:Learning Problem-agnostic Speech Representations from Multiple Self-supervised Tasks

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Learning Problem-agnostic Speech Representations from Multiple Self-supervised Tasks

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators