Towards Semi-Supervised Semantics Understanding from Speech

Lai, Cheng-I; Cao, Jin; Bodapati, Sravan; Li, Shang-Wen

Computer Science > Computation and Language

arXiv:2011.06195 (cs)

[Submitted on 11 Nov 2020]

Title:Towards Semi-Supervised Semantics Understanding from Speech

Authors:Cheng-I Lai, Jin Cao, Sravan Bodapati, Shang-Wen Li

View PDF

Abstract:Much recent work on Spoken Language Understanding (SLU) falls short in at least one of three ways: models were trained on oracle text input and neglected the Automatics Speech Recognition (ASR) outputs, models were trained to predict only intents without the slot values, or models were trained on a large amount of in-house data. We proposed a clean and general framework to learn semantics directly from speech with semi-supervision from transcribed speech to address these. Our framework is built upon pretrained end-to-end (E2E) ASR and self-supervised language models, such as BERT, and fine-tuned on a limited amount of target SLU corpus. In parallel, we identified two inadequate settings under which SLU models have been tested: noise-robustness and E2E semantics evaluation. We tested the proposed framework under realistic environmental noises and with a new metric, the slots edit F1 score, on two public SLU corpora. Experiments show that our SLU framework with speech as input can perform on par with those with oracle text as input in semantics understanding, while environmental noises are present, and a limited amount of labeled semantics data is available.

Comments:	arXiv admin note: text overlap with arXiv:2010.13826
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as:	arXiv:2011.06195 [cs.CL]
	(or arXiv:2011.06195v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2011.06195

Submission history

From: Cheng-I Lai [view email]
[v1] Wed, 11 Nov 2020 01:48:09 UTC (2,147 KB)

Computer Science > Computation and Language

Title:Towards Semi-Supervised Semantics Understanding from Speech

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Towards Semi-Supervised Semantics Understanding from Speech

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators