BART based semantic correction for Mandarin automatic speech recognition system

Zhao, Yun; Yang, Xuerui; Wang, Jinchao; Gao, Yongyu; Yan, Chao; Zhou, Yuanfu

doi:10.21437/Interspeech.2021-739

Computer Science > Computation and Language

arXiv:2104.05507 (cs)

[Submitted on 26 Mar 2021]

Title:BART based semantic correction for Mandarin automatic speech recognition system

Authors:Yun Zhao, Xuerui Yang, Jinchao Wang, Yongyu Gao, Chao Yan, Yuanfu Zhou

View PDF

Abstract:Although automatic speech recognition (ASR) systems achieved significantly improvements in recent years, spoken language recognition error occurs which can be easily spotted by human beings. Various language modeling techniques have been developed on post recognition tasks like semantic correction. In this paper, we propose a Transformer based semantic correction method with pretrained BART initialization, Experiments on 10000 hours Mandarin speech dataset show that character error rate (CER) can be effectively reduced by 21.7% relatively compared to our baseline ASR system. Expert evaluation demonstrates that actual improvement of our model surpasses what CER indicates.

Comments:	submitted to INTERSPEECH2021
Subjects:	Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
Cite as:	arXiv:2104.05507 [cs.CL]
	(or arXiv:2104.05507v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2104.05507
Journal reference:	Interspeech 2021
Related DOI:	https://doi.org/10.21437/Interspeech.2021-739

Submission history

From: Yun Zhao [view email]
[v1] Fri, 26 Mar 2021 06:41:16 UTC (69 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2021-04

Change to browse by:

cs
cs.SD
eess
eess.AS

References & Citations

DBLP - CS Bibliography

listing | bibtex

Yun Zhao
Xuerui Yang
Chao Yan

export BibTeX citation

Computer Science > Computation and Language

Title:BART based semantic correction for Mandarin automatic speech recognition system

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:BART based semantic correction for Mandarin automatic speech recognition system

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators