Visual-Only Recognition of Normal, Whispered and Silent Speech

Petridis, Stavros; Shen, Jie; Cetin, Doruk; Pantic, Maja

Computer Science > Computer Vision and Pattern Recognition

arXiv:1802.06399 (cs)

[Submitted on 18 Feb 2018]

Title:Visual-Only Recognition of Normal, Whispered and Silent Speech

Authors:Stavros Petridis, Jie Shen, Doruk Cetin, Maja Pantic

View PDF

Abstract:Silent speech interfaces have been recently proposed as a way to enable communication when the acoustic signal is not available. This introduces the need to build visual speech recognition systems for silent and whispered speech. However, almost all the recently proposed systems have been trained on vocalised data only. This is in contrast with evidence in the literature which suggests that lip movements change depending on the speech mode. In this work, we introduce a new audiovisual database which is publicly available and contains normal, whispered and silent speech. To the best of our knowledge, this is the first study which investigates the differences between the three speech modes using the visual modality only. We show that an absolute decrease in classification rate of up to 3.7% is observed when training and testing on normal and whispered, respectively, and vice versa. An even higher decrease of up to 8.5% is reported when the models are tested on silent speech. This reveals that there are indeed visual differences between the 3 speech modes and the common assumption that vocalized training data can be used directly to train a silent speech recognition system may not be true.

Comments:	Accepted to ICASSP 2018
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:1802.06399 [cs.CV]
	(or arXiv:1802.06399v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.1802.06399

Submission history

From: Stavros Petridis [view email]
[v1] Sun, 18 Feb 2018 16:40:46 UTC (247 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CV

< prev | next >

new | recent | 2018-02

Change to browse by:

References & Citations

DBLP - CS Bibliography

listing | bibtex

Stavros Petridis
Jie Shen
Doruk Cetin
Maja Pantic

export BibTeX citation

Computer Science > Computer Vision and Pattern Recognition

Title:Visual-Only Recognition of Normal, Whispered and Silent Speech

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Visual-Only Recognition of Normal, Whispered and Silent Speech

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators