Reactive Multi-Stage Feature Fusion for Multimodal Dialogue Modeling

Yeh, Yi-Ting; Lin, Tzu-Chuan; Cheng, Hsiao-Hua; Deng, Yu-Hsuan; Su, Shang-Yu; Chen, Yun-Nung

Computer Science > Computation and Language

arXiv:1908.05067 (cs)

[Submitted on 14 Aug 2019]

Title:Reactive Multi-Stage Feature Fusion for Multimodal Dialogue Modeling

Authors:Yi-Ting Yeh, Tzu-Chuan Lin, Hsiao-Hua Cheng, Yu-Hsuan Deng, Shang-Yu Su, Yun-Nung Chen

View PDF

Abstract:Visual question answering and visual dialogue tasks have been increasingly studied in the multimodal field towards more practical real-world scenarios. A more challenging task, audio visual scene-aware dialogue (AVSD), is proposed to further advance the technologies that connect audio, vision, and language, which introduces temporal video information and dialogue interactions between a questioner and an answerer. This paper proposes an intuitive mechanism that fuses features and attention in multiple stages in order to well integrate multimodal features, and the results demonstrate its capability in the experiments. Also, we apply several state-of-the-art models in other tasks to the AVSD task, and further analyze their generalization across different tasks.

Comments:	Accepted for a poster session at the DSTC7 workshop at AAAI 2019
Subjects:	Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:1908.05067 [cs.CL]
	(or arXiv:1908.05067v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.1908.05067

Submission history

From: Yi-Ting Yeh [view email]
[v1] Wed, 14 Aug 2019 10:58:14 UTC (4,409 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2019-08

Change to browse by:

cs
cs.CV

References & Citations

DBLP - CS Bibliography

listing | bibtex

Yi Ting Yeh
Tzu-Chuan Lin
Shang-Yu Su
Yun-Nung Chen

export BibTeX citation

Computer Science > Computation and Language

Title:Reactive Multi-Stage Feature Fusion for Multimodal Dialogue Modeling

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Reactive Multi-Stage Feature Fusion for Multimodal Dialogue Modeling

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators