Improving Visually Grounded Sentence Representations with Self-Attention

Yoo, Kang Min; Shin, Youhyun; Lee, Sang-goo

Computer Science > Computation and Language

arXiv:1712.00609 (cs)

[Submitted on 2 Dec 2017]

Title:Improving Visually Grounded Sentence Representations with Self-Attention

Authors:Kang Min Yoo, Youhyun Shin, Sang-goo Lee

View PDF

Abstract:Sentence representation models trained only on language could potentially suffer from the grounding problem. Recent work has shown promising results in improving the qualities of sentence representations by jointly training them with associated image features. However, the grounding capability is limited due to distant connection between input sentences and image features by the design of the architecture. In order to further close the gap, we propose applying self-attention mechanism to the sentence encoder to deepen the grounding effect. Our results on transfer tasks show that self-attentive encoders are better for visual grounding, as they exploit specific words with strong visual associations.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:1712.00609 [cs.CL]
	(or arXiv:1712.00609v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.1712.00609

Submission history

From: Kang Min Yoo [view email]
[v1] Sat, 2 Dec 2017 14:14:50 UTC (262 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2017-12

Change to browse by:

References & Citations

DBLP - CS Bibliography

listing | bibtex

Kang Min Yoo
Youhyun Shin
Sang-goo Lee

export BibTeX citation

Computer Science > Computation and Language

Title:Improving Visually Grounded Sentence Representations with Self-Attention

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Improving Visually Grounded Sentence Representations with Self-Attention

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators