Improving Reinforcement Learning Based Image Captioning with Natural Language Prior

Guo, Tszhang; Chang, Shiyu; Yu, Mo; Bai, Kun

Computer Science > Computer Vision and Pattern Recognition

arXiv:1809.06227 (cs)

[Submitted on 13 Sep 2018]

Title:Improving Reinforcement Learning Based Image Captioning with Natural Language Prior

Authors:Tszhang Guo, Shiyu Chang, Mo Yu, Kun Bai

View PDF

Abstract:Recently, Reinforcement Learning (RL) approaches have demonstrated advanced performance in image captioning by directly optimizing the metric used for testing. However, this shaped reward introduces learning biases, which reduces the readability of generated text. In addition, the large sample space makes training unstable and slow. To alleviate these issues, we propose a simple coherent solution that constrains the action space using an n-gram language prior. Quantitative and qualitative evaluations on benchmarks show that RL with the simple add-on module performs favorably against its counterpart in terms of both readability and speed of convergence. Human evaluation results show that our model is more human readable and graceful. The implementation will become publicly available upon the acceptance of the paper.

Comments:	8 pages, 5 figures, EMNLP2018
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:1809.06227 [cs.CV]
	(or arXiv:1809.06227v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.1809.06227

Submission history

From: Tszhang Guo [view email]
[v1] Thu, 13 Sep 2018 17:21:56 UTC (7,121 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CV

< prev | next >

new | recent | 2018-09

Change to browse by:

cs
cs.LG
stat
stat.ML

References & Citations

DBLP - CS Bibliography

listing | bibtex

Tszhang Guo
Shiyu Chang
Mo Yu
Kun Bai

export BibTeX citation

Computer Science > Computer Vision and Pattern Recognition

Title:Improving Reinforcement Learning Based Image Captioning with Natural Language Prior

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Improving Reinforcement Learning Based Image Captioning with Natural Language Prior

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators