Differentiable Scene Graphs

Raboh, Moshiko; Herzig, Roei; Chechik, Gal; Berant, Jonathan; Globerson, Amir

Computer Science > Computer Vision and Pattern Recognition

arXiv:1902.10200 (cs)

[Submitted on 26 Feb 2019 (v1), last revised 14 Mar 2020 (this version, v5)]

Title:Differentiable Scene Graphs

Authors:Moshiko Raboh, Roei Herzig, Gal Chechik, Jonathan Berant, Amir Globerson

View PDF

Abstract:Reasoning about complex visual scenes involves perception of entities and their relations. Scene graphs provide a natural representation for reasoning tasks, by assigning labels to both entities (nodes) and relations (edges). Unfortunately, reasoning systems based on SGs are typically trained in a two-step procedure: First, training a model to predict SGs from images; Then, a separate model is created to reason based on predicted SGs. In many domains, it is preferable to train systems jointly in an end-to-end manner, but SGs are not commonly used as intermediate components in visual reasoning systems because being discrete and sparse, scene-graph representations are non-differentiable and difficult to optimize. Here we propose Differentiable Scene Graphs (DSGs), an image representation that is amenable to differentiable end-to-end optimization, and requires supervision only from the downstream tasks. DSGs provide a dense representation for all regions and pairs of regions, and do not spend modelling capacity on areas of the images that do not contain objects or relations of interest. We evaluate our model on the challenging task of identifying referring relationships (RR) in three benchmark datasets, Visual Genome, VRD and CLEVR. We describe a multi-task objective, and train in an end-to-end manner supervised by the downstream RR task. Using DSGs as an intermediate representation leads to new state-of-the-art performance.

Comments:	Winter Conference on Applications of Computer Vision (WACV), 2020
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:1902.10200 [cs.CV]
	(or arXiv:1902.10200v5 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.1902.10200

Submission history

From: Roei Herzig [view email]
[v1] Tue, 26 Feb 2019 20:22:33 UTC (8,892 KB)
[v2] Tue, 26 Mar 2019 21:25:01 UTC (8,527 KB)
[v3] Sun, 28 Jul 2019 06:21:43 UTC (6,924 KB)
[v4] Sun, 26 Jan 2020 10:33:19 UTC (6,965 KB)
[v5] Sat, 14 Mar 2020 16:25:32 UTC (6,965 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Differentiable Scene Graphs

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Differentiable Scene Graphs

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators