Fine-Grained Predicates Learning for Scene Graph Generation

Lyu, Xinyu; Gao, Lianli; Guo, Yuyu; Zhao, Zhou; Huang, Hao; Shen, Heng Tao; Song, Jingkuan

Computer Science > Computer Vision and Pattern Recognition

arXiv:2204.02597 (cs)

[Submitted on 6 Apr 2022 (v1), last revised 8 Apr 2022 (this version, v2)]

Title:Fine-Grained Predicates Learning for Scene Graph Generation

Authors:Xinyu Lyu, Lianli Gao, Yuyu Guo, Zhou Zhao, Hao Huang, Heng Tao Shen, Jingkuan Song

View PDF

Abstract:The performance of current Scene Graph Generation models is severely hampered by some hard-to-distinguish predicates, e.g., "woman-on/standing on/walking on-beach" or "woman-near/looking at/in front of-child". While general SGG models are prone to predict head predicates and existing re-balancing strategies prefer tail categories, none of them can appropriately handle these hard-to-distinguish predicates. To tackle this issue, inspired by fine-grained image classification, which focuses on differentiating among hard-to-distinguish object classes, we propose a method named Fine-Grained Predicates Learning (FGPL) which aims at differentiating among hard-to-distinguish predicates for Scene Graph Generation task. Specifically, we first introduce a Predicate Lattice that helps SGG models to figure out fine-grained predicate pairs. Then, utilizing the Predicate Lattice, we propose a Category Discriminating Loss and an Entity Discriminating Loss, which both contribute to distinguishing fine-grained predicates while maintaining learned discriminatory power over recognizable ones. The proposed model-agnostic strategy significantly boosts the performances of three benchmark models (Transformer, VCTree, and Motif) by 22.8\%, 24.1\% and 21.7\% of Mean Recall (mR@100) on the Predicate Classification sub-task, respectively. Our model also outperforms state-of-the-art methods by a large margin (i.e., 6.1\%, 4.6\%, and 3.2\% of Mean Recall (mR@100)) on the Visual Genome dataset.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2204.02597 [cs.CV]
	(or arXiv:2204.02597v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2204.02597

Submission history

From: Xinyu Lyu [view email]
[v1] Wed, 6 Apr 2022 06:20:09 UTC (531 KB)
[v2] Fri, 8 Apr 2022 00:43:13 UTC (531 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Fine-Grained Predicates Learning for Scene Graph Generation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Fine-Grained Predicates Learning for Scene Graph Generation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators