Global Meets Local: Effective Multi-Label Image Classification via Category-Aware Weak Supervision

Zhan, Jiawei; Liu, Jun; Tang, Wei; Jiang, Guannan; Wang, Xi; Gao, Bin-Bin; Zhang, Tianliang; Wu, Wenlong; Zhang, Wei; Wang, Chengjie; Xie, Yuan

doi:10.1145/3503161.3547834

Computer Science > Computer Vision and Pattern Recognition

arXiv:2211.12716 (cs)

[Submitted on 23 Nov 2022]

Title:Global Meets Local: Effective Multi-Label Image Classification via Category-Aware Weak Supervision

Authors:Jiawei Zhan, Jun Liu, Wei Tang, Guannan Jiang, Xi Wang, Bin-Bin Gao, Tianliang Zhang, Wenlong Wu, Wei Zhang, Chengjie Wang, Yuan Xie

View PDF

Abstract:Multi-label image classification, which can be categorized into label-dependency and region-based methods, is a challenging problem due to the complex underlying object layouts. Although region-based methods are less likely to encounter issues with model generalizability than label-dependency methods, they often generate hundreds of meaningless or noisy proposals with non-discriminative information, and the contextual dependency among the localized regions is often ignored or over-simplified. This paper builds a unified framework to perform effective noisy-proposal suppression and to interact between global and local features for robust feature learning. Specifically, we propose category-aware weak supervision to concentrate on non-existent categories so as to provide deterministic information for local feature learning, restricting the local branch to focus on more high-quality regions of interest. Moreover, we develop a cross-granularity attention module to explore the complementary information between global and local features, which can build the high-order feature correlation containing not only global-to-local, but also local-to-local relations. Both advantages guarantee a boost in the performance of the whole network. Extensive experiments on two large-scale datasets (MS-COCO and VOC 2007) demonstrate that our framework achieves superior performance over state-of-the-art methods.

Comments:	12 pages, 10 figures, published in ACMMM 2022
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2211.12716 [cs.CV]
	(or arXiv:2211.12716v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2211.12716
Journal reference:	Proceedings of the 30th ACM International Conference on Multimedia. 2022: 6318-6326
Related DOI:	https://doi.org/10.1145/3503161.3547834

Submission history

From: Jiawei Zhan [view email]
[v1] Wed, 23 Nov 2022 05:39:17 UTC (8,309 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Global Meets Local: Effective Multi-Label Image Classification via Category-Aware Weak Supervision

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Global Meets Local: Effective Multi-Label Image Classification via Category-Aware Weak Supervision

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators