VectorNet: Encoding HD Maps and Agent Dynamics from Vectorized Representation

Gao, Jiyang; Sun, Chen; Zhao, Hang; Shen, Yi; Anguelov, Dragomir; Li, Congcong; Schmid, Cordelia

Computer Science > Computer Vision and Pattern Recognition

arXiv:2005.04259 (cs)

[Submitted on 8 May 2020]

Title:VectorNet: Encoding HD Maps and Agent Dynamics from Vectorized Representation

Authors:Jiyang Gao, Chen Sun, Hang Zhao, Yi Shen, Dragomir Anguelov, Congcong Li, Cordelia Schmid

View PDF

Abstract:Behavior prediction in dynamic, multi-agent systems is an important problem in the context of self-driving cars, due to the complex representations and interactions of road components, including moving agents (e.g. pedestrians and vehicles) and road context information (e.g. lanes, traffic lights). This paper introduces VectorNet, a hierarchical graph neural network that first exploits the spatial locality of individual road components represented by vectors and then models the high-order interactions among all components. In contrast to most recent approaches, which render trajectories of moving agents and road context information as bird-eye images and encode them with convolutional neural networks (ConvNets), our approach operates on a vector representation. By operating on the vectorized high definition (HD) maps and agent trajectories, we avoid lossy rendering and computationally intensive ConvNet encoding steps. To further boost VectorNet's capability in learning context features, we propose a novel auxiliary task to recover the randomly masked out map entities and agent trajectories based on their context. We evaluate VectorNet on our in-house behavior prediction benchmark and the recently released Argoverse forecasting dataset. Our method achieves on par or better performance than the competitive rendering approach on both benchmarks while saving over 70% of the model parameters with an order of magnitude reduction in FLOPs. It also outperforms the state of the art on the Argoverse dataset.

Comments:	CVPR 2020
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:2005.04259 [cs.CV]
	(or arXiv:2005.04259v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2005.04259

Submission history

From: Jiyang Gao [view email]
[v1] Fri, 8 May 2020 19:07:03 UTC (1,620 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:VectorNet: Encoding HD Maps and Agent Dynamics from Vectorized Representation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:VectorNet: Encoding HD Maps and Agent Dynamics from Vectorized Representation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators