Optimizing the Whole-life Cost in End-to-end CNN Acceleration

Zhang, Jiaqi; Chen, Xiangru; Ray, Sandip

Computer Science > Distributed, Parallel, and Cluster Computing

arXiv:2104.05541 (cs)

[Submitted on 12 Apr 2021 (v1), last revised 22 Jul 2021 (this version, v2)]

Title:Optimizing the Whole-life Cost in End-to-end CNN Acceleration

Authors:Jiaqi Zhang, Xiangru Chen, Sandip Ray

View PDF

Abstract:The acceleration of CNNs has gained increasing atten-tion since their success in computer vision. With the heterogeneous functional layers that cannot be pro-cessed by the accelerators proposed for convolution layers only, modern end-to-end CNN acceleration so-lutions either transform the diverse computation into matrix/vector arithmetic, which loses data reuse op-portunities in convolution, or introduce dedicated functional unit to each kind of layer, which results in underutilization and high update expense. To enhance the whole-life cost efficiency, we need an acceleration solution that is efficient in processing CNN layers and has the generality to apply to all kinds of existing and emerging layers. To this end, we pro-pose GCONV Chain, a method to convert the entire CNN computation into a chain of standard general convolutions (GCONV) that can be efficiently pro-cessed by the existing CNN accelerators. This paper comprehensively analyzes the GCONV Chain model and proposes a full-stack implementation to support GCONV Chain. On one hand, the results on seven var-ious CNNs demonstrate that GCONV Chain improves the performance and energy efficiency of existing CNN accelerators by an average of 3.4x and 3.2x re-spectively. On the other hand, we show that GCONV Chain provides low whole-life costs for CNN accelera-tion, including both developer efforts and total cost of ownership for the users.

Subjects:	Distributed, Parallel, and Cluster Computing (cs.DC); Machine Learning (cs.LG)
Cite as:	arXiv:2104.05541 [cs.DC]
	(or arXiv:2104.05541v2 [cs.DC] for this version)
	https://doi.org/10.48550/arXiv.2104.05541

Submission history

From: Jiaqi Zhang [view email]
[v1] Mon, 12 Apr 2021 15:12:42 UTC (790 KB)
[v2] Thu, 22 Jul 2021 00:46:39 UTC (790 KB)

Computer Science > Distributed, Parallel, and Cluster Computing

Title:Optimizing the Whole-life Cost in End-to-end CNN Acceleration

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Distributed, Parallel, and Cluster Computing

Title:Optimizing the Whole-life Cost in End-to-end CNN Acceleration

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators