Red-blue pebbling revisited: near optimal parallel matrix-matrix multiplication

Kwasniewski, Grzegorz; Kabić, Marko; Besta, Maciej; VandeVondele, Joost; Solcà, Raffaele; Hoefler, Torsten

Computer Science > Computational Complexity

arXiv:1908.09606 (cs)

[Submitted on 26 Aug 2019 (v1), last revised 13 Dec 2019 (this version, v3)]

Title:Red-blue pebbling revisited: near optimal parallel matrix-matrix multiplication

Authors:Grzegorz Kwasniewski (1), Marko Kabić (2,3), Maciej Besta (1), Joost VandeVondele (2,3), Raffaele Solcà (2,3), Torsten Hoefler (1) ((1) Department of Computer Science, ETH Zurich, (2) ETH Zurich, (3) Swiss National Supercomputing Centre (CSCS))

View PDF

Abstract:We propose COSMA: a parallel matrix-matrix multiplication algorithm that is near communication-optimal for all combinations of matrix dimensions, processor counts, and memory sizes. The key idea behind COSMA is to derive an optimal (up to a factor of 0.03\% for 10MB of fast memory) sequential schedule and then parallelize it, preserving I/O optimality. To achieve this, we use the red-blue pebble game to precisely model MMM dependencies and derive a constructive and tight sequential and parallel I/O lower bound proofs. Compared to 2D or 3D algorithms, which fix processor decomposition upfront and then map it to the matrix dimensions, it reduces communication volume by up to $\sqrt{3}$ times. COSMA outperforms the established ScaLAPACK, CARMA, and CTF algorithms in all scenarios up to 12.8x (2.2x on average), achieving up to 88\% of Piz Daint's peak performance. Our work does not require any hand tuning and is maintained as an open source implementation.

Comments:	18 pages, 29 figures, short version submitted to the SC'19 conference
Subjects:	Computational Complexity (cs.CC); Distributed, Parallel, and Cluster Computing (cs.DC); Performance (cs.PF)
Cite as:	arXiv:1908.09606 [cs.CC]
	(or arXiv:1908.09606v3 [cs.CC] for this version)
	https://doi.org/10.48550/arXiv.1908.09606

Submission history

From: Grzegorz Kwasniewski [view email]
[v1] Mon, 26 Aug 2019 11:40:17 UTC (1,235 KB)
[v2] Thu, 29 Aug 2019 00:24:39 UTC (1,235 KB)
[v3] Fri, 13 Dec 2019 15:36:04 UTC (1,235 KB)

Computer Science > Computational Complexity

Title:Red-blue pebbling revisited: near optimal parallel matrix-matrix multiplication

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computational Complexity

Title:Red-blue pebbling revisited: near optimal parallel matrix-matrix multiplication

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators