Stream Iterative Distributed Coded Computing for Learning Applications in Heterogeneous Systems

Esfahanizadeh, Homa; Cohen, Alejandro; Medard, Muriel

Computer Science > Distributed, Parallel, and Cluster Computing

arXiv:2204.13195 (cs)

[Submitted on 27 Apr 2022]

Title:Stream Iterative Distributed Coded Computing for Learning Applications in Heterogeneous Systems

Authors:Homa Esfahanizadeh, Alejandro Cohen, Muriel Medard

View PDF

Abstract:To improve the utility of learning applications and render machine learning solutions feasible for complex applications, a substantial amount of heavy computations is needed. Thus, it is essential to delegate the computations among several workers, which brings up the major challenge of coping with delays and failures caused by the system's heterogeneity and uncertainties. In particular, minimizing the end-to-end job in-order execution delay, from arrival to delivery, is of great importance for real-world delay-sensitive applications. In this paper, for computation of each job iteration in a stochastic heterogeneous distributed system where the workers vary in their computing and communicating powers, we present a novel joint scheduling-coding framework that optimally split the coded computational load among the workers. This closes the gap between the workers' response time, and is critical to maximize the resource utilization. To further reduce the in-order execution delay, we also incorporate redundant computations in each iteration of a distributed computational job. Our simulation results demonstrate that the delay obtained using the proposed solution is dramatically lower than the uniform split which is oblivious to the system's heterogeneity and, in fact, is very close to an ideal lower bound just by introducing a small percentage of redundant computations.

Comments:	Accepted to appear in IEEE International Conference on Computer Communications (IEEE Infocom)
Subjects:	Distributed, Parallel, and Cluster Computing (cs.DC); Information Theory (cs.IT)
Cite as:	arXiv:2204.13195 [cs.DC]
	(or arXiv:2204.13195v1 [cs.DC] for this version)
	https://doi.org/10.48550/arXiv.2204.13195

Submission history

From: Homa Esfahanizadeh [view email]
[v1] Wed, 27 Apr 2022 21:08:29 UTC (2,600 KB)

Computer Science > Distributed, Parallel, and Cluster Computing

Title:Stream Iterative Distributed Coded Computing for Learning Applications in Heterogeneous Systems

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Distributed, Parallel, and Cluster Computing

Title:Stream Iterative Distributed Coded Computing for Learning Applications in Heterogeneous Systems

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators