The Unreasonable Effectiveness of Random Pruning: Return of the Most Naive Baseline for Sparse Training

Liu, Shiwei; Chen, Tianlong; Chen, Xiaohan; Shen, Li; Mocanu, Decebal Constantin; Wang, Zhangyang; Pechenizkiy, Mykola

Computer Science > Machine Learning

arXiv:2202.02643 (cs)

[Submitted on 5 Feb 2022]

Title:The Unreasonable Effectiveness of Random Pruning: Return of the Most Naive Baseline for Sparse Training

Authors:Shiwei Liu, Tianlong Chen, Xiaohan Chen, Li Shen, Decebal Constantin Mocanu, Zhangyang Wang, Mykola Pechenizkiy

View PDF

Abstract:Random pruning is arguably the most naive way to attain sparsity in neural networks, but has been deemed uncompetitive by either post-training pruning or sparse training. In this paper, we focus on sparse training and highlight a perhaps counter-intuitive finding, that random pruning at initialization can be quite powerful for the sparse training of modern neural networks. Without any delicate pruning criteria or carefully pursued sparsity structures, we empirically demonstrate that sparsely training a randomly pruned network from scratch can match the performance of its dense equivalent. There are two key factors that contribute to this revival: (i) the network sizes matter: as the original dense networks grow wider and deeper, the performance of training a randomly pruned sparse network will quickly grow to matching that of its dense equivalent, even at high sparsity ratios; (ii) appropriate layer-wise sparsity ratios can be pre-chosen for sparse training, which shows to be another important performance booster. Simple as it looks, a randomly pruned subnetwork of Wide ResNet-50 can be sparsely trained to outperforming a dense Wide ResNet-50, on ImageNet. We also observed such randomly pruned networks outperform dense counterparts in other favorable aspects, such as out-of-distribution detection, uncertainty estimation, and adversarial robustness. Overall, our results strongly suggest there is larger-than-expected room for sparse training at scale, and the benefits of sparsity might be more universal beyond carefully designed pruning. Our source code can be found at this https URL.

Comments:	Published as a conference paper at ICLR 2022. Code is available at this https URL
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2202.02643 [cs.LG]
	(or arXiv:2202.02643v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2202.02643

Submission history

From: Shiwei Liu [view email]
[v1] Sat, 5 Feb 2022 21:19:41 UTC (3,600 KB)

Computer Science > Machine Learning

Title:The Unreasonable Effectiveness of Random Pruning: Return of the Most Naive Baseline for Sparse Training

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:The Unreasonable Effectiveness of Random Pruning: Return of the Most Naive Baseline for Sparse Training

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators