Memory Requirement Reduction of Deep Neural Networks Using Low-bit Quantization of Parameters

Nicodemo, Niccoló; Naithani, Gaurav; Drossos, Konstantinos; Virtanen, Tuomas; Saletti, Roberto

Electrical Engineering and Systems Science > Audio and Speech Processing

arXiv:1911.00527 (eess)

[Submitted on 1 Nov 2019]

Title:Memory Requirement Reduction of Deep Neural Networks Using Low-bit Quantization of Parameters

Authors:Niccoló Nicodemo, Gaurav Naithani, Konstantinos Drossos, Tuomas Virtanen, Roberto Saletti

View PDF

Abstract:Effective employment of deep neural networks (DNNs) in mobile devices and embedded systems is hampered by requirements for memory and computational power. This paper presents a non-uniform quantization approach which allows for dynamic quantization of DNN parameters for different layers and within the same layer. A virtual bit shift (VBS) scheme is also proposed to improve the accuracy of the proposed scheme. Our method reduces the memory requirements, preserving the performance of the network. The performance of our method is validated in a speech enhancement application, where a fully connected DNN is used to predict the clean speech spectrum from the input noisy speech spectrum. A DNN is optimized and its memory footprint and performance are evaluated using the short-time objective intelligibility, STOI, metric. The application of the low-bit quantization allows a 50% reduction of the DNN memory footprint while the STOI performance drops only by 2.7%.

Subjects:	Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Performance (cs.PF); Sound (cs.SD)
Cite as:	arXiv:1911.00527 [eess.AS]
	(or arXiv:1911.00527v1 [eess.AS] for this version)
	https://doi.org/10.48550/arXiv.1911.00527

Submission history

From: Konstantinos Drossos [view email]
[v1] Fri, 1 Nov 2019 18:03:12 UTC (84 KB)

Electrical Engineering and Systems Science > Audio and Speech Processing

Title:Memory Requirement Reduction of Deep Neural Networks Using Low-bit Quantization of Parameters

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Electrical Engineering and Systems Science > Audio and Speech Processing

Title:Memory Requirement Reduction of Deep Neural Networks Using Low-bit Quantization of Parameters

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators