Distributed Inference for Linear Support Vector Machine

Wang, Xiaozhou; Yang, Zhuoyi; Chen, Xi; Liu, Weidong

Statistics > Machine Learning

arXiv:1811.11922 (stat)

[Submitted on 29 Nov 2018 (v1), last revised 20 Sep 2019 (this version, v2)]

Title:Distributed Inference for Linear Support Vector Machine

Authors:Xiaozhou Wang, Zhuoyi Yang, Xi Chen, Weidong Liu

View PDF

Abstract:The growing size of modern data brings many new challenges to existing statistical inference methodologies and theories, and calls for the development of distributed inferential approaches. This paper studies distributed inference for linear support vector machine (SVM) for the binary classification task. Despite a vast literature on SVM, much less is known about the inferential properties of SVM, especially in a distributed setting. In this paper, we propose a multi-round distributed linear-type (MDL) estimator for conducting inference for linear SVM. The proposed estimator is computationally efficient. In particular, it only requires an initial SVM estimator and then successively refines the estimator by solving simple weighted least squares problem. Theoretically, we establish the Bahadur representation of the estimator. Based on the representation, the asymptotic normality is further derived, which shows that the MDL estimator achieves the optimal statistical efficiency, i.e., the same efficiency as the classical linear SVM applying to the entire data set in a single machine setup. Moreover, our asymptotic result avoids the condition on the number of machines or data batches, which is commonly assumed in distributed estimation literature, and allows the case of diverging dimension. We provide simulation studies to demonstrate the performance of the proposed MDL estimator.

Comments:	50 pages, 11 figures
Subjects:	Machine Learning (stat.ML); Machine Learning (cs.LG); Methodology (stat.ME)
Cite as:	arXiv:1811.11922 [stat.ML]
	(or arXiv:1811.11922v2 [stat.ML] for this version)
	https://doi.org/10.48550/arXiv.1811.11922
Journal reference:	Journal of Machine Learning Research (JMLR), v20(113):1-41, 2019

Submission history

From: Zhuoyi Yang [view email]
[v1] Thu, 29 Nov 2018 02:05:09 UTC (631 KB)
[v2] Fri, 20 Sep 2019 20:23:16 UTC (631 KB)

Statistics > Machine Learning

Title:Distributed Inference for Linear Support Vector Machine

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Machine Learning

Title:Distributed Inference for Linear Support Vector Machine

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators