Cost-sensitive probabilistic predictions for support vector machines

Benítez-Peña, Sandra; Blanquero, Rafael; Carrizosa, Emilio; Ramírez-Cobo, Pepa

doi:10.1016/j.ejor.2023.09.027

Statistics > Machine Learning

arXiv:2310.05997 (stat)

[Submitted on 9 Oct 2023]

Title:Cost-sensitive probabilistic predictions for support vector machines

Authors:Sandra Benítez-Peña, Rafael Blanquero, Emilio Carrizosa, Pepa Ramírez-Cobo

View PDF

Abstract:Support vector machines (SVMs) are widely used and constitute one of the best examined and used machine learning models for two-class classification. Classification in SVM is based on a score procedure, yielding a deterministic classification rule, which can be transformed into a probabilistic rule (as implemented in off-the-shelf SVM libraries), but is not probabilistic in nature. On the other hand, the tuning of the regularization parameters in SVM is known to imply a high computational effort and generates pieces of information that are not fully exploited, not being used to build a probabilistic classification rule. In this paper we propose a novel approach to generate probabilistic outputs for the SVM. The new method has the following three properties. First, it is designed to be cost-sensitive, and thus the different importance of sensitivity (or true positive rate, TPR) and specificity (true negative rate, TNR) is readily accommodated in the model. As a result, the model can deal with imbalanced datasets which are common in operational business problems as churn prediction or credit scoring. Second, the SVM is embedded in an ensemble method to improve its performance, making use of the valuable information generated in the parameters tuning process. Finally, the probabilities estimation is done via bootstrap estimates, avoiding the use of parametric models as competing approaches. Numerical tests on a wide range of datasets show the advantages of our approach over benchmark procedures.

Comments:	European Journal of Operational Research (2023)
Subjects:	Machine Learning (stat.ML); Machine Learning (cs.LG)
Cite as:	arXiv:2310.05997 [stat.ML]
	(or arXiv:2310.05997v1 [stat.ML] for this version)
	https://doi.org/10.48550/arXiv.2310.05997
Related DOI:	https://doi.org/10.1016/j.ejor.2023.09.027

Submission history

From: Sandra Benítez-Peña [view email]
[v1] Mon, 9 Oct 2023 11:00:17 UTC (2,052 KB)

Statistics > Machine Learning

Title:Cost-sensitive probabilistic predictions for support vector machines

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Machine Learning

Title:Cost-sensitive probabilistic predictions for support vector machines

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators