» Articles » PMID: 26210983

Near-Bayesian Support Vector Machines for Imbalanced Data Classification with Equal or Unequal Misclassification Costs

Overview
Journal Neural Netw
Specialties Biology
Neurology
Date 2015 Jul 27
PMID 26210983
Citations 9
Authors
Affiliations
Soon will be listed here.
Abstract

Support Vector Machines (SVMs) form a family of popular classifier algorithms originally developed to solve two-class classification problems. However, SVMs are likely to perform poorly in situations with data imbalance between the classes, particularly when the target class is under-represented. This paper proposes a Near-Bayesian Support Vector Machine (NBSVM) for such imbalanced classification problems, by combining the philosophies of decision boundary shift and unequal regularization costs. Based on certain assumptions which hold true for most real-world datasets, we use the fractions of representation from each of the classes, to achieve the boundary shift as well as the asymmetric regularization costs. The proposed approach is extended to the multi-class scenario and also adapted for cases with unequal misclassification costs for the different classes. Extensive comparison with standard SVM and some state-of-the-art methods is furnished as a proof of the ability of the proposed approach to perform competitively on imbalanced datasets. A modified Sequential Minimal Optimization (SMO) algorithm is also presented to solve the NBSVM optimization problem in a computationally efficient manner.

Citing Articles

Exploring the Impact of the Class on In-the-Wild Human Activity Recognition.

Cherian J, Ray S, Taele P, Koh J, Hammond T Sensors (Basel). 2024; 24(12).

PMID: 38931682 PMC: 11207638. DOI: 10.3390/s24123898.


On sparse ensemble methods: An application to short-term predictions of the evolution of COVID-19.

Benitez-Pena S, Carrizosa E, Guerrero V, Jimenez-Gamero M, Martin-Barragan B, Molero-Rio C Eur J Oper Res. 2022; 295(2):648-663.

PMID: 36569384 PMC: 9759092. DOI: 10.1016/j.ejor.2021.04.016.


Integration of gene co-expression analysis and multi-class SVM specifies the functional players involved in determining the fate of HTLV-1 infection toward the development of cancer (ATLL) or neurological disorder (HAM/TSP).

Ghobadi M, Emamzadeh R PLoS One. 2022; 17(1):e0262739.

PMID: 35041720 PMC: 8765610. DOI: 10.1371/journal.pone.0262739.


Research on expansion and classification of imbalanced data based on SMOTE algorithm.

Wang S, Dai Y, Shen J, Xuan J Sci Rep. 2021; 11(1):24039.

PMID: 34912009 PMC: 8674253. DOI: 10.1038/s41598-021-03430-5.


Automatic Multi-Label ECG Classification with Category Imbalance and Cost-Sensitive Thresholding.

Liu Y, Li Q, Wang K, Liu J, He R, Yuan Y Biosensors (Basel). 2021; 11(11).

PMID: 34821669 PMC: 8615597. DOI: 10.3390/bios11110453.