0% found this document useful (0 votes)
77 views7 pages

Neural Networks: Sree Rama Vamsidhar S., Arun Kumar Sivapuram, Vaishnavi Ravi, Gowtham Senthil, Rama Krishna Gorthi

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
77 views7 pages

Neural Networks: Sree Rama Vamsidhar S., Arun Kumar Sivapuram, Vaishnavi Ravi, Gowtham Senthil, Rama Krishna Gorthi

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Neural Networks 161 (2023) 178–184

Contents lists available at ScienceDirect

Neural Networks
journal homepage: www.elsevier.com/locate/neunet

Neural Networks Letter

VISAL—A novel learning strategy to address class imbalance



Sree Rama Vamsidhar S. ,1 , Arun Kumar Sivapuram 1 ,2 , Vaishnavi Ravi 1 ,2 ,
Gowtham Senthil 3 , Rama Krishna Gorthi 4
Indian Institute of Technology, Tirupati, 517619, India

article info a b s t r a c t

Article history: In the imbalance data scenarios, Deep Neural Networks (DNNs) fail to generalize well on minority
Received 24 November 2021 classes. In this letter, we propose a simple and effective learning function i.e, Visually Interpretable
Received in revised form 22 November 2022 Space Adjustment Learning (VISAL) to handle the imbalanced data classification task. VISAL’s objective
Accepted 16 January 2023
is to create more room for the generalization of minority class samples by bringing in both the
Available online 20 January 2023
angular and euclidean margins into the cross-entropy learning strategy. When evaluated on the
Keywords: imbalanced versions of CIFAR, Tiny ImageNet, COVIDx and IMDB reviews datasets, our proposed
Data imbalance method outperforms the state of the art works by a significant margin.
Deep neural networks © 2023 Elsevier Ltd. All rights reserved.
Image classification
Learning function

1. Introduction 2016; Vuttipittayamongkol & Elyan, 2020). This method of re-


moving samples from majority class also removes bias, leading
Image classification is a well-established deep learning task in to the loss of true diverse information. One more point to note
computer vision. More the training data, more robust and reliable is that under-sampling leaves behind a small number of sam-
the resulting deep learning model will be. In real-world scenarios, ples for training which is more often not enough to train deep
acquiring large amounts of open-source data is difficult due to models. In Over-sampling, training samples are increased either
various reasons like confidentiality, privacy etc. Further, data by augmenting them or synthesizing new samples from existing
annotation is a hindering task. Especially, when scarcity occurs minority class samples (Chawla, Bowyer, Hall, & Kegelmeyer,
only among a few classes, it leads to an imbalance in data. In this 2002). (Kim, Jeong, & Shin, 2020) attempted to bring the di-
case, DNNs with a huge number of parameters tend to overfit to versity existing in majority class data into the minority class.
the minority classes and result in poor generalization of unseen CDSMOTE (Elyan, Moreno-Garcia, & Jayne, 2021) reduced the
minority samples. dominance of majority class instances using class decomposition
However, the best way to detour this class imbalance problem besides over-sampling the minority class. Generative Adversarial
is to re-balance the situation either at the data level or by the Networks(GAN)-based methods (Ian et al., 2014) are also pro-
algorithmic approaches. posed for data augmentation to reduce the imbalance ratios.
(i) Data Level Approaches: ‘‘Data Re-sampling’’ is the most com- These methods aim to generate minority class samples to re-
monly used methods to address imbalance problem. Re-sampling balance the dataset (Ali-Gombe & Elyan, 2019; Antoniou, Storkey,
the given data balances the difference in prior information of & Edwards, 2017; Baur, Albarqouni, & Navab, 2018; Mariani,
majority and minority classes. It involves under-sampling the ma- Scheidegger, Istrate, Bekas, & Malossi, 2018). However, augmen-
jority class samples or over-sampling the minority class. Under- tation is not equivalent to more data due to the lack of true
sampling includes eliminating the samples from the majority diversity. Therefore, it results in over-fitting to the small num-
classes (He & Garcia, 2009; Japkowicz & Stephen, 2002) partic- ber of samples leading to poor generalization in the regime of
ularly the overlapping ones (Bunkhumpornpat & Sinapiromsaran, ‘‘extreme’’ imbalance.
(ii) Algorithmic approaches: These approaches aim to alter the
∗ Corresponding author. decision boundary to balance the difference in given imbalanced
E-mail addresses: [email protected] (Sree Rama Vamsidhar S.), data distribution. Support Vector Machines (SVMs) is one of the
[email protected] (A.K. Sivapuram), [email protected] (V. Ravi), first learning algorithms which is employed in the work Kernel
[email protected] (G. Senthil), [email protected] (R.K. Gorthi). Boundary Alignment (KBA) (Wu & Chang, 2005) to address im-
1 PhD Scholar at Dept of Electrical Engineering, IITTP. balance issue. KBA adjusts the class boundary and clusters the
2 Equally contributed. majority classes by modifying the kernel matrix and considers
3 B.Tech Student at Dept of Electrical Engineering, IITTP. the imbalanced data distribution as its prior information to pro-
4 Associate Professor at Dept of Electrical Engineering, IITTP. vide a large margin. DNN-based methods modify the objective

https://doi.org/10.1016/j.neunet.2023.01.015
0893-6080/© 2023 Elsevier Ltd. All rights reserved.
Sree Rama Vamsidhar S., A.K. Sivapuram, V. Ravi et al. Neural Networks 161 (2023) 178–184

function to adjust the prior probability bias created because of The main contributions in the proposed work are:
the variation in class-wise frequency of samples in the dataset
1. Proposition of angular margin as the function of inverse
by re-weighting the loss function by a factor inversely propor-
class samples distribution to aid the imbalanced data clas-
tional to sample frequency in that particular class (Buda, Maki, &
sification.
Mazurowski, 2018). Categorical Cross-Entropy(CE) is the standard
2. Formulating a modified CE loss function, VISAL by incor-
loss function used for classification task when the dataset is bal-
porating the modified angular margin along with the eu-
anced. However, in the case of imbalance, CE loss works against clidean margin which has the combined effect of achieving
the minority class and the model is prone to overfitting. In order improved discriminative ability along with large margin
to achieve better class separability under imbalance conditions, for minority class separation to better address the data
vanilla CE loss is transformed either by revising the similarity imbalance problems in diverse classification scenarios.
assessment term (i.e, the final logits) or by supplementing the 3. Introducing Change Learning Strategy (CLS), an effective
regularizers (Kornblith, Lee, Chen, & Norouzi, 2020). Some such training technique for task specific applications (discussed
variants are Class-Balanced Loss (Cui, Jia, Lin, Song, & Belongie, in Section 2.3).
2019), proposed to use the concept of ‘‘effective number’’ of
samples as alternative weights in the re-weighting method, Fo- The rest of this paper is organized as follows: Section 2 dis-
cusses proposed work. In Section 3, we describe the dataset and
cal loss (Lin, Goyal, Girshick, He, & Dollár, 2017), designed to
implementation details. Section 4 presents the results of our
address the class imbalance by down-weighting easy examples,
empirical studies and the conclusion in Section 5.
Re-weighting approaches (Khan et al., 2017; Khan, Hayat, Zamir,
Shen, & Shao, 2019b), proposed for improving the generalization
2. Proposed method
of long-tail data. Influence Balanced (IB) loss (Park et al., 2021)
re-weights the samples according to their influence to form a In this section, we elaborate on VISAL and explain visually how
well-generalized decision boundary on class imbalance data. it achieves the motive of providing large margins for the less
Some recent works aim to improve the model’s ability to frequent classes by increasing the inter-class distance through
distinguish between the classes by including a margin. Mar- label dependent angular and euclidean margins.
gin of class i is defined as the minimum distance of data in
ith class to the decision boundary. Asymmetrical margins for 2.1. Visually Interpretable Space Adjustment Learning (VISAL)
imbalanced data applications are studied in Khan, Hayat, Za-
mir, Shen, and Shao (2019a). The angular margin is the ad- In a n class classification task, the pre-final layer scores are
ditive/multiplicative margin introduced between the classes in typically the inner product between the column vectors of the
angular space to minimize intra-class variance and maximize weight matrix at final fully connected layer and the feature vector
inter-class variance. Angular margin-based losses like SphereFace from penultimate layer. The score obtained for class j can be
(Liu et al., 2017), Arc-Face Loss (Deng, Guo, Xue, & Zafeiriou, written as
2019), Large-Margin Softmax (Liu, Wen, Yu, & Yang, 2016), and sj = wjT xi + bj . (1)
Additive Margin Softmax (Wang, Cheng, Liu, & Liu, 2018) are in-
troduced in face recognition tasks to obtain highly discriminative where sj is the score obtained for class j from the pre-final layer,
features. All these works have angular margin as a scalar hyper- wj ∈ Rd is the jth column vector of final fully connected layer
parameter and are not proposed to address the imbalance in data. weight matrix W ∈ Rd×n , and xi ∈ Rd is the feature vector at the
Label Distribution Aware Margin loss (LDAM) (Cao, Wei, Gaidon, penultimate layer of input sample i. Here d is the dimension of the
Arechiga, & Ma, 2019) handles the imbalanced data with additive flattened features, n denotes the number of classes and bj ∈ Rn
euclidean margin to the final logits. LDAM adjusts the decision is the bias term.
boundary in favour of minority classes with class label depen- The angle θj can be determined from the similarity distance
between wj and xi and is given by:
dent euclidean margins introduced into CE loss function. Merely
providing a large margin for minority class may not be helpful wjT xi = ∥wj ∥ ∥xi ∥ cos(θj ), (2)
in all the cases since the data distribution is different in different
scenarios. In the same way, only introducing the discriminative wjT xi
( )
ability, through angular margin to the classification algorithm will θj = cos −1
. (3)
not tackle the prior information gap between the classes in the
∥wj ∥ ∥xi ∥
dataset. Based on nj samples present in class j, a class balancing term for
In order to address this issue, we propose a simple and effec- effective clustering φ which is a function of inverse class sample
tive learning function with both angular and euclidean margins distribution (i.e, φj ∝ n1 ), is added as an additive angular margin
j
which has a bettered effect of improved discriminative ability to θj . Cosine of this angle sum (θj + φj ), scaled by scalar r gives
along with large margin for minority class separation. In our the transformed logits of class j in angular space. Here scaling is
best knowledge, we are the first to propose the angular margin performed to compensate the magnitude information value of the
as the function of inverse class samples distribution to provide product ∥wj ∥ ∥xi ∥. Hence, s′j can be written as
dense clustering especially among the minority samples and to
create large margin to the minority class in the angular space to s′j = r .cos(θj + φj ). (4)
compensate the asymmetry in the data. To alter the margin controlling class separation ability, an eu-
The idea of VISAL is incorporated into the well-designed CE clidean margin ∆ is included as a function of class sample dis-
loss function. Hence it is very easy to integrate with existing tribution. For class j, ∆j is added to s′j (given in Eq. (4)) where
DNN models. Even though the proposed idea bears similarity with
∆j ∝ n1 . The overall expression for the predicted final logits
KBA (Wu & Chang, 2005), the major advantage of VISAL is that, it j
s∗j which includes both euclidean and angular margins can be
is easy to integrate it into any DNN framework and this facilitates
written as
to address imbalance in high-level vision applications like image
segmentation, object detection etc. s∗j = r .cos(θj + φj ) − ∆j (5)
179
Sree Rama Vamsidhar S., A.K. Sivapuram, V. Ravi et al. Neural Networks 161 (2023) 178–184

Fig. 1. Visual understanding of learning with softmax CE Vs LDAM Vs VISAL respectively for an imbalanced classification task.

These final logits s∗j are given to the CE learning function frame- This is motivated by the proposition in Kornblith et al. (2020) that
work in place of the actual predicted logits. Hence, the mod- the layers of a DNN model, close to the input layer learn similar
ified CE, referred to as VISAL loss function for imbalance data features irrespective of the cost function used while training.
classification task can be formulated as in Eq. (6) below. Hence, in our case introducing a complex variant of CE in the
N
[ ∗
] initial stages of training brings in computational complexity when
1 ∑ e−si compared to regular CE. Therefore, it is proposed to train the
L=− ln ∗ (6)
N e−si +∆i +
∑ −sk
i=1 k̸ =i e DNN model with CE to its maximum ability and then bring in the
task-specific learning objective to train the model on top of the
By introducing euclidean and angular margins, φ and ∆ respec-
features already learnt from CE. The proposed CLS technique with
tively, the minority classes are provided with more extra margin
CE and VISAL, achieved improved results in terms of accuracy
to compensate the asymmetry in the data. Hence, we can say
through implicit regularization from multiple loss functions on
that the margins help in achieving generalization of the unseen
CIFAR-10 and CIFAR-100 datasets. The CLS training technique
samples from the minority class to a greater extent by improving
the separability and discriminative ability of the DNN models. with proposed loss function is evaluated and compared in Table 4.

2.2. Visual interpretation of VISAL 3. Experiments

This subsection provides the intuition and visual interpreta- In this work, to demonstrate the generalizability and effec-
tion for the proposed objective function, VISAL. tiveness of the proposed loss function (VISAL), besides the four
Consider a binary classification problem with linearly sep- benchmark image classification datasets i.e, CIFAR-10, CIFAR-100,
arable samples in an imbalance setting with an aim to put a
Tiny ImageNet datasets of Computer Vision and COVIDx dataset
decision margin to classify them. Three different representations
of medical imaging domain, IMDB reviews dataset of Natural
are shown in Fig. 1 pertaining to the samples arrangement and
Language Processing domain was also considered in our exper-
the location of the decision boundary when trained with different
imentation. In addition to the details of distinct datasets that
variants of CE learning function softmax CE, LDAM and the pro-
are considered for experimentation, their imbalance ratios (ratio
posed one i.e, VISAL. It can be observed that only in case of VISAL,
between sample sizes of the most frequent and least frequent
the data samples are clustered up and dense packing is seen
especially among the minority samples, since the angular margin class) and the architectures of respective train–test models are
φ in VISAL is the function of inverse class sample distribution. also detailed in this section. The overall details of datasets along
As a consequence, the distance between both the classes have with their corresponding imbalance ratios are given in Table 1.
increased which helped the model to form the decision boundary
effortlessly. Therefore, if inter-class distances of CE, LDAM and 3.1. COVIDx
VISAL are taken as γ , γ ′ and γ ∗ then because of φ , we can say
that γ ∗ >γ ′ = γ .
COVIDx (Wang et al., 2020) dataset is a severely imbalanced
It is important to note that since the euclidean margin ∆ in
CXR medical imaging dataset with three classes normal, pneu-
VISAL is also a function of inverse class sample distribution, ∆
monia and COVID19. The original dataset (Wang et al., 2020)
customizes the decision boundary to provide a large margin γ ∗ in
contains 10,000, 9000, 142 CXR image samples for Normal, Pneu-
favour of minority class thus providing more room for the unseen
monia, and COVID classes respectively. Here, the imbalance ratio
minority samples. Considering γ2 , γ2 ′ , γ2 ∗ are the margins of
minority class given by softmax CE, LDAM and VISAL respectively, between majority and minority class is 100:1. In this work, to
it can be observed from Fig. 1 that γ2 ∗ > γ2′ > γ2 . reduce the imbalance ratio to 1:2, under-sampling technique
On account of this, we claim that VISAL has an enhanced effect is applied and the final training dataset contains 250, 250 and
of improved discriminative ability along with large margin for 117 image samples for Normal, Pneumonia, and COVID classes
minority class separation which aids in dealing with imbalance respectively. The test data comprises of 221 Normal, 273 Pneumo-
scenarios. nia, 25 COVID samples. The reason for choosing under-sampling
approach over over-sampling is that the existing over-sampling
2.3. Change Learning Strategy (CLS) techniques (Chawla et al., 2002) may introduce artifacts in the
generated data which can totally alter the classification results.
A two-stage learning strategy called ‘‘Change Learning Strat- Hence, the use of over-sampling is not recommended for medical
egy’’ (CLS) for tackling special scenarios like imbalance data clas- data. Finally, this under-sampled dataset is preprocessed by resiz-
sification is also proposed in this paper. The main idea of CLS is ing each input to 224 × 224 and applying histogram equalization
to train the model with multiple loss functions one after another. before giving it to the classification model.
180
Sree Rama Vamsidhar S., A.K. Sivapuram, V. Ravi et al. Neural Networks 161 (2023) 178–184

Table 1
Datasets with their imbalance ratios.
Dataset No. of classes Imbalance ratio Imbalance type
COVIDx (Wang, Lin, & Wong, 3 1:100 Step
2020)
CIFAR-10 (Cui et al., 2019) 10 1:10, 1:100 Exponential
CIFAR-100 (Cui et al., 2019) 100 1:10, 1:100 Exponential
IMDB reviews (Maas et al., 2 1:10 Step
2011)
Tiny ImageNet (Buda et al., 200 1:10, 1:100 Exponential
2018)

Fig. 2. Heat Guided Convolutional Neural Network (HGCNN) for CXR image classification.

3.1.1. Heat Guided Convolutional Neural Network (HGCNN)


In this work, we propose Heat map Guided Convolutional Neu-
ral Network (HGCNN), a modified Attention Guided CNN (AGCNN)
architecture with an ability to explain its predictions. The pro-
posed HGCNN and the explainability analysis are detailed in this
subsection.
HGCNN architecture consists of three branches — global, local
and fusion branches (Guan et al., 2018) for CXR image classifica-
tion task. Global branch is a standard DenseNet121 architecture
which takes preprocessed image as input and produces an at-
tention weight map and a 1024 dimensional feature vector. This
attention map is multiplied with the input to selectively weigh
ROI and is passed to the local branch. This local branch which is
again a standard DenseNet121 extracts better features and gives
one more 1024 × 1 vector. These two vectors are concatenated
and passed through a fusion branch consisting of a single dense
layer to produce three class probabilities.
The main distinction between AGCNN and HGCNN lies in
how the Region of Interest (ROI) information present in the heat Fig. 3. Visual attention maps for proposed model’s classification of Normal,
map obtained from global branch is fed to the local branch. Pneumonia, COVID19 CXR samples provided by Grad-CAM (Selvaraju et al., 2017)
The HGCNN is demonstrated to have a better classification per- and Heat maps from global branch of HGCNN are shown in columns 2 and 3
respectively.
formance compared to its base model AGCNN, by utilizing the
complete ROI information at the local branch.
In this work, the HGCNN is pretrained on the 14 class multi- use of an attention-based architecture which globally predicts the
label NIH Chest X-ray14 dataset (Wang et al., 2017), containing amount of attention to be provided for different local regions of
67,000 CXR with postero-anterior(PA) view. Compared to AGCNN, the image. These attention weights are compared against existing
HGCNN is found to produce significant improvement of 9% and explainability approach, Grad-CAM (Selvaraju et al., 2017) to vali-
6% in accuracies at local and fusion branches respectively on date our model predictions. In order to get visual explanations for
COVIDx (Wang et al., 2020) dataset. The proposed architecture the classification done by the HGCNN network, Grad-CAM (Sel-
is shown in Fig. 2. varaju et al., 2017) based class discriminative feature maps are
plotted for test images from all the three classes in Fig. 3, which
3.1.2. Explainability analysis for CXR image classification specifies ROI in the images based on which they are classified. It
It is imperative that all learning models in medical imaging is observed that the attention maps generated from the fusion
incorporate explainability to validate that their predictions are branch are inline with the visualizations from Grad-CAM con-
based on meaningful features and not merely based on other ar- firming the HGCNN model’s ability to classify COVID samples and
tifacts or inherent biases in the images. For this reason, we made explain its predictions, simultaneously as shown in Fig. 3.
181
Sree Rama Vamsidhar S., A.K. Sivapuram, V. Ravi et al. Neural Networks 161 (2023) 178–184

Table 2
Comparison between recent works on 3-class COVID CXR datasets obeying the train–test data distribution ratio.
Model CovRecall Accuracy Imbalance
ratio
Nishio2020 (Nishio, Noguchi, 90.9% 83.68% 90:10
Matsuo, & Murakami, 2020)
Sitaula2020 (Sitaula & Hossain, 77% 79.58% 70:30
2021)
VISAL 100% 94% 90:10
VISAL 100% 89% 70:30

Table 3
Model performance with NLL, Arc-face (Deng et al., 2019), LDAM (Cao et al., 2019), and proposed VISAL loss HGCNN
on COVIDx dataset.
Learning function CovRecall Accuracy Imbalance
ratio
NLL 92% 93.45% 1:2
Arc-Face (Deng et al., 2019) 96% 90.37% 1:2
LDAM (Cao et al., 2019) 96% 92.49% 1:2
VISAL 100% 91.91% 1:2
NLL 84% 92.1% 1:10
Arc-Face (Deng et al., 2019) 76% 93.1% 1:10
LDAM (Cao et al., 2019) 96% 92.67% 1:10
VISAL 96% 92.1% 1:10

3.2. CIFAR-10 and CIFAR-100 detail. The experimental results of various methods used for com-
parison in this work are taken directly from the corresponding
The CIFAR-10 and CIFAR-100 are prominent benchmark papers, LDAM (Cao et al., 2019) and IB (Park et al., 2021).
datasets for classification task in computer vision domain. The
original version of CIFAR-10 and CIFAR-100 contain 50,000 train- 4.1. COVID CXR dataset results
ing images and 10,000 validation images of size 32 × 32 with
10 and 100 classes respectively. From that, two different imbal- As there are no common benchmark datasets to compare
anced versions of train datasets with imbalance ratios as 100:1 all CXR based COVID Diagnosis works, we listed the scores re-
& 10:1 are generated from the original train set following an ported in recently published works (Nishio et al., 2020; Sitaula
exponential decay in sample sizes across different classes (Cui & Hossain, 2021) on a similar three class classification dataset
et al., 2019). The validation set is considered without any change and our model’s performance is compared against them (refer
in number of samples. Standard ResNet32 architecture is used as Table 2).5 The HGCNN trained with VISAL is the first work to
the classification model with the proposed learning objective. achieve 100% recall score while maintaining high accuracy on
imbalanced COVIDx dataset with atleast 10% improvement in
3.3. Tiny ImageNet both COVID recall and accuracy when compared with (Nishio
et al., 2020; Sitaula & Hossain, 2021)(as reported on respective
Tiny ImageNet contains 1,00,000 images of 200 classes. Each datasets in those works). To the best of our knowledge, our work
class has 500 training and 50 validation images of size 64 × 64. is the first to employ attention-based CNN and learning functions
The imbalanced version of Tiny ImageNet dataset with imbal- like LDAM loss (Cao et al., 2019) for COVID X-ray classification
ance ratios 100:1 & 10:1 are generated from the original train (see Table 3).
set following exponential decay in sample sizes across different
classes (Buda et al., 2018). The classification architecture em- 4.2. CIFAR dataset results
ployed for this dataset is the standard ResNet18. Table 5 outlines
the top-1 and top-5 validation errors. The top-1 validation errors of various methods addressing data
imbalance as mentioned in Cao et al. (2019) for imbalanced CIFAR
3.4. IMDB review dataset datasets are included in Table 4. Our experiments showed that
both the proposed methods VISAL and CLS-VISAL surpasses the
Binary sentiment classification (Maas et al., 2011) is carried existing works on CIFAR-10 and CIFAR-100 which has imbalance
out on IMDB review dataset containing 50,000 movie reviews. ratios of 1:10 and 1:100. In this experiment, CLS-VISAL improved
The dataset is balanced with equal number of positive and nega- top-1 validation accuracy by at least 4% on CE and > 0.5% on
tive reviews. Keeping the imbalance ratio as 10:1, an imbalanced LDAM (Cao et al., 2019).
dataset is generated from the balanced dataset with positive
reviews being the majority class. A two-layer bidirectional long 4.3. Tiny ImageNet dataset results
Short Term Memory(LSTM) network is used in this classification.
The results corresponding to this imbalanced dataset are reported The top-1 and top-5 validation errors of VISAL on synthetically
in Table 6. imbalanced Tiny ImageNet dataset in comparison with the vari-
ous approaches is presented in Table 5. It is observed that VISAL
4. Results is able to exceed the previous state-of-the-art by >%15 in both
top-1 and top-5 errors in imbalance ratio 1:100 scenarios.
In this section, the experimental results demonstrating the
performance of our proposed learning function, VISAL on various 5 Note: The bold faces in the following tables represent the best metric values
diverse datasets mentioned in Section 3 are presented below in among the works mentioned in that table.

182
Sree Rama Vamsidhar S., A.K. Sivapuram, V. Ravi et al. Neural Networks 161 (2023) 178–184

Table 4 Declaration of competing interest


Top-1 Validation errors of ResNet32 on imbalanced CIFAR-10 and imbalanced
CIFAR-100 datasets.
The authors declare that they have no known competing finan-
Dataset CIFAR-10 CIFAR-100
cial interests or personal relationships that could have appeared
Imbalance ratio 1:100 | 1:10 1:100 | 1:10
to influence the work reported in this paper.
CE 29.64 | 13.61 61.68 | 44.3
Focal 29.62 | 13.64 61.59 | 44.2
LDAM 26.65 | 13.04 60.40 | 43.09 Data availability
CB Re-Sampling 29.45 | 13.21 66.56 | 44.94
CB Re-Weighting 27.63 | 13.46 66.01 | 42.88
No data was used for the research described in the article.
CB Focal 25.43 | 12.90 63.98 | 42.01
LDAM DRW 22.97 | 11.84 57.96 | 41.29
IB 21.24 | 11.75 57.86 | 42.87 References
VISAL 22.57 | 11.76 58.23 | 41.20
CLS-VISAL 22.32 | 11.60 57.81 | 40.87 Ali-Gombe, Adamu, & Elyan, Eyad (2019). MFC-GAN: class-imbalanced dataset
classification using multiple fake class generative adversarial network.
Neurocomputing, 361, 212–221.
Table 5
Antoniou, A., Storkey, A., & Edwards, H. (2017). Data augmentation generative
Top-1 and Top-5 Validation errors of ResNet18 on imbalanced Tiny ImageNet
adversarial networks. arXiv preprint arXiv:1711.04340, Nov 12.
dataset.
Baur, C., Albarqouni, S., & Navab, N. (2018). MelanoGANs: high resolution skin
Imbalance type Long-tailed lesion synthesis with GANs. arXiv preprint arXiv:1804.04338, Apr 12.
Imbalanced dataset Tiny ImageNet Buda, Mateusz, Maki, Atsuto, & Mazurowski, Maciej A. (2018). A systematic study
of the class imbalance problem in convolutional neural networks. Neural
Imbalance ratio 1:100 1:10
Networks, 106, 249–259.
Top-1 Top-5 Top-1 Top-5 Bunkhumpornpat, Chumphol, & Sinapiromsaran, Krung (2016). DBMUTE:
ERM-SGD 66.19 42.63 50.33 26.68 density-based majority under-sampling technique. Knowledge and Informa-
CB SM-SGD 72.72 52.62 51.58 28.91 tion Systems, 50, 827–850.
ERM-DRW 64.57 40.79 50.03 26.19 Cao, Kaidi, Wei, Colin, Gaidon, Adrien, Arechiga, Nikos, & Ma, Tengyu (2019).
LDAM-SGD 64.04 40.46 48.08 24.80 Learning imbalanced datasets with label-distribution-aware margin loss.
LDAM-DRW 62.53 39.06 47.22 23.84 Advances in Neural Information Processing Systems, 32.
IB 57.35 - 42.78 – Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE:
VISAL 30.12 20.05 42.17 24.22 synthetic minority over-sampling technique. Journal of Artificial Intelligence
Research, 16, 321–357, Jun 1.
Cui, Y., Jia, M., Lin, T. Y., Song, Y., & Belongie, S. (2019). Class-balanced loss based
on effective number of samples. In Proceedings of the IEEE/CVF conference on
computer vision and pattern recognition (pp. 9268–9277).
4.4. IMDB reviews dataset results
Deng, J., Guo, J., Xue, N., & Zafeiriou, S. (2019). Arcface: Additive angular margin
loss for deep face recognition. In Proceedings of the IEEE/CVF conference on
computer vision and pattern recognition (pp. 4690–4699).
The top-1 validation errors of various learning based ap-
Elyan, Eyad, Moreno-Garcia, Carlos Francisco, & Jayne, Chrisina (2021).
proaches on IMDB reviews dataset are reported in Table 6. In CDSMOTE: class decomposition and synthetic minority class oversam-
this experiment, the proposed VISAL improved top-1 validation pling technique for imbalanced-data classification. Neural Computing and
accuracy by at least 4% on LDAM (Cao et al., 2019). Applications, 33(7), 2839–2851.
Guan, Q., Huang, Y., Zhong, Z., Zheng, Z., Zheng, L., & Yang, Y. (2018). Diagnose
like a radiologist: Attention guided convolutional neural network for thorax
Table 6 disease classification. arXiv preprint arXiv:1801.09927, Jan 30.
Top-1 validation errors on imbalanced IMDB review dataset. Our proposed He, Haibo, & Garcia, Edwardo A. (2009). Learning from imbalanced data. IEEE
approach VISAL outperforms the baselines. Transactions on Knowledge and Data Engineering, 21(9), 1263–1284, Jun 26.
Approach Error on Error on Mean error Ian, Goodfellow, Pouget-Abadie, Jean, Mirza, Mehdi, et al. (2014). Generative
positive reviews negative reviews adversarial nets. Advances in Neural Information Processing Systems, 27.
Japkowicz, N., & Stephen, S. (2002). The class imbalance problem: A systematic
ERM 2.86 70.78 36.82
study. Intelligent Data Analysis, 6(5), 429–449.
Re-Sampling 7.12 45.88 26.50
Khan, S. H., Hayat, M., Bennamoun, M., et al. (2017). Cost-sensitive learning
Re-Weighting 5.20 42.12 23.66
of deep feature representations from imbalanced data. IEEE Transactions on
LDAM-DRW 4.91 30.77 17.84
Neural Networks and Learning Systems, 29(8), 3573–3587.
VISAL 5.9 26.52 16.21
Khan, S., Hayat, M., Zamir, S. W., Shen, J., & Shao, L. (2019a). Striking the
right balance with uncertainty. In Proceedings of the IEEE/CVF conference on
computer vision and pattern recognition (pp. 103–112).
5. Conclusion Khan, Salman, Hayat, Munawar, Zamir, Syed Waqas, Shen, Jianbing, & Shao, Ling
(2019b). Striking the right balance with uncertainty. In The IEEE conference
on computer vision and pattern recognition.
In this paper, we propose a novel loss function VISAL to tackle Kim, Jaehyung, Jeong, Jongheon, & Shin, Jinwoo (2020). M2m: Imbalanced
classification via major-to-minor translation. In Proceedings of the IEEE/CVF
the problem of overfitting to minority class in data imbalance
conference on computer vision and pattern recognition.
scenarios. VISAL is formulated by incorporating angular margin Kornblith, S., Lee, H., Chen, T., & Norouzi, M. (2020). What’s in a loss function
and euclidean margin in the CE loss function which can densely for image classification. arXiv preprint arXiv:2010.16402, Oct 16.
cluster and also provide large margin to the samples in the Lin, T. Y., Goyal, P., Girshick, R., He, K., & Dollár, P. (2017). Focal loss for
dense object detection. In Proceedings of the IEEE international conference on
minority class. DNN model trained with VISAL as the objective computer vision (pp. 2980–2988).
function exhibits improved discriminative ability and can gener- Liu, Weiyang, Wen, Yandong, Yu, Zhiding, Li, Ming, Raj, Bhiksha, & Song, Le
alize well in imbalance scenarios. The improved performance on (2017). Sphereface: Deep hypersphere embedding for face recognition. In
IEEE conference on computer vision and pattern recognition.
diverse datasets presented in this work validates the generaliza-
Liu, Weiyang, Wen, Yandong, Yu, Zhiding, & Yang, Meng (2016). Large-margin
tion capability of the proposed method. Moreover, VISAL can be softmax loss for convolutional neural networks. In ICML, vol. 2 (p. 7).
easily incorporated into deep learning algorithms and hence, we Maas, Andrew L., Daly, Raymond E., Pham, Peter T., Huang, Dan, Ng, Andrew Y.,
& Potts, Christopher (2011). Learning word vectors for sentiment analysis.
plan to bring VISAL into other frameworks like online sequential
In Proceedings of the 49th annual meeting of the Association for Computational
learning, image segmentation and object detection to address Linguistics: Human language technologies-volume 1 (pp. 142–150). Association
class imbalance problems in our future work. for Computational Linguistics.

183
Sree Rama Vamsidhar S., A.K. Sivapuram, V. Ravi et al. Neural Networks 161 (2023) 178–184

Mariani, G., Scheidegger, F., Istrate, R., Bekas, C., & Malossi, C. (2018). Bagan: Vuttipittayamongkol, Pattaramon, & Elyan, Eyad (2020). Neighbourhood-based
Data augmentation with balancing gan. arXiv preprint arXiv:1803.09655, Mar undersampling approach for handling imbalanced and overlapped data.
26. Information Sciences, 509, 47–70.
Nishio, M., Noguchi, S., Matsuo, H., & Murakami, T. (2020). Automatic classifi- Wang, Feng, Cheng, Jian, Liu, Weiyang, & Liu, Haijun (2018). Additive margin
cation between COVID-19 pneumonia, non-COVID-19 pneumonia, and the softmax for face verification. IEE Signal Processing Letters, 25(7), 926–930.
healthy on chest X-ray image: combination of data augmentation methods. Wang, Linda, Lin, Zhong Qiu, & Wong, Alexander (2020). Covid-net: A tailored
Scientific Reports, 10(1), 1–6, Oct 16. deep convolutional neural network design for detection of covid-19 cases
Park, Seulki, Lim, Jongin, Jeon, Younghan, et al. (2021). Influence-balanced from chest x-ray images. Scientific Reports, 10(1), 1–12.
loss for imbalanced visual classification. In Proceedings of the IEEE/CVF Wang, X., Peng, Y., Lu, L., Lu, Z., Bagheri, M., & Summers, RM (2017). Chestx-ray8:
international conference on computer vision. Hospital-scale chest x-ray database and benchmarks on weakly-supervised
Selvaraju, R. R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., & Batra, D. classification and localization of common thorax diseases. In Proceedings
(2017). Grad-cam: Visual explanations from deep networks via gradient- of the IEEE conference on computer vision and pattern recognition (pp.
based localization. In Proceedings of the IEEE international conference on 2097–2106).
computer vision (pp. 618–626). Wu, G., & Chang, E. Y. (2005). KBA: Kernel boundary alignment considering
Sitaula, C., & Hossain, M. B. (2021). Attention-based VGG-16 model for COVID-19 imbalanced data distribution. IEEE Transactions on Knowledge and Data
chest X-ray image classification. Applied Intelligence, 51(5), 2850–2863, May. Engineering, 17(6), 786–795, Apr 25.

184

You might also like