Semi-Supervised Variational Adversarial Active Lea
Semi-Supervised Variational Adversarial Active Lea
Abstract. Active learning aims to alleviate the amount of labor involved in data
labeling by automating the selection of unlabeled samples via an acquisition func-
tion. For example, variational adversarial active learning (VAAL) leverages an
adversarial network to discriminate unlabeled samples from labeled ones using
latent space information. However, VAAL has the following shortcomings: (i) it
does not exploit target task information, and (ii) unlabeled data is only used for
sample selection rather than model training. To address these limitations, we in-
troduce novel techniques that significantly improve the use of abundant unlabeled
data during training and take into account the task information. Concretely, we
propose an improved pseudo-labeling algorithm that leverages information from
all unlabeled data in a semi-supervised manner, thus allowing a model to ex-
plore a richer data space. In addition, we develop a ranking-based loss prediction
module that converts predicted relative ranking information into a differentiable
ranking loss. This loss can be embedded as a rank variable into the latent space of
a variational autoencoder and then trained with a discriminator in an adversarial
fashion for sample selection. We demonstrate the superior performance of our ap-
proach over the state of the art on various image classification and segmentation
benchmark datasets.
1 Introduction
Deep learning has shown impressive results on computer vision tasks mainly due to an-
notated large-scale datasets. Yet, acquiring labeled data can be extremely costly or even
infeasible. To overcome this issue, active learning (AL) was introduced [6,31]. In AL, a
model is initialized with a relatively small set of labeled training samples. Then, an AL
algorithm progressively chooses samples for annotation that yield high classification
performance while minimizing labeling costs. By demonstrating a reduced requirement
for training instances, AL has been applied to various computer vision applications in-
cluding image categorization, image segmentation, text classification, and more.
Among the most prevalent AL strategies, pool-based approaches have access to a
huge supply of unlabeled data. This provides valuable information about the under-
lying structure of the whole data distribution, especially for small labeling budgets.
Nevertheless, many AL methods still fail to leverage valuable information within the
2 Zongyao Lyu and William J. Beksi
Fig. 1: An overview of SS-VAAL. First, a loss prediction module attached to the tar-
get model predicts losses on the input data. Next, the predicted losses along with the
actual target losses are transformed into ranking losses via a pretrained ranking func-
tion. Unlabeled samples are then passed to the target model and subsequently through
a k-means algorithm to acquire pseudo labels for additional training. Finally, a discrim-
inator following a variational autoencoder is trained in an adversarial manner to select
unlabeled samples for annotation.
unlabeled data during training. On the other hand, semi-supervised learning (SSL), in
particular the technique of pseudo labeling, thrives on utilizing unlabeled data. Pseudo
labeling is based on the concept whereby a model assigns “pseudo labels” to samples
that produce high-confidence scores. It then integrates these samples into the training
process. In contrast, AL typically selects only a handful of highly-informative samples
(i.e., samples with low prediction confidence) at each learning step and regularly seeks
user input. Although AL and pseudo labeling both aim to leverage a model’s uncer-
tainty, they look at different ends of the same spectrum. Hence, their combination can
be expected to achieve increased performance [14].
In light of this observation, we propose to exploit both labeled and unlabeled data
during model training by (i) predicting pseudo labels for unlabeled samples, and (ii)
incorporating these samples and their pseudo labels into the labeled training data in
every AL cycle. The idea of using unlabeled data for training is not new. Earlier work by
Wang et al. [36] showed promising results by applying entropy-based pseudo labeling to
AL. However, pseudo labeling can perform poorly in its original formulation. The sub-
par performance is attributed to inaccurate high-confidence predictions made by poorly
calibrated models. These predictions produce numerous incorrect pseudo labels [1]. To
tackle this issue, we introduce a novel agreement-based clustering technique that assists
in determining pseudo labels. Clustering algorithms can analyze enormous amounts of
unlabeled data in an unsupervised way [26,7], and cluster centers are highly useful for
querying labels from an oracle [18]. Our two-step process involves (i) separately clus-
tering labeled and unlabeled data, (ii) assigning each piece of unlabeled data an initial
pseudo label and a clustering label. A final pseudo label is confirmed only if these two
Semi-Supervised Variational Adversarial Active Learning 3
labels agree. The end result is a significant reduction in the number of incorrect pseudo
labels.
The second aspect of our work focuses on the sample selection strategy in AL.
We base our approach on the VAAL [34] framework. VAAL uses an adversarial dis-
criminator to discern between labeled and unlabeled data, which informs the sample
selection process. Later adaptations of VAAL (e.g., TA-VAAL [19]) incorporate a loss
prediction module, relaxing the task of exact loss prediction to loss ranking prediction.
Additionally, a ranking conditional generative adversarial network (RankCGAN) [29]
is employed to combine normalized ranking loss information into VAAL. To better in-
tegrate task-related information into the training process, we propose a learning-to-rank
method for VAAL. This decision is inspired by the realization that the loss prediction
can be interpreted as a ranking problem [23], a concept central to information retrieval.
We refine the loss prediction process by applying a contemporary learning-to-rank tech-
nique for approximating non-differentiable operations in ranking-based scores. The loss
prediction module estimates a loss for labeled input, converting the predicted loss and
actual target loss into a differentiable ranking loss. This ranking loss, along with la-
beled and unlabeled data, is provided as input into an adversarial learning process that
identifies unlabeled samples for annotation. Therefore, by explicitly exploiting the loss
information directly related to the given task, task-related information is integrated into
the AL process. The architecture of our proposed method, SS-VAAL, is depicted in
Fig. 1.
To summarize, our contributions are the following.
2 Related Work
3 Method
Let (XL , YL ) be a pool of data and their labels, and XU the pool of unlabeled data.
Training starts with K available labeled sample pairs (XLK , YLK ). Given a fixed labeling
budget in each AL cycle, b samples from the unlabeled pool are queried according to
an acquisition function. Next, the samples are annotated by human experts and added
to the labeled pool. The model is then iteratively trained on the updated labeled pool
(XLK+b , YLK+b ), and this process is repeated until the labeling budget is exhausted.
SS-VAAL enhances the VAAL framework and its variant, TA-VAAL, as follows.
VAAL employs adversarial learning to distinguish features of labeled and unlabeled
data, which reduces outlier impact and leverages both labeled and unlabeled data in
a semi-supervised training scheme. TA-VAAL, building on the groundwork of VAAL,
utilizes global data structures and local task-related information for sample queries. Our
methodology improves upon these predecessors by harnessing the full potential of the
data distribution and model uncertainty, hence further refining the query strategy in the
AL process.
Given XL and XU for labeled and unlabeled examples, respectively, we apply a clas-
sifier f on the unlabeled data f (XU ), and select and assign pseudo labels ŷ for the
most certain predictions. Traditionally, the labeled set will be directly augmented by
y = y + ŷ for the next round of training. Nonetheless, pseudo labeling in its initial form
may produce high-confidence predictions that are incorrect, resulting in numerous er-
roneous pseudo labels and ultimately causing an unstable training process.
To mitigate this issue, we present a semi-supervised pre-clustering technique for
each pseudo label selection process that enhances robustness by reducing incorrect
pseudo labels. In each AL cycle, we first train a model on the available labeled data.
We modify the network to output both the probability score and the feature vector from
the last fully-connected layer before sending it to the softmax function. Then, we fit a
k-means clustering algorithm on the output features of the labeled training data. This
allows the algorithm to learn the structure of the labeled data and predict clusters each
Semi-Supervised Variational Adversarial Active Learning 7
of whose centroid corresponds to one of the classes of the dataset. One thing to note is
that the cluster assignments won’t necessarily correspond directly to the classes of the
dataset being trained. This is because clustering algorithms (e.g., k-means) do not have
any inherent knowledge of class labels and thus the cluster labels they assign have no
intrinsic meaning. To be meaningful, we map the clustering labels to the actual classes
to ensure that they correspond to each other. This is done by assigning each cluster label
to the most frequent true class label within that cluster based on the labeled training
data.
Next, we train a classifier on all unlabeled data to get the predicted probability
vectors
XU
− Rc ,
p(yi = j | xi ) = f (XU ) → (1)
where c is the number of total classes. We assign initial pseudo labels to the unlabeled
data with the most certain predictions only when their associated probabilities are larger
than a threshold τ (we set τ = 0.95 in the experiments), i.e.,
j ∗ = max p(yi = j | xi ),
j
(
arg j, j > τ
ŷi = (2)
0, otherwise.
Then, we apply the k-means function learned on the labeled data to the unlabeled data
to predict the clusters they belong to. Each unlabeled sample is grouped to the nearest
cluster and assigned a label to which the cluster centroid corresponds.
Each unlabeled data point will now have both an initial pseudo label and a clus-
tering label. Lastly, we compare the temporary pseudo labels with the clustering labels
to determine a final pseudo label for each unlabeled data only if they agree with each
other. By doing so, we reduce the number of incorrect pseudo labels, thus taking full
advantage of the abundant unlabeled data for model training. Stage 2 in Fig. 2 shows
this agreement-based pseudo-labeling process. We demonstrate improvement over con-
ventional pseudo labeling through an ablation study in the supplementary material.
original LL4AL as both only consider the neighboring data pairs and ignore the over-
all list structure. This motivates us to use a more appropriate listwise ranking scheme.
Ranking is crucial for many computing tasks, such as information retrieval, and it is
often addressed via a listwise approach (e.g., [5,25]). This involves taking ranked lists
of objects as instances and training a ranking function through the minimization of a
listwise loss function defined on the predicted and ground-truth lists [37].
SoDeep [10] is a method for approximating the non-differentiable sorting operation
in ranking-based losses. It uses a DNN as a sorter to approximate the ranking function
and it is pretrained separately on synthetic values and their ground-truth ranks. The
trained sorter can then be applied directly in downstream tasks by combining it with an
existing model (e.g., the loss prediction module) and converting the value list given by
the model into a ranking list. The ranking loss between the predicted and ground-truth
ranks can then be calculated and backpropagated through the differentiable sorter and
used to update the weights of the model. Fig. 3 illustrates the sorter architecture. We
find this process works well with the loss prediction task in the loss module. Therefore,
we apply SoDeep to the loss prediction module and learn to predict the ranking loss as
a variable that injects task-related information into the subsequent adversarial learning
process, which increases the robustness of the unlabeled sample selection. Concretely,
we substitute the loss prediction module into the sorter architecture as the DNN target
model to produce the predicted scores where the target losses are used as the ground-
truth scores.
The upper-right side of Fig. 2 displays the architecture of the modified loss learning
process. We retain the basic structure of the original loss prediction module. Given an
input, the target model generates a prediction, while the loss prediction module takes
multi-layer features as inputs that are extracted from multiple mid-level blocks of the
target model. These features are connected to multiple identical blocks each of which
consists of a global average pooling layer and a fully-connected layer. Then, the outputs
Semi-Supervised Variational Adversarial Active Learning 9
which can be used to update the weights of the model. The objective function of the
task learner with the ranking loss module is
where ŷL and yL are the predicted and ground-truth labels, respectively, and λ is a
scaling constant. This training process is illustrated as Stage 1 in Fig. 2. The learned
ranking loss is embedded as a task-related rank variable in the latent space of a VAE
for the subsequent adversarial learning process, which is described in detail in Sec. 3.3.
Stage 1 of the two-stage training is summarized in Alg. 1.
N (0, I) be the unit Gaussian prior. The transductive learning of the VAE to capture
latent representation information on both labeled and unlabeled data is characterized by
Ltrans
V AE = E[log qϕ (xL | zL , rL )] − βKL(pθ (zL | xL )||p(z))
(5)
+ E[log qϕ (xU | zU , ˆlU )] − βKL(pθ (zU | xU )||p(z)),
where ˆlU is the predicted loss Lpred over unlabeled data, β is the Lagrangian parameter,
and E denotes the expectation [16].
With the latent representations zL and zU learned by the VAE of both the labeled
and unlabeled data, the objective function of the VAE in adversarial training is then
Ladv ˆ
V AE = −E[log(D(pθ (zL | xL , rL )))] − E[log(D(pθ (zU | xU , lU )))]. (6)
Combining (5) and (6), the overall objective function of the VAE is
LV AE = Ltrans adv
V AE + ηLV AE , (7)
Ladv ˆ
D = −E[log(D(pθ (zL | xL , rL )))] − E[log(1 − D(pθ (zU | xU , lU )))], (8)
min max E[log(D(pθ (zL | xL , rL )))] + E[log(1 − D(pθ (zU | xU , ˆlU )))]. (9)
pθ D
The VAE and discriminator are trained in an adversarial manner. Specifically, the
VAE maps the labeled pθ (zL | xL ) and unlabeled pθ (zU | xU ) data into the latent space
with binary labels 1 and 0, respectively, and tries to trick the discriminator into classi-
fying all the inputs as labeled. On the other hand, the discriminator tries to distinguish
Semi-Supervised Variational Adversarial Active Learning 11
the unlabeled data from the labeled data by predicting the probability of each sample
being from the labeled pool. Thus, the adversarial network is trained to serve as the
sampling scheme via the discriminator by predicting the samples associated with the
latent representations of zL and zU to be from the labeled pool xL or the unlabeled
pool xU according to its predicted probability D(·). In short, sample selection is based
on the predicted probability of the discriminator adversarially trained with the VAE.
The smaller the probability, the more likely the sample will be selected for annotat-
ing. This adversarial training process is shown as Stage 3 in Fig. 2 and summarized in
Alg. 2.
4 Experiments
were incorporated back into the initial training set and the process was carried out again
on the updated training set.
For Caltech-101 and ImageNet, the images were resized to 224 × 224 and we initi-
ated the process with 10% of the samples from the dataset as labeled data with a budget
size equivalent to 5% of the dataset. All other settings remained the same as those used
for CIFAR-10 and CIFAR-100, except that the main task was trained for 100 epochs
for the ImageNet dataset. The effectiveness of our approach was assessed based on the
accuracy of the test data. We compared against a random sampling strategy baseline
and state-of-the-art methods including the core-set approach [30], LL4AL [40], VAAL
[34], TA-VAAL [19], and MAOAL [13].
Results. All the compared against methods were averaged across 5 trials on the CIFAR-
10, CIFAR-100, and Caltech-101 datasets, and across 2 trials on ImageNet. Fig. 4 and
Fig. 6 (see supplementary material) show the classification accuracy on the benchmark
datasets. The results obtained for the competing methods are largely in line with those
reported in the literature. Our comprehensive methodology, SS-VAAL, incorporates
both the ranking loss prediction module and the clustering-assisted pseudo labeling. The
empirical results consistently show that SS-VAAL surpasses all the competing methods
at each AL stage.
5 Conclusion
In this paper we developed key enhancements to both better optimize the use of vast
amounts of unlabeled data during training and incorporate task-related information. Our
approach, SS-VAAL, includes a novel pseudo-labeling algorithm that allows a model to
delve deeper into the data space, thus enhancing its representation ability by exploiting
all unlabeled data in a semi-supervised way in every AL cycle. SS-VAAL also incorpo-
rates a ranking-based loss prediction module that converts predicted losses into a differ-
entiable ranking loss. It can be inserted as a rank variable into VAAL’s latent space for
14 Zongyao Lyu and William J. Beksi
References
1. Arazo, E., Ortego, D., Albert, P., O’Connor, N.E., McGuinness, K.: Pseudo-labeling and
confirmation bias in deep semi-supervised learning. In: Proceedings of the International Joint
Conference on Neural Networks. pp. 1–8. IEEE (2020)
2. Ash, J.T., Zhang, C., Krishnamurthy, A., Langford, J., Agarwal, A.: Deep batch active learn-
ing by diverse, uncertain gradient lower bounds. In: Proceedings of the International Confer-
ence on Learning Representations (2020)
3. Beluch, W.H., Genewein, T., Nürnberger, A., Köhler, J.M.: The power of ensembles for
active learning in image classification. In: Proceedings of the IEEE/CVF Conference on
Computer Vision and Pattern Recognition. pp. 9368–9377 (2018)
4. Buchert, F., Navab, N., Kim, S.T.: Exploiting diversity of unlabeled data for label-efficient
semi-supervised active learning. In: Proceedings of the International Conference on Pattern
Recognition. pp. 2063–2069 (2022)
5. Cao, Z., Qin, T., Liu, T.Y., Tsai, M.F., Li, H.: Learning to rank: From pairwise approach to
listwise approach. In: Proceedings of the International Conference on Machine learning. pp.
129–136 (2007)
6. Cohn, D.A., Ghahramani, Z., Jordan, M.I.: Active learning with statistical models. Journal
of Artificial Intelligence Research 4, 129–145 (1996)
7. Coletta, L.F., Ponti, M., Hruschka, E.R., Acharya, A., Ghosh, J.: Combining clustering and
active learning for the detection and learning of new image classes. Neurocomputing 358,
150–165 (2019)
8. Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., Franke, U.,
Roth, S., Schiele, B.: The cityscapes dataset for semantic urban scene understanding. In:
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp.
3213–3223 (2016)
9. Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: A large-scale hierar-
chical image database. In: Proceedings of the IEEE/CVF Conference on Computer Vision
and Pattern Recognition. pp. 248–255 (2009)
10. Engilberge, M., Chevallier, L., Pérez, P., Cord, M.: Sodeep: a sorting deep net to learn rank-
ing loss surrogates. In: Proceedings of the IEEE/CVF Conference on Computer Vision and
Pattern Recognition. pp. 10792–10801 (2019)
11. Fei-Fei, L., Fergus, R., Perona, P.: Learning generative visual models from few training ex-
amples: An incremental bayesian approach tested on 101 object categories. In: Proceedings
of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. pp.
178–178 (2004)
12. Gal, Y., Islam, R., Ghahramani, Z.: Deep bayesian active learning with image data. In: Pro-
ceedings of the International Conference on Machine Learning. pp. 1183–1192 (2017)
13. Geng, L., Liu, N., Qin, J.: Multi-classifier adversarial optimization for active learning. In:
Proceedings of the AAAI Conference on Artificial Intelligence. vol. 37, pp. 7687–7695
(2023)
14. Gilhuber, S., Jahn, P., Ma, Y., Seidl, T.: Verips: Verified pseudo-label selection for deep active
learning. In: Proceedings of the IEEE International Conference on Data Mining. pp. 951–956
(2022)
15. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Pro-
ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp.
770–778 (2016)
Semi-Supervised Variational Adversarial Active Learning 15
16. Higgins, I., Matthey, L., Pal, A., Burgess, C., Glorot, X., Botvinick, M., Mohamed, S., Lerch-
ner, A.: beta-vae: Learning basic visual concepts with a constrained variational framework.
In: Proceedings of the International Conference on Learning Representations (2017)
17. Huang, S.J., Jin, R., Zhou, Z.H.: Active learning by querying informative and representative
examples. IEEE Transactions on Pattern Analysis and Machine Intelligence 10(36), 1936–
1949 (2014)
18. Huang, Z., He, Y., Vogt, S., Sick, B.: Uncertainty and utility sampling with pre-clustering.
In: Proceedings of the Workshop on Interactive Adaptive Learning (2021)
19. Kim, K., Park, D., Kim, K.I., Chun, S.Y.: Task-aware variational adversarial active learning.
In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.
pp. 8166–8175 (2021)
20. Krizhevsky, A.: Learning multiple layers of features from tiny images. Tech. rep., University
of Toronto, Toronto, Ontario (2009)
21. Lee, D.H.: Pseudo-label: The simple and efficient semi-supervised learning method for deep
neural networks. In: Proceedings of the Workshop on Challenges in Representation Learning.
vol. 3, p. 896 (2013)
22. Lewis, D.D.: A sequential algorithm for training text classifiers: Corrigendum and additional
data. In: Proceedings of the Acm Sigir Forum. vol. 29, pp. 13–19. ACM New York, NY, USA
(1995)
23. Li, M., Liu, X., van de Weijer, J., Raducanu, B.: Learning to rank for active learning: A
listwise approach. In: Proceedings of the International Conference on Pattern Recognition.
pp. 5587–5594 (2020)
24. Li, X., Guo, Y.: Adaptive active learning for image classification. In: Proceedings of the
IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 859–866 (2013)
25. Liu, T.Y.: Learning to rank for information retrieval. Foundations and Trends® in Informa-
tion Retrieval 3(3), 225–331 (2009)
26. Nguyen, H.T., Smeulders, A.: Active learning using pre-clustering. In: Proceedings of the
International Conference on Machine Learning. p. 79 (2004)
27. Rizve, M.N., Duarte, K., Rawat, Y.S., Shah, M.: In defense of pseudo-labeling: An
uncertainty-aware pseudo-label selection framework for semi-supervised learning. In: Pro-
ceedings of the International Conference on Learning Representations (2021)
28. Sajjadi, M., Javanmardi, M., Tasdizen, T.: Regularization with stochastic transformations and
perturbations for deep semi-supervised learning. In: Proceedings of the Advances in Neural
Information Processing Systems. vol. 29 (2016)
29. Saquil, Y., Kim, K.I., Hall, P.: Ranking cgans: Subjective control over semantic image at-
tributes. In: Proceedings of the British Machine Vision Conference (2018)
30. Sener, O., Savarese, S.: Active learning for convolutional neural networks: A core-set ap-
proach. In: Proceedings of the International Conference on Learning Representations (2018)
31. Settles, B.: Active learning literature survey. Tech. rep., University of Wisconsin-Madison
Department of Computer Science (2009)
32. Siméoni, O., Budnik, M., Avrithis, Y., Gravier, G.: Rethinking deep active learning: Using
unlabeled data at model training. In: Proceedings of the International Conference on Pattern
Recognition. pp. 1220–1227 (2020)
33. Sindhwani, V., Niyogi, P., Belkin, M.: Beyond the point cloud: From transductive to semi-
supervised learning. In: Proceedings of the International Conference on Machine Learning.
pp. 824–831 (2005)
34. Sinha, S., Ebrahimi, S., Darrell, T.: Variational adversarial active learning. In: Proceedings
of the IEEE/CVF International Conference on Computer Vision. pp. 5972–5981 (2019)
35. https://github.com/robotic-vision-lab/Semi-Supervised-
Variational-Adversarial-Active-Learning
16 Zongyao Lyu and William J. Beksi
36. Wang, K., Zhang, D., Li, Y., Zhang, R., Lin, L.: Cost-effective active learning for deep image
classification. IEEE Transactions on Circuits and Systems for Video Technology 27(12),
2591–2600 (2016)
37. Xia, F., Liu, T.Y., Wang, J., Zhang, W., Li, H.: Listwise approach to learning to rank: theory
and algorithm. In: Proceedings of the International Conference on Machine Learning. pp.
1192–1199 (2008)
38. Yan, X., Nazmi, S., Gebru, B., Anwar, M., Homaifar, A., Sarkar, M., Gupta, K.D.: A
clustering-based active learning method to query informative and representative samples.
Applied Intelligence 52(11), 13250–13267 (2022)
39. Yang, Y., Ma, Z., Nie, F., Chang, X., Hauptmann, A.G.: Multi-class active learning by uncer-
tainty sampling with diversity maximization. International Journal of Computer Vision 113,
113–127 (2015)
40. Yoo, D., Kweon, I.S.: Learning loss for active learning. In: Proceedings of the IEEE/CVF
Conference on Computer Vision and Pattern Recognition. pp. 93–102 (2019)
41. Yu, F., Koltun, V., Funkhouser, T.: Dilated residual networks. In: Proceedings of the
IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 472–480 (2017)
42. Zhang, B., Li, L., Yang, S., Wang, S., Zha, Z.J., Huang, Q.: State-relabeling adversarial active
learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern
Recognition. pp. 8756–8765 (2020)
Supplementary Material
In this supplement we provide image classification results on the ImageNet dataset and
additional experimental results for the ablation study to assess the impact of each SS-
VAAL component.
Fig. 6 presents a performance comparison of our full methodology against several main
competing approaches on the ImageNet dataset. The results clearly show that SS-VAAL
consistently outperforms the others in every iteration, demonstrating its efficacy and
scalability in handling large-scale datasets.
We conducted an ablation study on the image classification task to assess the impact
of each proposed component. SS-VAAL (w/ ranking only) refers to the enhancement
of VAAL by integrating the ranking loss-based prediction module. According to Fig. 7
- Fig. 9, this configuration outperforms VAAL and LL4AL, confirming the benefits
of considering task-related information in task learning. Furthermore, this setting also
outperforms TA-VAAL, indicating that our selection of the listwise ranking method
more effectively conveys task-related information than that of TA-VAAL.
On the contrary, SS-VAAL (w/ CAPL only) entails the implementation of the pro-
posed clustering-assisted pseudo-labeling procedure at every stage of model training.
This setup yields a noticeable improvement over all compared methods, highlighting
the effectiveness of exploiting unlabeled data during model training. It also offers a
Semi-Supervised Variational Adversarial Active Learning 17
modest improvement over the SS-VAAL (w/ ranking only) configuration (Fig. 10 -
Fig. 12), implying that leveraging unlabeled data for training contributes more to per-
formance improvement than employing alternative means for conveying task-related
information. Additionally, we contrast this configuration with SS-VAAL (w/ PL only),
which represents the use of the conventional pseudo-labeling technique. The enhance-
ment in performance underscores the effectiveness of our refinement on this method
(Fig. 13 - Fig. 15).
Fig. 7: Ablation results on analyzing the effect of the ranking component on the CIFAR-
10 dataset.
18 Zongyao Lyu and William J. Beksi
Fig. 8: Ablation results on analyzing the effect of the ranking component on the CIFAR-
100 dataset.
Fig. 9: Ablation results on analyzing the effect of the ranking component on the Caltech-
101 dataset.
Fig. 10: Ablation results on analyzing the effect of each component on the CIFAR-10
dataset.
Semi-Supervised Variational Adversarial Active Learning 19
Fig. 11: Ablation results on analyzing the effect of each component on the CIFAR-100
dataset.
Fig. 12: Ablation results on analyzing the effect of each component on the Caltech-101
dataset.
Fig. 13: Ablation results on analyzing the effect of the pseudo-labeling component on
the CIFAR-10 dataset.
20 Zongyao Lyu and William J. Beksi
Fig. 14: Ablation results on analyzing the effect of the pseudo-labeling component on
the CIFAR-100 dataset.
Fig. 15: Ablation results on analyzing the effect of the pseudo-labeling component on
the Caltech-101 dataset.