0% found this document useful (0 votes)

22 views11 pages

Universal Semi-Supervised Learning For Medical Classification

Uploaded by

keften.al

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

22 views11 pages

Universal Semi-Supervised Learning For Medical Classification

Uploaded by

keften.al

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

Universal Semi-Supervised Learning for Medical

Image Classification

Lie Ju1,2,3 , Yicheng Wu3,4 , Wei Feng1,2,3 , Zhen Yu1,2,3 ,

Lin Wang1,3,5 , Zhuoting Zhu6 , and Zongyuan Ge1,2,3( )
1
Monash-Airdoc Research, Monash University, Melbourne, Australia
2
eResearch Centre, Monash University, Melbourne, Australia
arXiv:2304.04059v2 [cs.CV] 2 Jul 2024

3
Monash Medical AI Group, Monash University, Melbourne, Australia
4
Faculty of Information Technology, Monash University, Melbourne, Australia
5
Harbin Engineering University, Harbin, China
6
Centre for Eye Research Australia, Melbourne University, Melbourne, Australia
https://www.monash.edu/mmai-group
[email protected], [email protected]

Abstract. Semi-supervised learning (SSL) has attracted much atten-

tion since it reduces the expensive costs of collecting adequate well-
labeled training data, especially for deep learning methods. However, tra-
ditional SSL is built upon an assumption that labeled and unlabeled data
should be from the same distribution e.g., classes and domains. However,
in practical scenarios, unlabeled data would be from unseen classes or
unseen domains, and it is still challenging to exploit them by existing SSL
methods. Therefore, in this paper, we proposed a unified framework to
leverage these unseen unlabeled data for open-scenario semi-supervised
medical image classification. We first design a novel scoring mechanism,
called dual-path outliers estimation, to identify samples from unseen
classes. Meanwhile, to extract unseen-domain samples, we then apply
an effective variational autoencoder (VAE) pre-training. After that, we
conduct domain adaptation to fully exploit the value of the detected
unseen-domain samples to boost semi-supervised training. We evaluated
our proposed framework on dermatology and ophthalmology tasks. Ex-
tensive experiments demonstrate our model can achieve superior classi-
fication performance in various medical SSL scenarios. The code imple-
mentations are accessible at: https://github.com/PyJulie/USSL4MIC.

Keywords: Semi-supervised Learning · Open-set · Dermatology · Oph-

thalmology

1 Introduction
Training a satisfied deep model for medical classification tasks remains highly
challenging due to the expensive costs of collecting adequate high-quality an-
notated data. Hence, semi-supervised learning (SSL) [15,1,25,16,19] becomes a
popular technique to exploit unlabeled data with only limited annotated data.
Essentially, most existing SSL methods are based on an assumption that labeled
2 L. Ju et al.

Labeled Data Unlabeled Data Skin Data

ISIC 2019 ISIC 2019 ISIC 2019

Close-set
(a) SSL
PH2
NV MEL NV MEL NV MEL

ISIC 2019 ISIC 2019 UKC

Der7Point
Open-set
(b) SSL

NV MEL NV MEL BCC BKL PAD-UFES-20

ISIC 2019 ISIC 2019 Der7Point

Universal Dermnet
(c) SSL

NV MEL NV MEL MEL BCC

UKC & UKD

Fig. 1: Problem illustration. (a) Close-set SSL. The samples in the labeled and
unlabeled data share the same classes and are collected under the same environ-
ment, i.e., dermatoscopes. (b) Open-set SSL. There are unknown classes (UKC)
in the unlabeled data, e.g., BCC and BKL. (c) Universal SSL. In addition to the
unknown classes, the samples in the unlabeled data may come from other un-
known domains (UKD), e.g., samples from other datasets with different imaging
and condition settings.

and unlabeled data should be from the same close-set distribution and neglect
the realistic scenarios. However, in practical clinical tasks (e.g., skin lesion clas-
sification), unlabeled data may contain samples from unknown/open-set classes
which do not present in the training set, leading to sub-optimal performance.
We further illustrate this problem in Fig. 1. Specifically, Fig. 1-(a) shows
a classic close-set SSL setting: the labeled and unlabeled data from ISIC 2019
dataset [10] share the same classes, i.e., melanocytic nevus (NV) and melanoma
(MEL). Fig. 1-(b) shows a condition of Open-set SSL, where novel classes
of basal cell carcinoma (BCC) and benign keratosis (BKL) are introduced. Re-
cent works for Open-set SSL [23,17] mainly focused on identifying those outliers
during the model training. Unlike them, here, we further consider a more realis-
tic scenario that also greatly violates the close-set assumption posted above, as
shown in Fig. 1-(c). The unlabeled data may share the same classes but come
from quite different domains, e.g., MEL from Der7point dataset [5], which con-
tains clinical images in addition to dermoscopic images. Meanwhile, there are
some unknown novel classes, e.g., BCC from Der7point dataset, leading to both
seen/unseen class and domain mismatch.
To handle this mismatch issue, Huang et al. [9] proposed CAFA to combine
the open-set recognition (OSR) and domain adaptation (DA), namely Univer-
sal SSL. Specifically, they proposed to measure the possibility of a sample to
be unknown classes (UKC) or unknown domains (UKD), which are leveraged to
re-weight the unlabeled samples. The domain adaptation term can adapt fea-
tures from unknown domains into the known domain, ensuring that the model
Universal Semi-Supervised Classification 3

can fully exploit the value of UKD samples. The effectiveness of CAFA relies
heavily on the detection of open-set samples where the proposed techniques al-
ways fail to generalize on medical datasets. For medical images, such as skin
data, UKC and UKD samples can be highly inseparable (e.g., MEL in Fig. 1 -
(a) vs. BCC in Fig. 1 - (b)), particularly when training with limited samples in
a semi-supervised setting.
Therefore, in this work, we propose a novel universal semi-supervised frame-
work for medical image classification for both class and domain mismatch. Specif-
ically, to measure the possibility of an unlabeled sample being UKC, we propose a
dual-path outlier estimation technique, to measure the possibility of an unlabeled
sample being UKC in both feature and classifier levels using prototypes and pre-
diction confidence. In addition, we first present a scoring mechanism to measure
the possibility of an unlabeled sample being UKD by pre-training a Variational
AutoEncoder (VAE) model, which is more suitable for medical image domain
separation with less labeled samples required. With the detected UKD samples,
we applied domain adaptation methods for feature matching for different do-
mains. After that, the labeled and unlabeled samples (including feature-adapted
UKD samples) could be optimized using traditional SSL techniques.
Our contributions can be summarized as: (1) We present a novel framework
for universal semi-supervised medical image classification, which enables the
model to learn from unknown classes/domains using open-set recognition and
domain adaptation techniques. (2) We propose a novel scoring mechanism to im-
prove the reliability of the detection of outliers from UKC/UKD for further uni-
fied training. (3) Experiments on datasets with various modalities demonstrate
our proposed method can perform well in different open-set SSL scenarios.

2 Methodology

2.1 Overview

The overview of our proposed framework is shown in Fig. 2. The framework

mainly contains a feature extractor F, an adversarial discriminator D, a multi-
class classifier C, and a non-adversarial discriminator D′ . The feature extrac-
tor encodes the inputs X into features V. The multi-class classifier C outputs
the predictions of exact diseases. The non-adversarial discriminator predicts the
possibility of an instance from unlabeled data to be UKD. The adversarial dis-
criminator conducts feature adaptation on the samples from known and detected
unknown domains. To summarize, our target is to score unlabeled samples from
unseen classes/domains for further training using SSL and domain adaptation.

2.2 Dual-path Outlier Estimation

Recent open-set SSL methods [23,8] mainly focus on the detection of UKC sam-
ples, which is known as the OSR task. Those outliers will be removed during
the training phase. In this section, we propose a novel OSR technique namely
4 L. Ju et al.

Dual-path Outlier Estimation (DOE) for the assessment of UKC based on both
feature similarities and confidence of classifier predictions. Formally, given la-
beled samples Xl , we first warm-up the model with standard cross-entropy loss.
Unlike CAFA [9], which computes instance-wise feature similarity, we argue that
samples from known classes should have closer distances to the centric represen-
tations, e.g, prototypes, than outliers. The prototypes of a class can be computed
as average outputs of its corresponding samples xl,i ∈ Xl :
PNcj
i=1,xl,i ∈Xl,cj F(xl,i )
vl,cj = , (1)
Ncj
where Ncj denotes the number of instances of class j and vcj is a vector with the
shape of 1×D after the average global pooling layer. Then, the feature similarity
of an instance xu,i ∈ XU to each known class can be calculated as:
Nc Nc
d = {di,xi ∈cj }j=1j = { F(xu,i|cj ) − vl,cj } j.
2 j=1
(2)

We can assume that if a sample is relatively far from all class-specific prototypes,
it should have a larger average value davg of distance d and can be considered a
potential outlier [13,20,28]. Then, we perform strong augmentations on unlabeled
inputs and generate two views xu′i,1 and xu′i,2 , which are also subsequently fed
into the pre-trained networks and obtain the predictions pui,1 and pui,2 . Inspired
by agreement maximization principle [24], a sample to be outliers or not can be
determined by the consistency of these two predictions:

pood (xi ) = |max (pui,1 ) − max (pui,2 )|. (3)

Finally, we combine the prototype-based and prediction-based scores:

wu,c = 1 − σ(davg · pood ), (4)

where σ is a normalization function that maps the original distribution into (0,1].

2.3 Class-agnostic Domain Separation

Although our proposed DOE can help detect potential UKC samples, an obvi-
ous issue is that the domain difference can easily disturb the estimation of UKC,
e.g., UKD samples have larger distances to the prototypes of known domains.
Different from the detection of UKC, distinguishing UKD samples in the unla-
beled data is less difficult since there is more environmental gap among different
domains such as imaging devices, modalities, or other artifacts. To this end, we
adopt the VAE which is agnostic to the supervised signal and can pay more
attention to the global style of the domain [26]. Formally, VAE consists of an
encoder g(·) and a decoder f (·), where the encoder compresses high-dimensional
input features xi to a low-dimension embedding space and the decoder aims to
reconstruct from that by minimizing the errors:
2
Lre = ∥xi − f (g(xi ))∥2 . (5)
Universal Semi-Supervised Classification 5

��
��
A. Overview of framework Domain Notations
�
�
Adaptation
Train

��
�
Semi- Inputs
� � supervised Inference

��
Learning Loss

��
Computation
Weights
Augmentation

Domain Feature
UKC? UKD? D’ Separation Extractor
Classififer

��
B. Assessment of UKC C. Assessment of UKD VAE
��

�
Probabilistic Probabilistic
� Encoder z Decoder

��
Prototypes
�
Clustering

�
Shared

��
Reconstruction
Loss

Pseudo
Similarity Labels
Calculation Gaussian
�
��’
Mixture Model

��
Feature
Vector

Fig. 2: The overview of our proposed framework.

In our scenario, we pre-train a VAE model using labeled data and evaluated it
on the unlabeled data to obtain the reconstruction errors. Then, we fit a two-
component Gaussian Mixture Model (GMM) using the Expectation-Maximization
algorithm, which has flexibility in the sharpness of distribution and is more sen-
sitive to low-dimension distribution [18], i.e., the reconstruction losses Lre . For
each sample xi , we have its posterior probability as wd,i for domain separation.
With known domain samples from labeled data (denoted as yl,d,i = 0) and the
possibility of UKC samples from unlabeled data (denoted as yu,d,j = wd,j ), we
optimize a binary cross-entropy loss for non-adversarial discriminator D′ :

Nl Nu
1 X 1 X
Ldom =− 1 − log(ŷl,d,i )− wd,j ·log(ŷu,d,j )−(1−wd,j )·(log(1−ŷu,d,j )).
Nl i=1 Nu j=1
(6)

2.4 Optimization with Adversarial Training & SSL

To make a domain adaptation for distinguished unknown domains, we adopt

a regular adversarial training manner [2], with the labeled data treated as the
target domain and the unlabeled data as the source domain. Note that we use
′
two weights wd,u from the non-adversarial discriminator and wc,u from DOE to
determine which samples to adapt. The adversarial loss can be formulated as:
′
max min Ladv = −(1−yt )·log(1−D(F (Xl )))−ys ·wu,d ·wu,c ·log D(F (Xu )), (7)
θF θD
6 L. Ju et al.

where θ denotes the parameters of the specific module and ys = 1 and yt = 0 are
the initial domain labels for the source domain and target domain. Then, we can
perform unified training from the labeled data and selectively feature-adapted
unlabeled data under weights controlled. The overall loss can be formulated as:
′
Loverall = LCE (Xl ) − α · Ladv (Xu |wu,d , wu,c ) + β · LSSL (Xu |wu,c ), (8)

where α and β are coefficients. For the semi-supervised term LSSL , we adopt
Π-model [15] here. Thus, we can perform a global optimization to better utilize
the unlabeled data with the class/domain mismatch.

3 Experiments
3.1 Datasets & Implementation Details
Dermatology For skin lesion recognition, we use four datasets to evaluate our
methods: ISIC 2019 [10], PAD-UFES-20 [22], Derm7pt [14] and Dermnet [4].
The statistics of four datasets can be found in our supplementary documents.
The images in ISIC2019 are captured from dermatoscopes. The images in PAD-
UFES-20 and Dermnet datasets are captured from a clinical scenario, where
Derm7pt dataset contains both. Firstly, we divide ISIC 2019 dataset into 4 (NV,
MEL, BCC, BKL) + 4 (AK, SCC, VASC, DF) classes as known classes and
unknown classes respectively. We sample 500 instances per class from known
classes to construct the labeled datasets. Then we sample 250 / 250 instances
per class from known classes to construct validation datasets and test datasets,
We sample 30% close-set samples and all open-set samples from the left 17,331
instances to form the unlabeled dataset. For the other three datasets, we mix
each dataset with ISIC 2019 unlabeled dataset, to validate the effectiveness of
our proposed methods on training from different unknown domains.
Ophthalmology We also evaluate our proposed methods on in-house fundus
datasets, which were collected from regular fundus cameras, handheld fundus
cameras, and ultra-widefield fundus imaging, covering the field of view of 60◦ ,
45◦ , and 200 ◦ respectively. We follow [12,11] and take the diabetic retinopathy
(DR) grading with 5 sub-classes (normal, mild DR, moderate DR, severe DR,
and proliferative DR) as known classes. We sample 1000 / 500 / 500 instances
per class to construct the training/validation/test dataset. The samples with the
presence of age-related macular degeneration (AMD) which have similar features
to DR, are introduced as 4 unknown classes (small drusen, big drusen, dry AMD,
and wet AMD). Please refer to our supplementary files for more details.
Implementation Details All Skin images are resized into 224×224 pixels and
all fundus images are resized into 512×512 pixels. We take ResNet-50 [7] as our
backbones for the classification model and VAE training. We use Π-model [15]
as a SSL regularizer. We warm up the model using exponential rampup [15] with
80 out of 200 epochs, to adjust the coefficients of adversarial training α and SSL
β. We use SGD optimizer with a learning rate of 3×10−4 and a batch size of 32.
Some regular augmentation techniques are applied such as random crop, and flip,
Universal Semi-Supervised Classification 7

Table 1: The comparative results on skin datasets (5-trial average accuracy%).

Datasets ISIC 2019 Derm7pt PAD-UFES-20 Dermnet
Supervised ERM 69.2 (± 0.89)
PL [16] 69.4 (± 0.65) 70.1 (± 0.10) 70.2 (± 0.83) 66.9 (± 0.34)
PI [15] 70.2 (± 0.33) 70.7 (± 0.64) 70.3 (± 0.22) 69.6 (± 0.87)
MT [27] 69.6 (± 0.45) 70.1 (± 0.20) 68.7 (± 0.36) 65.1 (± 1.43)
Close-set SSL
VAT [21] 70.3 (± 0.29) 69.9 (± 0.23) 69.2 (± 0.48) 69.6 (± 0.59)
MM [1] 59.4 (± 2.23) 60.2 (± 0.55) 60.3 (± 1.65) 41.7 (± 3.97)
FM [25] 59.8 (± 1.88) 65.1 (± 0.63) 63.0 (± 1.01) 53.2 (± 1.87)
UASD [3] 70.0 (± 0.47) 67.5 (± 1.30) 68.5 (± 1.01) 61.2 (± 2.85)
Open-set SSL DS3L [6] 69.3 (± 0.87) 69.8 (± 0.55) 68.9 (± 0.76) 65.3 (± 1.14)
MTCF [29] 65.4 (± 1.99) 69.2 (± 0.89) 66.3 (± 0.76) 66.8 (± 1.01)
T2T [8] 60.2 (± 0.23) 61.7 (± 0.15) 60.3 (± 0.21) 60.9 (± 0.10)
OM [23] 70.1 (± 0.29) 69.6 (± 0.54) 65.8 (± 0.46) 65.4 (± 0.54)
CAFA [9] 68.3 (± 1.08) 63.3 (± 1.02) 65.3 (± 1.70) 63.3 (± 1.88)
Universal SSL
Ours 71.1 (± 1.31) 70.9 (± 1.01) 70.8 (± 0.98) 69.6 (± 1.28)

Table 2: The comparative results on fundus datasets (5-trial average accuracy%).

Datasets Regular Handheld UWF
Supervised ERM 70.96 (± 0.98)
Close-set SSL PI [15] 75.01 (± 0.86) 74.51 (± 1.20) 73.85 (± 1.65)
Open-set SSL UASD [3] 73.25 (± 2.51) 73.05 (± 1.35) 73.96 (± 1.77)
CAFA [9] 72.00 (± 2.34) 73.52 (± 1.76) 65.21 (± 4.21)
Universal SSL
Ours 77.55 (± 2.85) 78.02 (± 2.01) 74.10 (± 2.36)

with color jitter and gaussian blur as strong augmentations for the assessment
of UKC. For a fair comparison study, we kept all basic hyper-parameters such as
augmentations, batch size, and learning rate the same on comparison methods.

3.2 Comparison Study

Performance on Skin Dataset As shown in Table 1, we have compared our
proposed methods with existing SSL methods, which are grouped with respect
to different realistic scenarios. It can be seen that our proposed methods achieve
competitive results on all datasets with different domain shifts. An interesting
finding is that existing methods in open-set SSL, such as MTCF [29], do not work
well in our settings. This is because these methods rely heavily on the estimation
and removal of UKC samples. However, the OSR techniques used, which are
designed for classical image classification, are not applicable to medical images.
Performance on Fundus Dataset Table 2 reports the comparative results on
fundus datasets. We select one technique for comparison study that shows the
best results in each group for different SSL scenarios. Our proposed methods
achieve the best performance over other competitors and significant improve-
ments over baseline models in all settings. Unlike the results on the skin dataset,
8 L. Ju et al.

Table 3: Domain separation AUC%. Table 4: Ablation study results.

Datasets ISIC D7pt PAD DN Datasets ISIC D7pt PAD DN
Confidence 59.16 61.94 63.01 68.41 w/o SSL 69.2
Pertur. [9] 61.12 62.55 61.32 60.17 w/o DOE 68.7 68.2 67.9 66.0
VAE 66.02 66.32 63.28 79.09 w/o CDS 55.6 59.3 61.1 51.9
OVA-Net [23] 58.17 59.82 56.71 79.46 w/o DA 70.5 69.3 68.5 65.6
Ours 67.99 66.99 65.32 83.21 Ours 71.1 70.9 70.8 69.6

(a) (b) (c) (d)

��,� : 0.73 ��,� : 0.35 ��,� : 0.99 ��,� : 0.53

��,� : 0.12 ��,� : 0.05 ��,� : 0.11 ��,� : 0.96
GT: PDR GT : Dry AMD GT : Normal GT : Moderate DR
Domain: Regular Domain: Regular Domain: Regular Domain: UWF

Fig. 3: The visualized examples from unlabeled data with normalized scores.

CAFA achieves satisfactory results except for UWF. UASD improves the perfor-
mance over the baseline ERM model. This is probably because DR and AMD
share similar features or semantic information such as hemorrhage and exudates,
which can well enhance the feature learning [12].
Novel Domain Detection As we claim that our proposed scoring mechanism
can well identify UKD samples, we perform experiments on the unknown domain
separation using different techniques. Our proposed CDS scoring mechanism
achieves the best results for unknown domain separation. Moreover, we can find
that UKD samples are also sensitive to prototype distances, e.g., with a high
AUC of 80.79% in terms of Dermnet (DN), which confirms the importance and
necessity of disentangling the domain information for the detection of UKC.

3.3 Ablation Study

Analysis on the Components We perform an ablation study on each compo-

nent, and the results are shown in Table 3. The ’w/o DOE’ denotes that we use
the unlabeled data re-weighed by wd to train the Π-model without considering
the class mismatch. The ’w/o CDS’ denotes that we directly take the whole unla-
beled data as the unknown domains for domain adaptation without considering
the condition of domain mix, e.g., ISIC & Dermnet. The ’w/o DA’ denotes that
we exploit wc and wd to re-weight the unlabeled samples but without the adver-
sarial training term. It is found that the dataset with a larger domain gap such
as Dermnet suffers from more performance degradation (from 69.6% to 65.6%).
Universal Semi-Supervised Classification 9

Semantic Correlations of Open-set Samples To explore semantic correla-

tions between samples exhibiting class/domain mismatch, we visualize instances
from unlabeled data alongside their corresponding normalized UKC/UKD scores,
as depicted in Fig. 3. It is noteworthy that while Fig. 3-(c) exhibits no AMD-
related lesions, it yields the highest wu,c , indicating a higher semantic similarity
in the feature space, e.g., hemorrhages. Incorporating such samples into unified
training can effectively enhance model performance by enriching useful features.

4 Conclusion

In this work, we propose a novel universal semi-supervised learning framework

for medical image classification. We propose two novel scoring mechanisms for
the separation of samples from unknown classes and unknown domains. Then,
we adopt regular semi-supervised learning and domain adaptation on the re-
weighted unlabeled samples for unified training. Our experimental results show
that our proposed methods can perform well under different realistic scenarios.

Acknowledgments. The work was partially supported by Airdoc medical AI

projects donation Phase 2, Monash-Airdoc Research Centre, and in part by the
MRFF NCRI GA89126.

Disclosure of Interests. The authors declare that they have no competing

interests.

References
1. Berthelot, D., Carlini, N., Goodfellow, I., Papernot, N., Oliver, A., Raffel, C.A.:
Mixmatch: A holistic approach to semi-supervised learning. Advances in Neural
Information Processing Systems 32 (2019)
2. Cao, Z., Ma, L., Long, M., Wang, J.: Partial adversarial domain adaptation. In:
European Conference on Computer Vision. pp. 135–150 (2018)
3. Chen, Y., Zhu, X., Li, W., Gong, S.: Semi-supervised learning under class distribu-
tion mismatch. In: Proceedings of the AAAI Conference on Artificial Intelligence.
vol. 34(4), pp. 3569–3576 (2020)
4. Dermnet: Dermnet (2023), https://dermnet.com/
5. Esteva, A., Kuprel, B., Novoa, R.A., Ko, J., Swetter, S.M., Blau, H.M., Thrun, S.:
Dermatologist-level classification of skin cancer with deep neural networks. Nature
542(7639), 115–118 (2017)
6. Guo, L.Z., Zhang, Z.Y., Jiang, Y., Li, Y.F., Zhou, Z.H.: Safe deep semi-supervised
learning for unseen-class unlabeled data. In: International Conference on Machine
Learning. pp. 3897–3906. PMLR (2020)
7. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In:
IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 770–778
(2016)
10 L. Ju et al.

8. Huang, J., Fang, C., Chen, W., Chai, Z., Wei, X., Wei, P., Lin, L., Li, G.: Trash
to treasure: harvesting ood data with cross-modal matching for open-set semi-
supervised learning. In: IEEE/CVF International Conference on Computer Vision.
pp. 8310–8319 (2021)
9. Huang, Z., Xue, C., Han, B., Yang, J., Gong, C.: Universal semi-supervised learn-
ing. Advances in Neural Information Processing Systems 34, 26714–26725 (2021)
10. ISIC: Isic archive (2023), https://www.isic-archive.com/
11. Ju, L., Wang, X., Zhao, X., Bonnington, P., Drummond, T., Ge, Z.: Leverag-
ing regular fundus images for training uwf fundus diagnosis models via adversar-
ial learning and pseudo-labeling. IEEE Transactions on Medical Imaging 40(10),
2911–2925 (2021)
12. Ju, L., Wang, X., Zhao, X., Lu, H., Mahapatra, D., Bonnington, P., Ge, Z.: Synergic
adversarial label learning for grading retinal diseases via knowledge distillation and
multi-task learning. IEEE Journal of Biomedical and Health Informatics 25(10),
3709–3720 (2021)
13. Ju, L., Wu, Y., Wang, L., Yu, Z., Zhao, X., Wang, X., Bonnington, P., Ge, Z.:
Flexible sampling for long-tailed skin lesion classification. In: Medical Image Com-
puting and Computer Assisted Intervention–MICCAI 2022. pp. 462–471. Springer
(2022)
14. Kawahara, J., Daneshvar, S., Argenziano, G., Hamarneh, G.: Seven-point checklist
and skin lesion classification using multitask multimodal neural nets. IEEE Journal
of Biomedical and Health Informatics 23(2), 538–546 (2018)
15. Laine, S., Aila, T.: Temporal ensembling for semi-supervised learning. arXiv
preprint arXiv:1610.02242 (2016)
16. Lee, D.H., et al.: Pseudo-label: The simple and efficient semi-supervised learning
method for deep neural networks. In: ICML Workshop on challenges in represen-
tation learning. vol. 3(2), p. 896 (2013)
17. Lee, D., Kim, S., Kim, I., Cheon, Y., Cho, M., Han, W.S.: Contrastive regulariza-
tion for semi-supervised learning. In: IEEE/CVF Conference on Computer Vision
and Pattern Recognition. pp. 3911–3920 (2022)
18. Li, J., Socher, R., Hoi, S.C.: Dividemix: Learning with noisy labels as semi-
supervised learning. In: International Conference on Learning Representations
(2020)
19. Liu, Q., Yu, L., Luo, L., Dou, Q., Heng, P.A.: Semi-supervised medical image
classification with relation-driven self-ensembling model. IEEE Transactions on
Medical Imaging 39(11), 3429–3440 (2020)
20. Ming, Y., Sun, Y., Dia, O., Li, Y.: How to exploit hyperspherical embeddings for
out-of-distribution detection? arXiv preprint arXiv:2203.04450 (2022)
21. Miyato, T., Maeda, S.i., Koyama, M., Ishii, S.: Virtual adversarial training: a regu-
larization method for supervised and semi-supervised learning. IEEE Transactions
on Pattern Analysis and Machine Intelligence 41(8), 1979–1993 (2018)
22. Pacheco, A.G., Lima, G.R., Salomao, A.S., Krohling, B., Biral, I.P., de Angelo,
G.G., Alves Jr, F.C., Esgario, J.G., Simora, A.C., Castro, P.B., et al.: Pad-ufes-20:
A skin lesion dataset composed of patient data and clinical images collected from
smartphones. Data in Brief 32, 106221 (2020)
23. Saito, K., Kim, D., Saenko, K.: Openmatch: Open-set semi-supervised learning
with open-set consistency regularization. Advances in Neural Information Process-
ing Systems 34, 25956–25967 (2021)
24. Sindhwani, V., Niyogi, P., Belkin, M.: A co-regularization approach to semi-
supervised learning with multiple views. In: Proceedings of ICML workshop on
learning with multiple views. vol. 2005, pp. 74–79 (2005)
Universal Semi-Supervised Classification 11

25. Sohn, K., Berthelot, D., Carlini, N., Zhang, Z., Zhang, H., Raffel, C.A., Cubuk,
E.D., Kurakin, A., Li, C.L.: Fixmatch: Simplifying semi-supervised learning with
consistency and confidence. Advances in Neural Information Processing Systems
33, 596–608 (2020)
26. Sun, X., Yang, Z., Zhang, C., Ling, K.V., Peng, G.: Conditional gaussian distribu-
tion learning for open set recognition. In: Proceedings of the IEEE/CVF Confer-
ence on Computer Vision and Pattern Recognition. pp. 13480–13489 (2020)
27. Tarvainen, A., Valpola, H.: Mean teachers are better role models: Weight-averaged
consistency targets improve semi-supervised deep learning results. Advances in
Neural Information Processing Systems 30 (2017)
28. Ye, H., Xie, C., Cai, T., Li, R., Li, Z., Wang, L.: Towards a theoretical framework
of out-of-distribution generalization. Advances in Neural Information Processing
Systems 34, 23519–23531 (2021)
29. Yu, Q., Ikami, D., Irie, G., Aizawa, K.: Multi-task curriculum framework for open-
set semi-supervised learning. In: European Conference on Computer Vision. pp.
438–454. Springer (2020)