0% found this document useful (0 votes)
9 views10 pages

A Progressive Generative Adversarial Method For Structurally Inadequate Medical Image Data Augmentation

Uploaded by

Saikat Das
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views10 pages

A Progressive Generative Adversarial Method For Structurally Inadequate Medical Image Data Augmentation

Uploaded by

Saikat Das
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, VOL. 26, NO.

1, JANUARY 2022 7

A Progressive Generative Adversarial Method


for Structurally Inadequate Medical
Image Data Augmentation
Ruixuan Zhang , Wenhuan Lu , Xi Wei, Jialin Zhu, Han Jiang, Zhiqiang Liu, Jie Gao, Xuewei Li ,
Jian Yu, Mei Yu , and Ruiguo Yu

Abstract—The generation-based data augmentation I. INTRODUCTION


method can overcome the challenge caused by the
N RECENT years, some of the Auxiliary Diagnosis research
imbalance of medical image data to a certain extent.
However, most of the current research focus on images with
unified structure which are easy to learn. What is different
I based on Deep Learning (ADDL) approached or even sur-
passed doctor’s behavior [1]. However, ADDL are often limited
is that ultrasound images are structurally inadequate, by several aspects of the data: 1) The cost of data annotation.
making it difficult for the structure to be captured by the
generative network, resulting in the generated image lacks The annotation of medical image data requires precise labeling
structural legitimacy. Therefore, a Progressive Generative by professional doctors. And it is impossible to obtain massive
Adversarial Method for Structurally Inadequate Medical amounts of data for unsupervised learning. 2) Long-tailed dis-
Image Data Augmentation is proposed in this paper, tribution of disease data, which leads to excessive imbalance of
including a network and a strategy. Our Progressive data. 3) The scarcity of normal data. The reason is that the data
Texture Generative Adversarial Network alleviates the
adverse effect of completely truncating the reconstruction collected by the hospital is often abnormal data.
of structure and texture during the generation process and The problems above are usually overcame by data augmenta-
enhances the implicit association between structure and tion. Currently, data augmentation methods can be divided into
texture. The Image Data Augmentation Strategy based on Traditional Augmentation (TA) and Synthetic Augmentation
Mask-Reconstruction overcomes data imbalance from a (SA). There are two methods in TA: geometric transformation
novel perspective, maintains the legitimacy of the structure
in the generated data, as well as increases the diversity and color transformation. The SA can also be divided into
of disease data interpretably. The experiments prove the two categories: synthesis methods based on interpolation and
effectiveness of our method on data augmentation and synthesis methods based on generation models. Among them,
image reconstruction on Structurally Inadequate Medical the interpolation-based methods include SMOTE [2], Sample-
Image both qualitatively and quantitatively. Finally, the Pairing [3] and mixup [4]. The generation-based augmentation
weakly supervised segmentation of the lesion is the
additional contribution of our method. methods based on Generative Adversarial Networks (GAN)
fulfill the task of data augmentation outside the sample space to
Index Terms—Ultrasound images, structural legitimacy, a certain extent. However, it is still a controversial topic to use
data augmentation.
this method for data augmentation in medical images, as the syn-
thetic data often loses their fixed physiological structure, which
Manuscript received January 29, 2021; revised June 14, 2021; ac- is especially evident in ultrasound images, as shown in Fig. 1.
cepted July 25, 2021. Date of publication August 4, 2021; date of current Compared with other medical images, such as MRI and CT,
version January 5, 2022. This work was supported in part by the Na- ultrasound images have the advantages of flexible and variable
tional Natural Science Foundation of China under Grant 61976155 and
in part by the Major Scientific and Technological Projects for A New Gen- imaging angles and scales. However, from another perspective,
eration of Artificial Intelligence of Tianjin under Grant 18ZXZNSY00300. the structure of ultrasound data is inadequate, which results in the
(Corresponding author: Xuewei Li.) lack of legitimacy of the generated sample in ultrasound images.
Ruixuan Zhang, Wenhuan Lu, Zhiqiang Liu, Jie Gao, Xuewei Li,
Jian Yu, Mei Yu, and Ruiguo Yu are with the College of Intelligence The thyroid ultrasound image in Fig. 1 is the generated result of
and Computing, Tianjin University, Tianjin 300072, China, with the our previous research, in which the position and feature of the
Tianjin Key Laboratory of Cognitive Computing and Application, Tianjin trachea and artery have completely deviated. In contrast, other
300072, China, and also with the Tianjin Key Laboratory of Advanced
Networking, Tianjin 300072, China (e-mail: [email protected]; medical images such as brain MRI [5], Chest X-ray [6], and lung
[email protected]; [email protected]; [email protected]; CT [7], due to their unified structure, make the distribution of
[email protected]; [email protected]; [email protected]; rgyu@tju. the generated samples in the sample space basically legitimate.
edu.cn).
Xi Wei and Jialin Zhu are with the Tianjin Medical University Therefore, this paper focuses on thyroid ultrasound images and
Cancer Hospital, Tianjin 300060, China (e-mail: [email protected]; aims to address the issue that generation-based augmentation
[email protected]). methods are difficult to maintain structural legitimacy.
Han Jiang is with the OpenBayes (Tianjin) IT Company, Ltd., Tianjin
300074, China (e-mail: [email protected]). A strong hypothesis is proposed in this paper: Medical Image
Digital Object Identifier 10.1109/JBHI.2021.3101551 = Local Human Body Structure + Tissue Texture. For modeling

2168-2194 © 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.

Authorized licensed use limited to: KIIT University. Downloaded on June 07,2024 at 09:04:52 UTC from IEEE Xplore. Restrictions apply.
8 IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, VOL. 26, NO. 1, JANUARY 2022

II. RELATED WORKS


As our method is a data augmentation task based on image
inpainting, this section will introduce the related GAN-based
works of image augmentation and image inpainting separately.

A. GAN for Medical Data Augmentation


The GAN-based data Augmentation method (GA) improves
the accuracy of downstream tasks to a certain extent. But re-
searchers hold different views [11]–[13] on who has superior
performance between GA and TA. The GAN-based data aug-
mentation methods can be divided into 3 groups according to
the type of input data:
1) Noise Vector: Many methods generate fake dermato-
scopic images [10], MRI [14], chest X-rays Images [15] by
Fig. 1. Several medical Image generation results based on GAN, in- taking noise vectors with 1-D as input, to augment unbalanced
cluding Chest X-ray [6], MRI [5], Breast Ultrasound Image [8], Nuclei image dataset. For relatively complex data such as ultrasound
Image [9] and Skin Lesion [10]. The structure of ultrasonic images is too
flexible, resulting in the legitimacy of the generated image loss their own image, Al-Dhabyani et al. [8] focused on Region Of Interest
structure, unless focusing on Region Of Interest (ROI) [8]. (ROI), avoiding the problems caused by structurally inadequate
breast ultrasound images.
2) Segmentation Mask: These methods are mostly used for
cell images generation [9], [16] and also for CT images [17].
Even though the masks constrain the shape of the lesion for new
samples, it is not able to control for structures outside the range
of lesion.
3) Real Image: This kind of method [6], [7] makes the
generator learn the mapping relationship between the real data to
Fig. 2. Ultrasound images of thyroid. the fake data by encoding and decoding. Among them, [18] syn-
thesized two images (abnormal and normal) into abnormal one,
which ensures the legitimacy of the structure but the diversity
data with inconsistent structure, the generation model can only of it is limited.
learn a relatively unified texture. Based on this, a Progressive At present, most of the data of GAN-based medical image
Generative Adversarial Method (PGAM) for Structurally Inad- data augmentation methods have consistent or loose structure.
equate Medical Image Data Augmentation is proposed in this In addition, the method above lacks the constraints of prior
paper, which can augment normal data by repairing lesions in knowledge when sampling latent features. More specifically,
disease data and also alleviate data imbalance between disease the diversity of samples and the rationality of their distribu-
data through a Conditional Category-Sensitive Strategy. The tion in the feature space need to be considered simultaneously.
whole process only uses the image-level annotation and the point Therefore, it is worth studying a data augmentation method for
annotation of the lesion (the annotation used by the doctors to structurally inadequate images and weighing the relationship
measure the size of the lesion during ultrasound examination, between diversity and rationality of distribution of generated
as shown in the Fig. 2(a)). The contributions of this paper are as data. The methods of image inpainting give us inspiration.
follows:
1) Aiming at the data augmentation task for structurally
inadequate images, a Progressive Generative Adversarial B. GAN-Based Image Inpainting Method
Method is proposed to complete the image augmentation GAN-based inpainting methods including single-stage in-
under the premise of preserving the legitimate structure; painting and multi-stage inpainting. [19] used GAN for image
2) A Progressive Texture Generative Adversarial Network inpainting for the first time. [20] inpainted damaged images
(PTGAN) is proposed that integrates the repair process on the basis of latent vector z recovered by single-stage. [21]
of structure and texture, which gives a beneficial effect believed that inpainting and mask are a complementary process,
on the repair of lesions and the synthesis of new lesions; which learning at the same time helps the two tasks achieve better
3) An Image Data Augmentation Strategy based on Mask- results. Inspired by the idea of [22], [23] removed the metallic
Reconstruction (MR-IDAS) is proposed, where the Con- implants of MRI adversarially, using two discriminators, one
ditional Category-Sensitive Strategy is to increase the global and one local. Although [24] divided the image inpainting
diversity of lesion sample distribution under the priori into two parts: content and texture, it is essentially a single-stage
guidance; inpainting of image content.
4) This method can achieve the weakly supervised segmen- According to the content inpainted in the first step, the multi-
tation of the lesion through point annotation in addition. stage method can be divided into: a) edge; b) structure; c)

Authorized licensed use limited to: KIIT University. Downloaded on June 07,2024 at 09:04:52 UTC from IEEE Xplore. Restrictions apply.
ZHANG et al.: PGAM FOR STRUCTURALLY INADEQUATE MEDICAL IMAGE DATA AUGMENTATION 9

Fig. 3. Overview of our Progressive Adversarial Method, including a Network and two Strategies. At first, we perform different strategies on data to
accommodate different data augmentation tasks. The Conditional Category-Sensitive Strategy adjusts the mask according to the feature distribution
of the specific category. And the Tissue Texture Reconstruction Strategy is designed to enable the model to learn the tissue texture features in the
ultrasound image. Then, the ability of the model to infer structure and texture is enhanced by the Progressive Texture Module. The final Refining
Module completes refinement of the reconstruction of images.

rough result; d) other optimization strategies [25], [26]. method only uses the image-level annotation and point anno-
Firstly, [27]–[29] generated the edge and color of the image in tation, which can alleviate data imbalance between different
the first step, and integrate the information in the second step to categories, maintain the legitimacy of the structure in the gen-
complete the inpainting task. [30] was also one of the methods of erated data, as well as increase the diversity of disease data
contour completion, but it uses additional saliency information. interpretably. The flowchart of our method is shown in the Fig. 3.
Secondly, [31] repaired the overall structure of the image in The PTGAN includes Progressive Texture Module (PTM), Re-
the first stage, and repaired the details in the second stage. [32] fining Module (RM) and Loss Function. The MR-IDAS includes
proposed a method of content + texture inpainting. Thirdly, [33] a Conditional Category-Sensitive Strategy (CCS) for disease
proposed a gradual inpainting process from low resolution to data and a Tissue Texture Reconstruction Strategy (TTR) for
high. [34] introduced a two-stage coarse-to-fine network archi- normal-disease data.
tecture where the first network made an initial coarse prediction,
and the second network took the coarse prediction as inputs and A. Progressive Texture GAN Overview
predicted refined results.
However, although that the structure and texture of the image In this paper, we argue that the method that completely
are reconstructed separately can improve the reconstruction separates the structure of an image from its texture for image
performance of the generation model, the implicit association generation ignores the implicit association between image struc-
and progressive relationship between these elements are ignored. ture and texture, yet the relationship between them should be a
Therefore, this paper strengthen the incrementally relationship simple to complex one. Therefore, it is one of the core ideas
between image structure and texture in a progressive way. of this method that to combine the reconstruction process of
structure and texture together to carry out progressive texture
learning. The loss function of the entire model is composed of
III. METHODS the losses of each module:
One of the main challenges of medical image data aug- L = λ p LP + λ r LR + λ c LC (1)
mentation is to maintain the structural legitimacy of generated
structurally inadequate image. This paper proposes a Progres- where LC is the cross-entropy loss, which imposes a penalty on
sive Generative Adversarial Method for Structurally Inadequate the generator for generating errors of a specific disease category
Medical Image Data Augmentation, including a Progressive when disease data is generated. LP and LR are the losses of
Texture Generative Adversarial Network and an Image Data PTM and RM, respectively. And λ∗ represent the weight of each
Augmentation Strategy based on Mask-Reconstruction. Our loss. We break down each of other loss terms below.

Authorized licensed use limited to: KIIT University. Downloaded on June 07,2024 at 09:04:52 UTC from IEEE Xplore. Restrictions apply.
10 IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, VOL. 26, NO. 1, JANUARY 2022

1) Progressive Texture Module: This module consists of a


generator GP and a discriminator DP . Different from other
generative models, Progressive Texture Module (PTM) utilizes
image structures with different degrees of detail as supervision
from simple to complex. The training process is disassembled
into gradual texture learning of image content. We define the
original image as Igt and the mask as M which will be described
in detail in Section III-B. At this time, M can be understood as a
simple binarized matrix, where 0 represents the missing region
and 1 represents the background. The fusion of image structure Fig. 4. Histograms of mean and variance distribution of benign and
malignant nodule regions. This difference needs to be taken into account
and texture is expressed as Si , where i ∈ [1, n], representing the when sampling, which can be treated as a prior knowledge for data
different degrees of texture detail contained in the structure. n augmentation.
means the number of progressive training in PTM. The smaller
the i, the less high-frequency texture feature contained in S where I  is the final generated image. The loss function can be
which can be calculated by mean filtering, as shown in Equation expressed as:
(2). The reason for choosing mean filtering as a fusion method
for mixup structure-texture is that mean filtering can not only LR = β ∗ (L1i + Lf ) + (1 − β) ∗ LRdi (8)
reduce the adverse effect of high-frequency information (such as
speckle noise) on modeling, but also adjust the degree of texture L1i = ||I − I||1 (9)
detail combination by adjusting the kernel size. In addition, after 
1 Sim(Fx,y , Fx,y )
the image is filtered by the mean filtering, each pixel has a larger Lf = exp − (10)
receptive field with more intensive supervision provided during N
x,y∈M =1 Sim max (F , Fx,y )
back-propagation.  M )))] + E[log DR (I)]
LRdi = E[log(1 − DR (GR (I × M, S,
1   ki
 
ki
 (11)
Si (x, y) = I x± , y± (2)
ki × ki 2 2 Among them, L1 is the refined reconstruction loss. Drawing
on the idea of [35], Lf uses Appearance Flow, used to compare
where ki represents the size of the filter kernel,(x, y) represent the cosine similarity (Sim(·)) between the features of missing
the coordinate of pixels. In order to ensure that the filtered image
regions 
Fx,y in the input image and Ground Truth features F . For
still maintains its original size, padding should be performed.
more details, please refer to [35]. N is the number of elements
At this point, our progressive texture learning generator can be
in set M = 1. LRd is the adversarial loss of the discriminator
written as:
DR . β is used to adjust the weight between each loss.
n = GP (I × M, Sn × M, M )
S (3)
B. Image Data Augmentation Strategy Overview
where × denotes pixel-wise product. Three elements are con- Binarized mask is often used as artificial image damage in
nected together as the input of the network. As the training image inpainting tasks. This method is simple but does little
progresses, the joint loss of generator and discriminator will to the generated diverse samples. The strategy proposed in this
be optimized gradually: paper increases the diversity of generated disease data and alle-
 viates the imbalance between normal and disease data through
LP = (α ∗ L1i + (1 − α) ∗ LP di ) (4) Mask-Reconstruction.
i 1) Conditional Category-Sensitive Strategy: We firstly per-
L1i = ||Si − Si ||1 (5) formed distribution statistics on the mean and variance of be-
nign and malignant nodules, and found that there are certain
LP di = E[log(1 − DP (GP (I × M, Si × M, M )))] differences, as shown in the Fig. 4. So we use the mask with the
+ E[log DP (S)] (6) Equation (12) to integrate with the prior constraints novelly.
M |{μ, σ} = M0−1 ⊗ {μ(I T ), σ(I T )} (12)
where L1 is the reconstruction loss, calculating the 1 distance
between the generated result and S. LP d is the adversarial loss where M0−1 is a dual-channel 0 − 1 matrix, and μ(I T ), σ(I T )
of the discriminator, and the α is regularization parameter. represent the mean and variance within the nodule regions.
2) Refining Module: After the PTM, since the supervision is ⊗ is channel-wise multiply operation. Therefore, the results
filtered to smooth out the high-frequency information, another generated by the model in Section III-A can be rewritten as:
refinement generator GR and a corresponding discriminator DR
are needed to generate more realistic images in RM. The whole n |{μ, σ} = GP (I × M |{μ, σ}, Sn × M |{μ, σ}, M |{μ, σ})
S
refinement process can be expressed as: (13)

I|{μ, 
σ} = GR (I × M |{μ, σ}, S|{μ, σ}, M |{μ, σ}) (14)
 M)
I = GR (I × M, S, (7)

Authorized licensed use limited to: KIIT University. Downloaded on June 07,2024 at 09:04:52 UTC from IEEE Xplore. Restrictions apply.
ZHANG et al.: PGAM FOR STRUCTURALLY INADEQUATE MEDICAL IMAGE DATA AUGMENTATION 11

The reason for using mean and variance is that these two
numerical features have practical meaning and can represent
the echo and tissue components of the nodule to a certain
extent. Therefore, in the sampling stage of data generation, the
diversity of new samples is achieved by changing the conditional
distribution of sampling. But it is also necessary to follow the
prior distribution of the data to ensure the rationality of the
feature distribution, as shown in (15)–(18), where μ(·) and σ(·)
are shown in Fig. 4.

μ = P (μ|I ∈ IB ) ∼ N (μbm , σbm )


SampleB (15) Fig. 5. Histogram of our dataset. The solid box represents the real
data, which is quite imbalanced. We use PTGAN to generate new
SampleB
σ = P (σ|I ∈ IB ) ∼ N (μbv , σbv ) (16) samples to alleviate this issue, which is represented by dashed box.
The imbalance of malignant data is similar to that of benign data, so it is
SampleM
μ = P (μ|I ∈ IM ) ∼ N (μmm , σmm ) (17) not shown in the figure.

σ = P (σ|I ∈ IM ) ∼ N (μmv , σmv )


SampleM (18)
300 MI as the testset, as shown in the Fig. 5(b). Since there is no
2) Tissue Texture Reconstruction Strategy: The problem of imbalance between the benign and malignant data, we randomly
translation between normal and disease data for thyroid ultra- select some data of a specific category to artificially create an
sound images can be reduced to the problem of repair of lesions imbalanced situation, as shown in the Fig. 5(b). Firstly, part of BI
in disease data. Specifically, the tissue region in the ultrasound (700 training, 300 test) and MI (3666 training, 300 test) are taken
image is repaired by (3) and (7), and the PT-GAN is trained with as one of the dataset to assist the diagnosis model. Secondly, part
random masks. The M in TTR is a simple 0 − 1 matrix. In this of MI (800 training, 300 test) and BI (3932 training, 300 test)
case, training the generator by using the image Si with nodules are taken as another imbalanced dataset. Finally, the generated
will not allow the model to generate nodules. This is owing data is added to training sets respectively, as shown in the dashed
to the fact that 1) generally, for the translation from disease to box in the Fig. 5(b).
normal data, most studies are based on the assumption [36]: a In this paper, the input image size is 256 × 256 pixels, and
generator trained with normal data can remove lesions since the the batch size is 6. PTM and RM are trained separately and then
generator has not learned the distribution of lesions. However, jointly optimized. The kernel size of k is updated once whenever
the unique flexibility of ultrasound images causes this assump- the training is stable for 5 times, that is, the hyperparameter n
tion is not valid in ultrasound images. Therefore, for modeling is set to 5. Adam optimizer with learning rate is used as 10−4 .
data with inconsistent structure, the generation model can only Also, we set λp = λr = λc = 1. The α is set to 0.8, and the β is
learn a relatively uniform tissue texture. 2) After selecting data set to 0.85.
according to the size of nodule, the nodules cover a small region,
so that the random masking resulted in the vast majority of the
B. Normal-Disease Data Augmentation
missing regions being normal tissue. Although we can make the
mask not overlap with the nodules, we found that even without The input images in this experiment are masked by random
restricting the position of the mask, it has no negative impact on boxes, and the CCS is ignored. We firstly conduct a qualitative
the generator. analysis to show the results of generated normal data from
disease data. Then we perform quantitative analysis to show
IV. RESULTS the positive effect of PGAM on the classifier.
1) Qualitative Analysis: The generator can fit the distribution
A. Dataset and Experimental Setup
of tissue texture feature in the flexible thyroid ultrasound image
This study has been approved by the Medical Ethics Commit- by inpainting random regions over the whole image. We remove
tee of Tianjin Medical University Cancer Institute and Hospital. some specific modules from the model to prove their effective-
Written consent had been obtained from each patient after full ness. In the Fig. 6, we show the real disease images (including
explanation of the purpose and nature of all procedures used. benign and malignancy), generated image by PTGAN, PTGAN
The thyroid ultrasound image data in this paper is collected w/o PTM, PTGAN w/o RM, PTGAN w/o PTM&RM. To make
from Tianjin Medical University Cancer Institute and Hospital, it easier for readers, the orange boxes in the Fig. 6 zoom in of
including 8232 Benign Images (BI, Fig. 2(c)), 9966 Malignant the reconstruction region of lesions. It can be seen that PTM
Images (MI, Fig. 2(d)) and 404 Normal Images (NI, Fig. 2(b)) la- can guide the network to generate more meaningful structure
beled by professional imaging physicians. We divide the dataset and texture. On the contrary, the structure and texture generated
into different subsets to train and test PTGAN and classifiers, as by the model without PTM are more chaotic. In addition, the
shown in the Fig. 5(a). The training set of PTGAN contains 4000 RM makes the details of the generated image more realistic,
BI, 6000 MI, and the validation set contains 2000 BI, 2000 MI. including speckle noise unique to ultrasound images.
In addition, we choose VGG-19 [37] as an auxiliary diagnostic We use the same training strategy to compare with the latest
model. We take 285 NI, 1932 BI and 1666 MI as the training image inpainting methods [38], [39], as shown in the last two
set of the normal-disease experiment, and 119 NI, 300 BI and rows in the Fig. 6. Obviously, our method has a superior effect on

Authorized licensed use limited to: KIIT University. Downloaded on June 07,2024 at 09:04:52 UTC from IEEE Xplore. Restrictions apply.
12 IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, VOL. 26, NO. 1, JANUARY 2022

Fig. 6. Generation results of normal data from disease data. The missing regions in input are shown as a more visible white color (0 actually).

Authorized licensed use limited to: KIIT University. Downloaded on June 07,2024 at 09:04:52 UTC from IEEE Xplore. Restrictions apply.
ZHANG et al.: PGAM FOR STRUCTURALLY INADEQUATE MEDICAL IMAGE DATA AUGMENTATION 13

TABLE II
THE PERFORMANCE COMPARISON BETWEEN TA AND PTGAM ON VGG-19
FOR NORMAL-DISEASE DATA IMBALANCE

Fig. 7. A local zoom-in of image during the training of inpainting by


PTGAN, including progressive texture and refining performance.

TABLE I
THE OBJECTIVE ASSESSMENT ON PTGAN AND OTHER GANS USING
PSNR AND SSIM METRIC

lesion reconstruction. For the case where the mask is connected


to the hypoechoic region of the artery (the first and fourth
columns), the existing methods cannot reconstruct the destroyed
vessel wall. However, our method can reconstruct the masked
vessel wall on the basis of effectively restoring normal tissue
texture.
In addition, Fig. 7 shows the reconstruction process of an
image with an masked skin texture in our method. The first row,
from left to right, shows the reconstruction of the structure and
texture step-by-step. The second row, from right to left, shows
the effect of the Refining Module on the optimization of the
details in the ultrasound image.
2) Quantitative Analysis: Due to the lack of matching data Fig. 8. Generation results of disease data from disease data. Please
between disease and normal, we take random boxes to inpaint the see text for analysis.
images in validation set when measuring the effect of PTGAN
qualitatively, comparing the PSNR and SSIM among the various optimal results can be achieved if our method is combined with
methods, as shown in the Table I. It can be seen that our traditional augmentation methods.
method can achieve the best results for medical image inpainting.
Without the help of PTM and RM, our method is comparable
C. Disease Data Augmentation
to RFR. But in terms of visual observation, our method is much
closer to the real image. In addition, it is reasonable that the In this experiment, input images are masked by fixed boxes
improvement of RM on metric is better than that of PTM. This that conceal the nodules. Same as Section IV-B, this section
is because the supervision of the PTM is a structure of image also shows the experimental results from both qualitative and
by mean filtering, while the metric is the difference compared quantitative aspects.
to the original image. 1) Qualitative Analysis: We performed the ablation exper-
In addition, we performed experiments on data augmentation iments to prove the effect of image generation by PTM and
and compared with TA, as shown in the Table II. We use the benefit of CCS for the diversity of generated samples with
generated data to balance the training set to a close number, PTGAN, PTGAN w/o PTM, PTGAN w/o CCS, and PTGAN
as shown in the Fig. 5(b). It is worth noting that the disease data w/o PTM&CCS, respectively. The results of each experiment
used to generate the new data were derived from the training are shown in the Fig. 8. Interestingly, we found that the PTM
set, i.e., no additional samples were brought. Compared with can guide the generator to generate nodules with oval-shaped.
traditional methods, our method can make the auxiliary diagno- In contrast, in the absence of PTM, the generated nodules will
sis model have higher accuracy and precision. What’s more, the have a bias towards the shape of mask (rectangular), as shown

Authorized licensed use limited to: KIIT University. Downloaded on June 07,2024 at 09:04:52 UTC from IEEE Xplore. Restrictions apply.
14 IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, VOL. 26, NO. 1, JANUARY 2022

TABLE IV
THE PERFORMANCE COMPARISON BETWEEN TA AND PTGAM ON VGG-19
FOR MALIGNANT DATA IMBALANCE

Fig. 9. The diversity of our method under different mean and variance
of sampling conditions.

TABLE III
THE PERFORMANCE COMPARISON BETWEEN TA AND PTGAM ON VGG-19
FOR BENIGN DATA IMBALANCE

Fig. 10. Performance of data augmentation methods under different


imbalance data.

in the second column of Fig. 8. Without the constraint of PTM, In the case of scarcity of benign data, the model tends to
the model will be confused about the reconstruction of structure fall into a local optimal solution biased toward the malignant
and texture, as shown in the third column in the Fig. 8. When category. Therefore, we choose False Positive Rate (FPR) and
the mask covers part of the skin, only PTM can make the model Precision to measure different methods of augmentation. It can
reconstruct the nodules under the skin tissue. Instead, the skin be seen that our method can greatly reduce false positives.
texture will be regenerated underneath the skin tissue without On the contrary, when malignant data is scarce, we choose
PTM. False Negative Rate (FNR) and NPV. As a result, the highest
In addition, CCS enables model to obtain the maximum like- accuracy can be obtained when our method is combined with
lihood estimation under the conditional probability. The visual TA. However, in terms of FPR and FNR, our method minimizes
effect of the generated result cannot be seen clearly in the Fig. 8, the undesired effects of data bias.
so we show the diversity under different sampling conditions in In addition, to prove the ability of our method for address-
the Fig. 9. The echo intensity of the generated nodule decreases ing the data imbalance problem, we performed several exper-
as the mean of sampling increases. As the variance of sampling iments with different scarcity under imbalance of malignant
increases, the solid component in the generated nodule gradually data. Specifically, experiments were performed by limiting the
increases. Therefore, we reconstructs diverse and reasonably number of malignant data in the training set to: 800, 900, 1000,
distributed new data by sampling different mean-variance dis- 1100, and 1200, respectively, with sufficient benign data, and
tributions of benign and malignant nodules. the results are shown in Fig. 10. The orange dashed line in
2) Quantitative Analysis: In order to prove that our method the figure shows the accuracy obtained by the classifier without
can alleviate the data imbalance among different disease, we artificial removing (3932 BI, 3666 MI). As can be seen, our
select a small part of benign data and all malignant data to method consistently outperforms TA. And once the proportion
perform experiment. Then, the new benign images are generated of malignant data reaches a certain extent (1200:3932), the
by randomly selecting samples. Similarly, no additional samples combination of these two augmentation methods enables the
were brought throughout the process. The experimental results classifier to outperform the model trained by real and balanced
are shown in the Table III, and the IC in the table represents data.
Imbalance Category, which is the scarce data in the dataset. In In summary, the Progressive Generative Adversarial Method
addition, we also carry out an data augmentation experiment of is a sample generation method for data augmentation with
malignant data, making the amount of data roughly the same as preservation of structure in ultrasound images. It has not only
the augmentation setting of the benign one. It can be seen that produced superior augmentation results in thyroid ultrasound
(Table IV) our method has equal augmentation effects for each images, but also applies equally to other medical imaging
category. The NPV in Table IV means the Negative Predictive datasets such as BUSI [8]. Our method surpassed the augmenta-
Value, T N/(T N + F N ). tion with the [8] by 85.47% in generating only 6356 images

Authorized licensed use limited to: KIIT University. Downloaded on June 07,2024 at 09:04:52 UTC from IEEE Xplore. Restrictions apply.
ZHANG et al.: PGAM FOR STRUCTURALLY INADEQUATE MEDICAL IMAGE DATA AUGMENTATION 15

Fig. 11. The segmentation results of our method. The red color in the third column represents a larger difference and the blue color represents a
smaller one. GT is outlined with a solid orange line.

([8] generated 5000 images per category, generating 15 000 TABLE V


WEAK SUPERVISORY SEGMENTATION COMPARISON USING
images in total). IOU AND DICE METRIC

D. Weakly Supervised Segmentation


Besides the ability of data augmentation, our method has an
additional contribution that is to complete the task of weakly-
supervised segmentation of the lesion with strict point annota-
tion. As the lesion in the disease image is inpainted to normal
tissue, the absolute value of the difference between the original
image and the generated image can represent the abnormality V. CONCLUSION
of the pixel as a lesion, which is transformed into a binary As an exploratory study on data augmentation method for
prediction by a threshold of 0.1, as shown in the Fig. 11. structurally inadequate medical image, this paper proposes a
In addition, we compared IOU and Dice with other weakly Progressive Generative Adversarial Method for Data Augmen-
supervised segmentation methods, showing that our method tation, including a Progressive Texture Generative Adversarial
performs better, as shown in Table V. Among them, CAM [40] Network and an Image Data Augmentation Strategy based on
denotes the method that generates seed in weakly supervised Mask-Reconstruction. Taking thyroid ultrasound images as the
segmentation task. The CIAN [41], on the other hand, is a research object with structurally inadequate, the image gener-
retrained method based on the optimized CAM. The threshold ation method based on mask-reconstruction generates realistic
for both methods is set to 0.6. Although the weak supervised normal data and benign-malignant data with legitimate structure.
annotation in our method is relatively strict, this kind of annota- At the same time, it takes the diversity and rationality of the
tion is necessary in ultrasound examinations, not an additional generated samples into account, which has excellent perfor-
labeling work. mance in the authenticity of image generation and the effect

Authorized licensed use limited to: KIIT University. Downloaded on June 07,2024 at 09:04:52 UTC from IEEE Xplore. Restrictions apply.
16 IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, VOL. 26, NO. 1, JANUARY 2022

of data augmentation. The whole method makes full use of the [17] Y. Tang, S. Oh, Y. Tang, J. Xiao, and R. M. Summers, “CT-realistic
point annotation and image-level information during ultrasonic data augmentation using generative adversarial network for robust lymph
node segmentation,” in Proc. Med. Imagin., Comput.-Aided Diagnosis,
examination, without any extra annotation work. vol. 10950, 2019, Art. no. 109503V.
Nonetheless, there are some aspects that can be optimized [18] T. Kanayama et al., “Gastric cancer detection from endoscopic images
in this method. For example, combining PTM and the loss using synthesis by GAN,” in Med. Image Comput. Comput. Assist. Interv.,
Part V, vol. 11768, 2019, pp. 530–538.
function to make the whole process out of the limit of hyper- [19] D. Pathak, P. Krähenbühl, J. Donahue, T. Darrell, and A. A. Efros, “Context
parameters. In addition, the limitation of this method is that encoders: Feature learning by inpainting,” in Proc. IEEE Conf. Comput.
it relies on the original structural information in the image, Vis. Pattern Recognit., 2016, pp. 2536–2544.
[20] R. A. Yeh, C. Chen, T. Lim, A. G. Schwing, M. Hasegawa-Johnson, and
that is, the lesion/nodule in the image cannot be too large. M. N. Do, “Semantic image inpainting with deep generative models,” in
Therefore, how to fully dig out the overly flexible structural Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2017, pp. 6882–6890.
information in structurally inadequate images is the next urgent [21] S. Lee, N. U. Islam, and S. Lee, “Robust image completion and masking
with application to robotic bin picking,” Robot. Auton. Syst., vol. 131,
problem. 2020, Art. no. 103563.
[22] S. Iizuka, E. Simo-Serra, and H. Ishikawa, “Globally and locally consistent
REFERENCES image completion,” ACM Trans. Graph., vol. 36, no. 4, pp. 1–107:14, 2017.
[23] K. Armanious, Y. Mecky, S. Gatidis, and B. Yang, “Adversarial inpainting
[1] Y. Gong et al., “Fetal congenital heart disease echocardiogram screen- of medical image modalities,” in Proc. IEEE Int. Conf. Acoust., Speech
ing based on DGACNN: Adversarial one-class classification combined Signal Process., 2019, pp. 3267–3271.
with video transfer learning,” IEEE Trans. Med. Imag., vol. 39, no. 4, [24] C. Yang, X. Lu, Z. Lin, E. Shechtman, O. Wang, and H. Li, “High-
pp. 1206–1222, Apr. 2020. resolution image inpainting using multi-scale neural patch synthesis,” in
[2] N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, “SMOTE: Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2017, pp. 4076–4084.
Synthetic minority over-sampling technique,” J. Artif. Intell. Res., vol. 16, [25] M. Sagong, Y. Shin, S. Kim, S. Park, and S. Ko, “PEPSI: Fast image
pp. 321–357, 2002. inpainting with parallel decoding network,” in Proc. IEEE Conf. Comput.
[3] H. Inoue, “Data augmentation by pairing samples for images classifica- Vis. Pattern Recognit., 2019, pp. 11360–11368.
tion,” 2018, arXiv:1801.02929. [26] Y.-G. Shin, M.-C. Sagong, Y.-J. Yeo, S.-W. Kim, and S.-J. Ko, “PEPSI++:
[4] H. Zhang, M. Cissé, Y. N. Dauphin, and D. Lopez-Paz, “Mixup: Beyond Fast and lightweight network for image inpainting,” IEEE Trans. Neural
empirical risk minimization,” 2017, arXiv:1710.09412. Netw. Learn. Syst., vol. 32, no. 1, pp. 252–265, Jan. 2021.
[5] L. Sun, J. Wang, Y. Huang, X. Ding, H. Greenspan, and J. W. Paisley, “An [27] D. Zhao, B. Guo, and Y. Yan, “Parallel image completion with edge and
adversarial learning approach to medical image synthesis for lesion detec- color map,” Appl. Sci.-Basel, vol. 9, no. 18, pp. 1–29, Sep. 2019.
tion,” IEEE J. Biomed. Health Informat., vol. 24, no. 8, pp. 2303–2314, [28] K. Nazeri, E. Ng, T. Joseph, F. Z. Qureshi, and M. Ebrahimi, “Edgecon-
Aug. 2020. nect: Generative image inpainting with adversarial edge learning,” 2019,
[6] B. Bozorgtabar et al., “Informative sample generation using class aware arXiv:1901.00212.
generative adversarial networks for classification of chest X-rays,” Com- [29] Y. Chai, B. Xu, K. Zhang, N. Leporé, and J. C. Wood, “MRI restora-
put. Vis. Image Understanding, vol. 184, pp. 57–65, 2019. tion using edge-guided adversarial learning,” IEEE Access, vol. 8,
[7] D. Zhao, D. Zhu, J. Lu, Y. Luo, and G. Zhang, “Synthetic medical images pp. 83858–83870, 2020.
using F&BGAN for improved lung nodules classification by multi-scale [30] W. Xiong et al., “Foreground-aware image inpainting,” in Proc. IEEE
VGG16,” Symmetry, vol. 10, no. 10, pp. 1–16, 2018. Conf. Comput. Vis. Pattern Recognit., 2019, pp. 5840–5848.
[8] W. Al-Dhabyani, M. Gomaa, H. Khaled, and A. Fahmy, “Deep learning [31] Y. Ren, X. Yu, R. Zhang, T. H. Li, S. Liu, and G. Li, “StructureFlow: Image
approaches for data augmentation and classification of breast masses inpainting via structure-aware appearance flow,” in Proc. IEEE Int. Conf.
using ultrasound images,” Int. J. Adv. Comput. Sci. Appl., vol. 10, no. 5, Comput. Vis., 2019, pp. 181–190.
pp. 618–627, 2019. [32] T. Sun, W. Fang, W. Chen, Y. Yao, F. Bi, and B. Wu, “High-resolution
[9] S. Pandey, P. R. Singh, and J. Tian, “An image augmentation ap- image inpainting based on multi-scale neural network,” Electronics, vol. 8,
proach using two-stage generative adversarial network for nuclei im- no. 11, pp. 1370-1–1370-17, Nov. 2019.
age segmentation,” Biomed. Signal Process. Control., vol. 57, 2020, [33] Y. Chen and H. Hu, “An improved method for semantic image inpainting
Art. no. 101782. with GANs: Progressive inpainting,” Neural Process. Lett., vol. 49, no. 3,
[10] Z. Qin, Z. Liu, P. Zhu, and Y. Xue, “A GAN-based image synthesis method pp. 1355–1367, 2019.
for skin lesion classification,” Comput. Meth. Programs Biomed., vol. 195, [34] J. Yu, Z. Lin, J. Yang, X. Shen, X. Lu, and T. S. Huang, “Generative image
2020, Art. no. 105568. inpainting with contextual attention,” in Proc. IEEE Conf. Comput. Vis.
[11] M. Rezaei, T. Uemura, J. Nappi, H. Yoshida, and C. Meinel, “Generative Pattern Recognit., 2018, pp. 5505–5514.
synthetic adversarial network for internal bias correction and handling [35] T. Zhou, S. Tulsiani, W. Sun, J. Malik, and A. A. Efros, “View synthesis
class imbalance problem in medical image diagnosis,” in Proc. Med. Imag., by appearance flow,” in Proc. Eur. Conf. Comput. Vis., Part IV, vol. 9908,
Comput.-Aided Diagnosis, 2020, pp. 113140E-1–113140E-8. 2016, pp. 286–301.
[12] P. Ganesan, S. Rajaraman, L. R. Long, B. Ghoraani, and S. K. Antani, [36] J. P. Cohen, M. Luck, and S. Honari, “Distribution matching losses can
“Assessment of data augmentation strategies toward performance improve- hallucinate features in medical image translation,” in Med. Image Com-
ment of abnormality classification in chest radiographs,” in Proc. Annu. put. Comput. Assist. Interv. - Proc., Part I, vol. 11070. Springer, 2018,
Int. Conf. IEEE Eng. Med. Biol. Soc., 2019, pp. 841–844. pp. 529–536.
[13] S. Guan and M. H. Loew, “Breast cancer detection using synthetic mam- [37] K. Simonyan and A. Zisserman, “Very deep convolutional networks for
mograms from generative adversarial networks in convolutional neural large-scale image recognition,” 2014, arXiv:1409.1556.
networks,” in Proc. 14th Int. Workshop Breast Imag., vol. 10718, 2018, [38] Y. Zeng, J. Fu, H. Chao, and B. Guo, “Learning pyramid-context encoder
pp. 1–9, Art. no. 107180X. network for high-quality image inpainting,” in Proc. IEEE Conf. Comput.
[14] C. Han et al., “Combining noise-to-image and image-to-image GANs: Vis. Pattern Recognit., 2019, pp. 1486–1494.
Brain MR image augmentation for tumor detection,” IEEE Access, vol. 7, [39] J. Li, N. Wang, L. Zhang, B. Du, and D. Tao, “Recurrent feature reasoning
pp. 156966–156977, 2019. for image inpainting,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit.,
[15] V. Bhagat and S. Bhaumik, “Data augmentation using generative adver- 2020, pp. 7757–7765.
sarial networks for pneumonia classification in chest X-rays,” in Proc. 5th [40] A. Kolesnikov and C. H. Lampert, “Seed, expand and constrain: Three
Int. Conf. Image Inf. Process., 2019, pp. 574–579. principles for weakly-supervised image segmentation,” in Proc. Eur. Conf.
[16] J. Liu, C. Shen, T. Liu, N. Aguilera, and J. Tam, “Active appearance model Comput. Vis., vol. 9908, 2016, pp. 695–711.
induced generative adversarial network for controlled data augmentation,” [41] J. Fan, Z. Zhang, T. Tan, C. Song, and J. Xiao, “CIAN: Cross-image affinity
in Proc. Med. Image Comput. Comput. Assist. Interv. - Proc., Part I, net for weakly supervised semantic segmentation,” in Proc. AAAI Conf.
vol. 11764, 2019, pp. 201–208. Artif. Intell., 2020, pp. 10762–10769.

Authorized licensed use limited to: KIIT University. Downloaded on June 07,2024 at 09:04:52 UTC from IEEE Xplore. Restrictions apply.

You might also like