A Progressive Generative Adversarial Method For Structurally Inadequate Medical Image Data Augmentation
A Progressive Generative Adversarial Method For Structurally Inadequate Medical Image Data Augmentation
1, JANUARY 2022 7
2168-2194 © 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: KIIT University. Downloaded on June 07,2024 at 09:04:52 UTC from IEEE Xplore. Restrictions apply.
8 IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, VOL. 26, NO. 1, JANUARY 2022
Authorized licensed use limited to: KIIT University. Downloaded on June 07,2024 at 09:04:52 UTC from IEEE Xplore. Restrictions apply.
ZHANG et al.: PGAM FOR STRUCTURALLY INADEQUATE MEDICAL IMAGE DATA AUGMENTATION 9
Fig. 3. Overview of our Progressive Adversarial Method, including a Network and two Strategies. At first, we perform different strategies on data to
accommodate different data augmentation tasks. The Conditional Category-Sensitive Strategy adjusts the mask according to the feature distribution
of the specific category. And the Tissue Texture Reconstruction Strategy is designed to enable the model to learn the tissue texture features in the
ultrasound image. Then, the ability of the model to infer structure and texture is enhanced by the Progressive Texture Module. The final Refining
Module completes refinement of the reconstruction of images.
rough result; d) other optimization strategies [25], [26]. method only uses the image-level annotation and point anno-
Firstly, [27]–[29] generated the edge and color of the image in tation, which can alleviate data imbalance between different
the first step, and integrate the information in the second step to categories, maintain the legitimacy of the structure in the gen-
complete the inpainting task. [30] was also one of the methods of erated data, as well as increase the diversity of disease data
contour completion, but it uses additional saliency information. interpretably. The flowchart of our method is shown in the Fig. 3.
Secondly, [31] repaired the overall structure of the image in The PTGAN includes Progressive Texture Module (PTM), Re-
the first stage, and repaired the details in the second stage. [32] fining Module (RM) and Loss Function. The MR-IDAS includes
proposed a method of content + texture inpainting. Thirdly, [33] a Conditional Category-Sensitive Strategy (CCS) for disease
proposed a gradual inpainting process from low resolution to data and a Tissue Texture Reconstruction Strategy (TTR) for
high. [34] introduced a two-stage coarse-to-fine network archi- normal-disease data.
tecture where the first network made an initial coarse prediction,
and the second network took the coarse prediction as inputs and A. Progressive Texture GAN Overview
predicted refined results.
However, although that the structure and texture of the image In this paper, we argue that the method that completely
are reconstructed separately can improve the reconstruction separates the structure of an image from its texture for image
performance of the generation model, the implicit association generation ignores the implicit association between image struc-
and progressive relationship between these elements are ignored. ture and texture, yet the relationship between them should be a
Therefore, this paper strengthen the incrementally relationship simple to complex one. Therefore, it is one of the core ideas
between image structure and texture in a progressive way. of this method that to combine the reconstruction process of
structure and texture together to carry out progressive texture
learning. The loss function of the entire model is composed of
III. METHODS the losses of each module:
One of the main challenges of medical image data aug- L = λ p LP + λ r LR + λ c LC (1)
mentation is to maintain the structural legitimacy of generated
structurally inadequate image. This paper proposes a Progres- where LC is the cross-entropy loss, which imposes a penalty on
sive Generative Adversarial Method for Structurally Inadequate the generator for generating errors of a specific disease category
Medical Image Data Augmentation, including a Progressive when disease data is generated. LP and LR are the losses of
Texture Generative Adversarial Network and an Image Data PTM and RM, respectively. And λ∗ represent the weight of each
Augmentation Strategy based on Mask-Reconstruction. Our loss. We break down each of other loss terms below.
Authorized licensed use limited to: KIIT University. Downloaded on June 07,2024 at 09:04:52 UTC from IEEE Xplore. Restrictions apply.
10 IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, VOL. 26, NO. 1, JANUARY 2022
Authorized licensed use limited to: KIIT University. Downloaded on June 07,2024 at 09:04:52 UTC from IEEE Xplore. Restrictions apply.
ZHANG et al.: PGAM FOR STRUCTURALLY INADEQUATE MEDICAL IMAGE DATA AUGMENTATION 11
The reason for using mean and variance is that these two
numerical features have practical meaning and can represent
the echo and tissue components of the nodule to a certain
extent. Therefore, in the sampling stage of data generation, the
diversity of new samples is achieved by changing the conditional
distribution of sampling. But it is also necessary to follow the
prior distribution of the data to ensure the rationality of the
feature distribution, as shown in (15)–(18), where μ(·) and σ(·)
are shown in Fig. 4.
Authorized licensed use limited to: KIIT University. Downloaded on June 07,2024 at 09:04:52 UTC from IEEE Xplore. Restrictions apply.
12 IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, VOL. 26, NO. 1, JANUARY 2022
Fig. 6. Generation results of normal data from disease data. The missing regions in input are shown as a more visible white color (0 actually).
Authorized licensed use limited to: KIIT University. Downloaded on June 07,2024 at 09:04:52 UTC from IEEE Xplore. Restrictions apply.
ZHANG et al.: PGAM FOR STRUCTURALLY INADEQUATE MEDICAL IMAGE DATA AUGMENTATION 13
TABLE II
THE PERFORMANCE COMPARISON BETWEEN TA AND PTGAM ON VGG-19
FOR NORMAL-DISEASE DATA IMBALANCE
TABLE I
THE OBJECTIVE ASSESSMENT ON PTGAN AND OTHER GANS USING
PSNR AND SSIM METRIC
Authorized licensed use limited to: KIIT University. Downloaded on June 07,2024 at 09:04:52 UTC from IEEE Xplore. Restrictions apply.
14 IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, VOL. 26, NO. 1, JANUARY 2022
TABLE IV
THE PERFORMANCE COMPARISON BETWEEN TA AND PTGAM ON VGG-19
FOR MALIGNANT DATA IMBALANCE
Fig. 9. The diversity of our method under different mean and variance
of sampling conditions.
TABLE III
THE PERFORMANCE COMPARISON BETWEEN TA AND PTGAM ON VGG-19
FOR BENIGN DATA IMBALANCE
in the second column of Fig. 8. Without the constraint of PTM, In the case of scarcity of benign data, the model tends to
the model will be confused about the reconstruction of structure fall into a local optimal solution biased toward the malignant
and texture, as shown in the third column in the Fig. 8. When category. Therefore, we choose False Positive Rate (FPR) and
the mask covers part of the skin, only PTM can make the model Precision to measure different methods of augmentation. It can
reconstruct the nodules under the skin tissue. Instead, the skin be seen that our method can greatly reduce false positives.
texture will be regenerated underneath the skin tissue without On the contrary, when malignant data is scarce, we choose
PTM. False Negative Rate (FNR) and NPV. As a result, the highest
In addition, CCS enables model to obtain the maximum like- accuracy can be obtained when our method is combined with
lihood estimation under the conditional probability. The visual TA. However, in terms of FPR and FNR, our method minimizes
effect of the generated result cannot be seen clearly in the Fig. 8, the undesired effects of data bias.
so we show the diversity under different sampling conditions in In addition, to prove the ability of our method for address-
the Fig. 9. The echo intensity of the generated nodule decreases ing the data imbalance problem, we performed several exper-
as the mean of sampling increases. As the variance of sampling iments with different scarcity under imbalance of malignant
increases, the solid component in the generated nodule gradually data. Specifically, experiments were performed by limiting the
increases. Therefore, we reconstructs diverse and reasonably number of malignant data in the training set to: 800, 900, 1000,
distributed new data by sampling different mean-variance dis- 1100, and 1200, respectively, with sufficient benign data, and
tributions of benign and malignant nodules. the results are shown in Fig. 10. The orange dashed line in
2) Quantitative Analysis: In order to prove that our method the figure shows the accuracy obtained by the classifier without
can alleviate the data imbalance among different disease, we artificial removing (3932 BI, 3666 MI). As can be seen, our
select a small part of benign data and all malignant data to method consistently outperforms TA. And once the proportion
perform experiment. Then, the new benign images are generated of malignant data reaches a certain extent (1200:3932), the
by randomly selecting samples. Similarly, no additional samples combination of these two augmentation methods enables the
were brought throughout the process. The experimental results classifier to outperform the model trained by real and balanced
are shown in the Table III, and the IC in the table represents data.
Imbalance Category, which is the scarce data in the dataset. In In summary, the Progressive Generative Adversarial Method
addition, we also carry out an data augmentation experiment of is a sample generation method for data augmentation with
malignant data, making the amount of data roughly the same as preservation of structure in ultrasound images. It has not only
the augmentation setting of the benign one. It can be seen that produced superior augmentation results in thyroid ultrasound
(Table IV) our method has equal augmentation effects for each images, but also applies equally to other medical imaging
category. The NPV in Table IV means the Negative Predictive datasets such as BUSI [8]. Our method surpassed the augmenta-
Value, T N/(T N + F N ). tion with the [8] by 85.47% in generating only 6356 images
Authorized licensed use limited to: KIIT University. Downloaded on June 07,2024 at 09:04:52 UTC from IEEE Xplore. Restrictions apply.
ZHANG et al.: PGAM FOR STRUCTURALLY INADEQUATE MEDICAL IMAGE DATA AUGMENTATION 15
Fig. 11. The segmentation results of our method. The red color in the third column represents a larger difference and the blue color represents a
smaller one. GT is outlined with a solid orange line.
Authorized licensed use limited to: KIIT University. Downloaded on June 07,2024 at 09:04:52 UTC from IEEE Xplore. Restrictions apply.
16 IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, VOL. 26, NO. 1, JANUARY 2022
of data augmentation. The whole method makes full use of the [17] Y. Tang, S. Oh, Y. Tang, J. Xiao, and R. M. Summers, “CT-realistic
point annotation and image-level information during ultrasonic data augmentation using generative adversarial network for robust lymph
node segmentation,” in Proc. Med. Imagin., Comput.-Aided Diagnosis,
examination, without any extra annotation work. vol. 10950, 2019, Art. no. 109503V.
Nonetheless, there are some aspects that can be optimized [18] T. Kanayama et al., “Gastric cancer detection from endoscopic images
in this method. For example, combining PTM and the loss using synthesis by GAN,” in Med. Image Comput. Comput. Assist. Interv.,
Part V, vol. 11768, 2019, pp. 530–538.
function to make the whole process out of the limit of hyper- [19] D. Pathak, P. Krähenbühl, J. Donahue, T. Darrell, and A. A. Efros, “Context
parameters. In addition, the limitation of this method is that encoders: Feature learning by inpainting,” in Proc. IEEE Conf. Comput.
it relies on the original structural information in the image, Vis. Pattern Recognit., 2016, pp. 2536–2544.
[20] R. A. Yeh, C. Chen, T. Lim, A. G. Schwing, M. Hasegawa-Johnson, and
that is, the lesion/nodule in the image cannot be too large. M. N. Do, “Semantic image inpainting with deep generative models,” in
Therefore, how to fully dig out the overly flexible structural Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2017, pp. 6882–6890.
information in structurally inadequate images is the next urgent [21] S. Lee, N. U. Islam, and S. Lee, “Robust image completion and masking
with application to robotic bin picking,” Robot. Auton. Syst., vol. 131,
problem. 2020, Art. no. 103563.
[22] S. Iizuka, E. Simo-Serra, and H. Ishikawa, “Globally and locally consistent
REFERENCES image completion,” ACM Trans. Graph., vol. 36, no. 4, pp. 1–107:14, 2017.
[23] K. Armanious, Y. Mecky, S. Gatidis, and B. Yang, “Adversarial inpainting
[1] Y. Gong et al., “Fetal congenital heart disease echocardiogram screen- of medical image modalities,” in Proc. IEEE Int. Conf. Acoust., Speech
ing based on DGACNN: Adversarial one-class classification combined Signal Process., 2019, pp. 3267–3271.
with video transfer learning,” IEEE Trans. Med. Imag., vol. 39, no. 4, [24] C. Yang, X. Lu, Z. Lin, E. Shechtman, O. Wang, and H. Li, “High-
pp. 1206–1222, Apr. 2020. resolution image inpainting using multi-scale neural patch synthesis,” in
[2] N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, “SMOTE: Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2017, pp. 4076–4084.
Synthetic minority over-sampling technique,” J. Artif. Intell. Res., vol. 16, [25] M. Sagong, Y. Shin, S. Kim, S. Park, and S. Ko, “PEPSI: Fast image
pp. 321–357, 2002. inpainting with parallel decoding network,” in Proc. IEEE Conf. Comput.
[3] H. Inoue, “Data augmentation by pairing samples for images classifica- Vis. Pattern Recognit., 2019, pp. 11360–11368.
tion,” 2018, arXiv:1801.02929. [26] Y.-G. Shin, M.-C. Sagong, Y.-J. Yeo, S.-W. Kim, and S.-J. Ko, “PEPSI++:
[4] H. Zhang, M. Cissé, Y. N. Dauphin, and D. Lopez-Paz, “Mixup: Beyond Fast and lightweight network for image inpainting,” IEEE Trans. Neural
empirical risk minimization,” 2017, arXiv:1710.09412. Netw. Learn. Syst., vol. 32, no. 1, pp. 252–265, Jan. 2021.
[5] L. Sun, J. Wang, Y. Huang, X. Ding, H. Greenspan, and J. W. Paisley, “An [27] D. Zhao, B. Guo, and Y. Yan, “Parallel image completion with edge and
adversarial learning approach to medical image synthesis for lesion detec- color map,” Appl. Sci.-Basel, vol. 9, no. 18, pp. 1–29, Sep. 2019.
tion,” IEEE J. Biomed. Health Informat., vol. 24, no. 8, pp. 2303–2314, [28] K. Nazeri, E. Ng, T. Joseph, F. Z. Qureshi, and M. Ebrahimi, “Edgecon-
Aug. 2020. nect: Generative image inpainting with adversarial edge learning,” 2019,
[6] B. Bozorgtabar et al., “Informative sample generation using class aware arXiv:1901.00212.
generative adversarial networks for classification of chest X-rays,” Com- [29] Y. Chai, B. Xu, K. Zhang, N. Leporé, and J. C. Wood, “MRI restora-
put. Vis. Image Understanding, vol. 184, pp. 57–65, 2019. tion using edge-guided adversarial learning,” IEEE Access, vol. 8,
[7] D. Zhao, D. Zhu, J. Lu, Y. Luo, and G. Zhang, “Synthetic medical images pp. 83858–83870, 2020.
using F&BGAN for improved lung nodules classification by multi-scale [30] W. Xiong et al., “Foreground-aware image inpainting,” in Proc. IEEE
VGG16,” Symmetry, vol. 10, no. 10, pp. 1–16, 2018. Conf. Comput. Vis. Pattern Recognit., 2019, pp. 5840–5848.
[8] W. Al-Dhabyani, M. Gomaa, H. Khaled, and A. Fahmy, “Deep learning [31] Y. Ren, X. Yu, R. Zhang, T. H. Li, S. Liu, and G. Li, “StructureFlow: Image
approaches for data augmentation and classification of breast masses inpainting via structure-aware appearance flow,” in Proc. IEEE Int. Conf.
using ultrasound images,” Int. J. Adv. Comput. Sci. Appl., vol. 10, no. 5, Comput. Vis., 2019, pp. 181–190.
pp. 618–627, 2019. [32] T. Sun, W. Fang, W. Chen, Y. Yao, F. Bi, and B. Wu, “High-resolution
[9] S. Pandey, P. R. Singh, and J. Tian, “An image augmentation ap- image inpainting based on multi-scale neural network,” Electronics, vol. 8,
proach using two-stage generative adversarial network for nuclei im- no. 11, pp. 1370-1–1370-17, Nov. 2019.
age segmentation,” Biomed. Signal Process. Control., vol. 57, 2020, [33] Y. Chen and H. Hu, “An improved method for semantic image inpainting
Art. no. 101782. with GANs: Progressive inpainting,” Neural Process. Lett., vol. 49, no. 3,
[10] Z. Qin, Z. Liu, P. Zhu, and Y. Xue, “A GAN-based image synthesis method pp. 1355–1367, 2019.
for skin lesion classification,” Comput. Meth. Programs Biomed., vol. 195, [34] J. Yu, Z. Lin, J. Yang, X. Shen, X. Lu, and T. S. Huang, “Generative image
2020, Art. no. 105568. inpainting with contextual attention,” in Proc. IEEE Conf. Comput. Vis.
[11] M. Rezaei, T. Uemura, J. Nappi, H. Yoshida, and C. Meinel, “Generative Pattern Recognit., 2018, pp. 5505–5514.
synthetic adversarial network for internal bias correction and handling [35] T. Zhou, S. Tulsiani, W. Sun, J. Malik, and A. A. Efros, “View synthesis
class imbalance problem in medical image diagnosis,” in Proc. Med. Imag., by appearance flow,” in Proc. Eur. Conf. Comput. Vis., Part IV, vol. 9908,
Comput.-Aided Diagnosis, 2020, pp. 113140E-1–113140E-8. 2016, pp. 286–301.
[12] P. Ganesan, S. Rajaraman, L. R. Long, B. Ghoraani, and S. K. Antani, [36] J. P. Cohen, M. Luck, and S. Honari, “Distribution matching losses can
“Assessment of data augmentation strategies toward performance improve- hallucinate features in medical image translation,” in Med. Image Com-
ment of abnormality classification in chest radiographs,” in Proc. Annu. put. Comput. Assist. Interv. - Proc., Part I, vol. 11070. Springer, 2018,
Int. Conf. IEEE Eng. Med. Biol. Soc., 2019, pp. 841–844. pp. 529–536.
[13] S. Guan and M. H. Loew, “Breast cancer detection using synthetic mam- [37] K. Simonyan and A. Zisserman, “Very deep convolutional networks for
mograms from generative adversarial networks in convolutional neural large-scale image recognition,” 2014, arXiv:1409.1556.
networks,” in Proc. 14th Int. Workshop Breast Imag., vol. 10718, 2018, [38] Y. Zeng, J. Fu, H. Chao, and B. Guo, “Learning pyramid-context encoder
pp. 1–9, Art. no. 107180X. network for high-quality image inpainting,” in Proc. IEEE Conf. Comput.
[14] C. Han et al., “Combining noise-to-image and image-to-image GANs: Vis. Pattern Recognit., 2019, pp. 1486–1494.
Brain MR image augmentation for tumor detection,” IEEE Access, vol. 7, [39] J. Li, N. Wang, L. Zhang, B. Du, and D. Tao, “Recurrent feature reasoning
pp. 156966–156977, 2019. for image inpainting,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit.,
[15] V. Bhagat and S. Bhaumik, “Data augmentation using generative adver- 2020, pp. 7757–7765.
sarial networks for pneumonia classification in chest X-rays,” in Proc. 5th [40] A. Kolesnikov and C. H. Lampert, “Seed, expand and constrain: Three
Int. Conf. Image Inf. Process., 2019, pp. 574–579. principles for weakly-supervised image segmentation,” in Proc. Eur. Conf.
[16] J. Liu, C. Shen, T. Liu, N. Aguilera, and J. Tam, “Active appearance model Comput. Vis., vol. 9908, 2016, pp. 695–711.
induced generative adversarial network for controlled data augmentation,” [41] J. Fan, Z. Zhang, T. Tan, C. Song, and J. Xiao, “CIAN: Cross-image affinity
in Proc. Med. Image Comput. Comput. Assist. Interv. - Proc., Part I, net for weakly supervised semantic segmentation,” in Proc. AAAI Conf.
vol. 11764, 2019, pp. 201–208. Artif. Intell., 2020, pp. 10762–10769.
Authorized licensed use limited to: KIIT University. Downloaded on June 07,2024 at 09:04:52 UTC from IEEE Xplore. Restrictions apply.