Transfer Learning Scenarios On Deep Learning For Ultrasound Based Image Segmentation
Transfer Learning Scenarios On Deep Learning For Ultrasound Based Image Segmentation
Corresponding Author:
Nur Iriawan
Department of Statistics, Faculty of Science and Data Analytics, Institut Teknologi Sepuluh Nopember
Kampus ITS Sukolilo, Surabaya, Indonesia
Email: [email protected]
1. INTRODUCTION
Image segmentation is a crucial task in image and video processing. This process involves dividing
the image into multiple segments or objects by assigning class labels to each pixel [1]. Its applications are
widespread and encompass medical imaging [2]–[4], remote sensing [5]–[7], and the development of
autonomous vehicles [8]–[10]. Amid various segmentation methods, deep learning emerges as a promising
approach [11]–[13]. They decompose complex mappings into a sequence of simpler ones, each described by
different layers [14]. The input is presented in a visible layer, and subsequent hidden layers extract abstract
features from it. The refinement of these layers is driven by the results of the training process, rather than
manual intervention [15]. With a large number of layers, they can accurately represent input features and
effectively perform complex tasks like image segmentation, natural language processing, or stock price
prediction [16]. Due to this benefit, deep learning is better than traditional machine learning methods, which
still rely on domain expertise for feature extraction.
Deep learning implementation, however, requires a large amount of training data and may require a
while to complete [17]. This presents difficulties, particularly in the medical domain where labeled datasets
are scarce [18]. To overcome this issue, transfer learning can be coupled with deep learning approach [17].
Reusing pre-trained network components, such as the structure and parameter values, is part of this process.
To be more precise, the network is typically divided into two parts: the part receiving transfer learning and
the part not receiving it. The first, leveraging transfer learning, will be structurally identical with parameter
values transferred from a pre-trained model. The source model is typically trained on a larger dataset, which
may be related or entirely different. The next section is a non-transferred layer, meaning its parameter values
are initialized and updated during training.
Furthermore, variations exist in how parameter values are handled in layers affected by transfer
learning. These values can be "frozen" (non-trainable) and maintained in that state, or they can be "unfrozen"
(trainable) and updated as the training progresses. Some studies treat them as non-trainable parameters
[19]–[22]. On the other hand, some researchers utilize transfer learning values for initialization and updating
them immediately in the first training iteration [23], [24]. Unfortunately, to the best of our knowledge, no
research has evaluated the effectiveness of these two scenarios simultaneously. The majority of the articles
only contrasted one scenario of transfer learning with a model that did not employ transfer learning [19],
[21]. Furthermore, a lot of applications only construct a transfer learning model without contrasting it with
any other models [18]. This leads to a gap in knowledge that requires research. Therefore, this study aims to
compare those two parameter update scenarios, as well as introduce a new state-of-the-art transfer learning
scenario. This scenario involves updating the newly transferred parameter values only after a specific time
point is reached.
Dense-UNet, a deep learning architecture that hybridizes Unet [4] and DenseNet [25], was
employed in this investigation. This architecture was implemented to limit the number of model parameters,
maximize information flow between network layers, and address vanishing gradient concerns due to its
feature reuse and dense connections at each stage [26]. The encoder and the decoder are the two primary
components of this architecture in general. The encoder, also known as the contraction path, is responsible
for applying transfer learning from a pre-trained model and extracting features. The second component,
known as the expanding path or decoder, is amid reconfiguring features and boosting spatial resolution
through the use of upsampling operators [4], [27]. These two paths are connected via skip connections, in
which the feature maps from the encoder are bypassed and concatenated with the decoder results at specific
positions [28].
The simulation will be conducted on an ultrasound-based cardiac assessment dataset. Ultrasound,
known for its accessibility, affordability, and absence of radiation exposure, addresses key healthcare concerns
[29]. However, due to increased noise and decreased contrast, observing certain cardiac features can be
challenging, as they are typically difficult to determine and interpret [30]. Therefore, automatic segmentation
is urgently required for assistance in identifying the region of interest in ultrasound-based images.
Nevertheless, in contrast to other non-invasive imaging modalities like magnetic resonance imaging (MRI)
and computed tomography scan (CT-scan), research on automatic segmentation in ultrasound, particularly
utilizing deep learning, has been very limited in recent years [31]. To overcome this problem, we employ a
publicly available dataset from Hamad Medical Corporation, Qatar University, and Tampere University
known as the HMC-QU dataset, accessible at https://www.kaggle.com/datasets/aysendegerli/hmcqu-dataset.
This dataset encompasses ultrasound-based assessments featuring diverse patients and viewpoint types.
Furthermore, the ground truth is supplied, with the left ventricular wall (LVW) serving as the region of
interest (ROI). This is essential to us because LVW movement and structure analysis serves as an early
indicator of various heart problems, including myocardial infarction and hypertrophic cardiomyopathy [30],
[32]. This dataset has been used in several earlier investigations, either for segmentation or for the
identification of structural and movement anomalies [33]–[37]. While deep learning remains the dominant
option, none of these studies has explored the use of transfer learning to the extent that we propose.
Therefore, our research provides practical benefits for the development of ultrasound-based cardiac image
processing in addition to theoretical benefits for deep learning transfer learning scenarios.
2. METHOD
2.1. Dense-UNet architecture
Dense-UNet is a modified U-Net architecture that incorporates dense blocks and transition layers
into its structure, drawing inspiration from the DenseNet architecture introduced by [25]. Their layer-to-layer
linkages are the main distinction between standard block layers and dense blocks. Each layer in a dense block
obtains feature mappings from every layer before it via some concatenation [25]. This feature reuse
minimizes the addition of excessive features in each layer, consequently reducing the required parameters.
However, it necessitates that the dimensions of feature maps remain unchanged due to concatenation-based
merging. This limitation impedes the implementation of a pooling procedure, which is generally resolved by
adding a transition layer. In the original configuration, this transition layer consists of 2×2 average pooling
preceded by 1×1 convolution.
Figure 1 illustrates the structure of the nine-stage Dense-UNet. A 7×7 convolution is employed in
the first step to process the input dimensions from 224×224 to 112×112. This process continues with the first
transition layer, leading us to the first dense block in the second stage. Within a dense block, layer
configurations include batch normalization (BN), rectified linear unit (ReLU) activation, 3×3 convolution,
another BN, ReLU activation, and 1×1 convolution. This sequence is repeated several times depending on
the architectural construction. Subsequently, the second transition layer guides us to the third stage (second
dense block). We will continue this process until we reach the fourth dense block in the fifth stage when we
have 7×7 feature maps. The next step involves starting 2×2 upsampling and concatenating the result with the
final feature maps from the fourth stage. Their results will serve as the input for the fifth dense block, which
has the same layer configuration as its mirrored version (third dense block). This provision continues until we
reach the ninth stage, concluding with a sigmoid activation layer and a resulting output of 224×224.
Determining how many layers are present in each dense block is another crucial factor. The number
of layers in this study, ranging from stage one to stage five, follows the DenseNet-121, DenseNet-169, and
DenseNet-201 structure of the original DenseNet versions [25]. The sixth to ninth stages replicate this
structure by mirroring the number of layers. Under these conditions, the three Dense-UNet architectures in
this study are named Dense-UNet-121, Dense-UNet-169, and Dense-UNet-201.
Transfer learning scenarios on deep learning for ultrasound-based image segmentation (Didik Bani Unggul)
3276 ISSN: 2252-8938
regardless of whether they are in layers with or without transfer learning. The initialization procedure is
where the differences arise: non-transferred layers begin with glorot uniform initialization, whereas other
layers start with values from a pre-trained model.
‒ Scenario 3: freeze-unfreeze scenario (TL-S3). Parameters in layers affected by transfer learning will
remain unchanged for an initial portion of the training process. In other words, only the parameters in
layers not affected by transfer learning will be updated, while those influenced by transfer learning will be
frozen. After reaching a pre-defined epoch threshold, the transfer learning layer is unfrozen, and training
continues across all layers. The transfer learning cutoff will be explored at various stages, including 20%,
40%, 60%, and 80% of the total training epochs. This exploration will clarify how the timing of the
transition impacts the final outcome.
In this study, we will simulate the three scenarios that are depicted in Figure 2.
√6
𝑎= (1)
√𝑛𝑖𝑛+𝑛𝑜𝑢𝑡
Next, the adaptive moment (Adam) technique proposed in [43] will be utilized for updating the
initial value iteratively. This method updates parameter values using bias-corrected values of gradients' first
and second moments estimations. Algorithm 1 illustrates the procedure. The first component that must be
calculated is the gradient of the loss function with respect to the model parameters, denoted by 𝑔𝑡 where 𝑡 is
the index of iteration performed. The binary cross-entropy loss function as in (2) was selected to suit the
binary classification task.
The loss for the 𝑖 𝑡ℎ pixel, denoted as 𝐿𝑖 , is defined for 𝑖 = 1, . . . , 𝑁 with 𝑁 representing the total
pixels in the output image. The actual classification class of 𝑖 𝑡ℎ pixel is notated by 𝑐𝑖 ∈ {0,1}, in which
𝑐𝑖 = 0 is for the background and 𝑐𝑖 = 1 is for the ROI. Lastly,𝑝(𝑐𝑖 ) is the predicted probability of belonging
to class 𝑐𝑖 calculated by the model. After finding the 𝑔𝑡 , we are able to calculate the exponentially weighted
moving average of the gradient (𝑚𝑡 ) and squared gradient (𝑣𝑡 ). This step demands us to configure the
hyperparameter values 𝛽1 , 𝛽2 ∈ [0,1) as the exponential decay rates for the moment estimates. We then
utilize the bias-corrected versions of 𝑚𝑡 and 𝑣𝑡 along with 𝜂 and 𝜀 to update parameter values from 𝜃𝑡−1 to
𝜃𝑡 . We also set the hyperparameter values at 𝛽1 = 0.9, 𝛽2 = 0.999, 𝜂 = 10−6 , and 𝜀 = 10−8 .
|𝐺𝑇𝑗 ∩𝑆𝑒𝑔𝑗 |
CCR = ∑1𝑗=0 (9)
|𝐺𝑇|
Transfer learning scenarios on deep learning for ultrasound-based image segmentation (Didik Bani Unggul)
3278 ISSN: 2252-8938
The training utilized a batch size of 10, with 10 images selected at random for each iteration. Each
epoch concluded after processing all images, and this procedure was repeated for 100 epochs. The
experiment was conducted on Google Colab using an NVIDIA V100 GPU, with Python 3 and the Keras
framework chosen for their effectiveness and executability.
Figure 3. Some examples of ultrasound-based images and their ground truth (mask)
Table 1 summarizes the training durations (in seconds), loss values, and CCR for the three
Dense-UNet architectures with various transfer learning scenarios. Notably, across all architectures, the
suggested third scenario (TL-S3) consistently outperforms models without transfer learning (NoTL), TL-S1,
and TL-S2. The models under TL-S3 achieve a remarkable CCR exceeding 0.99, a level not attained by
models from other scenarios. Furthermore, TL-S3 models demonstrate far lower losses than the others, with
reductions ranging from 82% to 97%.
Investigation reveals that when transfer learning parameters are unfrozen after the cutoff, the TL-S3
models perform noticeably better. TL-S3 20%-F models, for example, exhibit better performance spikes after
20 epochs, whereas TL-S3 40%-F models show better performance surges after 40 epochs. The TL-S3
60%-F and TL-S3 80%-F models also exhibit this pattern. The learning curve provides a visual
representation of this circumstance, with Figures 4(a) to 4(c) representing CCR and Figures 5(a) to 5(c)
representing loss. It validates the hypothesis that temporarily freezing transfer learning layers enables the
model to adapt to the current case's characteristics without disrupting the robust feature extraction of pre-
trained layers. After the non-transfer learning layer stabilizes, unfreezing the transfer learning layer boosts
performance by iteratively updating all parameters. This performance jump occurs shortly after the cutoff.
Figure 4. Learning curve for CCR values: (a) Dense-UNet-121, (b) Dense-UNet-169, and (c) Dense-UNet-201
Figure 5. Learning curve for loss values: (a) Dense-UNet-121, (b) Dense-UNet-169, and (c) Dense-UNet-201
The average CCR increase for TL-S3 models during the next twenty epochs was 0.0216, compared
to 0.0048 for other scenarios. This approximately five-fold difference highlights how much preferable the
TL-S3 scenario is. We discover that the TL-S3 20%-F scenario corresponds to the best-performing model
during training in each Dense-UNet architecture. Dense-UNet-121, 169, and 201 with this scenario had CCR
values of 0.9950, 0.9950, and 0.9949, respectively, placing them among the top three in terms of both CCR
and loss. With a CCR of 0.9699, the Dense-UNet-121 model with TL-S3 20%-F also leads in validation.
Dense-UNet-121 with TL-S3 40%-F and Dense-UNet-169 with TL-S1 are the second and third-best models,
respectively, with CCR values of 0.9694. TL-S3 scenario models were able to maintain two of the top three
positions in this instance. Then, a different testing dataset was employed to further evaluate these three
models, which were determined to be the best options. Once more, the model with the greatest CCR of
0.9695 was Dense-UNet-121 with TL-S3 20%-F. It performed better than Dense-UNet-121 with TL-S3
40%-F and Dense-UNet-169 with TL-S1, which had CCR values of 0.9685 and 0.9681, respectively. The
results demonstrate the strong segmentation capabilities of Dense-UNet-121, confirming its superior
performance with TL-S3 20%-F. It continuously achieves the greatest CCR (0.9950, 0.9699, and 0.9695,
respectively) across training, validation, and testing datasets.
When comparing models with and without transfer learning, models with transfer learning generally
demonstrate faster training times. Dense-UNet-201 TL-S2 is an exception, taking 19 seconds longer than
Transfer learning scenarios on deep learning for ultrasound-based image segmentation (Didik Bani Unggul)
3280 ISSN: 2252-8938
Dense-UNet-201 without transfer learning. In other circumstances, transfer learning steadily quickens the
training process. Second, we anticipated that TL-S1 would demonstrate the most rapid training time. The
rationale behind this approach stemmed from the observation that TL-S1 requires fewer learned parameters
than TL-S2 and TL-S3. Nevertheless, our research indicates that this hypothesis is valid exclusively when
comparing TL-S1 and TL-S2. Interestingly, certain of the models in the TL-S3 scenario required less training
time than those in TL-S1. This result provides an interesting novel perspective to our investigation, indicating
that the special parameter update approach employed by TL-S3 may help enhance the effectiveness of
training. We additionally discover that among TL-S3 models, the training period varies depending on the
cutoff position selection. The earlier the transition from non-trainable (freeze) to trainable (unfreeze) status
occurs, the longer the training duration. This condition is attributed to the increasing proportion of epochs
with a full-scale trainable parameter set. In terms of processing time, our best model, the Dense-UNet-121
with TL-S3 20%-F, also performed well. With 2,857 seconds of duration, it is faster than 52% of other
models.
Lastly, Figure 6 provides a visualization of data segmentation testing with our best model. The
original photos are displayed in the top row, and a comparison of the ROI contour generated by the model
(red line) and the ground truth (blue line) is presented in the bottom row. This figure illustrates how the
model can segment data from a new dataset that was not utilized during training.
4. CONCLUSION
This study provides several important conclusions. Firstly, during training, the TL-S3 scenario
consistently outperforms other scenarios, achieving CCRs over 0.99 and losses under 0.0205. This
superiority is explained by TL-S3's learning curve exhibiting a performance increase after surpassing the
freezing cutoff. Five times higher than the rest, the average CCR increase in the 20 epochs post-cutoff is
0.0216. Furthermore, the excellence of TL-S3 extends to validation process, securing top positions in terms
of the highest CCR. In summary, the Dense-UNet-121 model with TL-S3 20%-F is deemed the best,
achieving a training duration of 2,857 seconds and attaining the highest CCR values for training, validation,
and testing data (0.9950, 0.9699, and 0.9695, respectively). This study establishes opportunities for further
research on the TL-S3 scenario by raising two crucial issues: first, determining the optimal transition point
from 'untrainable' to 'trainable' status, and second, exploring how distinct training parameter adjustments can
be made for each layer impacted by transfer learning. These investigations are expected to enhance the
robustness and performance of the deep learning model with transfer learning.
ACKNOWLEDGEMENTS
The research presented in this paper was supported by Department of Statistics, Institut Teknologi
Sepuluh Nopember and Indonesia Endowment Fund for Education Agency under scholarship no.
KET-438/LPDP.4/2022.
REFERENCES
[1] R. Szeliski, Computer vision: algorithms and applications. Cham: Springer, 2022.
[2] R. Ranjbarzadeh, A. Caputo, E. B. Tirkolaee, S. J. Ghoushchi, and M. Bendechache, “Brain tumor segmentation of MRI images: a
comprehensive review on the application of artificial intelligence tools,” Computers in Biology and Medicine, vol. 152, 2023, doi:
10.1016/j.compbiomed.2022.106405.
[3] N. Salpea, P. Tzouveli, and D. Kollias, “Medical image segmentation: a review of modern architectures,” in Computer Vision –
ECCV 2022 Workshops, 2023, pp. 691–708, doi: 10.1007/978-3-031-25082-8_47.
[4] O. Ronneberger, P. Fischer, and T. Brox, “U-Net: convolutional networks for biomedical image segmentation,” in Medical Image
Computing and Computer-Assisted Intervention – MICCAI 2015, Cham: Springer, 2015, pp. 234–241, doi: 10.1007/978-3-319-
24574-4_28.
[5] S. M. Azimi, C. Henry, L. Sommer, A. Schumann, and E. Vig, “SkyScapes fine-grained semantic understanding of aerial
scenes,” in 2019 IEEE/CVF International Conference on Computer Vision (ICCV), 2019, pp. 7392–7402, doi:
10.1109/ICCV.2019.00749.
[6] P. Bhadoria, S. Agrawal, and R. Pandey, “Image segmentation techniques for remote sensing satellite images,” IOP Conference
Series: Materials Science and Engineering, vol. 993, no. 1, pp. 1–17, 2020, doi: 10.1088/1757-899X/993/1/012050.
[7] B. E. -Zahouani et al., “Remote sensing imagery segmentation in object-based analysis: a review of methods, optimization, and
quality evaluation over the past 20 years,” Remote Sensing Applications: Society and Environment, vol. 32, 2023, doi:
10.1016/j.rsase.2023.101031.
[8] D. Feng et al., “Deep multi-modal object detection and semantic segmentation for autonomous driving: datasets, methods, and
challenges,” IEEE Transactions on Intelligent Transportation Systems, vol. 22, no. 3, pp. 1341–1360, 2021, doi:
10.1109/TITS.2020.2972974.
[9] D. -V. Giurgi, T. J. -Laurain, M. Devanne, and J. -P. Lauffenburger, “Real-time road detection implementation of UNet
architecture for autonomous driving,” in 2022 IEEE 14th Image, Video, and Multidimensional Signal Processing Workshop
(IVMSP), 2022, pp. 1–5, doi: 10.1109/IVMSP54334.2022.9816237.
[10] L. Lizhou and Z. Yong, “A closer look at U-net for road detection,” in Tenth International Conference on Digital Image
Processing (ICDIP 2018), 2018, doi: 10.1117/12.2503282.
[11] M. Aljabri and M. AlGhamdi, “A review on the use of deep learning for medical images segmentation,” Neurocomputing, vol.
506, pp. 311–335, 2022, doi: 10.1016/j.neucom.2022.07.070.
[12] B. Sistaninejhad, H. Rasi, and P. Nayeri, “A review paper about deep learning for medical image analysis,” Computational and
Mathematical Methods in Medicine, vol. 2023, pp. 1–10, 2023, doi: 10.1155/2023/7091301.
[13] S. M. Khaniabadi, H. Ibrahim, I. A. Huqqani, F. M. Khaniabadi, H. A. M. Sakim, and S. S. Teoh, “Comparative review on
traditional and deep learning methods for medical image segmentation,” in 2023 IEEE 14th Control and System Graduate
Research Colloquium (ICSGRC), 2023, pp. 45–50, doi: 10.1109/ICSGRC57744.2023.10215402.
[14] I. Goodfellow, Y. Bengio, and A. Courville, Deep learning. Cambridge, Massachusetts: MIT Press, 2016.
[15] Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, vol. 521, no. 7553, pp. 436–444, 2015, doi: 10.1038/nature14539.
[16] E. Alaros, M. Marjani, D. A. Shafiq, and D. Asirvatham, “Predicting consumption intention of consumer relationship
management users using deep learning techniques: a review,” Indonesian Journal of Science and Technology, vol. 8, no. 2, pp.
307–328, 2023, doi: 10.17509/ijost.v8i2.55814.
[17] C. Tan, F. Sun, T. Kong, W. Zhang, C. Yang, and C. Liu, “A survey on deep transfer learning,” in Artificial Neural Networks and
Machine Learning – ICANN 2018, Cham: Springer, 2018, pp. 270–279, doi: 10.1007/978-3-030-01424-7_27.
[18] P. Kora et al., “Transfer learning techniques for medical image analysis: a review,” Biocybernetics and Biomedical Engineering,
vol. 42, no. 1, pp. 79–107, 2022, doi: 10.1016/j.bbe.2021.11.004.
[19] A. A. Pravitasari et al., “UNet-VGG16 with transfer learning for MRI-based brain tumor segmentation,” Telkomnika
(Telecommunication Computing Electronics and Control), vol. 18, no. 3, pp. 1310–1318, 2020, doi:
10.12928/TELKOMNIKA.v18i3.14753.
[20] M. Oquab, L. Bottou, I. Laptev, and J. Sivic, “Learning and transferring mid-level image representations using convolutional
neural networks,” in 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014, pp. 1717–1724, doi:
10.1109/CVPR.2014.222.
[21] D. A. Rasyid, G. H. Huang, and N. Iriawan, “Segmentation of low-grade gliomas using U-Net VGG16 with transfer learning,” in
2021 11th International Conference on Cloud Computing, Data Science & Engineering (Confluence), 2021, pp. 393–398, doi:
10.1109/Confluence51648.2021.9377093.
[22] O. T. Bişkin, İ. Kırbaş, and A. Çelik, “A fast and time-efficient glitch classification method: a deep learning-based visual feature
extractor for machine learning algorithms,” Astronomy and Computing, vol. 42, 2023, doi: 10.1016/j.ascom.2022.100683.
[23] J. Yosinski, J. Clune, Y. Bengio, and H. Lipson, “How transferable are features in deep neural networks?,” Advances in Neural
Information Processing Systems, vol. 4, pp. 3320–3328, 2014.
[24] Z. Yang, J. Yue, Z. Li, and L. Zhu, “Vegetable image retrieval with fine-tuning VGG model and image hash,” IFAC-
PapersOnLine, vol. 51, no. 17, pp. 280–285, 2018, doi: 10.1016/j.ifacol.2018.08.175.
[25] G. Huang, Z. Liu, L. V. D. Maaten, and K. Q. Weinberger, “Densely connected convolutional networks,” in 2017 IEEE
Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 2261–2269, doi: 10.1109/CVPR.2017.243.
[26] Y. Cao, S. Liu, Y. Peng, and J. Li, “DenseUNet: densely connected UNet for electron microscopy image segmentation,” IET
Image Processing, vol. 14, no. 12, pp. 2682–2689, 2020, doi: 10.1049/iet-ipr.2019.1527.
[27] J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks for semantic segmentation,” in 2015 IEEE Conference on
Computer Vision and Pattern Recognition (CVPR), 2015, vol. 39, no. 4, pp. 3431–3440, doi: 10.1109/CVPR.2015.7298965.
[28] S. Cai, Y. Tian, H. Lui, H. Zeng, Y. Wu, and G. Chen, “Dense-UNet: a novel multiphoton in vivo cellular image segmentation
model based on a convolutional neural network,” Quantitative Imaging in Medicine and Surgery, vol. 10, no. 6, pp. 1275–1285,
2020, doi: 10.21037/QIMS-19-1090.
[29] J. E. -Taraboulsi, C. P. Cabrera, C. Roney, and N. Aung, “Deep neural network architectures for cardiac image segmentation,”
Artificial Intelligence in the Life Sciences, vol. 4, pp. 1–19, 2023, doi: 10.1016/j.ailsci.2023.100083.
[30] A. Degerli et al., “Early detection of myocardial infarction in low-quality echocardiography,” IEEE Access, vol. 9, pp. 34442–
34453, 2021, doi: 10.1109/ACCESS.2021.3059595.
[31] C. Chen et al., “Deep learning for cardiac image segmentation: a review,” Frontiers in Cardiovascular Medicine, vol. 7, pp. 1–33,
2020, doi: 10.3389/fcvm.2020.00025.
[32] J. A. U. -Moral et al., “Contrast-enhanced echocardiographic measurement of left ventricular wall thickness in hypertrophic
cardiomyopathy: comparison with standard echocardiography and cardiac magnetic resonance,” Journal of the American Society
of Echocardiography, vol. 33, no. 9, pp. 1106–1115, 2020, doi: 10.1016/j.echo.2020.04.009.
[33] O. Hamila et al., “Fully automated 2D and 3D convolutional neural networks pipeline for video segmentation and myocardial
infarction detection in echocardiography,” Multimedia Tools and Applications, vol. 81, no. 26, pp. 37417–37439, 2022, doi:
10.1007/s11042-021-11579-4.
[34] G. Sanjeevi, U. Gopalakrishnan, R. K. Pathinarupothi, and T. Madathil, “Automatic diagnostic tool for detection of regional wall
motion abnormality from echocardiogram,” Journal of Medical Systems, vol. 47, no. 1, 2023, doi: 10.1007/s10916-023-01911-w.
Transfer learning scenarios on deep learning for ultrasound-based image segmentation (Didik Bani Unggul)
3282 ISSN: 2252-8938
[35] I. Adalioglu, M. Ahishali, A. Degerli, S. Kiranyaz, and M. Gabbouj, “SAF-Net: self-attention fusion network for myocardial
infarction detection using multi-view echocardiography,” in Computing in Cardiology, 2023, pp. 1–4, doi:
10.22489/CinC.2023.240.
[36] Y. Li, W. Lu, P. Monkam, Z. Zhu, W. Wu, and M. Liu, “LVSnake: accurate and robust left ventricle contour localization for
myocardial infarction detection,” Biomedical Signal Processing and Control, vol. 85, 2023, doi: 10.1016/j.bspc.2023.105076.
[37] A. Degerli, S. Kiranyaz, T. Hamid, R. Mazhar, and M. Gabbouj, “Early myocardial infarction detection over multi-view
echocardiography,” Biomedical Signal Processing and Control, vol. 87, pp. 1–12, 2024, doi: 10.1016/j.bspc.2023.105448.
[38] A. Hosna, E. Merry, J. Gyalmo, Z. Alom, Z. Aung, and M. A. Azim, “Transfer learning: a friendly introduction,” Journal of Big
Data, vol. 9, no. 1, pp. 1–19, 2022, doi: 10.1186/s40537-022-00652-w.
[39] A. H. Zim et al., “Smart manufacturing with transfer learning under limited data: towards data-driven intelligences,” Materials
Today Communications, vol. 37, 2023, doi: 10.1016/j.mtcomm.2023.107357.
[40] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet classification with deep convolutional neural networks,”
Communications of the ACM, vol. 60, no. 6, pp. 84–90, 2017, doi: 10.1145/3065386.
[41] H. Li, M. Krček, and G. Perin, “A comparison of weight initializers in deep learning-based side-channel analysis,” in Applied
Cryptography and Network Security Workshops, Cham: Springer, 2020, pp. 126–143, doi: 10.1007/978-3-030-61638-0_8.
[42] X. Glorot and Y. Bengio, “Understanding the difficulty of training deep feedforward neural networks,” Journal of Machine
Learning Research, vol. 9, pp. 249–256, 2010.
[43] D. P. Kingma and J. L. Ba, “Adam: a method for stochastic optimization,” Arxiv-Computer Science, vol. 1, pp. 1–15, 2015.
BIOGRAPHIES OF AUTHORS
Didik Bani Unggul earned his Bachelor of Science in Statistics from Universitas
Indonesia, graduating in 2020. Currently, he is pursuing a master's degree in statistics at
Institut Teknologi Sepuluh Nopember in Surabaya, Indonesia. Actively involved in projects at
the Laboratory of Computational Statistics and Data Science, his research areas of interest
include deep learning, biomedical image processing, and computational statistics. He can be
contacted at email: [email protected] or [email protected].
Nur Iriawan received a bachelor's degree in statistics from the Institut Teknologi
Sepuluh Nopember (ITS) Surabaya, a master's degree in computer science from the University
of Maryland, USA, and a Ph.D. in statistics from Curtin University of Technology, Australia.
He is a professor at the Department of Statistics, Faculty of Science and Data Analytics, ITS,
Surabaya. He also serves as the head the Laboratory of Computational Statistics and Data
Science. He has supervised and co-supervised over 20 master and 10 Ph.D. students. He has
authored or co-authored more than 60 Scopus articles, with 12 H-indexes and over 1,000
citations. His research interests encompass stochastic processes, statistical computations, and
Bayesian models. He can be contacted at email: [email protected].
Heri Kuswanto holds a Statistics B.Sc. (2003) and M.Sc. (2005) from Institut
Teknologi Sepuluh Nopember, Indonesia, and a Dr.rer.pol. in statistics (econometrics) from
Leibniz Hannover University, Germany (2009). He further pursued a postdoctoral degree at
Laval University, Canada, focusing on the calibration of ensemble weather forecasts in 2010.
Currently a professor in statistics at ITS, he also serves as the Director of Graduate Program
and Academic Development. His academic career includes appointments as the Head of
Climate Change Research Group. His research spans weather forecast, solar radiation
management, computational statistics, time series forecasting, econometrics, machine
learning, and advanced data analysis. He also received awards such as the Harvard Residency
Program on Solar Geoengineering and DAAD Scholarship for Doctoral research in Germany.
He can be contacted at email: [email protected].