Federated Learning With Privacy Preserving For Multi-Institutional Three-Dimensional Brain Tumor Segmentation
Federated Learning With Privacy Preserving For Multi-Institutional Three-Dimensional Brain Tumor Segmentation
1 Mathematics, Informatics and Systems LAboratory—LAMIS Laboratory, University of Echahid Cheikh Larbi
Tebessi, Tebessa 12000, Algeria; [email protected] (M.E.Y.);
[email protected] (M.G.)
2 Artificial Intelligence and Autonomous Things Laboratory—LIAOA, University of Oum el Bouaghi,
Oum El Bouaghi 04000, Algeria
3 Faculty of Computer Studies (FCS), Arab Open Universit—Oman, Muscat 130, Oman; [email protected]
4 Department of Computer Science and Software Engineering, College of Information Technology,
United Arab Emirates University, Al Ain 15551, United Arab Emirates
5 Department of Management Information System, College of Commerce & Business Administration,
Dhofar University, Salalah 211, Oman; [email protected]
6 Department of Information Technology, Aylol University College, Yarim 547, Yemen; [email protected]
* Correspondence: [email protected] (M.D.); [email protected] (S.T.);
[email protected] (A.B.)
Abstract: Background and Objectives: Brain tumors are complex diseases that require careful diagno-
sis and treatment. A minor error in the diagnosis may easily lead to significant consequences. Thus,
one must place a premium on accurately identifying brain tumors. However, deep learning (DL)
models often face challenges in obtaining sufficient medical imaging data due to legal, privacy, and
technical barriers hindering data sharing between institutions. This study aims to implement a feder-
ated learning (FL) approach with privacy-preserving techniques (PPTs) directed toward segmenting
brain tumor lesions in a distributed and privacy-aware manner.Methods: The suggested approach
Citation: Yahiaoui, M.E.; Derdour, M.;
employs a model of 3D U-Net, which is trained using federated learning on the BraTS 2020 dataset.
Abdulghafor, R.; Turaev, S.; Gasmi, M.;
PPTs, such as differential privacy, are included to ensure data confidentiality while managing privacy
Bennour, A.; Aborujilah, A.; Sarem,
and heterogeneity challenges with minimal communication overhead. The efficiency of the model
M.A. Federated Learning with Privacy
Preserving for Multi-institutional
is measured in terms of Dice similarity coefficients (DSCs) and 95% Hausdorff distances (HD95)
Three-Dimensional Brain Tumor concerning the target areas concerned by tumors, which include the whole tumor (WT), tumor core
Segmentation. Diagnostics 2024, 14, (TC), and enhancing tumor core (ET). Results: In the validation phase, the partial federated model
2891. https://doi.org/10.3390/ achieved DSCs of 86.1%, 83.3%, and 79.8%, corresponding to 95% values of 25.3 mm, 8.61 mm, and
diagnostics14242891 9.16 mm for WT, TC, and ET, respectively. On the final test set, the model demonstrated improved
performance, achieving DSCs of 89.85%, 87.55%, and 86.6%, with HD95 values of 22.95 mm, 8.68 mm,
Academic Editors: Tudor Drugan and
Daniel Leucuta
and 8.32 mm for WT, TC, and ET, respectively, which indicates the effectiveness of the segmentation
approach, and its privacy preservation.Conclusion: This study presents a highly competitive, col-
Received: 25 November 2024 laborative federated learning model with PPTs that can successfully segment brain tumor lesions
Revised: 15 December 2024
without compromising patient data confidentiality. Future work will improve model generalizability
Accepted: 20 December 2024
and extend the framework to other medical imaging tasks.
Published: 23 December 2024
2. Related Work
Many studies have predominantly adopted centralized approaches for machine learn-
ing (ML) tasks [10–21]. A paper [19] explores advanced hybrid techniques for the early
detection of brain tumors based on ML and DL models. It presents four systems: the first
combines ANN and FFNN using LBP, co-occurrence matrix (GLCM), and transform (DWT);
the second uses pretrained GoogLeNet and ResNet-50 models for feature extraction and
classification; the third merges (CNN) with (SVM) to improve classification; and the fourth
combines GoogLeNet, ResNet-50, and handcrafted features. In that study, the FFNN model
achieved an accuracy of 99.9%. A study [20] used a dataset of 3060 MRI images divided
into four classes: three malignant types and one normal class. The proposed hybrid system
combined AlexNet + SVM and achieved excellent performance with an accuracy of 95.10%,
a sensitivity of 95.25% and a specificity of 98.50%. A paper [21] presents a study on the
diagnosis of intracranial hemorrhage using advanced imaging techniques, focusing on CT
images. Three methods are presented. The first uses pretrained CNNs such as GoogLeNet,
ResNet-50 and AlexNet, achieving accuracies of 94%, 91.7% ,and 91.5%, respectively. The
second method integrates CNNs with SVM, achieving accuracies of 97.4%, 97.2% and
95.7%. The third method uses an ANN that combines the features extracted from CNNs
with the features from the co-occurrence matrix (GLCM and LBP), achieving the highest
accuracy of 99.3%. This conventional paradigm involves aggregating and processing data
in a centralized manner, often posing challenges related to data privacy, security, and
scalability. Recently, FL has demonstrated itself to be a paradigm with the potential to
transform the field of ML, offering a decentralized approach that is particularly well suited
to the distributed nature of modern computing. In FL, the training of models is conducted
locally on distributed devices, such as mobile phones or edge devices. Only aggregated
model updates are shared with a central server. This enables collaborative model training
while mitigating the privacy concerns associated with centralized data storage. Several
studies have explored the application of FL in the healthcare domain, particularly in the
context of medical image segmentation and tumor analysis. For instance, Sheller et al. [22]
demonstrated the feasibility of using FL to identify cancerous disease cells in the brain
across different medical institutions leaving no doubt about the usefulness of FL, which
improves the performance of the intelligent model without compromising and sharing the
data of the patients involved. Similarly, Qiu et al. [23] used a federated semisupervised
learning (FSSL) model for medical image segmentation, which incorporates a federated
pseudo-labeling strategy to address the annotation deficiency for unlabeled clients. In [24],
the authors present a new approach to brain tumor segmentation using FL, which enables
collaborative model learning without the need to share sensitive patient data. The study
used a 3D U-Net architecture on the BraTS 2020 dataset, achieving high segmentation
performance with a tumor-wide DSC of 0.896 and an HD95 of 23.611 mm, comparable
to those of centralized models. In the multiorgan segmentation context, Xie et al. [25]
proposed an FL approach that leverages inconsistent labels across institutions to improve
model robustness and generalization (Fed-MENU). This method utilizes a multiencoding
U-Net to extract organ-specific features from partially labeled data and was evaluated
with six public abdominal CT datasets. Additionally, Liu and co-authors [26] introduced
episodic learning in continuous frequency space (ELCFS) to address the problem of fed-
erated domain generalization (FedDG) in medical image segmentation. Their approach
involves transferring distribution information across clients in a simple privacy-protecting
manner and implementing a boundary-oriented episodic learning paradigm to enhance
model generalization. FL has successfully ensured data privacy and collaborative model
training in fields beyond medical imaging. Examples include its application in intrusion
detection systems (IDSs) [27,28], the development of smart cities [29], and electrical driv-
ing systems [30]. However, despite these successes, the adoption of FL with advanced
PPTs remains limited. Presently, PPTs in FL have garnered significant attention due to
the increasing focus on safeguarding sensitive information during collaborative model
training. Several works [31–35] have explored diverse privacy-preserving (PP) mecha-
Diagnostics 2024, 14, 2891 4 of 25
nisms, aiming to strike a balance between model accuracy and individual privacy. For
instance, Chen et al. [31] proposed an efficient PP and traceable FL framework with mini-
mal overhead. The approach incorporates hierarchical aggregation, random seed-based
noise addition, and encryption for secure global parameter transmission through subag-
gregators. Authors [32] proposed PrivateKT for PP knowledge transfer in FL. The method
demonstrates comparable performance to centralized learning while providing strict pri-
vacy guarantees through differential privacy constraints. Shao et al. [33] developed a
selective knowledge-sharing mechanism for federated distillation, which has the potential
to enable a PP, communication-efficient, and heterogeneity-adaptive federated training
framework. Several works rely on blockchain-based FL [36–38]. In particular, in [36], the
authors propose a 5G framework integrating blockchain for enhanced security and privacy.
The system incorporates a DRL optimization strategy, local noise addition for privacy, and a
DPoS protocol for efficient transaction validation and edge server computation. Similarly, a
study [37] proposed an Industrial Internet of Things BCFL model with adaptive differential
privacy, using the Laplace mechanism to introduce noise during model updates. The model
adapts the cropping threshold dynamically, minimizing noise’s impact on accuracy, and
incorporates validation and consensus mechanisms to detect and prevent malicious attacks,
ensuring fairness through a reputation-based node assessment system. In one recent ex-
ample [38], the authors explored privacy solutions in the context of blockchain-enabled
federated learning (BCFL), summarizing background information, evaluating integration
architectures, addressing privacy concerns, and highlighting applications and challenges in
this emerging field. In the medicine and healthcare domain, Li et al. Ref. [34] explored FL
for brain tumor segmentation while preserving patient data privacy. They implemented
and evaluated practical systems on the BraTS 2018 dataset, investigating the use of differen-
tial privacy techniques. The study compared federated averaging algorithms, addressing
momentum-based optimization and imbalanced training nodes. Additionally, the authors
empirically studied the sparse vector technique to provide a strong differential privacy
guarantee, highlighting the trade- off between model performance and privacy protection.
Authors [35] introduced PriMIA (privacy-preserving medical image analysis), a framework
for secure medical image analysis. PriMIA employs FL and encrypted diagnosis to protect
patient privacy while ensuring accurate classification. The study evaluated the framework’s
security against model inversion attacks. In the literature [39], a comprehensive overview of
current and future approaches to PP artificial intelligence, particularly in medical imaging,
is provided. The authors addressed the challenges, potential vulnerabilities compromising
privacy, and future developments in the field. Moreover, a paper [40] proposes a mobile
crowdsensing system that leverages unmanned aerial vehicles (UAVs), incorporating local
differential privacy and a reinforcement-learning-based incentive system. In [41,42], homo-
morphic encryption was employed in FL to reinforce the security of parameters shared with
the external network system, ensuring continued privacy of patient data and facilitating the
accurate aggregation of models. From our literature review, we identified several important
limitations in previous studies. Many studies have primarily used centralized approaches
for ML tasks, raising significant concerns related to data confidentiality, security, and scala-
bility. Challenges in acquiring sufficient medical imaging data, due to legal, technical, and
privacy-related obstacles, were also common, particularly in multi-institutional collabora-
tions. While FL has emerged as a promising solution, its application to medical imaging,
especially for brain tumor segmentation, remains limited. Furthermore, existing research
often overlooked advanced PPTs, which are crucial for protecting sensitive patient data
during collaborative learning. Our study addresses these gaps by introducing a specifically
designed FL framework for brain tumor segmentation using a deep 3D U-Net model on
the BraTS 2020 dataset. We implemented privacy techniques, including differential privacy,
to ensure robust protection of individual patient data throughout the collaborative learning
process. In addition, we explored secure model aggregation methods in a heterogeneous
and distributed environment, enhancing privacy while improving model performance.
In summary, our work not only fills existing gaps in the literature by combining FL with
Diagnostics 2024, 14, 2891 5 of 25
advanced privacy protection measures but also makes a significant contribution to the field
of secure and efficient medical image analysis.
3. Background
3.1. Neural Network Architecture
In addressing the task of segmenting distinct Glioma subregions within 3D medi-
cal images, our proposed neural network architecture draws inspiration from the U-Net
framework. Comprising a dual-path structure, the model conducts its analysis with a
3D input tensor of shape (4, 128, 128, 128), initiating a comprehensive exploration. The
contracting path employs convolutional blocks, integrating group normalization and ReLU
activations for enhanced nonlinearity, followed by pooling operations to capture contextual
information, where each convolutional block applies two 3D convolutions with a kernel
size of 3 × 3 × 3. Progressing through additional convolutional blocks, the contracting
path gradually augments filters to extract high-level features and downsample the input
volume. Transitioning to the intermediate stage, the model introduces further convolutional
blocks (M4 and M5). It utilizes transposed convolutions (Conv3DTranspose) for upsam-
pling and introduces innovative concatenation blocks in the expanding path for seamless
communication between paths. Refinement in the expanding path involves convolutional
blocks to restore spatial resolution, culminating in the final segmentation map through
a 1 × 1 × 1 convolutional layer with a sigmoid activation function. Figure 1 provides a
visual representation of this tailored architecture.
16 16 16 3
skip connection
32 32 32
64 64 64
4x128x128x128
128 256 128
Concat
Concat
Concat
M4 M5
E3 Middle D1
E2 D2
encoder decoder
E1 D3
concat
3.2. Dataset
In this study, we evaluated the performance of the proposed intelligent model on the
BraTS2020 dataset [42–45], which is well known in the scientific community and freely
available to the public. It contains four distinct MRI brain scans from glioma patients: the
native (T1), post contrast T1-weighted (T1Gd), T2-weighted (T2), and T2 fluid attenuated
inversion recovery (T2-FLAIR) volumes, each paired with a ground truth mask, all from
multiple institutional sources. The masks highlight three key labels: peritumoral edema
(ED-label 2), the necrotic and nonenhancing tumor core (NCR/NET label 1), and ET
(label 4). The dataset is challenging due to the variability in imaging protocols, scanner
types, and patient populations across institutions. We trained and tested our model on the
BraTS2020 training dataset, which was divided into subsets, with 267 cases for training, 66
for validation, and 36 for testing. Figure 2 illustrates the format of a multimodality brain
MRI scan together with an example of the corresponding expert annotation
Diagnostics 2024, 14, 2891 6 of 25
Figure 2. Sample glioma from the BraTS 2020 training dataset. Yellow: enhancing tumor (ET), blue:
nonenhancing tumor/necrotic tumor (NET/NCR), green: peritumoral edema (ED).
• Random Flip Along Each Spatial Axis: The data were subjected to a random horizontal
flip, a random vertical flip, and potentially a random flip along the depth, with a
probability of 80%.
1 Sn × R n + ε
DSC = 1 −
N ∑ Sn 2 + R n 2 + ε (1)
n
where N is the total number of classes; Sn and Rn are the prediction and ground truth
for each channel. A smoothing factor (ε) equal to 1 was incorporated to improve stability.
Unlike traditional approaches, our optimization strategy targeted the final tumor regions
of interest (ET, TC, and WT), bypassing the individual components (such as NET-NCR and
ED). The neural network’s output, structured as a 3-channel volume, provided distinct
probability maps for each tumor region.
2TP
DSC = (2)
2TP + FP + FN
TP
Sensitivity = (3)
TP + FN
TN
Speci f icity = (4)
TN + FP
In these equations, TP represents true positives, which correspond to voxels correctly
classified by the model. FP stands for false positives, FN represents false negatives, and
TN denotes the number of true negatives.
4. Methodology
Our approach employs two distinct training pipelines with the same neural network
architecture based on the 3D U-Net. We initially adopted a centralized approach to train the
3D model, utilizing a single-site dataset for model training and evaluation. Subsequently,
we evaluated the FL using the same 3D U-Net architecture as the shared model, with a mod-
ified federated averaging (FedAvg) serving as the aggregation algorithm. The experiments
used the BraTS2020 datasets, focusing on multiclass brain tumor segmentation. Within
the FL pipeline, distributed devices perform segmentation tasks using their respective
local data, ensuring no data sharing occurs among participants to uphold data privacy.
The subsequent sections provide detailed insights into each pipeline and underscore the
distinct effects of the privacy-preserving FL techniques employed. Comprehensive training
configurations and other technical aspects of the proposed method are expounded. To pro-
vide a visual representation of our proposed methodology, Figure 3 illustrates a flowchart
outlining the fundamental steps and components of our approach.
Diagnostics 2024, 14, 2891 8 of 25
Train_test split
5 cross-validation
Train_set split on
four clients
Labelmaps Predictions
Pre-Processing
Predictions binarizing
Min-max scaling
Removed padding
Clipping intensity value
Re-cropped zero-padded
Comparison
350 epochs training
Input channel dropping Partial Federated Model
420 epochs training
random flip
Figure 3. Overall block diagram of the proposed strategy used for enhancing brain tumor segmenta-
tion accuracy and preserving data privacy.
Medical institutions may have diverse patient populations with varying demographics,
medical histories, and conditions. Therefore, treating the local data as representative of
the entire population might lead to biased or inaccurate model training. Similarly, a
large research hospital may have more extensive patient records than a smaller clinic.
Addressing this unbalance is crucial to ensure fair model training and prevent biases
toward institutions with larger datasets. Additionally, numerous hospitals, clinics, and
research centers contribute their data for collective model training. This collaborative
approach ensures a diverse representation of medical scenarios and conditions across the
federated network. Furthermore, given the diverse and dynamic nature of healthcare
settings, not all medical institutions may be online simultaneously for training. Therefore,
training with flexibility in participation ensures that the FL process continues even when
some institutions are temporarily offline. Overall, in medical FL, these challenges are
particularly significant due to the sensitive nature of patient data. Ensuring that models are
robust and generalizable across different medical institutions, regardless of variations in
data distribution, size, and communication constraints, is essential for the successful and
ethical application of FL in healthcare.
Algorithm 1 The FedAvg algorithm. K clients are indexed by k, B is the local minibatch
size, E is the number of local epochs, and η is the learning rate.
1: Server executes:
2: initialize w0
3: for each round t = 1, 2, . . . do
4: m ← max(C · K, 1)
5: St ← (random set of m clients)
6: for each client k ∈ St in parallel do
7: wtk+1 ← ClientUpdate(k, wt )
8: end for
9: wt+1 ← ∑kK=1 nnk wtk+1
10: end for
11: ClientUpdate(k, w): //run on client k
12: B ← (split Pk into batches of size B)
13: for each local epoch i from 1 to E do
14: for each batch b ∈ B do
15: w ← w − η · ∇l (w, b)
16: end for
17: end for
18: return w to Server
Shared Model
Server Aggregator
client 1
2-Client Updates 1
• Reward-Driven Approaches
– Objective: Encourage participants to contribute data while protecting their privacy.
– Methodology: Implement incentive structures that reward data contributors, strik-
ing a balance between participation encouragement and privacy preservation.
In our FL approach, differential privacy is implemented through perturbation methods,
incorporating randomly generated noises into each patient’s data point and for each client
before the training process begins. Noise is introduced to local models as well, to improve
privacy and to avoid accidental or intentional leakage of sensitive information. Additionally,
data augmentation techniques are employed to introduce diversity. During training epochs,
a shuffle technique is applied, which can also help to avoid overfitting and improve
privacy by preventing the model from memorizing the order of the data points and the
training data by introducing randomness into the training process. In this approach,
a comprehensive security analysis is conducted to identify and address vulnerabilities,
ensuring the robustness and integrity of our FL system against potential threats, guiding
the development of effective countermeasures; see Figure 5.
Multi-institutional Collaboration
Non-IID
Threats Modeling
Hacker
: Leakage Measures
: Limited Communication
Horizontal federated learning (Random Repture)
Encryption and security measures, while crucial, are reserved for future works. To
avoid the leakage of sensitive information during model sharing and to prevent potential
adversaries from launching attacks, we carefully select a random subset of local models’
training weights to share during each federation round. This measure, combined with
differential privacy, mitigates the risk of reverse attacks [47]. To address the high latency
and limited bandwidth issues, communication between clients and the server is limited
to 1 time per round (3 epochs), starting from the ninth epoch. This strategic approach
serves to both reduce the number of communication rounds and simultaneously limit the
size of transmitted messages, optimizing communication and computational overhead.
Diagnostics 2024, 14, 2891 13 of 25
During each federation round, clients transmit their local models to the aggregator to
facilitate the update of global model parameters. On the server side, the algorithm scruti-
nizes each weight in every layer. The update process adheres to the following principles:
• Handling Zero Weights
– If a weight is 0 for all clients, the algorithm retains the previous global model
weight for that position in the new model.
– If a weight is 0 for all clients except one, the algorithm incorporates the nonzero
weight from the single client into the new model.
– If weights are nonzero for a subset of clients, the algorithm computes the average
of those nonzero weights and disregards clients with 0 weights.
– If weights are nonzero for all clients, the algorithm computes the average of
those weights.
• Global Model Update
– The algorithm concludes by computing a simple average between the previous
global model and the new model weights.
– The adjusted server model is shared with all participating clients in preparation
for the forthcoming federation round.
This PP algorithm ensures that sensitive information is carefully managed during the
FL process, striking a balance between privacy preservation and model utility.
Diagnostics 2024, 14, 2891 14 of 25
Client 1 Client 2
0.8 0.8
0.6 0.6
ET Dice score
Metrics
TC Dice score
WT Dice score Metrics
0.4 Validation loss 0.4
Client 3 Client 4
0.8 0.8
0.6 0.6
ET Dice score
Metrics
Metrics
TC Dice score
WT Dice score
0.4 0.4 Validation loss
Figure 7. Partial federated mode analysis: examination of the segmentation results on the validation
set, showcasing the contributions from each site. (a–d) display the ET, TC, and WT Dice scores, along
with the validation loss for Clients 1 to 4, respectively.
Upon analysis of the results, a discernible trend emerged. The global model is stable
and consistently performs well across epochs, with lower validation loss compared to
individual clients. In contrast, the federated clients’ models, while achieving scores that
are slightly lower than the global model, demonstrate a lack of stability, manifesting as
numerous drops in all evaluated metrics, including ET Dice score, TC Dice score, WT
Dice score, and validation loss. This observation underscores the intricate dynamics of
Diagnostics 2024, 14, 2891 16 of 25
the FL process and prompts a closer examination of the factors influencing the fluctuating
performance of individual clients within the federated framework.
0.8
Validation loss Global Model
Client 1
Client 2
Client 3
Client 4
Global model 0.8
0.6
0.6
ET Dice score
Metrics
TC Dice score
Loss
0.2 0.2
0.0
0.0 0 50 100 150 200 250 300 0 50 100 150 200 250 300 350 400
Epochs Epochs
(a) (b)
Figure 8. Partial federated mode analysis: validation loss and Dice score metrics. (a) Validation loss
comparison: server vs. clients. (b) Dice score metrics’ plot: server aggregator performance.
Client 1 Client 2
0.8 0.8
0.6 0.6
ET Dice score ET Dice score
Metrics
Metrics
0.2 0.2
0.0 0 50 100 150 200 250 0.0 0 50 100 150 200 250
Epochs Epochs
(a) (b)
Client 3 Client 4
0.8 0.8
0.6 0.6
ET Dice score ET Dice score
Metrics
Metrics
0.2 0.2
0.0 0.0
0 50 100 150 200 250 0 50 100 150 200 250
Epochs Epochs
(c) (d)
Figure 9. Full federated mode analysis: examination of the segmentation results on the validation set,
showcasing the contributions from each site. (a–d) display the ET, TC, and WT Dice scores, along
with the validation loss for Clients 1 to 4, respectively.
Diagnostics 2024, 14, 2891 17 of 25
Figure 10a,b illustrate the aggregated results from the server, offering a comprehensive
view of the overall model performance. The global model demonstrates stability and
consistent performance across epochs, achieving a lower validation loss compared to
individual clients. While the federated clients’ models achieve competitive scores, they have
slightly lower performance metrics than the global model. The individual client models
show fluctuations and drops in different metrics, indicating variability in their performance.
0.6 0.6
ET Dice score
Metrics
TC Dice score
Loss
WT Dice score
0.4 0.4 Validation loss
0.2 0.2
Figure 10. Full federated mode analysis: validation loss and Dice score metrics. (a) Validation loss
comparison: server vs. clients. (b) Dice score metrics’ plot: server aggregator performance.
0.8
0.6
ET Dice score
Metrics
TC Dice score
WT Dice score
0.4 Validation loss
0.2
On the server side, each federated round takes approximately 10 s for training and
the same duration for validation (110 s). Therefore, completing each federated round
requires approximately 2108 s (about 35 min). These detailed timings provide insights
into the computational costs and resource requirements associated with the FL approach.
A closer inspection of Figure 12, which illustrates the metrics of those models over their
last 118 epochs, facilitates a comprehensive analysis and visualization. We observe that
all models achieve a segmentation DSC above 80%. However, the centralized model
demonstrates superior performance, as evidenced by a lower validation loss. Particularly
noteworthy is the TC segmentation, where the centralized model attains values closer to
0.89. The federated models yield comparable segmentation results in terms of WT DSC,
aligning closely with the centralized model. In the ET segmentation, the partial federated
model trails slightly behind the full federated model, yet both experience a marginal drop
of approximately 1% in performance compared to the centralized model. This nuanced
analysis sheds light on the trade-offs between the efficiency of centralized training and the
collaborative nature of FL, offering valuable insights into the model convergence dynamics
and segmentation outcomes.
0.6
0.6
Values
Values
0.4
0.4
0.2 0.2
Partial Federated model Partial Federated model
Full Federated model Full Federated model
Centralized model Centralized model
0.0 0 20 40 60 80 100 120 0.0 0 20 40 60 80 100 120
Epochs Epochs
(a) (b)
WT Dice Score Validation loss
Partial Federated model
Full Federated model
Centralized model
0.8
0.8
0.6 0.6
Values
Loss
0.4 0.4
0.2 0.2
Partial Federated model
Full Federated model
Centralized model
0.0 0 20 40 60 80 100 120 0.0 0 10 20 30 40 50 60 70 80 90 100 110 120
Epochs Epochs
(c) (d)
Figure 12. First step of validation comparison: centralized vs. partial federated vs. full federated
approach. (a–c) Three cases of Dice score comparison, representing ET, TC, and WT, respectively.
(d) Validation loss comparison.
Diagnostics 2024, 14, 2891 19 of 25
Table 1. A comparative analysis of the performance of various approaches on the validation set. The
metrics are presented as mean values. The highest score is indicated in bold.
0.0 0 0.0
Centralized Model Full Federated Partial Federated Centralized Model Full Federated Partial Federated Centralized Model Full Federated Partial Federated
Figure 13. Second step of performance comparison: federated vs centralized learning. (a) Dices
values. (b) Hausdorff distance. (c) Sensitivity.
model in terms of HD95, particularly the partial federated model, which exhibits the lowest
values for all tumor parts compared to both the centralized and full federated models.
Furthermore, regarding sensitivity, both federated models surpass the centralized model,
indicating that the HFL system, with the modified FedAvg as the aggregating algorithm, ex-
cels in multiclass segmentation tasks. This nuanced analysis underscores the effectiveness
of the FL approach, balancing privacy concerns with robust segmentation performance
on diverse datasets. For example, Figure 15 displays segmented tumors from the test
set. These visual representations offer a succinct overview of the models’ strengths and
weaknesses, aiding in interpreting the nuanced results discussed above.
0.2 5 0.2
0.0 0 0.0
Centralized Model Full Federated Partial Federated Centralized Model Full Federated Partial Federated Centralized Model Full Federated Partial Federated
Figure 14. Inference performance comparison: federated vs centralized learning. (a) Dices values.
(b) Hausdorff distance. (c) Sensitivity.
Prediction Centralized Prediction Federated complete Prediction Federated partial Prediction Centralized Prediction Federated complete Prediction Federated partial
Figure 15. Visual segmentation results of proposed methods from two different patients’ data. Axial
slice of MRI images in two modalities, ground truth, and predicted results from both the centralized
and federated models.
Diagnostics 2024, 14, 2891 21 of 25
Table 2. Model performance results in our own test set extracted from the Brats2020 training dataset.
Bold values represent the best performance for each metric.
0.8 25
20
0.6
15
0.4
10
0.2
5
0.0 0
Partial Federated
Silva et al. (2021)
Centralized Model
Wang et al. (2023)
Full Federated
Partial Federated
Silva et al. (2021)
Sahoo et al. (2023)
Hu et al. (2023)
Centralized Model
Liu et al. (2023c)
Wang et al. (2023)
Chang et al. (2023)
Wang et al. (2021a)
Zhao et al. (2023)
Liu et al. (2023b)
Full Federated
(a) (b)
Figure 16. Performance comparison of proposed method with state-of-the-art techniques [10–18]
on Brats2020 test set in terms of dice coefficient and Hausdorff distance. (a) Validation Dice values.
(b) Validation Hausdorff distance.
Diagnostics 2024, 14, 2891 22 of 25
Proposed Centralized 0.878 0.893 0.881 0.836 0.934 0.884 0.999 0.998 0.999 9.28 24.36 9.61
Model
Full Federated 0.868 0.873 0.896 0.865 0.957 0.890 0.999 0.997 0.998 11.088 23.611 12.208
Partial Federated 0.866 0.875 0.898 0.846 0.965 0.883 0.999 0.998 0.999 8.321 22.959 8.683
Our results present a significant advance in the application of FL for brain tumor
segmentation, particularly in the context of PP medical imaging. Our results not only
contribute to the existing body of knowledge but also highlight the potential of FL to
address critical challenges in healthcare. Our model achieved DSCs of 0.898, 0.875, and
0.866 for the WT, TC, and enhanced tumor core, respectively. These results are consistent
with the performance measures reported in the recent literature, where DL models applied
to the Brats dataset have demonstrated similar or slightly lower levels of performance. For
example, studies using 2D and 3D convolutional neural networks reported DSCs ranging
from 0.85 to 0.90, indicating that our approach is competitive and effective for accurately
segmenting brain tumor regions. This performance is particularly noteworthy given the
complexities inherent in brain tumor imaging, including variations in tumor morphology
and the presence of surrounding anatomical structures. The integration of FL into our
methodology represents a paradigm shift in how medical imaging data can be used for
model learning. Traditional centralized approaches often face significant obstacles related
to data confidentiality, security, and regulatory compliance. By employing FL, we enable
model training on decentralized data sources, without the need to transfer sensitive patient
information. This approach not only mitigates confidentiality issues but also enhances the
potential for collaboration between institutions, enabling the sharing of diverse datasets
while adhering to ethical standards. Our results build on previous research highlighting
the benefits of FL in healthcare applications. For example, studies have shown that FL can
improve model performance while preserving patient data privacy, which promotes confi-
dence in AI applications in clinical settings. The successful implementation of differential
privacy techniques in our study further highlights the importance of protecting individual
patient data during the training process. This is essential as the risks of data leakage and
model inversion attacks pose significant threats to patient privacy in traditional ML frame-
works. Despite these promising results, our study has limitations. The performance of the
FL model may be influenced by data heterogeneity between different institutions, including
variations in imaging protocols and patient demographics. Future research should focus on
expanding the dataset to encompass a wider range of tumor types and imaging modalities,
enabling a more comprehensive assessment of the model’s performance. In addition, the
Diagnostics 2024, 14, 2891 23 of 25
6. Conclusions
In this research, we demonstrated the success of FL techniques in improving brain
tumor segmentation from MRI images while prioritizing patient data privacy. Our analysis
of partial and full federated deep models versus a centralized approach revealed that feder-
ated models achieved comparable segmentation performance, with notable advantages in
terms of sensitivity and robustness, particularly in WT segmentation. The results obtained
indicated that while the centralized model showed slightly superior performance metrics,
the federated models maintained a high level of accuracy, with only marginal decreases
in DSC for improved tumor segmentation and TC. Furthermore, the partial federated
model outperformed its counterparts in terms of Hausdorff95 metrics, suggesting the
better delineation of tumor boundaries, which is essential for clinical applications. This
study highlights the potential of FL to facilitate collaborative research between institutions
without compromising sensitive patient data, thereby removing an important hurdle in
the field of medical imaging. The results support the wider adoption of FL methodolo-
gies in medical applications, paving the way for future research to explore more complex
models and larger datasets, helping to improve patient outcomes and advance the field of
medical imaging.
Author Contributions: M.E.Y., M.D., M.G. and A.B. conceived and designed the work. A.R., S.T.,
A.A. and M.A.S. critically revised the article. All authors have read and agreed to the published
version of this manuscript.
Funding: This research was funded by the United Arab Emirates University, UAEU Strategic Research
Grant G00003676 (Fund No.: 12R136) through the Big Data Analytics Center.
Institutional Review Board Statement: Not applicable: This study did not involve humans or
animals.
Informed Consent Statement: Not Applicable.
Data Availability Statement: The data supporting the findings of this study are based on a publicly
available dataset in the reference: Brats2020 at https://www.med.upenn.edu/cbica/brats2020/
registration.html accessed on 20 November 2024.
Acknowledgments: The authors thank United Arab Emirates University for supporting this work
through the UAEU Strategic Research Grant G00003676 and SURE+ Grant G00004748.
Conflicts of Interest: The authors declare no conflicts of interest.
References
1. Siegel, R.L.; Giaquinto, A.N.; Jemal, A. Cancer statistics, 2024. CA A Cancer J. Clin. 2024, 74, 12–49. [CrossRef]
2. Litjens, G.; Kooi, T.; Bejnordi, B.E.; Setio, A.A.A.; Ciompi, F.; Ghafoorian, M.; van der Laak, J.A.; van Ginneken, B.; Sánchez, C.I. A
survey on deep learning in medical image analysis. Med. Image Anal. 2017, 42, 60–88. [CrossRef] [PubMed]
3. Bouamrane, A.; Derdour, M. Enhancing Lung Cancer Detection and Classification Using Machine Learning and Deep Learning
Techniques: A Comparative Study. In Proceedings of the 2023 International Conference on Networking and Advanced Systems
(ICNAS), Algiers, Algeria, 21–23 October 2023; IEEE: New York, NY, USA, 2023; pp. 1–6.
4. Gasmi, M.; Derdour, M.; Gahmous, A. Transfer learning for the classification of small-cell and non-small-cell lung cancer. In
Proceedings of the International Conference on Intelligent Systems and Pattern Recognition, Hammamet, Tunisia, 24–26 March
2022; Springer: Berlin/Heidelberg, Germany, 2022; pp. 341–348.
5. Gasmi, M.; Derdour, M.; Gahmousse, A.; Amroune, M.; Bendjenna, H.; Sahraoui, B. Multi-Input CNN for molecular classification
in breast cancer. In Proceedings of the 2021 International Conference on Recent Advances in Mathematics and Informatics
(ICRAMI), Tebessa, Algeria, 21–22 September 2021; IEEE: New York, NY, USA, 2021; pp. 1–5.
Diagnostics 2024, 14, 2891 24 of 25
6. Menaceur, S.; Derdour, M.; Bouramoul, A. Using Query Expansion Techniques and Content-Based Filtering for Personalizing
Analysis in Big Data. Int. J. Inf. Technol. Web Eng. (IJITWE) 2020, 15, 77–101. [CrossRef]
7. Mounir, A.; Adel, A.; Makhlouf, D.; Sébastien, L.; Philippe, R. A New Two-Level Clustering Approach for Situations Management
in Distributed Smart Environments. Int. J. Ambient. Comput. Intell. (IJACI) 2019, 10, 91–111. [CrossRef]
8. Kahil, M.S.; Bouramoul, A.; Derdour, M. GreedyBigVis–A greedy approach for preparing large datasets to multidimensional
visualization. Int. J. Comput. Appl. 2022, 44, 760–769. [CrossRef]
9. Kahil, M.S.; Bouramoul, A.; Derdour, M. Multi Criteria-Based Community Detection and Visualization in Large-scale Networks
Using Label Propagation Algorithm. In Proceedings of the 2021 International Conference on Recent Advances in Mathematics
and Informatics (ICRAMI), Tebessa, Algeria, 21–22 September 2021; IEEE: New York, NY, USA, 2021; pp. 1–6.
10. Silva, C.A.; Pinto, A.; Pereira, S.; Lopes, A. Multi-stage Deep Layer Aggregation for Brain Tumor Segmentation. In Brainlesion: Glioma,
Multiple Sclerosis, Stroke and Traumatic Brain Injuries; Crimi, A., Bakas, S., Eds.; Springer: Cham, Switzerland, 2021; pp. 179–188.
11. Sahoo, A.K.; Parida, P.; Muralibabu, K.; Dash, S. An improved DNN with FFCM method for multimodal brain tumor segmentation.
Intell. Syst. Appl. 2023, 18, 200245. [CrossRef]
12. Hu, J.; Gu, X.; Wang, Z.; Gu, X. Mixture of calibrated networks for domain generalization in brain tumor segmentation. Knowl.
-Based Syst. 2023, 270, 110520. [CrossRef]
13. Liu, Z.; Wei, J.; Li, R.; Zhou, J. Learning multi-modal brain tumor segmentation from privileged semi-paired MRI images with
curriculum disentanglement learning. Comput. Biol. Med. 2023, 159, 106927. [CrossRef]
14. Wang, Y.; Chen, J.; Bai, X. Gradient-assisted deep model for brain tumor segmentation by multi-modality MRI volumes. Biomed.
Signal Process. Control 2023, 85, 105066. [CrossRef]
15. Chang, Y.; Zheng, Z.; Sun, Y.; Zhao, M.; Lu, Y.; Zhang, Y. DPAFNet: A Residual Dual-Path Attention-Fusion Convolutional
Neural Network for Multimodal Brain Tumor Segmentation. Biomed. Signal Process. Control 2023, 79, 104037. [CrossRef]
16. Wang, Y.; Cao, Y.; Li, J.; Wu, H.; Wang, S.; Dong, X.; Yu, H. A lightweight hierarchical convolution network for brain tumor
segmentation. BMC Bioinform. 2021, 22, 636. [CrossRef] [PubMed]
17. Zhao, J.; Xing, Z.; Chen, Z.; Wan, L.; Han, T.; Fu, H.; Zhu, L. Uncertainty-aware multi-dimensional mutual learning for brain and
brain tumor segmentation. IEEE J. Biomed. Health Inform. 2023, 27, 4362–4372. [CrossRef] [PubMed]
18. Liu, Z.; Ma, C.; She, W.; Wang, X. TransMVU: Multi-view 2D U-Nets with transformer for brain tumour segmentation. IET Image
Process. 2023, 17, 1874–1882. [CrossRef]
19. Mohammed, B.A.; Senan, E.M.; Alshammari, T.S.; Alreshidi, A.; Alayba, A.M.; Alazmi, M.; Alsagri, A.N. Hybrid techniques of
analyzing mri images for early diagnosis of brain tumours based on hybrid features. Processes 2023, 11, 212. [CrossRef]
20. Senan, E.M.; Jadhav, M.E.; Rassem, T.H.; Aljaloud, A.S.; Mohammed, B.A.; Al-Mekhlafi, Z.G. Early diagnosis of brain tumour
mri images using hybrid techniques between deep and machine learning. Comput. Math. Methods Med. 2022, 2022, 8330833.
[CrossRef] [PubMed]
21. Mohammed, B.A.; Senan, E.M.; Al-Mekhlafi, Z.G.; Rassem, T.H.; Makbol, N.M.; Alanazi, A.A.; Almurayziq, T.S.; Ghaleb, F.A.;
Sallam, A.A. Multi-method diagnosis of CT images for rapid detection of intracranial hemorrhages based on deep and hybrid
learning. Electronics 2022, 11, 2460. [CrossRef]
22. Sheller, M.J.; Reina, G.A.; Edwards, B.; Martin, J.; Bakas, S. Multi-institutional deep learning modeling without sharing patient
data: A feasibility study on brain tumor segmentation. In Proceedings of the Brainlesion: Glioma, Multiple Sclerosis, Stroke and
Traumatic Brain Injuries: 4th International Workshop, BrainLes 2018, Held in Conjunction with MICCAI 2018, Granada, Spain, 16
September 2018; Revised Selected Papers, Part I 4; Springer: Berlin/Heidelberg, Germany, 2019; pp. 92–104.
23. Qiu, L.; Cheng, J.; Gao, H.; Xiong, W.; Ren, H. Federated semi-supervised learning for medical image segmentation via
pseudo-label denoising. IEEE J. Biomed. Health Inform. 2023, 27, 4672–4683. [CrossRef]
24. Elbachir, Y.M.; Makhlouf, D.; Mohamed, G.; Bouhamed, M.M.; Abdellah, K. Federated Learning for Multi-institutional on 3D
Brain Tumor Segmentation. In Proceedings of the 2024 6th International Conference on Pattern Analysis and Intelligent Systems
(PAIS), El Oued, Algeria, 24–25 April 2024; IEEE: New York, NY, USA, 2024; pp. 1–8.
25. Xu, X.; Deng, H.H.; Gateno, J.; Yan, P. Federated multi-organ segmentation with inconsistent labels. IEEE Trans. Med. Imaging
2023, 42, 2948-2960. [CrossRef]
26. Liu, Q.; Chen, C.; Qin, J.; Dou, Q.; Heng, P.A. Feddg: Federated domain generalization on medical image segmentation via
episodic learning in continuous frequency space. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern
Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 1013–1023.
27. Agrawal, S.; Sarkar, S.; Aouedi, O.; Yenduri, G.; Piamrat, K.; Alazab, M.; Bhattacharya, S.; Maddikunta, P.K.R.; Gadekallu,
T.R. Federated Learning for intrusion detection system: Concepts, challenges and future directions. Comput. Commun. 2022,
195, 346–361. [CrossRef]
28. Lazzarini, R.; Tianfield, H.; Charissis, V. Federated learning for IoT intrusion detection. AI 2023, 4, 509–530. [CrossRef]
29. Wang, W.; He, F.; Li, Y.; Tang, S.; Li, X.; Xia, J.; Lv, Z. Data information processing of traffic digital twins in smart cities using edge
intelligent federation learning. Inf. Process. Manag. 2023, 60, 103171. [CrossRef]
30. Zhou, F.; Liu, S.; Fujita, H.; Hu, X.; Zhang, Y.; Wang, B.; Wang, K. Fault diagnosis based on federated learning driven by dynamic
expansion for model layers of imbalanced client. Expert Syst. Appl. 2024, 238, 121982. [CrossRef]
31. Chen, J.; Xue, J.; Wang, Y.; Huang, L.; Baker, T.; Zhou, Z. Privacy-Preserving and Traceable Federated Learning for data sharing in
industrial IoT applications. Expert Syst. Appl. 2023, 213, 119036. [CrossRef]
Diagnostics 2024, 14, 2891 25 of 25
32. Qi, T.; Wu, F.; Wu, C.; He, L.; Huang, Y.; Xie, X. Differentially private knowledge transfer for federated learning. Nat. Commun.
2023, 14, 3785. [CrossRef] [PubMed]
33. Shao, J.; Wu, F.; Zhang, J. Selective knowledge sharing for privacy-preserving federated distillation without a good teacher. Nat.
Commun. 2024, 15, 349. [CrossRef] [PubMed]
34. Li, W.; Milletarì, F.; Xu, D.; Rieke, N.; Hancox, J.; Zhu, W.; Baust, M.; Cheng, Y.; Ourselin, S.; Cardoso, M.J.; et al. Privacy-
preserving federated brain tumour segmentation. In Proceedings of the Machine Learning in Medical Imaging: 10th International
Workshop, MLMI 2019, Held in Conjunction with MICCAI 2019, Shenzhen, China, 13 October 2019; Proceedings 10; Springer:
Berlin/Heidelberg, Germany, 2019; pp. 133–141.
35. Ziller, A.; Passerat-Palmbach, J.; Ryffel, T.; Usynin, D.; Trask, A.; Junior, I.D.L.C.; Mancuso, J.; Makowski, M.; Rueckert, D.; Braren,
R.; et al. Privacy-preserving medical image analysis. arXiv 2020, arXiv:2012.06354.
36. Lu, Y.; Huang, X.; Zhang, K.; Maharjan, S.; Zhang, Y. Blockchain and federated learning for 5G beyond. IEEE Netw. 2020, 35, 219–225.
[CrossRef]
37. Xu, G.; Zhou, Z.; Dong, J.; Zhang, L.; Song, X. A blockchain-based federated learning scheme for data sharing in industrial
internet of things. IEEE Internet Things J. 2023, 10, 21467–21478. [CrossRef]
38. Sameera, K.M.; Nicolazzo, S.; Arazzi, M.; Nocera, A.; KA, R.R.; Vinod, P.; Conti, M. Privacy-Preserving in Blockchain-based
Federated Learning Systems. arXiv 2024, arXiv:2401.03552.
39. Kaissis, G.A.; Makowski, M.R.; Rückert, D.; Braren, R.F. Secure, privacy-preserving and federated machine learning in medical
imaging. Nat. Mach. Intell. 2020, 2, 305–311. [CrossRef]
40. Wang, Y.; Su, Z.; Zhang, N.; Benslimane, A. Learning in the air: Secure federated learning for UAV-assisted crowdsensing. IEEE
Trans. Netw. Sci. Eng. 2020, 8, 1055–1069. [CrossRef]
41. Zhang, C.; Li, S.; Xia, J.; Wang, W.; Yan, F.; Liu, Y. {BatchCrypt}: Efficient homomorphic encryption for {Cross-Silo} federated
learning. In Proceedings of the 2020 USENIX Annual Technical Conference (USENIX ATC 20), Online, 15–17 July 2020;
pp. 493–506.
42. Zhang, L.; Xu, J.; Vijayakumar, P.; Sharma, P.K.; Ghosh, U. Homomorphic encryption-based privacy-preserving federated learning
in iot-enabled healthcare system. IEEE Trans. Netw. Sci. Eng. 2022, 10, 2864–2880. [CrossRef]
43. Henry, T.; Carré, A.; Lerousseau, M.; Estienne, T.; Robert, C.; Paragios, N.; Deutsch, E. Brain Tumor Segmentation with Self-
ensembled, Deeply-Supervised 3D U-Net Neural Networks: A BraTS 2020 Challenge Solution. Lect. Notes Comput. Sci. (Incl.
Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinform.) 2021, 12658 LNCS, 327–339. [CrossRef]
44. Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Proceedings of
the Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich,
Germany, 5–9 October 2015; Proceedings, Part III 18; Springer: Berlin/Heidelberg, Germany, 2015; pp. 234–241.
45. Brendan McMahan, H.; Moore, E.; Ramage, D.; Hampson, S.; Agüera y Arcas, B. Communication-efficient learning of deep
networks from decentralized data. In Proceedings of the 20th International Conference on Artificial Intelligence and Statistics,
AISTATS 2017, Lauderdale, FL, USA, 20–22 April 2017; Volume 54.
46. Konečný, J.; McMahan, B.; Ramage, D. Federated Optimization:Distributed Optimization Beyond the Datacenter. arXiv 2015,
arXiv:1511.03575.
47. Hitaj, B.; Ateniese, G.; Perez-Cruz, F. Deep models under the GAN: information leakage from collaborative deep learning. In
Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, Dallas, TX, USA, 30 October–3
November 2017; pp. 603–618.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.