0% found this document useful (0 votes)
33 views14 pages

Federated Learning Meets Blockchain in Decentralized Data Sharing Healthcare Use Case

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views14 pages

Federated Learning Meets Blockchain in Decentralized Data Sharing Healthcare Use Case

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

19602 IEEE INTERNET OF THINGS JOURNAL, VOL. 11, NO.

11, 1 JUNE 2024

Federated Learning Meets Blockchain in


Decentralized Data Sharing: Healthcare Use Case
Saeed Hamood Alsamhi , Raushan Myrzashova , Ammar Hawbani , Santosh Kumar , Member, IEEE,
Sumit Srivastava, Liang Zhao , Xi Wei , Mohsen Guizan , Fellow, IEEE, and Edward Curry

Abstract—In the era of data-driven healthcare, the amalga- Index Terms—Blockchain, data sharing, Dataspace 4.0, decen-
mation of blockchain and federated learning (FL) introduces tralized data sharing, federated learning (FL), healthcare,
a paradigm shift toward secure, collaborative, and patient- Industry 4.0, Industry 5.0, IoE.
centric data sharing. This article pioneers the exploration of
the conceptual framework and technical synergy of FL and
blockchain for decentralized data sharing, aiming to strike a
balance between data utility and privacy. FL, a decentralized I. I NTRODUCTION
machine learning paradigm, enables collaborative AI model HE RAPID development of the Internet of Things (IoT),
training across multiple healthcare institutions without sharing
raw patient data. Combined with blockchain, a transparent
T cloud computing, and big data has led to Dataspace
4.0, a digital ecosystem where massive amounts of data
and immutable ledger, it establishes an ecosystem fostering
trust, security, and data integrity. This article elucidates the from various sources are seamlessly integrated and shared
technical foundations of FL and blockchain, unravelling their among stakeholders. Dataspace 4.0, funded by the European
roles in reshaping healthcare data sharing. This article vividly Union, aims to establish shared principles for exchanging
illustrates the potential impact of this fusion on patient care. manufacturing data at the EU level; Dataspace 4.0 is to pave
The proposed approach preserves patient privacy while granting
healthcare providers and researchers access to diversified data the way for a unified manufacturing data ecosystem and foster
sets, ultimately leading to more accurate models and improved the formation of a cohesive European community focused
diagnoses. The findings underscore the potential acceleration on Dataspace 4.0 [1]. Therefore, data sharing is essential in
of medical research, improved treatment outcomes, and patient Dataspace 4.0 to create a coherent European community and
empowerment through data ownership. The synergy of FL a unified industrial data environment. With the advent of the
and blockchain envisions a healthcare ecosystem that prioritizes
individual privacy and propels advancements in medical science. sixth generation (6G), the capabilities of Dataspace 4.0 are
expected to be further enhanced, providing new opportunities
for data-driven applications and services. Dataspace 4.0 refers
Manuscript received 27 November 2023; revised 5 February 2024; accepted to the next generation of data management systems expected
13 February 2024. Date of publication 19 February 2024; date of current to enable the integration and sharing of data across various
version 23 May 2024. This work was supported by the Science Foundation
Ireland under Grant SFI/12/RC/2289_P2. The work of Raushan Myrzashova industries and domains [2]. Varga et al. [3] discussed how
was suuported by the ANSO Scholarship for Young Talents. The work advanced technologies and the needs set for 6G affect Industry
of Xi Wei was supported by the Natural Science Foundation of Anhui 4.0 developments based on massive data. The foundation of
Province under Grant BJ2060000039. (Corresponding author: Saeed Hamood
Alsamhi.) Industry 4.0 is data sharing, which facilitates smooth commu-
Saeed Hamood Alsamhi is with the Insight Centre for Data Analytics, nication between entities, machines, and processes, improving
University of Galway, Galway, H91 TK33 Ireland, and also with the operational excellence, decision making, and resource usage.
Faculty of Engineering, IBB University, Ibb, Yemen (e-mail: Saeed.alsamhi@
insight-centre.org). Furthermore, Han et al. [4] provided a vision for a 6G
Raushan Myrzashova is with the School of Computer Science and industrial digital twin (DT) ecosystem to bridge the gaps
Technology, University of Science and Technology of China, Hefei 230026, between machines, humans, and data infrastructure to enable
Anhui, China (e-mail: [email protected]).
Ammar Hawbani and Liang Zhao are with the School of Computer numerous applications. As a result, data sharing is essential
Science, Shenyang Aerospace University, Shenyang 110136, China (e-mail: to achieving the full potential of Industry 4.0 and Dataspace
[email protected]; [email protected]). 4.0, not merely necessary.
Santosh Kumar is with the Department of Computer Science and
Engineering, International Institute of Information Technology-Naya Raipur, The safe and ethical sharing of private patient data is a
Atal Nagar-Nava Raipur 493661, India (e-mail: [email protected]). crucial challenge when healthcare data is expanding expo-
Sumit Srivastava is with the Department of Electronics and Communication nentially, and there is an increasing demand for data-driven
Engineering, FET, MJP Rohilkhand University, Bareilly 243001, India (e-mail:
[email protected]). medical advancements. Healthcare institutions, researchers,
Xi Wei is with the Department of Chemistry, University of Science and and patients need to strike a delicate balance between the
Technology of China, Hefei 230026, Anhui, China (e-mail: [email protected]). utility of aggregated medical data for scientific progress and
Mohsen Guizan is with the Machine Learning Department, Mohamed
Bin Zayed University of Artificial Intelligence, Abu Dhabi, UAE (e-mail: the paramount importance of preserving individual privacy
[email protected]). and data security. The challenge has spurred the emergence
Edward Curry is with the Insight Centre for Data Analytics, University of innovative technologies poised to reshape the landscape
of Galway, Galway, H91 TK33 Ireland (e-mail: edward.curry@insight-centre.
org). of healthcare data sharing. Data sharing has become an
Digital Object Identifier 10.1109/JIOT.2024.3367249 essential component of modern society, enabling businesses,
2327-4662 
c 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.
ALSAMHI et al.: FEDERATED LEARNING MEETS BLOCKCHAIN IN DECENTRALIZED DATA SHARING 19603

governments, and individuals to access and analyze vast A. Motivation and Contributions
amounts of data for various purposes, such as research, Modern societies depend on data sharing because it
decision making, and innovation. However, centralized data- promotes cooperation, spurs innovation, and increases indus-
sharing systems have limitations, such as data privacy and try transparency [16]. Although it is essential to research,
security issues [5], interoperability issues [6], and single points development, and the welfare of society, the explosion in
of failure [7]. To address these challenges, decentralized data data generation—especially since the introduction of the 6G
sharing has emerged as a promising alternative that distributes network and the spread of the IoT—brings new difficulties.
data across multiple nodes or peers without needing a cen- Once shared, centralized data-sharing solutions now have
tral authority or intermediary. In addition, decentralized data privacy, security, and accessibility issues. To overcome these
sharing offers several benefits, such as increased privacy and obstacles, this article proposes a paradigm shift toward decen-
security, improved data ownership and control, and enhanced tralized data sharing by utilizing blockchain technology and
transparency and accountability [8]. FL. The synergy of blockchain and FL strategy guaran-
Decentralized data sharing is an essential aspect of tees enhanced security, privacy and a strong barrier against
Dataspace 4.0, as it allows multiple parties to share data unwanted access and possible data breaches. Furthermore, It
without needing a central authority or intermediary [9], lead- offers protection from changing cyber threats by sharing power
ing to improved collaboration, increased data privacy and
and leveraging blockchain’s advantages.
security, and the potential for new business models and rev-
Moreover, the synergy of FL and blockchain gives
enue streams. Several decentralized data-sharing technologies
stakeholders unparalleled control over data in addition to
and techniques, such as federated learning (FL) [10] and
security [17]. It creates an environment of trust and account-
blockchain [11], have emerged as promising solutions to
ability among players by protecting intellectual property rights
address these challenges. The technologies above have been
and promoting openness. At the vanguard of transforming
applied in various domains, such as healthcare, finance, and the
healthcare data exchange, the synergy strategy goes beyond
IoT, to address specific use cases and requirements. Two such
satisfying urgent needs. Safe, effective, patient-centered data
technologies, FL and blockchain, have garnered significant
sharing will speed up medical research, enhance patient care
attention for their potential to solve this conundrum. FL, a
and accelerate improvements in healthcare. Our proposed
decentralized machine learning (ML) approach that Google
paradigm stands out for resolving the conventional tradeoff
pioneered [12] offers a novel paradigm for collaborative
between privacy and data sharing. Not only does it comply
model training across a network of data sources without
with strict regulations, but it also dramatically increases pro-
centralizing raw data. It inherently safeguards data privacy
ductivity and openness in the healthcare industry. In addition
at its source, a crucial factor in healthcare, where data
to providing a comprehensive solution, our work establishes a
confidentiality is sacrosanct [13]. Initially developed as the
underlying technology for cryptocurrencies like Bitcoin [14], new benchmark for the interchange of healthcare information.
blockchain has transcended its financial origins to become The combination of blockchain technology and FL promises
a secure and immutable ledger capable of ensuring data to transform the healthcare industry by promoting scientific
integrity and transparency. Its characteristics are well suited breakthroughs, enhancing patient care, and guaranteeing legal
to address the need for trust and accountability in data- compliance.
sharing ecosystems [15]. Despite the potential benefits of Data sharing is pivotal in shaping modern societies, offering
decentralized data sharing, several challenges and limitations myriad benefits that span individuals, organizations, and the
are associated with the above technologies, such as scalability, broader community [16]. It fosters collaboration, drives
interoperability, and regulatory compliance. efficiencies, and fosters innovation across various sectors. Data
In this article, we explore the intersection between FL sharing enhances transparency and accountability, acting as a
and blockchain in the context of decentralized data sharing, bulwark against corruption and building trust among stake-
with a particular focus on the healthcare sector. Our objective holders [18]. It also streamlines resource utilization, leading
is to unravel the synergies between these two technologies, to significant cost savings and productivity gains. In public
shedding light on how they can be harnessed to revolu- services, data sharing catalyzes research and development,
tionize healthcare data sharing while preserving individual particularly in critical areas like healthcare, environmental
privacy and fostering collaboration. The significance of this conservation, and societal well-being. However, the landscape
article extends beyond theoretical exploration and embraces of data sharing is not without its complexities. With the pro-
practical implications for healthcare institutions, researchers, liferation of the IoT and the advent of the 6G network, there
and, ultimately, patients. The combination of blockchain has been an exponential increase in data generation, presenting
technology and FL has become a game-changer in the quickly both opportunities and challenges. Data sharing in this context
developing field of data-driven technologies, providing a fresh raises significant privacy, security, and interoperability con-
approach to decentralized data sharing. In the context of a cerns, necessitating a careful balance between innovation and
decentralized data-sharing framework, this article examines risk mitigation. Centralized data-sharing models, traditionally
the synergies between these two technologies, highlight- prevalent, are increasingly seen as inadequate due to their
ing how they could transform collaborative data sharing inherent privacy and security limitations, reliance on singu-
while protecting individual privacy and promoting smooth lar management entities, and accessibility challenges. This
collaboration. article argues for a shift toward decentralized data sharing,
19604 IEEE INTERNET OF THINGS JOURNAL, VOL. 11, NO. 11, 1 JUNE 2024

utilizing FL and blockchain technology. Such a decentral- options, and improved patient care, revolutionizing
ized approach leverages distributed computing for efficiency healthcare.
and scalability while harnessing blockchain’s strengths in 5) Bridging the Privacy-Utility Gap in Healthcare: Our
immutability and security. This method promises enhanced framework addresses the critical balance between col-
security and privacy, mitigating risks like unauthorized access laborative healthcare research and patient data privacy,
and data breaches. It also empowers stakeholders by granting aligning with stringent regulatory standards and enhanc-
greater control over data, fostering transparency, and safe- ing transparency and trust in healthcare data sharing.
guarding intellectual property rights. Additionally, it promotes
interoperability and seamless data exchange, thereby reducing
fragmentation and improving collaboration.
Our work is at the forefront of reshaping healthcare data B. Related Work
sharing by exploring the synergistic potential of FL and Industry 4.0 is characterized by integrating several cutting-
blockchain technologies. Our approach addresses the critical edge technologies, such as the Industrial IoT, artificial
needs of secure, efficient, and patient-centric healthcare data intelligence (AI)—including augmented intelligence, big data
sharing in a world increasingly driven by data. We propose analytics, ML, and deep learning (DL)—and edge–fog cloud
an innovative framework that enables healthcare institutions, computing. These technologies are driving the next phase of
researchers, and patients to share data securely and efficiently. digital transformation [28], [29], [30]. However, unlocking the
This approach not only enhances patient care and accelerates full potential of IIoT requires cross-company collaboration,
medical research but also promises greater accuracy in diag- such as multiparty computation, pooled analyses, data sharing,
noses, personalized treatment options, and rapid advancements and data exchanging within a network of collaborators or
in medical science. The primary driving force behind our work organizations, which is essential to overcome the significant
is the need to bridge the gap between collaborative healthcare fragmentation of data. Integrating FL, blockchain technology,
research and the imperative to protect patient data privacy. and healthcare data sharing has been an increasing interest
Our proposed decentralized data-sharing model effectively and research area. Numerous studies have examined the
resolves the traditional tradeoff between sharing and privacy. technologies individually and in conjunction to address the
It aligns with stringent regulatory requirements while boosting pressing challenges of healthcare data privacy, security, and
efficiency, transparency, and trust in the healthcare sector. collaborative research. Table I summarizes the comparison of
The main contributions of this article are encapsulated in the existing related work.
development of a groundbreaking, patient-centric framework FL in Healthcare: FL allows multiple parties to train an ML
for healthcare data sharing in the 6G era, integrating FL model collaboratively without sharing raw data. Liu et al. [31]
and blockchain technologies. This integration is poised to proposed an FL-based approach for decentralized data sharing
revolutionize the healthcare landscape, fostering advancements in the IIoT. The authors showed that their approach achieved
in research, improving patient care, and ensuring regulatory better accuracy and reduced communication overhead com-
compliance, all while maintaining a steadfast focus on patient pared to traditional centralized learning. However, FL still
privacy. We offer a comprehensive solution to decentralized faces challenges, such as the privacy–utility tradeoff and com-
data sharing, setting a new standard in healthcare information munication efficiency [32]. The combination of homomorphic
exchange. encryption and FL enables privacy-preserving healthcare data
1) Innovative Integration of Technologies: We propose a analysis, demonstrating the feasibility of collaborative model
novel framework combining FL and blockchain for training without exposing sensitive patient data [13]. The
healthcare data sharing. This synergy addresses complex challenges, methods, and prospects, including their applica-
challenges related to data security and efficient sharing, tions in the healthcare domain, are discussed in [33] and [34].
ensuring a patient-centric approach. Moreover, FL is a privacy-preserving paradigm in healthcare,
2) Privacy-Preserving Data-Sharing Model: Our work emphasizing its potential in medical research and the devel-
introduces a privacy-preserving framework for health- opment of diagnostic models [35].
care data sharing. By amalgamating FL’s capability to Blockchain in Healthcare: Blockchain is a decentralized
train models without exposing raw data and blockchain’s and tamper-proof ledger that records transactions and stores
strength in maintaining data integrity, we ensure patient data securely and transparently. Blockchain has been proposed
information’s confidentiality and immutability. as a potential solution for decentralized data sharing due
3) Empowerment of Patients in Data Sharing: The to its ability to provide data immutability, auditability, and
proposed model enhances patient empowerment by transparency. Makhdoom et al. [36] proposed a blockchain-
allowing them to maintain control over their healthcare based decentralized data-sharing framework that addressed
data. Blockchain technology enables patients to partici- data privacy and security concerns. Blockchain’s relevance in
pate in medical research while actively preserving data healthcare has been extensively investigated. Chen et al. [15]
ownership. examined the patient-centric blockchain model in healthcare,
4) Healthcare Use Case Application: We demonstrate highlighting its capacity for secure and transparent health
the practical applicability of our framework through data management and sharing. Fatima et al. [37] provided
a healthcare use case. This approach leads to a comprehensive review of blockchain’s role in healthcare
more accurate medical models, personalized treatment privacy and data security, focusing on its applications in
ALSAMHI et al.: FEDERATED LEARNING MEETS BLOCKCHAIN IN DECENTRALIZED DATA SHARING 19605

TABLE I
C OMPARISON OF E XISTING R ELATED W ORK

patient records, clinical trials, and supply chain manage-


ment. Additionally, Fan et al. [38] explored secure multiparty
computations using blockchain, with implications for privacy-
preserving distributed prediction in healthcare analytics.
Integration of FL and Blockchain: While significant
progress has been made in investigating FL and blockchain
individually in healthcare, a notable gap in research exploring
their synergistic potential exists. This article represents a
pioneering effort to integrate these technologies specifically
for decentralized healthcare data sharing. Our integration aims
to harness the advantages of both approaches, such as FL’s
data privacy preservation and blockchain’s data integrity, to
address the challenges faced by traditional healthcare data- Fig. 1. Decentralized data sharing using blockchain.
sharing methods.

II. OVERVIEW
security and control over data access. Second, decentral-
A. Decentralized Data Sharing ized data sharing empowers individuals with heightened data
Decentralized data sharing refers to distributing data across privacy control. It eliminates the need for a central authority
a network of independent participants rather than relying on to manage data access, permitting individuals to grant access
a centralized authority to manage and control access to the exclusively to trusted parties. Within this framework, data
data. In a decentralized data-sharing system, each participant is distributed across nodes, safeguarded by cryptography.
has a copy of the data and is responsible for maintaining and Each entity holds a private key for data encryption and
updating their copy. In addition, participants share data with decryption, assuring data privacy and thwarting unauthorized
other participants, either directly or through a P2P network, access. This approach significantly augments personal data
and access data shared by other participants. Decentralized privacy and control, aligning with contemporary demands for
data sharing is designed with security and privacy in mind robust privacy measures to enable requirements of Industry 4.0
to protect against data breaches and unauthorized access toward Industry 5.0 [45]. Fig. 1 illustrates the architecture of
to sensitive information. Decentralized data sharing involves decentralized data-sharing using blockchain, including compo-
encryption, access controls, and other security measures to nents such as smart contracts, blockchain databases, and data
safeguard the data [39]. governance mechanisms.
Decentralized data sharing represents a groundbreaking Additionally, decentralized data sharing improves interop-
departure from traditional data-sharing approaches, offering erability across diverse systems and organizations. It achieves
many compelling advantages. Primarily, it fortifies data secu- this by embracing open standards and protocols that streamline
rity through its distributed structure, rendering it resistant data sharing among distinct platforms and applications. The
to targeted cyber-attacks or data breaches [40], [41]. Unlike result is reduced inefficiencies, redundancies, and delays in
centralized systems, where all data resides in a single location data exchange, facilitating seamless collaboration and resource
vulnerable to hacking [42], decentralized data sharing scatters optimization. Moreover, this decentralized approach enhances
data across a network of nodes, bolstering protection measures transparency by allowing all parties to access and vali-
with encryption and access controls. Each node possesses a date shared data, cultivating trust and collaborative potential.
private key [43], ensuring only intended recipients can access Decentralized data-sharing systems use encryption to protect
shared data, even if the network is compromised. Furthermore, the data from unauthorized access or tampering. Each node in
consensus algorithms verify data accuracy [44], fortifying the network has a private key used to encrypt and decrypt data,
19606 IEEE INTERNET OF THINGS JOURNAL, VOL. 11, NO. 11, 1 JUNE 2024

TABLE II
C OMPARING C ENTRALIZED AND D ECENTRALIZED DATA S HARING

ensuring that only the intended recipient can access the data to Advanced consensus mechanisms bolster this resilience, ren-
prevent unauthorized access to the data and provide a greater dering the system less susceptible to malicious attacks or data
level of security for the data. Therefore, decentralized data breaches [49]. Blockchain’s multifaceted potential is vividly
sharing improves resilience by creating a distributed network evident in various industries, including supply chain manage-
of nodes that continue to operate even if some nodes fail or are ment, healthcare, and financial services. It offers secure and
compromised and by using encryption to protect the data from transparent data recording and sharing capabilities, enhances
unauthorized access or tampering, leading to a reduction of the efficiency, accountability, and transparency, and presents novel
risks associated with data sharing and enabling organizations solutions to industry-specific challenges. While blockchain
to work more effectively and efficiently [46]. Table II outlines holds immense promise, it is essential to acknowledge and
the differences between centralized and decentralized data- address challenges, such as scalability, energy consumption,
sharing, emphasizing the superior resilience, privacy, and and regulatory frameworks, to fully harness its potential
interoperability of the latter. for decentralized data sharing across a spectrum of applica-
tions [49].
B. Blockchain Every node in a decentralized blockchain network has a
Blockchain technology is a formidable decentralized copy of the ledger. A new transaction is announced to the
and distributed data-sharing solution renowned for its network whenever one is proposed. The transaction is then
robust security and transparency features. Functioning as independently verified by nodes using pre-established proto-
a ledger system, it organizes data into immutable and cols and regulations. The consensus process’s primary goal
chronological blocks, authenticated through consensus mech- is to reach a consensus over the ledger’s current status. This
anisms among a network of nodes, ensuring its accuracy keeps any one node from intentionally or mistakenly changing
and timeliness [47]. The successful implementation of a the blockchain by requiring all nodes to verify and concur
blockchain-based decentralized data-sharing system hinges on on the sequence and legitimacy of transactions. Every node in
several key considerations. It must accommodate substantial the network is equal and cooperates to keep the blockchain
data volumes and transactions, necessitating high scalabil- current. These nodes divide up the transaction processing,
ity and performance. Robust security measures, including including consensus-building and validation. Blockchain’s
encryption and tamper-proofing, are vital to data integrity and decentralization guarantees that no single entity controls the
confidentiality. Additionally, the versatility to support various network. Rather, a democratic consensus is reached among
applications and use cases, spanning financial transactions, nodes through the consensus process. To enhance security
supply chain management, and digital identity verification, is and resilience, no single organization can dictate changes
paramount [14]. Blockchain’s essential attributes position it to the blockchain. Blockchain technology’s core feature is
as a pivotal player in the evolution of data-sharing systems, the distribution of processing among the network of nodes.
such as Dataspace 4.0 and 6G, offering a pathway to highly It guarantees that the system is resilient to attacks, strong,
secure, efficient, and transparent decentralized data-sharing and able to unite different people when trust is lacking. In
platforms [45], [48]. conclusion, the blockchain’s distributed bulk processing site
Blockchain technology’s prowess extends to enhancing highlights the decentralized character of consensus processes
interoperability in decentralized data sharing, offering a uni- across the nodes. This decentralized processing enhances the
fied framework for secure and effective interaction among blockchain’s security, transparency, and reliability.
diverse systems and organizations. Blockchain-based systems
facilitate secure data exchange while preserving data integrity C. Federated Learning
using common data structures and cryptographic algorithms. FL presents an innovative approach to ML that prioritizes
Transparency, another hallmark feature of blockchain, ensures collaborative model training while preserving data privacy and
all participants maintain a shared, comprehensive view of security [50]. In this decentralized paradigm, each partici-
data and its historical changes. It achieves this through a pating entity retains its data on its local device or network,
distributed ledger, creating an immutable, transparent record eliminating the need to transmit sensitive information to a
of all data transactions. This heightened transparency fosters centralized repository. FL operates by having each participant
trust among parties, promotes accountability, and ensures train an ML model using their local data and sending model
compliance. Furthermore, blockchain’s resilience factor is updates to a central server, reflecting the parameter differ-
crucial in decentralized data sharing, guaranteeing data ences post-training. The server aggregates these updates from
availability despite system failures or network disruptions. all participants, typically through algorithms like averaging
ALSAMHI et al.: FEDERATED LEARNING MEETS BLOCKCHAIN IN DECENTRALIZED DATA SHARING 19607

or median computation. The central server then returns an


updated global model to each participant. This iterative process
of local training, update transmission, and model retrieval
continues until the global model reaches an acceptable level of
accuracy or satisfies other predetermined criteria. The inherent
structure of FL facilitates participation from multiple parties
in the ML process without necessitating the sharing of raw
data. Utilizing a standardized ML model across all participants
ensures consistent application, ultimately leading to a more
accurate global model. However, the effectiveness of FL relies
heavily on a robust communication infrastructure for efficient
model exchange between participants and the central server.
Weak infrastructure or connectivity can delay model updates
and compromise learning processes [51].
FL significantly augments data security and privacy by
retaining sensitive information locally, thereby reducing the Fig. 2. Decentralized data sharing using FL.
risk of data breaches during transmission [51]. Since raw data
remains on local devices, potential attackers face formidable
challenges accessing sensitive information. Compromising In Fig. 2, the FL framework ensures that most processing
multiple devices to reconstruct a complete data set is con- happens in a distributed manner by distributing computa-
siderably more complex than targeting a single centralized tional workloads across participating nodes. Every node in
server. Moreover, FL’s design ensures only model updates, the network performs part of the total calculation. Using
typically aggregated and abstracted information, are transmit- its local data set, each participating node trains its model
ted to the central server. These updates do not reveal the raw independently. Therefore, it eliminates the need to send raw
data from which they were derived, further fortifying data data to a central server by processing it locally on the node’s
privacy [52]. Regarding privacy preservation, FL guarantees device. Following local model training, the nodes only send the
that user data remains private by avoiding central server model updates—not the raw data—to a central server. Usually,
sharing. Data remains confined to each user’s device, rendering these updates show the modifications or enhancements made to
it inaccessible to third parties, including entities engaged the models during local training. The central server combines
in the learning process. FL incorporates privacy-enhancing these changes to create a global model. FL’s decentralized
techniques like differential privacy, introducing statistical noise architecture briefly contributes to a safe, private, and coopera-
into data or model updates, rendering reidentifying individuals tive ML environment by distributing processing duties across
based on shared information exceedingly tricky. This feature is nodes and ensuring that crucial operations like model training
precious in sectors governed by strict data privacy regulations, occur locally.
such as healthcare, finance, and telecommunications [53].
Furthermore, transparency plays a pivotal role in estab-
lishing trust among collaborating parties. Participants can III. C OMBINATION OF FL AND B LOCKCHAIN FOR
verify that sensitive data remains unexposed during model- D ECENTRALIZED DATA S HARING
building [54]. In terms of resilience, FL enhances system The combination of FL and blockchain presents a robust
robustness through various means. For instance, it ensures solution for decentralized data sharing. FL enables secure,
efficient data utilization even in environments with limited local model training across multiple parties without central-
network connectivity [55]. Most computations occur on edge izing data, enhancing privacy and reducing network load.
devices (i.e., locally), requiring only intermittent network Blockchain complements this by providing a secure, trans-
access to transmit aggregated model updates. Additionally, parent ledger for recording transactions and maintaining data
FL is designed to handle device failures and data corruption integrity. Together, they create a powerful platform that
robustly. Should a device go offline or experience data cor- enhances security, privacy, interoperability, and transparency
ruption, the FL process continues with minimal disruption, in data sharing in healthcare [56]. Our approach uniquely
as it relies on numerous other devices that persist in their addresses end-to-end data security, from local model training
local computations. This redundancy significantly enhances to secure data storage and sharing, promising substantial
the reliability of FL models, ensuring their functionality even improvements in the efficiency and trustworthiness of collab-
in adverse circumstances [52]. For instance, there are three orative data sharing. A comparative analysis of decentralized
nodes (Node 1, Node 2, and Node 3) in the decentralized data-sharing when combining FL with blockchain technology
infrastructure layer, as shown in Fig. 2. Each node has its reveals enhanced security, improved transparency, and efficient
own instance of the FL framework, represented by the FL collaboration, as illustrated in Table III.
Framework layer. The nodes communicate with each other Enhanced Security: The combination of blockchain and
through the decentralized infrastructure to collaborate on FL ensures that data is encrypted, hashed, and distributed
training an ML model using their local data while ensuring across a network of nodes, making it difficult for hackers to
data privacy and security through FL techniques. compromise the system. FL can enhance security by allowing
19608 IEEE INTERNET OF THINGS JOURNAL, VOL. 11, NO. 11, 1 JUNE 2024

TABLE III
D ECENTRALIZED DATA S HARING W HEN B LOCKCHAIN M EETS FL

TABLE IV
A DVANTAGES OF U SING FL AND B LOCKCHAIN FOR D ECENTRALIZED DATA S HARING

local model training on the user’s device without transferring by allowing local model training on user devices, reduc-
data to a central server. ing the reliance on centralized servers. Using blockchain
Improved Privacy: By using blockchain to store data in with FL can increase security and privacy while ensuring
an encrypted and distributed manner, users can retain control a transparent and fair training process. However, it may
over their data and decide who can access it. FL can also also require additional computational resources and coordi-
improve privacy by allowing local model training on user nation between parties and may not always be necessary or
devices without centralized data collection. practical, depending on the specific use case. Table IV com-
Improved Interoperability: Blockchain and FL can enable pares FL with and without blockchain for decentralized data
interoperability between different systems and platforms, sharing.
allowing seamless data sharing across different networks. Fig. 3 shows the combination of FL and blockchain for
FL can also improve interoperability by aggregating locally decentralized data sharing. Data sources represent data sources
trained models across different devices and platforms. that can be used in Dataspace 4.0. These can include sensors,
Greater Transparency: The use of blockchain can provide devices, databases, and other sources. At the same time, FL
greater transparency in data sharing by providing an immutable represents the ML algorithms used for training models on
record of all transactions. The combination can further enhance distributed data. FL allows models to be trained without the
transparency by enabling users to verify the authenticity of need for centralized data storage. Data labeling and model
data and model outputs. FL can also improve transparency training represent the processes of labeling data and training
by allowing for the inspection of locally trained models by ML models on the labeled data. This process can be done
independent auditors. in a decentralized manner using FL. Blockchain consensus
Improved Resilience: The combination of blockchain and and validation of transactions represent the use of blockchain
FL can ensure that data and models are distributed across for consensus and validation of transactions in Dataspace 4.0.
a decentralized network of nodes, making the system more Blockchain provides a decentralized mechanism for validating
resilient to failures and attacks. FL can also improve resilience and verifying data transactions.
ALSAMHI et al.: FEDERATED LEARNING MEETS BLOCKCHAIN IN DECENTRALIZED DATA SHARING 19609

to integrate blockchain technology into the local decentral-


ized network. For automated governance, smart contracts
enforce compliance with pre-established guidelines. By reach-
ing a consensus on the ledger’s current state, consensus
mechanisms—like Proof of Authority or Proof of Stake—
validate transactions and preserve data integrity. Blockchain
improves security by guaranteeing data confidentiality and lim-
iting unwanted access. Offering an unchangeable and auditable
record of transactions encourages openness and builds partic-
ipant confidence. Data immutability is a significant advantage
as it offers a solid basis for healthcare data exchange inside the
local network since it cannot be changed once data is stored
on the blockchain.
An essential component of the infrastructure of the local
network is the use of IPFS for decentralized data management.
Instead of depending on a single server, IPFS functions as
a distributed file system where data is saved among sev-
eral nodes. It functions as a peer-to-peer network, enabling
direct data storage and retrieval for any member of the
healthcare ecosystem. Using a content-addressed architecture,
Fig. 3. Combination of FL and blockchain for decentralized data sharing. IPFS ensures data integrity and minimizes redundancy by
assigning a unique hash to each piece of data depending on
its content. Because the data is spread across several nodes,
IPFS has improved resilience, making the system resistant to
Decentralized Data Management represents using the failures. By enabling direct data retrieval from other network
interplanetary file system (IPFS) for decentralized data man- users, IPFS improves data accessibility and encourages a
agement. IPFS allows data to be stored and accessed decentralized and effective method.
decentralized without relying on a central server. Mining The community that the local decentralized network in
Mechanism and Rewards represent the mechanism for mining healthcare serves benefits greatly. First, it allows hospitals,
data and rewarding data contributors. The rewards can be in the researchers, and patients to safely and effectively share med-
form of tokens or other incentives. Data analytics and reporting ical data, improving patient care. The cooperative method
represent data analytics and tools to analyze and visualize data improves the precision of medical diagnosis and available
in Dataspace 4.0. These tools can be used to gain insights treatments. Second, the network expedites medical research by
and make data-driven decisions. Data Governance represents giving interested parties access to a large and varied data set
using smart contracts for data governance in Dataspace 4.0. while protecting personal privacy [57]. It encourages advance-
Smart contracts can be utilized to enforce rules and regulations ments in medical research and the creation of more realistic
for data sharing and access. Data consumers and smart data models. With its robust consensus processes, blockchain
providers represent the users of Dataspace 4.0 who consume guarantees data security and privacy when integrated, while
and provide data. Table V presents the security, privacy, IPFS increases accessibility by decentralizing data storage
interoperability, transparency, and resilience benefits of FL and and retrieval. In conclusion, cooperative data sharing on a
blockchain technologies individually and in synergy within local decentralized network advances healthcare, and IPFS and
different industrial applications. blockchain are essential for guaranteeing security, privacy, and
Nodes representing patients, researchers, and healthcare accessibility for all parties involved [58].
organizations must be put up to create a local, decentralized
network for sharing medical data. By starting a blockchain,
the nodes create a visible and safe ledger. Smart contracts IV. D ECENTRALIZED DATA S HARING IN
are used to automate governance and guarantee compliance. H EALTHCARE : U SE C ASE
Patients voluntarily supply personal health data, academics Our methodology presents a decentralized approach in
offer analytical models, and healthcare facilities contribute an era dominated by centralized data repositories. Let H
data sets. Nodes validate transactions using consensus proce- represent the set of hospitals, where each hospital h ∈
dures, keeping an accurate record. The network encourages H maintains its independent data set. Integrating FL and
cooperation by enabling a range of inputs without centralizing blockchain in our framework presents a powerful combination.
unprocessed data. By creating a safe and effective environment FL facilitates the initial stages of data preprocessing and
for healthcare data sharing, participants get access to a larger distribution between entities like Hospital A and Hospital
pool of data for research, improved privacy management, and B. Meanwhile, blockchain serves as the decentralized ledger,
transparent governance. ensuring subsequent data transactions’ transparency, secu-
By distributing blockchain nodes across medical facili- rity, and immutability. By leveraging the strengths of both
ties, researchers, and patients, a distributed ledger is created paradigms, we enhance the privacy, security, and efficiency of
19610 IEEE INTERNET OF THINGS JOURNAL, VOL. 11, NO. 11, 1 JUNE 2024

TABLE V
B ENEFITS OF D ECENTRALIZED DATA S HARING IN D IFFERENT I NDUSTRIES W ITHIN THE C ONTEXT OF I NDUSTRY 4.0

Fig. 5. Nodes initialization.

collaborative model training across various nodes. The local


processing at the nodes is complemented by the blockchain,
which provides a secure and transparent way to record and
validate the model updates contributed by each node.

A. FL for Data Preprocessing and Distribution


Fig. 4. Decentralized data sharing in healthcare use case.
In our approach, FL plays a pivotal role in the initial
stages. Hospitals A and B utilize FL for data preprocessing
while ensuring the raw data set remains securely within
decentralized data sharing. The schematic in Fig. 4 depicts the
their respective premises. Through FL, both hospitals, despite
decentralized data-sharing process in a healthcare use case,
retaining the actual data locally, collaboratively develop a
highlighting the role of federated learning and the protection
model using shared insights and updates. The goal here is to
against unauthorized access.
benefit from the data available across both entities, and by the
The processing in our BCFL system is highly distributed
time any information gets ready for the blockchain, it is not
across multiple nodes. Each node operates autonomously
the raw data but its processed encrypted attributes. The overall
within the decentralized infrastructure, conducting compu-
flow can be described as follows.
tations using its local data. This design is foundational to
1) Hospital A and Hospital B each start with their local data
the FL framework we have implemented. It allows for a
sets. Fig. 5 provides an example of the node initialization
resilient and reliable process, as each node independently
process in a blockchain network, detailing the assigned
contributes to the overarching ML model without centralizing
hashes and the encryption keys for each node.
data, thus preserving privacy and minimizing the risk of
2) An FL cycle is initiated, where both hospitals collaborate
data corruption or loss. For instance, our framework involves
to preprocess the data. Fig. 6 shows a sequence dia-
multiple nodes collaborating through a decentralized network
gram for transactions between hospitals in a blockchain
to train a ML model. The local computations at each node
network, emphasizing the encryption, signature genera-
mean that even if one device goes offline or experiences data
tion, and verification processes.
corruption, the FL process experiences minimal disruption.
3) The processed data, now in a standardized format, is
This not only enhances the reliability of the FL models but
integrated into the blockchain for subsequent decentral-
also ensures their functionality even in adverse circumstances.
ized transactions.
In essence, the bulk of the processing in our BCFL system
It is worth noting that by utilizing FL at this stage, the integrity
occurs distributedly. Each node in the network takes on a
and privacy of the hospital data is maintained. Only aggregated
portion of the computational load, with local data being
updates are exchanged, ensuring data privacy.
processed at the edge, close to the data sources. The FL
framework ensures that processing occurs locally at each node,
particularly the computationally intensive model training tasks. B. Sharing Iris Data Set
This distributed processing approach is crucial for maintaining The Iris data set, a widely used data set in ML and
the system’s integrity, ensuring data privacy, and enabling data analysis was employed as the primary data set for this
ALSAMHI et al.: FEDERATED LEARNING MEETS BLOCKCHAIN IN DECENTRALIZED DATA SHARING 19611

Algorithm 1 Data Attributes Retrieval


1: function R ETRIEVE DATA (species, sk)
2: filtered_data ← filter_by_species(species)
3: decrypted_data ← []
4: for T in filtered_data do
5: append(decrypted_data, decrypt(T, sk))
6: end forreturn decrypted_data
7: end function

Fig. 6. Transactions between hospitals. decryption time, the total time complexity would be O(n ∗ d),
assuming the filter operation’s complexity is less than or equal
to O(n). After retrieving the data, standard data analysis or
research. This data set consists of 150 samples from three ML techniques can be applied to the decrypted data set.
species of Iris flowers (Iris setosa, Iris virginica, and Iris 4) Data Structure (Blockchain): Each hospital’s blockchain
versicolor). Four features were measured from each sample: can be represented as a sequence of blocks
the lengths and the widths of the sepals and petals. Given
its rich history in data analysis and ML, the Iris data set B = {b0 , b1 , b2 , . . . , bn } (2)
served as an ideal foundation for demonstrating the fea- where b0 is the genesis block and bn is the latest block. Each
sibility and effectiveness of our decentralized data-sharing block bi contains
mechanism.
1) Data Representation in Federated Learning: FL ensures bi = {T, h(bi−1 ), nonce} (3)
that the participating nodes, like hospitals, retain their local where:
data without exposing the raw data set to others. However, 1) T is the transaction data;
essential attributes or insights derived from the data might 2) h(bi−1 ) is a cryptographic hash of the previous block;
undergo encryption and be shared for collaborative learn- 3) nonce is a variable adjusted during the proof-of-work
ing. These shared attributes, rather than the actual data, get process.
recorded on the blockchain, ensuring transparency, security, The blockchain structure comprises a sequence of blocks,
and consistency. each linking to its predecessor through a hash. The complexity
2) Data Representation in Blockchain: While the actual of adding a new block involves calculating the hash and
data sets, like the Iris data set, do not leave the respective performing the proof of work, which has a complexity of
hospitals, specific data attributes are processed and then O(2k ) on average, where k is the number of bits required by
encrypted for sharing on the blockchain. Specifically, the the difficulty target D.
attributes of the Iris data set—sepal length, sepal width,
and petal length—are encrypted using the recipient’s pub-
C. Data Transaction
lic key. Additionally, the species label acts as metadata,
which is not encrypted, allowing for querying based on Given a message M, the encrypted message E for a recipient
species without requiring decryption. The complexity in with public key pk is
these sections is centered around data attribute encryption E = encrypt(M, pk). (4)
and decryption. The encryption process used for the Iris
data set attributes, like sepal length and width, is based The signature S using the sender’s private key sk is
on public-key cryptography. The time complexity for such S = sign(M, sk). (5)
operations typically depends on the critical size and the
algorithm used, often being polynomial concerning the key The data transaction process involves encryption and signing
length operations. Both operations are considered polynomial time
complexity based on the key sizes used for encryption and
T = {encrypt(sl , pk), encrypt(sw , pk) signing. The transmission complexity depends on network
encrypt(pl , pk), encrypt(pw , pk), species} (1) factors and is typically considered O(1) in the context of
algorithmic analysis. Algorithm 2 outlines the process for
where:
sending encrypted data and the corresponding digital signature
1) sl is the sepal length;
in a blockchain-based data transaction.
2) sw is the sepal width;
3) pl is the petal length;
4) pw is the petal width. D. Consensus Mechanism: Proof of Work
3) Data Retrieval and Analysis: To retrieve specific data The proof-of-work consensus mechanism aims to find a
attributes from the blockchain, we implement Algorithm 1. nonce such that
The retrieval algorithm’s complexity depends on the filtered
h(T, h(bi−1 ), nonce) < D (6)
data’s size and the decryption process’s efficiency. If n
represents the number of transactions and d represents the where:
19612 IEEE INTERNET OF THINGS JOURNAL, VOL. 11, NO. 11, 1 JUNE 2024

Algorithm 2 Data Transaction Algorithm 4 Authorization Check


1: procedure S END DATA (M, pkrecipient , sksender ) 1: function IS AUTHORIZED (pk)
2: E ← encrypt(M, pkrecipient ) 2: if ∃(id, pk) ∈ R then return True
3: S ← sign(M, sksender ) 3: elsereturn False
4: transmit(E, S)  Send encrypted data and signature 4: end if
5: end procedure 5: end function

Algorithm 3 Proof of Work Algorithm 5 Potential Attack Methods


1: procedure M INE B LOCK (T, h(bi−1 ), D) 1: function R EPLAY T RANSACTION (interceptedTransaction)
2: nonce ← 0 2: send(interceptedTransaction)
3: while h(T, h(bi−1 ), nonce) ≥ D do 3: end function
4: nonce ← nonce + 1
5: end whilereturn nonce 4: function M ASQUERADE(fakeID, transactionData)
6: end procedure 5: fakeSignature ← forgeSignature(transactionData)
6: send(transactionData, fakeSignature, fakeID)
7: end function

8: function I NTERCEPTA NDA LTER(transaction)


9: interceptedData ← transaction.data
10: alteredData ← modify(interceptedData)
11: forward(alteredData)
12: end function

Algorithm 4 presents a method for checking authorization of


a participant in a blockchain network using a public key.
Fig. 7. Requests for data access between hospitals.
F. Adversarial Simulation: Hospital C
For our research, we introduced a malicious third-party
1) D represents the target difficulty; entity termed Hospital C. This entity was not part of the
2) h is the hashing function. authorized hospital’s list and acted as an adversary, simulating
Proof of work is inherently designed to be computationally various attack vectors to compromise the system’s security.
intensive. The complexity is not fixed and is adjusted by the Algorithm 5 enumerates potential attack methods within a
difficulty target D. The average time complexity of finding a blockchain network, including replay, masquerade, and inter-
valid nonce is proportional to the difficulty target, which is cept and alter attacks.
typically exponential concerning the number of leading zeros 1) Replay Attack: Hospital C eavesdrops on the transactions
required in the hash output. Algorithm 3 describes the Proof between Hospital A and Hospital B. It tries to resend
of Work (PoW) process, essential for maintaining the integrity intercepted transactions, aiming to reinsert data or initi-
and trust in blockchain operations. ate unauthorized data requests.
2) Identity Masquerade: Hospital C attempts to masquerade
E. Authorization Mechanism as Hospital A or Hospital B by forging signatures or
Let the centralized registry R be a set of tuples manipulating its IP address.
3) Man-in-the-Middle Attack: Hospital C places itself
R = {(id1 , pk1 ), (id2 , pk2 ), . . . , (idn , pkn )} (7) between Hospital A and Hospital B, intercepting and
where: potentially altering the data being exchanged.
1) idi is the unique identifier of hospital hi ;
2) pki is the public key of hospital hi . G. Defense Mechanisms
The authorization check function, is Authorized(pk), verifies Against the backdrop of these simulated attacks, our
if a given public key exists in the registry R. In Fig. 7, blockchain implementation showcased several defense mech-
the transactional workflow for requesting and granting data anisms.
access between hospitals is depicted, demonstrating the use of 1) Nonce and Hash Verification: Every block contains
encryption and blockchain verification. a nonce value, ensuring the block’s hash matches a
The authorization check involves searching through a reg- particular pattern. Replay attacks get detected as the
istry for a matching public key. If the registry is unsorted and blockchain verifies the nonce and hash values, and a
has n entries, this operation has a worst case time complexity reused nonce value indicates a replay attempt.
of O(n). If the registry is sorted or hashed, the time complexity 2) Digital Signatures and IP Verification: Our system uses
could be reduced to O(log n) or even O(1), respectively. RSA-based digital signatures to verify the authenticity of
ALSAMHI et al.: FEDERATED LEARNING MEETS BLOCKCHAIN IN DECENTRALIZED DATA SHARING 19613

transactions. The digital signature verification will fail if


Hospital C masquerades Hospital A or B. Additionally,
IP address checks were implemented to add an extra
layer of verification, further thwarting identity masquer-
ade attempts.
3) End-to-End Encryption: Data exchanged between hos-
pitals is encrypted using the recipient’s public key. This
ensures that even if Hospital C intercepts the data in a
man-in-the-middle attack, it cannot decrypt or modify it
without the corresponding private key.
Through these defense mechanisms, our decentralized data-
sharing blockchain system demonstrated resilience against the
common threats posed by adversarial entities.
Fig. 8. Attacks’ success rates.
H. Evaluation
are sufficiently robust, and the system can be considered secure
Evaluation of the effectiveness of the defense mechanisms
against the simulated adversarial actions.
against poisoning attacks conducted by the adversarial entity
is as follows.
1) Replay Attack: The defence mechanism includes nonce V. C HALLENGES , O PPORTUNITIES ,
and hash verification within the blockchain. When AND F UTURE D IRECTIONS
Hospital C attempts to resend intercepted transactions, A. Challenges
the system checks for nonce values. A reused nonce
Decentralized data sharing presents several technical chal-
indicates a replay attempt, which the blockchain is
lenges that must be addressed to ensure its effectiveness and
designed to detect. The graph in Fig. 8 shows a low
security. Some of these challenges include the following.
success rate for replay attacks, remaining consistently
Interoperability: Different decentralized data-sharing
low across multiple attempts. This indicates the system’s
systems may use different protocols and standards, making
effective detection and prevention of replay attempts
sharing data across different systems difficult. This requires
attributed to the robust verification process.
standardization and interoperability between systems.
2) Identity Masquerade: The system uses RSA-based
Scalability: Decentralized data-sharing systems must be
digital signatures and IP verification to ensure the
designed to handle large amounts of data and many par-
authenticity of transactions. Hospital C’s attempts to
ticipants. This requires efficient data storage and retrieval
forge signatures or manipulate its IP address will likely
mechanisms and distributed processing capabilities.
be unsuccessful due to these stringent checks. The graph
Consensus: Decentralized data-sharing systems rely on
in Fig. 8 corroborates this: the success rate for identity
consensus mechanisms to ensure that all participants agree
masquerade attacks is also low and does not significantly
on the validity of shared data. This requires robust consensus
increase with more attempts. This reflects the strength
algorithms to handle malicious attacks and ensure data integrity.
of the digital signature verification and IP checks in
Security: Decentralized data-sharing systems must be
preventing unauthorized entity masquerading.
designed to protect data from unauthorized access, tampering,
3) Man-in-the-Middle Attack: With end-to-end encryption,
and corruption. This requires robust authentication, encryption,
it cannot decrypt or alter the information even if Hospital
and effective mechanisms for detecting and mitigating attacks.
C intercepts the data without the corresponding private
Privacy: Decentralized data-sharing systems must pro-
key. The graph in Fig. 8 suggests that man-in-the-middle
tect the privacy of participants’ data and sensitive personal
attacks have a slightly higher success rate than the
and financial data. This requires effective mechanisms for
other two types but remain relatively low. This slight
anonymizing and protecting data and ensuring participants
increase could be due to the complexity of detecting
have control over the data.
and preventing active interception compared to the more
Data Quality: Decentralized data-sharing systems must
straightforward detection of replay and identity attacks.
ensure the accuracy and reliability of shared data, especially
Nonetheless, the encryption mechanism is a solid barrier,
in cases where data is collected from multiple sources. This
preventing Hospital C from gaining meaningful access
requires effective data validation and verification mechanisms
to the data.
to resolve conflicts between data sources.
The overall low success rates across all attack types illus-
trate the robustness of the defence mechanisms. The nonce
and hash checks, digital signature and IP verification, and B. Opportunities
end-to-end encryption collectively contribute to the resilience First, it enables businesses and organizations to access
of the blockchain system, effectively mitigating the risk of a broader range of data, leading to more comprehensive
poisoning attacks. This analysis, supported by the empirical insights and improved decision making. This can lead to
data shown in Fig. 8, demonstrates that the defence strategies the development of new products and services and enhance
19614 IEEE INTERNET OF THINGS JOURNAL, VOL. 11, NO. 11, 1 JUNE 2024

the competitiveness of companies. Second, decentralized data FL and blockchain, signalling a paradigm shift toward secure,
sharing promotes collaboration among participants, allowing collaborative, and patient-centric decentralized data sharing
them to work together to solve complex problems and develop in the data-driven healthcare era. The combination of FL’s
new solutions. This can lead to new business models, part- decentralized ML paradigm and blockchain’s transparent and
nerships, and ecosystems. Third, decentralized data sharing immutable ledger creates an ecosystem fostering trust, secu-
can facilitate the development of new technologies and appli- rity, and data integrity. While a specific real-world healthcare
cations, such as blockchain and edge computing, which can use case is not presented, this article vividly outlines the
further enhance the capabilities of Dataspace 4.0. Fourth, it potential impact of this fusion on patient care, emphasizing the
can lead to increased transparency and accountability, which is preservation of patient privacy alongside granting healthcare
particularly important in healthcare and finance, where privacy providers and researchers access to diverse data sets. The
and security are crucial. Finally, decentralized data sharing proposed approach promises to accelerate medical research,
can give individuals more control over their data, increasing improve treatment outcomes, and empower patients through
privacy and security. This can lead to the development of new data ownership. The synergy of FL and blockchain envisions a
services that provide individuals with more control over their healthcare ecosystem that prioritizes individual privacy, fosters
personal information. advancements in medical science, and sets the stage for a
The combination of BCFL for decentralized data sharing transformative shift in healthcare data sharing. This innovative
presents a unique and promising use case in healthcare, approach addresses the challenges of balancing data utility
particularly for remote monitoring applications. and privacy and opens avenues for more accurate models,
1) Remote Patient Monitoring (RPM): It involves tracking leading to enhanced diagnoses and ultimately contributing to
patient health data outside of traditional clinical settings. the evolution of a patient-centric and collaborative healthcare
This could include monitoring vital signs, blood sugar landscape.
levels, heart rate, or other relevant health metrics through
wearable devices or home-based equipment [59].
R EFERENCES
2) Collaborative Research and Treatment Optimization:
BCFL can facilitate collaborative research among dif- [1] “A common data space 4.0 for European manufacturing.”
DFA. Accessed: Oct. 16, 2023. [Online]. Available:
ferent healthcare entities while maintaining data privacy. https://digitalfactoryalliance.eu/moving-towards-a-common-data-space-
This collaboration can lead to more comprehensive 4-0-for-european-manufacturing/
health models, benefiting treatment optimization [60]. [2] “Dataspace 4.0.” Accessed: Jun. 23, 2023. [Online]. Available:
https://digitalfactoryalliance.eu/data-space-4-0-alliance/
3) Regulatory Compliance and Consent Management: [3] P. Varga et al., “Converging telco-grade solutions 5G and beyond to
Healthcare is a highly regulated sector, and BCFL can support production in industry 4.0,” Appl. Sci., vol. 12, no. 15, p. 7600,
aid in complying with regulations like HIPAA, GDPR, 2022.
[4] B. Han et al., “Digital twins for industry 4.0 in the 6G era,” 2022,
and others concerning patient data protection [61]. arXiv:2210.08970.
[5] T. White, E. Blok, and V. D. Calhoun “Data sharing and privacy issues
in neuroimaging research: Opportunities, obstacles, challenges, and
C. Future Directions monsters under the bed,” Hum. Brain Map., vol. 43, no. 1, pp. 278–291,
Advancing decentralized data sharing requires multi- 2022.
[6] A. Torab-Miandoab, T. Samad-Soltani, A. Jodati, and P. Rezaei-Hachesu,
faceted research efforts. Technical challenges, including data “Interoperability of heterogeneous health information systems: A sys-
integration, interoperability, and security, demand the devel- tematic literature review,” BMC Med. Informat. Decis. Mak., vol. 23,
opment of tailored algorithms and architectures. Legal and no. 1, p. 18, 2023.
[7] M. A. Uddin, A. Stranieri, and I. Gondal, “A survey on the adoption of
regulatory dimensions necessitate the exploration of frame- blockchain in IoT: Challenges and solutions,” Blockchain, Res. Appl.,
works safeguarding privacy amid data sharing. Investigating vol. 2, no. 2, 2021, Art. no. 100006.
the potential of decentralized data sharing in industries [8] V. Neumann et al., “Examining public views on decentralised health
data sharing,” Plos One, vol. 18, no. 3, 2023, Art. no. e0282257.
like healthcare and finance involves identifying domain- [9] E. Curry et al., “Data sharing spaces: The BDVA perspective,” in
specific use cases. Additionally, emerging technologies, such Designing Data Spaces: The Ecosystem Approach to Competitive
as blockchain and edge computing, require scrutiny for their Advantage, pp. 365–382. Cham, Switzerland: Springer Int. Publ., 2022.
[10] V. Pandi Chellapandi, A. Upadhyay, A. Hashemi, and S. H. Zak, “On
performance in decentralized contexts. Finally, developing the convergence of decentralized federated learning under imperfect
business models and ecosystems with incentives for collabora- information sharing,” 2023, arXiv:2303.10695.
tion is vital. Looking ahead, a focus on practical applications, [11] H. Niavis, N. Papadis, V. Reddy, H. Rao, and L. Tassiulas, “A
exemplified through case studies in healthcare partnerships, blockchain-based decentralized data sharing infrastructure for off-
grid networking,” in Proc. IEEE Int. Conf. Blockchain Cryptocurr.
aims to validate methodologies, address concerns about cen- (ICBC), 2020, pp. 1–5.
tralized control, and enhance flexibility for global applicability. [12] B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y Arcas,
The commitment to refining and verifying these approaches “Communication-efficient learning of deep networks from decentralized
data,” in Proc. 20th Artif. Intell. Statist., 2017, pp. 1273–1282.
in real-world healthcare underscores a dedicated thrust for the [13] H. Fang and Q. Qian, “Privacy preserving machine learning with homo-
evolution of decentralized data sharing. morphic encryption and federated learning,” Future Internet, vol. 13,
no. 4, p. 94, 2021.
[14] S. Nakamoto, “Bitcoin: A peer-to-peer electronic cash system,” in Proc.
VI. C ONCLUSION Decent. Bus. Rev., 2008, p. 21260
[15] H. S. Chen, J. T. Jarrell, K. A. Carpenter, D. S. Cohen, and X. Huang
This article has introduced a groundbreaking exploration “Blockchain in healthcare: A patient-centered model,” Biomed. J. Sci.
of the conceptual framework and technical synergy between Techn. Res., vol. 20, no. 3, 2019, Art. no. 15017.
ALSAMHI et al.: FEDERATED LEARNING MEETS BLOCKCHAIN IN DECENTRALIZED DATA SHARING 19615

[16] S. Alansari, “A blockchain-based approach for secure, transparent and [38] H. Fan et al., “Privacy preserving ultra-short-term wind power prediction
accountable personal data sharing,” Ph.D. dissertation, Faculty of Eng., based on secure multi party computation,” 2023, arXiv:2301.13513.
Sci. Math., University of Southampton, Southampton, U.K., 2020. [39] Y. Ye, L. Zhang, W. You, and Y. Mu, “Secure decentralized access
[17] S. H. Alsamhi et al., “Drones’ edge intelligence over smart environments control policy for data sharing in smart grid,” in Proc. IEEE
in B5G: Blockchain and federated learning synergy,” IEEE Trans. Green Conf. Comput. Commun. Workshops (INFOCOM WKSHPS), 2021,
Commun. Netw., vol. 6, no. 1, pp. 295–312, Mar. 2022. pp. 1–6.
[18] I. Jao et al., “Research stakeholders’ views on benefits and challenges [40] M. Sultana, A. Hossain, F. Laila, K. A. Taher, and M. N. Islam “Towards
for public health research data sharing in Kenya: the importance of trust developing a secure medical image sharing system based on zero trust
and social relations.” PLoS ONE, vol. 10, no. 9, 2015, Art. no. e0135545. principles and blockchain technology,” BMC Med. Informat. Decis.
[19] Y. M. Arif, H. Nurhayati, F. Kurniawan, S. M. S. Nugroho, and Mak., vol. 20, no. 1, pp. 1–10, 2020.
M. Hariadi “Blockchain-based data sharing for decentralized tourism [41] Y. Lu, X. Huang, Y. Dai, S. Maharjan and Y. Zhang, “Blockchain and
destinations recommendation system,” Int. J. Intell. Eng. Syst., vol. 13, federated learning for privacy-preserved data sharing in Industrial IoT,”
no. 6, pp. 472–486, 2020. IEEE Trans. Ind. Informat., vol. 16, no. 6, pp. 4177–4186, Jun. 2020.
[20] T. Guggenberger, A. Schweizer, and N. Urbach, “Improving interorga- [42] M. Stolpe, “The Internet of Things: Opportunities and challenges for
nizational information sharing for vendor managed inventory: Toward distributed data analysis,” ACM SIGKDD Explor. Newslett., vol. 18,
a decentralized information hub using blockchain technology,” IEEE no. 1, pp. 15–34, 2016.
Trans. Eng. Manag., vol. 67, no. 4, pp. 1074–1085, Nov. 2020. [43] R. E. Endeley, “End-to-end encryption in messaging services and
[21] M. Johnson, M. Jones, M. Shervey, J. T. Dudley, and N. Zimmerman, national security—Case of Whatsapp messenger,” J. Inf. Secur., vol. 9,
“Building a secure biomedical data sharing decentralized app (DApp): no. 1, p. 95, 2018.
Tutorial,” J. Med. Internet Res., vol. 21, no. 10, 2019, Art. no. e13601. [44] B. Lashkari and P. Musilek, “A comprehensive review of blockchain
[22] A. Balador, A. Bazzi, U. Hernandez-Jayo, I. de la Iglesia, and consensus mechanisms,” IEEE Access, vol. 9, pp. 43620–43652, 2021.
H. Ahmadvand, “A survey on vehicular communication for coop- [45] M. Pilkington, “11 blockchain technology: Principles and applica-
erative truck platooning application,” Veh. Commun., vol. 35, 2022, tions,” in Research handbook On Digital Transformations, vol. 225.
Apr. no. 100460. Cheltenham, U.K.: Edward Elgar Publ., 2016.
[23] M. Firdaus, S. Rahmadika, and K. H. Rhee, “Decentralized trusted data [46] E.-H. Diallo, O. Dib, and K. Al Agha. “A scalable blockchain-based
sharing management on internet of vehicle edge computing (IoVEC) scheme for traffic-related data sharing in VANETs,” Blockchain, Res.
networks using consortium blockchain,” Sensors, vol. 21, no. 7, p. 2410, Appl., vol. 3, no. 3, 2022, Art. no. 100087.
2021. [47] M. Pilkington, “Blockchain technology: Principles and applications,”
[24] P. Wang, W. Cui, and J. Li, “A framework of data sharing system in Research Handbook on Digital Transformations. Cheltenham, U.K.:
with decentralized network,” in Proc. 1st Int. Conf. (BigSDM), 2019, Edward Elgar Publ., 2016, pp. 225–253.
pp. 255–262. [48] H. Xu, P. V. Klaine, O. Onireti, B. Cao, M. Imran, and L. Zhang,
[25] O. Gallay, K. Korpela, N. Tapio, and J. K. Nurminen “A peer-to-peer “Blockchain-enabled resource management and sharing for 6G commu-
platform for decentralized logistics,” in Proc. Hamburg Int. Conf. Logist. nications,” Digit. Commun. Netw., vol. 6, no. 3, pp. 261–269, 2020.
(HICL), 2017, pp. 19–34. [49] N. Radziwill, “Blockchain revolution: How the technology behind
[26] S. Swetha and P. M. JoePrathap, “A study on a decentralized network bitcoin is changing money, business, and the world,” Qual. Manag. J.,
secured data sharing using blockchain,” in Proc. 1st Int. Conf. Comput. vol. 25, no. 1, pp. 64–65, 2018.
Sci. Technol. (ICCST), 2022, pp. 620–624. [50] J. Konecný, H. B. McMahan, F. X. Yu, P. Richtàrik, A. Theertha
[27] C. F. L. Hickman et al., “Data sharing: Using blockchain and decentral- Suresh, and D. Bacon, “Federated learning: Strategies for improving
ized data technologies to unlock the potential of artificial intelligence: communication efficiency,” 2016, arXiv:1610.05492.
What can assisted reproduction learn from other areas of medicine?” [51] K. Bonawitz et al., “Towards federated learning at scale: System
Fertil. Steril., vol. 114, no. 5, pp. 927–933, 2020. design,” in Proc. Mach. Learn. Syst., vol. 1, 2019, pp. 374–388.
[28] F. Firouzi, B. Farahani, and A. Marinšek, “The convergence and [52] T. Li, A. K. Sahu, M. Zaheer, M. Sanjabi, A. Talwalkar, and V. Smith,
interplay of edge, fog, and cloud in the AI-driven Internet of Things “Federated optimization in heterogeneous networks,” in Proc. Mach.
(IoT),” Inf. Syst., vol. 107, Jul. 2022, Art. no. 101840. Learn. Syst., vol. 2, 2020, pp. 429–450.
[29] F. Firouzi et al., “Fusion of IoT, AI, edge–fog–cloud, and blockchain: [53] N. Rieke et al., “The future of digital health with federated learning.”
Challenges, solutions, and a case study in healthcare and medicine,” NPJ Digit. Med., vol. 3, no. 1, p. 119, 2020.
IEEE Internet Things J., vol. 10, no. 5, pp. 3686–3705, Mar. 2023. [54] Q. Yang, Y. Liu, T. Chen, and Y. Tong, “Federated machine learning:
[30] B. Farahani, F. Firouzi, and M. Luecking, “The convergence of IoT and Concept and applications,” ACM Trans. Intell. Syst. Technol., vol. 10,
distributed ledger technologies (DLT): Opportunities, challenges, and no. 2, pp. 1–19, 2019.
solutions,” J. Netw. Comput. Appl., vol. 177, 2021, Art. no. 102936. [55] T. Nishio and R. Yonetani. “Client selection for federated learning with
[31] Y. Liu et al., “Deep anomaly detection for time-series data in Industrial heterogeneous resources in mobile edge,” in Proc. IEEE Int. Conf.
IoT: A communication-efficient on-device federated learning approach,” Commun. (ICC), 2019, pp. 1–7.
IEEE Internet Things J., vol. 8, no. 8, pp. 6348–6358, Apr. 2021. [56] R. Myrzashova, S. H. Alsamhi, A. V. Shvetsov, A. Hawbani, and X. Wei,
[32] Q. Meng, F. Zhou, H. Ren, T. Feng, G. Liu, and Y. Lin, “Improving “Blockchain meets federated learning in healthcare: A systematic review
federated learning face recognition via privacy-agnostic clusters,” 2022, with challenges and opportunities,” IEEE Internet Things J., vol. 10,
arXiv:2201.12467. no. 16, pp. 14418–14437, Aug. 2023.
[33] T. Li, A. Kumar Sahu, A. Talwalkar, and V. Smith, “Federated learning: [57] Z. Zhou, C. Guo, X. Zhang, R. Wang, L. Zhang, and M. Imran, “A
Challenges, methods, and future directions,” IEEE Signal Process. Mag., Blockchain-based data sharing marketplace with a federated learning use
vol. 37, no. 3, pp. 50–60, May 2020. case,” in Proc. IEEE Int. Conf. Blockchain Cryptocurr. (ICBC), Dubai,
[34] S. H. Alsamhi, A. V. Shvetsov, A. Hawbani, S. V. Shvetsova, S. Kumar, UAE, 2023, pp. 1041–1044.
and L. Zhao, “Survey on federated learning enabling indoor navigation [58] J. Bang and M.-J. Choi, “Design of personal data protection decen-
for industry 4.0 in B5G,” Future Gener. Comput. Syst., vol. 148, tralized model using blockchain and IPFS,” in Proc. 24st Asia–Pac.
pp. 250–265, Nov. 2023. Netw. Oper. Manag. Symp. (APNOMS), Sejong, South Korea, 2023,
[35] Y. Tian, S. Wang, J. Xiong, R. Bi, Z. Zhou, and M. Z. A. Bhuiyan, pp. 251–254.
“Robust and privacy-preserving decentralized deep federated learn- [59] D. K. Acharya, M. Shrivastava, and P. Padhi, “A decentralized
ing training: Focusing on digital healthcare applications,” IEEE/ACM blockchain-based IoT system for privacy-preserving data sharing,” in
Trans. Comput. Biol. Bioinformat., early access, Mar. 3, 2023, Proc. IEEE Int. Conf. Blockchain Distrib. Syst. Secur. (ICBDS),
doi: 10.1109/TCBB.2023.3243932. New Raipur, India, 2023, pp. 1–5.
[36] I. Makhdoom, I. Zhou, M. Abolhasan, J. Lipman, and W. Ni, [60] R. Song, B. Xiao, Y. Song, S. Guo, and Y. Yang, “A survey of
“PrivySharing: A blockchain-based framework for privacy-preserving blockchain-based schemes for data sharing and exchange,” IEEE Trans.
and secure data sharing in smart cities. Comput. Secur., vol. 88, Big Data, vol. 9, no. 6, pp. 1477–1495, Dec. 2023.
Jan. 2020, Art. no. 101653. [61] Y. Liu, P. Liu, W. Jing, and H. H. Song, “PD2S: A privacy-preserving
[37] N. Fatima, P. Agarwal, and S. S. Sohail “Security and privacy issues of differentiated data sharing scheme based on blockchain and federated
blockchain technology in health care—A review,” in ICT Analysis and learning,” IEEE Internet Things J., vol. 10, no. 24, pp. 21489–21501,
Applications. Singapore: Springer, pp. 193–201, 2022. Dec. 2023.

You might also like